Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull perf updates from Ingo Molnar:
 "Kernel side changes:

   - Add branch type profiling/tracing support. (Jin Yao)

   - Add the PERF_SAMPLE_PHYS_ADDR ABI to allow the tracing/profiling of
     physical memory addresses, where the PMU supports it. (Kan Liang)

   - Export some PMU capability details in the new
     /sys/bus/event_source/devices/cpu/caps/ sysfs directory. (Andi
     Kleen)

   - Aux data fixes and updates (Will Deacon)

   - kprobes fixes and updates (Masami Hiramatsu)

   - AMD uncore PMU driver fixes and updates (Janakarajan Natarajan)

  On the tooling side, here's a (limited!) list of highlights - there
  were many other changes that I could not list, see the shortlog and
  git history for details:

  UI improvements:

   - Implement a visual marker for fused x86 instructions in the
     annotate TUI browser, available now in 'perf report', more work
     needed to have it available as well in 'perf top' (Jin Yao)

     Further explanation from one of Jin's patches:

             │   ┌──cmpl   $0x0,argp_program_version_hook
       81.93 │   ├──je     20
             │   │  lock   cmpxchg %esi,0x38a9a4(%rip)
             │   │↓ jne    29
             │   │↓ jmp    43
       11.47 │20:└─→cmpxch %esi,0x38a999(%rip)

     That means the cmpl+je is a fused instruction pair and they should
     be considered together.

   - Record the branch type and then show statistics and info about in
     callchain entries (Jin Yao)

     Example from one of Jin's patches:

        # perf record -g -j any,save_type
        # perf report --branch-history --stdio --no-children

        38.50%  div.c:45                [.] main                    div
                |
                ---main div.c:42 (RET CROSS_2M cycles:2)
                   compute_flag div.c:28 (cycles:2)
                   compute_flag div.c:27 (RET CROSS_2M cycles:1)
                   rand rand.c:28 (cycles:1)
                   rand rand.c:28 (RET CROSS_2M cycles:1)
                   __random random.c:298 (cycles:1)
                   __random random.c:297 (COND_BWD CROSS_2M cycles:1)
                   __random random.c:295 (cycles:1)
                   __random random.c:295 (COND_BWD CROSS_2M cycles:1)
                   __random random.c:295 (cycles:1)
                   __random random.c:295 (RET CROSS_2M cycles:9)

  namespaces support:

   - Add initial support for namespaces, using setns to access files in
     namespaces, grabbing their build-ids, etc. (Krister Johansen)

  perf trace enhancements:

   - Beautify pkey_{alloc,free,mprotect} arguments in 'perf trace'
     (Arnaldo Carvalho de Melo)

   - Add initial 'clone' syscall args beautifier in 'perf trace'
     (Arnaldo Carvalho de Melo)

   - Ignore 'fd' and 'offset' args for MAP_ANONYMOUS in 'perf trace'
     (Arnaldo Carvalho de Melo)

   - Beautifiers for the 'cmd' arg of several ioctl types, including:
     sound, DRM, KVM, vhost virtio and perf_events. (Arnaldo Carvalho de
     Melo)

   - Add PERF_SAMPLE_CALLCHAIN and PERF_RECORD_MMAP[2] to 'perf data'
     CTF conversion, allowing CTF trace visualization tools to show
     callchains and to resolve symbols (Geneviève Bastien)

   - Beautify the fcntl syscall, which is an interesting one in the
     sense that infrastructure had to be put in place to change the
     formatters of some arguments according to the value in a previous
     one, i.e. cmd dictates how arg and the syscall return will be
     formatted. (Arnaldo Carvalho de Melo

  perf stat enhancements:

   - Use group read for event groups in 'perf stat', reducing overhead
     when groups are defined in the event specification, i.e. when using
     {} to enclose a list of events, asking them to be read at the same
     time, e.g.: "perf stat -e '{cycles,instructions}'" (Jiri Olsa)

  pipe mode improvements:

   - Process tracing data in 'perf annotate' pipe mode (David
     Carrillo-Cisneros)

   - Add header record types to pipe-mode, now this command:

        $ perf record -o - -e cycles sleep 1 | perf report --stdio --header

     Will show the same as in non-pipe mode, i.e. involving a perf.data
     file (David Carrillo-Cisneros)

  Vendor specific hardware event support updates/enhancements:

   - Update POWER9 vendor events tables (Sukadev Bhattiprolu)

   - Add POWER9 PMU events Sukadev (Bhattiprolu)

   - Support additional POWER8+ PVR in PMU mapfile (Shriya)

   - Add Skylake server uncore JSON vendor events (Andi Kleen)

   - Support exporting Intel PT data to sqlite3 with python perf
     scripts, this is in addition to the postgresql support that was
     already there (Adrian Hunter)"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (253 commits)
  perf symbols: Fix plt entry calculation for ARM and AARCH64
  perf probe: Fix kprobe blacklist checking condition
  perf/x86: Fix caps/ for !Intel
  perf/core, x86: Add PERF_SAMPLE_PHYS_ADDR
  perf/core, pt, bts: Get rid of itrace_started
  perf trace beauty: Beautify pkey_{alloc,free,mprotect} arguments
  tools headers: Sync cpu features kernel ABI headers with tooling headers
  perf tools: Pass full path of FEATURES_DUMP
  perf tools: Robustify detection of clang binary
  tools lib: Allow external definition of CC, AR and LD
  perf tools: Allow external definition of flex and bison binary names
  tools build tests: Don't hardcode gcc name
  perf report: Group stat values on global event id
  perf values: Zero value buffers
  perf values: Fix allocation check
  perf values: Fix thread index bug
  perf report: Add dump_read function
  perf record: Set read_format for inherit_stat
  perf c2c: Fix remote HITM detection for Skylake
  perf tools: Fix static build with newer toolchains
  ...
diff --git a/Documentation/Makefile b/Documentation/Makefile
index a423203..85f7856 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -22,6 +22,8 @@
 
 .DEFAULT:
 	$(warning The '$(SPHINXBUILD)' command was not found. Make sure you have Sphinx installed and in PATH, or set the SPHINXBUILD make variable to point to the full path of the '$(SPHINXBUILD)' executable.)
+	@echo
+	@./scripts/sphinx-pre-install
 	@echo "  SKIP    Sphinx $@ target."
 
 else # HAVE_SPHINX
@@ -95,16 +97,6 @@
 # The following targets are independent of HAVE_SPHINX, and the rules should
 # work or silently pass without Sphinx.
 
-# no-ops for the Sphinx toolchain
-sgmldocs:
-	@:
-psdocs:
-	@:
-mandocs:
-	@:
-installmandocs:
-	@:
-
 cleandocs:
 	$(Q)rm -rf $(BUILDDIR)
 	$(Q)$(MAKE) BUILDDIR=$(abspath $(BUILDDIR)) $(build)=Documentation/media clean
diff --git a/Documentation/RCU/Design/Requirements/Requirements.html b/Documentation/RCU/Design/Requirements/Requirements.html
index 95b30fa..62e847b 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.html
+++ b/Documentation/RCU/Design/Requirements/Requirements.html
@@ -2080,6 +2080,8 @@
 <li>	<a href="#Scheduler and RCU">Scheduler and RCU</a>.
 <li>	<a href="#Tracing and RCU">Tracing and RCU</a>.
 <li>	<a href="#Energy Efficiency">Energy Efficiency</a>.
+<li>	<a href="#Scheduling-Clock Interrupts and RCU">
+	Scheduling-Clock Interrupts and RCU</a>.
 <li>	<a href="#Memory Efficiency">Memory Efficiency</a>.
 <li>	<a href="#Performance, Scalability, Response Time, and Reliability">
 	Performance, Scalability, Response Time, and Reliability</a>.
@@ -2532,6 +2534,134 @@
 Flaming me on the Linux-kernel mailing list was apparently not
 sufficient to fully vent their ire at RCU's energy-efficiency bugs!
 
+<h3><a name="Scheduling-Clock Interrupts and RCU">
+Scheduling-Clock Interrupts and RCU</a></h3>
+
+<p>
+The kernel transitions between in-kernel non-idle execution, userspace
+execution, and the idle loop.
+Depending on kernel configuration, RCU handles these states differently:
+
+<table border=3>
+<tr><th><tt>HZ</tt> Kconfig</th>
+	<th>In-Kernel</th>
+		<th>Usermode</th>
+			<th>Idle</th></tr>
+<tr><th align="left"><tt>HZ_PERIODIC</tt></th>
+	<td>Can rely on scheduling-clock interrupt.</td>
+		<td>Can rely on scheduling-clock interrupt and its
+		    detection of interrupt from usermode.</td>
+			<td>Can rely on RCU's dyntick-idle detection.</td></tr>
+<tr><th align="left"><tt>NO_HZ_IDLE</tt></th>
+	<td>Can rely on scheduling-clock interrupt.</td>
+		<td>Can rely on scheduling-clock interrupt and its
+		    detection of interrupt from usermode.</td>
+			<td>Can rely on RCU's dyntick-idle detection.</td></tr>
+<tr><th align="left"><tt>NO_HZ_FULL</tt></th>
+	<td>Can only sometimes rely on scheduling-clock interrupt.
+	    In other cases, it is necessary to bound kernel execution
+	    times and/or use IPIs.</td>
+		<td>Can rely on RCU's dyntick-idle detection.</td>
+			<td>Can rely on RCU's dyntick-idle detection.</td></tr>
+</table>
+
+<table>
+<tr><th>&nbsp;</th></tr>
+<tr><th align="left">Quick Quiz:</th></tr>
+<tr><td>
+	Why can't <tt>NO_HZ_FULL</tt> in-kernel execution rely on the
+	scheduling-clock interrupt, just like <tt>HZ_PERIODIC</tt>
+	and <tt>NO_HZ_IDLE</tt> do?
+</td></tr>
+<tr><th align="left">Answer:</th></tr>
+<tr><td bgcolor="#ffffff"><font color="ffffff">
+	Because, as a performance optimization, <tt>NO_HZ_FULL</tt>
+	does not necessarily re-enable the scheduling-clock interrupt
+	on entry to each and every system call.
+</font></td></tr>
+<tr><td>&nbsp;</td></tr>
+</table>
+
+<p>
+However, RCU must be reliably informed as to whether any given
+CPU is currently in the idle loop, and, for <tt>NO_HZ_FULL</tt>,
+also whether that CPU is executing in usermode, as discussed
+<a href="#Energy Efficiency">earlier</a>.
+It also requires that the scheduling-clock interrupt be enabled when
+RCU needs it to be:
+
+<ol>
+<li>	If a CPU is either idle or executing in usermode, and RCU believes
+	it is non-idle, the scheduling-clock tick had better be running.
+	Otherwise, you will get RCU CPU stall warnings.  Or at best,
+	very long (11-second) grace periods, with a pointless IPI waking
+	the CPU from time to time.
+<li>	If a CPU is in a portion of the kernel that executes RCU read-side
+	critical sections, and RCU believes this CPU to be idle, you will get
+	random memory corruption.  <b>DON'T DO THIS!!!</b>
+
+	<br>This is one reason to test with lockdep, which will complain
+	about this sort of thing.
+<li>	If a CPU is in a portion of the kernel that is absolutely
+	positively no-joking guaranteed to never execute any RCU read-side
+	critical sections, and RCU believes this CPU to to be idle,
+	no problem.  This sort of thing is used by some architectures
+	for light-weight exception handlers, which can then avoid the
+	overhead of <tt>rcu_irq_enter()</tt> and <tt>rcu_irq_exit()</tt>
+	at exception entry and exit, respectively.
+	Some go further and avoid the entireties of <tt>irq_enter()</tt>
+	and <tt>irq_exit()</tt>.
+
+	<br>Just make very sure you are running some of your tests with
+	<tt>CONFIG_PROVE_RCU=y</tt>, just in case one of your code paths
+	was in fact joking about not doing RCU read-side critical sections.
+<li>	If a CPU is executing in the kernel with the scheduling-clock
+	interrupt disabled and RCU believes this CPU to be non-idle,
+	and if the CPU goes idle (from an RCU perspective) every few
+	jiffies, no problem.  It is usually OK for there to be the
+	occasional gap between idle periods of up to a second or so.
+
+	<br>If the gap grows too long, you get RCU CPU stall warnings.
+<li>	If a CPU is either idle or executing in usermode, and RCU believes
+	it to be idle, of course no problem.
+<li>	If a CPU is executing in the kernel, the kernel code
+	path is passing through quiescent states at a reasonable
+	frequency (preferably about once per few jiffies, but the
+	occasional excursion to a second or so is usually OK) and the
+	scheduling-clock interrupt is enabled, of course no problem.
+
+	<br>If the gap between a successive pair of quiescent states grows
+	too long, you get RCU CPU stall warnings.
+</ol>
+
+<table>
+<tr><th>&nbsp;</th></tr>
+<tr><th align="left">Quick Quiz:</th></tr>
+<tr><td>
+	But what if my driver has a hardware interrupt handler
+	that can run for many seconds?
+	I cannot invoke <tt>schedule()</tt> from an hardware
+	interrupt handler, after all!
+</td></tr>
+<tr><th align="left">Answer:</th></tr>
+<tr><td bgcolor="#ffffff"><font color="ffffff">
+	One approach is to do <tt>rcu_irq_exit();rcu_irq_enter();</tt>
+	every so often.
+	But given that long-running interrupt handlers can cause
+	other problems, not least for response time, shouldn't you
+	work to keep your interrupt handler's runtime within reasonable
+	bounds?
+</font></td></tr>
+<tr><td>&nbsp;</td></tr>
+</table>
+
+<p>
+But as long as RCU is properly informed of kernel state transitions between
+in-kernel execution, usermode execution, and idle, and as long as the
+scheduling-clock interrupt is enabled when RCU needs it to be, you
+can rest assured that the bugs you encounter will be in some other
+part of RCU or some other part of the kernel!
+
 <h3><a name="Memory Efficiency">Memory Efficiency</a></h3>
 
 <p>
diff --git a/Documentation/RCU/checklist.txt b/Documentation/RCU/checklist.txt
index 6beda55..4974771 100644
--- a/Documentation/RCU/checklist.txt
+++ b/Documentation/RCU/checklist.txt
@@ -23,6 +23,14 @@
 	Yet another exception is where the low real-time latency of RCU's
 	read-side primitives is critically important.
 
+	One final exception is where RCU readers are used to prevent
+	the ABA problem (https://en.wikipedia.org/wiki/ABA_problem)
+	for lockless updates.  This does result in the mildly
+	counter-intuitive situation where rcu_read_lock() and
+	rcu_read_unlock() are used to protect updates, however, this
+	approach provides the same potential simplifications that garbage
+	collectors do.
+
 1.	Does the update code have proper mutual exclusion?
 
 	RCU does allow -readers- to run (almost) naked, but -writers- must
@@ -40,7 +48,9 @@
 	explain how this single task does not become a major bottleneck on
 	big multiprocessor machines (for example, if the task is updating
 	information relating to itself that other tasks can read, there
-	by definition can be no bottleneck).
+	by definition can be no bottleneck).  Note that the definition
+	of "large" has changed significantly:  Eight CPUs was "large"
+	in the year 2000, but a hundred CPUs was unremarkable in 2017.
 
 2.	Do the RCU read-side critical sections make proper use of
 	rcu_read_lock() and friends?  These primitives are needed
@@ -55,6 +65,12 @@
 	Disabling of preemption can serve as rcu_read_lock_sched(), but
 	is less readable.
 
+	Letting RCU-protected pointers "leak" out of an RCU read-side
+	critical section is every bid as bad as letting them leak out
+	from under a lock.  Unless, of course, you have arranged some
+	other means of protection, such as a lock or a reference count
+	-before- letting them out of the RCU read-side critical section.
+
 3.	Does the update code tolerate concurrent accesses?
 
 	The whole point of RCU is to permit readers to run without
@@ -78,10 +94,10 @@
 
 		This works quite well, also.
 
-	c.	Make updates appear atomic to readers.  For example,
+	c.	Make updates appear atomic to readers.	For example,
 		pointer updates to properly aligned fields will
 		appear atomic, as will individual atomic primitives.
-		Sequences of perations performed under a lock will -not-
+		Sequences of operations performed under a lock will -not-
 		appear to be atomic to RCU readers, nor will sequences
 		of multiple atomic primitives.
 
@@ -168,8 +184,8 @@
 
 5.	If call_rcu(), or a related primitive such as call_rcu_bh(),
 	call_rcu_sched(), or call_srcu() is used, the callback function
-	must be written to be called from softirq context.  In particular,
-	it cannot block.
+	will be called from softirq context.  In particular, it cannot
+	block.
 
 6.	Since synchronize_rcu() can block, it cannot be called from
 	any sort of irq context.  The same rule applies for
@@ -178,11 +194,14 @@
 	synchronize_sched_expedite(), and synchronize_srcu_expedited().
 
 	The expedited forms of these primitives have the same semantics
-	as the non-expedited forms, but expediting is both expensive
-	and unfriendly to real-time workloads.	Use of the expedited
-	primitives should be restricted to rare configuration-change
-	operations that would not normally be undertaken while a real-time
-	workload is running.
+	as the non-expedited forms, but expediting is both expensive and
+	(with the exception of synchronize_srcu_expedited()) unfriendly
+	to real-time workloads.  Use of the expedited primitives should
+	be restricted to rare configuration-change operations that would
+	not normally be undertaken while a real-time workload is running.
+	However, real-time workloads can use rcupdate.rcu_normal kernel
+	boot parameter to completely disable expedited grace periods,
+	though this might have performance implications.
 
 	In particular, if you find yourself invoking one of the expedited
 	primitives repeatedly in a loop, please do everyone a favor:
@@ -193,11 +212,6 @@
 	of the system, especially to real-time workloads running on
 	the rest of the system.
 
-	In addition, it is illegal to call the expedited forms from
-	a CPU-hotplug notifier, or while holding a lock that is acquired
-	by a CPU-hotplug notifier.  Failing to observe this restriction
-	will result in deadlock.
-
 7.	If the updater uses call_rcu() or synchronize_rcu(), then the
 	corresponding readers must use rcu_read_lock() and
 	rcu_read_unlock().  If the updater uses call_rcu_bh() or
@@ -321,7 +335,7 @@
 	Similarly, disabling preemption is not an acceptable substitute
 	for rcu_read_lock().  Code that attempts to use preemption
 	disabling where it should be using rcu_read_lock() will break
-	in real-time kernel builds.
+	in CONFIG_PREEMPT=y kernel builds.
 
 	If you want to wait for interrupt handlers, NMI handlers, and
 	code under the influence of preempt_disable(), you instead
@@ -356,23 +370,22 @@
 	not the case, a self-spawning RCU callback would prevent the
 	victim CPU from ever going offline.)
 
-14.	SRCU (srcu_read_lock(), srcu_read_unlock(), srcu_dereference(),
-	synchronize_srcu(), synchronize_srcu_expedited(), and call_srcu())
-	may only be invoked from process context.  Unlike other forms of
-	RCU, it -is- permissible to block in an SRCU read-side critical
-	section (demarked by srcu_read_lock() and srcu_read_unlock()),
-	hence the "SRCU": "sleepable RCU".  Please note that if you
-	don't need to sleep in read-side critical sections, you should be
-	using RCU rather than SRCU, because RCU is almost always faster
-	and easier to use than is SRCU.
+14.	Unlike other forms of RCU, it -is- permissible to block in an
+	SRCU read-side critical section (demarked by srcu_read_lock()
+	and srcu_read_unlock()), hence the "SRCU": "sleepable RCU".
+	Please note that if you don't need to sleep in read-side critical
+	sections, you should be using RCU rather than SRCU, because RCU
+	is almost always faster and easier to use than is SRCU.
 
-	Also unlike other forms of RCU, explicit initialization
-	and cleanup is required via init_srcu_struct() and
-	cleanup_srcu_struct().	These are passed a "struct srcu_struct"
-	that defines the scope of a given SRCU domain.	Once initialized,
-	the srcu_struct is passed to srcu_read_lock(), srcu_read_unlock()
-	synchronize_srcu(), synchronize_srcu_expedited(), and call_srcu().
-	A given synchronize_srcu() waits only for SRCU read-side critical
+	Also unlike other forms of RCU, explicit initialization and
+	cleanup is required either at build time via DEFINE_SRCU()
+	or DEFINE_STATIC_SRCU() or at runtime via init_srcu_struct()
+	and cleanup_srcu_struct().  These last two are passed a
+	"struct srcu_struct" that defines the scope of a given
+	SRCU domain.  Once initialized, the srcu_struct is passed
+	to srcu_read_lock(), srcu_read_unlock() synchronize_srcu(),
+	synchronize_srcu_expedited(), and call_srcu().	A given
+	synchronize_srcu() waits only for SRCU read-side critical
 	sections governed by srcu_read_lock() and srcu_read_unlock()
 	calls that have been passed the same srcu_struct.  This property
 	is what makes sleeping read-side critical sections tolerable --
@@ -390,10 +403,16 @@
 	Therefore, SRCU should be used in preference to rw_semaphore
 	only in extremely read-intensive situations, or in situations
 	requiring SRCU's read-side deadlock immunity or low read-side
-	realtime latency.
+	realtime latency.  You should also consider percpu_rw_semaphore
+	when you need lightweight readers.
 
-	Note that, rcu_assign_pointer() relates to SRCU just as it does
-	to other forms of RCU.
+	SRCU's expedited primitive (synchronize_srcu_expedited())
+	never sends IPIs to other CPUs, so it is easier on
+	real-time workloads than is synchronize_rcu_expedited(),
+	synchronize_rcu_bh_expedited() or synchronize_sched_expedited().
+
+	Note that rcu_dereference() and rcu_assign_pointer() relate to
+	SRCU just as they do to other forms of RCU.
 
 15.	The whole point of call_rcu(), synchronize_rcu(), and friends
 	is to wait until all pre-existing readers have finished before
@@ -435,3 +454,33 @@
 
 	These debugging aids can help you find problems that are
 	otherwise extremely difficult to spot.
+
+18.	If you register a callback using call_rcu(), call_rcu_bh(),
+	call_rcu_sched(), or call_srcu(), and pass in a function defined
+	within a loadable module, then it in necessary to wait for
+	all pending callbacks to be invoked after the last invocation
+	and before unloading that module.  Note that it is absolutely
+	-not- sufficient to wait for a grace period!  The current (say)
+	synchronize_rcu() implementation waits only for all previous
+	callbacks registered on the CPU that synchronize_rcu() is running
+	on, but it is -not- guaranteed to wait for callbacks registered
+	on other CPUs.
+
+	You instead need to use one of the barrier functions:
+
+	o	call_rcu() -> rcu_barrier()
+	o	call_rcu_bh() -> rcu_barrier_bh()
+	o	call_rcu_sched() -> rcu_barrier_sched()
+	o	call_srcu() -> srcu_barrier()
+
+	However, these barrier functions are absolutely -not- guaranteed
+	to wait for a grace period.  In fact, if there are no call_rcu()
+	callbacks waiting anywhere in the system, rcu_barrier() is within
+	its rights to return immediately.
+
+	So if you need to wait for both an RCU grace period and for
+	all pre-existing call_rcu() callbacks, you will need to execute
+	both rcu_barrier() and synchronize_rcu(), if necessary, using
+	something like workqueues to to execute them concurrently.
+
+	See rcubarrier.txt for more information.
diff --git a/Documentation/RCU/rcu.txt b/Documentation/RCU/rcu.txt
index 745f429..7d4ae110 100644
--- a/Documentation/RCU/rcu.txt
+++ b/Documentation/RCU/rcu.txt
@@ -76,15 +76,12 @@
 	Of these, one was allowed to lapse by the assignee, and the
 	others have been contributed to the Linux kernel under GPL.
 	There are now also LGPL implementations of user-level RCU
-	available (http://lttng.org/?q=node/18).
+	available (http://liburcu.org/).
 
 o	I hear that RCU needs work in order to support realtime kernels?
 
-	This work is largely completed.  Realtime-friendly RCU can be
-	enabled via the CONFIG_PREEMPT_RCU kernel configuration
-	parameter.  However, work is in progress for enabling priority
-	boosting of preempted RCU read-side critical sections.	This is
-	needed if you have CPU-bound realtime threads.
+	Realtime-friendly RCU can be enabled via the CONFIG_PREEMPT_RCU
+	kernel configuration parameter.
 
 o	Where can I find more information on RCU?
 
diff --git a/Documentation/RCU/rcu_dereference.txt b/Documentation/RCU/rcu_dereference.txt
index b2a613f..1acb26b 100644
--- a/Documentation/RCU/rcu_dereference.txt
+++ b/Documentation/RCU/rcu_dereference.txt
@@ -25,35 +25,35 @@
 	for an example where the compiler can in fact deduce the exact
 	value of the pointer, and thus cause misordering.
 
+o	You are only permitted to use rcu_dereference on pointer values.
+	The compiler simply knows too much about integral values to
+	trust it to carry dependencies through integer operations.
+	There are a very few exceptions, namely that you can temporarily
+	cast the pointer to uintptr_t in order to:
+
+	o	Set bits and clear bits down in the must-be-zero low-order
+		bits of that pointer.  This clearly means that the pointer
+		must have alignment constraints, for example, this does
+		-not- work in general for char* pointers.
+
+	o	XOR bits to translate pointers, as is done in some
+		classic buddy-allocator algorithms.
+
+	It is important to cast the value back to pointer before
+	doing much of anything else with it.
+
 o	Avoid cancellation when using the "+" and "-" infix arithmetic
 	operators.  For example, for a given variable "x", avoid
-	"(x-x)".  There are similar arithmetic pitfalls from other
-	arithmetic operators, such as "(x*0)", "(x/(x+1))" or "(x%1)".
-	The compiler is within its rights to substitute zero for all of
-	these expressions, so that subsequent accesses no longer depend
-	on the rcu_dereference(), again possibly resulting in bugs due
-	to misordering.
+	"(x-(uintptr_t)x)" for char* pointers.	The compiler is within its
+	rights to substitute zero for this sort of expression, so that
+	subsequent accesses no longer depend on the rcu_dereference(),
+	again possibly resulting in bugs due to misordering.
 
 	Of course, if "p" is a pointer from rcu_dereference(), and "a"
 	and "b" are integers that happen to be equal, the expression
 	"p+a-b" is safe because its value still necessarily depends on
 	the rcu_dereference(), thus maintaining proper ordering.
 
-o	Avoid all-zero operands to the bitwise "&" operator, and
-	similarly avoid all-ones operands to the bitwise "|" operator.
-	If the compiler is able to deduce the value of such operands,
-	it is within its rights to substitute the corresponding constant
-	for the bitwise operation.  Once again, this causes subsequent
-	accesses to no longer depend on the rcu_dereference(), causing
-	bugs due to misordering.
-
-	Please note that single-bit operands to bitwise "&" can also
-	be dangerous.  At this point, the compiler knows that the
-	resulting value can only take on one of two possible values.
-	Therefore, a very small amount of additional information will
-	allow the compiler to deduce the exact value, which again can
-	result in misordering.
-
 o	If you are using RCU to protect JITed functions, so that the
 	"()" function-invocation operator is applied to a value obtained
 	(directly or indirectly) from rcu_dereference(), you may need to
@@ -61,25 +61,6 @@
 	This issue arises on some systems when a newly JITed function is
 	using the same memory that was used by an earlier JITed function.
 
-o	Do not use the results from the boolean "&&" and "||" when
-	dereferencing.	For example, the following (rather improbable)
-	code is buggy:
-
-		int *p;
-		int *q;
-
-		...
-
-		p = rcu_dereference(gp)
-		q = &global_q;
-		q += p != &oom_p1 && p != &oom_p2;
-		r1 = *q;  /* BUGGY!!! */
-
-	The reason this is buggy is that "&&" and "||" are often compiled
-	using branches.  While weak-memory machines such as ARM or PowerPC
-	do order stores after such branches, they can speculate loads,
-	which can result in misordering bugs.
-
 o	Do not use the results from relational operators ("==", "!=",
 	">", ">=", "<", or "<=") when dereferencing.  For example,
 	the following (quite strange) code is buggy:
diff --git a/Documentation/RCU/rcubarrier.txt b/Documentation/RCU/rcubarrier.txt
index b10cfe7..5d77590 100644
--- a/Documentation/RCU/rcubarrier.txt
+++ b/Documentation/RCU/rcubarrier.txt
@@ -263,6 +263,11 @@
 	are delayed for a full grace period? Couldn't this result in
 	rcu_barrier() returning prematurely?
 
+The current rcu_barrier() implementation is more complex, due to the need
+to avoid disturbing idle CPUs (especially on battery-powered systems)
+and the need to minimally disturb non-idle CPUs in real-time systems.
+However, the code above illustrates the concepts.
+
 
 rcu_barrier() Summary
 
diff --git a/Documentation/RCU/torture.txt b/Documentation/RCU/torture.txt
index 278f6a9..55918b5 100644
--- a/Documentation/RCU/torture.txt
+++ b/Documentation/RCU/torture.txt
@@ -276,15 +276,17 @@
 	somehow gets incremented farther than it should.
 
 Different implementations of RCU can provide implementation-specific
-additional information.  For example, SRCU provides the following
+additional information.  For example, Tree SRCU provides the following
 additional line:
 
-	srcu-torture: per-CPU(idx=1): 0(0,1) 1(0,1) 2(0,0) 3(0,1)
+	srcud-torture: Tree SRCU per-CPU(idx=0): 0(35,-21) 1(-4,24) 2(1,1) 3(-26,20) 4(28,-47) 5(-9,4) 6(-10,14) 7(-14,11) T(1,6)
 
-This line shows the per-CPU counter state.  The numbers in parentheses are
-the values of the "old" and "current" counters for the corresponding CPU.
-The "idx" value maps the "old" and "current" values to the underlying
-array, and is useful for debugging.
+This line shows the per-CPU counter state, in this case for Tree SRCU
+using a dynamically allocated srcu_struct (hence "srcud-" rather than
+"srcu-").  The numbers in parentheses are the values of the "old" and
+"current" counters for the corresponding CPU.  The "idx" value maps the
+"old" and "current" values to the underlying array, and is useful for
+debugging.  The final "T" entry contains the totals of the counters.
 
 
 USAGE
@@ -304,3 +306,9 @@
 "FAILURE", or "RCU_HOTPLUG" indication to be printk()ed.  The first
 two are self-explanatory, while the last indicates that while there
 were no RCU failures, CPU-hotplug problems were detected.
+
+However, the tools/testing/selftests/rcutorture/bin/kvm.sh script
+provides better automation, including automatic failure analysis.
+It assumes a qemu/kvm-enabled platform, and runs guest OSes out of initrd.
+See tools/testing/selftests/rcutorture/doc/initrd.txt for instructions
+on setting up such an initrd.
diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index 8ed6c9f..df62466 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -890,6 +890,8 @@
 	srcu_read_lock_held
 
 SRCU:	Initialization/cleanup
+	DEFINE_SRCU
+	DEFINE_STATIC_SRCU
 	init_srcu_struct
 	cleanup_srcu_struct
 
@@ -913,7 +915,8 @@
 b.	What about the -rt patchset?  If readers would need to block
 	in an non-rt kernel, you need SRCU.  If readers would block
 	in a -rt kernel, but not in a non-rt kernel, SRCU is not
-	necessary.
+	necessary.  (The -rt patchset turns spinlocks into sleeplocks,
+	hence this distinction.)
 
 c.	Do you need to treat NMI handlers, hardirq handlers,
 	and code segments with preemption disabled (whether
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index d9c171c..3a99cc9 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2633,9 +2633,10 @@
 			In kernels built with CONFIG_NO_HZ_FULL=y, set
 			the specified list of CPUs whose tick will be stopped
 			whenever possible. The boot CPU will be forced outside
-			the range to maintain the timekeeping.
-			The CPUs in this range must also be included in the
-			rcu_nocbs= set.
+			the range to maintain the timekeeping.  Any CPUs
+			in this list will have their RCU callbacks offloaded,
+			just as if they had also been called out in the
+			rcu_nocbs= boot parameter.
 
 	noiotrap	[SH] Disables trapped I/O port accesses.
 
diff --git a/Documentation/arm/firmware.txt b/Documentation/arm/firmware.txt
index da6713a..7f175db 100644
--- a/Documentation/arm/firmware.txt
+++ b/Documentation/arm/firmware.txt
@@ -60,7 +60,7 @@
 
 	/* some platform code, e.g. SMP initialization */
 
-	__raw_writel(virt_to_phys(exynos4_secondary_startup),
+	__raw_writel(__pa_symbol(exynos4_secondary_startup),
 		CPU1_BOOT_REG);
 
 	/* Call Exynos specific smc call */
diff --git a/Documentation/conf.py b/Documentation/conf.py
index 71b032b..f9054ab 100644
--- a/Documentation/conf.py
+++ b/Documentation/conf.py
@@ -29,7 +29,7 @@
 # -- General configuration ------------------------------------------------
 
 # If your documentation needs a minimal Sphinx version, state it here.
-needs_sphinx = '1.2'
+needs_sphinx = '1.3'
 
 # Add any Sphinx extension module names here, as strings. They can be
 # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
@@ -344,8 +344,8 @@
 if major == 1 and minor <= 4:
     latex_elements['preamble']  += '\\usepackage[margin=0.5in, top=1in, bottom=1in]{geometry}'
 elif major == 1 and (minor > 5 or (minor == 5 and patch >= 3)):
-    latex_elements['sphinxsetup'] = 'hmargin=0.5in, vmargin=0.5in'
-
+    latex_elements['sphinxsetup'] = 'hmargin=0.5in, vmargin=1in'
+    latex_elements['preamble']  += '\\fvset{fontsize=auto}\n'
 
 # Grouping the document tree into LaTeX files. List of tuples
 # (source start file, target name, title,
diff --git a/Documentation/core-api/genalloc.rst b/Documentation/core-api/genalloc.rst
new file mode 100644
index 0000000..6b38a39
--- /dev/null
+++ b/Documentation/core-api/genalloc.rst
@@ -0,0 +1,144 @@
+The genalloc/genpool subsystem
+==============================
+
+There are a number of memory-allocation subsystems in the kernel, each
+aimed at a specific need.  Sometimes, however, a kernel developer needs to
+implement a new allocator for a specific range of special-purpose memory;
+often that memory is located on a device somewhere.  The author of the
+driver for that device can certainly write a little allocator to get the
+job done, but that is the way to fill the kernel with dozens of poorly
+tested allocators.  Back in 2005, Jes Sorensen lifted one of those
+allocators from the sym53c8xx_2 driver and posted_ it as a generic module
+for the creation of ad hoc memory allocators.  This code was merged
+for the 2.6.13 release; it has been modified considerably since then.
+
+.. _posted: https://lwn.net/Articles/125842/
+
+Code using this allocator should include <linux/genalloc.h>.  The action
+begins with the creation of a pool using one of:
+
+.. kernel-doc:: lib/genalloc.c
+   :functions: gen_pool_create		
+
+.. kernel-doc:: lib/genalloc.c
+   :functions: devm_gen_pool_create
+
+A call to :c:func:`gen_pool_create` will create a pool.  The granularity of
+allocations is set with min_alloc_order; it is a log-base-2 number like
+those used by the page allocator, but it refers to bytes rather than pages.
+So, if min_alloc_order is passed as 3, then all allocations will be a
+multiple of eight bytes.  Increasing min_alloc_order decreases the memory
+required to track the memory in the pool.  The nid parameter specifies
+which NUMA node should be used for the allocation of the housekeeping
+structures; it can be -1 if the caller doesn't care.
+
+The "managed" interface :c:func:`devm_gen_pool_create` ties the pool to a
+specific device.  Among other things, it will automatically clean up the
+pool when the given device is destroyed.
+
+A pool is shut down with:
+
+.. kernel-doc:: lib/genalloc.c
+   :functions: gen_pool_destroy
+
+It's worth noting that, if there are still allocations outstanding from the
+given pool, this function will take the rather extreme step of invoking
+BUG(), crashing the entire system.  You have been warned.
+
+A freshly created pool has no memory to allocate.  It is fairly useless in
+that state, so one of the first orders of business is usually to add memory
+to the pool.  That can be done with one of:
+
+.. kernel-doc:: include/linux/genalloc.h
+   :functions: gen_pool_add
+
+.. kernel-doc:: lib/genalloc.c
+   :functions: gen_pool_add_virt
+
+A call to :c:func:`gen_pool_add` will place the size bytes of memory
+starting at addr (in the kernel's virtual address space) into the given
+pool, once again using nid as the node ID for ancillary memory allocations.
+The :c:func:`gen_pool_add_virt` variant associates an explicit physical
+address with the memory; this is only necessary if the pool will be used
+for DMA allocations.
+
+The functions for allocating memory from the pool (and putting it back)
+are:
+
+.. kernel-doc:: lib/genalloc.c
+   :functions: gen_pool_alloc
+
+.. kernel-doc:: lib/genalloc.c
+   :functions: gen_pool_dma_alloc
+
+.. kernel-doc:: lib/genalloc.c
+   :functions: gen_pool_free
+
+As one would expect, :c:func:`gen_pool_alloc` will allocate size< bytes
+from the given pool.  The :c:func:`gen_pool_dma_alloc` variant allocates
+memory for use with DMA operations, returning the associated physical
+address in the space pointed to by dma.  This will only work if the memory
+was added with :c:func:`gen_pool_add_virt`.  Note that this function
+departs from the usual genpool pattern of using unsigned long values to
+represent kernel addresses; it returns a void * instead.
+
+That all seems relatively simple; indeed, some developers clearly found it
+to be too simple.  After all, the interface above provides no control over
+how the allocation functions choose which specific piece of memory to
+return.  If that sort of control is needed, the following functions will be
+of interest:
+
+.. kernel-doc:: lib/genalloc.c
+   :functions: gen_pool_alloc_algo
+
+.. kernel-doc:: lib/genalloc.c
+   :functions: gen_pool_set_algo
+
+Allocations with :c:func:`gen_pool_alloc_algo` specify an algorithm to be
+used to choose the memory to be allocated; the default algorithm can be set
+with :c:func:`gen_pool_set_algo`.  The data value is passed to the
+algorithm; most ignore it, but it is occasionally needed.  One can,
+naturally, write a special-purpose algorithm, but there is a fair set
+already available:
+
+- gen_pool_first_fit is a simple first-fit allocator; this is the default
+  algorithm if none other has been specified.
+
+- gen_pool_first_fit_align forces the allocation to have a specific
+  alignment (passed via data in a genpool_data_align structure).
+
+- gen_pool_first_fit_order_align aligns the allocation to the order of the
+  size.  A 60-byte allocation will thus be 64-byte aligned, for example.
+
+- gen_pool_best_fit, as one would expect, is a simple best-fit allocator.
+
+- gen_pool_fixed_alloc allocates at a specific offset (passed in a
+  genpool_data_fixed structure via the data parameter) within the pool.
+  If the indicated memory is not available the allocation fails.
+
+There is a handful of other functions, mostly for purposes like querying
+the space available in the pool or iterating through chunks of memory.
+Most users, however, should not need much beyond what has been described
+above.  With luck, wider awareness of this module will help to prevent the
+writing of special-purpose memory allocators in the future.
+
+.. kernel-doc:: lib/genalloc.c
+   :functions: gen_pool_virt_to_phys
+
+.. kernel-doc:: lib/genalloc.c
+   :functions: gen_pool_for_each_chunk
+
+.. kernel-doc:: lib/genalloc.c
+   :functions: addr_in_gen_pool
+
+.. kernel-doc:: lib/genalloc.c
+   :functions: gen_pool_avail
+
+.. kernel-doc:: lib/genalloc.c
+   :functions: gen_pool_size
+
+.. kernel-doc:: lib/genalloc.c
+   :functions: gen_pool_get
+
+.. kernel-doc:: lib/genalloc.c
+   :functions: of_gen_pool_get
diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst
index 0606be3..d5bbe03 100644
--- a/Documentation/core-api/index.rst
+++ b/Documentation/core-api/index.rst
@@ -20,6 +20,7 @@
    genericirq
    flexible-arrays
    librs
+   genalloc
 
 Interfaces for kernel debugging
 ===============================
diff --git a/Documentation/core-api/kernel-api.rst b/Documentation/core-api/kernel-api.rst
index 17b0091..8282099 100644
--- a/Documentation/core-api/kernel-api.rst
+++ b/Documentation/core-api/kernel-api.rst
@@ -344,3 +344,52 @@
 
 .. kernel-doc:: include/linux/clk.h
    :internal:
+
+Synchronization Primitives
+==========================
+
+Read-Copy Update (RCU)
+----------------------
+
+.. kernel-doc:: include/linux/rcupdate.h
+   :external:
+
+.. kernel-doc:: include/linux/rcupdate_wait.h
+   :external:
+
+.. kernel-doc:: include/linux/rcutree.h
+   :external:
+
+.. kernel-doc:: kernel/rcu/tree.c
+   :external:
+
+.. kernel-doc:: kernel/rcu/tree_plugin.h
+   :external:
+
+.. kernel-doc:: kernel/rcu/tree_exp.h
+   :external:
+
+.. kernel-doc:: kernel/rcu/update.c
+   :external:
+
+.. kernel-doc:: include/linux/srcu.h
+   :external:
+
+.. kernel-doc:: kernel/rcu/srcutree.c
+   :external:
+
+.. kernel-doc:: include/linux/rculist_bl.h
+   :external:
+
+.. kernel-doc:: include/linux/rculist.h
+   :external:
+
+.. kernel-doc:: include/linux/rculist_nulls.h
+   :external:
+
+.. kernel-doc:: include/linux/rcu_sync.h
+   :external:
+
+.. kernel-doc:: kernel/rcu/sync.c
+   :external:
+
diff --git a/Documentation/dev-tools/gdb-kernel-debugging.rst b/Documentation/dev-tools/gdb-kernel-debugging.rst
index 5e93c9b..19df792 100644
--- a/Documentation/dev-tools/gdb-kernel-debugging.rst
+++ b/Documentation/dev-tools/gdb-kernel-debugging.rst
@@ -31,11 +31,13 @@
   CONFIG_DEBUG_INFO_REDUCED off. If your architecture supports
   CONFIG_FRAME_POINTER, keep it enabled.
 
-- Install that kernel on the guest.
+- Install that kernel on the guest, turn off KASLR if necessary by adding
+  "nokaslr" to the kernel command line.
   Alternatively, QEMU allows to boot the kernel directly using -kernel,
   -append, -initrd command line switches. This is generally only useful if
   you do not depend on modules. See QEMU documentation for more details on
-  this mode.
+  this mode. In this case, you should build the kernel with
+  CONFIG_RANDOMIZE_BASE disabled if the architecture supports KASLR.
 
 - Enable the gdb stub of QEMU/KVM, either
 
diff --git a/Documentation/dev-tools/kgdb.rst b/Documentation/dev-tools/kgdb.rst
index 7527320..d38be58 100644
--- a/Documentation/dev-tools/kgdb.rst
+++ b/Documentation/dev-tools/kgdb.rst
@@ -348,6 +348,15 @@
     - ``echo 1 > /sys/module/debug_core/parameters/kgdbreboot``
     - Enter the debugger on reboot notify.
 
+Kernel parameter: ``nokaslr``
+-----------------------------
+
+If the architecture that you are using enable KASLR by default,
+you should consider turning it off.  KASLR randomizes the
+virtual address where the kernel image is mapped and confuse
+gdb which resolve kernel symbol address from symbol table
+of vmlinux.
+
 Using kdb
 =========
 
@@ -358,7 +367,7 @@
 
 1. Configure kgdboc at boot using kernel parameters::
 
-	console=ttyS0,115200 kgdboc=ttyS0,115200
+	console=ttyS0,115200 kgdboc=ttyS0,115200 nokaslr
 
    OR
 
diff --git a/Documentation/devicetree/bindings/display/bridge/dw_mipi_dsi.txt b/Documentation/devicetree/bindings/display/bridge/dw_mipi_dsi.txt
new file mode 100644
index 0000000..b13adf3
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/bridge/dw_mipi_dsi.txt
@@ -0,0 +1,32 @@
+Synopsys DesignWare MIPI DSI host controller
+============================================
+
+This document defines device tree properties for the Synopsys DesignWare MIPI
+DSI host controller. It doesn't constitue a device tree binding specification
+by itself but is meant to be referenced by platform-specific device tree
+bindings.
+
+When referenced from platform device tree bindings the properties defined in
+this document are defined as follows. The platform device tree bindings are
+responsible for defining whether each optional property is used or not.
+
+- reg: Memory mapped base address and length of the DesignWare MIPI DSI
+  host controller registers. (mandatory)
+
+- clocks: References to all the clocks specified in the clock-names property
+  as specified in [1]. (mandatory)
+
+- clock-names:
+  - "pclk" is the peripheral clock for either AHB and APB. (mandatory)
+  - "px_clk" is the pixel clock for the DPI/RGB input. (optional)
+
+- resets: References to all the resets specified in the reset-names property
+  as specified in [2]. (optional)
+
+- reset-names: string reset name, must be "apb" if used. (optional)
+
+- panel or bridge node: see [3]. (mandatory)
+
+[1] Documentation/devicetree/bindings/clock/clock-bindings.txt
+[2] Documentation/devicetree/bindings/reset/reset.txt
+[3] Documentation/devicetree/bindings/display/mipi-dsi-bus.txt
diff --git a/Documentation/devicetree/bindings/display/exynos/exynos5433-decon.txt b/Documentation/devicetree/bindings/display/exynos/exynos5433-decon.txt
index 549c538..fc25882 100644
--- a/Documentation/devicetree/bindings/display/exynos/exynos5433-decon.txt
+++ b/Documentation/devicetree/bindings/display/exynos/exynos5433-decon.txt
@@ -25,12 +25,6 @@
 	 size-cells must 1 and 0, respectively.
 - port: contains an endpoint node which is connected to the endpoint in the mic
 	node. The reg value muset be 0.
-- i80-if-timings: specify whether the panel which is connected to decon uses
-		  i80 lcd interface or mipi video interface. This node contains
-		  no timing information as that of fimd does. Because there is
-		  no register in decon to specify i80 interface timing value,
-		  it is not needed, but make it remain to use same kind of node
-		  in fimd and exynos7 decon.
 
 Example:
 SoC specific DT entry:
@@ -59,9 +53,3 @@
 		};
 	};
 };
-
-Board specific DT entry:
-&decon {
-	i80-if-timings {
-	};
-};
diff --git a/Documentation/devicetree/bindings/display/repaper.txt b/Documentation/devicetree/bindings/display/repaper.txt
new file mode 100644
index 0000000..f5f9f9c
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/repaper.txt
@@ -0,0 +1,52 @@
+Pervasive Displays RePaper branded e-ink displays
+
+Required properties:
+- compatible:		"pervasive,e1144cs021" for 1.44" display
+			"pervasive,e1190cs021" for 1.9" display
+			"pervasive,e2200cs021" for 2.0" display
+			"pervasive,e2271cs021" for 2.7" display
+
+- panel-on-gpios:	Timing controller power control
+- discharge-gpios:	Discharge control
+- reset-gpios:		RESET pin
+- busy-gpios:		BUSY pin
+
+Required property for e2271cs021:
+- border-gpios:		Border control
+
+The node for this driver must be a child node of a SPI controller, hence
+all mandatory properties described in ../spi/spi-bus.txt must be specified.
+
+Optional property:
+- pervasive,thermal-zone:	name of thermometer's thermal zone
+
+Example:
+
+	display_temp: lm75@48 {
+		compatible = "lm75b";
+		reg = <0x48>;
+		#thermal-sensor-cells = <0>;
+	};
+
+	thermal-zones {
+		display {
+			polling-delay-passive = <0>;
+			polling-delay = <0>;
+			thermal-sensors = <&display_temp>;
+		};
+	};
+
+	papirus27@0{
+		compatible = "pervasive,e2271cs021";
+		reg = <0>;
+
+		spi-max-frequency = <8000000>;
+
+		panel-on-gpios = <&gpio 23 0>;
+		border-gpios = <&gpio 14 0>;
+		discharge-gpios = <&gpio 15 0>;
+		reset-gpios = <&gpio 24 0>;
+		busy-gpios = <&gpio 25 0>;
+
+		pervasive,thermal-zone = "display";
+	};
diff --git a/Documentation/devicetree/bindings/display/rockchip/dw_hdmi-rockchip.txt b/Documentation/devicetree/bindings/display/rockchip/dw_hdmi-rockchip.txt
index 046076c..fad8b76 100644
--- a/Documentation/devicetree/bindings/display/rockchip/dw_hdmi-rockchip.txt
+++ b/Documentation/devicetree/bindings/display/rockchip/dw_hdmi-rockchip.txt
@@ -11,7 +11,9 @@
 
 Required properties:
 
-- compatible: Shall contain "rockchip,rk3288-dw-hdmi".
+- compatible: should be one of the following:
+		"rockchip,rk3288-dw-hdmi"
+		"rockchip,rk3399-dw-hdmi"
 - reg: See dw_hdmi.txt.
 - reg-io-width: See dw_hdmi.txt. Shall be 4.
 - interrupts: HDMI interrupt number
@@ -30,7 +32,8 @@
   I2C master controller.
 - clock-names: See dw_hdmi.txt. The "cec" clock is optional.
 - clock-names: May contain "cec" as defined in dw_hdmi.txt.
-
+- clock-names: May contain "grf", power for grf io.
+- clock-names: May contain "vpll", external clock for some hdmi phy.
 
 Example:
 
diff --git a/Documentation/devicetree/bindings/display/rockchip/rockchip-vop.txt b/Documentation/devicetree/bindings/display/rockchip/rockchip-vop.txt
index 9eb3f0a..5d835d9 100644
--- a/Documentation/devicetree/bindings/display/rockchip/rockchip-vop.txt
+++ b/Documentation/devicetree/bindings/display/rockchip/rockchip-vop.txt
@@ -8,8 +8,12 @@
 - compatible: value should be one of the following
 		"rockchip,rk3036-vop";
 		"rockchip,rk3288-vop";
+		"rockchip,rk3368-vop";
+		"rockchip,rk3366-vop";
 		"rockchip,rk3399-vop-big";
 		"rockchip,rk3399-vop-lit";
+		"rockchip,rk3228-vop";
+		"rockchip,rk3328-vop";
 
 - interrupts: should contain a list of all VOP IP block interrupts in the
 		 order: VSYNC, LCD_SYSTEM. The interrupt specifier
diff --git a/Documentation/devicetree/bindings/display/sitronix,st7586.txt b/Documentation/devicetree/bindings/display/sitronix,st7586.txt
new file mode 100644
index 0000000..1d0dad1
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/sitronix,st7586.txt
@@ -0,0 +1,22 @@
+Sitronix ST7586 display panel
+
+Required properties:
+- compatible:	"lego,ev3-lcd".
+- a0-gpios:	The A0 signal (since this binding is for serial mode, this is
+                the pin labeled D1 on the controller, not the pin labeled A0)
+- reset-gpios:	Reset pin
+
+The node for this driver must be a child node of a SPI controller, hence
+all mandatory properties described in ../spi/spi-bus.txt must be specified.
+
+Optional properties:
+- rotation:	panel rotation in degrees counter clockwise (0,90,180,270)
+
+Example:
+	display@0{
+		compatible = "lego,ev3-lcd";
+		reg = <0>;
+		spi-max-frequency = <10000000>;
+		a0-gpios = <&gpio 43 GPIO_ACTIVE_HIGH>;
+		reset-gpios = <&gpio 80 GPIO_ACTIVE_HIGH>;
+	};
diff --git a/Documentation/devicetree/bindings/display/st,stm32-ltdc.txt b/Documentation/devicetree/bindings/display/st,stm32-ltdc.txt
index 8e14769..74b5ac7 100644
--- a/Documentation/devicetree/bindings/display/st,stm32-ltdc.txt
+++ b/Documentation/devicetree/bindings/display/st,stm32-ltdc.txt
@@ -1,7 +1,6 @@
 * STMicroelectronics STM32 lcd-tft display controller
 
 - ltdc: lcd-tft display controller host
-  must be a sub-node of st-display-subsystem
   Required properties:
   - compatible: "st,stm32-ltdc"
   - reg: Physical base address of the IP registers and length of memory mapped region.
@@ -13,8 +12,40 @@
   Required nodes:
     - Video port for RGB output.
 
-Example:
+* STMicroelectronics STM32 DSI controller specific extensions to Synopsys
+  DesignWare MIPI DSI host controller
 
+The STMicroelectronics STM32 DSI controller uses the Synopsys DesignWare MIPI
+DSI host controller. For all mandatory properties & nodes, please refer
+to the related documentation in [5].
+
+Mandatory properties specific to STM32 DSI:
+- #address-cells: Should be <1>.
+- #size-cells: Should be <0>.
+- compatible: "st,stm32-dsi".
+- clock-names:
+  - phy pll reference clock string name, must be "ref".
+- resets: see [5].
+- reset-names: see [5].
+
+Mandatory nodes specific to STM32 DSI:
+- ports: A node containing DSI input & output port nodes with endpoint
+  definitions as documented in [3] & [4].
+  - port@0: DSI input port node, connected to the ltdc rgb output port.
+  - port@1: DSI output port node, connected to a panel or a bridge input port.
+- panel or bridge node: A node containing the panel or bridge description as
+  documented in [6].
+  - port: panel or bridge port node, connected to the DSI output port (port@1).
+
+Note: You can find more documentation in the following references
+[1] Documentation/devicetree/bindings/clock/clock-bindings.txt
+[2] Documentation/devicetree/bindings/reset/reset.txt
+[3] Documentation/devicetree/bindings/media/video-interfaces.txt
+[4] Documentation/devicetree/bindings/graph.txt
+[5] Documentation/devicetree/bindings/display/bridge/dw_mipi_dsi.txt
+[6] Documentation/devicetree/bindings/display/mipi-dsi-bus.txt
+
+Example 1: RGB panel
 / {
 	...
 	soc {
@@ -34,3 +65,73 @@
 		};
 	};
 };
+
+Example 2: DSI panel
+
+/ {
+	...
+	soc {
+	...
+		ltdc: display-controller@40016800 {
+			compatible = "st,stm32-ltdc";
+			reg = <0x40016800 0x200>;
+			interrupts = <88>, <89>;
+			resets = <&rcc STM32F4_APB2_RESET(LTDC)>;
+			clocks = <&rcc 1 CLK_LCD>;
+			clock-names = "lcd";
+
+			port {
+				ltdc_out_dsi: endpoint {
+					remote-endpoint = <&dsi_in>;
+				};
+			};
+		};
+
+
+		dsi: dsi@40016c00 {
+			#address-cells = <1>;
+			#size-cells = <0>;
+			compatible = "st,stm32-dsi";
+			reg = <0x40016c00 0x800>;
+			clocks = <&rcc 1 CLK_F469_DSI>, <&clk_hse>;
+			clock-names = "ref", "pclk";
+			resets = <&rcc STM32F4_APB2_RESET(DSI)>;
+			reset-names = "apb";
+
+			ports {
+				#address-cells = <1>;
+				#size-cells = <0>;
+
+				port@0 {
+					reg = <0>;
+					dsi_in: endpoint {
+						remote-endpoint = <&ltdc_out_dsi>;
+					};
+				};
+
+				port@1 {
+					reg = <1>;
+					dsi_out: endpoint {
+						remote-endpoint = <&dsi_in_panel>;
+					};
+				};
+
+			};
+
+			panel-dsi@0 {
+				reg = <0>; /* dsi virtual channel (0..3) */
+				compatible = ...;
+				enable-gpios = ...;
+
+				port {
+					dsi_in_panel: endpoint {
+						remote-endpoint = <&dsi_out>;
+					};
+				};
+
+			};
+
+		};
+
+	};
+};
diff --git a/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt b/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
index b83e601..2ee6ff0 100644
--- a/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
+++ b/Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
@@ -4,15 +4,33 @@
 The Allwinner A10 Display pipeline is composed of several components
 that are going to be documented below:
 
-For the input port of all components up to the TCON in the display
-pipeline, if there are multiple components, the local endpoint IDs
-must correspond to the index of the upstream block. For example, if
-the remote endpoint is Frontend 1, then the local endpoint ID must
-be 1.
+For all connections between components up to the TCONs in the display
+pipeline, when there are multiple components of the same type at the
+same depth, the local endpoint ID must be the same as the remote
+component's index. For example, if the remote endpoint is Frontend 1,
+then the local endpoint ID must be 1.
 
-Conversely, for the output ports of the same group, the remote endpoint
-ID must be the index of the local hardware block. If the local backend
-is backend 1, then the remote endpoint ID must be 1.
+    Frontend 0  [0] ------- [0]  Backend 0  [0] ------- [0]  TCON 0
+		[1] --   -- [1]             [1] --   -- [1]
+		      \ /                         \ /
+		       X                           X
+		      / \                         / \
+		[0] --   -- [0]             [0] --   -- [0]
+    Frontend 1  [1] ------- [1]  Backend 1  [1] ------- [1]  TCON 1
+
+For a two pipeline system such as the one depicted above, the lines
+represent the connections between the components, while the numbers
+within the square brackets corresponds to the ID of the local endpoint.
+
+The same rule also applies to DE 2.0 mixer-TCON connections:
+
+    Mixer 0  [0] ----------- [0]  TCON 0
+	     [1] ----   ---- [1]
+		     \ /
+		      X
+		     / \
+	     [0] ----   ---- [0]
+    Mixer 1  [1] ----------- [1]  TCON 1
 
 HDMI Encoder
 ------------
diff --git a/Documentation/devicetree/bindings/hwmon/aspeed-pwm-tacho.txt b/Documentation/devicetree/bindings/hwmon/aspeed-pwm-tacho.txt
index cf44605..367c8203 100644
--- a/Documentation/devicetree/bindings/hwmon/aspeed-pwm-tacho.txt
+++ b/Documentation/devicetree/bindings/hwmon/aspeed-pwm-tacho.txt
@@ -11,6 +11,8 @@
 
 - #size-cells : should be 1.
 
+- #cooling-cells: should be 2.
+
 - reg : address and length of the register set for the device.
 
 - pinctrl-names : a pinctrl state named "default" must be defined.
@@ -28,12 +30,17 @@
 Under fan subnode there can upto 8 child nodes, with each child node
 representing a fan. If there are 8 fans each fan can have one PWM port and
 one/two Fan tach inputs.
+For PWM port can be configured cooling-levels to create cooling device.
+Cooling device could be bound to a thermal zone for the thermal control.
 
 Required properties for each child node:
 - reg : should specify PWM source port.
 	integer value in the range 0 to 7 with 0 indicating PWM port A and
 	7 indicating PWM port H.
 
+- cooling-levels: PWM duty cycle values in a range from 0 to 255
+                  which correspond to thermal cooling states.
+
 - aspeed,fan-tach-ch : should specify the Fan tach input channel.
                 integer value in the range 0 through 15, with 0 indicating
 		Fan tach channel 0 and 15 indicating Fan tach channel 15.
@@ -50,6 +57,7 @@
 pwm_tacho: pwmtachocontroller@1e786000 {
 	#address-cells = <1>;
 	#size-cells = <1>;
+	#cooling-cells = <2>;
 	reg = <0x1E786000 0x1000>;
 	compatible = "aspeed,ast2500-pwm-tacho";
 	clocks = <&pwm_tacho_fixed_clk>;
@@ -58,6 +66,7 @@
 
 	fan@0 {
 		reg = <0x00>;
+		cooling-levels = /bits/ 8 <125 151 177 203 229 255>;
 		aspeed,fan-tach-ch = /bits/ 8 <0x00>;
 	};
 
diff --git a/Documentation/devicetree/bindings/hwmon/ibm,cffps1.txt b/Documentation/devicetree/bindings/hwmon/ibm,cffps1.txt
new file mode 100644
index 0000000..f68a0a6
--- /dev/null
+++ b/Documentation/devicetree/bindings/hwmon/ibm,cffps1.txt
@@ -0,0 +1,21 @@
+Device-tree bindings for IBM Common Form Factor Power Supply Version 1
+----------------------------------------------------------------------
+
+Required properties:
+ - compatible = "ibm,cffps1";
+ - reg = < I2C bus address >;		: Address of the power supply on the
+					  I2C bus.
+
+Example:
+
+    i2c-bus@100 {
+        #address-cells = <1>;
+        #size-cells = <0>;
+        #interrupt-cells = <1>;
+        < more properties >
+
+        power-supply@68 {
+            compatible = "ibm,cffps1";
+            reg = <0x68>;
+        };
+    };
diff --git a/Documentation/devicetree/bindings/hwmon/ltq-cputemp.txt b/Documentation/devicetree/bindings/hwmon/ltq-cputemp.txt
new file mode 100644
index 0000000..33fd00a
--- /dev/null
+++ b/Documentation/devicetree/bindings/hwmon/ltq-cputemp.txt
@@ -0,0 +1,10 @@
+Lantiq cpu temperatur sensor
+
+Requires node properties:
+- compatible value :
+	"lantiq,cputemp"
+
+Example:
+	cputemp@0 {
+		compatible = "lantiq,cputemp";
+	};
diff --git a/Documentation/devicetree/bindings/net/dwmac-sun8i.txt b/Documentation/devicetree/bindings/net/dwmac-sun8i.txt
deleted file mode 100644
index 725f3b1..0000000
--- a/Documentation/devicetree/bindings/net/dwmac-sun8i.txt
+++ /dev/null
@@ -1,84 +0,0 @@
-* Allwinner sun8i GMAC ethernet controller
-
-This device is a platform glue layer for stmmac.
-Please see stmmac.txt for the other unchanged properties.
-
-Required properties:
-- compatible: should be one of the following string:
-		"allwinner,sun8i-a83t-emac"
-		"allwinner,sun8i-h3-emac"
-		"allwinner,sun8i-v3s-emac"
-		"allwinner,sun50i-a64-emac"
-- reg: address and length of the register for the device.
-- interrupts: interrupt for the device
-- interrupt-names: should be "macirq"
-- clocks: A phandle to the reference clock for this device
-- clock-names: should be "stmmaceth"
-- resets: A phandle to the reset control for this device
-- reset-names: should be "stmmaceth"
-- phy-mode: See ethernet.txt
-- phy-handle: See ethernet.txt
-- #address-cells: shall be 1
-- #size-cells: shall be 0
-- syscon: A phandle to the syscon of the SoC with one of the following
- compatible string:
-  - allwinner,sun8i-h3-system-controller
-  - allwinner,sun8i-v3s-system-controller
-  - allwinner,sun50i-a64-system-controller
-  - allwinner,sun8i-a83t-system-controller
-
-Optional properties:
-- allwinner,tx-delay-ps: TX clock delay chain value in ps. Range value is 0-700. Default is 0)
-- allwinner,rx-delay-ps: RX clock delay chain value in ps. Range value is 0-3100. Default is 0)
-Both delay properties need to be a multiple of 100. They control the delay for
-external PHY.
-
-Optional properties for the following compatibles:
-  - "allwinner,sun8i-h3-emac",
-  - "allwinner,sun8i-v3s-emac":
-- allwinner,leds-active-low: EPHY LEDs are active low
-
-Required child node of emac:
-- mdio bus node: should be named mdio
-
-Required properties of the mdio node:
-- #address-cells: shall be 1
-- #size-cells: shall be 0
-
-The device node referenced by "phy" or "phy-handle" should be a child node
-of the mdio node. See phy.txt for the generic PHY bindings.
-
-Required properties of the phy node with the following compatibles:
-  - "allwinner,sun8i-h3-emac",
-  - "allwinner,sun8i-v3s-emac":
-- clocks: a phandle to the reference clock for the EPHY
-- resets: a phandle to the reset control for the EPHY
-
-Example:
-
-emac: ethernet@1c0b000 {
-	compatible = "allwinner,sun8i-h3-emac";
-	syscon = <&syscon>;
-	reg = <0x01c0b000 0x104>;
-	interrupts = <GIC_SPI 82 IRQ_TYPE_LEVEL_HIGH>;
-	interrupt-names = "macirq";
-	resets = <&ccu RST_BUS_EMAC>;
-	reset-names = "stmmaceth";
-	clocks = <&ccu CLK_BUS_EMAC>;
-	clock-names = "stmmaceth";
-	#address-cells = <1>;
-	#size-cells = <0>;
-
-	phy-handle = <&int_mii_phy>;
-	phy-mode = "mii";
-	allwinner,leds-active-low;
-	mdio: mdio {
-		#address-cells = <1>;
-		#size-cells = <0>;
-		int_mii_phy: ethernet-phy@1 {
-			reg = <1>;
-			clocks = <&ccu CLK_BUS_EPHY>;
-			resets = <&ccu RST_BUS_EPHY>;
-		};
-	};
-};
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt
index daf465be..36b27b1 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -249,6 +249,7 @@
 panasonic	Panasonic Corporation
 parade	Parade Technologies Inc.
 pericom	Pericom Technology Inc.
+pervasive	Pervasive Displays, Inc.
 phytec	PHYTEC Messtechnik GmbH
 picochip	Picochip Ltd
 pine64	Pine64
diff --git a/Documentation/doc-guide/sphinx.rst b/Documentation/doc-guide/sphinx.rst
index 84e8e8a..a241763 100644
--- a/Documentation/doc-guide/sphinx.rst
+++ b/Documentation/doc-guide/sphinx.rst
@@ -19,6 +19,110 @@
 ``Documentation``. Some of these will likely be converted to reStructuredText
 over time, but the bulk of them will remain in plain text.
 
+.. _sphinx_install:
+
+Sphinx Install
+==============
+
+The ReST markups currently used by the Documentation/ files are meant to be
+built with ``Sphinx`` version 1.3 or upper. If you're desiring to build
+PDF outputs, it is recommended to use version 1.4.6 or upper.
+
+There's a script that checks for the Spinx requirements. Please see
+:ref:`sphinx-pre-install` for further details.
+
+Most distributions are shipped with Sphinx, but its toolchain is fragile,
+and it is not uncommon that upgrading it or some other Python packages
+on your machine would cause the documentation build to break.
+
+A way to get rid of that is to use a different version than the one shipped
+on your distributions. In order to do that, it is recommended to install
+Sphinx inside a virtual environment, using ``virtualenv-3``
+or ``virtualenv``, depending on how your distribution packaged Python 3.
+
+.. note::
+
+   #) Sphinx versions below 1.5 don't work properly with Python's
+      docutils version 0.13.1 or upper. So, if you're willing to use
+      those versions, you should run ``pip install 'docutils==0.12'``.
+
+   #) It is recommended to use the RTD theme for html output. Depending
+      on the Sphinx version, it should be installed  in separate,
+      with ``pip install sphinx_rtd_theme``.
+
+   #) Some ReST pages contain math expressions. Due to the way Sphinx work,
+      those expressions are written using LaTeX notation. It needs texlive
+      installed with amdfonts and amsmath in order to evaluate them.
+
+In summary, if you want to install Sphinx version 1.4.9, you should do::
+
+       $ virtualenv sphinx_1.4
+       $ . sphinx_1.4/bin/activate
+       (sphinx_1.4) $ pip install -r Documentation/sphinx/requirements.txt
+
+After running ``. sphinx_1.4/bin/activate``, the prompt will change,
+in order to indicate that you're using the new environment. If you
+open a new shell, you need to rerun this command to enter again at
+the virtual environment before building the documentation.
+
+Image output
+------------
+
+The kernel documentation build system contains an extension that
+handles images on both GraphViz and SVG formats (see
+:ref:`sphinx_kfigure`).
+
+For it to work, you need to install both GraphViz and ImageMagick
+packages. If those packages are not installed, the build system will
+still build the documentation, but won't include any images at the
+output.
+
+PDF and LaTeX builds
+--------------------
+
+Such builds are currently supported only with Sphinx versions 1.4 and upper.
+
+For PDF and LaTeX output, you'll also need ``XeLaTeX`` version 3.14159265.
+
+Depending on the distribution, you may also need to install a series of
+``texlive`` packages that provide the minimal set of functionalities
+required for ``XeLaTeX`` to work.
+
+.. _sphinx-pre-install:
+
+Checking for Sphinx dependencies
+--------------------------------
+
+There's a script that automatically check for Sphinx dependencies. If it can
+recognize your distribution, it will also give a hint about the install
+command line options for your distro::
+
+	$ ./scripts/sphinx-pre-install
+	Checking if the needed tools for Fedora release 26 (Twenty Six) are available
+	Warning: better to also install "texlive-luatex85".
+	You should run:
+
+		sudo dnf install -y texlive-luatex85
+		/usr/bin/virtualenv sphinx_1.4
+		. sphinx_1.4/bin/activate
+		pip install -r Documentation/sphinx/requirements.txt
+
+	Can't build as 1 mandatory dependency is missing at ./scripts/sphinx-pre-install line 468.
+
+By default, it checks all the requirements for both html and PDF, including
+the requirements for images, math expressions and LaTeX build, and assumes
+that a virtual Python environment will be used. The ones needed for html
+builds are assumed to be mandatory; the others to be optional.
+
+It supports two optional parameters:
+
+``--no-pdf``
+	Disable checks for PDF;
+
+``--no-virtualenv``
+	Use OS packaging for Sphinx instead of Python virtual environment.
+
+
 Sphinx Build
 ============
 
@@ -118,7 +222,7 @@
 the C domain
 ------------
 
-The `Sphinx C Domain`_ (name c) is suited for documentation of C API. E.g. a
+The **Sphinx C Domain** (name c) is suited for documentation of C API. E.g. a
 function prototype:
 
 .. code-block:: rst
@@ -229,6 +333,7 @@
 
         - column 3
 
+.. _sphinx_kfigure:
 
 Figures & Images
 ================
diff --git a/Documentation/driver-api/basics.rst b/Documentation/driver-api/basics.rst
index ab82250..73fa7d4 100644
--- a/Documentation/driver-api/basics.rst
+++ b/Documentation/driver-api/basics.rst
@@ -4,7 +4,7 @@
 Driver Entry and Exit points
 ----------------------------
 
-.. kernel-doc:: include/linux/init.h
+.. kernel-doc:: include/linux/module.h
    :internal:
 
 Driver device table
@@ -103,9 +103,6 @@
 .. kernel-doc:: kernel/panic.c
    :export:
 
-.. kernel-doc:: kernel/sys.c
-   :export:
-
 .. kernel-doc:: kernel/rcu/tree.c
    :export:
 
diff --git a/Documentation/driver-api/dma-buf.rst b/Documentation/driver-api/dma-buf.rst
index 31671b4..dc384f2 100644
--- a/Documentation/driver-api/dma-buf.rst
+++ b/Documentation/driver-api/dma-buf.rst
@@ -139,9 +139,6 @@
 Seqno Hardware Fences
 ~~~~~~~~~~~~~~~~~~~~~
 
-.. kernel-doc:: drivers/dma-buf/seqno-fence.c
-   :export:
-
 .. kernel-doc:: include/linux/seqno-fence.h
    :internal:
 
diff --git a/Documentation/driver-api/miscellaneous.rst b/Documentation/driver-api/miscellaneous.rst
index 8da7d11..304ffb1 100644
--- a/Documentation/driver-api/miscellaneous.rst
+++ b/Documentation/driver-api/miscellaneous.rst
@@ -47,4 +47,3 @@
 
 .. kernel-doc:: drivers/pwm/core.c
    :export:
-   
diff --git a/Documentation/driver-api/s390-drivers.rst b/Documentation/driver-api/s390-drivers.rst
index 7060da1..ecf8851 100644
--- a/Documentation/driver-api/s390-drivers.rst
+++ b/Documentation/driver-api/s390-drivers.rst
@@ -75,7 +75,7 @@
 data which is made available by the channel subsystem for each channel
 attached device.
 
-.. kernel-doc:: arch/s390/include/asm/cmb.h
+.. kernel-doc:: arch/s390/include/uapi/asm/cmb.h
    :internal:
 
 .. kernel-doc:: drivers/s390/cio/cmf.c
diff --git a/Documentation/driver-api/scsi.rst b/Documentation/driver-api/scsi.rst
index 859fb67..5a2aa7a 100644
--- a/Documentation/driver-api/scsi.rst
+++ b/Documentation/driver-api/scsi.rst
@@ -224,14 +224,6 @@
 .. kernel-doc:: drivers/scsi/hosts.c
    :export:
 
-drivers/scsi/constants.c
-~~~~~~~~~~~~~~~~~~~~~~~~
-
-mid to lowlevel SCSI driver interface
-
-.. kernel-doc:: drivers/scsi/constants.c
-   :export:
-
 Transport classes
 -----------------
 
diff --git a/Documentation/features/core/tracehook/arch-support.txt b/Documentation/features/core/tracehook/arch-support.txt
index 5e97a89..dfb638c 100644
--- a/Documentation/features/core/tracehook/arch-support.txt
+++ b/Documentation/features/core/tracehook/arch-support.txt
@@ -25,7 +25,7 @@
     |     mn10300: |  ok  |
     |       nios2: |  ok  |
     |    openrisc: |  ok  |
-    |      parisc: | TODO |
+    |      parisc: |  ok  |
     |     powerpc: |  ok  |
     |        s390: |  ok  |
     |       score: | TODO |
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 73e7d91..405a3df 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -829,9 +829,7 @@
   swap_activate: Called when swapon is used on a file to allocate
 	space if necessary and pin the block lookup information in
 	memory. A return value of zero indicates success,
-	in which case this file can be used to back swapspace. The
-	swapspace operations will be proxied to this address space's
-	->swap_{out,in} methods.
+	in which case this file can be used to back swapspace.
 
   swap_deactivate: Called during swapoff on files where swap_activate
 	was successful.
diff --git a/Documentation/gpu/drm-internals.rst b/Documentation/gpu/drm-internals.rst
index 0d936c6..5ee9674 100644
--- a/Documentation/gpu/drm-internals.rst
+++ b/Documentation/gpu/drm-internals.rst
@@ -201,6 +201,8 @@
 Open/Close, File Operations and IOCTLs
 ======================================
 
+.. _drm_driver_fops:
+
 File Operations
 ---------------
 
diff --git a/Documentation/gpu/drm-kms-helpers.rst b/Documentation/gpu/drm-kms-helpers.rst
index 7c5e254..13dd237 100644
--- a/Documentation/gpu/drm-kms-helpers.rst
+++ b/Documentation/gpu/drm-kms-helpers.rst
@@ -296,3 +296,12 @@
 
 .. kernel-doc:: drivers/gpu/drm/drm_modeset_helper.c
    :export:
+
+Framebuffer GEM Helper Reference
+================================
+
+.. kernel-doc:: drivers/gpu/drm/drm_gem_framebuffer_helper.c
+   :doc: overview
+
+.. kernel-doc:: drivers/gpu/drm/drm_gem_framebuffer_helper.c
+   :export:
diff --git a/Documentation/gpu/drm-kms.rst b/Documentation/gpu/drm-kms.rst
index 2d77c95..3072841 100644
--- a/Documentation/gpu/drm-kms.rst
+++ b/Documentation/gpu/drm-kms.rst
@@ -523,9 +523,6 @@
 .. kernel-doc:: drivers/gpu/drm/drm_color_mgmt.c
    :doc: overview
 
-.. kernel-doc:: include/drm/drm_color_mgmt.h
-   :internal:
-
 .. kernel-doc:: drivers/gpu/drm/drm_color_mgmt.c
    :export:
 
@@ -554,60 +551,8 @@
 Vertical Blanking
 =================
 
-Vertical blanking plays a major role in graphics rendering. To achieve
-tear-free display, users must synchronize page flips and/or rendering to
-vertical blanking. The DRM API offers ioctls to perform page flips
-synchronized to vertical blanking and wait for vertical blanking.
-
-The DRM core handles most of the vertical blanking management logic,
-which involves filtering out spurious interrupts, keeping race-free
-blanking counters, coping with counter wrap-around and resets and
-keeping use counts. It relies on the driver to generate vertical
-blanking interrupts and optionally provide a hardware vertical blanking
-counter. Drivers must implement the following operations.
-
--  int (\*enable_vblank) (struct drm_device \*dev, int crtc); void
-   (\*disable_vblank) (struct drm_device \*dev, int crtc);
-   Enable or disable vertical blanking interrupts for the given CRTC.
-
--  u32 (\*get_vblank_counter) (struct drm_device \*dev, int crtc);
-   Retrieve the value of the vertical blanking counter for the given
-   CRTC. If the hardware maintains a vertical blanking counter its value
-   should be returned. Otherwise drivers can use the
-   :c:func:`drm_vblank_count()` helper function to handle this
-   operation.
-
-Drivers must initialize the vertical blanking handling core with a call
-to :c:func:`drm_vblank_init()` in their load operation.
-
-Vertical blanking interrupts can be enabled by the DRM core or by
-drivers themselves (for instance to handle page flipping operations).
-The DRM core maintains a vertical blanking use count to ensure that the
-interrupts are not disabled while a user still needs them. To increment
-the use count, drivers call :c:func:`drm_vblank_get()`. Upon
-return vertical blanking interrupts are guaranteed to be enabled.
-
-To decrement the use count drivers call
-:c:func:`drm_vblank_put()`. Only when the use count drops to zero
-will the DRM core disable the vertical blanking interrupts after a delay
-by scheduling a timer. The delay is accessible through the
-vblankoffdelay module parameter or the ``drm_vblank_offdelay`` global
-variable and expressed in milliseconds. Its default value is 5000 ms.
-Zero means never disable, and a negative value means disable
-immediately. Drivers may override the behaviour by setting the
-:c:type:`struct drm_device <drm_device>`
-vblank_disable_immediate flag, which when set causes vblank interrupts
-to be disabled immediately regardless of the drm_vblank_offdelay
-value. The flag should only be set if there's a properly working
-hardware vblank counter present.
-
-When a vertical blanking interrupt occurs drivers only need to call the
-:c:func:`drm_handle_vblank()` function to account for the
-interrupt.
-
-Resources allocated by :c:func:`drm_vblank_init()` must be freed
-with a call to :c:func:`drm_vblank_cleanup()` in the driver unload
-operation handler.
+.. kernel-doc:: drivers/gpu/drm/drm_vblank.c
+   :doc: vblank handling
 
 Vertical Blanking and Interrupt Handling Functions Reference
 ------------------------------------------------------------
diff --git a/Documentation/gpu/drm-mm.rst b/Documentation/gpu/drm-mm.rst
index 9412798..b08e9dc 100644
--- a/Documentation/gpu/drm-mm.rst
+++ b/Documentation/gpu/drm-mm.rst
@@ -191,7 +191,7 @@
 holding the lock.
 
 When the last reference to a GEM object is released the GEM core calls
-the :c:type:`struct drm_driver <drm_driver>` gem_free_object
+the :c:type:`struct drm_driver <drm_driver>` gem_free_object_unlocked
 operation. That operation is mandatory for GEM-enabled drivers and must
 free the GEM object and all associated resources.
 
@@ -492,7 +492,7 @@
    :doc: Overview
 
 .. kernel-doc:: include/drm/drm_syncobj.h
-   :export:
+   :internal:
 
 .. kernel-doc:: drivers/gpu/drm/drm_syncobj.c
    :export:
diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst
index 8584575..679373b 100644
--- a/Documentation/gpu/drm-uapi.rst
+++ b/Documentation/gpu/drm-uapi.rst
@@ -160,6 +160,8 @@
 visible to user-space and accessible beyond open-file boundaries, they
 cannot support render nodes.
 
+.. _drm_driver_ioctl:
+
 IOCTL Support on Device Nodes
 =============================
 
diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index 9c7ed3e..2e7ee03 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -417,6 +417,10 @@
    :functions: i915_perf_open_ioctl
 .. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
    :functions: i915_perf_release
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_add_config_ioctl
+.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
+   :functions: i915_perf_remove_config_ioctl
 
 i915 Perf Stream
 ----------------
@@ -477,4 +481,16 @@
 .. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
    :internal:
 
-.. WARNING: DOCPROC directive not supported: !Cdrivers/gpu/drm/i915/i915_irq.c
+Style
+=====
+
+The drm/i915 driver codebase has some style rules in addition to (and, in some
+cases, deviating from) the kernel coding style.
+
+Register macro definition style
+-------------------------------
+
+The style guide for ``i915_reg.h``.
+
+.. kernel-doc:: drivers/gpu/drm/i915/i915_reg.h
+   :doc: The i915 register macro definition style guide
diff --git a/Documentation/gpu/todo.rst b/Documentation/gpu/todo.rst
index 1ae4200..22af55d 100644
--- a/Documentation/gpu/todo.rst
+++ b/Documentation/gpu/todo.rst
@@ -108,8 +108,8 @@
   crtc state, clear that to the max values, x/y = 0 and w/h = MAX_INT, in
   __drm_atomic_helper_crtc_duplicate_state().
 
-- Move tinydrm_merge_clips into drm_framebuffer.c, dropping the tinydrm_
-  prefix ofc and using drm_fb_. drm_framebuffer.c makes sense since this
+- Move tinydrm_merge_clips into drm_framebuffer.c, dropping the tinydrm\_
+  prefix ofc and using drm_fb\_. drm_framebuffer.c makes sense since this
   is a function useful to implement the fb->dirty function.
 
 - Create a new drm_fb_dirty function which does essentially what e.g.
diff --git a/Documentation/hwmon/ftsteutates b/Documentation/hwmon/ftsteutates
index 8c10a91..af54db9 100644
--- a/Documentation/hwmon/ftsteutates
+++ b/Documentation/hwmon/ftsteutates
@@ -18,6 +18,10 @@
 8 fans. It also contains an integrated watchdog which is currently
 implemented in this driver.
 
+To clear a temperature or fan alarm, execute the following command with the
+correct path to the alarm file:
+	echo 0 >XXXX_alarm
+
 Specification of the chip can be found here:
 ftp://ftp.ts.fujitsu.com/pub/Mainboard-OEM-Sales/Services/Software&Tools/Linux_SystemMonitoring&Watchdog&GPIO/BMC-Teutates_Specification_V1.21.pdf
 ftp://ftp.ts.fujitsu.com/pub/Mainboard-OEM-Sales/Services/Software&Tools/Linux_SystemMonitoring&Watchdog&GPIO/Fujitsu_mainboards-1-Sensors_HowTo-en-US.pdf
diff --git a/Documentation/hwmon/ibm-cffps b/Documentation/hwmon/ibm-cffps
new file mode 100644
index 0000000..e05ecd8
--- /dev/null
+++ b/Documentation/hwmon/ibm-cffps
@@ -0,0 +1,54 @@
+Kernel driver ibm-cffps
+=======================
+
+Supported chips:
+  * IBM Common Form Factor power supply
+
+Author: Eddie James <eajames@us.ibm.com>
+
+Description
+-----------
+
+This driver supports IBM Common Form Factor (CFF) power supplies. This driver
+is a client to the core PMBus driver.
+
+Usage Notes
+-----------
+
+This driver does not auto-detect devices. You will have to instantiate the
+devices explicitly. Please see Documentation/i2c/instantiating-devices for
+details.
+
+Sysfs entries
+-------------
+
+The following attributes are supported:
+
+curr1_alarm		Output current over-current alarm.
+curr1_input		Measured output current in mA.
+curr1_label		"iout1"
+
+fan1_alarm		Fan 1 warning.
+fan1_fault		Fan 1 fault.
+fan1_input		Fan 1 speed in RPM.
+fan2_alarm		Fan 2 warning.
+fan2_fault		Fan 2 fault.
+fan2_input		Fan 2 speed in RPM.
+
+in1_alarm		Input voltage under-voltage alarm.
+in1_input		Measured input voltage in mV.
+in1_label		"vin"
+in2_alarm		Output voltage over-voltage alarm.
+in2_input		Measured output voltage in mV.
+in2_label		"vout1"
+
+power1_alarm		Input fault or alarm.
+power1_input		Measured input power in uW.
+power1_label		"pin"
+
+temp1_alarm		PSU inlet ambient temperature over-temperature alarm.
+temp1_input		Measured PSU inlet ambient temp in millidegrees C.
+temp2_alarm		Secondary rectifier temp over-temperature alarm.
+temp2_input		Measured secondary rectifier temp in millidegrees C.
+temp3_alarm		ORing FET temperature over-temperature alarm.
+temp3_input		Measured ORing FET temperature in millidegrees C.
diff --git a/Documentation/hwmon/lm25066 b/Documentation/hwmon/lm25066
index 2cb20eb..3fa6bf8 100644
--- a/Documentation/hwmon/lm25066
+++ b/Documentation/hwmon/lm25066
@@ -29,6 +29,11 @@
     Addresses scanned: -
     Datasheet:
 	http://www.national.com/pf/LM/LM5066.html
+  * Texas Instruments LM5066I
+    Prefix: 'lm5066i'
+    Addresses scanned: -
+	Datasheet:
+    http://www.ti.com/product/LM5066I
 
 Author: Guenter Roeck <linux@roeck-us.net>
 
@@ -37,8 +42,8 @@
 -----------
 
 This driver supports hardware monitoring for National Semiconductor / TI LM25056,
-LM25063, LM25066, LM5064, and LM5066 Power Management, Monitoring, Control, and
-Protection ICs.
+LM25063, LM25066, LM5064, and LM5066/LM5066I Power Management, Monitoring,
+Control, and Protection ICs.
 
 The driver is a client driver to the core PMBus driver. Please see
 Documentation/hwmon/pmbus for details on PMBus client drivers.
diff --git a/Documentation/infiniband/tag_matching.txt b/Documentation/infiniband/tag_matching.txt
new file mode 100644
index 0000000..d2a3bf8
--- /dev/null
+++ b/Documentation/infiniband/tag_matching.txt
@@ -0,0 +1,64 @@
+Tag matching logic
+
+The MPI standard defines a set of rules, known as tag-matching, for matching
+source send operations to destination receives.  The following parameters must
+match the following source and destination parameters:
+*	Communicator
+*	User tag - wild card may be specified by the receiver
+*	Source rank – wild car may be specified by the receiver
+*	Destination rank – wild
+The ordering rules require that when more than one pair of send and receive
+message envelopes may match, the pair that includes the earliest posted-send
+and the earliest posted-receive is the pair that must be used to satisfy the
+matching operation. However, this doesn’t imply that tags are consumed in
+the order they are created, e.g., a later generated tag may be consumed, if
+earlier tags can’t be used to satisfy the matching rules.
+
+When a message is sent from the sender to the receiver, the communication
+library may attempt to process the operation either after or before the
+corresponding matching receive is posted.  If a matching receive is posted,
+this is an expected message, otherwise it is called an unexpected message.
+Implementations frequently use different matching schemes for these two
+different matching instances.
+
+To keep MPI library memory footprint down, MPI implementations typically use
+two different protocols for this purpose:
+
+1.	The Eager protocol- the complete message is sent when the send is
+processed by the sender. A completion send is received in the send_cq
+notifying that the buffer can be reused.
+
+2.	The Rendezvous Protocol - the sender sends the tag-matching header,
+and perhaps a portion of data when first notifying the receiver. When the
+corresponding buffer is posted, the responder will use the information from
+the header to initiate an RDMA READ operation directly to the matching buffer.
+A fin message needs to be received in order for the buffer to be reused.
+
+Tag matching implementation
+
+There are two types of matching objects used, the posted receive list and the
+unexpected message list. The application posts receive buffers through calls
+to the MPI receive routines in the posted receive list and posts send messages
+using the MPI send routines. The head of the posted receive list may be
+maintained by the hardware, with the software expected to shadow this list.
+
+When send is initiated and arrives at the receive side, if there is no
+pre-posted receive for this arriving message, it is passed to the software and
+placed in the unexpected message list. Otherwise the match is processed,
+including rendezvous processing, if appropriate, delivering the data to the
+specified receive buffer. This allows overlapping receive-side MPI tag
+matching with computation.
+
+When a receive-message is posted, the communication library will first check
+the software unexpected message list for a matching receive. If a match is
+found, data is delivered to the user buffer, using a software controlled
+protocol. The UCX implementation uses either an eager or rendezvous protocol,
+depending on data size. If no match is found, the entire pre-posted receive
+list is maintained by the hardware, and there is space to add one more
+pre-posted receive to this list, this receive is passed to the hardware.
+Software is expected to shadow this list, to help with processing MPI cancel
+operations. In addition, because hardware and software are not expected to be
+tightly synchronized with respect to the tag-matching operation, this shadow
+list is used to detect the case that a pre-posted receive is passed to the
+hardware, as the matching unexpected message is being passed from the hardware
+to the software.
diff --git a/Documentation/input/input.rst b/Documentation/input/input.rst
index 3b3a229..47f86a4 100644
--- a/Documentation/input/input.rst
+++ b/Documentation/input/input.rst
@@ -109,7 +109,7 @@
 keyboard
 ~~~~~~~~
 
-``keyboard`` is in-kernel input handler ad is a part of VT code. It
+``keyboard`` is in-kernel input handler and is a part of VT code. It
 consumes keyboard keystrokes and handles user input for VT consoles.
 
 mousedev
diff --git a/Documentation/input/joydev/index.rst b/Documentation/input/joydev/index.rst
index 8d9666c..ebcff43 100644
--- a/Documentation/input/joydev/index.rst
+++ b/Documentation/input/joydev/index.rst
@@ -12,7 +12,6 @@
 
 .. toctree::
 	:maxdepth: 3
-	:numbered:
 
 	joystick
 	joystick-api
diff --git a/Documentation/kbuild/makefiles.txt b/Documentation/kbuild/makefiles.txt
index 7003141..329e740 100644
--- a/Documentation/kbuild/makefiles.txt
+++ b/Documentation/kbuild/makefiles.txt
@@ -297,9 +297,9 @@
 	ccflags-y specifies options for compiling with $(CC).
 
 	Example:
-		# drivers/acpi/Makefile
-		ccflags-y := -Os
-		ccflags-$(CONFIG_ACPI_DEBUG) += -DACPI_DEBUG_OUTPUT
+		# drivers/acpi/acpica/Makefile
+		ccflags-y			:= -Os -D_LINUX -DBUILDING_ACPICA
+		ccflags-$(CONFIG_ACPI_DEBUG)	+= -DACPI_DEBUG_OUTPUT
 
 	This variable is necessary because the top Makefile owns the
 	variable $(KBUILD_CFLAGS) and uses it for compilation flags for the
diff --git a/Documentation/locking/rt-mutex-design.txt b/Documentation/locking/rt-mutex-design.txt
index 8666070..6c6e8c2 100644
--- a/Documentation/locking/rt-mutex-design.txt
+++ b/Documentation/locking/rt-mutex-design.txt
@@ -97,9 +97,9 @@
            a process being blocked on the mutex, it is fine to allocate
            the waiter on the process's stack (local variable).  This
            structure holds a pointer to the task, as well as the mutex that
-           the task is blocked on.  It also has the plist node structures to
-           place the task in the waiter_list of a mutex as well as the
-           pi_list of a mutex owner task (described below).
+           the task is blocked on.  It also has rbtree node structures to
+           place the task in the waiters rbtree of a mutex as well as the
+           pi_waiters rbtree of a mutex owner task (described below).
 
            waiter is sometimes used in reference to the task that is waiting
            on a mutex. This is the same as waiter->task.
@@ -179,53 +179,34 @@
                          |
                    F->L5-+
 
+If process G has the highest priority in the chain, then all the tasks up
+the chain (A and B in this example), must have their priorities increased
+to that of G.
 
-Plist
------
-
-Before I go further and talk about how the PI chain is stored through lists
-on both mutexes and processes, I'll explain the plist.  This is similar to
-the struct list_head functionality that is already in the kernel.
-The implementation of plist is out of scope for this document, but it is
-very important to understand what it does.
-
-There are a few differences between plist and list, the most important one
-being that plist is a priority sorted linked list.  This means that the
-priorities of the plist are sorted, such that it takes O(1) to retrieve the
-highest priority item in the list.  Obviously this is useful to store processes
-based on their priorities.
-
-Another difference, which is important for implementation, is that, unlike
-list, the head of the list is a different element than the nodes of a list.
-So the head of the list is declared as struct plist_head and nodes that will
-be added to the list are declared as struct plist_node.
-
-
-Mutex Waiter List
+Mutex Waiters Tree
 -----------------
 
-Every mutex keeps track of all the waiters that are blocked on itself. The mutex
-has a plist to store these waiters by priority.  This list is protected by
-a spin lock that is located in the struct of the mutex. This lock is called
-wait_lock.  Since the modification of the waiter list is never done in
-interrupt context, the wait_lock can be taken without disabling interrupts.
+Every mutex keeps track of all the waiters that are blocked on itself. The
+mutex has a rbtree to store these waiters by priority.  This tree is protected
+by a spin lock that is located in the struct of the mutex. This lock is called
+wait_lock.
 
 
-Task PI List
+Task PI Tree
 ------------
 
-To keep track of the PI chains, each process has its own PI list.  This is
-a list of all top waiters of the mutexes that are owned by the process.
-Note that this list only holds the top waiters and not all waiters that are
+To keep track of the PI chains, each process has its own PI rbtree.  This is
+a tree of all top waiters of the mutexes that are owned by the process.
+Note that this tree only holds the top waiters and not all waiters that are
 blocked on mutexes owned by the process.
 
-The top of the task's PI list is always the highest priority task that
+The top of the task's PI tree is always the highest priority task that
 is waiting on a mutex that is owned by the task.  So if the task has
 inherited a priority, it will always be the priority of the task that is
-at the top of this list.
+at the top of this tree.
 
-This list is stored in the task structure of a process as a plist called
-pi_list.  This list is protected by a spin lock also in the task structure,
+This tree is stored in the task structure of a process as a rbtree called
+pi_waiters.  It is protected by a spin lock also in the task structure,
 called pi_lock.  This lock may also be taken in interrupt context, so when
 locking the pi_lock, interrupts must be disabled.
 
@@ -312,15 +293,12 @@
 
 The mutex structure contains a pointer to the owner of the mutex.  If the
 mutex is not owned, this owner is set to NULL.  Since all architectures
-have the task structure on at least a four byte alignment (and if this is
-not true, the rtmutex.c code will be broken!), this allows for the two
-least significant bits to be used as flags.  This part is also described
-in Documentation/rt-mutex.txt, but will also be briefly described here.
+have the task structure on at least a two byte alignment (and if this is
+not true, the rtmutex.c code will be broken!), this allows for the least
+significant bit to be used as a flag.  Bit 0 is used as the "Has Waiters"
+flag. It's set whenever there are waiters on a mutex.
 
-Bit 0 is used as the "Pending Owner" flag.  This is described later.
-Bit 1 is used as the "Has Waiters" flags.  This is also described later
-  in more detail, but is set whenever there are waiters on a mutex.
-
+See Documentation/locking/rt-mutex.txt for further details.
 
 cmpxchg Tricks
 --------------
@@ -359,40 +337,31 @@
 --------------------
 
 The implementation of the PI code in rtmutex.c has several places that a
-process must adjust its priority.  With the help of the pi_list of a
+process must adjust its priority.  With the help of the pi_waiters of a
 process this is rather easy to know what needs to be adjusted.
 
-The functions implementing the task adjustments are rt_mutex_adjust_prio,
-__rt_mutex_adjust_prio (same as the former, but expects the task pi_lock
-to already be taken), rt_mutex_getprio, and rt_mutex_setprio.
+The functions implementing the task adjustments are rt_mutex_adjust_prio
+and rt_mutex_setprio. rt_mutex_setprio is only used in rt_mutex_adjust_prio.
 
-rt_mutex_getprio and rt_mutex_setprio are only used in __rt_mutex_adjust_prio.
+rt_mutex_adjust_prio examines the priority of the task, and the highest
+priority process that is waiting any of mutexes owned by the task. Since
+the pi_waiters of a task holds an order by priority of all the top waiters
+of all the mutexes that the task owns, we simply need to compare the top
+pi waiter to its own normal/deadline priority and take the higher one.
+Then rt_mutex_setprio is called to adjust the priority of the task to the
+new priority. Note that rt_mutex_setprio is defined in kernel/sched/core.c
+to implement the actual change in priority.
 
-rt_mutex_getprio returns the priority that the task should have.  Either the
-task's own normal priority, or if a process of a higher priority is waiting on
-a mutex owned by the task, then that higher priority should be returned.
-Since the pi_list of a task holds an order by priority list of all the top
-waiters of all the mutexes that the task owns, rt_mutex_getprio simply needs
-to compare the top pi waiter to its own normal priority, and return the higher
-priority back.
+(Note:  For the "prio" field in task_struct, the lower the number, the
+	higher the priority. A "prio" of 5 is of higher priority than a
+	"prio" of 10.)
 
-(Note:  if looking at the code, you will notice that the lower number of
-        prio is returned.  This is because the prio field in the task structure
-        is an inverse order of the actual priority.  So a "prio" of 5 is
-        of higher priority than a "prio" of 10.)
-
-__rt_mutex_adjust_prio examines the result of rt_mutex_getprio, and if the
-result does not equal the task's current priority, then rt_mutex_setprio
-is called to adjust the priority of the task to the new priority.
-Note that rt_mutex_setprio is defined in kernel/sched/core.c to implement the
-actual change in priority.
-
-It is interesting to note that __rt_mutex_adjust_prio can either increase
+It is interesting to note that rt_mutex_adjust_prio can either increase
 or decrease the priority of the task.  In the case that a higher priority
-process has just blocked on a mutex owned by the task, __rt_mutex_adjust_prio
+process has just blocked on a mutex owned by the task, rt_mutex_adjust_prio
 would increase/boost the task's priority.  But if a higher priority task
 were for some reason to leave the mutex (timeout or signal), this same function
-would decrease/unboost the priority of the task.  That is because the pi_list
+would decrease/unboost the priority of the task.  That is because the pi_waiters
 always contains the highest priority task that is waiting on a mutex owned
 by the task, so we only need to compare the priority of that top pi waiter
 to the normal priority of the given task.
@@ -412,9 +381,10 @@
 
 rt_mutex_adjust_prio_chain is called with a task to be checked for PI
 (de)boosting (the owner of a mutex that a process is blocking on), a flag to
-check for deadlocking, the mutex that the task owns, and a pointer to a waiter
+check for deadlocking, the mutex that the task owns, a pointer to a waiter
 that is the process's waiter struct that is blocked on the mutex (although this
-parameter may be NULL for deboosting).
+parameter may be NULL for deboosting), a pointer to the mutex on which the task
+is blocked, and a top_task as the top waiter of the mutex.
 
 For this explanation, I will not mention deadlock detection. This explanation
 will try to stay at a high level.
@@ -424,133 +394,14 @@
 
 Before this function is called, the task has already had rt_mutex_adjust_prio
 performed on it.  This means that the task is set to the priority that it
-should be at, but the plist nodes of the task's waiter have not been updated
-with the new priorities, and that this task may not be in the proper locations
-in the pi_lists and wait_lists that the task is blocked on.  This function
+should be at, but the rbtree nodes of the task's waiter have not been updated
+with the new priorities, and this task may not be in the proper locations
+in the pi_waiters and waiters trees that the task is blocked on. This function
 solves all that.
 
-A loop is entered, where task is the owner to be checked for PI changes that
-was passed by parameter (for the first iteration).  The pi_lock of this task is
-taken to prevent any more changes to the pi_list of the task.  This also
-prevents new tasks from completing the blocking on a mutex that is owned by this
-task.
-
-If the task is not blocked on a mutex then the loop is exited.  We are at
-the top of the PI chain.
-
-A check is now done to see if the original waiter (the process that is blocked
-on the current mutex) is the top pi waiter of the task.  That is, is this
-waiter on the top of the task's pi_list.  If it is not, it either means that
-there is another process higher in priority that is blocked on one of the
-mutexes that the task owns, or that the waiter has just woken up via a signal
-or timeout and has left the PI chain.  In either case, the loop is exited, since
-we don't need to do any more changes to the priority of the current task, or any
-task that owns a mutex that this current task is waiting on.  A priority chain
-walk is only needed when a new top pi waiter is made to a task.
-
-The next check sees if the task's waiter plist node has the priority equal to
-the priority the task is set at.  If they are equal, then we are done with
-the loop.  Remember that the function started with the priority of the
-task adjusted, but the plist nodes that hold the task in other processes
-pi_lists have not been adjusted.
-
-Next, we look at the mutex that the task is blocked on. The mutex's wait_lock
-is taken.  This is done by a spin_trylock, because the locking order of the
-pi_lock and wait_lock goes in the opposite direction. If we fail to grab the
-lock, the pi_lock is released, and we restart the loop.
-
-Now that we have both the pi_lock of the task as well as the wait_lock of
-the mutex the task is blocked on, we update the task's waiter's plist node
-that is located on the mutex's wait_list.
-
-Now we release the pi_lock of the task.
-
-Next the owner of the mutex has its pi_lock taken, so we can update the
-task's entry in the owner's pi_list.  If the task is the highest priority
-process on the mutex's wait_list, then we remove the previous top waiter
-from the owner's pi_list, and replace it with the task.
-
-Note: It is possible that the task was the current top waiter on the mutex,
-      in which case the task is not yet on the pi_list of the waiter.  This
-      is OK, since plist_del does nothing if the plist node is not on any
-      list.
-
-If the task was not the top waiter of the mutex, but it was before we
-did the priority updates, that means we are deboosting/lowering the
-task.  In this case, the task is removed from the pi_list of the owner,
-and the new top waiter is added.
-
-Lastly, we unlock both the pi_lock of the task, as well as the mutex's
-wait_lock, and continue the loop again.  On the next iteration of the
-loop, the previous owner of the mutex will be the task that will be
-processed.
-
-Note: One might think that the owner of this mutex might have changed
-      since we just grab the mutex's wait_lock. And one could be right.
-      The important thing to remember is that the owner could not have
-      become the task that is being processed in the PI chain, since
-      we have taken that task's pi_lock at the beginning of the loop.
-      So as long as there is an owner of this mutex that is not the same
-      process as the tasked being worked on, we are OK.
-
-      Looking closely at the code, one might be confused.  The check for the
-      end of the PI chain is when the task isn't blocked on anything or the
-      task's waiter structure "task" element is NULL.  This check is
-      protected only by the task's pi_lock.  But the code to unlock the mutex
-      sets the task's waiter structure "task" element to NULL with only
-      the protection of the mutex's wait_lock, which was not taken yet.
-      Isn't this a race condition if the task becomes the new owner?
-
-      The answer is No!  The trick is the spin_trylock of the mutex's
-      wait_lock.  If we fail that lock, we release the pi_lock of the
-      task and continue the loop, doing the end of PI chain check again.
-
-      In the code to release the lock, the wait_lock of the mutex is held
-      the entire time, and it is not let go when we grab the pi_lock of the
-      new owner of the mutex.  So if the switch of a new owner were to happen
-      after the check for end of the PI chain and the grabbing of the
-      wait_lock, the unlocking code would spin on the new owner's pi_lock
-      but never give up the wait_lock.  So the PI chain loop is guaranteed to
-      fail the spin_trylock on the wait_lock, release the pi_lock, and
-      try again.
-
-      If you don't quite understand the above, that's OK. You don't have to,
-      unless you really want to make a proof out of it ;)
-
-
-Pending Owners and Lock stealing
---------------------------------
-
-One of the flags in the owner field of the mutex structure is "Pending Owner".
-What this means is that an owner was chosen by the process releasing the
-mutex, but that owner has yet to wake up and actually take the mutex.
-
-Why is this important?  Why can't we just give the mutex to another process
-and be done with it?
-
-The PI code is to help with real-time processes, and to let the highest
-priority process run as long as possible with little latencies and delays.
-If a high priority process owns a mutex that a lower priority process is
-blocked on, when the mutex is released it would be given to the lower priority
-process.  What if the higher priority process wants to take that mutex again.
-The high priority process would fail to take that mutex that it just gave up
-and it would need to boost the lower priority process to run with full
-latency of that critical section (since the low priority process just entered
-it).
-
-There's no reason a high priority process that gives up a mutex should be
-penalized if it tries to take that mutex again.  If the new owner of the
-mutex has not woken up yet, there's no reason that the higher priority process
-could not take that mutex away.
-
-To solve this, we introduced Pending Ownership and Lock Stealing.  When a
-new process is given a mutex that it was blocked on, it is only given
-pending ownership.  This means that it's the new owner, unless a higher
-priority process comes in and tries to grab that mutex.  If a higher priority
-process does come along and wants that mutex, we let the higher priority
-process "steal" the mutex from the pending owner (only if it is still pending)
-and continue with the mutex.
-
+The main operation of this function is summarized by Thomas Gleixner in
+rtmutex.c. See the 'Chain walk basics and protection scope' comment for further
+details.
 
 Taking of a mutex (The walk through)
 ------------------------------------
@@ -563,14 +414,14 @@
 fails).  Only when the owner field of the mutex is NULL can the lock be
 taken with the CMPXCHG and nothing else needs to be done.
 
-If there is contention on the lock, whether it is owned or pending owner
-we go about the slow path (rt_mutex_slowlock).
+If there is contention on the lock, we go about the slow path
+(rt_mutex_slowlock).
 
 The slow path function is where the task's waiter structure is created on
 the stack.  This is because the waiter structure is only needed for the
 scope of this function.  The waiter structure holds the nodes to store
-the task on the wait_list of the mutex, and if need be, the pi_list of
-the owner.
+the task on the waiters tree of the mutex, and if need be, the pi_waiters
+tree of the owner.
 
 The wait_lock of the mutex is taken since the slow path of unlocking the
 mutex also takes this lock.
@@ -581,102 +432,45 @@
 
 try_to_take_rt_mutex is used every time the task tries to grab a mutex in the
 slow path.  The first thing that is done here is an atomic setting of
-the "Has Waiters" flag of the mutex's owner field.  Yes, this could really
-be false, because if the mutex has no owner, there are no waiters and
-the current task also won't have any waiters.  But we don't have the lock
-yet, so we assume we are going to be a waiter.  The reason for this is to
-play nice for those architectures that do have CMPXCHG.  By setting this flag
-now, the owner of the mutex can't release the mutex without going into the
-slow unlock path, and it would then need to grab the wait_lock, which this
-code currently holds.  So setting the "Has Waiters" flag forces the owner
-to synchronize with this code.
+the "Has Waiters" flag of the mutex's owner field. By setting this flag
+now, the current owner of the mutex being contended for can't release the mutex
+without going into the slow unlock path, and it would then need to grab the
+wait_lock, which this code currently holds. So setting the "Has Waiters" flag
+forces the current owner to synchronize with this code.
 
-Now that we know that we can't have any races with the owner releasing the
-mutex, we check to see if we can take the ownership.  This is done if the
-mutex doesn't have a owner, or if we can steal the mutex from a pending
-owner.  Let's look at the situations we have here.
+The lock is taken if the following are true:
+   1) The lock has no owner
+   2) The current task is the highest priority against all other
+      waiters of the lock
 
-  1) Has owner that is pending
-  ----------------------------
+If the task succeeds to acquire the lock, then the task is set as the
+owner of the lock, and if the lock still has waiters, the top_waiter
+(highest priority task waiting on the lock) is added to this task's
+pi_waiters tree.
 
-  The mutex has a owner, but it hasn't woken up and the mutex flag
-  "Pending Owner" is set.  The first check is to see if the owner isn't the
-  current task.  This is because this function is also used for the pending
-  owner to grab the mutex.  When a pending owner wakes up, it checks to see
-  if it can take the mutex, and this is done if the owner is already set to
-  itself.  If so, we succeed and leave the function, clearing the "Pending
-  Owner" bit.
-
-  If the pending owner is not current, we check to see if the current priority is
-  higher than the pending owner.  If not, we fail the function and return.
-
-  There's also something special about a pending owner.  That is a pending owner
-  is never blocked on a mutex.  So there is no PI chain to worry about.  It also
-  means that if the mutex doesn't have any waiters, there's no accounting needed
-  to update the pending owner's pi_list, since we only worry about processes
-  blocked on the current mutex.
-
-  If there are waiters on this mutex, and we just stole the ownership, we need
-  to take the top waiter, remove it from the pi_list of the pending owner, and
-  add it to the current pi_list.  Note that at this moment, the pending owner
-  is no longer on the list of waiters.  This is fine, since the pending owner
-  would add itself back when it realizes that it had the ownership stolen
-  from itself.  When the pending owner tries to grab the mutex, it will fail
-  in try_to_take_rt_mutex if the owner field points to another process.
-
-  2) No owner
-  -----------
-
-  If there is no owner (or we successfully stole the lock), we set the owner
-  of the mutex to current, and set the flag of "Has Waiters" if the current
-  mutex actually has waiters, or we clear the flag if it doesn't.  See, it was
-  OK that we set that flag early, since now it is cleared.
-
-  3) Failed to grab ownership
-  ---------------------------
-
-  The most interesting case is when we fail to take ownership. This means that
-  there exists an owner, or there's a pending owner with equal or higher
-  priority than the current task.
-
-We'll continue on the failed case.
-
-If the mutex has a timeout, we set up a timer to go off to break us out
-of this mutex if we failed to get it after a specified amount of time.
-
-Now we enter a loop that will continue to try to take ownership of the mutex, or
-fail from a timeout or signal.
-
-Once again we try to take the mutex.  This will usually fail the first time
-in the loop, since it had just failed to get the mutex.  But the second time
-in the loop, this would likely succeed, since the task would likely be
-the pending owner.
-
-If the mutex is TASK_INTERRUPTIBLE a check for signals and timeout is done
-here.
-
-The waiter structure has a "task" field that points to the task that is blocked
-on the mutex.  This field can be NULL the first time it goes through the loop
-or if the task is a pending owner and had its mutex stolen.  If the "task"
-field is NULL then we need to set up the accounting for it.
+If the lock is not taken by try_to_take_rt_mutex(), then the
+task_blocks_on_rt_mutex() function is called. This will add the task to
+the lock's waiter tree and propagate the pi chain of the lock as well
+as the lock's owner's pi_waiters tree. This is described in the next
+section.
 
 Task blocks on mutex
 --------------------
 
 The accounting of a mutex and process is done with the waiter structure of
 the process.  The "task" field is set to the process, and the "lock" field
-to the mutex.  The plist nodes are initialized to the processes current
-priority.
+to the mutex.  The rbtree node of waiter are initialized to the processes
+current priority.
 
 Since the wait_lock was taken at the entry of the slow lock, we can safely
-add the waiter to the wait_list.  If the current process is the highest
-priority process currently waiting on this mutex, then we remove the
-previous top waiter process (if it exists) from the pi_list of the owner,
-and add the current process to that list.  Since the pi_list of the owner
+add the waiter to the task waiter tree.  If the current process is the
+highest priority process currently waiting on this mutex, then we remove the
+previous top waiter process (if it exists) from the pi_waiters of the owner,
+and add the current process to that tree.  Since the pi_waiter of the owner
 has changed, we call rt_mutex_adjust_prio on the owner to see if the owner
 should adjust its priority accordingly.
 
-If the owner is also blocked on a lock, and had its pi_list changed
+If the owner is also blocked on a lock, and had its pi_waiters changed
 (or deadlock checking is on), we unlock the wait_lock of the mutex and go ahead
 and run rt_mutex_adjust_prio_chain on the owner, as described earlier.
 
@@ -686,30 +480,23 @@
 Waking up in the loop
 ---------------------
 
-The schedule can then wake up for a few reasons.
-  1) we were given pending ownership of the mutex.
-  2) we received a signal and was TASK_INTERRUPTIBLE
-  3) we had a timeout and was TASK_INTERRUPTIBLE
+The task can then wake up for a couple of reasons:
+  1) The previous lock owner released the lock, and the task now is top_waiter
+  2) we received a signal or timeout
 
-In any of these cases, we continue the loop and once again try to grab the
-ownership of the mutex.  If we succeed, we exit the loop, otherwise we continue
-and on signal and timeout, will exit the loop, or if we had the mutex stolen
-we just simply add ourselves back on the lists and go back to sleep.
+In both cases, the task will try again to acquire the lock. If it
+does, then it will take itself off the waiters tree and set itself back
+to the TASK_RUNNING state.
 
-Note: For various reasons, because of timeout and signals, the steal mutex
-      algorithm needs to be careful. This is because the current process is
-      still on the wait_list. And because of dynamic changing of priorities,
-      especially on SCHED_OTHER tasks, the current process can be the
-      highest priority task on the wait_list.
+In first case, if the lock was acquired by another task before this task
+could get the lock, then it will go back to sleep and wait to be woken again.
 
-Failed to get mutex on Timeout or Signal
-----------------------------------------
-
-If a timeout or signal occurred, the waiter's "task" field would not be
-NULL and the task needs to be taken off the wait_list of the mutex and perhaps
-pi_list of the owner.  If this process was a high priority process, then
-the rt_mutex_adjust_prio_chain needs to be executed again on the owner,
-but this time it will be lowering the priorities.
+The second case is only applicable for tasks that are grabbing a mutex
+that can wake up before getting the lock, either due to a signal or
+a timeout (i.e. rt_mutex_timed_futex_lock()). When woken, it will try to
+take the lock again, if it succeeds, then the task will return with the
+lock held, otherwise it will return with -EINTR if the task was woken
+by a signal, or -ETIMEDOUT if it timed out.
 
 
 Unlocking the Mutex
@@ -739,25 +526,12 @@
 owner field is set to NULL, the wait_lock is released and nothing more is
 needed.
 
-If there are waiters, then we need to wake one up and give that waiter
-pending ownership.
+If there are waiters, then we need to wake one up.
 
 On the wake up code, the pi_lock of the current owner is taken.  The top
-waiter of the lock is found and removed from the wait_list of the mutex
-as well as the pi_list of the current owner.  The task field of the new
-pending owner's waiter structure is set to NULL, and the owner field of the
-mutex is set to the new owner with the "Pending Owner" bit set, as well
-as the "Has Waiters" bit if there still are other processes blocked on the
-mutex.
-
-The pi_lock of the previous owner is released, and the new pending owner's
-pi_lock is taken.  Remember that this is the trick to prevent the race
-condition in rt_mutex_adjust_prio_chain from adding itself as a waiter
-on the mutex.
-
-We now clear the "pi_blocked_on" field of the new pending owner, and if
-the mutex still has waiters pending, we add the new top waiter to the pi_list
-of the pending owner.
+waiter of the lock is found and removed from the waiters tree of the mutex
+as well as the pi_waiters tree of the current owner. The "Has Waiters" bit is
+marked to prevent lower priority tasks from stealing the lock.
 
 Finally we unlock the pi_lock of the pending owner and wake it up.
 
@@ -772,10 +546,14 @@
 -------
 
 Author:  Steven Rostedt <rostedt@goodmis.org>
+Updated: Alex Shi <alex.shi@linaro.org>	- 7/6/2017
 
-Reviewers:  Ingo Molnar, Thomas Gleixner, Thomas Duetsch, and Randy Dunlap
+Original Reviewers:  Ingo Molnar, Thomas Gleixner, Thomas Duetsch, and
+		     Randy Dunlap
+Update (7/6/2017) Reviewers: Steven Rostedt and Sebastian Siewior
 
 Updates
 -------
 
 This document was originally written for 2.6.17-rc3-mm1
+was updated on 4.12
diff --git a/Documentation/locking/rt-mutex.txt b/Documentation/locking/rt-mutex.txt
index 243393d..35793e0 100644
--- a/Documentation/locking/rt-mutex.txt
+++ b/Documentation/locking/rt-mutex.txt
@@ -28,14 +28,13 @@
 well-designed applications to use userspace locks in critical parts of
 an high priority thread, without losing determinism.
 
-The enqueueing of the waiters into the rtmutex waiter list is done in
+The enqueueing of the waiters into the rtmutex waiter tree is done in
 priority order. For same priorities FIFO order is chosen. For each
 rtmutex, only the top priority waiter is enqueued into the owner's
-priority waiters list. This list too queues in priority order. Whenever
+priority waiters tree. This tree too queues in priority order. Whenever
 the top priority waiter of a task changes (for example it timed out or
-got a signal), the priority of the owner task is readjusted. [The
-priority enqueueing is handled by "plists", see include/linux/plist.h
-for more details.]
+got a signal), the priority of the owner task is readjusted. The
+priority enqueueing is handled by "pi_waiters".
 
 RT-mutexes are optimized for fastpath operations and have no internal
 locking overhead when locking an uncontended mutex or unlocking a mutex
@@ -46,34 +45,29 @@
 The state of the rt-mutex is tracked via the owner field of the rt-mutex
 structure:
 
-rt_mutex->owner holds the task_struct pointer of the owner. Bit 0 and 1
-are used to keep track of the "owner is pending" and "rtmutex has
-waiters" state.
+lock->owner holds the task_struct pointer of the owner. Bit 0 is used to
+keep track of the "lock has waiters" state.
 
- owner		bit1	bit0
- NULL		0	0	mutex is free (fast acquire possible)
- NULL		0	1	invalid state
- NULL		1	0	Transitional state*
- NULL		1	1	invalid state
- taskpointer	0	0	mutex is held (fast release possible)
- taskpointer	0	1	task is pending owner
- taskpointer	1	0	mutex is held and has waiters
- taskpointer	1	1	task is pending owner and mutex has waiters
+ owner        bit0
+ NULL         0       lock is free (fast acquire possible)
+ NULL         1       lock is free and has waiters and the top waiter
+			is going to take the lock*
+ taskpointer  0       lock is held (fast release possible)
+ taskpointer  1       lock is held and has waiters**
 
-Pending-ownership handling is a performance optimization:
-pending-ownership is assigned to the first (highest priority) waiter of
-the mutex, when the mutex is released. The thread is woken up and once
-it starts executing it can acquire the mutex. Until the mutex is taken
-by it (bit 0 is cleared) a competing higher priority thread can "steal"
-the mutex which puts the woken up thread back on the waiters list.
+The fast atomic compare exchange based acquire and release is only
+possible when bit 0 of lock->owner is 0.
 
-The pending-ownership optimization is especially important for the
-uninterrupted workflow of high-prio tasks which repeatedly
-takes/releases locks that have lower-prio waiters. Without this
-optimization the higher-prio thread would ping-pong to the lower-prio
-task [because at unlock time we always assign a new owner].
+(*) It also can be a transitional state when grabbing the lock
+with ->wait_lock is held. To prevent any fast path cmpxchg to the lock,
+we need to set the bit0 before looking at the lock, and the owner may be
+NULL in this small time, hence this can be a transitional state.
 
-(*) The "mutex has waiters" bit gets set to take the lock. If the lock
-doesn't already have an owner, this bit is quickly cleared if there are
-no waiters.  So this is a transitional state to synchronize with looking
-at the owner field of the mutex and the mutex owner releasing the lock.
+(**) There is a small time when bit 0 is set but there are no
+waiters. This can happen when grabbing the lock in the slow path.
+To prevent a cmpxchg of the owner releasing the lock, we need to
+set this bit before looking at the lock.
+
+BTW, there is still technically a "Pending Owner", it's just not called
+that anymore. The pending owner happens to be the top_waiter of a lock
+that has no owner and has been woken up to grab the lock.
diff --git a/Documentation/media/uapi/cec/cec-funcs.rst b/Documentation/media/uapi/cec/cec-funcs.rst
index 5b7630f..6d696ce 100644
--- a/Documentation/media/uapi/cec/cec-funcs.rst
+++ b/Documentation/media/uapi/cec/cec-funcs.rst
@@ -7,7 +7,6 @@
 
 .. toctree::
     :maxdepth: 1
-    :numbered:
 
     cec-func-open
     cec-func-close
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index c4ddfcd..e2ee0a1 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -594,29 +594,6 @@
 This enforces the occurrence of one of the two implications, and prevents the
 third possibility from arising.
 
-A data-dependency barrier must also order against dependent writes:
-
-	CPU 1		      CPU 2
-	===============	      ===============
-	{ A == 1, B == 2, C = 3, P == &A, Q == &C }
-	B = 4;
-	<write barrier>
-	WRITE_ONCE(P, &B);
-			      Q = READ_ONCE(P);
-			      <data dependency barrier>
-			      *Q = 5;
-
-The data-dependency barrier must order the read into Q with the store
-into *Q.  This prohibits this outcome:
-
-	(Q == &B) && (B == 4)
-
-Please note that this pattern should be rare.  After all, the whole point
-of dependency ordering is to -prevent- writes to the data structure, along
-with the expensive cache misses associated with those writes.  This pattern
-can be used to record rare error conditions and the like, and the ordering
-prevents such records from being lost.
-
 
 [!] Note that this extremely counterintuitive situation arises most easily on
 machines with split caches, so that, for example, one cache bank processes
@@ -628,6 +605,36 @@
 but the old value of the variable B (2).
 
 
+A data-dependency barrier is not required to order dependent writes
+because the CPUs that the Linux kernel supports don't do writes
+until they are certain (1) that the write will actually happen, (2)
+of the location of the write, and (3) of the value to be written.
+But please carefully read the "CONTROL DEPENDENCIES" section and the
+Documentation/RCU/rcu_dereference.txt file:  The compiler can and does
+break dependencies in a great many highly creative ways.
+
+	CPU 1		      CPU 2
+	===============	      ===============
+	{ A == 1, B == 2, C = 3, P == &A, Q == &C }
+	B = 4;
+	<write barrier>
+	WRITE_ONCE(P, &B);
+			      Q = READ_ONCE(P);
+			      WRITE_ONCE(*Q, 5);
+
+Therefore, no data-dependency barrier is required to order the read into
+Q with the store into *Q.  In other words, this outcome is prohibited,
+even without a data-dependency barrier:
+
+	(Q == &B) && (B == 4)
+
+Please note that this pattern should be rare.  After all, the whole point
+of dependency ordering is to -prevent- writes to the data structure, along
+with the expensive cache misses associated with those writes.  This pattern
+can be used to record rare error conditions and the like, and the CPUs'
+naturally occurring ordering prevents such records from being lost.
+
+
 The data dependency barrier is very important to the RCU system,
 for example.  See rcu_assign_pointer() and rcu_dereference() in
 include/linux/rcupdate.h.  This permits the current target of an RCU'd
diff --git a/Documentation/networking/ieee802154.txt b/Documentation/networking/ieee802154.txt
index c411434..057e9fd 100644
--- a/Documentation/networking/ieee802154.txt
+++ b/Documentation/networking/ieee802154.txt
@@ -84,17 +84,17 @@
 ==================
 
 The include/net/mac802154.h defines following functions:
- - struct ieee802154_dev *ieee802154_alloc_device
-   (size_t priv_size, struct ieee802154_ops *ops):
-   allocation of IEEE 802.15.4 compatible device
+ - struct ieee802154_hw *
+   ieee802154_alloc_hw(size_t priv_data_len, const struct ieee802154_ops *ops):
+   allocation of IEEE 802.15.4 compatible hardware device
 
- - void ieee802154_free_device(struct ieee802154_dev *dev):
-   freeing allocated device
+ - void ieee802154_free_hw(struct ieee802154_hw *hw):
+   freeing allocated hardware device
 
- - int ieee802154_register_device(struct ieee802154_dev *dev):
-   register PHY in the system
+ - int ieee802154_register_hw(struct ieee802154_hw *hw):
+   register PHY which is the allocated hardware device, in the system
 
- - void ieee802154_unregister_device(struct ieee802154_dev *dev):
+ - void ieee802154_unregister_hw(struct ieee802154_hw *hw):
    freeing registered PHY
 
 Moreover IEEE 802.15.4 device operations structure should be filled.
diff --git a/Documentation/nvmem/nvmem.txt b/Documentation/nvmem/nvmem.txt
index dbd40d8..8d8d8f5 100644
--- a/Documentation/nvmem/nvmem.txt
+++ b/Documentation/nvmem/nvmem.txt
@@ -112,7 +112,7 @@
 5. Releasing a reference to the NVMEM
 =====================================
 
-When a consumers no longer needs the NVMEM, it has to release the reference
+When a consumer no longer needs the NVMEM, it has to release the reference
 to the NVMEM it has obtained using the APIs mentioned in the above section.
 The NVMEM framework provides 2 APIs to release a reference to the NVMEM.
 
diff --git a/Documentation/process/applying-patches.rst b/Documentation/process/applying-patches.rst
index a0d058c..dc2ddc3 100644
--- a/Documentation/process/applying-patches.rst
+++ b/Documentation/process/applying-patches.rst
@@ -6,9 +6,6 @@
 Original by:
 	Jesper Juhl, August 2005
 
-Last update:
-	2016-09-14
-
 .. note::
 
    This document is obsolete.  In most cases, rather than using ``patch``
@@ -344,7 +341,7 @@
 
 This is a good branch to run for people who want to help out testing
 development kernels but do not want to run some of the really experimental
-stuff (such people should see the sections about -git and -mm kernels below).
+stuff (such people should see the sections about -next and -mm kernels below).
 
 The -rc patches are not incremental, they apply to a base 4.x kernel, just
 like the 4.x.y patches described above. The kernel version before the -rcN
@@ -380,44 +377,6 @@
 	$ mv linux-4.7.3 linux-4.8-rc5		# rename the kernel source dir
 
 
-The -git kernels
-================
-
-These are daily snapshots of Linus' kernel tree (managed in a git
-repository, hence the name).
-
-These patches are usually released daily and represent the current state of
-Linus's tree. They are more experimental than -rc kernels since they are
-generated automatically without even a cursory glance to see if they are
-sane.
-
--git patches are not incremental and apply either to a base 4.x kernel or
-a base 4.x-rc kernel -- you can see which from their name.
-A patch named 4.7-git1 applies to the 4.7 kernel source and a patch
-named 4.8-rc3-git2 applies to the source of the 4.8-rc3 kernel.
-
-Here are some examples of how to apply these patches::
-
-	# moving from 4.7 to 4.7-git1
-
-	$ cd ~/linux-4.7			# change to the kernel source dir
-	$ patch -p1 < ../patch-4.7-git1		# apply the 4.7-git1 patch
-	$ cd ..
-	$ mv linux-4.7 linux-4.7-git1		# rename the kernel source dir
-
-	# moving from 4.7-git1 to 4.8-rc2-git3
-
-	$ cd ~/linux-4.7-git1			# change to the kernel source dir
-	$ patch -p1 -R < ../patch-4.7-git1	# revert the 4.7-git1 patch
-						# we now have a 4.7 kernel
-	$ patch -p1 < ../patch-4.8-rc2		# apply the 4.8-rc2 patch
-						# the kernel is now 4.8-rc2
-	$ patch -p1 < ../patch-4.8-rc2-git3	# apply the 4.8-rc2-git3 patch
-						# the kernel is now 4.8-rc2-git3
-	$ cd ..
-	$ mv linux-4.7-git1 linux-4.8-rc2-git3	# rename source dir
-
-
 The -mm patches and the linux-next tree
 =======================================
 
diff --git a/Documentation/process/changes.rst b/Documentation/process/changes.rst
index adbb50a..560beae 100644
--- a/Documentation/process/changes.rst
+++ b/Documentation/process/changes.rst
@@ -53,7 +53,7 @@
 iptables               1.4.2            iptables -V
 openssl & libcrypto    1.0.0            openssl version
 bc                     1.06.95          bc --version
-Sphinx\ [#f1]_	       1.2		sphinx-build --version
+Sphinx\ [#f1]_	       1.3		sphinx-build --version
 ====================== ===============  ========================================
 
 .. [#f1] Sphinx is needed only to build the Kernel documentation
@@ -309,18 +309,8 @@
 Sphinx
 ------
 
-The ReST markups currently used by the Documentation/ files are meant to be
-built with ``Sphinx`` version 1.2 or upper. If you're desiring to build
-PDF outputs, it is recommended to use version 1.4.6.
-
-.. note::
-
-  Please notice that, for PDF and LaTeX output, you'll also need ``XeLaTeX``
-  version 3.14159265. Depending on the distribution, you may also need to
-  install a series of ``texlive`` packages that provide the minimal set of
-  functionalities required for ``XeLaTex`` to work. For PDF output you'll also
-  need ``convert(1)`` from ImageMagick (https://www.imagemagick.org).
-
+Please see :ref:`sphinx_install` in ``Documentation/doc-guide/sphinx.rst``
+for details about Sphinx requirements.
 
 Getting updated software
 ========================
diff --git a/Documentation/process/stable-kernel-rules.rst b/Documentation/process/stable-kernel-rules.rst
index 61e9c78..36a2dde 100644
--- a/Documentation/process/stable-kernel-rules.rst
+++ b/Documentation/process/stable-kernel-rules.rst
@@ -166,12 +166,12 @@
  - The queues of patches, for both completed versions and in progress
    versions can be found at:
 
-	http://git.kernel.org/?p=linux/kernel/git/stable/stable-queue.git
+	https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git
 
  - The finalized and tagged releases of all stable kernels can be found
    in separate branches per version at:
 
-	http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git
+	https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
 
 
 Review committee
diff --git a/Documentation/process/submitting-patches.rst b/Documentation/process/submitting-patches.rst
index 3e10719..733478a 100644
--- a/Documentation/process/submitting-patches.rst
+++ b/Documentation/process/submitting-patches.rst
@@ -413,7 +413,7 @@
 
 
 
-11) Sign your work — the Developer's Certificate of Origin
+11) Sign your work - the Developer's Certificate of Origin
 ----------------------------------------------------------
 
 To improve tracking of who did what, especially with patches that can
diff --git a/Documentation/security/keys/core.rst b/Documentation/security/keys/core.rst
index 1648fa8..1266eea 100644
--- a/Documentation/security/keys/core.rst
+++ b/Documentation/security/keys/core.rst
@@ -16,17 +16,7 @@
 
 This document has the following sections:
 
-	- Key overview
-	- Key service overview
-	- Key access permissions
-	- SELinux support
-	- New procfs files
-	- Userspace system call interface
-	- Kernel services
-	- Notes on accessing payload contents
-	- Defining a key type
-	- Request-key callback service
-	- Garbage collection
+.. contents:: :local:
 
 
 Key Overview
@@ -443,7 +433,7 @@
      /sbin/request-key will be invoked in an attempt to obtain a key. The
      callout_info string will be passed as an argument to the program.
 
-     See also Documentation/security/keys-request-key.txt.
+     See also Documentation/security/keys/request-key.rst.
 
 
 The keyctl syscall functions are:
@@ -973,7 +963,7 @@
     If successful, the key will have been attached to the default keyring for
     implicitly obtained request-key keys, as set by KEYCTL_SET_REQKEY_KEYRING.
 
-    See also Documentation/security/keys-request-key.txt.
+    See also Documentation/security/keys/request-key.rst.
 
 
  *  To search for a key, passing auxiliary data to the upcaller, call::
diff --git a/Documentation/security/keys/request-key.rst b/Documentation/security/keys/request-key.rst
index aba3278..b2d16ab 100644
--- a/Documentation/security/keys/request-key.rst
+++ b/Documentation/security/keys/request-key.rst
@@ -3,7 +3,7 @@
 ===================
 
 The key request service is part of the key retention service (refer to
-Documentation/security/keys.txt).  This document explains more fully how
+Documentation/security/core.rst).  This document explains more fully how
 the requesting algorithm works.
 
 The process starts by either the kernel requesting a service by calling
diff --git a/Documentation/security/keys/trusted-encrypted.rst b/Documentation/security/keys/trusted-encrypted.rst
index 7b50383..3bb24e0 100644
--- a/Documentation/security/keys/trusted-encrypted.rst
+++ b/Documentation/security/keys/trusted-encrypted.rst
@@ -172,4 +172,4 @@
 are anticipated.  In particular the new format 'ecryptfs' has been defined in
 in order to use encrypted keys to mount an eCryptfs filesystem.  More details
 about the usage can be found in the file
-``Documentation/security/keys-ecryptfs.txt``.
+``Documentation/security/keys/ecryptfs.rst``.
diff --git a/Documentation/sphinx-static/theme_overrides.css b/Documentation/sphinx-static/theme_overrides.css
index d5764a4..522b6d4 100644
--- a/Documentation/sphinx-static/theme_overrides.css
+++ b/Documentation/sphinx-static/theme_overrides.css
@@ -4,6 +4,17 @@
  *
  */
 
+/* Interim: Code-blocks with line nos - lines and line numbers don't line up.
+ * see: https://github.com/rtfd/sphinx_rtd_theme/issues/419
+ */
+
+div[class^="highlight"] pre {
+    line-height: normal;
+}
+.rst-content .highlight > pre {
+    line-height: normal;
+}
+
 @media screen {
 
     /* content column
@@ -56,6 +67,12 @@
 	font-family: "Courier New", Courier, monospace
     }
 
+    /* fix bottom margin of lists items */
+
+    .rst-content .section ul li:last-child, .rst-content .section ul li p:last-child {
+          margin-bottom: 12px;
+    }
+
     /* inline literal: drop the borderbox, padding and red color */
 
     code, .rst-content tt, .rst-content code {
diff --git a/Documentation/sphinx/kerneldoc.py b/Documentation/sphinx/kerneldoc.py
index d15e07f3..39aa9e8 100644
--- a/Documentation/sphinx/kerneldoc.py
+++ b/Documentation/sphinx/kerneldoc.py
@@ -27,6 +27,7 @@
 # Please make sure this works on both python2 and python3.
 #
 
+import codecs
 import os
 import subprocess
 import sys
@@ -88,13 +89,10 @@
         try:
             env.app.verbose('calling kernel-doc \'%s\'' % (" ".join(cmd)))
 
-            p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)
+            p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
             out, err = p.communicate()
 
-            # python2 needs conversion to unicode.
-            # python3 with universal_newlines=True returns strings.
-            if sys.version_info.major < 3:
-                out, err = unicode(out, 'utf-8'), unicode(err, 'utf-8')
+            out, err = codecs.decode(out, 'utf-8'), codecs.decode(err, 'utf-8')
 
             if p.returncode != 0:
                 sys.stderr.write(err)
diff --git a/Documentation/sphinx/requirements.txt b/Documentation/sphinx/requirements.txt
new file mode 100644
index 0000000..742be3e
--- /dev/null
+++ b/Documentation/sphinx/requirements.txt
@@ -0,0 +1,3 @@
+docutils==0.12
+Sphinx==1.4.9
+sphinx_rtd_theme
diff --git a/Documentation/translations/zh_CN/HOWTO b/Documentation/translations/zh_CN/HOWTO
index 11be075..5f6d09e 100644
--- a/Documentation/translations/zh_CN/HOWTO
+++ b/Documentation/translations/zh_CN/HOWTO
@@ -149,9 +149,7 @@
 核源码的主目录中使用以下不同命令将会分别生成PDF、Postscript、HTML和手册
 页等不同格式的文档:
     make pdfdocs
-    make psdocs
     make htmldocs
-    make mandocs
 
 
 如何成为内核开发者
diff --git a/MAINTAINERS b/MAINTAINERS
index 1c3feff..f1c9195 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4359,6 +4359,12 @@
 F:	drivers/gpu/drm/qxl/
 F:	include/uapi/drm/qxl_drm.h
 
+DRM DRIVER FOR PERVASIVE DISPLAYS REPAPER PANELS
+M:	Noralf Trønnes <noralf@tronnes.org>
+S:	Maintained
+F:	drivers/gpu/drm/tinydrm/repaper.c
+F:	Documentation/devicetree/bindings/display/repaper.txt
+
 DRM DRIVER FOR RAGE 128 VIDEO CARDS
 S:	Orphan / Obsolete
 F:	drivers/gpu/drm/r128/
@@ -4374,6 +4380,12 @@
 F:	drivers/gpu/drm/sis/
 F:	include/uapi/drm/sis_drm.h
 
+DRM DRIVER FOR SITRONIX ST7586 PANELS
+M:	David Lechner <david@lechnology.com>
+S:	Maintained
+F:	drivers/gpu/drm/tinydrm/st7586.c
+F:	Documentation/devicetree/bindings/display/st7586.txt
+
 DRM DRIVER FOR TDFX VIDEO CARDS
 S:	Orphan / Obsolete
 F:	drivers/gpu/drm/tdfx/
@@ -4622,6 +4634,14 @@
 F:	include/drm/drm_panel.h
 F:	Documentation/devicetree/bindings/display/panel/
 
+DRM TINYDRM DRIVERS
+M:	Noralf Trønnes <noralf@tronnes.org>
+W:	https://github.com/notro/tinydrm/wiki/Development
+T:	git git://anongit.freedesktop.org/drm/drm-misc
+S:	Maintained
+F:	drivers/gpu/drm/tinydrm/
+F:	include/drm/tinydrm/
+
 DSBR100 USB FM RADIO DRIVER
 M:	Alexey Klimov <klimov.linux@gmail.com>
 L:	linux-media@vger.kernel.org
@@ -6744,8 +6764,9 @@
 F:	drivers/scsi/isci/
 
 INTEL DRM DRIVERS (excluding Poulsbo, Moorestown and derivative chipsets)
-M:	Daniel Vetter <daniel.vetter@intel.com>
 M:	Jani Nikula <jani.nikula@linux.intel.com>
+M:	Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
+M:	Rodrigo Vivi <rodrigo.vivi@intel.com>
 L:	intel-gfx@lists.freedesktop.org
 W:	https://01.org/linuxgraphics/
 B:	https://01.org/linuxgraphics/documentation/how-report-bugs
@@ -8628,7 +8649,7 @@
 M:	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
 L:	linux-kernel@vger.kernel.org
 S:	Supported
-F:	kernel/membarrier.c
+F:	kernel/sched/membarrier.c
 F:	include/uapi/linux/membarrier.h
 
 MEMORY MANAGEMENT
diff --git a/Makefile b/Makefile
index 8db6be7..ab067d5 100644
--- a/Makefile
+++ b/Makefile
@@ -1,7 +1,7 @@
 VERSION = 4
 PATCHLEVEL = 13
 SUBLEVEL = 0
-EXTRAVERSION = -rc7
+EXTRAVERSION =
 NAME = Fearless Coyote
 
 # *DOCUMENTATION*
@@ -1468,7 +1468,7 @@
 
 # Documentation targets
 # ---------------------------------------------------------------------------
-DOC_TARGETS := xmldocs sgmldocs psdocs latexdocs pdfdocs htmldocs mandocs installmandocs epubdocs cleandocs linkcheckdocs
+DOC_TARGETS := xmldocs latexdocs pdfdocs htmldocs epubdocs cleandocs linkcheckdocs
 PHONY += $(DOC_TARGETS)
 $(DOC_TARGETS): scripts_basic FORCE
 	$(Q)$(MAKE) $(build)=Documentation $@
diff --git a/arch/alpha/include/asm/io.h b/arch/alpha/include/asm/io.h
index ff40491..4d61d2a 100644
--- a/arch/alpha/include/asm/io.h
+++ b/arch/alpha/include/asm/io.h
@@ -299,6 +299,7 @@ static inline void __iomem * ioremap_nocache(unsigned long offset,
 	return ioremap(offset, size);
 }
 
+#define ioremap_wc ioremap_nocache
 #define ioremap_uc ioremap_nocache
 
 static inline void iounmap(volatile void __iomem *addr)
diff --git a/arch/alpha/include/asm/spinlock.h b/arch/alpha/include/asm/spinlock.h
index a40b9fc..718ac0b 100644
--- a/arch/alpha/include/asm/spinlock.h
+++ b/arch/alpha/include/asm/spinlock.h
@@ -16,11 +16,6 @@
 #define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
 #define arch_spin_is_locked(x)	((x)->lock != 0)
 
-static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	smp_cond_load_acquire(&lock->lock, !VAL);
-}
-
 static inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 {
         return lock.lock == 0;
diff --git a/arch/alpha/include/asm/types.h b/arch/alpha/include/asm/types.h
index 4cb4b6d..0bc66e1 100644
--- a/arch/alpha/include/asm/types.h
+++ b/arch/alpha/include/asm/types.h
@@ -1,6 +1,6 @@
 #ifndef _ALPHA_TYPES_H
 #define _ALPHA_TYPES_H
 
-#include <asm-generic/int-ll64.h>
+#include <uapi/asm/types.h>
 
 #endif /* _ALPHA_TYPES_H */
diff --git a/arch/alpha/include/asm/unistd.h b/arch/alpha/include/asm/unistd.h
index b37153e..db7fc0f 100644
--- a/arch/alpha/include/asm/unistd.h
+++ b/arch/alpha/include/asm/unistd.h
@@ -3,7 +3,7 @@
 
 #include <uapi/asm/unistd.h>
 
-#define NR_SYSCALLS			514
+#define NR_SYSCALLS			523
 
 #define __ARCH_WANT_OLD_READDIR
 #define __ARCH_WANT_STAT64
diff --git a/arch/alpha/include/uapi/asm/types.h b/arch/alpha/include/uapi/asm/types.h
index 9fd3cd4..8d1024d 100644
--- a/arch/alpha/include/uapi/asm/types.h
+++ b/arch/alpha/include/uapi/asm/types.h
@@ -9,8 +9,18 @@
  * need to be careful to avoid a name clashes.
  */
 
-#ifndef __KERNEL__
+/*
+ * This is here because we used to use l64 for alpha
+ * and we don't want to impact user mode with our change to ll64
+ * in the kernel.
+ *
+ * However, some user programs are fine with this.  They can
+ * flag __SANE_USERSPACE_TYPES__ to get int-ll64.h here.
+ */
+#if !defined(__SANE_USERSPACE_TYPES__) && !defined(__KERNEL__)
 #include <asm-generic/int-l64.h>
+#else
+#include <asm-generic/int-ll64.h>
 #endif
 
 #endif /* _UAPI_ALPHA_TYPES_H */
diff --git a/arch/alpha/include/uapi/asm/unistd.h b/arch/alpha/include/uapi/asm/unistd.h
index aa33bf5..a2945fe 100644
--- a/arch/alpha/include/uapi/asm/unistd.h
+++ b/arch/alpha/include/uapi/asm/unistd.h
@@ -475,5 +475,19 @@
 #define __NR_getrandom			511
 #define __NR_memfd_create		512
 #define __NR_execveat			513
+#define __NR_seccomp			514
+#define __NR_bpf			515
+#define __NR_userfaultfd		516
+#define __NR_membarrier			517
+#define __NR_mlock2			518
+#define __NR_copy_file_range		519
+#define __NR_preadv2			520
+#define __NR_pwritev2			521
+#define __NR_statx			522
+
+/* Alpha doesn't have protection keys. */
+#define __IGNORE_pkey_mprotect
+#define __IGNORE_pkey_alloc
+#define __IGNORE_pkey_free
 
 #endif /* _UAPI_ALPHA_UNISTD_H */
diff --git a/arch/alpha/kernel/core_marvel.c b/arch/alpha/kernel/core_marvel.c
index d5f0580..03ff832 100644
--- a/arch/alpha/kernel/core_marvel.c
+++ b/arch/alpha/kernel/core_marvel.c
@@ -351,7 +351,7 @@ marvel_init_io7(struct io7 *io7)
 	}
 }
 
-void
+void __init
 marvel_io7_present(gct6_node *node)
 {
 	int pe;
@@ -369,6 +369,7 @@ marvel_io7_present(gct6_node *node)
 static void __init
 marvel_find_console_vga_hose(void)
 {
+#ifdef CONFIG_VGA_HOSE
 	u64 *pu64 = (u64 *)((u64)hwrpb + hwrpb->ctbt_offset);
 
 	if (pu64[7] == 3) {	/* TERM_TYPE == graphics */
@@ -402,9 +403,10 @@ marvel_find_console_vga_hose(void)
 			pci_vga_hose = hose;
 		}
 	}
+#endif
 }
 
-gct6_search_struct gct_wanted_node_list[] = {
+gct6_search_struct gct_wanted_node_list[] __initdata = {
 	{ GCT_TYPE_HOSE, GCT_SUBTYPE_IO_PORT_MODULE, marvel_io7_present },
 	{ 0, 0, NULL }
 };
diff --git a/arch/alpha/kernel/core_titan.c b/arch/alpha/kernel/core_titan.c
index 219bf27..b532d92 100644
--- a/arch/alpha/kernel/core_titan.c
+++ b/arch/alpha/kernel/core_titan.c
@@ -461,6 +461,7 @@ titan_ioremap(unsigned long addr, unsigned long size)
 	unsigned long *ptes;
 	unsigned long pfn;
 
+#ifdef CONFIG_VGA_HOSE
 	/*
 	 * Adjust the address and hose, if necessary.
 	 */ 
@@ -468,6 +469,7 @@ titan_ioremap(unsigned long addr, unsigned long size)
 		h = pci_vga_hose->index;
 		addr += pci_vga_hose->mem_space->start;
 	}
+#endif
 
 	/*
 	 * Find the hose.
diff --git a/arch/alpha/kernel/module.c b/arch/alpha/kernel/module.c
index 936bc8f..47632fa 100644
--- a/arch/alpha/kernel/module.c
+++ b/arch/alpha/kernel/module.c
@@ -181,6 +181,9 @@ apply_relocate_add(Elf64_Shdr *sechdrs, const char *strtab,
 		switch (r_type) {
 		case R_ALPHA_NONE:
 			break;
+		case R_ALPHA_REFLONG:
+			*(u32 *)location = value;
+			break;
 		case R_ALPHA_REFQUAD:
 			/* BUG() can produce misaligned relocations. */
 			((u32 *)location)[0] = value;
diff --git a/arch/alpha/kernel/smp.c b/arch/alpha/kernel/smp.c
index 9fc5604..f6726a7 100644
--- a/arch/alpha/kernel/smp.c
+++ b/arch/alpha/kernel/smp.c
@@ -115,7 +115,7 @@ wait_boot_cpu_to_stop(int cpuid)
 /*
  * Where secondaries begin a life of C.
  */
-void
+void __init
 smp_callin(void)
 {
 	int cpuid = hard_smp_processor_id();
diff --git a/arch/alpha/kernel/systbls.S b/arch/alpha/kernel/systbls.S
index 9b62e3f..5b4514a 100644
--- a/arch/alpha/kernel/systbls.S
+++ b/arch/alpha/kernel/systbls.S
@@ -532,6 +532,15 @@
 	.quad sys_getrandom
 	.quad sys_memfd_create
 	.quad sys_execveat
+	.quad sys_seccomp
+	.quad sys_bpf				/* 515 */
+	.quad sys_userfaultfd
+	.quad sys_membarrier
+	.quad sys_mlock2
+	.quad sys_copy_file_range
+	.quad sys_preadv2			/* 520 */
+	.quad sys_pwritev2
+	.quad sys_statx
 
 	.size sys_call_table, . - sys_call_table
 	.type sys_call_table, @object
diff --git a/arch/alpha/lib/Makefile b/arch/alpha/lib/Makefile
index 7083434..a808159 100644
--- a/arch/alpha/lib/Makefile
+++ b/arch/alpha/lib/Makefile
@@ -20,12 +20,8 @@
 	checksum.o \
 	csum_partial_copy.o \
 	$(ev67-y)strlen.o \
-	$(ev67-y)strcat.o \
-	strcpy.o \
-	$(ev67-y)strncat.o \
-	strncpy.o \
-	$(ev6-y)stxcpy.o \
-	$(ev6-y)stxncpy.o \
+	stycpy.o \
+	styncpy.o \
 	$(ev67-y)strchr.o \
 	$(ev67-y)strrchr.o \
 	$(ev6-y)memchr.o \
@@ -49,3 +45,17 @@
 $(addprefix $(obj)/,__divqu.o __remqu.o __divlu.o __remlu.o): \
 						$(src)/$(ev6-y)divide.S FORCE
 	$(call if_changed_rule,as_o_S)
+
+# There are direct branches between {str*cpy,str*cat} and stx*cpy.
+# Ensure the branches are within range by merging these objects.
+
+LDFLAGS_stycpy.o := -r
+LDFLAGS_styncpy.o := -r
+
+$(obj)/stycpy.o: $(obj)/strcpy.o $(obj)/$(ev67-y)strcat.o \
+		 $(obj)/$(ev6-y)stxcpy.o FORCE
+	$(call if_changed,ld)
+
+$(obj)/styncpy.o: $(obj)/strncpy.o $(obj)/$(ev67-y)strncat.o \
+		 $(obj)/$(ev6-y)stxncpy.o FORCE
+	$(call if_changed,ld)
diff --git a/arch/alpha/lib/copy_user.S b/arch/alpha/lib/copy_user.S
index 159f1b7..c277a1a 100644
--- a/arch/alpha/lib/copy_user.S
+++ b/arch/alpha/lib/copy_user.S
@@ -34,7 +34,7 @@
 	.ent __copy_user
 __copy_user:
 	.prologue 0
-	and $18,$18,$0
+	mov $18,$0
 	and $16,7,$3
 	beq $0,$35
 	beq $3,$36
diff --git a/arch/alpha/lib/ev6-copy_user.S b/arch/alpha/lib/ev6-copy_user.S
index 35e6710..954ca03 100644
--- a/arch/alpha/lib/ev6-copy_user.S
+++ b/arch/alpha/lib/ev6-copy_user.S
@@ -45,9 +45,10 @@
 				# Pipeline info: Slotting & Comments
 __copy_user:
 	.prologue 0
-	andq $18, $18, $0
-	subq $18, 32, $1	# .. E  .. ..	: Is this going to be a small copy?
-	beq $0, $zerolength	# U  .. .. ..	: U L U L
+	mov $18, $0		# .. .. .. E
+	subq $18, 32, $1	# .. .. E. ..	: Is this going to be a small copy?
+	nop			# .. E  .. ..
+	beq $18, $zerolength	# U  .. .. ..	: U L U L
 
 	and $16,7,$3		# .. .. .. E	: is leading dest misalignment
 	ble $1, $onebyteloop	# .. .. U  ..	: 1st branch : small amount of data
diff --git a/arch/arc/include/asm/spinlock.h b/arch/arc/include/asm/spinlock.h
index 233d5ff..a325e6a 100644
--- a/arch/arc/include/asm/spinlock.h
+++ b/arch/arc/include/asm/spinlock.h
@@ -16,11 +16,6 @@
 #define arch_spin_is_locked(x)	((x)->slock != __ARCH_SPIN_LOCK_UNLOCKED__)
 #define arch_spin_lock_flags(lock, flags)	arch_spin_lock(lock)
 
-static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	smp_cond_load_acquire(&lock->slock, !VAL);
-}
-
 #ifdef CONFIG_ARC_HAS_LLSC
 
 static inline void arch_spin_lock(arch_spinlock_t *lock)
diff --git a/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero.dts b/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero.dts
index 6713d0f..b1502df 100644
--- a/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero.dts
+++ b/arch/arm/boot/dts/sun8i-h2-plus-orangepi-zero.dts
@@ -56,8 +56,6 @@
 
 	aliases {
 		serial0 = &uart0;
-		/* ethernet0 is the H3 emac, defined in sun8i-h3.dtsi */
-		ethernet0 = &emac;
 		ethernet1 = &xr819;
 	};
 
@@ -104,13 +102,6 @@
 	status = "okay";
 };
 
-&emac {
-	phy-handle = <&int_mii_phy>;
-	phy-mode = "mii";
-	allwinner,leds-active-low;
-	status = "okay";
-};
-
 &mmc0 {
 	pinctrl-names = "default";
 	pinctrl-0 = <&mmc0_pins_a>;
diff --git a/arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts b/arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts
index d756ff8..a337af1 100644
--- a/arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts
+++ b/arch/arm/boot/dts/sun8i-h3-bananapi-m2-plus.dts
@@ -52,7 +52,6 @@
 	compatible = "sinovoip,bpi-m2-plus", "allwinner,sun8i-h3";
 
 	aliases {
-		ethernet0 = &emac;
 		serial0 = &uart0;
 		serial1 = &uart1;
 	};
@@ -115,30 +114,12 @@
 	status = "okay";
 };
 
-&emac {
-	pinctrl-names = "default";
-	pinctrl-0 = <&emac_rgmii_pins>;
-	phy-supply = <&reg_gmac_3v3>;
-	phy-handle = <&ext_rgmii_phy>;
-	phy-mode = "rgmii";
-
-	allwinner,leds-active-low;
-	status = "okay";
-};
-
 &ir {
 	pinctrl-names = "default";
 	pinctrl-0 = <&ir_pins_a>;
 	status = "okay";
 };
 
-&mdio {
-	ext_rgmii_phy: ethernet-phy@1 {
-		compatible = "ethernet-phy-ieee802.3-c22";
-		reg = <0>;
-	};
-};
-
 &mmc0 {
 	pinctrl-names = "default";
 	pinctrl-0 = <&mmc0_pins_a>, <&mmc0_cd_pin>;
diff --git a/arch/arm/boot/dts/sun8i-h3-nanopi-neo.dts b/arch/arm/boot/dts/sun8i-h3-nanopi-neo.dts
index 78f6c24..8d2cc6e 100644
--- a/arch/arm/boot/dts/sun8i-h3-nanopi-neo.dts
+++ b/arch/arm/boot/dts/sun8i-h3-nanopi-neo.dts
@@ -46,10 +46,3 @@
 	model = "FriendlyARM NanoPi NEO";
 	compatible = "friendlyarm,nanopi-neo", "allwinner,sun8i-h3";
 };
-
-&emac {
-	phy-handle = <&int_mii_phy>;
-	phy-mode = "mii";
-	allwinner,leds-active-low;
-	status = "okay";
-};
diff --git a/arch/arm/boot/dts/sun8i-h3-orangepi-2.dts b/arch/arm/boot/dts/sun8i-h3-orangepi-2.dts
index 17cdeae..8ff71b1 100644
--- a/arch/arm/boot/dts/sun8i-h3-orangepi-2.dts
+++ b/arch/arm/boot/dts/sun8i-h3-orangepi-2.dts
@@ -54,7 +54,6 @@
 	aliases {
 		serial0 = &uart0;
 		/* ethernet0 is the H3 emac, defined in sun8i-h3.dtsi */
-		ethernet0 = &emac;
 		ethernet1 = &rtl8189;
 	};
 
@@ -118,13 +117,6 @@
 	status = "okay";
 };
 
-&emac {
-	phy-handle = <&int_mii_phy>;
-	phy-mode = "mii";
-	allwinner,leds-active-low;
-	status = "okay";
-};
-
 &ir {
 	pinctrl-names = "default";
 	pinctrl-0 = <&ir_pins_a>;
diff --git a/arch/arm/boot/dts/sun8i-h3-orangepi-one.dts b/arch/arm/boot/dts/sun8i-h3-orangepi-one.dts
index 6880268..5fea430 100644
--- a/arch/arm/boot/dts/sun8i-h3-orangepi-one.dts
+++ b/arch/arm/boot/dts/sun8i-h3-orangepi-one.dts
@@ -52,7 +52,6 @@
 	compatible = "xunlong,orangepi-one", "allwinner,sun8i-h3";
 
 	aliases {
-		ethernet0 = &emac;
 		serial0 = &uart0;
 	};
 
@@ -98,13 +97,6 @@
 	status = "okay";
 };
 
-&emac {
-	phy-handle = <&int_mii_phy>;
-	phy-mode = "mii";
-	allwinner,leds-active-low;
-	status = "okay";
-};
-
 &mmc0 {
 	pinctrl-names = "default";
 	pinctrl-0 = <&mmc0_pins_a>, <&mmc0_cd_pin>;
diff --git a/arch/arm/boot/dts/sun8i-h3-orangepi-pc-plus.dts b/arch/arm/boot/dts/sun8i-h3-orangepi-pc-plus.dts
index a10281b..8b93f5c 100644
--- a/arch/arm/boot/dts/sun8i-h3-orangepi-pc-plus.dts
+++ b/arch/arm/boot/dts/sun8i-h3-orangepi-pc-plus.dts
@@ -53,11 +53,6 @@
 	};
 };
 
-&emac {
-	/* LEDs changed to active high on the plus */
-	/delete-property/ allwinner,leds-active-low;
-};
-
 &mmc1 {
 	pinctrl-names = "default";
 	pinctrl-0 = <&mmc1_pins_a>;
diff --git a/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts b/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts
index 998b60f..1a044b1 100644
--- a/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts
+++ b/arch/arm/boot/dts/sun8i-h3-orangepi-pc.dts
@@ -52,7 +52,6 @@
 	compatible = "xunlong,orangepi-pc", "allwinner,sun8i-h3";
 
 	aliases {
-		ethernet0 = &emac;
 		serial0 = &uart0;
 	};
 
@@ -114,13 +113,6 @@
 	status = "okay";
 };
 
-&emac {
-	phy-handle = <&int_mii_phy>;
-	phy-mode = "mii";
-	allwinner,leds-active-low;
-	status = "okay";
-};
-
 &ir {
 	pinctrl-names = "default";
 	pinctrl-0 = <&ir_pins_a>;
diff --git a/arch/arm/boot/dts/sun8i-h3-orangepi-plus.dts b/arch/arm/boot/dts/sun8i-h3-orangepi-plus.dts
index 331ed68..828ae7a5 100644
--- a/arch/arm/boot/dts/sun8i-h3-orangepi-plus.dts
+++ b/arch/arm/boot/dts/sun8i-h3-orangepi-plus.dts
@@ -47,10 +47,6 @@
 	model = "Xunlong Orange Pi Plus / Plus 2";
 	compatible = "xunlong,orangepi-plus", "allwinner,sun8i-h3";
 
-	aliases {
-		ethernet0 = &emac;
-	};
-
 	reg_gmac_3v3: gmac-3v3 {
 		compatible = "regulator-fixed";
 		regulator-name = "gmac-3v3";
@@ -78,24 +74,6 @@
 	status = "okay";
 };
 
-&emac {
-	pinctrl-names = "default";
-	pinctrl-0 = <&emac_rgmii_pins>;
-	phy-supply = <&reg_gmac_3v3>;
-	phy-handle = <&ext_rgmii_phy>;
-	phy-mode = "rgmii";
-
-	allwinner,leds-active-low;
-	status = "okay";
-};
-
-&mdio {
-	ext_rgmii_phy: ethernet-phy@1 {
-		compatible = "ethernet-phy-ieee802.3-c22";
-		reg = <0>;
-	};
-};
-
 &mmc2 {
 	pinctrl-names = "default";
 	pinctrl-0 = <&mmc2_8bit_pins>;
diff --git a/arch/arm/boot/dts/sun8i-h3-orangepi-plus2e.dts b/arch/arm/boot/dts/sun8i-h3-orangepi-plus2e.dts
index 80026f3..97920b1 100644
--- a/arch/arm/boot/dts/sun8i-h3-orangepi-plus2e.dts
+++ b/arch/arm/boot/dts/sun8i-h3-orangepi-plus2e.dts
@@ -61,19 +61,3 @@
 		gpio = <&pio 3 6 GPIO_ACTIVE_HIGH>; /* PD6 */
 	};
 };
-
-&emac {
-	pinctrl-names = "default";
-	pinctrl-0 = <&emac_rgmii_pins>;
-	phy-supply = <&reg_gmac_3v3>;
-	phy-handle = <&ext_rgmii_phy>;
-	phy-mode = "rgmii";
-	status = "okay";
-};
-
-&mdio {
-	ext_rgmii_phy: ethernet-phy@1 {
-		compatible = "ethernet-phy-ieee802.3-c22";
-		reg = <1>;
-	};
-};
diff --git a/arch/arm/boot/dts/sunxi-h3-h5.dtsi b/arch/arm/boot/dts/sunxi-h3-h5.dtsi
index d38282b..11240a8 100644
--- a/arch/arm/boot/dts/sunxi-h3-h5.dtsi
+++ b/arch/arm/boot/dts/sunxi-h3-h5.dtsi
@@ -391,32 +391,6 @@
 			clocks = <&osc24M>;
 		};
 
-		emac: ethernet@1c30000 {
-			compatible = "allwinner,sun8i-h3-emac";
-			syscon = <&syscon>;
-			reg = <0x01c30000 0x10000>;
-			interrupts = <GIC_SPI 82 IRQ_TYPE_LEVEL_HIGH>;
-			interrupt-names = "macirq";
-			resets = <&ccu RST_BUS_EMAC>;
-			reset-names = "stmmaceth";
-			clocks = <&ccu CLK_BUS_EMAC>;
-			clock-names = "stmmaceth";
-			#address-cells = <1>;
-			#size-cells = <0>;
-			status = "disabled";
-
-			mdio: mdio {
-				#address-cells = <1>;
-				#size-cells = <0>;
-				int_mii_phy: ethernet-phy@1 {
-					compatible = "ethernet-phy-ieee802.3-c22";
-					reg = <1>;
-					clocks = <&ccu CLK_BUS_EPHY>;
-					resets = <&ccu RST_BUS_EPHY>;
-				};
-			};
-		};
-
 		spi0: spi@01c68000 {
 			compatible = "allwinner,sun8i-h3-spi";
 			reg = <0x01c68000 0x1000>;
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 127e2dd..4a879f6 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -225,12 +225,6 @@ int kvm_arm_copy_reg_indices(struct kvm_vcpu *vcpu, u64 __user *indices);
 int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end);
 int kvm_test_age_hva(struct kvm *kvm, unsigned long hva);
 
-/* We do not have shadow page tables, hence the empty hooks */
-static inline void kvm_arch_mmu_notifier_invalidate_page(struct kvm *kvm,
-							 unsigned long address)
-{
-}
-
 struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
 struct kvm_vcpu __percpu **kvm_get_running_vcpus(void);
 void kvm_arm_halt_guest(struct kvm *kvm);
diff --git a/arch/arm/include/asm/spinlock.h b/arch/arm/include/asm/spinlock.h
index 4bec454..c030143 100644
--- a/arch/arm/include/asm/spinlock.h
+++ b/arch/arm/include/asm/spinlock.h
@@ -52,22 +52,6 @@ static inline void dsb_sev(void)
  * memory.
  */
 
-static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	u16 owner = READ_ONCE(lock->tickets.owner);
-
-	for (;;) {
-		arch_spinlock_t tmp = READ_ONCE(*lock);
-
-		if (tmp.tickets.owner == tmp.tickets.next ||
-		    tmp.tickets.owner != owner)
-			break;
-
-		wfe();
-	}
-	smp_acquire__after_ctrl_dep();
-}
-
 #define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
 
 static inline void arch_spin_lock(arch_spinlock_t *lock)
diff --git a/arch/arm/mach-omap2/Makefile b/arch/arm/mach-omap2/Makefile
index 779fb1f..b3b3b3a 100644
--- a/arch/arm/mach-omap2/Makefile
+++ b/arch/arm/mach-omap2/Makefile
@@ -8,7 +8,7 @@
 # Common support
 obj-y := id.o io.o control.o devices.o fb.o timer.o pm.o \
 	 common.o dma.o wd_timer.o display.o i2c.o hdq1w.o omap_hwmod.o \
-	 omap_device.o omap-headsmp.o sram.o drm.o
+	 omap_device.o omap-headsmp.o sram.o
 
 hwmod-common				= omap_hwmod.o omap_hwmod_reset.o \
 					  omap_hwmod_common_data.o
diff --git a/arch/arm/mach-omap2/board-generic.c b/arch/arm/mach-omap2/board-generic.c
index b1e661b..583fc39 100644
--- a/arch/arm/mach-omap2/board-generic.c
+++ b/arch/arm/mach-omap2/board-generic.c
@@ -33,6 +33,7 @@ static void __init __maybe_unused omap_generic_init(void)
 	pdata_quirks_init(omap_dt_match_table);
 
 	omapdss_init_of();
+	omap_soc_device_init();
 }
 
 #ifdef CONFIG_SOC_OMAP2420
diff --git a/arch/arm/mach-omap2/display.c b/arch/arm/mach-omap2/display.c
index 8fa01c0..b3f6eb5 100644
--- a/arch/arm/mach-omap2/display.c
+++ b/arch/arm/mach-omap2/display.c
@@ -66,6 +66,7 @@
  */
 #define FRAMEDONE_IRQ_TIMEOUT		100
 
+#if defined(CONFIG_FB_OMAP2)
 static struct platform_device omap_display_device = {
 	.name          = "omapdss",
 	.id            = -1,
@@ -163,6 +164,65 @@ static enum omapdss_version __init omap_display_get_version(void)
 		return OMAPDSS_VER_UNKNOWN;
 }
 
+static int __init omapdss_init_fbdev(void)
+{
+	static struct omap_dss_board_info board_data = {
+		.dsi_enable_pads = omap_dsi_enable_pads,
+		.dsi_disable_pads = omap_dsi_disable_pads,
+		.set_min_bus_tput = omap_dss_set_min_bus_tput,
+	};
+	struct device_node *node;
+	int r;
+
+	board_data.version = omap_display_get_version();
+	if (board_data.version == OMAPDSS_VER_UNKNOWN) {
+		pr_err("DSS not supported on this SoC\n");
+		return -ENODEV;
+	}
+
+	omap_display_device.dev.platform_data = &board_data;
+
+	r = platform_device_register(&omap_display_device);
+	if (r < 0) {
+		pr_err("Unable to register omapdss device\n");
+		return r;
+	}
+
+	/* create vrfb device */
+	r = omap_init_vrfb();
+	if (r < 0) {
+		pr_err("Unable to register omapvrfb device\n");
+		return r;
+	}
+
+	/* create FB device */
+	r = omap_init_fb();
+	if (r < 0) {
+		pr_err("Unable to register omapfb device\n");
+		return r;
+	}
+
+	/* create V4L2 display device */
+	r = omap_init_vout();
+	if (r < 0) {
+		pr_err("Unable to register omap_vout device\n");
+		return r;
+	}
+
+	/* add DSI info for omap4 */
+	node = of_find_node_by_name(NULL, "omap4_padconf_global");
+	if (node)
+		omap4_dsi_mux_syscon = syscon_node_to_regmap(node);
+
+	return 0;
+}
+#else
+static inline int omapdss_init_fbdev(void)
+{
+	return 0;
+}
+#endif /* CONFIG_FB_OMAP2 */
+
 static void dispc_disable_outputs(void)
 {
 	u32 v, irq_mask = 0;
@@ -335,16 +395,9 @@ static struct device_node * __init omapdss_find_dss_of_node(void)
 int __init omapdss_init_of(void)
 {
 	int r;
-	enum omapdss_version ver;
 	struct device_node *node;
 	struct platform_device *pdev;
 
-	static struct omap_dss_board_info board_data = {
-		.dsi_enable_pads = omap_dsi_enable_pads,
-		.dsi_disable_pads = omap_dsi_disable_pads,
-		.set_min_bus_tput = omap_dss_set_min_bus_tput,
-	};
-
 	/* only create dss helper devices if dss is enabled in the .dts */
 
 	node = omapdss_find_dss_of_node();
@@ -354,13 +407,6 @@ int __init omapdss_init_of(void)
 	if (!of_device_is_available(node))
 		return 0;
 
-	ver = omap_display_get_version();
-
-	if (ver == OMAPDSS_VER_UNKNOWN) {
-		pr_err("DSS not supported on this SoC\n");
-		return -ENODEV;
-	}
-
 	pdev = of_find_device_by_node(node);
 
 	if (!pdev) {
@@ -374,48 +420,5 @@ int __init omapdss_init_of(void)
 		return r;
 	}
 
-	board_data.version = ver;
-
-	omap_display_device.dev.platform_data = &board_data;
-
-	r = platform_device_register(&omap_display_device);
-	if (r < 0) {
-		pr_err("Unable to register omapdss device\n");
-		return r;
-	}
-
-	/* create DRM device */
-	r = omap_init_drm();
-	if (r < 0) {
-		pr_err("Unable to register omapdrm device\n");
-		return r;
-	}
-
-	/* create vrfb device */
-	r = omap_init_vrfb();
-	if (r < 0) {
-		pr_err("Unable to register omapvrfb device\n");
-		return r;
-	}
-
-	/* create FB device */
-	r = omap_init_fb();
-	if (r < 0) {
-		pr_err("Unable to register omapfb device\n");
-		return r;
-	}
-
-	/* create V4L2 display device */
-	r = omap_init_vout();
-	if (r < 0) {
-		pr_err("Unable to register omap_vout device\n");
-		return r;
-	}
-
-	/* add DSI info for omap4 */
-	node = of_find_node_by_name(NULL, "omap4_padconf_global");
-	if (node)
-		omap4_dsi_mux_syscon = syscon_node_to_regmap(node);
-
-	return 0;
+	return omapdss_init_fbdev();
 }
diff --git a/arch/arm/mach-omap2/display.h b/arch/arm/mach-omap2/display.h
index 9a39646..42ec2e9 100644
--- a/arch/arm/mach-omap2/display.h
+++ b/arch/arm/mach-omap2/display.h
@@ -26,7 +26,6 @@ struct omap_dss_dispc_dev_attr {
 	bool	has_framedonetv_irq;
 };
 
-int omap_init_drm(void);
 int omap_init_vrfb(void);
 int omap_init_fb(void);
 int omap_init_vout(void);
diff --git a/arch/arm/mach-omap2/drm.c b/arch/arm/mach-omap2/drm.c
deleted file mode 100644
index 44fef96..0000000
--- a/arch/arm/mach-omap2/drm.c
+++ /dev/null
@@ -1,53 +0,0 @@
-/*
- * DRM/KMS device registration for TI OMAP platforms
- *
- * Copyright (C) 2012 Texas Instruments
- * Author: Rob Clark <rob.clark@linaro.org>
- *
- * This program is free software; you can redistribute it and/or modify it
- * under the terms of the GNU General Public License version 2 as published by
- * the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but WITHOUT
- * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
- * more details.
- *
- * You should have received a copy of the GNU General Public License along with
- * this program.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#include <linux/module.h>
-#include <linux/kernel.h>
-#include <linux/mm.h>
-#include <linux/init.h>
-#include <linux/platform_device.h>
-#include <linux/dma-mapping.h>
-#include <linux/platform_data/omap_drm.h>
-
-#include "soc.h"
-#include "display.h"
-
-#if IS_ENABLED(CONFIG_DRM_OMAP)
-
-static struct omap_drm_platform_data platform_data;
-
-static struct platform_device omap_drm_device = {
-	.dev = {
-		.coherent_dma_mask = DMA_BIT_MASK(32),
-		.platform_data = &platform_data,
-	},
-	.name = "omapdrm",
-	.id = 0,
-};
-
-int __init omap_init_drm(void)
-{
-	platform_data.omaprev = GET_OMAP_TYPE;
-
-	return platform_device_register(&omap_drm_device);
-
-}
-#else
-int __init omap_init_drm(void) { return 0; }
-#endif
diff --git a/arch/arm/mach-omap2/io.c b/arch/arm/mach-omap2/io.c
index 1cd20e4..cb5d731 100644
--- a/arch/arm/mach-omap2/io.c
+++ b/arch/arm/mach-omap2/io.c
@@ -428,7 +428,6 @@ static void __init __maybe_unused omap_hwmod_init_postsetup(void)
 static void __init __maybe_unused omap_common_late_init(void)
 {
 	omap2_common_pm_late_init();
-	omap_soc_device_init();
 }
 
 #ifdef CONFIG_SOC_OMAP2420
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-bananapi-m64.dts b/arch/arm64/boot/dts/allwinner/sun50i-a64-bananapi-m64.dts
index ba2fde2..6872135 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64-bananapi-m64.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-bananapi-m64.dts
@@ -51,7 +51,6 @@
 	compatible = "sinovoip,bananapi-m64", "allwinner,sun50i-a64";
 
 	aliases {
-		ethernet0 = &emac;
 		serial0 = &uart0;
 		serial1 = &uart1;
 	};
@@ -68,14 +67,6 @@
 	};
 };
 
-&emac {
-	pinctrl-names = "default";
-	pinctrl-0 = <&rgmii_pins>;
-	phy-mode = "rgmii";
-	phy-handle = <&ext_rgmii_phy>;
-	status = "okay";
-};
-
 &i2c1 {
 	pinctrl-names = "default";
 	pinctrl-0 = <&i2c1_pins>;
@@ -86,13 +77,6 @@
 	bias-pull-up;
 };
 
-&mdio {
-	ext_rgmii_phy: ethernet-phy@1 {
-		compatible = "ethernet-phy-ieee802.3-c22";
-		reg = <1>;
-	};
-};
-
 &mmc0 {
 	pinctrl-names = "default";
 	pinctrl-0 = <&mmc0_pins>;
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64-plus.dts b/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64-plus.dts
index 24f1aac..f82ccf3 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64-plus.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64-plus.dts
@@ -48,18 +48,3 @@
 
 	/* TODO: Camera, touchscreen, etc. */
 };
-
-&emac {
-	pinctrl-names = "default";
-	pinctrl-0 = <&rgmii_pins>;
-	phy-mode = "rgmii";
-	phy-handle = <&ext_rgmii_phy>;
-	status = "okay";
-};
-
-&mdio {
-	ext_rgmii_phy: ethernet-phy@1 {
-		compatible = "ethernet-phy-ieee802.3-c22";
-		reg = <1>;
-	};
-};
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64.dts b/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64.dts
index 827168b..7c533b6 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64.dts
@@ -51,7 +51,6 @@
 	compatible = "pine64,pine64", "allwinner,sun50i-a64";
 
 	aliases {
-		ethernet0 = &emac;
 		serial0 = &uart0;
 		serial1 = &uart1;
 		serial2 = &uart2;
@@ -79,15 +78,6 @@
 	status = "okay";
 };
 
-&emac {
-	pinctrl-names = "default";
-	pinctrl-0 = <&rmii_pins>;
-	phy-mode = "rmii";
-	phy-handle = <&ext_rmii_phy1>;
-	status = "okay";
-
-};
-
 &i2c1 {
 	pinctrl-names = "default";
 	pinctrl-0 = <&i2c1_pins>;
@@ -98,13 +88,6 @@
 	bias-pull-up;
 };
 
-&mdio {
-	ext_rmii_phy1: ethernet-phy@1 {
-		compatible = "ethernet-phy-ieee802.3-c22";
-		reg = <1>;
-	};
-};
-
 &mmc0 {
 	pinctrl-names = "default";
 	pinctrl-0 = <&mmc0_pins>;
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-sopine-baseboard.dts b/arch/arm64/boot/dts/allwinner/sun50i-a64-sopine-baseboard.dts
index 216e3a5..d891a1a 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64-sopine-baseboard.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-sopine-baseboard.dts
@@ -53,7 +53,6 @@
 		     "allwinner,sun50i-a64";
 
 	aliases {
-		ethernet0 = &emac;
 		serial0 = &uart0;
 	};
 
@@ -77,21 +76,6 @@
 	status = "okay";
 };
 
-&emac {
-	pinctrl-names = "default";
-	pinctrl-0 = <&rgmii_pins>;
-	phy-mode = "rgmii";
-	phy-handle = <&ext_rgmii_phy>;
-	status = "okay";
-};
-
-&mdio {
-	ext_rgmii_phy: ethernet-phy@1 {
-		compatible = "ethernet-phy-ieee802.3-c22";
-		reg = <1>;
-	};
-};
-
 &mmc2 {
 	pinctrl-names = "default";
 	pinctrl-0 = <&mmc2_pins>;
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
index bd0f33b..68aadc9 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
@@ -449,26 +449,6 @@
 			#size-cells = <0>;
 		};
 
-		emac: ethernet@1c30000 {
-			compatible = "allwinner,sun50i-a64-emac";
-			syscon = <&syscon>;
-			reg = <0x01c30000 0x10000>;
-			interrupts = <GIC_SPI 82 IRQ_TYPE_LEVEL_HIGH>;
-			interrupt-names = "macirq";
-			resets = <&ccu RST_BUS_EMAC>;
-			reset-names = "stmmaceth";
-			clocks = <&ccu CLK_BUS_EMAC>;
-			clock-names = "stmmaceth";
-			status = "disabled";
-			#address-cells = <1>;
-			#size-cells = <0>;
-
-			mdio: mdio {
-				#address-cells = <1>;
-				#size-cells = <0>;
-			};
-		};
-
 		gic: interrupt-controller@1c81000 {
 			compatible = "arm,gic-400";
 			reg = <0x01c81000 0x1000>,
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h5-nanopi-neo2.dts b/arch/arm64/boot/dts/allwinner/sun50i-h5-nanopi-neo2.dts
index 9689087..1c2387b 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-h5-nanopi-neo2.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-h5-nanopi-neo2.dts
@@ -50,7 +50,6 @@
 	compatible = "friendlyarm,nanopi-neo2", "allwinner,sun50i-h5";
 
 	aliases {
-		ethernet0 = &emac;
 		serial0 = &uart0;
 	};
 
@@ -109,22 +108,6 @@
 	status = "okay";
 };
 
-&emac {
-	pinctrl-names = "default";
-	pinctrl-0 = <&emac_rgmii_pins>;
-	phy-supply = <&reg_gmac_3v3>;
-	phy-handle = <&ext_rgmii_phy>;
-	phy-mode = "rgmii";
-	status = "okay";
-};
-
-&mdio {
-	ext_rgmii_phy: ethernet-phy@7 {
-		compatible = "ethernet-phy-ieee802.3-c22";
-		reg = <7>;
-	};
-};
-
 &mmc0 {
 	pinctrl-names = "default";
 	pinctrl-0 = <&mmc0_pins_a>, <&mmc0_cd_pin>;
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h5-orangepi-pc2.dts b/arch/arm64/boot/dts/allwinner/sun50i-h5-orangepi-pc2.dts
index a8296fe..4f77c84 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-h5-orangepi-pc2.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-h5-orangepi-pc2.dts
@@ -59,7 +59,6 @@
 	};
 
 	aliases {
-		ethernet0 = &emac;
 		serial0 = &uart0;
 	};
 
@@ -137,28 +136,12 @@
 	status = "okay";
 };
 
-&emac {
-	pinctrl-names = "default";
-	pinctrl-0 = <&emac_rgmii_pins>;
-	phy-supply = <&reg_gmac_3v3>;
-	phy-handle = <&ext_rgmii_phy>;
-	phy-mode = "rgmii";
-	status = "okay";
-};
-
 &ir {
 	pinctrl-names = "default";
 	pinctrl-0 = <&ir_pins_a>;
 	status = "okay";
 };
 
-&mdio {
-	ext_rgmii_phy: ethernet-phy@1 {
-		compatible = "ethernet-phy-ieee802.3-c22";
-		reg = <1>;
-	};
-};
-
 &mmc0 {
 	pinctrl-names = "default";
 	pinctrl-0 = <&mmc0_pins_a>, <&mmc0_cd_pin>;
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h5-orangepi-prime.dts b/arch/arm64/boot/dts/allwinner/sun50i-h5-orangepi-prime.dts
index d906b30..6be0687 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-h5-orangepi-prime.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-h5-orangepi-prime.dts
@@ -54,7 +54,6 @@
 	compatible = "xunlong,orangepi-prime", "allwinner,sun50i-h5";
 
 	aliases {
-		ethernet0 = &emac;
 		serial0 = &uart0;
 	};
 
@@ -144,28 +143,12 @@
 	status = "okay";
 };
 
-&emac {
-	pinctrl-names = "default";
-	pinctrl-0 = <&emac_rgmii_pins>;
-	phy-supply = <&reg_gmac_3v3>;
-	phy-handle = <&ext_rgmii_phy>;
-	phy-mode = "rgmii";
-	status = "okay";
-};
-
 &ir {
 	pinctrl-names = "default";
 	pinctrl-0 = <&ir_pins_a>;
 	status = "okay";
 };
 
-&mdio {
-	ext_rgmii_phy: ethernet-phy@1 {
-		compatible = "ethernet-phy-ieee802.3-c22";
-		reg = <1>;
-	};
-};
-
 &mmc0 {
 	pinctrl-names = "default";
 	pinctrl-0 = <&mmc0_pins_a>, <&mmc0_cd_pin>;
diff --git a/arch/arm64/boot/dts/exynos/exynos5433-tm2-common.dtsi b/arch/arm64/boot/dts/exynos/exynos5433-tm2-common.dtsi
index e2b0da2..105b293 100644
--- a/arch/arm64/boot/dts/exynos/exynos5433-tm2-common.dtsi
+++ b/arch/arm64/boot/dts/exynos/exynos5433-tm2-common.dtsi
@@ -280,9 +280,6 @@
 
 &decon {
 	status = "okay";
-
-	i80-if-timings {
-	};
 };
 
 &decon_tv {
@@ -1116,9 +1113,6 @@
 
 &mic {
 	status = "okay";
-
-	i80-if-timings {
-	};
 };
 
 &pmu_system_controller {
diff --git a/arch/arm64/boot/dts/marvell/armada-ap806.dtsi b/arch/arm64/boot/dts/marvell/armada-ap806.dtsi
index 1eb1f1e..4d36071 100644
--- a/arch/arm64/boot/dts/marvell/armada-ap806.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-ap806.dtsi
@@ -268,10 +268,10 @@
 				ap_gpio: gpio {
 					compatible = "marvell,armada-8k-gpio";
 					offset = <0x1040>;
-					ngpios = <19>;
+					ngpios = <20>;
 					gpio-controller;
 					#gpio-cells = <2>;
-					gpio-ranges = <&ap_pinctrl 0 0 19>;
+					gpio-ranges = <&ap_pinctrl 0 0 20>;
 				};
 			};
 		};
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index d686300..e923b58 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -326,12 +326,6 @@ void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte);
 int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end);
 int kvm_test_age_hva(struct kvm *kvm, unsigned long hva);
 
-/* We do not have shadow page tables, hence the empty hooks */
-static inline void kvm_arch_mmu_notifier_invalidate_page(struct kvm *kvm,
-							 unsigned long address)
-{
-}
-
 struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
 struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
 void kvm_arm_halt_guest(struct kvm *kvm);
diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h
index cae331d..f445bd7 100644
--- a/arch/arm64/include/asm/spinlock.h
+++ b/arch/arm64/include/asm/spinlock.h
@@ -26,58 +26,6 @@
  * The memory barriers are implicit with the load-acquire and store-release
  * instructions.
  */
-static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	unsigned int tmp;
-	arch_spinlock_t lockval;
-	u32 owner;
-
-	/*
-	 * Ensure prior spin_lock operations to other locks have completed
-	 * on this CPU before we test whether "lock" is locked.
-	 */
-	smp_mb();
-	owner = READ_ONCE(lock->owner) << 16;
-
-	asm volatile(
-"	sevl\n"
-"1:	wfe\n"
-"2:	ldaxr	%w0, %2\n"
-	/* Is the lock free? */
-"	eor	%w1, %w0, %w0, ror #16\n"
-"	cbz	%w1, 3f\n"
-	/* Lock taken -- has there been a subsequent unlock->lock transition? */
-"	eor	%w1, %w3, %w0, lsl #16\n"
-"	cbz	%w1, 1b\n"
-	/*
-	 * The owner has been updated, so there was an unlock->lock
-	 * transition that we missed. That means we can rely on the
-	 * store-release of the unlock operation paired with the
-	 * load-acquire of the lock operation to publish any of our
-	 * previous stores to the new lock owner and therefore don't
-	 * need to bother with the writeback below.
-	 */
-"	b	4f\n"
-"3:\n"
-	/*
-	 * Serialise against any concurrent lockers by writing back the
-	 * unlocked lock value
-	 */
-	ARM64_LSE_ATOMIC_INSN(
-	/* LL/SC */
-"	stxr	%w1, %w0, %2\n"
-	__nops(2),
-	/* LSE atomics */
-"	mov	%w1, %w0\n"
-"	cas	%w0, %w0, %2\n"
-"	eor	%w1, %w1, %w0\n")
-	/* Somebody else wrote to the lock, GOTO 10 and reload the value */
-"	cbnz	%w1, 2b\n"
-"4:"
-	: "=&r" (lockval), "=&r" (tmp), "+Q" (*lock)
-	: "r" (owner)
-	: "memory");
-}
 
 #define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
 
@@ -176,7 +124,11 @@ static inline int arch_spin_value_unlocked(arch_spinlock_t lock)
 
 static inline int arch_spin_is_locked(arch_spinlock_t *lock)
 {
-	smp_mb(); /* See arch_spin_unlock_wait */
+	/*
+	 * Ensure prior spin_lock operations to other locks have completed
+	 * on this CPU before we test whether "lock" is locked.
+	 */
+	smp_mb(); /* ^^^ */
 	return !arch_spin_value_unlocked(READ_ONCE(*lock));
 }
 
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 659ae80..c8f7d98 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -360,6 +360,8 @@ __notrace_funcgraph struct task_struct *__switch_to(struct task_struct *prev,
 	/*
 	 * Complete any pending TLB or cache maintenance on this CPU in case
 	 * the thread migrates to a different CPU.
+	 * This full barrier is also required by the membarrier system
+	 * call.
 	 */
 	dsb(ish);
 
diff --git a/arch/blackfin/include/asm/spinlock.h b/arch/blackfin/include/asm/spinlock.h
index c58f4a8..f643143 100644
--- a/arch/blackfin/include/asm/spinlock.h
+++ b/arch/blackfin/include/asm/spinlock.h
@@ -48,11 +48,6 @@ static inline void arch_spin_unlock(arch_spinlock_t *lock)
 	__raw_spin_unlock_asm(&lock->lock);
 }
 
-static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	smp_cond_load_acquire(&lock->lock, !VAL);
-}
-
 static inline int arch_read_can_lock(arch_rwlock_t *rw)
 {
 	return __raw_uncached_fetch_asm(&rw->lock) > 0;
diff --git a/arch/blackfin/kernel/module.c b/arch/blackfin/kernel/module.c
index 0188c93..15af576 100644
--- a/arch/blackfin/kernel/module.c
+++ b/arch/blackfin/kernel/module.c
@@ -4,8 +4,6 @@
  * Licensed under the GPL-2 or later
  */
 
-#define pr_fmt(fmt) "module %s: " fmt, mod->name
-
 #include <linux/moduleloader.h>
 #include <linux/elf.h>
 #include <linux/vmalloc.h>
@@ -16,6 +14,11 @@
 #include <asm/cacheflush.h>
 #include <linux/uaccess.h>
 
+#define mod_err(mod, fmt, ...)						\
+	pr_err("module %s: " fmt, (mod)->name, ##__VA_ARGS__)
+#define mod_debug(mod, fmt, ...)					\
+	pr_debug("module %s: " fmt, (mod)->name, ##__VA_ARGS__)
+
 /* Transfer the section to the L1 memory */
 int
 module_frob_arch_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs,
@@ -44,7 +47,7 @@ module_frob_arch_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs,
 			dest = l1_inst_sram_alloc(s->sh_size);
 			mod->arch.text_l1 = dest;
 			if (dest == NULL) {
-				pr_err("L1 inst memory allocation failed\n");
+				mod_err(mod, "L1 inst memory allocation failed\n");
 				return -1;
 			}
 			dma_memcpy(dest, (void *)s->sh_addr, s->sh_size);
@@ -56,7 +59,7 @@ module_frob_arch_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs,
 			dest = l1_data_sram_alloc(s->sh_size);
 			mod->arch.data_a_l1 = dest;
 			if (dest == NULL) {
-				pr_err("L1 data memory allocation failed\n");
+				mod_err(mod, "L1 data memory allocation failed\n");
 				return -1;
 			}
 			memcpy(dest, (void *)s->sh_addr, s->sh_size);
@@ -68,7 +71,7 @@ module_frob_arch_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs,
 			dest = l1_data_sram_zalloc(s->sh_size);
 			mod->arch.bss_a_l1 = dest;
 			if (dest == NULL) {
-				pr_err("L1 data memory allocation failed\n");
+				mod_err(mod, "L1 data memory allocation failed\n");
 				return -1;
 			}
 
@@ -77,7 +80,7 @@ module_frob_arch_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs,
 			dest = l1_data_B_sram_alloc(s->sh_size);
 			mod->arch.data_b_l1 = dest;
 			if (dest == NULL) {
-				pr_err("L1 data memory allocation failed\n");
+				mod_err(mod, "L1 data memory allocation failed\n");
 				return -1;
 			}
 			memcpy(dest, (void *)s->sh_addr, s->sh_size);
@@ -87,7 +90,7 @@ module_frob_arch_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs,
 			dest = l1_data_B_sram_alloc(s->sh_size);
 			mod->arch.bss_b_l1 = dest;
 			if (dest == NULL) {
-				pr_err("L1 data memory allocation failed\n");
+				mod_err(mod, "L1 data memory allocation failed\n");
 				return -1;
 			}
 			memset(dest, 0, s->sh_size);
@@ -99,7 +102,7 @@ module_frob_arch_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs,
 			dest = l2_sram_alloc(s->sh_size);
 			mod->arch.text_l2 = dest;
 			if (dest == NULL) {
-				pr_err("L2 SRAM allocation failed\n");
+				mod_err(mod, "L2 SRAM allocation failed\n");
 				return -1;
 			}
 			memcpy(dest, (void *)s->sh_addr, s->sh_size);
@@ -111,7 +114,7 @@ module_frob_arch_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs,
 			dest = l2_sram_alloc(s->sh_size);
 			mod->arch.data_l2 = dest;
 			if (dest == NULL) {
-				pr_err("L2 SRAM allocation failed\n");
+				mod_err(mod, "L2 SRAM allocation failed\n");
 				return -1;
 			}
 			memcpy(dest, (void *)s->sh_addr, s->sh_size);
@@ -123,7 +126,7 @@ module_frob_arch_sections(Elf_Ehdr *hdr, Elf_Shdr *sechdrs,
 			dest = l2_sram_zalloc(s->sh_size);
 			mod->arch.bss_l2 = dest;
 			if (dest == NULL) {
-				pr_err("L2 SRAM allocation failed\n");
+				mod_err(mod, "L2 SRAM allocation failed\n");
 				return -1;
 			}
 
@@ -157,8 +160,8 @@ apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
 	Elf32_Sym *sym;
 	unsigned long location, value, size;
 
-	pr_debug("applying relocate section %u to %u\n",
-		relsec, sechdrs[relsec].sh_info);
+	mod_debug(mod, "applying relocate section %u to %u\n",
+		  relsec, sechdrs[relsec].sh_info);
 
 	for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rel); i++) {
 		/* This is where to make the change */
@@ -174,14 +177,14 @@ apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
 
 #ifdef CONFIG_SMP
 		if (location >= COREB_L1_DATA_A_START) {
-			pr_err("cannot relocate in L1: %u (SMP kernel)\n",
+			mod_err(mod, "cannot relocate in L1: %u (SMP kernel)\n",
 				ELF32_R_TYPE(rel[i].r_info));
 			return -ENOEXEC;
 		}
 #endif
 
-		pr_debug("location is %lx, value is %lx type is %d\n",
-			location, value, ELF32_R_TYPE(rel[i].r_info));
+		mod_debug(mod, "location is %lx, value is %lx type is %d\n",
+			  location, value, ELF32_R_TYPE(rel[i].r_info));
 
 		switch (ELF32_R_TYPE(rel[i].r_info)) {
 
@@ -200,12 +203,12 @@ apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
 		case R_BFIN_PCREL12_JUMP:
 		case R_BFIN_PCREL12_JUMP_S:
 		case R_BFIN_PCREL10:
-			pr_err("unsupported relocation: %u (no -mlong-calls?)\n",
+			mod_err(mod, "unsupported relocation: %u (no -mlong-calls?)\n",
 				ELF32_R_TYPE(rel[i].r_info));
 			return -ENOEXEC;
 
 		default:
-			pr_err("unknown relocation: %u\n",
+			mod_err(mod, "unknown relocation: %u\n",
 				ELF32_R_TYPE(rel[i].r_info));
 			return -ENOEXEC;
 		}
@@ -222,7 +225,7 @@ apply_relocate_add(Elf_Shdr *sechdrs, const char *strtab,
 			isram_memcpy((void *)location, &value, size);
 			break;
 		default:
-			pr_err("invalid relocation for %#lx\n", location);
+			mod_err(mod, "invalid relocation for %#lx\n", location);
 			return -ENOEXEC;
 		}
 	}
diff --git a/arch/hexagon/include/asm/spinlock.h b/arch/hexagon/include/asm/spinlock.h
index a1c5578..53a8d58 100644
--- a/arch/hexagon/include/asm/spinlock.h
+++ b/arch/hexagon/include/asm/spinlock.h
@@ -179,11 +179,6 @@ static inline unsigned int arch_spin_trylock(arch_spinlock_t *lock)
  */
 #define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
 
-static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	smp_cond_load_acquire(&lock->lock, !VAL);
-}
-
 #define arch_spin_is_locked(x) ((x)->lock != 0)
 
 #define arch_read_lock_flags(lock, flags) arch_read_lock(lock)
diff --git a/arch/ia64/include/asm/spinlock.h b/arch/ia64/include/asm/spinlock.h
index ca9e761..df2c121 100644
--- a/arch/ia64/include/asm/spinlock.h
+++ b/arch/ia64/include/asm/spinlock.h
@@ -76,22 +76,6 @@ static __always_inline void __ticket_spin_unlock(arch_spinlock_t *lock)
 	ACCESS_ONCE(*p) = (tmp + 2) & ~1;
 }
 
-static __always_inline void __ticket_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	int	*p = (int *)&lock->lock, ticket;
-
-	ia64_invala();
-
-	for (;;) {
-		asm volatile ("ld4.c.nc %0=[%1]" : "=r"(ticket) : "r"(p) : "memory");
-		if (!(((ticket >> TICKET_SHIFT) ^ ticket) & TICKET_MASK))
-			return;
-		cpu_relax();
-	}
-
-	smp_acquire__after_ctrl_dep();
-}
-
 static inline int __ticket_spin_is_locked(arch_spinlock_t *lock)
 {
 	long tmp = ACCESS_ONCE(lock->lock);
@@ -143,11 +127,6 @@ static __always_inline void arch_spin_lock_flags(arch_spinlock_t *lock,
 	arch_spin_lock(lock);
 }
 
-static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	__ticket_spin_unlock_wait(lock);
-}
-
 #define arch_read_can_lock(rw)		(*(volatile int *)(rw) >= 0)
 #define arch_write_can_lock(rw)	(*(volatile int *)(rw) == 0)
 
diff --git a/arch/m32r/include/asm/flat.h b/arch/m32r/include/asm/flat.h
index 455ce7d..dfcb0e4 100644
--- a/arch/m32r/include/asm/flat.h
+++ b/arch/m32r/include/asm/flat.h
@@ -95,7 +95,7 @@ static inline unsigned long m32r_flat_get_addr_from_rp (u32 *rp,
 	return ~0;      /* bogus value */
 }
 
-static inline void flat_put_addr_at_rp(u32 *rp, u32 addr, u32 relval)
+static inline int flat_put_addr_at_rp(u32 *rp, u32 addr, u32 relval)
 {
         unsigned int reloc = flat_m32r_get_reloc_type (relval);
 	if (reloc & 0xf0) {
@@ -133,6 +133,7 @@ static inline void flat_put_addr_at_rp(u32 *rp, u32 addr, u32 relval)
 			break;
 		}
 	}
+	return 0;
 }
 
 // kludge - text_len is a local variable in the only user.
diff --git a/arch/m32r/include/asm/spinlock.h b/arch/m32r/include/asm/spinlock.h
index 323c7fc..a568255 100644
--- a/arch/m32r/include/asm/spinlock.h
+++ b/arch/m32r/include/asm/spinlock.h
@@ -30,11 +30,6 @@
 #define arch_spin_is_locked(x)		(*(volatile int *)(&(x)->slock) <= 0)
 #define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
 
-static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	smp_cond_load_acquire(&lock->slock, VAL > 0);
-}
-
 /**
  * arch_spin_trylock - Try spin lock and return a result
  * @lock: Pointer to the lock variable
diff --git a/arch/metag/include/asm/spinlock.h b/arch/metag/include/asm/spinlock.h
index c0c7a22..ddf7fe5 100644
--- a/arch/metag/include/asm/spinlock.h
+++ b/arch/metag/include/asm/spinlock.h
@@ -15,11 +15,6 @@
  * locked.
  */
 
-static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	smp_cond_load_acquire(&lock->lock, !VAL);
-}
-
 #define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
 
 #define	arch_read_lock_flags(lock, flags) arch_read_lock(lock)
diff --git a/arch/microblaze/include/asm/flat.h b/arch/microblaze/include/asm/flat.h
index f23c3d2..3d2747d 100644
--- a/arch/microblaze/include/asm/flat.h
+++ b/arch/microblaze/include/asm/flat.h
@@ -60,7 +60,7 @@ static inline int flat_get_addr_from_rp(u32 __user *rp, u32 relval, u32 flags,
  * unaligned.
  */
 
-static inline void
+static inline int
 flat_put_addr_at_rp(u32 __user *rp, u32 addr, u32 relval)
 {
 	u32 *p = (__force u32 *)rp;
diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
index 2998479..a9af1d2 100644
--- a/arch/mips/include/asm/kvm_host.h
+++ b/arch/mips/include/asm/kvm_host.h
@@ -938,11 +938,6 @@ void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte);
 int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end);
 int kvm_test_age_hva(struct kvm *kvm, unsigned long hva);
 
-static inline void kvm_arch_mmu_notifier_invalidate_page(struct kvm *kvm,
-							 unsigned long address)
-{
-}
-
 /* Emulation */
 int kvm_get_inst(u32 *opc, struct kvm_vcpu *vcpu, u32 *out);
 enum emulation_result update_pc(struct kvm_vcpu *vcpu, u32 cause);
diff --git a/arch/mips/kernel/ptrace.c b/arch/mips/kernel/ptrace.c
index 6dd1364..1395654 100644
--- a/arch/mips/kernel/ptrace.c
+++ b/arch/mips/kernel/ptrace.c
@@ -872,15 +872,13 @@ asmlinkage long syscall_trace_enter(struct pt_regs *regs, long syscall)
 	if (unlikely(test_thread_flag(TIF_SECCOMP))) {
 		int ret, i;
 		struct seccomp_data sd;
+		unsigned long args[6];
 
 		sd.nr = syscall;
 		sd.arch = syscall_get_arch();
-		for (i = 0; i < 6; i++) {
-			unsigned long v, r;
-
-			r = mips_get_syscall_arg(&v, current, regs, i);
-			sd.args[i] = r ? 0 : v;
-		}
+		syscall_get_arguments(current, regs, 0, 6, args);
+		for (i = 0; i < 6; i++)
+			sd.args[i] = args[i];
 		sd.instruction_pointer = KSTK_EIP(current);
 
 		ret = __secure_computing(&sd);
diff --git a/arch/mips/kernel/scall32-o32.S b/arch/mips/kernel/scall32-o32.S
index 27c2f90..a9a7d78 100644
--- a/arch/mips/kernel/scall32-o32.S
+++ b/arch/mips/kernel/scall32-o32.S
@@ -190,12 +190,6 @@
 	sll	t1, t0, 2
 	beqz	v0, einval
 	lw	t2, sys_call_table(t1)		# syscall routine
-	sw	a0, PT_R2(sp)			# call routine directly on restart
-
-	/* Some syscalls like execve get their arguments from struct pt_regs
-	   and claim zero arguments in the syscall table. Thus we have to
-	   assume the worst case and shuffle around all potential arguments.
-	   If you want performance, don't use indirect syscalls. */
 
 	move	a0, a1				# shift argument registers
 	move	a1, a2
@@ -207,11 +201,6 @@
 	sw	t4, 16(sp)
 	sw	t5, 20(sp)
 	sw	t6, 24(sp)
-	sw	a0, PT_R4(sp)			# .. and push back a0 - a3, some
-	sw	a1, PT_R5(sp)			# syscalls expect them there
-	sw	a2, PT_R6(sp)
-	sw	a3, PT_R7(sp)
-	sw	a3, PT_R26(sp)			# update a3 for syscall restarting
 	jr	t2
 	/* Unreached */
 
diff --git a/arch/mips/kernel/scall64-o32.S b/arch/mips/kernel/scall64-o32.S
index c30bc52..9ebe3e2 100644
--- a/arch/mips/kernel/scall64-o32.S
+++ b/arch/mips/kernel/scall64-o32.S
@@ -198,7 +198,6 @@
 	dsll	t1, t0, 3
 	beqz	v0, einval
 	ld	t2, sys32_call_table(t1)		# syscall routine
-	sd	a0, PT_R2(sp)		# call routine directly on restart
 
 	move	a0, a1			# shift argument registers
 	move	a1, a2
@@ -207,11 +206,6 @@
 	move	a4, a5
 	move	a5, a6
 	move	a6, a7
-	sd	a0, PT_R4(sp)		# ... and push back a0 - a3, some
-	sd	a1, PT_R5(sp)		# syscalls expect them there
-	sd	a2, PT_R6(sp)
-	sd	a3, PT_R7(sp)
-	sd	a3, PT_R26(sp)		# update a3 for syscall restarting
 	jr	t2
 	/* Unreached */
 
diff --git a/arch/mn10300/include/asm/spinlock.h b/arch/mn10300/include/asm/spinlock.h
index 9c7b8f7..fe413b4 100644
--- a/arch/mn10300/include/asm/spinlock.h
+++ b/arch/mn10300/include/asm/spinlock.h
@@ -26,11 +26,6 @@
 
 #define arch_spin_is_locked(x)	(*(volatile signed char *)(&(x)->slock) != 0)
 
-static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	smp_cond_load_acquire(&lock->slock, !VAL);
-}
-
 static inline void arch_spin_unlock(arch_spinlock_t *lock)
 {
 	asm volatile(
diff --git a/arch/parisc/include/asm/spinlock.h b/arch/parisc/include/asm/spinlock.h
index e32936c..55bfe4a 100644
--- a/arch/parisc/include/asm/spinlock.h
+++ b/arch/parisc/include/asm/spinlock.h
@@ -14,13 +14,6 @@ static inline int arch_spin_is_locked(arch_spinlock_t *x)
 
 #define arch_spin_lock(lock) arch_spin_lock_flags(lock, 0)
 
-static inline void arch_spin_unlock_wait(arch_spinlock_t *x)
-{
-	volatile unsigned int *a = __ldcw_align(x);
-
-	smp_cond_load_acquire(a, VAL);
-}
-
 static inline void arch_spin_lock_flags(arch_spinlock_t *x,
 					 unsigned long flags)
 {
diff --git a/arch/powerpc/include/asm/kvm_host.h b/arch/powerpc/include/asm/kvm_host.h
index 8b3f123..e372ed8 100644
--- a/arch/powerpc/include/asm/kvm_host.h
+++ b/arch/powerpc/include/asm/kvm_host.h
@@ -67,11 +67,6 @@ extern int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end);
 extern int kvm_test_age_hva(struct kvm *kvm, unsigned long hva);
 extern void kvm_set_spte_hva(struct kvm *kvm, unsigned long hva, pte_t pte);
 
-static inline void kvm_arch_mmu_notifier_invalidate_page(struct kvm *kvm,
-							 unsigned long address)
-{
-}
-
 #define HPTEG_CACHE_NUM			(1 << 15)
 #define HPTEG_HASH_BITS_PTE		13
 #define HPTEG_HASH_BITS_PTE_LONG	12
diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h
index 8c1b913..d256e44 100644
--- a/arch/powerpc/include/asm/spinlock.h
+++ b/arch/powerpc/include/asm/spinlock.h
@@ -170,39 +170,6 @@ static inline void arch_spin_unlock(arch_spinlock_t *lock)
 	lock->slock = 0;
 }
 
-static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	arch_spinlock_t lock_val;
-
-	smp_mb();
-
-	/*
-	 * Atomically load and store back the lock value (unchanged). This
-	 * ensures that our observation of the lock value is ordered with
-	 * respect to other lock operations.
-	 */
-	__asm__ __volatile__(
-"1:	" PPC_LWARX(%0, 0, %2, 0) "\n"
-"	stwcx. %0, 0, %2\n"
-"	bne- 1b\n"
-	: "=&r" (lock_val), "+m" (*lock)
-	: "r" (lock)
-	: "cr0", "xer");
-
-	if (arch_spin_value_unlocked(lock_val))
-		goto out;
-
-	while (lock->slock) {
-		HMT_low();
-		if (SHARED_PROCESSOR)
-			__spin_yield(lock);
-	}
-	HMT_medium();
-
-out:
-	smp_mb();
-}
-
 /*
  * Read-write spinlocks, allowing multiple readers
  * but only one writer.
diff --git a/arch/powerpc/platforms/powernv/npu-dma.c b/arch/powerpc/platforms/powernv/npu-dma.c
index b5d960d..4c7b859 100644
--- a/arch/powerpc/platforms/powernv/npu-dma.c
+++ b/arch/powerpc/platforms/powernv/npu-dma.c
@@ -614,15 +614,6 @@ static void pnv_npu2_mn_change_pte(struct mmu_notifier *mn,
 	mmio_invalidate(npu_context, 1, address, true);
 }
 
-static void pnv_npu2_mn_invalidate_page(struct mmu_notifier *mn,
-					struct mm_struct *mm,
-					unsigned long address)
-{
-	struct npu_context *npu_context = mn_to_npu_context(mn);
-
-	mmio_invalidate(npu_context, 1, address, true);
-}
-
 static void pnv_npu2_mn_invalidate_range(struct mmu_notifier *mn,
 					struct mm_struct *mm,
 					unsigned long start, unsigned long end)
@@ -640,7 +631,6 @@ static void pnv_npu2_mn_invalidate_range(struct mmu_notifier *mn,
 static const struct mmu_notifier_ops nv_nmmu_notifier_ops = {
 	.release = pnv_npu2_mn_release,
 	.change_pte = pnv_npu2_mn_change_pte,
-	.invalidate_page = pnv_npu2_mn_invalidate_page,
 	.invalidate_range = pnv_npu2_mn_invalidate_range,
 };
 
diff --git a/arch/s390/include/asm/compat.h b/arch/s390/include/asm/compat.h
index b9300f8..07a82bc 100644
--- a/arch/s390/include/asm/compat.h
+++ b/arch/s390/include/asm/compat.h
@@ -8,11 +8,12 @@
 #include <linux/sched/task_stack.h>
 #include <linux/thread_info.h>
 
-#define __TYPE_IS_PTR(t) (!__builtin_types_compatible_p(typeof(0?(t)0:0ULL), u64))
+#define __TYPE_IS_PTR(t) (!__builtin_types_compatible_p( \
+				typeof(0?(__force t)0:0ULL), u64))
 
 #define __SC_DELOUSE(t,v) ({ \
 	BUILD_BUG_ON(sizeof(t) > 4 && !__TYPE_IS_PTR(t)); \
-	(t)(__TYPE_IS_PTR(t) ? ((v) & 0x7fffffff) : (v)); \
+	(__force t)(__TYPE_IS_PTR(t) ? ((v) & 0x7fffffff) : (v)); \
 })
 
 #define PSW32_MASK_PER		0x40000000UL
diff --git a/arch/s390/include/asm/mmu_context.h b/arch/s390/include/asm/mmu_context.h
index 4541ac4..24bc416 100644
--- a/arch/s390/include/asm/mmu_context.h
+++ b/arch/s390/include/asm/mmu_context.h
@@ -44,6 +44,11 @@ static inline int init_new_context(struct task_struct *tsk,
 		mm->context.asce = __pa(mm->pgd) | _ASCE_TABLE_LENGTH |
 				   _ASCE_USER_BITS | _ASCE_TYPE_REGION3;
 		break;
+	case -PAGE_SIZE:
+		/* forked 5-level task, set new asce with new_mm->pgd */
+		mm->context.asce = __pa(mm->pgd) | _ASCE_TABLE_LENGTH |
+			_ASCE_USER_BITS | _ASCE_TYPE_REGION1;
+		break;
 	case 1UL << 53:
 		/* forked 4-level task, set new asce with new mm->pgd */
 		mm->context.asce = __pa(mm->pgd) | _ASCE_TABLE_LENGTH |
diff --git a/arch/s390/include/asm/spinlock.h b/arch/s390/include/asm/spinlock.h
index f7838ec..217ee52 100644
--- a/arch/s390/include/asm/spinlock.h
+++ b/arch/s390/include/asm/spinlock.h
@@ -98,13 +98,6 @@ static inline void arch_spin_unlock(arch_spinlock_t *lp)
 		: "cc", "memory");
 }
 
-static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	while (arch_spin_is_locked(lock))
-		arch_spin_relax(lock);
-	smp_acquire__after_ctrl_dep();
-}
-
 /*
  * Read-write spinlocks, allowing multiple readers
  * but only one writer.
diff --git a/arch/s390/mm/mmap.c b/arch/s390/mm/mmap.c
index 2e10d2b..5bea139 100644
--- a/arch/s390/mm/mmap.c
+++ b/arch/s390/mm/mmap.c
@@ -119,7 +119,8 @@ arch_get_unmapped_area(struct file *filp, unsigned long addr,
 		return addr;
 
 check_asce_limit:
-	if (addr + len > current->mm->context.asce_limit) {
+	if (addr + len > current->mm->context.asce_limit &&
+	    addr + len <= TASK_SIZE) {
 		rc = crst_table_upgrade(mm, addr + len);
 		if (rc)
 			return (unsigned long) rc;
@@ -183,7 +184,8 @@ arch_get_unmapped_area_topdown(struct file *filp, const unsigned long addr0,
 	}
 
 check_asce_limit:
-	if (addr + len > current->mm->context.asce_limit) {
+	if (addr + len > current->mm->context.asce_limit &&
+	    addr + len <= TASK_SIZE) {
 		rc = crst_table_upgrade(mm, addr + len);
 		if (rc)
 			return (unsigned long) rc;
diff --git a/arch/sh/include/asm/spinlock-cas.h b/arch/sh/include/asm/spinlock-cas.h
index c46e8cc..5ed7dbb 100644
--- a/arch/sh/include/asm/spinlock-cas.h
+++ b/arch/sh/include/asm/spinlock-cas.h
@@ -29,11 +29,6 @@ static inline unsigned __sl_cas(volatile unsigned *p, unsigned old, unsigned new
 #define arch_spin_is_locked(x)		((x)->lock <= 0)
 #define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
 
-static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	smp_cond_load_acquire(&lock->lock, VAL > 0);
-}
-
 static inline void arch_spin_lock(arch_spinlock_t *lock)
 {
 	while (!__sl_cas(&lock->lock, 1, 0));
diff --git a/arch/sh/include/asm/spinlock-llsc.h b/arch/sh/include/asm/spinlock-llsc.h
index cec7814..f77263a 100644
--- a/arch/sh/include/asm/spinlock-llsc.h
+++ b/arch/sh/include/asm/spinlock-llsc.h
@@ -21,11 +21,6 @@
 #define arch_spin_is_locked(x)		((x)->lock <= 0)
 #define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
 
-static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	smp_cond_load_acquire(&lock->lock, VAL > 0);
-}
-
 /*
  * Simple spin lock operations.  There are two variants, one clears IRQ's
  * on the local processor, one does not.
diff --git a/arch/sparc/include/asm/spinlock_32.h b/arch/sparc/include/asm/spinlock_32.h
index 8011e79..67345b2 100644
--- a/arch/sparc/include/asm/spinlock_32.h
+++ b/arch/sparc/include/asm/spinlock_32.h
@@ -14,11 +14,6 @@
 
 #define arch_spin_is_locked(lock) (*((volatile unsigned char *)(lock)) != 0)
 
-static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	smp_cond_load_acquire(&lock->lock, !VAL);
-}
-
 static inline void arch_spin_lock(arch_spinlock_t *lock)
 {
 	__asm__ __volatile__(
diff --git a/arch/tile/include/asm/spinlock_32.h b/arch/tile/include/asm/spinlock_32.h
index b14b1ba..cba8ba9 100644
--- a/arch/tile/include/asm/spinlock_32.h
+++ b/arch/tile/include/asm/spinlock_32.h
@@ -64,8 +64,6 @@ static inline void arch_spin_unlock(arch_spinlock_t *lock)
 	lock->current_ticket = old_ticket + TICKET_QUANTUM;
 }
 
-void arch_spin_unlock_wait(arch_spinlock_t *lock);
-
 /*
  * Read-write spinlocks, allowing multiple readers
  * but only one writer.
diff --git a/arch/tile/include/asm/spinlock_64.h b/arch/tile/include/asm/spinlock_64.h
index b9718fb..9a2c2d6 100644
--- a/arch/tile/include/asm/spinlock_64.h
+++ b/arch/tile/include/asm/spinlock_64.h
@@ -58,8 +58,6 @@ static inline void arch_spin_unlock(arch_spinlock_t *lock)
 	__insn_fetchadd4(&lock->lock, 1U << __ARCH_SPIN_CURRENT_SHIFT);
 }
 
-void arch_spin_unlock_wait(arch_spinlock_t *lock);
-
 void arch_spin_lock_slow(arch_spinlock_t *lock, u32 val);
 
 /* Grab the "next" ticket number and bump it atomically.
diff --git a/arch/tile/lib/spinlock_32.c b/arch/tile/lib/spinlock_32.c
index 076c6cc..db9333f 100644
--- a/arch/tile/lib/spinlock_32.c
+++ b/arch/tile/lib/spinlock_32.c
@@ -62,29 +62,6 @@ int arch_spin_trylock(arch_spinlock_t *lock)
 }
 EXPORT_SYMBOL(arch_spin_trylock);
 
-void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	u32 iterations = 0;
-	int curr = READ_ONCE(lock->current_ticket);
-	int next = READ_ONCE(lock->next_ticket);
-
-	/* Return immediately if unlocked. */
-	if (next == curr)
-		return;
-
-	/* Wait until the current locker has released the lock. */
-	do {
-		delay_backoff(iterations++);
-	} while (READ_ONCE(lock->current_ticket) == curr);
-
-	/*
-	 * The TILE architecture doesn't do read speculation; therefore
-	 * a control dependency guarantees a LOAD->{LOAD,STORE} order.
-	 */
-	barrier();
-}
-EXPORT_SYMBOL(arch_spin_unlock_wait);
-
 /*
  * The low byte is always reserved to be the marker for a "tns" operation
  * since the low bit is set to "1" by a tns.  The next seven bits are
diff --git a/arch/tile/lib/spinlock_64.c b/arch/tile/lib/spinlock_64.c
index a4b5b2c..de414c2 100644
--- a/arch/tile/lib/spinlock_64.c
+++ b/arch/tile/lib/spinlock_64.c
@@ -62,28 +62,6 @@ int arch_spin_trylock(arch_spinlock_t *lock)
 }
 EXPORT_SYMBOL(arch_spin_trylock);
 
-void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	u32 iterations = 0;
-	u32 val = READ_ONCE(lock->lock);
-	u32 curr = arch_spin_current(val);
-
-	/* Return immediately if unlocked. */
-	if (arch_spin_next(val) == curr)
-		return;
-
-	/* Wait until the current locker has released the lock. */
-	do {
-		delay_backoff(iterations++);
-	} while (arch_spin_current(READ_ONCE(lock->lock)) == curr);
-
-	/*
-	 * The TILE architecture doesn't do read speculation; therefore
-	 * a control dependency guarantees a LOAD->{LOAD,STORE} order.
-	 */
-	barrier();
-}
-EXPORT_SYMBOL(arch_spin_unlock_wait);
 
 /*
  * If the read lock fails due to a writer, we retry periodically
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index a0838ab..c14217c 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -116,8 +116,7 @@ void __putstr(const char *s)
 		}
 	}
 
-	if (boot_params->screen_info.orig_video_mode == 0 &&
-	    lines == 0 && cols == 0)
+	if (lines == 0 || cols == 0)
 		return;
 
 	x = boot_params->screen_info.orig_x;
diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S
index 2ed8f0c..1bb08ec 100644
--- a/arch/x86/boot/header.S
+++ b/arch/x86/boot/header.S
@@ -520,8 +520,14 @@
 # the description in lib/decompressor_xxx.c for specific information.
 #
 # extra_bytes = (uncompressed_size >> 12) + 65536 + 128
+#
+# LZ4 is even worse: data that cannot be further compressed grows by 0.4%,
+# or one byte per 256 bytes. OTOH, we can safely get rid of the +128 as
+# the size-dependent part now grows so fast.
+#
+# extra_bytes = (uncompressed_size >> 8) + 65536
 
-#define ZO_z_extra_bytes	((ZO_z_output_len >> 12) + 65536 + 128)
+#define ZO_z_extra_bytes	((ZO_z_output_len >> 8) + 65536)
 #if ZO_z_output_len > ZO_z_input_len
 # define ZO_z_extract_offset	(ZO_z_output_len + ZO_z_extra_bytes - \
 				 ZO_z_input_len)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 73a6311..80534d3 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2370,12 +2370,9 @@ static unsigned long get_segment_base(unsigned int segment)
 #ifdef CONFIG_MODIFY_LDT_SYSCALL
 		struct ldt_struct *ldt;
 
-		if (idx > LDT_ENTRIES)
-			return 0;
-
 		/* IRQs are off, so this synchronizes with smp_store_release */
 		ldt = lockless_dereference(current->active_mm->context.ldt);
-		if (!ldt || idx > ldt->nr_entries)
+		if (!ldt || idx >= ldt->nr_entries)
 			return 0;
 
 		desc = &ldt->entries[idx];
@@ -2383,7 +2380,7 @@ static unsigned long get_segment_base(unsigned int segment)
 		return 0;
 #endif
 	} else {
-		if (idx > GDT_ENTRIES)
+		if (idx >= GDT_ENTRIES)
 			return 0;
 
 		desc = raw_cpu_ptr(gdt_page.gdt) + idx;
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f4d120a..92c9032 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1375,8 +1375,6 @@ int kvm_arch_interrupt_allowed(struct kvm_vcpu *vcpu);
 int kvm_cpu_get_interrupt(struct kvm_vcpu *v);
 void kvm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event);
 void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu);
-void kvm_arch_mmu_notifier_invalidate_page(struct kvm *kvm,
-					   unsigned long address);
 
 void kvm_define_shared_msr(unsigned index, u32 msr);
 int kvm_set_shared_msr(unsigned index, u64 val, u64 mask);
diff --git a/arch/x86/kernel/early-quirks.c b/arch/x86/kernel/early-quirks.c
index d907c3d..a4516ca 100644
--- a/arch/x86/kernel/early-quirks.c
+++ b/arch/x86/kernel/early-quirks.c
@@ -527,6 +527,7 @@ static const struct pci_device_id intel_early_ids[] __initconst = {
 	INTEL_BXT_IDS(&gen9_early_ops),
 	INTEL_KBL_IDS(&gen9_early_ops),
 	INTEL_GLK_IDS(&gen9_early_ops),
+	INTEL_CNL_IDS(&gen9_early_ops),
 };
 
 static void __init
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 05a5e57..272320e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6734,17 +6734,6 @@ void kvm_vcpu_reload_apic_access_page(struct kvm_vcpu *vcpu)
 }
 EXPORT_SYMBOL_GPL(kvm_vcpu_reload_apic_access_page);
 
-void kvm_arch_mmu_notifier_invalidate_page(struct kvm *kvm,
-					   unsigned long address)
-{
-	/*
-	 * The physical address of apic access page is stored in the VMCS.
-	 * Update it when it becomes invalid.
-	 */
-	if (address == gfn_to_hva(kvm, APIC_DEFAULT_PHYS_BASE >> PAGE_SHIFT))
-		kvm_make_all_cpus_request(kvm, KVM_REQ_APIC_PAGE_RELOAD);
-}
-
 /*
  * Returns 1 to let vcpu_run() continue the guest execution loop without
  * exiting to the userspace.  Otherwise, the value will be returned to the
diff --git a/arch/x86/um/user-offsets.c b/arch/x86/um/user-offsets.c
index ae4cd58..02250b2 100644
--- a/arch/x86/um/user-offsets.c
+++ b/arch/x86/um/user-offsets.c
@@ -50,7 +50,7 @@ void foo(void)
 	DEFINE(HOST_GS, GS);
 	DEFINE(HOST_ORIG_AX, ORIG_EAX);
 #else
-#if defined(PTRACE_GETREGSET) && defined(PTRACE_SETREGSET)
+#ifdef FP_XSTATE_MAGIC1
 	DEFINE(HOST_FP_SIZE, sizeof(struct _xstate) / sizeof(unsigned long));
 #else
 	DEFINE(HOST_FP_SIZE, sizeof(struct _fpstate) / sizeof(unsigned long));
diff --git a/arch/xtensa/include/asm/spinlock.h b/arch/xtensa/include/asm/spinlock.h
index a36221cf..3bb4968 100644
--- a/arch/xtensa/include/asm/spinlock.h
+++ b/arch/xtensa/include/asm/spinlock.h
@@ -33,11 +33,6 @@
 
 #define arch_spin_is_locked(x) ((x)->slock != 0)
 
-static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	smp_cond_load_acquire(&lock->slock, !VAL);
-}
-
 #define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
 
 static inline void arch_spin_lock(arch_spinlock_t *lock)
diff --git a/block/Kconfig b/block/Kconfig
index 89cd28f..3ab42bb 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -206,4 +206,9 @@
 	depends on BLOCK && VIRTIO
 	default y
 
+config BLK_MQ_RDMA
+	bool
+	depends on BLOCK && INFINIBAND
+	default y
+
 source block/Kconfig.iosched
diff --git a/block/Makefile b/block/Makefile
index 2b281cf2..9396ebc 100644
--- a/block/Makefile
+++ b/block/Makefile
@@ -29,6 +29,7 @@
 obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o t10-pi.o
 obj-$(CONFIG_BLK_MQ_PCI)	+= blk-mq-pci.o
 obj-$(CONFIG_BLK_MQ_VIRTIO)	+= blk-mq-virtio.o
+obj-$(CONFIG_BLK_MQ_RDMA)	+= blk-mq-rdma.o
 obj-$(CONFIG_BLK_DEV_ZONED)	+= blk-zoned.o
 obj-$(CONFIG_BLK_WBT)		+= blk-wbt.o
 obj-$(CONFIG_BLK_DEBUG_FS)	+= blk-mq-debugfs.o
diff --git a/block/blk-mq-rdma.c b/block/blk-mq-rdma.c
new file mode 100644
index 0000000..996167f
--- /dev/null
+++ b/block/blk-mq-rdma.c
@@ -0,0 +1,52 @@
+/*
+ * Copyright (c) 2017 Sagi Grimberg.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+#include <linux/blk-mq.h>
+#include <linux/blk-mq-rdma.h>
+#include <rdma/ib_verbs.h>
+
+/**
+ * blk_mq_rdma_map_queues - provide a default queue mapping for rdma device
+ * @set:	tagset to provide the mapping for
+ * @dev:	rdma device associated with @set.
+ * @first_vec:	first interrupt vectors to use for queues (usually 0)
+ *
+ * This function assumes the rdma device @dev has at least as many available
+ * interrupt vetors as @set has queues.  It will then query it's affinity mask
+ * and built queue mapping that maps a queue to the CPUs that have irq affinity
+ * for the corresponding vector.
+ *
+ * In case either the driver passed a @dev with less vectors than
+ * @set->nr_hw_queues, or @dev does not provide an affinity mask for a
+ * vector, we fallback to the naive mapping.
+ */
+int blk_mq_rdma_map_queues(struct blk_mq_tag_set *set,
+		struct ib_device *dev, int first_vec)
+{
+	const struct cpumask *mask;
+	unsigned int queue, cpu;
+
+	for (queue = 0; queue < set->nr_hw_queues; queue++) {
+		mask = ib_get_vector_affinity(dev, first_vec + queue);
+		if (!mask)
+			goto fallback;
+
+		for_each_cpu(cpu, mask)
+			set->mq_map[cpu] = queue;
+	}
+
+	return 0;
+
+fallback:
+	return blk_mq_map_queues(set);
+}
+EXPORT_SYMBOL_GPL(blk_mq_rdma_map_queues);
diff --git a/block/compat_ioctl.c b/block/compat_ioctl.c
index 38554c2..abaf9d7 100644
--- a/block/compat_ioctl.c
+++ b/block/compat_ioctl.c
@@ -79,7 +79,7 @@ static int compat_hdio_getgeo(struct gendisk *disk, struct block_device *bdev,
 static int compat_hdio_ioctl(struct block_device *bdev, fmode_t mode,
 		unsigned int cmd, unsigned long arg)
 {
-	unsigned long *__user p;
+	unsigned long __user *p;
 	int error;
 
 	p = compat_alloc_user_space(sizeof(unsigned long));
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index 43839b0..903605d 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -87,8 +87,13 @@ static void skcipher_free_async_sgls(struct skcipher_async_req *sreq)
 	}
 	sgl = sreq->tsg;
 	n = sg_nents(sgl);
-	for_each_sg(sgl, sg, n, i)
-		put_page(sg_page(sg));
+	for_each_sg(sgl, sg, n, i) {
+		struct page *page = sg_page(sg);
+
+		/* some SGs may not have a page mapped */
+		if (page && page_ref_count(page))
+			put_page(page);
+	}
 
 	kfree(sreq->tsg);
 }
diff --git a/crypto/chacha20_generic.c b/crypto/chacha20_generic.c
index 8b3c04d..4a45fa4 100644
--- a/crypto/chacha20_generic.c
+++ b/crypto/chacha20_generic.c
@@ -91,9 +91,14 @@ int crypto_chacha20_crypt(struct skcipher_request *req)
 	crypto_chacha20_init(state, ctx, walk.iv);
 
 	while (walk.nbytes > 0) {
+		unsigned int nbytes = walk.nbytes;
+
+		if (nbytes < walk.total)
+			nbytes = round_down(nbytes, walk.stride);
+
 		chacha20_docrypt(state, walk.dst.virt.addr, walk.src.virt.addr,
-				 walk.nbytes);
-		err = skcipher_walk_done(&walk, 0);
+				 nbytes);
+		err = skcipher_walk_done(&walk, walk.nbytes - nbytes);
 	}
 
 	return err;
diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index 6ceb0e2..d54971d 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -32675,6 +32675,10 @@ static const struct cipher_testvec chacha20_enc_tv_template[] = {
 			  "\x5b\x86\x2f\x37\x30\xe3\x7c\xfd"
 			  "\xc4\xfd\x80\x6c\x22\xf2\x21",
 		.rlen	= 375,
+		.also_non_np = 1,
+		.np	= 3,
+		.tap	= { 375 - 20, 4, 16 },
+
 	}, { /* RFC7539 A.2. Test Vector #3 */
 		.key	= "\x1c\x92\x40\xa5\xeb\x55\xd3\x8a"
 			  "\xf3\x33\x88\x86\x04\xf6\xb5\xf0"
@@ -33049,6 +33053,9 @@ static const struct cipher_testvec chacha20_enc_tv_template[] = {
 			  "\xa1\xed\xad\xd5\x76\xfa\x24\x8f"
 			  "\x98",
 		.rlen	= 1281,
+		.also_non_np = 1,
+		.np	= 3,
+		.tap	= { 1200, 1, 80 },
 	},
 };
 
diff --git a/drivers/ata/ahci_da850.c b/drivers/ata/ahci_da850.c
index 1a50cd3b..9b34dff 100644
--- a/drivers/ata/ahci_da850.c
+++ b/drivers/ata/ahci_da850.c
@@ -216,12 +216,16 @@ static int ahci_da850_probe(struct platform_device *pdev)
 		return rc;
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
-	if (!res)
+	if (!res) {
+		rc = -ENODEV;
 		goto disable_resources;
+	}
 
 	pwrdn_reg = devm_ioremap(dev, res->start, resource_size(res));
-	if (!pwrdn_reg)
+	if (!pwrdn_reg) {
+		rc = -ENOMEM;
 		goto disable_resources;
+	}
 
 	da850_sata_init(dev, pwrdn_reg, hpriv->mmio, mpy);
 
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index fa7dd43..1945a8e 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -2411,6 +2411,9 @@ static void ata_dev_config_trusted(struct ata_device *dev)
 	u64 trusted_cap;
 	unsigned int err;
 
+	if (!ata_id_has_trusted(dev->id))
+		return;
+
 	if (!ata_identify_page_supported(dev, ATA_LOG_SECURITY)) {
 		ata_dev_warn(dev,
 			     "Security Log not supported\n");
diff --git a/drivers/ata/libata-eh.c b/drivers/ata/libata-eh.c
index 3dbd055..e4effef 100644
--- a/drivers/ata/libata-eh.c
+++ b/drivers/ata/libata-eh.c
@@ -645,12 +645,11 @@ void ata_scsi_cmd_error_handler(struct Scsi_Host *host, struct ata_port *ap,
 	 * completions are honored.  A scmd is determined to have
 	 * timed out iff its associated qc is active and not failed.
 	 */
+	spin_lock_irqsave(ap->lock, flags);
 	if (ap->ops->error_handler) {
 		struct scsi_cmnd *scmd, *tmp;
 		int nr_timedout = 0;
 
-		spin_lock_irqsave(ap->lock, flags);
-
 		/* This must occur under the ap->lock as we don't want
 		   a polled recovery to race the real interrupt handler
 
@@ -700,12 +699,11 @@ void ata_scsi_cmd_error_handler(struct Scsi_Host *host, struct ata_port *ap,
 		if (nr_timedout)
 			__ata_port_freeze(ap);
 
-		spin_unlock_irqrestore(ap->lock, flags);
 
 		/* initialize eh_tries */
 		ap->eh_tries = ATA_EH_MAX_TRIES;
-	} else
-		spin_unlock_wait(ap->lock);
+	}
+	spin_unlock_irqrestore(ap->lock, flags);
 
 }
 EXPORT_SYMBOL(ata_scsi_cmd_error_handler);
diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c
index 792da68..2adb859 100644
--- a/drivers/block/xen-blkback/xenbus.c
+++ b/drivers/block/xen-blkback/xenbus.c
@@ -244,6 +244,7 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
 {
 	struct pending_req *req, *n;
 	unsigned int j, r;
+	bool busy = false;
 
 	for (r = 0; r < blkif->nr_rings; r++) {
 		struct xen_blkif_ring *ring = &blkif->rings[r];
@@ -261,8 +262,10 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
 		 * don't have any discard_io or other_io requests. So, checking
 		 * for inflight IO is enough.
 		 */
-		if (atomic_read(&ring->inflight) > 0)
-			return -EBUSY;
+		if (atomic_read(&ring->inflight) > 0) {
+			busy = true;
+			continue;
+		}
 
 		if (ring->irq) {
 			unbind_from_irqhandler(ring->irq, ring);
@@ -300,6 +303,9 @@ static int xen_blkif_disconnect(struct xen_blkif *blkif)
 		WARN_ON(i != (XEN_BLKIF_REQS_PER_PAGE * blkif->nr_ring_pages));
 		ring->active = false;
 	}
+	if (busy)
+		return -EBUSY;
+
 	blkif->nr_ring_pages = 0;
 	/*
 	 * blkif->rings was allocated in connect_ring, so we should free it in
diff --git a/drivers/char/agp/ali-agp.c b/drivers/char/agp/ali-agp.c
index dcbbb4e..89527ba 100644
--- a/drivers/char/agp/ali-agp.c
+++ b/drivers/char/agp/ali-agp.c
@@ -381,7 +381,7 @@ static void agp_ali_remove(struct pci_dev *pdev)
 	agp_put_bridge(bridge);
 }
 
-static struct pci_device_id agp_ali_pci_table[] = {
+static const struct pci_device_id agp_ali_pci_table[] = {
 	{
 	.class		= (PCI_CLASS_BRIDGE_HOST << 8),
 	.class_mask	= ~0,
diff --git a/drivers/char/agp/amd-k7-agp.c b/drivers/char/agp/amd-k7-agp.c
index 5fbd333..b450544 100644
--- a/drivers/char/agp/amd-k7-agp.c
+++ b/drivers/char/agp/amd-k7-agp.c
@@ -21,7 +21,7 @@
 #define AMD_TLBFLUSH	0x0c	/* In mmio region (32-bit register) */
 #define AMD_CACHEENTRY	0x10	/* In mmio region (32-bit register) */
 
-static struct pci_device_id agp_amdk7_pci_table[];
+static const struct pci_device_id agp_amdk7_pci_table[];
 
 struct amd_page_map {
 	unsigned long *real;
@@ -508,7 +508,7 @@ static int agp_amdk7_resume(struct pci_dev *pdev)
 #endif /* CONFIG_PM */
 
 /* must be the same order as name table above */
-static struct pci_device_id agp_amdk7_pci_table[] = {
+static const struct pci_device_id agp_amdk7_pci_table[] = {
 	{
 	.class		= (PCI_CLASS_BRIDGE_HOST << 8),
 	.class_mask	= ~0,
diff --git a/drivers/char/agp/amd64-agp.c b/drivers/char/agp/amd64-agp.c
index c99cd19..e50c29c 100644
--- a/drivers/char/agp/amd64-agp.c
+++ b/drivers/char/agp/amd64-agp.c
@@ -610,7 +610,7 @@ static int agp_amd64_resume(struct pci_dev *pdev)
 
 #endif /* CONFIG_PM */
 
-static struct pci_device_id agp_amd64_pci_table[] = {
+static const struct pci_device_id agp_amd64_pci_table[] = {
 	{
 	.class		= (PCI_CLASS_BRIDGE_HOST << 8),
 	.class_mask	= ~0,
diff --git a/drivers/char/agp/ati-agp.c b/drivers/char/agp/ati-agp.c
index 0b5ec7a..88b4cbe 100644
--- a/drivers/char/agp/ati-agp.c
+++ b/drivers/char/agp/ati-agp.c
@@ -540,7 +540,7 @@ static void agp_ati_remove(struct pci_dev *pdev)
 	agp_put_bridge(bridge);
 }
 
-static struct pci_device_id agp_ati_pci_table[] = {
+static const struct pci_device_id agp_ati_pci_table[] = {
 	{
 	.class		= (PCI_CLASS_BRIDGE_HOST << 8),
 	.class_mask	= ~0,
diff --git a/drivers/char/agp/efficeon-agp.c b/drivers/char/agp/efficeon-agp.c
index 533cb6d..7f88490 100644
--- a/drivers/char/agp/efficeon-agp.c
+++ b/drivers/char/agp/efficeon-agp.c
@@ -427,7 +427,7 @@ static int agp_efficeon_resume(struct pci_dev *pdev)
 }
 #endif
 
-static struct pci_device_id agp_efficeon_pci_table[] = {
+static const struct pci_device_id agp_efficeon_pci_table[] = {
 	{
 	.class		= (PCI_CLASS_BRIDGE_HOST << 8),
 	.class_mask	= ~0,
diff --git a/drivers/char/agp/intel-agp.c b/drivers/char/agp/intel-agp.c
index 0a21dae..9e4f27a 100644
--- a/drivers/char/agp/intel-agp.c
+++ b/drivers/char/agp/intel-agp.c
@@ -828,7 +828,7 @@ static int agp_intel_resume(struct pci_dev *pdev)
 }
 #endif
 
-static struct pci_device_id agp_intel_pci_table[] = {
+static const struct pci_device_id agp_intel_pci_table[] = {
 #define ID(x)						\
 	{						\
 	.class		= (PCI_CLASS_BRIDGE_HOST << 8),	\
diff --git a/drivers/char/agp/nvidia-agp.c b/drivers/char/agp/nvidia-agp.c
index 6c8d39c..828b344 100644
--- a/drivers/char/agp/nvidia-agp.c
+++ b/drivers/char/agp/nvidia-agp.c
@@ -420,7 +420,7 @@ static int agp_nvidia_resume(struct pci_dev *pdev)
 #endif
 
 
-static struct pci_device_id agp_nvidia_pci_table[] = {
+static const struct pci_device_id agp_nvidia_pci_table[] = {
 	{
 	.class		= (PCI_CLASS_BRIDGE_HOST << 8),
 	.class_mask	= ~0,
diff --git a/drivers/char/agp/sis-agp.c b/drivers/char/agp/sis-agp.c
index 2c74038..14909fc 100644
--- a/drivers/char/agp/sis-agp.c
+++ b/drivers/char/agp/sis-agp.c
@@ -237,7 +237,7 @@ static int agp_sis_resume(struct pci_dev *pdev)
 
 #endif /* CONFIG_PM */
 
-static struct pci_device_id agp_sis_pci_table[] = {
+static const struct pci_device_id agp_sis_pci_table[] = {
 	{
 		.class		= (PCI_CLASS_BRIDGE_HOST << 8),
 		.class_mask	= ~0,
diff --git a/drivers/char/agp/uninorth-agp.c b/drivers/char/agp/uninorth-agp.c
index fdced54..c381c8e 100644
--- a/drivers/char/agp/uninorth-agp.c
+++ b/drivers/char/agp/uninorth-agp.c
@@ -679,7 +679,7 @@ static void agp_uninorth_remove(struct pci_dev *pdev)
 	agp_put_bridge(bridge);
 }
 
-static struct pci_device_id agp_uninorth_pci_table[] = {
+static const struct pci_device_id agp_uninorth_pci_table[] = {
 	{
 	.class		= (PCI_CLASS_BRIDGE_HOST << 8),
 	.class_mask	= ~0,
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 56e0a0e..9a30279 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -48,7 +48,7 @@ static atomic64_t dma_fence_context_counter = ATOMIC64_INIT(0);
  */
 u64 dma_fence_context_alloc(unsigned num)
 {
-	BUG_ON(!num);
+	WARN_ON(!num);
 	return atomic64_add_return(num, &dma_fence_context_counter) - num;
 }
 EXPORT_SYMBOL(dma_fence_context_alloc);
@@ -172,7 +172,7 @@ void dma_fence_release(struct kref *kref)
 
 	trace_dma_fence_destroy(fence);
 
-	BUG_ON(!list_empty(&fence->cb_list));
+	WARN_ON(!list_empty(&fence->cb_list));
 
 	if (fence->ops->release)
 		fence->ops->release(fence);
diff --git a/drivers/dma-buf/reservation.c b/drivers/dma-buf/reservation.c
index 393817e..dec3a81 100644
--- a/drivers/dma-buf/reservation.c
+++ b/drivers/dma-buf/reservation.c
@@ -195,8 +195,7 @@ reservation_object_add_shared_replace(struct reservation_object *obj,
 	if (old)
 		kfree_rcu(old, rcu);
 
-	if (old_fence)
-		dma_fence_put(old_fence);
+	dma_fence_put(old_fence);
 }
 
 /**
@@ -258,12 +257,71 @@ void reservation_object_add_excl_fence(struct reservation_object *obj,
 		dma_fence_put(rcu_dereference_protected(old->shared[i],
 						reservation_object_held(obj)));
 
-	if (old_fence)
-		dma_fence_put(old_fence);
+	dma_fence_put(old_fence);
 }
 EXPORT_SYMBOL(reservation_object_add_excl_fence);
 
 /**
+* reservation_object_copy_fences - Copy all fences from src to dst.
+* @dst: the destination reservation object
+* @src: the source reservation object
+*
+* Copy all fences from src to dst. Both src->lock as well as dst-lock must be
+* held.
+*/
+int reservation_object_copy_fences(struct reservation_object *dst,
+				   struct reservation_object *src)
+{
+	struct reservation_object_list *src_list, *dst_list;
+	struct dma_fence *old, *new;
+	size_t size;
+	unsigned i;
+
+	src_list = reservation_object_get_list(src);
+
+	if (src_list) {
+		size = offsetof(typeof(*src_list),
+				shared[src_list->shared_count]);
+		dst_list = kmalloc(size, GFP_KERNEL);
+		if (!dst_list)
+			return -ENOMEM;
+
+		dst_list->shared_count = src_list->shared_count;
+		dst_list->shared_max = src_list->shared_count;
+		for (i = 0; i < src_list->shared_count; ++i)
+			dst_list->shared[i] =
+				dma_fence_get(src_list->shared[i]);
+	} else {
+		dst_list = NULL;
+	}
+
+	kfree(dst->staged);
+	dst->staged = NULL;
+
+	src_list = reservation_object_get_list(dst);
+
+	old = reservation_object_get_excl(dst);
+	new = reservation_object_get_excl(src);
+
+	dma_fence_get(new);
+
+	preempt_disable();
+	write_seqcount_begin(&dst->seq);
+	/* write_seqcount_begin provides the necessary memory barrier */
+	RCU_INIT_POINTER(dst->fence_excl, new);
+	RCU_INIT_POINTER(dst->fence, dst_list);
+	write_seqcount_end(&dst->seq);
+	preempt_enable();
+
+	if (src_list)
+		kfree_rcu(src_list, rcu);
+	dma_fence_put(old);
+
+	return 0;
+}
+EXPORT_SYMBOL(reservation_object_copy_fences);
+
+/**
  * reservation_object_get_fences_rcu - Get an object's shared and exclusive
  * fences without update side lock held
  * @obj: the reservation object
@@ -373,12 +431,25 @@ long reservation_object_wait_timeout_rcu(struct reservation_object *obj,
 	long ret = timeout ? timeout : 1;
 
 retry:
-	fence = NULL;
 	shared_count = 0;
 	seq = read_seqcount_begin(&obj->seq);
 	rcu_read_lock();
 
-	if (wait_all) {
+	fence = rcu_dereference(obj->fence_excl);
+	if (fence && !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) {
+		if (!dma_fence_get_rcu(fence))
+			goto unlock_retry;
+
+		if (dma_fence_is_signaled(fence)) {
+			dma_fence_put(fence);
+			fence = NULL;
+		}
+
+	} else {
+		fence = NULL;
+	}
+
+	if (!fence && wait_all) {
 		struct reservation_object_list *fobj =
 						rcu_dereference(obj->fence);
 
@@ -405,22 +476,6 @@ long reservation_object_wait_timeout_rcu(struct reservation_object *obj,
 		}
 	}
 
-	if (!shared_count) {
-		struct dma_fence *fence_excl = rcu_dereference(obj->fence_excl);
-
-		if (fence_excl &&
-		    !test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
-			      &fence_excl->flags)) {
-			if (!dma_fence_get_rcu(fence_excl))
-				goto unlock_retry;
-
-			if (dma_fence_is_signaled(fence_excl))
-				dma_fence_put(fence_excl);
-			else
-				fence = fence_excl;
-		}
-	}
-
 	rcu_read_unlock();
 	if (fence) {
 		if (read_seqcount_retry(&obj->seq, seq)) {
diff --git a/drivers/dma-buf/sw_sync.c b/drivers/dma-buf/sw_sync.c
index 69c5ff3..38cc738 100644
--- a/drivers/dma-buf/sw_sync.c
+++ b/drivers/dma-buf/sw_sync.c
@@ -96,9 +96,9 @@ static struct sync_timeline *sync_timeline_create(const char *name)
 	obj->context = dma_fence_context_alloc(1);
 	strlcpy(obj->name, name, sizeof(obj->name));
 
-	INIT_LIST_HEAD(&obj->child_list_head);
-	INIT_LIST_HEAD(&obj->active_list_head);
-	spin_lock_init(&obj->child_list_lock);
+	obj->pt_tree = RB_ROOT;
+	INIT_LIST_HEAD(&obj->pt_list);
+	spin_lock_init(&obj->lock);
 
 	sync_timeline_debug_add(obj);
 
@@ -125,68 +125,6 @@ static void sync_timeline_put(struct sync_timeline *obj)
 	kref_put(&obj->kref, sync_timeline_free);
 }
 
-/**
- * sync_timeline_signal() - signal a status change on a sync_timeline
- * @obj:	sync_timeline to signal
- * @inc:	num to increment on timeline->value
- *
- * A sync implementation should call this any time one of it's fences
- * has signaled or has an error condition.
- */
-static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc)
-{
-	unsigned long flags;
-	struct sync_pt *pt, *next;
-
-	trace_sync_timeline(obj);
-
-	spin_lock_irqsave(&obj->child_list_lock, flags);
-
-	obj->value += inc;
-
-	list_for_each_entry_safe(pt, next, &obj->active_list_head,
-				 active_list) {
-		if (dma_fence_is_signaled_locked(&pt->base))
-			list_del_init(&pt->active_list);
-	}
-
-	spin_unlock_irqrestore(&obj->child_list_lock, flags);
-}
-
-/**
- * sync_pt_create() - creates a sync pt
- * @parent:	fence's parent sync_timeline
- * @size:	size to allocate for this pt
- * @inc:	value of the fence
- *
- * Creates a new sync_pt as a child of @parent.  @size bytes will be
- * allocated allowing for implementation specific data to be kept after
- * the generic sync_timeline struct. Returns the sync_pt object or
- * NULL in case of error.
- */
-static struct sync_pt *sync_pt_create(struct sync_timeline *obj, int size,
-			     unsigned int value)
-{
-	unsigned long flags;
-	struct sync_pt *pt;
-
-	if (size < sizeof(*pt))
-		return NULL;
-
-	pt = kzalloc(size, GFP_KERNEL);
-	if (!pt)
-		return NULL;
-
-	spin_lock_irqsave(&obj->child_list_lock, flags);
-	sync_timeline_get(obj);
-	dma_fence_init(&pt->base, &timeline_fence_ops, &obj->child_list_lock,
-		       obj->context, value);
-	list_add_tail(&pt->child_list, &obj->child_list_head);
-	INIT_LIST_HEAD(&pt->active_list);
-	spin_unlock_irqrestore(&obj->child_list_lock, flags);
-	return pt;
-}
-
 static const char *timeline_fence_get_driver_name(struct dma_fence *fence)
 {
 	return "sw_sync";
@@ -203,13 +141,17 @@ static void timeline_fence_release(struct dma_fence *fence)
 {
 	struct sync_pt *pt = dma_fence_to_sync_pt(fence);
 	struct sync_timeline *parent = dma_fence_parent(fence);
-	unsigned long flags;
 
-	spin_lock_irqsave(fence->lock, flags);
-	list_del(&pt->child_list);
-	if (!list_empty(&pt->active_list))
-		list_del(&pt->active_list);
-	spin_unlock_irqrestore(fence->lock, flags);
+	if (!list_empty(&pt->link)) {
+		unsigned long flags;
+
+		spin_lock_irqsave(fence->lock, flags);
+		if (!list_empty(&pt->link)) {
+			list_del(&pt->link);
+			rb_erase(&pt->node, &parent->pt_tree);
+		}
+		spin_unlock_irqrestore(fence->lock, flags);
+	}
 
 	sync_timeline_put(parent);
 	dma_fence_free(fence);
@@ -219,18 +161,11 @@ static bool timeline_fence_signaled(struct dma_fence *fence)
 {
 	struct sync_timeline *parent = dma_fence_parent(fence);
 
-	return (fence->seqno > parent->value) ? false : true;
+	return !__dma_fence_is_later(fence->seqno, parent->value);
 }
 
 static bool timeline_fence_enable_signaling(struct dma_fence *fence)
 {
-	struct sync_pt *pt = dma_fence_to_sync_pt(fence);
-	struct sync_timeline *parent = dma_fence_parent(fence);
-
-	if (timeline_fence_signaled(fence))
-		return false;
-
-	list_add_tail(&pt->active_list, &parent->active_list_head);
 	return true;
 }
 
@@ -259,6 +194,107 @@ static const struct dma_fence_ops timeline_fence_ops = {
 	.timeline_value_str = timeline_fence_timeline_value_str,
 };
 
+/**
+ * sync_timeline_signal() - signal a status change on a sync_timeline
+ * @obj:	sync_timeline to signal
+ * @inc:	num to increment on timeline->value
+ *
+ * A sync implementation should call this any time one of it's fences
+ * has signaled or has an error condition.
+ */
+static void sync_timeline_signal(struct sync_timeline *obj, unsigned int inc)
+{
+	struct sync_pt *pt, *next;
+
+	trace_sync_timeline(obj);
+
+	spin_lock_irq(&obj->lock);
+
+	obj->value += inc;
+
+	list_for_each_entry_safe(pt, next, &obj->pt_list, link) {
+		if (!timeline_fence_signaled(&pt->base))
+			break;
+
+		list_del_init(&pt->link);
+		rb_erase(&pt->node, &obj->pt_tree);
+
+		/*
+		 * A signal callback may release the last reference to this
+		 * fence, causing it to be freed. That operation has to be
+		 * last to avoid a use after free inside this loop, and must
+		 * be after we remove the fence from the timeline in order to
+		 * prevent deadlocking on timeline->lock inside
+		 * timeline_fence_release().
+		 */
+		dma_fence_signal_locked(&pt->base);
+	}
+
+	spin_unlock_irq(&obj->lock);
+}
+
+/**
+ * sync_pt_create() - creates a sync pt
+ * @parent:	fence's parent sync_timeline
+ * @inc:	value of the fence
+ *
+ * Creates a new sync_pt as a child of @parent.  @size bytes will be
+ * allocated allowing for implementation specific data to be kept after
+ * the generic sync_timeline struct. Returns the sync_pt object or
+ * NULL in case of error.
+ */
+static struct sync_pt *sync_pt_create(struct sync_timeline *obj,
+				      unsigned int value)
+{
+	struct sync_pt *pt;
+
+	pt = kzalloc(sizeof(*pt), GFP_KERNEL);
+	if (!pt)
+		return NULL;
+
+	sync_timeline_get(obj);
+	dma_fence_init(&pt->base, &timeline_fence_ops, &obj->lock,
+		       obj->context, value);
+	INIT_LIST_HEAD(&pt->link);
+
+	spin_lock_irq(&obj->lock);
+	if (!dma_fence_is_signaled_locked(&pt->base)) {
+		struct rb_node **p = &obj->pt_tree.rb_node;
+		struct rb_node *parent = NULL;
+
+		while (*p) {
+			struct sync_pt *other;
+			int cmp;
+
+			parent = *p;
+			other = rb_entry(parent, typeof(*pt), node);
+			cmp = value - other->base.seqno;
+			if (cmp > 0) {
+				p = &parent->rb_right;
+			} else if (cmp < 0) {
+				p = &parent->rb_left;
+			} else {
+				if (dma_fence_get_rcu(&other->base)) {
+					dma_fence_put(&pt->base);
+					pt = other;
+					goto unlock;
+				}
+				p = &parent->rb_left;
+			}
+		}
+		rb_link_node(&pt->node, parent, p);
+		rb_insert_color(&pt->node, &obj->pt_tree);
+
+		parent = rb_next(&pt->node);
+		list_add_tail(&pt->link,
+			      parent ? &rb_entry(parent, typeof(*pt), node)->link : &obj->pt_list);
+	}
+unlock:
+	spin_unlock_irq(&obj->lock);
+
+	return pt;
+}
+
 /*
  * *WARNING*
  *
@@ -309,7 +345,7 @@ static long sw_sync_ioctl_create_fence(struct sync_timeline *obj,
 		goto err;
 	}
 
-	pt = sync_pt_create(obj, sizeof(*pt), data.value);
+	pt = sync_pt_create(obj, data.value);
 	if (!pt) {
 		err = -ENOMEM;
 		goto err;
@@ -345,6 +381,11 @@ static long sw_sync_ioctl_inc(struct sync_timeline *obj, unsigned long arg)
 	if (copy_from_user(&value, (void __user *)arg, sizeof(value)))
 		return -EFAULT;
 
+	while (value > INT_MAX)  {
+		sync_timeline_signal(obj, INT_MAX);
+		value -= INT_MAX;
+	}
+
 	sync_timeline_signal(obj, value);
 
 	return 0;
diff --git a/drivers/dma-buf/sync_debug.c b/drivers/dma-buf/sync_debug.c
index 59a3b2f8..c4c8ecb 100644
--- a/drivers/dma-buf/sync_debug.c
+++ b/drivers/dma-buf/sync_debug.c
@@ -116,17 +116,15 @@ static void sync_print_fence(struct seq_file *s,
 static void sync_print_obj(struct seq_file *s, struct sync_timeline *obj)
 {
 	struct list_head *pos;
-	unsigned long flags;
 
 	seq_printf(s, "%s: %d\n", obj->name, obj->value);
 
-	spin_lock_irqsave(&obj->child_list_lock, flags);
-	list_for_each(pos, &obj->child_list_head) {
-		struct sync_pt *pt =
-			container_of(pos, struct sync_pt, child_list);
+	spin_lock_irq(&obj->lock);
+	list_for_each(pos, &obj->pt_list) {
+		struct sync_pt *pt = container_of(pos, struct sync_pt, link);
 		sync_print_fence(s, &pt->base, false);
 	}
-	spin_unlock_irqrestore(&obj->child_list_lock, flags);
+	spin_unlock_irq(&obj->lock);
 }
 
 static void sync_print_sync_file(struct seq_file *s,
@@ -151,12 +149,11 @@ static void sync_print_sync_file(struct seq_file *s,
 
 static int sync_debugfs_show(struct seq_file *s, void *unused)
 {
-	unsigned long flags;
 	struct list_head *pos;
 
 	seq_puts(s, "objs:\n--------------\n");
 
-	spin_lock_irqsave(&sync_timeline_list_lock, flags);
+	spin_lock_irq(&sync_timeline_list_lock);
 	list_for_each(pos, &sync_timeline_list_head) {
 		struct sync_timeline *obj =
 			container_of(pos, struct sync_timeline,
@@ -165,11 +162,11 @@ static int sync_debugfs_show(struct seq_file *s, void *unused)
 		sync_print_obj(s, obj);
 		seq_putc(s, '\n');
 	}
-	spin_unlock_irqrestore(&sync_timeline_list_lock, flags);
+	spin_unlock_irq(&sync_timeline_list_lock);
 
 	seq_puts(s, "fences:\n--------------\n");
 
-	spin_lock_irqsave(&sync_file_list_lock, flags);
+	spin_lock_irq(&sync_file_list_lock);
 	list_for_each(pos, &sync_file_list_head) {
 		struct sync_file *sync_file =
 			container_of(pos, struct sync_file, sync_file_list);
@@ -177,7 +174,7 @@ static int sync_debugfs_show(struct seq_file *s, void *unused)
 		sync_print_sync_file(s, sync_file);
 		seq_putc(s, '\n');
 	}
-	spin_unlock_irqrestore(&sync_file_list_lock, flags);
+	spin_unlock_irq(&sync_file_list_lock);
 	return 0;
 }
 
diff --git a/drivers/dma-buf/sync_debug.h b/drivers/dma-buf/sync_debug.h
index 26fe8b9..d615a89 100644
--- a/drivers/dma-buf/sync_debug.h
+++ b/drivers/dma-buf/sync_debug.h
@@ -14,6 +14,7 @@
 #define _LINUX_SYNC_H
 
 #include <linux/list.h>
+#include <linux/rbtree.h>
 #include <linux/spinlock.h>
 #include <linux/dma-fence.h>
 
@@ -24,42 +25,41 @@
  * struct sync_timeline - sync object
  * @kref:		reference count on fence.
  * @name:		name of the sync_timeline. Useful for debugging
- * @child_list_head:	list of children sync_pts for this sync_timeline
- * @child_list_lock:	lock protecting @child_list_head and fence.status
- * @active_list_head:	list of active (unsignaled/errored) sync_pts
+ * @lock:		lock protecting @pt_list and @value
+ * @pt_tree:		rbtree of active (unsignaled/errored) sync_pts
+ * @pt_list:		list of active (unsignaled/errored) sync_pts
  * @sync_timeline_list:	membership in global sync_timeline_list
  */
 struct sync_timeline {
 	struct kref		kref;
 	char			name[32];
 
-	/* protected by child_list_lock */
+	/* protected by lock */
 	u64			context;
 	int			value;
 
-	struct list_head	child_list_head;
-	spinlock_t		child_list_lock;
-
-	struct list_head	active_list_head;
+	struct rb_root		pt_tree;
+	struct list_head	pt_list;
+	spinlock_t		lock;
 
 	struct list_head	sync_timeline_list;
 };
 
 static inline struct sync_timeline *dma_fence_parent(struct dma_fence *fence)
 {
-	return container_of(fence->lock, struct sync_timeline, child_list_lock);
+	return container_of(fence->lock, struct sync_timeline, lock);
 }
 
 /**
  * struct sync_pt - sync_pt object
  * @base: base fence object
- * @child_list: sync timeline child's list
- * @active_list: sync timeline active child's list
+ * @link: link on the sync timeline's list
+ * @node: node in the sync timeline's tree
  */
 struct sync_pt {
 	struct dma_fence base;
-	struct list_head child_list;
-	struct list_head active_list;
+	struct list_head link;
+	struct rb_node node;
 };
 
 #ifdef CONFIG_SW_SYNC
diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
index 24a066e..a8acc19 100644
--- a/drivers/gpu/drm/Makefile
+++ b/drivers/gpu/drm/Makefile
@@ -33,7 +33,7 @@
 		drm_plane_helper.o drm_dp_mst_topology.o drm_atomic_helper.o \
 		drm_kms_helper_common.o drm_dp_dual_mode_helper.o \
 		drm_simple_kms_helper.o drm_modeset_helper.o \
-		drm_scdc_helper.o
+		drm_scdc_helper.o drm_gem_framebuffer_helper.o
 
 drm_kms_helper-$(CONFIG_DRM_PANEL_BRIDGE) += bridge/panel.o
 drm_kms_helper-$(CONFIG_DRM_LOAD_EDID_FIRMWARE) += drm_edid_load.o
diff --git a/drivers/gpu/drm/amd/amdgpu/Makefile b/drivers/gpu/drm/amd/amdgpu/Makefile
index faea634..658bac0 100644
--- a/drivers/gpu/drm/amd/amdgpu/Makefile
+++ b/drivers/gpu/drm/amd/amdgpu/Makefile
@@ -25,7 +25,7 @@
 	amdgpu_prime.o amdgpu_vm.o amdgpu_ib.o amdgpu_pll.o \
 	amdgpu_ucode.o amdgpu_bo_list.o amdgpu_ctx.o amdgpu_sync.o \
 	amdgpu_gtt_mgr.o amdgpu_vram_mgr.o amdgpu_virt.o amdgpu_atomfirmware.o \
-	amdgpu_queue_mgr.o
+	amdgpu_queue_mgr.o amdgpu_vf_error.o
 
 # add asic specific block
 amdgpu-$(CONFIG_DRM_AMDGPU_CIK)+= cik.o cik_ih.o kv_smc.o kv_dpm.o \
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
index ff7bf1a9..12e71bb 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h
@@ -68,13 +68,16 @@
 
 #include "gpu_scheduler.h"
 #include "amdgpu_virt.h"
+#include "amdgpu_gart.h"
 
 /*
  * Modules parameters.
  */
 extern int amdgpu_modeset;
 extern int amdgpu_vram_limit;
-extern int amdgpu_gart_size;
+extern int amdgpu_vis_vram_limit;
+extern unsigned amdgpu_gart_size;
+extern int amdgpu_gtt_size;
 extern int amdgpu_moverate;
 extern int amdgpu_benchmarking;
 extern int amdgpu_testing;
@@ -93,6 +96,7 @@ extern int amdgpu_bapm;
 extern int amdgpu_deep_color;
 extern int amdgpu_vm_size;
 extern int amdgpu_vm_block_size;
+extern int amdgpu_vm_fragment_size;
 extern int amdgpu_vm_fault_stop;
 extern int amdgpu_vm_debug;
 extern int amdgpu_vm_update_mode;
@@ -104,6 +108,7 @@ extern unsigned amdgpu_pcie_gen_cap;
 extern unsigned amdgpu_pcie_lane_cap;
 extern unsigned amdgpu_cg_mask;
 extern unsigned amdgpu_pg_mask;
+extern unsigned amdgpu_sdma_phase_quantum;
 extern char *amdgpu_disable_cu;
 extern char *amdgpu_virtual_display;
 extern unsigned amdgpu_pp_feature_mask;
@@ -369,78 +374,10 @@ struct amdgpu_clock {
 };
 
 /*
- * BO.
+ * GEM.
  */
-struct amdgpu_bo_list_entry {
-	struct amdgpu_bo		*robj;
-	struct ttm_validate_buffer	tv;
-	struct amdgpu_bo_va		*bo_va;
-	uint32_t			priority;
-	struct page			**user_pages;
-	int				user_invalidated;
-};
-
-struct amdgpu_bo_va_mapping {
-	struct list_head		list;
-	struct rb_node			rb;
-	uint64_t			start;
-	uint64_t			last;
-	uint64_t			__subtree_last;
-	uint64_t			offset;
-	uint64_t			flags;
-};
-
-/* bo virtual addresses in a specific vm */
-struct amdgpu_bo_va {
-	/* protected by bo being reserved */
-	struct list_head		bo_list;
-	struct dma_fence	        *last_pt_update;
-	unsigned			ref_count;
-
-	/* protected by vm mutex and spinlock */
-	struct list_head		vm_status;
-
-	/* mappings for this bo_va */
-	struct list_head		invalids;
-	struct list_head		valids;
-
-	/* constant after initialization */
-	struct amdgpu_vm		*vm;
-	struct amdgpu_bo		*bo;
-};
 
 #define AMDGPU_GEM_DOMAIN_MAX		0x3
-
-struct amdgpu_bo {
-	/* Protected by tbo.reserved */
-	u32				prefered_domains;
-	u32				allowed_domains;
-	struct ttm_place		placements[AMDGPU_GEM_DOMAIN_MAX + 1];
-	struct ttm_placement		placement;
-	struct ttm_buffer_object	tbo;
-	struct ttm_bo_kmap_obj		kmap;
-	u64				flags;
-	unsigned			pin_count;
-	void				*kptr;
-	u64				tiling_flags;
-	u64				metadata_flags;
-	void				*metadata;
-	u32				metadata_size;
-	unsigned			prime_shared_count;
-	/* list of all virtual address to which this bo
-	 * is associated to
-	 */
-	struct list_head		va;
-	/* Constant after initialization */
-	struct drm_gem_object		gem_base;
-	struct amdgpu_bo		*parent;
-	struct amdgpu_bo		*shadow;
-
-	struct ttm_bo_kmap_obj		dma_buf_vmap;
-	struct amdgpu_mn		*mn;
-	struct list_head		mn_list;
-	struct list_head		shadow_list;
-};
 #define gem_to_amdgpu_bo(gobj) container_of((gobj), struct amdgpu_bo, gem_base)
 
 void amdgpu_gem_object_free(struct drm_gem_object *obj);
@@ -532,49 +469,6 @@ int amdgpu_fence_slab_init(void);
 void amdgpu_fence_slab_fini(void);
 
 /*
- * GART structures, functions & helpers
- */
-struct amdgpu_mc;
-
-#define AMDGPU_GPU_PAGE_SIZE 4096
-#define AMDGPU_GPU_PAGE_MASK (AMDGPU_GPU_PAGE_SIZE - 1)
-#define AMDGPU_GPU_PAGE_SHIFT 12
-#define AMDGPU_GPU_PAGE_ALIGN(a) (((a) + AMDGPU_GPU_PAGE_MASK) & ~AMDGPU_GPU_PAGE_MASK)
-
-struct amdgpu_gart {
-	dma_addr_t			table_addr;
-	struct amdgpu_bo		*robj;
-	void				*ptr;
-	unsigned			num_gpu_pages;
-	unsigned			num_cpu_pages;
-	unsigned			table_size;
-#ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
-	struct page			**pages;
-#endif
-	bool				ready;
-
-	/* Asic default pte flags */
-	uint64_t			gart_pte_flags;
-
-	const struct amdgpu_gart_funcs *gart_funcs;
-};
-
-int amdgpu_gart_table_ram_alloc(struct amdgpu_device *adev);
-void amdgpu_gart_table_ram_free(struct amdgpu_device *adev);
-int amdgpu_gart_table_vram_alloc(struct amdgpu_device *adev);
-void amdgpu_gart_table_vram_free(struct amdgpu_device *adev);
-int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
-void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
-int amdgpu_gart_init(struct amdgpu_device *adev);
-void amdgpu_gart_fini(struct amdgpu_device *adev);
-int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
-			int pages);
-int amdgpu_gart_bind(struct amdgpu_device *adev, uint64_t offset,
-		     int pages, struct page **pagelist,
-		     dma_addr_t *dma_addr, uint64_t flags);
-int amdgpu_ttm_recover_gart(struct amdgpu_device *adev);
-
-/*
  * VMHUB structures, functions & helpers
  */
 struct amdgpu_vmhub {
@@ -598,22 +492,20 @@ struct amdgpu_mc {
 	 * about vram size near mc fb location */
 	u64			mc_vram_size;
 	u64			visible_vram_size;
-	u64			gtt_size;
-	u64			gtt_start;
-	u64			gtt_end;
+	u64			gart_size;
+	u64			gart_start;
+	u64			gart_end;
 	u64			vram_start;
 	u64			vram_end;
 	unsigned		vram_width;
 	u64			real_vram_size;
 	int			vram_mtrr;
-	u64                     gtt_base_align;
 	u64                     mc_mask;
 	const struct firmware   *fw;	/* MC firmware */
 	uint32_t                fw_version;
 	struct amdgpu_irq_src	vm_fault;
 	uint32_t		vram_type;
 	uint32_t                srbm_soft_reset;
-	struct amdgpu_mode_mc_save save;
 	bool			prt_warning;
 	uint64_t		stolen_size;
 	/* apertures */
@@ -719,15 +611,15 @@ typedef enum _AMDGPU_DOORBELL64_ASSIGNMENT
 	/* overlap the doorbell assignment with VCN as they are  mutually exclusive
 	 * VCE engine's doorbell is 32 bit and two VCE ring share one QWORD
 	 */
-	AMDGPU_DOORBELL64_RING0_1                 = 0xF8,
-	AMDGPU_DOORBELL64_RING2_3                 = 0xF9,
-	AMDGPU_DOORBELL64_RING4_5                 = 0xFA,
-	AMDGPU_DOORBELL64_RING6_7                 = 0xFB,
+	AMDGPU_DOORBELL64_UVD_RING0_1             = 0xF8,
+	AMDGPU_DOORBELL64_UVD_RING2_3             = 0xF9,
+	AMDGPU_DOORBELL64_UVD_RING4_5             = 0xFA,
+	AMDGPU_DOORBELL64_UVD_RING6_7             = 0xFB,
 
-	AMDGPU_DOORBELL64_UVD_RING0_1             = 0xFC,
-	AMDGPU_DOORBELL64_UVD_RING2_3             = 0xFD,
-	AMDGPU_DOORBELL64_UVD_RING4_5             = 0xFE,
-	AMDGPU_DOORBELL64_UVD_RING6_7             = 0xFF,
+	AMDGPU_DOORBELL64_VCE_RING0_1             = 0xFC,
+	AMDGPU_DOORBELL64_VCE_RING2_3             = 0xFD,
+	AMDGPU_DOORBELL64_VCE_RING4_5             = 0xFE,
+	AMDGPU_DOORBELL64_VCE_RING6_7             = 0xFF,
 
 	AMDGPU_DOORBELL64_MAX_ASSIGNMENT          = 0xFF,
 	AMDGPU_DOORBELL64_INVALID                 = 0xFFFF
@@ -857,6 +749,7 @@ void amdgpu_ctx_mgr_fini(struct amdgpu_ctx_mgr *mgr);
 struct amdgpu_fpriv {
 	struct amdgpu_vm	vm;
 	struct amdgpu_bo_va	*prt_va;
+	struct amdgpu_bo_va	*csa_va;
 	struct mutex		bo_list_lock;
 	struct idr		bo_list_handles;
 	struct amdgpu_ctx_mgr	ctx_mgr;
@@ -866,6 +759,14 @@ struct amdgpu_fpriv {
 /*
  * residency list
  */
+struct amdgpu_bo_list_entry {
+	struct amdgpu_bo		*robj;
+	struct ttm_validate_buffer	tv;
+	struct amdgpu_bo_va		*bo_va;
+	uint32_t			priority;
+	struct page			**user_pages;
+	int				user_invalidated;
+};
 
 struct amdgpu_bo_list {
 	struct mutex lock;
@@ -1159,7 +1060,9 @@ struct amdgpu_cs_parser {
 	struct list_head		validated;
 	struct dma_fence		*fence;
 	uint64_t			bytes_moved_threshold;
+	uint64_t			bytes_moved_vis_threshold;
 	uint64_t			bytes_moved;
+	uint64_t			bytes_moved_vis;
 	struct amdgpu_bo_list_entry	*evictable;
 
 	/* user fence */
@@ -1230,8 +1133,6 @@ struct amdgpu_wb {
 
 int amdgpu_wb_get(struct amdgpu_device *adev, u32 *wb);
 void amdgpu_wb_free(struct amdgpu_device *adev, u32 wb);
-int amdgpu_wb_get_64bit(struct amdgpu_device *adev, u32 *wb);
-void amdgpu_wb_free_64bit(struct amdgpu_device *adev, u32 wb);
 
 void amdgpu_get_pcie_info(struct amdgpu_device *adev);
 
@@ -1525,7 +1426,7 @@ struct amdgpu_device {
 	bool				is_atom_fw;
 	uint8_t				*bios;
 	uint32_t			bios_size;
-	struct amdgpu_bo		*stollen_vga_memory;
+	struct amdgpu_bo		*stolen_vga_memory;
 	uint32_t			bios_scratch_reg_offset;
 	uint32_t			bios_scratch[AMDGPU_BIOS_NUM_SCRATCH];
 
@@ -1557,6 +1458,10 @@ struct amdgpu_device {
 	spinlock_t gc_cac_idx_lock;
 	amdgpu_rreg_t			gc_cac_rreg;
 	amdgpu_wreg_t			gc_cac_wreg;
+	/* protects concurrent se_cac register access */
+	spinlock_t se_cac_idx_lock;
+	amdgpu_rreg_t			se_cac_rreg;
+	amdgpu_wreg_t			se_cac_wreg;
 	/* protects concurrent ENDPOINT (audio) register access */
 	spinlock_t audio_endpt_idx_lock;
 	amdgpu_block_rreg_t		audio_endpt_rreg;
@@ -1579,9 +1484,6 @@ struct amdgpu_device {
 	struct amdgpu_mman		mman;
 	struct amdgpu_vram_scratch	vram_scratch;
 	struct amdgpu_wb		wb;
-	atomic64_t			vram_usage;
-	atomic64_t			vram_vis_usage;
-	atomic64_t			gtt_usage;
 	atomic64_t			num_bytes_moved;
 	atomic64_t			num_evictions;
 	atomic64_t			num_vram_cpu_page_faults;
@@ -1593,6 +1495,7 @@ struct amdgpu_device {
 		spinlock_t		lock;
 		s64			last_update_us;
 		s64			accum_us; /* accumulated microseconds */
+		s64			accum_us_vis; /* for visible VRAM */
 		u32			log2_max_MBps;
 	} mm_stats;
 
@@ -1687,6 +1590,8 @@ struct amdgpu_device {
 	bool has_hw_reset;
 	u8				reset_magic[AMDGPU_RESET_MAGIC_NUM];
 
+	/* record last mm index being written through WREG32*/
+	unsigned long last_mm_index;
 };
 
 static inline struct amdgpu_device *amdgpu_ttm_adev(struct ttm_bo_device *bdev)
@@ -1742,6 +1647,8 @@ void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v);
 #define WREG32_DIDT(reg, v) adev->didt_wreg(adev, (reg), (v))
 #define RREG32_GC_CAC(reg) adev->gc_cac_rreg(adev, (reg))
 #define WREG32_GC_CAC(reg, v) adev->gc_cac_wreg(adev, (reg), (v))
+#define RREG32_SE_CAC(reg) adev->se_cac_rreg(adev, (reg))
+#define WREG32_SE_CAC(reg, v) adev->se_cac_wreg(adev, (reg), (v))
 #define RREG32_AUDIO_ENDPT(block, reg) adev->audio_endpt_rreg(adev, (block), (reg))
 #define WREG32_AUDIO_ENDPT(block, reg, v) adev->audio_endpt_wreg(adev, (block), (reg), (v))
 #define WREG32_P(reg, val, mask)				\
@@ -1792,50 +1699,6 @@ void amdgpu_mm_wdoorbell64(struct amdgpu_device *adev, u32 index, u64 v);
 #define RBIOS16(i) (RBIOS8(i) | (RBIOS8((i)+1) << 8))
 #define RBIOS32(i) ((RBIOS16(i)) | (RBIOS16((i)+2) << 16))
 
-/*
- * RING helpers.
- */
-static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
-{
-	if (ring->count_dw <= 0)
-		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
-	ring->ring[ring->wptr++ & ring->buf_mask] = v;
-	ring->wptr &= ring->ptr_mask;
-	ring->count_dw--;
-}
-
-static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring, void *src, int count_dw)
-{
-	unsigned occupied, chunk1, chunk2;
-	void *dst;
-
-	if (unlikely(ring->count_dw < count_dw)) {
-		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
-		return;
-	}
-
-	occupied = ring->wptr & ring->buf_mask;
-	dst = (void *)&ring->ring[occupied];
-	chunk1 = ring->buf_mask + 1 - occupied;
-	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
-	chunk2 = count_dw - chunk1;
-	chunk1 <<= 2;
-	chunk2 <<= 2;
-
-	if (chunk1)
-		memcpy(dst, src, chunk1);
-
-	if (chunk2) {
-		src += chunk1;
-		dst = (void *)ring->ring;
-		memcpy(dst, src, chunk2);
-	}
-
-	ring->wptr += count_dw;
-	ring->wptr &= ring->ptr_mask;
-	ring->count_dw -= count_dw;
-}
-
 static inline struct amdgpu_sdma_instance *
 amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
 {
@@ -1898,7 +1761,6 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
 #define amdgpu_ih_get_wptr(adev) (adev)->irq.ih_funcs->get_wptr((adev))
 #define amdgpu_ih_decode_iv(adev, iv) (adev)->irq.ih_funcs->decode_iv((adev), (iv))
 #define amdgpu_ih_set_rptr(adev) (adev)->irq.ih_funcs->set_rptr((adev))
-#define amdgpu_display_set_vga_render_state(adev, r) (adev)->mode_info.funcs->set_vga_render_state((adev), (r))
 #define amdgpu_display_vblank_get_counter(adev, crtc) (adev)->mode_info.funcs->vblank_get_counter((adev), (crtc))
 #define amdgpu_display_vblank_wait(adev, crtc) (adev)->mode_info.funcs->vblank_wait((adev), (crtc))
 #define amdgpu_display_backlight_set_level(adev, e, l) (adev)->mode_info.funcs->backlight_set_level((e), (l))
@@ -1911,8 +1773,6 @@ amdgpu_get_sdma_instance(struct amdgpu_ring *ring)
 #define amdgpu_display_page_flip_get_scanoutpos(adev, crtc, vbl, pos) (adev)->mode_info.funcs->page_flip_get_scanoutpos((adev), (crtc), (vbl), (pos))
 #define amdgpu_display_add_encoder(adev, e, s, c) (adev)->mode_info.funcs->add_encoder((adev), (e), (s), (c))
 #define amdgpu_display_add_connector(adev, ci, sd, ct, ib, coi, h, r) (adev)->mode_info.funcs->add_connector((adev), (ci), (sd), (ct), (ib), (coi), (h), (r))
-#define amdgpu_display_stop_mc_access(adev, s) (adev)->mode_info.funcs->stop_mc_access((adev), (s))
-#define amdgpu_display_resume_mc_access(adev, s) (adev)->mode_info.funcs->resume_mc_access((adev), (s))
 #define amdgpu_emit_copy_buffer(adev, ib, s, d, b) (adev)->mman.buffer_funcs->emit_copy_buffer((ib),  (s), (d), (b))
 #define amdgpu_emit_fill_buffer(adev, ib, s, d, b) (adev)->mman.buffer_funcs->emit_fill_buffer((ib), (s), (d), (b))
 #define amdgpu_gfx_get_gpu_clock_counter(adev) (adev)->gfx.funcs->get_gpu_clock_counter((adev))
@@ -1927,7 +1787,8 @@ void amdgpu_pci_config_reset(struct amdgpu_device *adev);
 bool amdgpu_need_post(struct amdgpu_device *adev);
 void amdgpu_update_display_priority(struct amdgpu_device *adev);
 
-void amdgpu_cs_report_moved_bytes(struct amdgpu_device *adev, u64 num_bytes);
+void amdgpu_cs_report_moved_bytes(struct amdgpu_device *adev, u64 num_bytes,
+				  u64 num_vis_bytes);
 void amdgpu_ttm_placement_from_domain(struct amdgpu_bo *abo, u32 domain);
 bool amdgpu_ttm_bo_is_amdgpu_bo(struct ttm_buffer_object *bo);
 int amdgpu_ttm_tt_get_user_pages(struct ttm_tt *ttm, struct page **pages);
@@ -1943,7 +1804,7 @@ bool amdgpu_ttm_tt_is_readonly(struct ttm_tt *ttm);
 uint64_t amdgpu_ttm_tt_pte_flags(struct amdgpu_device *adev, struct ttm_tt *ttm,
 				 struct ttm_mem_reg *mem);
 void amdgpu_vram_location(struct amdgpu_device *adev, struct amdgpu_mc *mc, u64 base);
-void amdgpu_gtt_location(struct amdgpu_device *adev, struct amdgpu_mc *mc);
+void amdgpu_gart_location(struct amdgpu_device *adev, struct amdgpu_mc *mc);
 void amdgpu_ttm_set_active_vram_size(struct amdgpu_device *adev, u64 size);
 int amdgpu_ttm_init(struct amdgpu_device *adev);
 void amdgpu_ttm_fini(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c
index 06879d1..a52795d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acp.c
@@ -285,19 +285,20 @@ static int acp_hw_init(void *handle)
 		return 0;
 	else if (r)
 		return r;
+	if (adev->asic_type != CHIP_STONEY) {
+		adev->acp.acp_genpd = kzalloc(sizeof(struct acp_pm_domain), GFP_KERNEL);
+		if (adev->acp.acp_genpd == NULL)
+			return -ENOMEM;
 
-	adev->acp.acp_genpd = kzalloc(sizeof(struct acp_pm_domain), GFP_KERNEL);
-	if (adev->acp.acp_genpd == NULL)
-		return -ENOMEM;
-
-	adev->acp.acp_genpd->gpd.name = "ACP_AUDIO";
-	adev->acp.acp_genpd->gpd.power_off = acp_poweroff;
-	adev->acp.acp_genpd->gpd.power_on = acp_poweron;
+		adev->acp.acp_genpd->gpd.name = "ACP_AUDIO";
+		adev->acp.acp_genpd->gpd.power_off = acp_poweroff;
+		adev->acp.acp_genpd->gpd.power_on = acp_poweron;
 
 
-	adev->acp.acp_genpd->cgs_dev = adev->acp.cgs_device;
+		adev->acp.acp_genpd->cgs_dev = adev->acp.cgs_device;
 
-	pm_genpd_init(&adev->acp.acp_genpd->gpd, NULL, false);
+		pm_genpd_init(&adev->acp.acp_genpd->gpd, NULL, false);
+	}
 
 	adev->acp.acp_cell = kzalloc(sizeof(struct mfd_cell) * ACP_DEVS,
 							GFP_KERNEL);
@@ -319,14 +320,29 @@ static int acp_hw_init(void *handle)
 		return -ENOMEM;
 	}
 
-	i2s_pdata[0].quirks = DW_I2S_QUIRK_COMP_REG_OFFSET;
+	switch (adev->asic_type) {
+	case CHIP_STONEY:
+		i2s_pdata[0].quirks = DW_I2S_QUIRK_COMP_REG_OFFSET |
+			DW_I2S_QUIRK_16BIT_IDX_OVERRIDE;
+		break;
+	default:
+		i2s_pdata[0].quirks = DW_I2S_QUIRK_COMP_REG_OFFSET;
+	}
 	i2s_pdata[0].cap = DWC_I2S_PLAY;
 	i2s_pdata[0].snd_rates = SNDRV_PCM_RATE_8000_96000;
 	i2s_pdata[0].i2s_reg_comp1 = ACP_I2S_COMP1_PLAY_REG_OFFSET;
 	i2s_pdata[0].i2s_reg_comp2 = ACP_I2S_COMP2_PLAY_REG_OFFSET;
+	switch (adev->asic_type) {
+	case CHIP_STONEY:
+		i2s_pdata[1].quirks = DW_I2S_QUIRK_COMP_REG_OFFSET |
+			DW_I2S_QUIRK_COMP_PARAM1 |
+			DW_I2S_QUIRK_16BIT_IDX_OVERRIDE;
+		break;
+	default:
+		i2s_pdata[1].quirks = DW_I2S_QUIRK_COMP_REG_OFFSET |
+			DW_I2S_QUIRK_COMP_PARAM1;
+	}
 
-	i2s_pdata[1].quirks = DW_I2S_QUIRK_COMP_REG_OFFSET |
-				DW_I2S_QUIRK_COMP_PARAM1;
 	i2s_pdata[1].cap = DWC_I2S_RECORD;
 	i2s_pdata[1].snd_rates = SNDRV_PCM_RATE_8000_96000;
 	i2s_pdata[1].i2s_reg_comp1 = ACP_I2S_COMP1_CAP_REG_OFFSET;
@@ -373,12 +389,14 @@ static int acp_hw_init(void *handle)
 	if (r)
 		return r;
 
-	for (i = 0; i < ACP_DEVS ; i++) {
-		dev = get_mfd_cell_dev(adev->acp.acp_cell[i].name, i);
-		r = pm_genpd_add_device(&adev->acp.acp_genpd->gpd, dev);
-		if (r) {
-			dev_err(dev, "Failed to add dev to genpd\n");
-			return r;
+	if (adev->asic_type != CHIP_STONEY) {
+		for (i = 0; i < ACP_DEVS ; i++) {
+			dev = get_mfd_cell_dev(adev->acp.acp_cell[i].name, i);
+			r = pm_genpd_add_device(&adev->acp.acp_genpd->gpd, dev);
+			if (r) {
+				dev_err(dev, "Failed to add dev to genpd\n");
+				return r;
+			}
 		}
 	}
 
@@ -398,20 +416,22 @@ static int acp_hw_fini(void *handle)
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
 	/* return early if no ACP */
-	if (!adev->acp.acp_genpd)
+	if (!adev->acp.acp_cell)
 		return 0;
 
-	for (i = 0; i < ACP_DEVS ; i++) {
-		dev = get_mfd_cell_dev(adev->acp.acp_cell[i].name, i);
-		ret = pm_genpd_remove_device(&adev->acp.acp_genpd->gpd, dev);
-		/* If removal fails, dont giveup and try rest */
-		if (ret)
-			dev_err(dev, "remove dev from genpd failed\n");
+	if (adev->acp.acp_genpd) {
+		for (i = 0; i < ACP_DEVS ; i++) {
+			dev = get_mfd_cell_dev(adev->acp.acp_cell[i].name, i);
+			ret = pm_genpd_remove_device(&adev->acp.acp_genpd->gpd, dev);
+			/* If removal fails, dont giveup and try rest */
+			if (ret)
+				dev_err(dev, "remove dev from genpd failed\n");
+		}
+		kfree(adev->acp.acp_genpd);
 	}
 
 	mfd_remove_devices(adev->acp.parent);
 	kfree(adev->acp.acp_res);
-	kfree(adev->acp.acp_genpd);
 	kfree(adev->acp.acp_cell);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
index ef79551..57afad7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
@@ -30,10 +30,10 @@
 #include <drm/drmP.h>
 #include <drm/drm_crtc_helper.h>
 #include "amdgpu.h"
+#include "amdgpu_pm.h"
 #include "amd_acpi.h"
 #include "atom.h"
 
-extern void amdgpu_pm_acpi_event_handler(struct amdgpu_device *adev);
 /* Call the ATIF method
  */
 /**
@@ -289,7 +289,7 @@ static int amdgpu_atif_get_sbios_requests(acpi_handle handle,
  * handles it.
  * Returns NOTIFY code
  */
-int amdgpu_atif_handler(struct amdgpu_device *adev,
+static int amdgpu_atif_handler(struct amdgpu_device *adev,
 			struct acpi_bus_event *event)
 {
 	struct amdgpu_atif *atif = &adev->atif;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
index 37971d9..5432af3 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c
@@ -27,16 +27,15 @@
 #include "amdgpu_gfx.h"
 #include <linux/module.h>
 
-const struct kfd2kgd_calls *kfd2kgd;
 const struct kgd2kfd_calls *kgd2kfd;
-bool (*kgd2kfd_init_p)(unsigned, const struct kgd2kfd_calls**);
+bool (*kgd2kfd_init_p)(unsigned int, const struct kgd2kfd_calls**);
 
 int amdgpu_amdkfd_init(void)
 {
 	int ret;
 
 #if defined(CONFIG_HSA_AMD_MODULE)
-	int (*kgd2kfd_init_p)(unsigned, const struct kgd2kfd_calls**);
+	int (*kgd2kfd_init_p)(unsigned int, const struct kgd2kfd_calls**);
 
 	kgd2kfd_init_p = symbol_request(kgd2kfd_init);
 
@@ -61,24 +60,6 @@ int amdgpu_amdkfd_init(void)
 	return ret;
 }
 
-bool amdgpu_amdkfd_load_interface(struct amdgpu_device *adev)
-{
-	switch (adev->asic_type) {
-#ifdef CONFIG_DRM_AMDGPU_CIK
-	case CHIP_KAVERI:
-		kfd2kgd = amdgpu_amdkfd_gfx_7_get_functions();
-		break;
-#endif
-	case CHIP_CARRIZO:
-		kfd2kgd = amdgpu_amdkfd_gfx_8_0_get_functions();
-		break;
-	default:
-		return false;
-	}
-
-	return true;
-}
-
 void amdgpu_amdkfd_fini(void)
 {
 	if (kgd2kfd) {
@@ -89,9 +70,27 @@ void amdgpu_amdkfd_fini(void)
 
 void amdgpu_amdkfd_device_probe(struct amdgpu_device *adev)
 {
-	if (kgd2kfd)
-		adev->kfd = kgd2kfd->probe((struct kgd_dev *)adev,
-					adev->pdev, kfd2kgd);
+	const struct kfd2kgd_calls *kfd2kgd;
+
+	if (!kgd2kfd)
+		return;
+
+	switch (adev->asic_type) {
+#ifdef CONFIG_DRM_AMDGPU_CIK
+	case CHIP_KAVERI:
+		kfd2kgd = amdgpu_amdkfd_gfx_7_get_functions();
+		break;
+#endif
+	case CHIP_CARRIZO:
+		kfd2kgd = amdgpu_amdkfd_gfx_8_0_get_functions();
+		break;
+	default:
+		dev_info(adev->dev, "kfd not supported on this ASIC\n");
+		return;
+	}
+
+	adev->kfd = kgd2kfd->probe((struct kgd_dev *)adev,
+				   adev->pdev, kfd2kgd);
 }
 
 void amdgpu_amdkfd_device_init(struct amdgpu_device *adev)
@@ -184,7 +183,8 @@ int alloc_gtt_mem(struct kgd_dev *kgd, size_t size,
 		return -ENOMEM;
 
 	r = amdgpu_bo_create(adev, size, PAGE_SIZE, true, AMDGPU_GEM_DOMAIN_GTT,
-			     AMDGPU_GEM_CREATE_CPU_GTT_USWC, NULL, NULL, &(*mem)->bo);
+			     AMDGPU_GEM_CREATE_CPU_GTT_USWC, NULL, NULL, 0,
+			     &(*mem)->bo);
 	if (r) {
 		dev_err(adev->dev,
 			"failed to allocate BO for amdkfd (%d)\n", r);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
index 73f83a1..8d689ab 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h
@@ -26,6 +26,7 @@
 #define AMDGPU_AMDKFD_H_INCLUDED
 
 #include <linux/types.h>
+#include <linux/mmu_context.h>
 #include <kgd_kfd_interface.h>
 
 struct amdgpu_device;
@@ -39,8 +40,6 @@ struct kgd_mem {
 int amdgpu_amdkfd_init(void);
 void amdgpu_amdkfd_fini(void);
 
-bool amdgpu_amdkfd_load_interface(struct amdgpu_device *adev);
-
 void amdgpu_amdkfd_suspend(struct amdgpu_device *adev);
 int amdgpu_amdkfd_resume(struct amdgpu_device *adev);
 void amdgpu_amdkfd_interrupt(struct amdgpu_device *adev,
@@ -62,4 +61,19 @@ uint64_t get_gpu_clock_counter(struct kgd_dev *kgd);
 
 uint32_t get_max_engine_clock_in_mhz(struct kgd_dev *kgd);
 
+#define read_user_wptr(mmptr, wptr, dst)				\
+	({								\
+		bool valid = false;					\
+		if ((mmptr) && (wptr)) {				\
+			if ((mmptr) == current->mm) {			\
+				valid = !get_user((dst), (wptr));	\
+			} else if (current->mm == NULL) {		\
+				use_mm(mmptr);				\
+				valid = !get_user((dst), (wptr));	\
+				unuse_mm(mmptr);			\
+			}						\
+		}							\
+		valid;							\
+	})
+
 #endif /* AMDGPU_AMDKFD_H_INCLUDED */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
index 5254562..b9dbbf9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c
@@ -39,6 +39,12 @@
 #include "gmc/gmc_7_1_sh_mask.h"
 #include "cik_structs.h"
 
+enum hqd_dequeue_request_type {
+	NO_ACTION = 0,
+	DRAIN_PIPE,
+	RESET_WAVES
+};
+
 enum {
 	MAX_TRAPID = 8,		/* 3 bits in the bitfield. */
 	MAX_WATCH_ADDRESSES = 4
@@ -96,12 +102,15 @@ static int kgd_init_pipeline(struct kgd_dev *kgd, uint32_t pipe_id,
 				uint32_t hpd_size, uint64_t hpd_gpu_addr);
 static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id);
 static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
-			uint32_t queue_id, uint32_t __user *wptr);
+			uint32_t queue_id, uint32_t __user *wptr,
+			uint32_t wptr_shift, uint32_t wptr_mask,
+			struct mm_struct *mm);
 static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd);
 static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
 				uint32_t pipe_id, uint32_t queue_id);
 
-static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
+static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd,
+				enum kfd_preempt_type reset_type,
 				unsigned int utimeout, uint32_t pipe_id,
 				uint32_t queue_id);
 static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd);
@@ -126,6 +135,33 @@ static uint16_t get_atc_vmid_pasid_mapping_pasid(struct kgd_dev *kgd,
 static void write_vmid_invalidate_request(struct kgd_dev *kgd, uint8_t vmid);
 
 static uint16_t get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type);
+static void set_scratch_backing_va(struct kgd_dev *kgd,
+					uint64_t va, uint32_t vmid);
+
+/* Because of REG_GET_FIELD() being used, we put this function in the
+ * asic specific file.
+ */
+static int get_tile_config(struct kgd_dev *kgd,
+		struct tile_config *config)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
+
+	config->gb_addr_config = adev->gfx.config.gb_addr_config;
+	config->num_banks = REG_GET_FIELD(adev->gfx.config.mc_arb_ramcfg,
+				MC_ARB_RAMCFG, NOOFBANK);
+	config->num_ranks = REG_GET_FIELD(adev->gfx.config.mc_arb_ramcfg,
+				MC_ARB_RAMCFG, NOOFRANKS);
+
+	config->tile_config_ptr = adev->gfx.config.tile_mode_array;
+	config->num_tile_configs =
+			ARRAY_SIZE(adev->gfx.config.tile_mode_array);
+	config->macro_tile_config_ptr =
+			adev->gfx.config.macrotile_mode_array;
+	config->num_macro_tile_configs =
+			ARRAY_SIZE(adev->gfx.config.macrotile_mode_array);
+
+	return 0;
+}
 
 static const struct kfd2kgd_calls kfd2kgd = {
 	.init_gtt_mem_allocation = alloc_gtt_mem,
@@ -150,7 +186,9 @@ static const struct kfd2kgd_calls kfd2kgd = {
 	.get_atc_vmid_pasid_mapping_pasid = get_atc_vmid_pasid_mapping_pasid,
 	.get_atc_vmid_pasid_mapping_valid = get_atc_vmid_pasid_mapping_valid,
 	.write_vmid_invalidate_request = write_vmid_invalidate_request,
-	.get_fw_version = get_fw_version
+	.get_fw_version = get_fw_version,
+	.set_scratch_backing_va = set_scratch_backing_va,
+	.get_tile_config = get_tile_config,
 };
 
 struct kfd2kgd_calls *amdgpu_amdkfd_gfx_7_get_functions(void)
@@ -186,7 +224,7 @@ static void acquire_queue(struct kgd_dev *kgd, uint32_t pipe_id,
 {
 	struct amdgpu_device *adev = get_amdgpu_device(kgd);
 
-	uint32_t mec = (++pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
+	uint32_t mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
 	uint32_t pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
 	lock_srbm(kgd, mec, pipe, queue_id, 0);
@@ -290,20 +328,38 @@ static inline struct cik_sdma_rlc_registers *get_sdma_mqd(void *mqd)
 }
 
 static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
-			uint32_t queue_id, uint32_t __user *wptr)
+			uint32_t queue_id, uint32_t __user *wptr,
+			uint32_t wptr_shift, uint32_t wptr_mask,
+			struct mm_struct *mm)
 {
 	struct amdgpu_device *adev = get_amdgpu_device(kgd);
-	uint32_t wptr_shadow, is_wptr_shadow_valid;
 	struct cik_mqd *m;
+	uint32_t *mqd_hqd;
+	uint32_t reg, wptr_val, data;
 
 	m = get_mqd(mqd);
 
-	is_wptr_shadow_valid = !get_user(wptr_shadow, wptr);
-	if (is_wptr_shadow_valid)
-		m->cp_hqd_pq_wptr = wptr_shadow;
-
 	acquire_queue(kgd, pipe_id, queue_id);
-	gfx_v7_0_mqd_commit(adev, m);
+
+	/* HQD registers extend from CP_MQD_BASE_ADDR to CP_MQD_CONTROL. */
+	mqd_hqd = &m->cp_mqd_base_addr_lo;
+
+	for (reg = mmCP_MQD_BASE_ADDR; reg <= mmCP_MQD_CONTROL; reg++)
+		WREG32(reg, mqd_hqd[reg - mmCP_MQD_BASE_ADDR]);
+
+	/* Copy userspace write pointer value to register.
+	 * Activate doorbell logic to monitor subsequent changes.
+	 */
+	data = REG_SET_FIELD(m->cp_hqd_pq_doorbell_control,
+			     CP_HQD_PQ_DOORBELL_CONTROL, DOORBELL_EN, 1);
+	WREG32(mmCP_HQD_PQ_DOORBELL_CONTROL, data);
+
+	if (read_user_wptr(mm, wptr, wptr_val))
+		WREG32(mmCP_HQD_PQ_WPTR, (wptr_val << wptr_shift) & wptr_mask);
+
+	data = REG_SET_FIELD(m->cp_hqd_active, CP_HQD_ACTIVE, ACTIVE, 1);
+	WREG32(mmCP_HQD_ACTIVE, data);
+
 	release_queue(kgd);
 
 	return 0;
@@ -382,30 +438,99 @@ static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd)
 	return false;
 }
 
-static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
+static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd,
+				enum kfd_preempt_type reset_type,
 				unsigned int utimeout, uint32_t pipe_id,
 				uint32_t queue_id)
 {
 	struct amdgpu_device *adev = get_amdgpu_device(kgd);
 	uint32_t temp;
-	int timeout = utimeout;
+	enum hqd_dequeue_request_type type;
+	unsigned long flags, end_jiffies;
+	int retry;
 
 	acquire_queue(kgd, pipe_id, queue_id);
 	WREG32(mmCP_HQD_PQ_DOORBELL_CONTROL, 0);
 
-	WREG32(mmCP_HQD_DEQUEUE_REQUEST, reset_type);
+	switch (reset_type) {
+	case KFD_PREEMPT_TYPE_WAVEFRONT_DRAIN:
+		type = DRAIN_PIPE;
+		break;
+	case KFD_PREEMPT_TYPE_WAVEFRONT_RESET:
+		type = RESET_WAVES;
+		break;
+	default:
+		type = DRAIN_PIPE;
+		break;
+	}
 
+	/* Workaround: If IQ timer is active and the wait time is close to or
+	 * equal to 0, dequeueing is not safe. Wait until either the wait time
+	 * is larger or timer is cleared. Also, ensure that IQ_REQ_PEND is
+	 * cleared before continuing. Also, ensure wait times are set to at
+	 * least 0x3.
+	 */
+	local_irq_save(flags);
+	preempt_disable();
+	retry = 5000; /* wait for 500 usecs at maximum */
+	while (true) {
+		temp = RREG32(mmCP_HQD_IQ_TIMER);
+		if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, PROCESSING_IQ)) {
+			pr_debug("HW is processing IQ\n");
+			goto loop;
+		}
+		if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, ACTIVE)) {
+			if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, RETRY_TYPE)
+					== 3) /* SEM-rearm is safe */
+				break;
+			/* Wait time 3 is safe for CP, but our MMIO read/write
+			 * time is close to 1 microsecond, so check for 10 to
+			 * leave more buffer room
+			 */
+			if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, WAIT_TIME)
+					>= 10)
+				break;
+			pr_debug("IQ timer is active\n");
+		} else
+			break;
+loop:
+		if (!retry) {
+			pr_err("CP HQD IQ timer status time out\n");
+			break;
+		}
+		ndelay(100);
+		--retry;
+	}
+	retry = 1000;
+	while (true) {
+		temp = RREG32(mmCP_HQD_DEQUEUE_REQUEST);
+		if (!(temp & CP_HQD_DEQUEUE_REQUEST__IQ_REQ_PEND_MASK))
+			break;
+		pr_debug("Dequeue request is pending\n");
+
+		if (!retry) {
+			pr_err("CP HQD dequeue request time out\n");
+			break;
+		}
+		ndelay(100);
+		--retry;
+	}
+	local_irq_restore(flags);
+	preempt_enable();
+
+	WREG32(mmCP_HQD_DEQUEUE_REQUEST, type);
+
+	end_jiffies = (utimeout * HZ / 1000) + jiffies;
 	while (true) {
 		temp = RREG32(mmCP_HQD_ACTIVE);
-		if (temp & CP_HQD_ACTIVE__ACTIVE_MASK)
+		if (!(temp & CP_HQD_ACTIVE__ACTIVE_MASK))
 			break;
-		if (timeout <= 0) {
-			pr_err("kfd: cp queue preemption time out.\n");
+		if (time_after(jiffies, end_jiffies)) {
+			pr_err("cp queue preemption time out\n");
 			release_queue(kgd);
 			return -ETIME;
 		}
-		msleep(20);
-		timeout -= 20;
+		usleep_range(500, 1000);
 	}
 
 	release_queue(kgd);
@@ -556,6 +681,16 @@ static void write_vmid_invalidate_request(struct kgd_dev *kgd, uint8_t vmid)
 	WREG32(mmVM_INVALIDATE_REQUEST, 1 << vmid);
 }
 
+static void set_scratch_backing_va(struct kgd_dev *kgd,
+					uint64_t va, uint32_t vmid)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *) kgd;
+
+	lock_srbm(kgd, 0, 0, 0, vmid);
+	WREG32(mmSH_HIDDEN_PRIVATE_BASE_VMID, va);
+	unlock_srbm(kgd);
+}
+
 static uint16_t get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *) kgd;
@@ -566,42 +701,42 @@ static uint16_t get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type)
 	switch (type) {
 	case KGD_ENGINE_PFP:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.pfp_fw->data;
+						adev->gfx.pfp_fw->data;
 		break;
 
 	case KGD_ENGINE_ME:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.me_fw->data;
+						adev->gfx.me_fw->data;
 		break;
 
 	case KGD_ENGINE_CE:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.ce_fw->data;
+						adev->gfx.ce_fw->data;
 		break;
 
 	case KGD_ENGINE_MEC1:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.mec_fw->data;
+						adev->gfx.mec_fw->data;
 		break;
 
 	case KGD_ENGINE_MEC2:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.mec2_fw->data;
+						adev->gfx.mec2_fw->data;
 		break;
 
 	case KGD_ENGINE_RLC:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.rlc_fw->data;
+						adev->gfx.rlc_fw->data;
 		break;
 
 	case KGD_ENGINE_SDMA1:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->sdma.instance[0].fw->data;
+						adev->sdma.instance[0].fw->data;
 		break;
 
 	case KGD_ENGINE_SDMA2:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->sdma.instance[1].fw->data;
+						adev->sdma.instance[1].fw->data;
 		break;
 
 	default:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
index 133d066..fb6e5db 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v8.c
@@ -39,6 +39,12 @@
 #include "vi_structs.h"
 #include "vid.h"
 
+enum hqd_dequeue_request_type {
+	NO_ACTION = 0,
+	DRAIN_PIPE,
+	RESET_WAVES
+};
+
 struct cik_sdma_rlc_registers;
 
 /*
@@ -55,12 +61,15 @@ static int kgd_init_pipeline(struct kgd_dev *kgd, uint32_t pipe_id,
 		uint32_t hpd_size, uint64_t hpd_gpu_addr);
 static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id);
 static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
-		uint32_t queue_id, uint32_t __user *wptr);
+			uint32_t queue_id, uint32_t __user *wptr,
+			uint32_t wptr_shift, uint32_t wptr_mask,
+			struct mm_struct *mm);
 static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd);
 static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
 		uint32_t pipe_id, uint32_t queue_id);
 static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd);
-static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
+static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd,
+				enum kfd_preempt_type reset_type,
 				unsigned int utimeout, uint32_t pipe_id,
 				uint32_t queue_id);
 static int kgd_hqd_sdma_destroy(struct kgd_dev *kgd, void *mqd,
@@ -85,6 +94,33 @@ static uint16_t get_atc_vmid_pasid_mapping_pasid(struct kgd_dev *kgd,
 		uint8_t vmid);
 static void write_vmid_invalidate_request(struct kgd_dev *kgd, uint8_t vmid);
 static uint16_t get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type);
+static void set_scratch_backing_va(struct kgd_dev *kgd,
+					uint64_t va, uint32_t vmid);
+
+/* Because of REG_GET_FIELD() being used, we put this function in the
+ * asic specific file.
+ */
+static int get_tile_config(struct kgd_dev *kgd,
+		struct tile_config *config)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *)kgd;
+
+	config->gb_addr_config = adev->gfx.config.gb_addr_config;
+	config->num_banks = REG_GET_FIELD(adev->gfx.config.mc_arb_ramcfg,
+				MC_ARB_RAMCFG, NOOFBANK);
+	config->num_ranks = REG_GET_FIELD(adev->gfx.config.mc_arb_ramcfg,
+				MC_ARB_RAMCFG, NOOFRANKS);
+
+	config->tile_config_ptr = adev->gfx.config.tile_mode_array;
+	config->num_tile_configs =
+			ARRAY_SIZE(adev->gfx.config.tile_mode_array);
+	config->macro_tile_config_ptr =
+			adev->gfx.config.macrotile_mode_array;
+	config->num_macro_tile_configs =
+			ARRAY_SIZE(adev->gfx.config.macrotile_mode_array);
+
+	return 0;
+}
 
 static const struct kfd2kgd_calls kfd2kgd = {
 	.init_gtt_mem_allocation = alloc_gtt_mem,
@@ -111,12 +147,15 @@ static const struct kfd2kgd_calls kfd2kgd = {
 	.get_atc_vmid_pasid_mapping_valid =
 			get_atc_vmid_pasid_mapping_valid,
 	.write_vmid_invalidate_request = write_vmid_invalidate_request,
-	.get_fw_version = get_fw_version
+	.get_fw_version = get_fw_version,
+	.set_scratch_backing_va = set_scratch_backing_va,
+	.get_tile_config = get_tile_config,
 };
 
 struct kfd2kgd_calls *amdgpu_amdkfd_gfx_8_0_get_functions(void)
 {
 	return (struct kfd2kgd_calls *)&kfd2kgd;
+	return (struct kfd2kgd_calls *)&kfd2kgd;
 }
 
 static inline struct amdgpu_device *get_amdgpu_device(struct kgd_dev *kgd)
@@ -147,7 +186,7 @@ static void acquire_queue(struct kgd_dev *kgd, uint32_t pipe_id,
 {
 	struct amdgpu_device *adev = get_amdgpu_device(kgd);
 
-	uint32_t mec = (++pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
+	uint32_t mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
 	uint32_t pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
 	lock_srbm(kgd, mec, pipe, queue_id, 0);
@@ -216,7 +255,7 @@ static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id)
 	uint32_t mec;
 	uint32_t pipe;
 
-	mec = (++pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
+	mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
 	pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
 
 	lock_srbm(kgd, mec, pipe, 0, 0);
@@ -244,20 +283,67 @@ static inline struct cik_sdma_rlc_registers *get_sdma_mqd(void *mqd)
 }
 
 static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
-			uint32_t queue_id, uint32_t __user *wptr)
+			uint32_t queue_id, uint32_t __user *wptr,
+			uint32_t wptr_shift, uint32_t wptr_mask,
+			struct mm_struct *mm)
 {
-	struct vi_mqd *m;
-	uint32_t shadow_wptr, valid_wptr;
 	struct amdgpu_device *adev = get_amdgpu_device(kgd);
+	struct vi_mqd *m;
+	uint32_t *mqd_hqd;
+	uint32_t reg, wptr_val, data;
 
 	m = get_mqd(mqd);
 
-	valid_wptr = copy_from_user(&shadow_wptr, wptr, sizeof(shadow_wptr));
-	if (valid_wptr == 0)
-		m->cp_hqd_pq_wptr = shadow_wptr;
-
 	acquire_queue(kgd, pipe_id, queue_id);
-	gfx_v8_0_mqd_commit(adev, mqd);
+
+	/* HIQ is set during driver init period with vmid set to 0*/
+	if (m->cp_hqd_vmid == 0) {
+		uint32_t value, mec, pipe;
+
+		mec = (pipe_id / adev->gfx.mec.num_pipe_per_mec) + 1;
+		pipe = (pipe_id % adev->gfx.mec.num_pipe_per_mec);
+
+		pr_debug("kfd: set HIQ, mec:%d, pipe:%d, queue:%d.\n",
+			mec, pipe, queue_id);
+		value = RREG32(mmRLC_CP_SCHEDULERS);
+		value = REG_SET_FIELD(value, RLC_CP_SCHEDULERS, scheduler1,
+			((mec << 5) | (pipe << 3) | queue_id | 0x80));
+		WREG32(mmRLC_CP_SCHEDULERS, value);
+	}
+
+	/* HQD registers extend from CP_MQD_BASE_ADDR to CP_HQD_EOP_WPTR_MEM. */
+	mqd_hqd = &m->cp_mqd_base_addr_lo;
+
+	for (reg = mmCP_MQD_BASE_ADDR; reg <= mmCP_HQD_EOP_CONTROL; reg++)
+		WREG32(reg, mqd_hqd[reg - mmCP_MQD_BASE_ADDR]);
+
+	/* Tonga errata: EOP RPTR/WPTR should be left unmodified.
+	 * This is safe since EOP RPTR==WPTR for any inactive HQD
+	 * on ASICs that do not support context-save.
+	 * EOP writes/reads can start anywhere in the ring.
+	 */
+	if (get_amdgpu_device(kgd)->asic_type != CHIP_TONGA) {
+		WREG32(mmCP_HQD_EOP_RPTR, m->cp_hqd_eop_rptr);
+		WREG32(mmCP_HQD_EOP_WPTR, m->cp_hqd_eop_wptr);
+		WREG32(mmCP_HQD_EOP_WPTR_MEM, m->cp_hqd_eop_wptr_mem);
+	}
+
+	for (reg = mmCP_HQD_EOP_EVENTS; reg <= mmCP_HQD_ERROR; reg++)
+		WREG32(reg, mqd_hqd[reg - mmCP_MQD_BASE_ADDR]);
+
+	/* Copy userspace write pointer value to register.
+	 * Activate doorbell logic to monitor subsequent changes.
+	 */
+	data = REG_SET_FIELD(m->cp_hqd_pq_doorbell_control,
+			     CP_HQD_PQ_DOORBELL_CONTROL, DOORBELL_EN, 1);
+	WREG32(mmCP_HQD_PQ_DOORBELL_CONTROL, data);
+
+	if (read_user_wptr(mm, wptr, wptr_val))
+		WREG32(mmCP_HQD_PQ_WPTR, (wptr_val << wptr_shift) & wptr_mask);
+
+	data = REG_SET_FIELD(m->cp_hqd_active, CP_HQD_ACTIVE, ACTIVE, 1);
+	WREG32(mmCP_HQD_ACTIVE, data);
+
 	release_queue(kgd);
 
 	return 0;
@@ -308,29 +394,102 @@ static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd)
 	return false;
 }
 
-static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
+static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd,
+				enum kfd_preempt_type reset_type,
 				unsigned int utimeout, uint32_t pipe_id,
 				uint32_t queue_id)
 {
 	struct amdgpu_device *adev = get_amdgpu_device(kgd);
 	uint32_t temp;
-	int timeout = utimeout;
+	enum hqd_dequeue_request_type type;
+	unsigned long flags, end_jiffies;
+	int retry;
+	struct vi_mqd *m = get_mqd(mqd);
 
 	acquire_queue(kgd, pipe_id, queue_id);
 
-	WREG32(mmCP_HQD_DEQUEUE_REQUEST, reset_type);
+	if (m->cp_hqd_vmid == 0)
+		WREG32_FIELD(RLC_CP_SCHEDULERS, scheduler1, 0);
 
+	switch (reset_type) {
+	case KFD_PREEMPT_TYPE_WAVEFRONT_DRAIN:
+		type = DRAIN_PIPE;
+		break;
+	case KFD_PREEMPT_TYPE_WAVEFRONT_RESET:
+		type = RESET_WAVES;
+		break;
+	default:
+		type = DRAIN_PIPE;
+		break;
+	}
+
+	/* Workaround: If IQ timer is active and the wait time is close to or
+	 * equal to 0, dequeueing is not safe. Wait until either the wait time
+	 * is larger or timer is cleared. Also, ensure that IQ_REQ_PEND is
+	 * cleared before continuing. Also, ensure wait times are set to at
+	 * least 0x3.
+	 */
+	local_irq_save(flags);
+	preempt_disable();
+	retry = 5000; /* wait for 500 usecs at maximum */
+	while (true) {
+		temp = RREG32(mmCP_HQD_IQ_TIMER);
+		if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, PROCESSING_IQ)) {
+			pr_debug("HW is processing IQ\n");
+			goto loop;
+		}
+		if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, ACTIVE)) {
+			if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, RETRY_TYPE)
+					== 3) /* SEM-rearm is safe */
+				break;
+			/* Wait time 3 is safe for CP, but our MMIO read/write
+			 * time is close to 1 microsecond, so check for 10 to
+			 * leave more buffer room
+			 */
+			if (REG_GET_FIELD(temp, CP_HQD_IQ_TIMER, WAIT_TIME)
+					>= 10)
+				break;
+			pr_debug("IQ timer is active\n");
+		} else
+			break;
+loop:
+		if (!retry) {
+			pr_err("CP HQD IQ timer status time out\n");
+			break;
+		}
+		ndelay(100);
+		--retry;
+	}
+	retry = 1000;
+	while (true) {
+		temp = RREG32(mmCP_HQD_DEQUEUE_REQUEST);
+		if (!(temp & CP_HQD_DEQUEUE_REQUEST__IQ_REQ_PEND_MASK))
+			break;
+		pr_debug("Dequeue request is pending\n");
+
+		if (!retry) {
+			pr_err("CP HQD dequeue request time out\n");
+			break;
+		}
+		ndelay(100);
+		--retry;
+	}
+	local_irq_restore(flags);
+	preempt_enable();
+
+	WREG32(mmCP_HQD_DEQUEUE_REQUEST, type);
+
+	end_jiffies = (utimeout * HZ / 1000) + jiffies;
 	while (true) {
 		temp = RREG32(mmCP_HQD_ACTIVE);
-		if (temp & CP_HQD_ACTIVE__ACTIVE_MASK)
+		if (!(temp & CP_HQD_ACTIVE__ACTIVE_MASK))
 			break;
-		if (timeout <= 0) {
-			pr_err("kfd: cp queue preemption time out.\n");
+		if (time_after(jiffies, end_jiffies)) {
+			pr_err("cp queue preemption time out.\n");
 			release_queue(kgd);
 			return -ETIME;
 		}
-		msleep(20);
-		timeout -= 20;
+		usleep_range(500, 1000);
 	}
 
 	release_queue(kgd);
@@ -444,6 +603,16 @@ static uint32_t kgd_address_watch_get_offset(struct kgd_dev *kgd,
 	return 0;
 }
 
+static void set_scratch_backing_va(struct kgd_dev *kgd,
+					uint64_t va, uint32_t vmid)
+{
+	struct amdgpu_device *adev = (struct amdgpu_device *) kgd;
+
+	lock_srbm(kgd, 0, 0, 0, vmid);
+	WREG32(mmSH_HIDDEN_PRIVATE_BASE_VMID, va);
+	unlock_srbm(kgd);
+}
+
 static uint16_t get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *) kgd;
@@ -454,42 +623,42 @@ static uint16_t get_fw_version(struct kgd_dev *kgd, enum kgd_engine_type type)
 	switch (type) {
 	case KGD_ENGINE_PFP:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.pfp_fw->data;
+						adev->gfx.pfp_fw->data;
 		break;
 
 	case KGD_ENGINE_ME:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.me_fw->data;
+						adev->gfx.me_fw->data;
 		break;
 
 	case KGD_ENGINE_CE:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.ce_fw->data;
+						adev->gfx.ce_fw->data;
 		break;
 
 	case KGD_ENGINE_MEC1:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.mec_fw->data;
+						adev->gfx.mec_fw->data;
 		break;
 
 	case KGD_ENGINE_MEC2:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.mec2_fw->data;
+						adev->gfx.mec2_fw->data;
 		break;
 
 	case KGD_ENGINE_RLC:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->gfx.rlc_fw->data;
+						adev->gfx.rlc_fw->data;
 		break;
 
 	case KGD_ENGINE_SDMA1:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->sdma.instance[0].fw->data;
+						adev->sdma.instance[0].fw->data;
 		break;
 
 	case KGD_ENGINE_SDMA2:
 		hdr = (const union amdgpu_firmware_header *)
-							adev->sdma.instance[1].fw->data;
+						adev->sdma.instance[1].fw->data;
 		break;
 
 	default:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
index 1e8e112..ce44358 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.c
@@ -1686,7 +1686,7 @@ void amdgpu_atombios_scratch_regs_lock(struct amdgpu_device *adev, bool lock)
 {
 	uint32_t bios_6_scratch;
 
-	bios_6_scratch = RREG32(mmBIOS_SCRATCH_6);
+	bios_6_scratch = RREG32(adev->bios_scratch_reg_offset + 6);
 
 	if (lock) {
 		bios_6_scratch |= ATOM_S6_CRITICAL_STATE;
@@ -1696,15 +1696,17 @@ void amdgpu_atombios_scratch_regs_lock(struct amdgpu_device *adev, bool lock)
 		bios_6_scratch |= ATOM_S6_ACC_MODE;
 	}
 
-	WREG32(mmBIOS_SCRATCH_6, bios_6_scratch);
+	WREG32(adev->bios_scratch_reg_offset + 6, bios_6_scratch);
 }
 
 void amdgpu_atombios_scratch_regs_init(struct amdgpu_device *adev)
 {
 	uint32_t bios_2_scratch, bios_6_scratch;
 
-	bios_2_scratch = RREG32(mmBIOS_SCRATCH_2);
-	bios_6_scratch = RREG32(mmBIOS_SCRATCH_6);
+	adev->bios_scratch_reg_offset = mmBIOS_SCRATCH_0;
+
+	bios_2_scratch = RREG32(adev->bios_scratch_reg_offset + 2);
+	bios_6_scratch = RREG32(adev->bios_scratch_reg_offset + 6);
 
 	/* let the bios control the backlight */
 	bios_2_scratch &= ~ATOM_S2_VRI_BRIGHT_ENABLE;
@@ -1715,8 +1717,8 @@ void amdgpu_atombios_scratch_regs_init(struct amdgpu_device *adev)
 	/* clear the vbios dpms state */
 	bios_2_scratch &= ~ATOM_S2_DEVICE_DPMS_STATE;
 
-	WREG32(mmBIOS_SCRATCH_2, bios_2_scratch);
-	WREG32(mmBIOS_SCRATCH_6, bios_6_scratch);
+	WREG32(adev->bios_scratch_reg_offset + 2, bios_2_scratch);
+	WREG32(adev->bios_scratch_reg_offset + 6, bios_6_scratch);
 }
 
 void amdgpu_atombios_scratch_regs_save(struct amdgpu_device *adev)
@@ -1724,7 +1726,7 @@ void amdgpu_atombios_scratch_regs_save(struct amdgpu_device *adev)
 	int i;
 
 	for (i = 0; i < AMDGPU_BIOS_NUM_SCRATCH; i++)
-		adev->bios_scratch[i] = RREG32(mmBIOS_SCRATCH_0 + i);
+		adev->bios_scratch[i] = RREG32(adev->bios_scratch_reg_offset + i);
 }
 
 void amdgpu_atombios_scratch_regs_restore(struct amdgpu_device *adev)
@@ -1738,20 +1740,30 @@ void amdgpu_atombios_scratch_regs_restore(struct amdgpu_device *adev)
 	adev->bios_scratch[7] &= ~ATOM_S7_ASIC_INIT_COMPLETE_MASK;
 
 	for (i = 0; i < AMDGPU_BIOS_NUM_SCRATCH; i++)
-		WREG32(mmBIOS_SCRATCH_0 + i, adev->bios_scratch[i]);
+		WREG32(adev->bios_scratch_reg_offset + i, adev->bios_scratch[i]);
 }
 
 void amdgpu_atombios_scratch_regs_engine_hung(struct amdgpu_device *adev,
 					      bool hung)
 {
-	u32 tmp = RREG32(mmBIOS_SCRATCH_3);
+	u32 tmp = RREG32(adev->bios_scratch_reg_offset + 3);
 
 	if (hung)
 		tmp |= ATOM_S3_ASIC_GUI_ENGINE_HUNG;
 	else
 		tmp &= ~ATOM_S3_ASIC_GUI_ENGINE_HUNG;
 
-	WREG32(mmBIOS_SCRATCH_3, tmp);
+	WREG32(adev->bios_scratch_reg_offset + 3, tmp);
+}
+
+bool amdgpu_atombios_scratch_need_asic_init(struct amdgpu_device *adev)
+{
+	u32 tmp = RREG32(adev->bios_scratch_reg_offset + 7);
+
+	if (tmp & ATOM_S7_ASIC_INIT_COMPLETE_MASK)
+		return false;
+	else
+		return true;
 }
 
 /* Atom needs data in little endian format
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.h
index 38d0fe3..b0d5d1d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atombios.h
@@ -200,6 +200,7 @@ void amdgpu_atombios_scratch_regs_save(struct amdgpu_device *adev);
 void amdgpu_atombios_scratch_regs_restore(struct amdgpu_device *adev);
 void amdgpu_atombios_scratch_regs_engine_hung(struct amdgpu_device *adev,
 					      bool hung);
+bool amdgpu_atombios_scratch_need_asic_init(struct amdgpu_device *adev);
 
 void amdgpu_atombios_copy_swap(u8 *dst, u8 *src, u8 num_bytes, bool to_le);
 int amdgpu_atombios_get_max_vddc(struct amdgpu_device *adev, u8 voltage_type,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c
index 4bdda56..f9ffe8e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.c
@@ -66,41 +66,6 @@ void amdgpu_atomfirmware_scratch_regs_init(struct amdgpu_device *adev)
 	}
 }
 
-void amdgpu_atomfirmware_scratch_regs_save(struct amdgpu_device *adev)
-{
-	int i;
-
-	for (i = 0; i < AMDGPU_BIOS_NUM_SCRATCH; i++)
-		adev->bios_scratch[i] = RREG32(adev->bios_scratch_reg_offset + i);
-}
-
-void amdgpu_atomfirmware_scratch_regs_restore(struct amdgpu_device *adev)
-{
-	int i;
-
-	/*
-	 * VBIOS will check ASIC_INIT_COMPLETE bit to decide if
-	 * execute ASIC_Init posting via driver
-	 */
-	adev->bios_scratch[7] &= ~ATOM_S7_ASIC_INIT_COMPLETE_MASK;
-
-	for (i = 0; i < AMDGPU_BIOS_NUM_SCRATCH; i++)
-		WREG32(adev->bios_scratch_reg_offset + i, adev->bios_scratch[i]);
-}
-
-void amdgpu_atomfirmware_scratch_regs_engine_hung(struct amdgpu_device *adev,
-						  bool hung)
-{
-	u32 tmp = RREG32(adev->bios_scratch_reg_offset + 3);
-
-	if (hung)
-		tmp |= ATOM_S3_ASIC_GUI_ENGINE_HUNG;
-	else
-		tmp &= ~ATOM_S3_ASIC_GUI_ENGINE_HUNG;
-
-	WREG32(adev->bios_scratch_reg_offset + 3, tmp);
-}
-
 int amdgpu_atomfirmware_allocate_fb_scratch(struct amdgpu_device *adev)
 {
 	struct atom_context *ctx = adev->mode_info.atom_context;
@@ -130,3 +95,129 @@ int amdgpu_atomfirmware_allocate_fb_scratch(struct amdgpu_device *adev)
 	ctx->scratch_size_bytes = usage_bytes;
 	return 0;
 }
+
+union igp_info {
+	struct atom_integrated_system_info_v1_11 v11;
+};
+
+/*
+ * Return vram width from integrated system info table, if available,
+ * or 0 if not.
+ */
+int amdgpu_atomfirmware_get_vram_width(struct amdgpu_device *adev)
+{
+	struct amdgpu_mode_info *mode_info = &adev->mode_info;
+	int index = get_index_into_master_table(atom_master_list_of_data_tables_v2_1,
+						integratedsysteminfo);
+	u16 data_offset, size;
+	union igp_info *igp_info;
+	u8 frev, crev;
+
+	/* get any igp specific overrides */
+	if (amdgpu_atom_parse_data_header(mode_info->atom_context, index, &size,
+				   &frev, &crev, &data_offset)) {
+		igp_info = (union igp_info *)
+			(mode_info->atom_context->bios + data_offset);
+		switch (crev) {
+		case 11:
+			return igp_info->v11.umachannelnumber * 64;
+		default:
+			return 0;
+		}
+	}
+
+	return 0;
+}
+
+union firmware_info {
+	struct atom_firmware_info_v3_1 v31;
+};
+
+union smu_info {
+	struct atom_smu_info_v3_1 v31;
+};
+
+union umc_info {
+	struct atom_umc_info_v3_1 v31;
+};
+
+int amdgpu_atomfirmware_get_clock_info(struct amdgpu_device *adev)
+{
+	struct amdgpu_mode_info *mode_info = &adev->mode_info;
+	struct amdgpu_pll *spll = &adev->clock.spll;
+	struct amdgpu_pll *mpll = &adev->clock.mpll;
+	uint8_t frev, crev;
+	uint16_t data_offset;
+	int ret = -EINVAL, index;
+
+	index = get_index_into_master_table(atom_master_list_of_data_tables_v2_1,
+					    firmwareinfo);
+	if (amdgpu_atom_parse_data_header(mode_info->atom_context, index, NULL,
+				   &frev, &crev, &data_offset)) {
+		union firmware_info *firmware_info =
+			(union firmware_info *)(mode_info->atom_context->bios +
+						data_offset);
+
+		adev->clock.default_sclk =
+			le32_to_cpu(firmware_info->v31.bootup_sclk_in10khz);
+		adev->clock.default_mclk =
+			le32_to_cpu(firmware_info->v31.bootup_mclk_in10khz);
+
+		adev->pm.current_sclk = adev->clock.default_sclk;
+		adev->pm.current_mclk = adev->clock.default_mclk;
+
+		/* not technically a clock, but... */
+		adev->mode_info.firmware_flags =
+			le32_to_cpu(firmware_info->v31.firmware_capability);
+
+		ret = 0;
+	}
+
+	index = get_index_into_master_table(atom_master_list_of_data_tables_v2_1,
+					    smu_info);
+	if (amdgpu_atom_parse_data_header(mode_info->atom_context, index, NULL,
+				   &frev, &crev, &data_offset)) {
+		union smu_info *smu_info =
+			(union smu_info *)(mode_info->atom_context->bios +
+					   data_offset);
+
+		/* system clock */
+		spll->reference_freq = le32_to_cpu(smu_info->v31.core_refclk_10khz);
+
+		spll->reference_div = 0;
+		spll->min_post_div = 1;
+		spll->max_post_div = 1;
+		spll->min_ref_div = 2;
+		spll->max_ref_div = 0xff;
+		spll->min_feedback_div = 4;
+		spll->max_feedback_div = 0xff;
+		spll->best_vco = 0;
+
+		ret = 0;
+	}
+
+	index = get_index_into_master_table(atom_master_list_of_data_tables_v2_1,
+					    umc_info);
+	if (amdgpu_atom_parse_data_header(mode_info->atom_context, index, NULL,
+				   &frev, &crev, &data_offset)) {
+		union umc_info *umc_info =
+			(union umc_info *)(mode_info->atom_context->bios +
+					   data_offset);
+
+		/* memory clock */
+		mpll->reference_freq = le32_to_cpu(umc_info->v31.mem_refclk_10khz);
+
+		mpll->reference_div = 0;
+		mpll->min_post_div = 1;
+		mpll->max_post_div = 1;
+		mpll->min_ref_div = 2;
+		mpll->max_ref_div = 0xff;
+		mpll->min_feedback_div = 4;
+		mpll->max_feedback_div = 0xff;
+		mpll->best_vco = 0;
+
+		ret = 0;
+	}
+
+	return ret;
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.h
index a2c3ebe2..288b97e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_atomfirmware.h
@@ -26,10 +26,8 @@
 
 bool amdgpu_atomfirmware_gpu_supports_virtualization(struct amdgpu_device *adev);
 void amdgpu_atomfirmware_scratch_regs_init(struct amdgpu_device *adev);
-void amdgpu_atomfirmware_scratch_regs_save(struct amdgpu_device *adev);
-void amdgpu_atomfirmware_scratch_regs_restore(struct amdgpu_device *adev);
-void amdgpu_atomfirmware_scratch_regs_engine_hung(struct amdgpu_device *adev,
-						  bool hung);
 int amdgpu_atomfirmware_allocate_fb_scratch(struct amdgpu_device *adev);
+int amdgpu_atomfirmware_get_vram_width(struct amdgpu_device *adev);
+int amdgpu_atomfirmware_get_clock_info(struct amdgpu_device *adev);
 
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c
index 1beae5b..63ec1e1 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_benchmark.c
@@ -40,7 +40,7 @@ static int amdgpu_benchmark_do_move(struct amdgpu_device *adev, unsigned size,
 	for (i = 0; i < n; i++) {
 		struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
 		r = amdgpu_copy_buffer(ring, saddr, daddr, size, NULL, &fence,
-				       false);
+				       false, false);
 		if (r)
 			goto exit_do_move;
 		r = dma_fence_wait(fence, false);
@@ -81,7 +81,7 @@ static void amdgpu_benchmark_move(struct amdgpu_device *adev, unsigned size,
 
 	n = AMDGPU_BENCHMARK_ITERATIONS;
 	r = amdgpu_bo_create(adev, size, PAGE_SIZE, true, sdomain, 0, NULL,
-			     NULL, &sobj);
+			     NULL, 0, &sobj);
 	if (r) {
 		goto out_cleanup;
 	}
@@ -94,7 +94,7 @@ static void amdgpu_benchmark_move(struct amdgpu_device *adev, unsigned size,
 		goto out_cleanup;
 	}
 	r = amdgpu_bo_create(adev, size, PAGE_SIZE, true, ddomain, 0, NULL,
-			     NULL, &dobj);
+			     NULL, 0, &dobj);
 	if (r) {
 		goto out_cleanup;
 	}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
index 365e735..c21adf6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c
@@ -86,19 +86,6 @@ static bool check_atom_bios(uint8_t *bios, size_t size)
 	return false;
 }
 
-static bool is_atom_fw(uint8_t *bios)
-{
-	uint16_t bios_header_start = bios[0x48] | (bios[0x49] << 8);
-	uint8_t frev = bios[bios_header_start + 2];
-	uint8_t crev = bios[bios_header_start + 3];
-
-	if ((frev < 3) ||
-	    ((frev == 3) && (crev < 3)))
-		return false;
-
-	return true;
-}
-
 /* If you boot an IGP board with a discrete card as the primary,
  * the IGP rom is not accessible via the rom bar as the IGP rom is
  * part of the system bios.  On boot, the system bios puts a
@@ -117,7 +104,7 @@ static bool igp_read_bios_from_vram(struct amdgpu_device *adev)
 
 	adev->bios = NULL;
 	vram_base = pci_resource_start(adev->pdev, 0);
-	bios = ioremap(vram_base, size);
+	bios = ioremap_wc(vram_base, size);
 	if (!bios) {
 		return false;
 	}
@@ -455,6 +442,6 @@ bool amdgpu_get_bios(struct amdgpu_device *adev)
 	return false;
 
 success:
-	adev->is_atom_fw = is_atom_fw(adev->bios);
+	adev->is_atom_fw = (adev->asic_type >= CHIP_VEGA10) ? true : false;
 	return true;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
index 5e771bc11b..59089e0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c
@@ -83,7 +83,7 @@ static int amdgpu_bo_list_create(struct amdgpu_device *adev,
 	r = idr_alloc(&fpriv->bo_list_handles, list, 1, 0, GFP_KERNEL);
 	mutex_unlock(&fpriv->bo_list_lock);
 	if (r < 0) {
-		kfree(list);
+		amdgpu_bo_list_free(list);
 		return r;
 	}
 	*id = r;
@@ -136,7 +136,7 @@ static int amdgpu_bo_list_set(struct amdgpu_device *adev,
 		}
 
 		bo = amdgpu_bo_ref(gem_to_amdgpu_bo(gobj));
-		drm_gem_object_unreference_unlocked(gobj);
+		drm_gem_object_put_unlocked(gobj);
 
 		usermm = amdgpu_ttm_tt_get_usermm(bo->tbo.ttm);
 		if (usermm) {
@@ -156,11 +156,11 @@ static int amdgpu_bo_list_set(struct amdgpu_device *adev,
 		entry->tv.bo = &entry->robj->tbo;
 		entry->tv.shared = !entry->robj->prime_shared_count;
 
-		if (entry->robj->prefered_domains == AMDGPU_GEM_DOMAIN_GDS)
+		if (entry->robj->preferred_domains == AMDGPU_GEM_DOMAIN_GDS)
 			gds_obj = entry->robj;
-		if (entry->robj->prefered_domains == AMDGPU_GEM_DOMAIN_GWS)
+		if (entry->robj->preferred_domains == AMDGPU_GEM_DOMAIN_GWS)
 			gws_obj = entry->robj;
-		if (entry->robj->prefered_domains == AMDGPU_GEM_DOMAIN_OA)
+		if (entry->robj->preferred_domains == AMDGPU_GEM_DOMAIN_OA)
 			oa_obj = entry->robj;
 
 		total_size += amdgpu_bo_size(entry->robj);
@@ -270,7 +270,7 @@ int amdgpu_bo_list_ioctl(struct drm_device *dev, void *data,
 	struct amdgpu_fpriv *fpriv = filp->driver_priv;
 	union drm_amdgpu_bo_list *args = data;
 	uint32_t handle = args->in.list_handle;
-	const void __user *uptr = (const void*)(uintptr_t)args->in.bo_info_ptr;
+	const void __user *uptr = u64_to_user_ptr(args->in.bo_info_ptr);
 
 	struct drm_amdgpu_bo_list_entry *info;
 	struct amdgpu_bo_list *list;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c
index c0a8062..fd435a9 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cgs.c
@@ -124,7 +124,7 @@ static int amdgpu_cgs_alloc_gpu_mem(struct cgs_device *cgs_device,
 	ret = amdgpu_bo_create_restricted(adev, size, PAGE_SIZE,
 					  true, domain, flags,
 					  NULL, &placement, NULL,
-					  &obj);
+					  0, &obj);
 	if (ret) {
 		DRM_ERROR("(%d) bo create failed\n", ret);
 		return ret;
@@ -166,7 +166,7 @@ static int amdgpu_cgs_gmap_gpu_mem(struct cgs_device *cgs_device, cgs_handle_t h
 	r = amdgpu_bo_reserve(obj, true);
 	if (unlikely(r != 0))
 		return r;
-	r = amdgpu_bo_pin_restricted(obj, obj->prefered_domains,
+	r = amdgpu_bo_pin_restricted(obj, obj->preferred_domains,
 				     min_offset, max_offset, mcaddr);
 	amdgpu_bo_unreserve(obj);
 	return r;
@@ -240,6 +240,8 @@ static uint32_t amdgpu_cgs_read_ind_register(struct cgs_device *cgs_device,
 		return RREG32_DIDT(index);
 	case CGS_IND_REG_GC_CAC:
 		return RREG32_GC_CAC(index);
+	case CGS_IND_REG_SE_CAC:
+		return RREG32_SE_CAC(index);
 	case CGS_IND_REG__AUDIO_ENDPT:
 		DRM_ERROR("audio endpt register access not implemented.\n");
 		return 0;
@@ -266,6 +268,8 @@ static void amdgpu_cgs_write_ind_register(struct cgs_device *cgs_device,
 		return WREG32_DIDT(index, value);
 	case CGS_IND_REG_GC_CAC:
 		return WREG32_GC_CAC(index, value);
+	case CGS_IND_REG_SE_CAC:
+		return WREG32_SE_CAC(index, value);
 	case CGS_IND_REG__AUDIO_ENDPT:
 		DRM_ERROR("audio endpt register access not implemented.\n");
 		return;
@@ -610,6 +614,17 @@ static int amdgpu_cgs_enter_safe_mode(struct cgs_device *cgs_device,
 	return 0;
 }
 
+static void amdgpu_cgs_lock_grbm_idx(struct cgs_device *cgs_device,
+					bool lock)
+{
+	CGS_FUNC_ADEV;
+
+	if (lock)
+		mutex_lock(&adev->grbm_idx_mutex);
+	else
+		mutex_unlock(&adev->grbm_idx_mutex);
+}
+
 static int amdgpu_cgs_get_firmware_info(struct cgs_device *cgs_device,
 					enum cgs_ucode_id type,
 					struct cgs_firmware_info *info)
@@ -644,7 +659,7 @@ static int amdgpu_cgs_get_firmware_info(struct cgs_device *cgs_device,
 		info->version = (uint16_t)le32_to_cpu(header->header.ucode_version);
 
 		if (CGS_UCODE_ID_CP_MEC == type)
-			info->image_size = (header->jt_offset) << 2;
+			info->image_size = le32_to_cpu(header->jt_offset) << 2;
 
 		info->fw_version = amdgpu_get_firmware_version(cgs_device, type);
 		info->feature_version = (uint16_t)le32_to_cpu(header->ucode_feature_version);
@@ -719,7 +734,13 @@ static int amdgpu_cgs_get_firmware_info(struct cgs_device *cgs_device,
 				strcpy(fw_name, "amdgpu/polaris12_smc.bin");
 				break;
 			case CHIP_VEGA10:
-				strcpy(fw_name, "amdgpu/vega10_smc.bin");
+				if ((adev->pdev->device == 0x687f) &&
+					((adev->pdev->revision == 0xc0) ||
+					(adev->pdev->revision == 0xc1) ||
+					(adev->pdev->revision == 0xc3)))
+					strcpy(fw_name, "amdgpu/vega10_acg_smc.bin");
+				else
+					strcpy(fw_name, "amdgpu/vega10_smc.bin");
 				break;
 			default:
 				DRM_ERROR("SMC firmware not supported\n");
@@ -1117,6 +1138,7 @@ static const struct cgs_ops amdgpu_cgs_ops = {
 	.query_system_info = amdgpu_cgs_query_system_info,
 	.is_virtualization_enabled = amdgpu_cgs_is_virtualization_enabled,
 	.enter_safe_mode = amdgpu_cgs_enter_safe_mode,
+	.lock_grbm_idx = amdgpu_cgs_lock_grbm_idx,
 };
 
 static const struct cgs_os_ops amdgpu_cgs_os_ops = {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
index 5599c01..269b835 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c
@@ -54,7 +54,7 @@ static int amdgpu_cs_user_fence_chunk(struct amdgpu_cs_parser *p,
 
 	*offset = data->offset;
 
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 
 	if (amdgpu_ttm_tt_get_usermm(p->uf_entry.robj->tbo.ttm)) {
 		amdgpu_bo_unref(&p->uf_entry.robj);
@@ -90,7 +90,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, void *data)
 	}
 
 	/* get chunks */
-	chunk_array_user = (uint64_t __user *)(uintptr_t)(cs->in.chunks);
+	chunk_array_user = u64_to_user_ptr(cs->in.chunks);
 	if (copy_from_user(chunk_array, chunk_array_user,
 			   sizeof(uint64_t)*cs->in.num_chunks)) {
 		ret = -EFAULT;
@@ -110,7 +110,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, void *data)
 		struct drm_amdgpu_cs_chunk user_chunk;
 		uint32_t __user *cdata;
 
-		chunk_ptr = (void __user *)(uintptr_t)chunk_array[i];
+		chunk_ptr = u64_to_user_ptr(chunk_array[i]);
 		if (copy_from_user(&user_chunk, chunk_ptr,
 				       sizeof(struct drm_amdgpu_cs_chunk))) {
 			ret = -EFAULT;
@@ -121,7 +121,7 @@ static int amdgpu_cs_parser_init(struct amdgpu_cs_parser *p, void *data)
 		p->chunks[i].length_dw = user_chunk.length_dw;
 
 		size = p->chunks[i].length_dw;
-		cdata = (void __user *)(uintptr_t)user_chunk.chunk_data;
+		cdata = u64_to_user_ptr(user_chunk.chunk_data);
 
 		p->chunks[i].kdata = kvmalloc_array(size, sizeof(uint32_t), GFP_KERNEL);
 		if (p->chunks[i].kdata == NULL) {
@@ -223,10 +223,11 @@ static s64 bytes_to_us(struct amdgpu_device *adev, u64 bytes)
  * ticks. The accumulated microseconds (us) are converted to bytes and
  * returned.
  */
-static u64 amdgpu_cs_get_threshold_for_moves(struct amdgpu_device *adev)
+static void amdgpu_cs_get_threshold_for_moves(struct amdgpu_device *adev,
+					      u64 *max_bytes,
+					      u64 *max_vis_bytes)
 {
 	s64 time_us, increment_us;
-	u64 max_bytes;
 	u64 free_vram, total_vram, used_vram;
 
 	/* Allow a maximum of 200 accumulated ms. This is basically per-IB
@@ -238,11 +239,14 @@ static u64 amdgpu_cs_get_threshold_for_moves(struct amdgpu_device *adev)
 	 */
 	const s64 us_upper_bound = 200000;
 
-	if (!adev->mm_stats.log2_max_MBps)
-		return 0;
+	if (!adev->mm_stats.log2_max_MBps) {
+		*max_bytes = 0;
+		*max_vis_bytes = 0;
+		return;
+	}
 
 	total_vram = adev->mc.real_vram_size - adev->vram_pin_size;
-	used_vram = atomic64_read(&adev->vram_usage);
+	used_vram = amdgpu_vram_mgr_usage(&adev->mman.bdev.man[TTM_PL_VRAM]);
 	free_vram = used_vram >= total_vram ? 0 : total_vram - used_vram;
 
 	spin_lock(&adev->mm_stats.lock);
@@ -280,23 +284,46 @@ static u64 amdgpu_cs_get_threshold_for_moves(struct amdgpu_device *adev)
 		adev->mm_stats.accum_us = max(min_us, adev->mm_stats.accum_us);
 	}
 
-	/* This returns 0 if the driver is in debt to disallow (optional)
+	/* This is set to 0 if the driver is in debt to disallow (optional)
 	 * buffer moves.
 	 */
-	max_bytes = us_to_bytes(adev, adev->mm_stats.accum_us);
+	*max_bytes = us_to_bytes(adev, adev->mm_stats.accum_us);
+
+	/* Do the same for visible VRAM if half of it is free */
+	if (adev->mc.visible_vram_size < adev->mc.real_vram_size) {
+		u64 total_vis_vram = adev->mc.visible_vram_size;
+		u64 used_vis_vram =
+			amdgpu_vram_mgr_vis_usage(&adev->mman.bdev.man[TTM_PL_VRAM]);
+
+		if (used_vis_vram < total_vis_vram) {
+			u64 free_vis_vram = total_vis_vram - used_vis_vram;
+			adev->mm_stats.accum_us_vis = min(adev->mm_stats.accum_us_vis +
+							  increment_us, us_upper_bound);
+
+			if (free_vis_vram >= total_vis_vram / 2)
+				adev->mm_stats.accum_us_vis =
+					max(bytes_to_us(adev, free_vis_vram / 2),
+					    adev->mm_stats.accum_us_vis);
+		}
+
+		*max_vis_bytes = us_to_bytes(adev, adev->mm_stats.accum_us_vis);
+	} else {
+		*max_vis_bytes = 0;
+	}
 
 	spin_unlock(&adev->mm_stats.lock);
-	return max_bytes;
 }
 
 /* Report how many bytes have really been moved for the last command
  * submission. This can result in a debt that can stop buffer migrations
  * temporarily.
  */
-void amdgpu_cs_report_moved_bytes(struct amdgpu_device *adev, u64 num_bytes)
+void amdgpu_cs_report_moved_bytes(struct amdgpu_device *adev, u64 num_bytes,
+				  u64 num_vis_bytes)
 {
 	spin_lock(&adev->mm_stats.lock);
 	adev->mm_stats.accum_us -= bytes_to_us(adev, num_bytes);
+	adev->mm_stats.accum_us_vis -= bytes_to_us(adev, num_vis_bytes);
 	spin_unlock(&adev->mm_stats.lock);
 }
 
@@ -304,7 +331,7 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
 				 struct amdgpu_bo *bo)
 {
 	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
-	u64 initial_bytes_moved;
+	u64 initial_bytes_moved, bytes_moved;
 	uint32_t domain;
 	int r;
 
@@ -314,17 +341,35 @@ static int amdgpu_cs_bo_validate(struct amdgpu_cs_parser *p,
 	/* Don't move this buffer if we have depleted our allowance
 	 * to move it. Don't move anything if the threshold is zero.
 	 */
-	if (p->bytes_moved < p->bytes_moved_threshold)
-		domain = bo->prefered_domains;
-	else
+	if (p->bytes_moved < p->bytes_moved_threshold) {
+		if (adev->mc.visible_vram_size < adev->mc.real_vram_size &&
+		    (bo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)) {
+			/* And don't move a CPU_ACCESS_REQUIRED BO to limited
+			 * visible VRAM if we've depleted our allowance to do
+			 * that.
+			 */
+			if (p->bytes_moved_vis < p->bytes_moved_vis_threshold)
+				domain = bo->preferred_domains;
+			else
+				domain = bo->allowed_domains;
+		} else {
+			domain = bo->preferred_domains;
+		}
+	} else {
 		domain = bo->allowed_domains;
+	}
 
 retry:
 	amdgpu_ttm_placement_from_domain(bo, domain);
 	initial_bytes_moved = atomic64_read(&adev->num_bytes_moved);
 	r = ttm_bo_validate(&bo->tbo, &bo->placement, true, false);
-	p->bytes_moved += atomic64_read(&adev->num_bytes_moved) -
-		initial_bytes_moved;
+	bytes_moved = atomic64_read(&adev->num_bytes_moved) -
+		      initial_bytes_moved;
+	p->bytes_moved += bytes_moved;
+	if (adev->mc.visible_vram_size < adev->mc.real_vram_size &&
+	    bo->tbo.mem.mem_type == TTM_PL_VRAM &&
+	    bo->tbo.mem.start < adev->mc.visible_vram_size >> PAGE_SHIFT)
+		p->bytes_moved_vis += bytes_moved;
 
 	if (unlikely(r == -ENOMEM) && domain != bo->allowed_domains) {
 		domain = bo->allowed_domains;
@@ -350,7 +395,8 @@ static bool amdgpu_cs_try_evict(struct amdgpu_cs_parser *p,
 		struct amdgpu_bo_list_entry *candidate = p->evictable;
 		struct amdgpu_bo *bo = candidate->robj;
 		struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
-		u64 initial_bytes_moved;
+		u64 initial_bytes_moved, bytes_moved;
+		bool update_bytes_moved_vis;
 		uint32_t other;
 
 		/* If we reached our current BO we can forget it */
@@ -370,10 +416,17 @@ static bool amdgpu_cs_try_evict(struct amdgpu_cs_parser *p,
 
 		/* Good we can try to move this BO somewhere else */
 		amdgpu_ttm_placement_from_domain(bo, other);
+		update_bytes_moved_vis =
+			adev->mc.visible_vram_size < adev->mc.real_vram_size &&
+			bo->tbo.mem.mem_type == TTM_PL_VRAM &&
+			bo->tbo.mem.start < adev->mc.visible_vram_size >> PAGE_SHIFT;
 		initial_bytes_moved = atomic64_read(&adev->num_bytes_moved);
 		r = ttm_bo_validate(&bo->tbo, &bo->placement, true, false);
-		p->bytes_moved += atomic64_read(&adev->num_bytes_moved) -
+		bytes_moved = atomic64_read(&adev->num_bytes_moved) -
 			initial_bytes_moved;
+		p->bytes_moved += bytes_moved;
+		if (update_bytes_moved_vis)
+			p->bytes_moved_vis += bytes_moved;
 
 		if (unlikely(r))
 			break;
@@ -554,8 +607,10 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 		list_splice(&need_pages, &p->validated);
 	}
 
-	p->bytes_moved_threshold = amdgpu_cs_get_threshold_for_moves(p->adev);
+	amdgpu_cs_get_threshold_for_moves(p->adev, &p->bytes_moved_threshold,
+					  &p->bytes_moved_vis_threshold);
 	p->bytes_moved = 0;
+	p->bytes_moved_vis = 0;
 	p->evictable = list_last_entry(&p->validated,
 				       struct amdgpu_bo_list_entry,
 				       tv.head);
@@ -579,8 +634,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 		goto error_validate;
 	}
 
-	amdgpu_cs_report_moved_bytes(p->adev, p->bytes_moved);
-
+	amdgpu_cs_report_moved_bytes(p->adev, p->bytes_moved,
+				     p->bytes_moved_vis);
 	fpriv->vm.last_eviction_counter =
 		atomic64_read(&p->adev->num_evictions);
 
@@ -619,10 +674,8 @@ static int amdgpu_cs_parser_bos(struct amdgpu_cs_parser *p,
 	}
 
 error_validate:
-	if (r) {
-		amdgpu_vm_move_pt_bos_in_lru(p->adev, &fpriv->vm);
+	if (r)
 		ttm_eu_backoff_reservation(&p->ticket, &p->validated);
-	}
 
 error_free_pages:
 
@@ -670,21 +723,18 @@ static int amdgpu_cs_sync_rings(struct amdgpu_cs_parser *p)
  * If error is set than unvalidate buffer, otherwise just free memory
  * used by parsing context.
  **/
-static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser *parser, int error, bool backoff)
+static void amdgpu_cs_parser_fini(struct amdgpu_cs_parser *parser, int error,
+				  bool backoff)
 {
-	struct amdgpu_fpriv *fpriv = parser->filp->driver_priv;
 	unsigned i;
 
-	if (!error) {
-		amdgpu_vm_move_pt_bos_in_lru(parser->adev, &fpriv->vm);
-
+	if (!error)
 		ttm_eu_fence_buffer_objects(&parser->ticket,
 					    &parser->validated,
 					    parser->fence);
-	} else if (backoff) {
+	else if (backoff)
 		ttm_eu_backoff_reservation(&parser->ticket,
 					   &parser->validated);
-	}
 
 	for (i = 0; i < parser->num_post_dep_syncobjs; i++)
 		drm_syncobj_put(parser->post_dep_syncobjs[i]);
@@ -737,7 +787,8 @@ static int amdgpu_bo_vm_update_pte(struct amdgpu_cs_parser *p)
 
 	if (amdgpu_sriov_vf(adev)) {
 		struct dma_fence *f;
-		bo_va = vm->csa_bo_va;
+
+		bo_va = fpriv->csa_va;
 		BUG_ON(!bo_va);
 		r = amdgpu_vm_bo_update(adev, bo_va, false);
 		if (r)
@@ -774,7 +825,7 @@ static int amdgpu_bo_vm_update_pte(struct amdgpu_cs_parser *p)
 
 	}
 
-	r = amdgpu_vm_clear_invalids(adev, vm, &p->job->sync);
+	r = amdgpu_vm_clear_moved(adev, vm, &p->job->sync);
 
 	if (amdgpu_vm_debug && p->bo_list) {
 		/* Invalidate all BOs to test for userspace bugs */
@@ -984,7 +1035,7 @@ static int amdgpu_syncobj_lookup_and_add_to_sync(struct amdgpu_cs_parser *p,
 {
 	int r;
 	struct dma_fence *fence;
-	r = drm_syncobj_fence_get(p->filp, handle, &fence);
+	r = drm_syncobj_find_fence(p->filp, handle, &fence);
 	if (r)
 		return r;
 
@@ -1383,7 +1434,7 @@ int amdgpu_cs_wait_fences_ioctl(struct drm_device *dev, void *data,
 	if (fences == NULL)
 		return -ENOMEM;
 
-	fences_user = (void __user *)(uintptr_t)(wait->in.fences);
+	fences_user = u64_to_user_ptr(wait->in.fences);
 	if (copy_from_user(fences, fences_user,
 		sizeof(struct drm_amdgpu_fence) * fence_count)) {
 		r = -EFAULT;
@@ -1436,7 +1487,7 @@ amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,
 			    addr > mapping->last)
 				continue;
 
-			*bo = lobj->bo_va->bo;
+			*bo = lobj->bo_va->base.bo;
 			return mapping;
 		}
 
@@ -1445,7 +1496,7 @@ amdgpu_cs_find_mapping(struct amdgpu_cs_parser *parser,
 			    addr > mapping->last)
 				continue;
 
-			*bo = lobj->bo_va->bo;
+			*bo = lobj->bo_va->base.bo;
 			return mapping;
 		}
 	}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index 4a8fc15..1a459ac 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -53,6 +53,9 @@
 #include "bif/bif_4_1_d.h"
 #include <linux/pci.h>
 #include <linux/firmware.h>
+#include "amdgpu_vf_error.h"
+
+#include "amdgpu_amdkfd.h"
 
 MODULE_FIRMWARE("amdgpu/vega10_gpu_info.bin");
 MODULE_FIRMWARE("amdgpu/raven_gpu_info.bin");
@@ -128,6 +131,10 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v,
 {
 	trace_amdgpu_mm_wreg(adev->pdev->device, reg, v);
 
+	if (adev->asic_type >= CHIP_VEGA10 && reg == 0) {
+		adev->last_mm_index = v;
+	}
+
 	if (!(acc_flags & AMDGPU_REGS_NO_KIQ) && amdgpu_sriov_runtime(adev)) {
 		BUG_ON(in_interrupt());
 		return amdgpu_virt_kiq_wreg(adev, reg, v);
@@ -143,6 +150,10 @@ void amdgpu_mm_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v,
 		writel(v, ((void __iomem *)adev->rmmio) + (mmMM_DATA * 4));
 		spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
 	}
+
+	if (adev->asic_type >= CHIP_VEGA10 && reg == 1 && adev->last_mm_index == 0x5702C) {
+		udelay(500);
+	}
 }
 
 u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 reg)
@@ -157,6 +168,9 @@ u32 amdgpu_io_rreg(struct amdgpu_device *adev, u32 reg)
 
 void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
 {
+	if (adev->asic_type >= CHIP_VEGA10 && reg == 0) {
+		adev->last_mm_index = v;
+	}
 
 	if ((reg * 4) < adev->rio_mem_size)
 		iowrite32(v, adev->rio_mem + (reg * 4));
@@ -164,6 +178,10 @@ void amdgpu_io_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
 		iowrite32((reg * 4), adev->rio_mem + (mmMM_INDEX * 4));
 		iowrite32(v, adev->rio_mem + (mmMM_DATA * 4));
 	}
+
+	if (adev->asic_type >= CHIP_VEGA10 && reg == 1 && adev->last_mm_index == 0x5702C) {
+		udelay(500);
+	}
 }
 
 /**
@@ -318,51 +336,16 @@ static void amdgpu_block_invalid_wreg(struct amdgpu_device *adev,
 
 static int amdgpu_vram_scratch_init(struct amdgpu_device *adev)
 {
-	int r;
-
-	if (adev->vram_scratch.robj == NULL) {
-		r = amdgpu_bo_create(adev, AMDGPU_GPU_PAGE_SIZE,
-				     PAGE_SIZE, true, AMDGPU_GEM_DOMAIN_VRAM,
-				     AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
-				     AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
-				     NULL, NULL, &adev->vram_scratch.robj);
-		if (r) {
-			return r;
-		}
-	}
-
-	r = amdgpu_bo_reserve(adev->vram_scratch.robj, false);
-	if (unlikely(r != 0))
-		return r;
-	r = amdgpu_bo_pin(adev->vram_scratch.robj,
-			  AMDGPU_GEM_DOMAIN_VRAM, &adev->vram_scratch.gpu_addr);
-	if (r) {
-		amdgpu_bo_unreserve(adev->vram_scratch.robj);
-		return r;
-	}
-	r = amdgpu_bo_kmap(adev->vram_scratch.robj,
-				(void **)&adev->vram_scratch.ptr);
-	if (r)
-		amdgpu_bo_unpin(adev->vram_scratch.robj);
-	amdgpu_bo_unreserve(adev->vram_scratch.robj);
-
-	return r;
+	return amdgpu_bo_create_kernel(adev, AMDGPU_GPU_PAGE_SIZE,
+				       PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM,
+				       &adev->vram_scratch.robj,
+				       &adev->vram_scratch.gpu_addr,
+				       (void **)&adev->vram_scratch.ptr);
 }
 
 static void amdgpu_vram_scratch_fini(struct amdgpu_device *adev)
 {
-	int r;
-
-	if (adev->vram_scratch.robj == NULL) {
-		return;
-	}
-	r = amdgpu_bo_reserve(adev->vram_scratch.robj, true);
-	if (likely(r == 0)) {
-		amdgpu_bo_kunmap(adev->vram_scratch.robj);
-		amdgpu_bo_unpin(adev->vram_scratch.robj);
-		amdgpu_bo_unreserve(adev->vram_scratch.robj);
-	}
-	amdgpu_bo_unref(&adev->vram_scratch.robj);
+	amdgpu_bo_free_kernel(&adev->vram_scratch.robj, NULL, NULL);
 }
 
 /**
@@ -521,7 +504,8 @@ static int amdgpu_wb_init(struct amdgpu_device *adev)
 	int r;
 
 	if (adev->wb.wb_obj == NULL) {
-		r = amdgpu_bo_create_kernel(adev, AMDGPU_MAX_WB * sizeof(uint32_t),
+		/* AMDGPU_MAX_WB * sizeof(uint32_t) * 8 = AMDGPU_MAX_WB 256bit slots */
+		r = amdgpu_bo_create_kernel(adev, AMDGPU_MAX_WB * sizeof(uint32_t) * 8,
 					    PAGE_SIZE, AMDGPU_GEM_DOMAIN_GTT,
 					    &adev->wb.wb_obj, &adev->wb.gpu_addr,
 					    (void **)&adev->wb.wb);
@@ -552,32 +536,10 @@ static int amdgpu_wb_init(struct amdgpu_device *adev)
 int amdgpu_wb_get(struct amdgpu_device *adev, u32 *wb)
 {
 	unsigned long offset = find_first_zero_bit(adev->wb.used, adev->wb.num_wb);
+
 	if (offset < adev->wb.num_wb) {
 		__set_bit(offset, adev->wb.used);
-		*wb = offset;
-		return 0;
-	} else {
-		return -EINVAL;
-	}
-}
-
-/**
- * amdgpu_wb_get_64bit - Allocate a wb entry
- *
- * @adev: amdgpu_device pointer
- * @wb: wb index
- *
- * Allocate a wb slot for use by the driver (all asics).
- * Returns 0 on success or -EINVAL on failure.
- */
-int amdgpu_wb_get_64bit(struct amdgpu_device *adev, u32 *wb)
-{
-	unsigned long offset = bitmap_find_next_zero_area_off(adev->wb.used,
-				adev->wb.num_wb, 0, 2, 7, 0);
-	if ((offset + 1) < adev->wb.num_wb) {
-		__set_bit(offset, adev->wb.used);
-		__set_bit(offset + 1, adev->wb.used);
-		*wb = offset;
+		*wb = offset * 8; /* convert to dw offset */
 		return 0;
 	} else {
 		return -EINVAL;
@@ -599,22 +561,6 @@ void amdgpu_wb_free(struct amdgpu_device *adev, u32 wb)
 }
 
 /**
- * amdgpu_wb_free_64bit - Free a wb entry
- *
- * @adev: amdgpu_device pointer
- * @wb: wb index
- *
- * Free a wb slot allocated for use by the driver (all asics)
- */
-void amdgpu_wb_free_64bit(struct amdgpu_device *adev, u32 wb)
-{
-	if ((wb + 1) < adev->wb.num_wb) {
-		__clear_bit(wb, adev->wb.used);
-		__clear_bit(wb + 1, adev->wb.used);
-	}
-}
-
-/**
  * amdgpu_vram_location - try to find VRAM location
  * @adev: amdgpu device structure holding all necessary informations
  * @mc: memory controller structure holding memory informations
@@ -665,7 +611,7 @@ void amdgpu_vram_location(struct amdgpu_device *adev, struct amdgpu_mc *mc, u64
 }
 
 /**
- * amdgpu_gtt_location - try to find GTT location
+ * amdgpu_gart_location - try to find GTT location
  * @adev: amdgpu device structure holding all necessary informations
  * @mc: memory controller structure holding memory informations
  *
@@ -676,28 +622,28 @@ void amdgpu_vram_location(struct amdgpu_device *adev, struct amdgpu_mc *mc, u64
  *
  * FIXME: when reducing GTT size align new size on power of 2.
  */
-void amdgpu_gtt_location(struct amdgpu_device *adev, struct amdgpu_mc *mc)
+void amdgpu_gart_location(struct amdgpu_device *adev, struct amdgpu_mc *mc)
 {
 	u64 size_af, size_bf;
 
-	size_af = ((adev->mc.mc_mask - mc->vram_end) + mc->gtt_base_align) & ~mc->gtt_base_align;
-	size_bf = mc->vram_start & ~mc->gtt_base_align;
+	size_af = adev->mc.mc_mask - mc->vram_end;
+	size_bf = mc->vram_start;
 	if (size_bf > size_af) {
-		if (mc->gtt_size > size_bf) {
+		if (mc->gart_size > size_bf) {
 			dev_warn(adev->dev, "limiting GTT\n");
-			mc->gtt_size = size_bf;
+			mc->gart_size = size_bf;
 		}
-		mc->gtt_start = 0;
+		mc->gart_start = 0;
 	} else {
-		if (mc->gtt_size > size_af) {
+		if (mc->gart_size > size_af) {
 			dev_warn(adev->dev, "limiting GTT\n");
-			mc->gtt_size = size_af;
+			mc->gart_size = size_af;
 		}
-		mc->gtt_start = (mc->vram_end + 1 + mc->gtt_base_align) & ~mc->gtt_base_align;
+		mc->gart_start = mc->vram_end + 1;
 	}
-	mc->gtt_end = mc->gtt_start + mc->gtt_size - 1;
+	mc->gart_end = mc->gart_start + mc->gart_size - 1;
 	dev_info(adev->dev, "GTT: %lluM 0x%016llX - 0x%016llX\n",
-			mc->gtt_size >> 20, mc->gtt_start, mc->gtt_end);
+			mc->gart_size >> 20, mc->gart_start, mc->gart_end);
 }
 
 /*
@@ -720,7 +666,12 @@ bool amdgpu_need_post(struct amdgpu_device *adev)
 		adev->has_hw_reset = false;
 		return true;
 	}
-	/* then check MEM_SIZE, in case the crtcs are off */
+
+	/* bios scratch used on CIK+ */
+	if (adev->asic_type >= CHIP_BONAIRE)
+		return amdgpu_atombios_scratch_need_asic_init(adev);
+
+	/* check MEM_SIZE for older asics */
 	reg = amdgpu_asic_get_config_memsize(adev);
 
 	if ((reg != 0) && (reg != 0xffffffff))
@@ -1031,19 +982,6 @@ static unsigned int amdgpu_vga_set_decode(void *cookie, bool state)
 		return VGA_RSRC_NORMAL_IO | VGA_RSRC_NORMAL_MEM;
 }
 
-/**
- * amdgpu_check_pot_argument - check that argument is a power of two
- *
- * @arg: value to check
- *
- * Validates that a certain argument is a power of two (all asics).
- * Returns true if argument is valid.
- */
-static bool amdgpu_check_pot_argument(int arg)
-{
-	return (arg & (arg - 1)) == 0;
-}
-
 static void amdgpu_check_block_size(struct amdgpu_device *adev)
 {
 	/* defines number of bits in page table versus page directory,
@@ -1077,7 +1015,7 @@ static void amdgpu_check_vm_size(struct amdgpu_device *adev)
 	if (amdgpu_vm_size == -1)
 		return;
 
-	if (!amdgpu_check_pot_argument(amdgpu_vm_size)) {
+	if (!is_power_of_2(amdgpu_vm_size)) {
 		dev_warn(adev->dev, "VM size (%d) must be a power of 2\n",
 			 amdgpu_vm_size);
 		goto def_value;
@@ -1118,19 +1056,31 @@ static void amdgpu_check_arguments(struct amdgpu_device *adev)
 		dev_warn(adev->dev, "sched jobs (%d) must be at least 4\n",
 			 amdgpu_sched_jobs);
 		amdgpu_sched_jobs = 4;
-	} else if (!amdgpu_check_pot_argument(amdgpu_sched_jobs)){
+	} else if (!is_power_of_2(amdgpu_sched_jobs)){
 		dev_warn(adev->dev, "sched jobs (%d) must be a power of 2\n",
 			 amdgpu_sched_jobs);
 		amdgpu_sched_jobs = roundup_pow_of_two(amdgpu_sched_jobs);
 	}
 
-	if (amdgpu_gart_size != -1) {
+	if (amdgpu_gart_size < 32) {
+		/* gart size must be greater or equal to 32M */
+		dev_warn(adev->dev, "gart size (%d) too small\n",
+			 amdgpu_gart_size);
+		amdgpu_gart_size = 32;
+	}
+
+	if (amdgpu_gtt_size != -1 && amdgpu_gtt_size < 32) {
 		/* gtt size must be greater or equal to 32M */
-		if (amdgpu_gart_size < 32) {
-			dev_warn(adev->dev, "gart size (%d) too small\n",
-				 amdgpu_gart_size);
-			amdgpu_gart_size = -1;
-		}
+		dev_warn(adev->dev, "gtt size (%d) too small\n",
+				 amdgpu_gtt_size);
+		amdgpu_gtt_size = -1;
+	}
+
+	/* valid range is between 4 and 9 inclusive */
+	if (amdgpu_vm_fragment_size != -1 &&
+	    (amdgpu_vm_fragment_size > 9 || amdgpu_vm_fragment_size < 4)) {
+		dev_warn(adev->dev, "valid range is between 4 and 9\n");
+		amdgpu_vm_fragment_size = -1;
 	}
 
 	amdgpu_check_vm_size(adev);
@@ -1138,7 +1088,7 @@ static void amdgpu_check_arguments(struct amdgpu_device *adev)
 	amdgpu_check_block_size(adev);
 
 	if (amdgpu_vram_page_split != -1 && (amdgpu_vram_page_split < 16 ||
-	    !amdgpu_check_pot_argument(amdgpu_vram_page_split))) {
+	    !is_power_of_2(amdgpu_vram_page_split))) {
 		dev_warn(adev->dev, "invalid VRAM page split (%d)\n",
 			 amdgpu_vram_page_split);
 		amdgpu_vram_page_split = 1024;
@@ -1901,7 +1851,8 @@ static int amdgpu_sriov_reinit_late(struct amdgpu_device *adev)
 		AMD_IP_BLOCK_TYPE_DCE,
 		AMD_IP_BLOCK_TYPE_GFX,
 		AMD_IP_BLOCK_TYPE_SDMA,
-		AMD_IP_BLOCK_TYPE_VCE,
+		AMD_IP_BLOCK_TYPE_UVD,
+		AMD_IP_BLOCK_TYPE_VCE
 	};
 
 	for (i = 0; i < ARRAY_SIZE(ip_order); i++) {
@@ -2019,7 +1970,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	adev->flags = flags;
 	adev->asic_type = flags & AMD_ASIC_MASK;
 	adev->usec_timeout = AMDGPU_MAX_USEC_TIMEOUT;
-	adev->mc.gtt_size = 512 * 1024 * 1024;
+	adev->mc.gart_size = 512 * 1024 * 1024;
 	adev->accel_working = false;
 	adev->num_rings = 0;
 	adev->mman.buffer_funcs = NULL;
@@ -2068,6 +2019,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	spin_lock_init(&adev->uvd_ctx_idx_lock);
 	spin_lock_init(&adev->didt_idx_lock);
 	spin_lock_init(&adev->gc_cac_idx_lock);
+	spin_lock_init(&adev->se_cac_idx_lock);
 	spin_lock_init(&adev->audio_endpt_idx_lock);
 	spin_lock_init(&adev->mm_stats.lock);
 
@@ -2143,6 +2095,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	r = amdgpu_atombios_init(adev);
 	if (r) {
 		dev_err(adev->dev, "amdgpu_atombios_init failed\n");
+		amdgpu_vf_error_put(AMDGIM_ERROR_VF_ATOMBIOS_INIT_FAIL, 0, 0);
 		goto failed;
 	}
 
@@ -2153,6 +2106,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	if (amdgpu_vpost_needed(adev)) {
 		if (!adev->bios) {
 			dev_err(adev->dev, "no vBIOS found\n");
+			amdgpu_vf_error_put(AMDGIM_ERROR_VF_NO_VBIOS, 0, 0);
 			r = -EINVAL;
 			goto failed;
 		}
@@ -2160,18 +2114,28 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 		r = amdgpu_atom_asic_init(adev->mode_info.atom_context);
 		if (r) {
 			dev_err(adev->dev, "gpu post error!\n");
+			amdgpu_vf_error_put(AMDGIM_ERROR_VF_GPU_POST_ERROR, 0, 0);
 			goto failed;
 		}
 	} else {
 		DRM_INFO("GPU post is not needed\n");
 	}
 
-	if (!adev->is_atom_fw) {
+	if (adev->is_atom_fw) {
+		/* Initialize clocks */
+		r = amdgpu_atomfirmware_get_clock_info(adev);
+		if (r) {
+			dev_err(adev->dev, "amdgpu_atomfirmware_get_clock_info failed\n");
+			amdgpu_vf_error_put(AMDGIM_ERROR_VF_ATOMBIOS_GET_CLOCK_FAIL, 0, 0);
+			goto failed;
+		}
+	} else {
 		/* Initialize clocks */
 		r = amdgpu_atombios_get_clock_info(adev);
 		if (r) {
 			dev_err(adev->dev, "amdgpu_atombios_get_clock_info failed\n");
-			return r;
+			amdgpu_vf_error_put(AMDGIM_ERROR_VF_ATOMBIOS_GET_CLOCK_FAIL, 0, 0);
+			goto failed;
 		}
 		/* init i2c buses */
 		amdgpu_atombios_i2c_init(adev);
@@ -2181,6 +2145,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	r = amdgpu_fence_driver_init(adev);
 	if (r) {
 		dev_err(adev->dev, "amdgpu_fence_driver_init failed\n");
+		amdgpu_vf_error_put(AMDGIM_ERROR_VF_FENCE_INIT_FAIL, 0, 0);
 		goto failed;
 	}
 
@@ -2190,6 +2155,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	r = amdgpu_init(adev);
 	if (r) {
 		dev_err(adev->dev, "amdgpu_init failed\n");
+		amdgpu_vf_error_put(AMDGIM_ERROR_VF_AMDGPU_INIT_FAIL, 0, 0);
 		amdgpu_fini(adev);
 		goto failed;
 	}
@@ -2209,6 +2175,7 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	r = amdgpu_ib_pool_init(adev);
 	if (r) {
 		dev_err(adev->dev, "IB initialization failed (%d).\n", r);
+		amdgpu_vf_error_put(AMDGIM_ERROR_VF_IB_INIT_FAIL, 0, r);
 		goto failed;
 	}
 
@@ -2253,12 +2220,14 @@ int amdgpu_device_init(struct amdgpu_device *adev,
 	r = amdgpu_late_init(adev);
 	if (r) {
 		dev_err(adev->dev, "amdgpu_late_init failed\n");
+		amdgpu_vf_error_put(AMDGIM_ERROR_VF_AMDGPU_LATE_INIT_FAIL, 0, r);
 		goto failed;
 	}
 
 	return 0;
 
 failed:
+	amdgpu_vf_error_trans_all(adev);
 	if (runtime)
 		vga_switcheroo_fini_domain_pm_ops(adev->dev);
 	return r;
@@ -2351,6 +2320,8 @@ int amdgpu_device_suspend(struct drm_device *dev, bool suspend, bool fbcon)
 	}
 	drm_modeset_unlock_all(dev);
 
+	amdgpu_amdkfd_suspend(adev);
+
 	/* unpin the front buffers and cursors */
 	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
 		struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
@@ -2392,10 +2363,7 @@ int amdgpu_device_suspend(struct drm_device *dev, bool suspend, bool fbcon)
 	 */
 	amdgpu_bo_evict_vram(adev);
 
-	if (adev->is_atom_fw)
-		amdgpu_atomfirmware_scratch_regs_save(adev);
-	else
-		amdgpu_atombios_scratch_regs_save(adev);
+	amdgpu_atombios_scratch_regs_save(adev);
 	pci_save_state(dev->pdev);
 	if (suspend) {
 		/* Shut down the device */
@@ -2444,10 +2412,7 @@ int amdgpu_device_resume(struct drm_device *dev, bool resume, bool fbcon)
 		if (r)
 			goto unlock;
 	}
-	if (adev->is_atom_fw)
-		amdgpu_atomfirmware_scratch_regs_restore(adev);
-	else
-		amdgpu_atombios_scratch_regs_restore(adev);
+	amdgpu_atombios_scratch_regs_restore(adev);
 
 	/* post card */
 	if (amdgpu_need_post(adev)) {
@@ -2490,6 +2455,9 @@ int amdgpu_device_resume(struct drm_device *dev, bool resume, bool fbcon)
 			}
 		}
 	}
+	r = amdgpu_amdkfd_resume(adev);
+	if (r)
+		return r;
 
 	/* blat the mode back in */
 	if (fbcon) {
@@ -2860,21 +2828,9 @@ int amdgpu_gpu_reset(struct amdgpu_device *adev)
 		r = amdgpu_suspend(adev);
 
 retry:
-		/* Disable fb access */
-		if (adev->mode_info.num_crtc) {
-			struct amdgpu_mode_mc_save save;
-			amdgpu_display_stop_mc_access(adev, &save);
-			amdgpu_wait_for_idle(adev, AMD_IP_BLOCK_TYPE_GMC);
-		}
-		if (adev->is_atom_fw)
-			amdgpu_atomfirmware_scratch_regs_save(adev);
-		else
-			amdgpu_atombios_scratch_regs_save(adev);
+		amdgpu_atombios_scratch_regs_save(adev);
 		r = amdgpu_asic_reset(adev);
-		if (adev->is_atom_fw)
-			amdgpu_atomfirmware_scratch_regs_restore(adev);
-		else
-			amdgpu_atombios_scratch_regs_restore(adev);
+		amdgpu_atombios_scratch_regs_restore(adev);
 		/* post card */
 		amdgpu_atom_asic_init(adev->mode_info.atom_context);
 
@@ -2952,6 +2908,7 @@ int amdgpu_gpu_reset(struct amdgpu_device *adev)
 		}
 	} else {
 		dev_err(adev->dev, "asic resume failed (%d).\n", r);
+		amdgpu_vf_error_put(AMDGIM_ERROR_VF_ASIC_RESUME_FAIL, 0, r);
 		for (i = 0; i < AMDGPU_MAX_RINGS; ++i) {
 			if (adev->rings[i] && adev->rings[i]->sched.thread) {
 				kthread_unpark(adev->rings[i]->sched.thread);
@@ -2962,12 +2919,16 @@ int amdgpu_gpu_reset(struct amdgpu_device *adev)
 	drm_helper_resume_force_mode(adev->ddev);
 
 	ttm_bo_unlock_delayed_workqueue(&adev->mman.bdev, resched);
-	if (r)
+	if (r) {
 		/* bad news, how to tell it to userspace ? */
 		dev_info(adev->dev, "GPU reset failed\n");
-	else
+		amdgpu_vf_error_put(AMDGIM_ERROR_VF_GPU_RESET_FAIL, 0, r);
+	}
+	else {
 		dev_info(adev->dev, "GPU reset successed!\n");
+	}
 
+	amdgpu_vf_error_trans_all(adev);
 	return r;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
index cdf2ab2..6ad2432 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c
@@ -482,7 +482,7 @@ static void amdgpu_user_framebuffer_destroy(struct drm_framebuffer *fb)
 {
 	struct amdgpu_framebuffer *amdgpu_fb = to_amdgpu_framebuffer(fb);
 
-	drm_gem_object_unreference_unlocked(amdgpu_fb->obj);
+	drm_gem_object_put_unlocked(amdgpu_fb->obj);
 	drm_framebuffer_cleanup(fb);
 	kfree(amdgpu_fb);
 }
@@ -542,14 +542,14 @@ amdgpu_user_framebuffer_create(struct drm_device *dev,
 
 	amdgpu_fb = kzalloc(sizeof(*amdgpu_fb), GFP_KERNEL);
 	if (amdgpu_fb == NULL) {
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ERR_PTR(-ENOMEM);
 	}
 
 	ret = amdgpu_framebuffer_init(dev, amdgpu_fb, mode_cmd, obj);
 	if (ret) {
 		kfree(amdgpu_fb);
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ERR_PTR(ret);
 	}
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index b59f37c..e39ec98 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -68,13 +68,16 @@
  * - 3.16.0 - Add reserved vmid support
  * - 3.17.0 - Add AMDGPU_NUM_VRAM_CPU_PAGE_FAULTS.
  * - 3.18.0 - Export gpu always on cu bitmap
+ * - 3.19.0 - Add support for UVD MJPEG decode
  */
 #define KMS_DRIVER_MAJOR	3
-#define KMS_DRIVER_MINOR	18
+#define KMS_DRIVER_MINOR	19
 #define KMS_DRIVER_PATCHLEVEL	0
 
 int amdgpu_vram_limit = 0;
-int amdgpu_gart_size = -1; /* auto */
+int amdgpu_vis_vram_limit = 0;
+unsigned amdgpu_gart_size = 256;
+int amdgpu_gtt_size = -1; /* auto */
 int amdgpu_moverate = -1; /* auto */
 int amdgpu_benchmarking = 0;
 int amdgpu_testing = 0;
@@ -92,6 +95,7 @@ unsigned amdgpu_ip_block_mask = 0xffffffff;
 int amdgpu_bapm = -1;
 int amdgpu_deep_color = 0;
 int amdgpu_vm_size = -1;
+int amdgpu_vm_fragment_size = -1;
 int amdgpu_vm_block_size = -1;
 int amdgpu_vm_fault_stop = 0;
 int amdgpu_vm_debug = 0;
@@ -106,6 +110,7 @@ unsigned amdgpu_pcie_gen_cap = 0;
 unsigned amdgpu_pcie_lane_cap = 0;
 unsigned amdgpu_cg_mask = 0xffffffff;
 unsigned amdgpu_pg_mask = 0xffffffff;
+unsigned amdgpu_sdma_phase_quantum = 32;
 char *amdgpu_disable_cu = NULL;
 char *amdgpu_virtual_display = NULL;
 unsigned amdgpu_pp_feature_mask = 0xffffffff;
@@ -120,8 +125,14 @@ int amdgpu_lbpw = -1;
 MODULE_PARM_DESC(vramlimit, "Restrict VRAM for testing, in megabytes");
 module_param_named(vramlimit, amdgpu_vram_limit, int, 0600);
 
-MODULE_PARM_DESC(gartsize, "Size of PCIE/IGP gart to setup in megabytes (32, 64, etc., -1 = auto)");
-module_param_named(gartsize, amdgpu_gart_size, int, 0600);
+MODULE_PARM_DESC(vis_vramlimit, "Restrict visible VRAM for testing, in megabytes");
+module_param_named(vis_vramlimit, amdgpu_vis_vram_limit, int, 0444);
+
+MODULE_PARM_DESC(gartsize, "Size of PCIE/IGP gart to setup in megabytes (32, 64, etc.)");
+module_param_named(gartsize, amdgpu_gart_size, uint, 0600);
+
+MODULE_PARM_DESC(gttsize, "Size of the GTT domain in megabytes (-1 = auto)");
+module_param_named(gttsize, amdgpu_gtt_size, int, 0600);
 
 MODULE_PARM_DESC(moverate, "Maximum buffer migration rate in MB/s. (32, 64, etc., -1=auto, 0=1=disabled)");
 module_param_named(moverate, amdgpu_moverate, int, 0600);
@@ -174,6 +185,9 @@ module_param_named(deep_color, amdgpu_deep_color, int, 0444);
 MODULE_PARM_DESC(vm_size, "VM address space size in gigabytes (default 64GB)");
 module_param_named(vm_size, amdgpu_vm_size, int, 0444);
 
+MODULE_PARM_DESC(vm_fragment_size, "VM fragment size in bits (4, 5, etc. 4 = 64K (default), Max 9 = 2M)");
+module_param_named(vm_fragment_size, amdgpu_vm_fragment_size, int, 0444);
+
 MODULE_PARM_DESC(vm_block_size, "VM page table size in bits (default depending on vm_size)");
 module_param_named(vm_block_size, amdgpu_vm_block_size, int, 0444);
 
@@ -186,7 +200,7 @@ module_param_named(vm_debug, amdgpu_vm_debug, int, 0644);
 MODULE_PARM_DESC(vm_update_mode, "VM update using CPU (0 = never (default except for large BAR(LB)), 1 = Graphics only, 2 = Compute only (default for LB), 3 = Both");
 module_param_named(vm_update_mode, amdgpu_vm_update_mode, int, 0444);
 
-MODULE_PARM_DESC(vram_page_split, "Number of pages after we split VRAM allocations (default 1024, -1 = disable)");
+MODULE_PARM_DESC(vram_page_split, "Number of pages after we split VRAM allocations (default 512, -1 = disable)");
 module_param_named(vram_page_split, amdgpu_vram_page_split, int, 0444);
 
 MODULE_PARM_DESC(exp_hw_support, "experimental hw support (1 = enable, 0 = disable (default))");
@@ -199,7 +213,7 @@ MODULE_PARM_DESC(sched_hw_submission, "the max number of HW submissions (default
 module_param_named(sched_hw_submission, amdgpu_sched_hw_submission, int, 0444);
 
 MODULE_PARM_DESC(ppfeaturemask, "all power features enabled (default))");
-module_param_named(ppfeaturemask, amdgpu_pp_feature_mask, int, 0444);
+module_param_named(ppfeaturemask, amdgpu_pp_feature_mask, uint, 0444);
 
 MODULE_PARM_DESC(no_evict, "Support pinning request from user space (1 = enable, 0 = disable (default))");
 module_param_named(no_evict, amdgpu_no_evict, int, 0444);
@@ -219,6 +233,9 @@ module_param_named(cg_mask, amdgpu_cg_mask, uint, 0444);
 MODULE_PARM_DESC(pg_mask, "Powergating flags mask (0 = disable power gating)");
 module_param_named(pg_mask, amdgpu_pg_mask, uint, 0444);
 
+MODULE_PARM_DESC(sdma_phase_quantum, "SDMA context switch phase quantum (x 1K GPU clock cycles, 0 = no change (default 32))");
+module_param_named(sdma_phase_quantum, amdgpu_sdma_phase_quantum, uint, 0444);
+
 MODULE_PARM_DESC(disable_cu, "Disable CUs (se.sh.cu,...)");
 module_param_named(disable_cu, amdgpu_disable_cu, charp, 0444);
 
@@ -803,7 +820,6 @@ static struct drm_driver kms_driver = {
 	.open = amdgpu_driver_open_kms,
 	.postclose = amdgpu_driver_postclose_kms,
 	.lastclose = amdgpu_driver_lastclose_kms,
-	.set_busid = drm_pci_set_busid,
 	.unload = amdgpu_driver_unload_kms,
 	.get_vblank_counter = amdgpu_get_vblank_counter_kms,
 	.enable_vblank = amdgpu_enable_vblank_kms,
@@ -823,7 +839,6 @@ static struct drm_driver kms_driver = {
 	.gem_close_object = amdgpu_gem_object_close,
 	.dumb_create = amdgpu_mode_dumb_create,
 	.dumb_map_offset = amdgpu_mode_dumb_mmap,
-	.dumb_destroy = drm_gem_dumb_destroy,
 	.fops = &amdgpu_driver_kms_fops,
 
 	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
index c0d8c6f..9afa9c0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fb.c
@@ -118,7 +118,7 @@ static void amdgpufb_destroy_pinned_object(struct drm_gem_object *gobj)
 		amdgpu_bo_unpin(abo);
 		amdgpu_bo_unreserve(abo);
 	}
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 }
 
 static int amdgpufb_create_pinned_object(struct amdgpu_fbdev *rfbdev,
@@ -245,13 +245,12 @@ static int amdgpufb_create(struct drm_fb_helper *helper,
 
 	drm_fb_helper_fill_fix(info, fb->pitches[0], fb->format->depth);
 
-	info->flags = FBINFO_DEFAULT | FBINFO_CAN_FORCE_OUTPUT;
 	info->fbops = &amdgpufb_ops;
 
 	tmp = amdgpu_bo_gpu_offset(abo) - adev->mc.vram_start;
 	info->fix.smem_start = adev->mc.aper_base + tmp;
 	info->fix.smem_len = amdgpu_bo_size(abo);
-	info->screen_base = abo->kptr;
+	info->screen_base = amdgpu_bo_kptr(abo);
 	info->screen_size = amdgpu_bo_size(abo);
 
 	drm_fb_helper_fill_var(info, &rfbdev->helper, sizes->fb_width, sizes->fb_height);
@@ -281,7 +280,7 @@ static int amdgpufb_create(struct drm_fb_helper *helper,
 
 	}
 	if (fb && ret) {
-		drm_gem_object_unreference_unlocked(gobj);
+		drm_gem_object_put_unlocked(gobj);
 		drm_framebuffer_unregister_private(fb);
 		drm_framebuffer_cleanup(fb);
 		kfree(fb);
@@ -312,31 +311,7 @@ static int amdgpu_fbdev_destroy(struct drm_device *dev, struct amdgpu_fbdev *rfb
 	return 0;
 }
 
-/** Sets the color ramps on behalf of fbcon */
-static void amdgpu_crtc_fb_gamma_set(struct drm_crtc *crtc, u16 red, u16 green,
-				      u16 blue, int regno)
-{
-	struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
-
-	amdgpu_crtc->lut_r[regno] = red >> 6;
-	amdgpu_crtc->lut_g[regno] = green >> 6;
-	amdgpu_crtc->lut_b[regno] = blue >> 6;
-}
-
-/** Gets the color ramps on behalf of fbcon */
-static void amdgpu_crtc_fb_gamma_get(struct drm_crtc *crtc, u16 *red, u16 *green,
-				      u16 *blue, int regno)
-{
-	struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
-
-	*red = amdgpu_crtc->lut_r[regno] << 6;
-	*green = amdgpu_crtc->lut_g[regno] << 6;
-	*blue = amdgpu_crtc->lut_b[regno] << 6;
-}
-
 static const struct drm_fb_helper_funcs amdgpu_fb_helper_funcs = {
-	.gamma_set = amdgpu_crtc_fb_gamma_set,
-	.gamma_get = amdgpu_crtc_fb_gamma_get,
 	.fb_probe = amdgpufb_create,
 };
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
index a57abc1..94c1e2e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.c
@@ -55,6 +55,19 @@
 /*
  * Common GART table functions.
  */
+
+/**
+ * amdgpu_gart_set_defaults - set the default gart_size
+ *
+ * @adev: amdgpu_device pointer
+ *
+ * Set the default gart_size based on parameters and available VRAM.
+ */
+void amdgpu_gart_set_defaults(struct amdgpu_device *adev)
+{
+	adev->mc.gart_size = (uint64_t)amdgpu_gart_size << 20;
+}
+
 /**
  * amdgpu_gart_table_ram_alloc - allocate system ram for gart page table
  *
@@ -131,7 +144,7 @@ int amdgpu_gart_table_vram_alloc(struct amdgpu_device *adev)
 				     PAGE_SIZE, true, AMDGPU_GEM_DOMAIN_VRAM,
 				     AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
 				     AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
-				     NULL, NULL, &adev->gart.robj);
+				     NULL, NULL, 0, &adev->gart.robj);
 		if (r) {
 			return r;
 		}
@@ -263,6 +276,41 @@ int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
 }
 
 /**
+ * amdgpu_gart_map - map dma_addresses into GART entries
+ *
+ * @adev: amdgpu_device pointer
+ * @offset: offset into the GPU's gart aperture
+ * @pages: number of pages to bind
+ * @dma_addr: DMA addresses of pages
+ *
+ * Map the dma_addresses into GART entries (all asics).
+ * Returns 0 for success, -EINVAL for failure.
+ */
+int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
+		    int pages, dma_addr_t *dma_addr, uint64_t flags,
+		    void *dst)
+{
+	uint64_t page_base;
+	unsigned i, j, t;
+
+	if (!adev->gart.ready) {
+		WARN(1, "trying to bind memory to uninitialized GART !\n");
+		return -EINVAL;
+	}
+
+	t = offset / AMDGPU_GPU_PAGE_SIZE;
+
+	for (i = 0; i < pages; i++) {
+		page_base = dma_addr[i];
+		for (j = 0; j < (PAGE_SIZE / AMDGPU_GPU_PAGE_SIZE); j++, t++) {
+			amdgpu_gart_set_pte_pde(adev, dst, t, page_base, flags);
+			page_base += AMDGPU_GPU_PAGE_SIZE;
+		}
+	}
+	return 0;
+}
+
+/**
  * amdgpu_gart_bind - bind pages into the gart page table
  *
  * @adev: amdgpu_device pointer
@@ -279,31 +327,30 @@ int amdgpu_gart_bind(struct amdgpu_device *adev, uint64_t offset,
 		     int pages, struct page **pagelist, dma_addr_t *dma_addr,
 		     uint64_t flags)
 {
-	unsigned t;
-	unsigned p;
-	uint64_t page_base;
-	int i, j;
+#ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
+	unsigned i,t,p;
+#endif
+	int r;
 
 	if (!adev->gart.ready) {
 		WARN(1, "trying to bind memory to uninitialized GART !\n");
 		return -EINVAL;
 	}
 
+#ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
 	t = offset / AMDGPU_GPU_PAGE_SIZE;
 	p = t / (PAGE_SIZE / AMDGPU_GPU_PAGE_SIZE);
-
-	for (i = 0; i < pages; i++, p++) {
-#ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
+	for (i = 0; i < pages; i++, p++)
 		adev->gart.pages[p] = pagelist[i];
 #endif
-		if (adev->gart.ptr) {
-			page_base = dma_addr[i];
-			for (j = 0; j < (PAGE_SIZE / AMDGPU_GPU_PAGE_SIZE); j++, t++) {
-				amdgpu_gart_set_pte_pde(adev, adev->gart.ptr, t, page_base, flags);
-				page_base += AMDGPU_GPU_PAGE_SIZE;
-			}
-		}
+
+	if (adev->gart.ptr) {
+		r = amdgpu_gart_map(adev, offset, pages, dma_addr, flags,
+			    adev->gart.ptr);
+		if (r)
+			return r;
 	}
+
 	mb();
 	amdgpu_gart_flush_gpu_tlb(adev, 0);
 	return 0;
@@ -333,8 +380,8 @@ int amdgpu_gart_init(struct amdgpu_device *adev)
 	if (r)
 		return r;
 	/* Compute table size */
-	adev->gart.num_cpu_pages = adev->mc.gtt_size / PAGE_SIZE;
-	adev->gart.num_gpu_pages = adev->mc.gtt_size / AMDGPU_GPU_PAGE_SIZE;
+	adev->gart.num_cpu_pages = adev->mc.gart_size / PAGE_SIZE;
+	adev->gart.num_gpu_pages = adev->mc.gart_size / AMDGPU_GPU_PAGE_SIZE;
 	DRM_INFO("GART: num cpu pages %u, num gpu pages %u\n",
 		 adev->gart.num_cpu_pages, adev->gart.num_gpu_pages);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
new file mode 100644
index 0000000..d4cce69
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gart.h
@@ -0,0 +1,77 @@
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __AMDGPU_GART_H__
+#define __AMDGPU_GART_H__
+
+#include <linux/types.h>
+
+/*
+ * GART structures, functions & helpers
+ */
+struct amdgpu_device;
+struct amdgpu_bo;
+struct amdgpu_gart_funcs;
+
+#define AMDGPU_GPU_PAGE_SIZE 4096
+#define AMDGPU_GPU_PAGE_MASK (AMDGPU_GPU_PAGE_SIZE - 1)
+#define AMDGPU_GPU_PAGE_SHIFT 12
+#define AMDGPU_GPU_PAGE_ALIGN(a) (((a) + AMDGPU_GPU_PAGE_MASK) & ~AMDGPU_GPU_PAGE_MASK)
+
+struct amdgpu_gart {
+	dma_addr_t			table_addr;
+	struct amdgpu_bo		*robj;
+	void				*ptr;
+	unsigned			num_gpu_pages;
+	unsigned			num_cpu_pages;
+	unsigned			table_size;
+#ifdef CONFIG_DRM_AMDGPU_GART_DEBUGFS
+	struct page			**pages;
+#endif
+	bool				ready;
+
+	/* Asic default pte flags */
+	uint64_t			gart_pte_flags;
+
+	const struct amdgpu_gart_funcs *gart_funcs;
+};
+
+void amdgpu_gart_set_defaults(struct amdgpu_device *adev);
+int amdgpu_gart_table_ram_alloc(struct amdgpu_device *adev);
+void amdgpu_gart_table_ram_free(struct amdgpu_device *adev);
+int amdgpu_gart_table_vram_alloc(struct amdgpu_device *adev);
+void amdgpu_gart_table_vram_free(struct amdgpu_device *adev);
+int amdgpu_gart_table_vram_pin(struct amdgpu_device *adev);
+void amdgpu_gart_table_vram_unpin(struct amdgpu_device *adev);
+int amdgpu_gart_init(struct amdgpu_device *adev);
+void amdgpu_gart_fini(struct amdgpu_device *adev);
+int amdgpu_gart_unbind(struct amdgpu_device *adev, uint64_t offset,
+		       int pages);
+int amdgpu_gart_map(struct amdgpu_device *adev, uint64_t offset,
+		    int pages, dma_addr_t *dma_addr, uint64_t flags,
+		    void *dst);
+int amdgpu_gart_bind(struct amdgpu_device *adev, uint64_t offset,
+		     int pages, struct page **pagelist,
+		     dma_addr_t *dma_addr, uint64_t flags);
+
+#endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 621f739..7171968 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -49,7 +49,6 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size,
 				struct drm_gem_object **obj)
 {
 	struct amdgpu_bo *robj;
-	unsigned long max_size;
 	int r;
 
 	*obj = NULL;
@@ -58,20 +57,9 @@ int amdgpu_gem_object_create(struct amdgpu_device *adev, unsigned long size,
 		alignment = PAGE_SIZE;
 	}
 
-	if (!(initial_domain & (AMDGPU_GEM_DOMAIN_GDS | AMDGPU_GEM_DOMAIN_GWS | AMDGPU_GEM_DOMAIN_OA))) {
-		/* Maximum bo size is the unpinned gtt size since we use the gtt to
-		 * handle vram to system pool migrations.
-		 */
-		max_size = adev->mc.gtt_size - adev->gart_pin_size;
-		if (size > max_size) {
-			DRM_DEBUG("Allocation size %ldMb bigger than %ldMb limit\n",
-				  size >> 20, max_size >> 20);
-			return -ENOMEM;
-		}
-	}
 retry:
 	r = amdgpu_bo_create(adev, size, alignment, kernel, initial_domain,
-			     flags, NULL, NULL, &robj);
+			     flags, NULL, NULL, 0, &robj);
 	if (r) {
 		if (r != -ERESTARTSYS) {
 			if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
@@ -103,7 +91,7 @@ void amdgpu_gem_force_release(struct amdgpu_device *adev)
 		spin_lock(&file->table_lock);
 		idr_for_each_entry(&file->object_idr, gobj, handle) {
 			WARN_ONCE(1, "And also active allocations!\n");
-			drm_gem_object_unreference_unlocked(gobj);
+			drm_gem_object_put_unlocked(gobj);
 		}
 		idr_destroy(&file->object_idr);
 		spin_unlock(&file->table_lock);
@@ -237,9 +225,7 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void *data,
 	if (args->in.domain_flags & ~(AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
 				      AMDGPU_GEM_CREATE_NO_CPU_ACCESS |
 				      AMDGPU_GEM_CREATE_CPU_GTT_USWC |
-				      AMDGPU_GEM_CREATE_VRAM_CLEARED|
-				      AMDGPU_GEM_CREATE_SHADOW |
-				      AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS))
+				      AMDGPU_GEM_CREATE_VRAM_CLEARED))
 		return -EINVAL;
 
 	/* reject invalid gem domains */
@@ -275,7 +261,7 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void *data,
 
 	r = drm_gem_handle_create(filp, gobj, &handle);
 	/* drop reference from allocate - handle holds it now */
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	if (r)
 		return r;
 
@@ -318,7 +304,7 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data,
 		return r;
 
 	bo = gem_to_amdgpu_bo(gobj);
-	bo->prefered_domains = AMDGPU_GEM_DOMAIN_GTT;
+	bo->preferred_domains = AMDGPU_GEM_DOMAIN_GTT;
 	bo->allowed_domains = AMDGPU_GEM_DOMAIN_GTT;
 	r = amdgpu_ttm_tt_set_userptr(bo->tbo.ttm, args->addr, args->flags);
 	if (r)
@@ -353,7 +339,7 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data,
 
 	r = drm_gem_handle_create(filp, gobj, &handle);
 	/* drop reference from allocate - handle holds it now */
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	if (r)
 		return r;
 
@@ -367,7 +353,7 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data,
 	up_read(&current->mm->mmap_sem);
 
 release_object:
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 
 	return r;
 }
@@ -386,11 +372,11 @@ int amdgpu_mode_dumb_mmap(struct drm_file *filp,
 	robj = gem_to_amdgpu_bo(gobj);
 	if (amdgpu_ttm_tt_get_usermm(robj->tbo.ttm) ||
 	    (robj->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS)) {
-		drm_gem_object_unreference_unlocked(gobj);
+		drm_gem_object_put_unlocked(gobj);
 		return -EPERM;
 	}
 	*offset_p = amdgpu_bo_mmap_offset(robj);
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	return 0;
 }
 
@@ -460,7 +446,7 @@ int amdgpu_gem_wait_idle_ioctl(struct drm_device *dev, void *data,
 	} else
 		r = ret;
 
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	return r;
 }
 
@@ -503,7 +489,7 @@ int amdgpu_gem_metadata_ioctl(struct drm_device *dev, void *data,
 unreserve:
 	amdgpu_bo_unreserve(robj);
 out:
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	return r;
 }
 
@@ -635,7 +621,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
 
 	switch (args->operation) {
 	case AMDGPU_VA_OP_MAP:
-		r = amdgpu_vm_alloc_pts(adev, bo_va->vm, args->va_address,
+		r = amdgpu_vm_alloc_pts(adev, bo_va->base.vm, args->va_address,
 					args->map_size);
 		if (r)
 			goto error_backoff;
@@ -655,7 +641,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
 						args->map_size);
 		break;
 	case AMDGPU_VA_OP_REPLACE:
-		r = amdgpu_vm_alloc_pts(adev, bo_va->vm, args->va_address,
+		r = amdgpu_vm_alloc_pts(adev, bo_va->base.vm, args->va_address,
 					args->map_size);
 		if (r)
 			goto error_backoff;
@@ -676,7 +662,7 @@ int amdgpu_gem_va_ioctl(struct drm_device *dev, void *data,
 	ttm_eu_backoff_reservation(&ticket, &list);
 
 error_unref:
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	return r;
 }
 
@@ -701,11 +687,11 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev, void *data,
 	switch (args->op) {
 	case AMDGPU_GEM_OP_GET_GEM_CREATE_INFO: {
 		struct drm_amdgpu_gem_create_in info;
-		void __user *out = (void __user *)(uintptr_t)args->value;
+		void __user *out = u64_to_user_ptr(args->value);
 
 		info.bo_size = robj->gem_base.size;
 		info.alignment = robj->tbo.mem.page_alignment << PAGE_SHIFT;
-		info.domains = robj->prefered_domains;
+		info.domains = robj->preferred_domains;
 		info.domain_flags = robj->flags;
 		amdgpu_bo_unreserve(robj);
 		if (copy_to_user(out, &info, sizeof(info)))
@@ -723,10 +709,10 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev, void *data,
 			amdgpu_bo_unreserve(robj);
 			break;
 		}
-		robj->prefered_domains = args->value & (AMDGPU_GEM_DOMAIN_VRAM |
+		robj->preferred_domains = args->value & (AMDGPU_GEM_DOMAIN_VRAM |
 							AMDGPU_GEM_DOMAIN_GTT |
 							AMDGPU_GEM_DOMAIN_CPU);
-		robj->allowed_domains = robj->prefered_domains;
+		robj->allowed_domains = robj->preferred_domains;
 		if (robj->allowed_domains == AMDGPU_GEM_DOMAIN_VRAM)
 			robj->allowed_domains |= AMDGPU_GEM_DOMAIN_GTT;
 
@@ -738,7 +724,7 @@ int amdgpu_gem_op_ioctl(struct drm_device *dev, void *data,
 	}
 
 out:
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	return r;
 }
 
@@ -766,7 +752,7 @@ int amdgpu_mode_dumb_create(struct drm_file *file_priv,
 
 	r = drm_gem_handle_create(file_priv, gobj, &handle);
 	/* drop reference from allocate - handle holds it now */
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	if (r) {
 		return r;
 	}
@@ -784,6 +770,7 @@ static int amdgpu_debugfs_gem_bo_info(int id, void *ptr, void *data)
 	unsigned domain;
 	const char *placement;
 	unsigned pin_count;
+	uint64_t offset;
 
 	domain = amdgpu_mem_type_to_domain(bo->tbo.mem.mem_type);
 	switch (domain) {
@@ -798,9 +785,12 @@ static int amdgpu_debugfs_gem_bo_info(int id, void *ptr, void *data)
 		placement = " CPU";
 		break;
 	}
-	seq_printf(m, "\t0x%08x: %12ld byte %s @ 0x%010Lx",
-		   id, amdgpu_bo_size(bo), placement,
-		   amdgpu_bo_gpu_offset(bo));
+	seq_printf(m, "\t0x%08x: %12ld byte %s",
+		   id, amdgpu_bo_size(bo), placement);
+
+	offset = ACCESS_ONCE(bo->tbo.mem.start);
+	if (offset != AMDGPU_BO_INVALID_OFFSET)
+		seq_printf(m, " @ 0x%010Lx", offset);
 
 	pin_count = ACCESS_ONCE(bo->pin_count);
 	if (pin_count)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
index e26108a..4f6c68f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gfx.c
@@ -125,7 +125,8 @@ void amdgpu_gfx_compute_queue_acquire(struct amdgpu_device *adev)
 		if (mec >= adev->gfx.mec.num_mec)
 			break;
 
-		if (adev->gfx.mec.num_mec > 1) {
+		/* FIXME: spreading the queues across pipes causes perf regressions */
+		if (0) {
 			/* policy: amdgpu owns the first two queues of the first MEC */
 			if (mec == 0 && queue < 2)
 				set_bit(i, adev->gfx.mec.queue_bitmap);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
index f7d22c4..9e05e25 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gtt_mgr.c
@@ -28,7 +28,7 @@
 struct amdgpu_gtt_mgr {
 	struct drm_mm mm;
 	spinlock_t lock;
-	uint64_t available;
+	atomic64_t available;
 };
 
 /**
@@ -42,15 +42,19 @@ struct amdgpu_gtt_mgr {
 static int amdgpu_gtt_mgr_init(struct ttm_mem_type_manager *man,
 			       unsigned long p_size)
 {
+	struct amdgpu_device *adev = amdgpu_ttm_adev(man->bdev);
 	struct amdgpu_gtt_mgr *mgr;
+	uint64_t start, size;
 
 	mgr = kzalloc(sizeof(*mgr), GFP_KERNEL);
 	if (!mgr)
 		return -ENOMEM;
 
-	drm_mm_init(&mgr->mm, 0, p_size);
+	start = AMDGPU_GTT_MAX_TRANSFER_SIZE * AMDGPU_GTT_NUM_TRANSFER_WINDOWS;
+	size = (adev->mc.gart_size >> PAGE_SHIFT) - start;
+	drm_mm_init(&mgr->mm, start, size);
 	spin_lock_init(&mgr->lock);
-	mgr->available = p_size;
+	atomic64_set(&mgr->available, p_size);
 	man->priv = mgr;
 	return 0;
 }
@@ -81,6 +85,20 @@ static int amdgpu_gtt_mgr_fini(struct ttm_mem_type_manager *man)
 }
 
 /**
+ * amdgpu_gtt_mgr_is_allocated - Check if mem has address space
+ *
+ * @mem: the mem object to check
+ *
+ * Check if a mem object has already address space allocated.
+ */
+bool amdgpu_gtt_mgr_is_allocated(struct ttm_mem_reg *mem)
+{
+	struct drm_mm_node *node = mem->mm_node;
+
+	return (node->start != AMDGPU_BO_INVALID_OFFSET);
+}
+
+/**
  * amdgpu_gtt_mgr_alloc - allocate new ranges
  *
  * @man: TTM memory type manager
@@ -95,13 +113,14 @@ int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man,
 			 const struct ttm_place *place,
 			 struct ttm_mem_reg *mem)
 {
+	struct amdgpu_device *adev = amdgpu_ttm_adev(man->bdev);
 	struct amdgpu_gtt_mgr *mgr = man->priv;
 	struct drm_mm_node *node = mem->mm_node;
 	enum drm_mm_insert_mode mode;
 	unsigned long fpfn, lpfn;
 	int r;
 
-	if (node->start != AMDGPU_BO_INVALID_OFFSET)
+	if (amdgpu_gtt_mgr_is_allocated(mem))
 		return 0;
 
 	if (place)
@@ -112,7 +131,7 @@ int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man,
 	if (place && place->lpfn)
 		lpfn = place->lpfn;
 	else
-		lpfn = man->size;
+		lpfn = adev->gart.num_cpu_pages;
 
 	mode = DRM_MM_INSERT_BEST;
 	if (place && place->flags & TTM_PL_FLAG_TOPDOWN)
@@ -134,15 +153,6 @@ int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man,
 	return r;
 }
 
-void amdgpu_gtt_mgr_print(struct seq_file *m, struct ttm_mem_type_manager *man)
-{
-	struct amdgpu_device *adev = amdgpu_ttm_adev(man->bdev);
-	struct amdgpu_gtt_mgr *mgr = man->priv;
-
-	seq_printf(m, "man size:%llu pages, gtt available:%llu pages, usage:%lluMB\n",
-		   man->size, mgr->available, (u64)atomic64_read(&adev->gtt_usage) >> 20);
-
-}
 /**
  * amdgpu_gtt_mgr_new - allocate a new node
  *
@@ -163,11 +173,11 @@ static int amdgpu_gtt_mgr_new(struct ttm_mem_type_manager *man,
 	int r;
 
 	spin_lock(&mgr->lock);
-	if (mgr->available < mem->num_pages) {
+	if (atomic64_read(&mgr->available) < mem->num_pages) {
 		spin_unlock(&mgr->lock);
 		return 0;
 	}
-	mgr->available -= mem->num_pages;
+	atomic64_sub(mem->num_pages, &mgr->available);
 	spin_unlock(&mgr->lock);
 
 	node = kzalloc(sizeof(*node), GFP_KERNEL);
@@ -194,9 +204,7 @@ static int amdgpu_gtt_mgr_new(struct ttm_mem_type_manager *man,
 
 	return 0;
 err_out:
-	spin_lock(&mgr->lock);
-	mgr->available += mem->num_pages;
-	spin_unlock(&mgr->lock);
+	atomic64_add(mem->num_pages, &mgr->available);
 
 	return r;
 }
@@ -223,30 +231,47 @@ static void amdgpu_gtt_mgr_del(struct ttm_mem_type_manager *man,
 	spin_lock(&mgr->lock);
 	if (node->start != AMDGPU_BO_INVALID_OFFSET)
 		drm_mm_remove_node(node);
-	mgr->available += mem->num_pages;
 	spin_unlock(&mgr->lock);
+	atomic64_add(mem->num_pages, &mgr->available);
 
 	kfree(node);
 	mem->mm_node = NULL;
 }
 
 /**
+ * amdgpu_gtt_mgr_usage - return usage of GTT domain
+ *
+ * @man: TTM memory type manager
+ *
+ * Return how many bytes are used in the GTT domain
+ */
+uint64_t amdgpu_gtt_mgr_usage(struct ttm_mem_type_manager *man)
+{
+	struct amdgpu_gtt_mgr *mgr = man->priv;
+
+	return (u64)(man->size - atomic64_read(&mgr->available)) * PAGE_SIZE;
+}
+
+/**
  * amdgpu_gtt_mgr_debug - dump VRAM table
  *
  * @man: TTM memory type manager
- * @prefix: text prefix
+ * @printer: DRM printer to use
  *
  * Dump the table content using printk.
  */
 static void amdgpu_gtt_mgr_debug(struct ttm_mem_type_manager *man,
-				  const char *prefix)
+				 struct drm_printer *printer)
 {
 	struct amdgpu_gtt_mgr *mgr = man->priv;
-	struct drm_printer p = drm_debug_printer(prefix);
 
 	spin_lock(&mgr->lock);
-	drm_mm_print(&mgr->mm, &p);
+	drm_mm_print(&mgr->mm, printer);
 	spin_unlock(&mgr->lock);
+
+	drm_printf(printer, "man size:%llu pages, gtt available:%llu pages, usage:%lluMB\n",
+		   man->size, (u64)atomic64_read(&mgr->available),
+		   amdgpu_gtt_mgr_usage(man) >> 20);
 }
 
 const struct ttm_mem_type_manager_func amdgpu_gtt_mgr_func = {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
index f774b3f..659997b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ib.c
@@ -130,6 +130,7 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 
 	unsigned i;
 	int r = 0;
+	bool need_pipe_sync = false;
 
 	if (num_ibs == 0)
 		return -EINVAL;
@@ -165,15 +166,15 @@ int amdgpu_ib_schedule(struct amdgpu_ring *ring, unsigned num_ibs,
 	if (ring->funcs->emit_pipeline_sync && job &&
 	    ((tmp = amdgpu_sync_get_fence(&job->sched_sync)) ||
 	     amdgpu_vm_need_pipeline_sync(ring, job))) {
-		amdgpu_ring_emit_pipeline_sync(ring);
+		need_pipe_sync = true;
 		dma_fence_put(tmp);
 	}
 
 	if (ring->funcs->insert_start)
 		ring->funcs->insert_start(ring);
 
-	if (vm) {
-		r = amdgpu_vm_flush(ring, job);
+	if (job) {
+		r = amdgpu_vm_flush(ring, job, need_pipe_sync);
 		if (r) {
 			amdgpu_ring_undo(ring);
 			return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
index 62da6c5..4bdd851f 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -220,6 +220,10 @@ int amdgpu_irq_init(struct amdgpu_device *adev)
 	int r = 0;
 
 	spin_lock_init(&adev->irq.lock);
+
+	/* Disable vblank irqs aggressively for power-saving */
+	adev->ddev->vblank_disable_immediate = true;
+
 	r = drm_vblank_init(adev->ddev, adev->mode_info.num_crtc);
 	if (r) {
 		return r;
@@ -263,7 +267,6 @@ void amdgpu_irq_fini(struct amdgpu_device *adev)
 {
 	unsigned i, j;
 
-	drm_vblank_cleanup(adev->ddev);
 	if (adev->irq.installed) {
 		drm_irq_uninstall(adev->ddev);
 		adev->irq.installed = false;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
index 3d641e1..4510627 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c
@@ -81,6 +81,8 @@ int amdgpu_job_alloc_with_ib(struct amdgpu_device *adev, unsigned size,
 	r = amdgpu_ib_get(adev, NULL, size, &(*job)->ibs[0]);
 	if (r)
 		kfree(*job);
+	else
+		(*job)->vm_pd_addr = adev->gart.table_addr;
 
 	return r;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
index b0b2310..e162290 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_kms.c
@@ -158,7 +158,6 @@ int amdgpu_driver_load_kms(struct drm_device *dev, unsigned long flags)
 				"Error during ACPI methods call\n");
 	}
 
-	amdgpu_amdkfd_load_interface(adev);
 	amdgpu_amdkfd_device_probe(adev);
 	amdgpu_amdkfd_device_init(adev);
 
@@ -456,13 +455,13 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
 		ui64 = atomic64_read(&adev->num_vram_cpu_page_faults);
 		return copy_to_user(out, &ui64, min(size, 8u)) ? -EFAULT : 0;
 	case AMDGPU_INFO_VRAM_USAGE:
-		ui64 = atomic64_read(&adev->vram_usage);
+		ui64 = amdgpu_vram_mgr_usage(&adev->mman.bdev.man[TTM_PL_VRAM]);
 		return copy_to_user(out, &ui64, min(size, 8u)) ? -EFAULT : 0;
 	case AMDGPU_INFO_VIS_VRAM_USAGE:
-		ui64 = atomic64_read(&adev->vram_vis_usage);
+		ui64 = amdgpu_vram_mgr_vis_usage(&adev->mman.bdev.man[TTM_PL_VRAM]);
 		return copy_to_user(out, &ui64, min(size, 8u)) ? -EFAULT : 0;
 	case AMDGPU_INFO_GTT_USAGE:
-		ui64 = atomic64_read(&adev->gtt_usage);
+		ui64 = amdgpu_gtt_mgr_usage(&adev->mman.bdev.man[TTM_PL_TT]);
 		return copy_to_user(out, &ui64, min(size, 8u)) ? -EFAULT : 0;
 	case AMDGPU_INFO_GDS_CONFIG: {
 		struct drm_amdgpu_info_gds gds_info;
@@ -485,7 +484,8 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
 		vram_gtt.vram_size -= adev->vram_pin_size;
 		vram_gtt.vram_cpu_accessible_size = adev->mc.visible_vram_size;
 		vram_gtt.vram_cpu_accessible_size -= (adev->vram_pin_size - adev->invisible_pin_size);
-		vram_gtt.gtt_size  = adev->mc.gtt_size;
+		vram_gtt.gtt_size = adev->mman.bdev.man[TTM_PL_TT].size;
+		vram_gtt.gtt_size *= PAGE_SIZE;
 		vram_gtt.gtt_size -= adev->gart_pin_size;
 		return copy_to_user(out, &vram_gtt,
 				    min((size_t)size, sizeof(vram_gtt))) ? -EFAULT : 0;
@@ -497,7 +497,8 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
 		mem.vram.total_heap_size = adev->mc.real_vram_size;
 		mem.vram.usable_heap_size =
 			adev->mc.real_vram_size - adev->vram_pin_size;
-		mem.vram.heap_usage = atomic64_read(&adev->vram_usage);
+		mem.vram.heap_usage =
+			amdgpu_vram_mgr_usage(&adev->mman.bdev.man[TTM_PL_VRAM]);
 		mem.vram.max_allocation = mem.vram.usable_heap_size * 3 / 4;
 
 		mem.cpu_accessible_vram.total_heap_size =
@@ -506,14 +507,16 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
 			adev->mc.visible_vram_size -
 			(adev->vram_pin_size - adev->invisible_pin_size);
 		mem.cpu_accessible_vram.heap_usage =
-			atomic64_read(&adev->vram_vis_usage);
+			amdgpu_vram_mgr_vis_usage(&adev->mman.bdev.man[TTM_PL_VRAM]);
 		mem.cpu_accessible_vram.max_allocation =
 			mem.cpu_accessible_vram.usable_heap_size * 3 / 4;
 
-		mem.gtt.total_heap_size = adev->mc.gtt_size;
-		mem.gtt.usable_heap_size =
-			adev->mc.gtt_size - adev->gart_pin_size;
-		mem.gtt.heap_usage = atomic64_read(&adev->gtt_usage);
+		mem.gtt.total_heap_size = adev->mman.bdev.man[TTM_PL_TT].size;
+		mem.gtt.total_heap_size *= PAGE_SIZE;
+		mem.gtt.usable_heap_size = mem.gtt.total_heap_size
+			- adev->gart_pin_size;
+		mem.gtt.heap_usage =
+			amdgpu_gtt_mgr_usage(&adev->mman.bdev.man[TTM_PL_TT]);
 		mem.gtt.max_allocation = mem.gtt.usable_heap_size * 3 / 4;
 
 		return copy_to_user(out, &mem,
@@ -571,8 +574,8 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
 			dev_info.max_engine_clock = amdgpu_dpm_get_sclk(adev, false) * 10;
 			dev_info.max_memory_clock = amdgpu_dpm_get_mclk(adev, false) * 10;
 		} else {
-			dev_info.max_engine_clock = adev->pm.default_sclk * 10;
-			dev_info.max_memory_clock = adev->pm.default_mclk * 10;
+			dev_info.max_engine_clock = adev->clock.default_sclk * 10;
+			dev_info.max_memory_clock = adev->clock.default_mclk * 10;
 		}
 		dev_info.enabled_rb_pipes_mask = adev->gfx.config.backend_enable_mask;
 		dev_info.num_rb_pipes = adev->gfx.config.max_backends_per_se *
@@ -587,10 +590,8 @@ static int amdgpu_info_ioctl(struct drm_device *dev, void *data, struct drm_file
 		dev_info.virtual_address_offset = AMDGPU_VA_RESERVED_SIZE;
 		dev_info.virtual_address_max = (uint64_t)adev->vm_manager.max_pfn * AMDGPU_GPU_PAGE_SIZE;
 		dev_info.virtual_address_alignment = max((int)PAGE_SIZE, AMDGPU_GPU_PAGE_SIZE);
-		dev_info.pte_fragment_size = (1 << AMDGPU_LOG2_PAGES_PER_FRAG) *
-					     AMDGPU_GPU_PAGE_SIZE;
+		dev_info.pte_fragment_size = (1 << adev->vm_manager.fragment_size) * AMDGPU_GPU_PAGE_SIZE;
 		dev_info.gart_page_size = AMDGPU_GPU_PAGE_SIZE;
-
 		dev_info.cu_active_number = adev->gfx.cu_info.number;
 		dev_info.cu_ao_mask = adev->gfx.cu_info.ao_cu_mask;
 		dev_info.ce_ram_size = adev->gfx.ce_ram_size;
@@ -839,7 +840,7 @@ int amdgpu_driver_open_kms(struct drm_device *dev, struct drm_file *file_priv)
 	}
 
 	if (amdgpu_sriov_vf(adev)) {
-		r = amdgpu_map_static_csa(adev, &fpriv->vm);
+		r = amdgpu_map_static_csa(adev, &fpriv->vm, &fpriv->csa_va);
 		if (r)
 			goto out_suspend;
 	}
@@ -892,8 +893,8 @@ void amdgpu_driver_postclose_kms(struct drm_device *dev,
 	if (amdgpu_sriov_vf(adev)) {
 		/* TODO: how to handle reserve failure */
 		BUG_ON(amdgpu_bo_reserve(adev->virt.csa_obj, true));
-		amdgpu_vm_bo_rmv(adev, fpriv->vm.csa_bo_va);
-		fpriv->vm.csa_bo_va = NULL;
+		amdgpu_vm_bo_rmv(adev, fpriv->csa_va);
+		fpriv->csa_va = NULL;
 		amdgpu_bo_unreserve(adev->virt.csa_obj);
 	}
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
index 6558a3e..e1cde6b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mn.c
@@ -147,36 +147,6 @@ static void amdgpu_mn_invalidate_node(struct amdgpu_mn_node *node,
 }
 
 /**
- * amdgpu_mn_invalidate_page - callback to notify about mm change
- *
- * @mn: our notifier
- * @mn: the mm this callback is about
- * @address: address of invalidate page
- *
- * Invalidation of a single page. Blocks for all BOs mapping it
- * and unmap them by move them into system domain again.
- */
-static void amdgpu_mn_invalidate_page(struct mmu_notifier *mn,
-				      struct mm_struct *mm,
-				      unsigned long address)
-{
-	struct amdgpu_mn *rmn = container_of(mn, struct amdgpu_mn, mn);
-	struct interval_tree_node *it;
-
-	mutex_lock(&rmn->lock);
-
-	it = interval_tree_iter_first(&rmn->objects, address, address);
-	if (it) {
-		struct amdgpu_mn_node *node;
-
-		node = container_of(it, struct amdgpu_mn_node, it);
-		amdgpu_mn_invalidate_node(node, address, address);
-	}
-
-	mutex_unlock(&rmn->lock);
-}
-
-/**
  * amdgpu_mn_invalidate_range_start - callback to notify about mm change
  *
  * @mn: our notifier
@@ -215,7 +185,6 @@ static void amdgpu_mn_invalidate_range_start(struct mmu_notifier *mn,
 
 static const struct mmu_notifier_ops amdgpu_mn_ops = {
 	.release = amdgpu_mn_release,
-	.invalidate_page = amdgpu_mn_invalidate_page,
 	.invalidate_range_start = amdgpu_mn_invalidate_range_start,
 };
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
index 43a9d3a..2af2678 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_mode.h
@@ -257,15 +257,7 @@ struct amdgpu_audio {
 	int num_pins;
 };
 
-struct amdgpu_mode_mc_save {
-	u32 vga_render_control;
-	u32 vga_hdp_control;
-	bool crtc_enabled[AMDGPU_MAX_CRTCS];
-};
-
 struct amdgpu_display_funcs {
-	/* vga render */
-	void (*set_vga_render_state)(struct amdgpu_device *adev, bool render);
 	/* display watermarks */
 	void (*bandwidth_update)(struct amdgpu_device *adev);
 	/* get frame count */
@@ -300,10 +292,6 @@ struct amdgpu_display_funcs {
 			      uint16_t connector_object_id,
 			      struct amdgpu_hpd *hpd,
 			      struct amdgpu_router *router);
-	void (*stop_mc_access)(struct amdgpu_device *adev,
-			       struct amdgpu_mode_mc_save *save);
-	void (*resume_mc_access)(struct amdgpu_device *adev,
-				 struct amdgpu_mode_mc_save *save);
 };
 
 struct amdgpu_mode_info {
@@ -369,7 +357,6 @@ struct amdgpu_atom_ss {
 struct amdgpu_crtc {
 	struct drm_crtc base;
 	int crtc_id;
-	u16 lut_r[256], lut_g[256], lut_b[256];
 	bool enabled;
 	bool can_tile;
 	uint32_t crtc_offset;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
index 8ee6965..e7e8991 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.c
@@ -37,55 +37,6 @@
 #include "amdgpu.h"
 #include "amdgpu_trace.h"
 
-
-
-static u64 amdgpu_get_vis_part_size(struct amdgpu_device *adev,
-						struct ttm_mem_reg *mem)
-{
-	if (mem->start << PAGE_SHIFT >= adev->mc.visible_vram_size)
-		return 0;
-
-	return ((mem->start << PAGE_SHIFT) + mem->size) >
-		adev->mc.visible_vram_size ?
-		adev->mc.visible_vram_size - (mem->start << PAGE_SHIFT) :
-		mem->size;
-}
-
-static void amdgpu_update_memory_usage(struct amdgpu_device *adev,
-		       struct ttm_mem_reg *old_mem,
-		       struct ttm_mem_reg *new_mem)
-{
-	u64 vis_size;
-	if (!adev)
-		return;
-
-	if (new_mem) {
-		switch (new_mem->mem_type) {
-		case TTM_PL_TT:
-			atomic64_add(new_mem->size, &adev->gtt_usage);
-			break;
-		case TTM_PL_VRAM:
-			atomic64_add(new_mem->size, &adev->vram_usage);
-			vis_size = amdgpu_get_vis_part_size(adev, new_mem);
-			atomic64_add(vis_size, &adev->vram_vis_usage);
-			break;
-		}
-	}
-
-	if (old_mem) {
-		switch (old_mem->mem_type) {
-		case TTM_PL_TT:
-			atomic64_sub(old_mem->size, &adev->gtt_usage);
-			break;
-		case TTM_PL_VRAM:
-			atomic64_sub(old_mem->size, &adev->vram_usage);
-			vis_size = amdgpu_get_vis_part_size(adev, old_mem);
-			atomic64_sub(vis_size, &adev->vram_vis_usage);
-			break;
-		}
-	}
-}
-
 static void amdgpu_ttm_bo_destroy(struct ttm_buffer_object *tbo)
 {
 	struct amdgpu_device *adev = amdgpu_ttm_adev(tbo->bdev);
@@ -93,7 +44,7 @@ static void amdgpu_ttm_bo_destroy(struct ttm_buffer_object *tbo)
 
 	bo = container_of(tbo, struct amdgpu_bo, tbo);
 
-	amdgpu_update_memory_usage(adev, &bo->tbo.mem, NULL);
+	amdgpu_bo_kunmap(bo);
 
 	drm_gem_object_release(&bo->gem_base);
 	amdgpu_bo_unref(&bo->parent);
@@ -219,7 +170,7 @@ static void amdgpu_fill_placement_to_bo(struct amdgpu_bo *bo,
 }
 
 /**
- * amdgpu_bo_create_kernel - create BO for kernel use
+ * amdgpu_bo_create_reserved - create reserved BO for kernel use
  *
  * @adev: amdgpu device object
  * @size: size for the new BO
@@ -229,24 +180,30 @@ static void amdgpu_fill_placement_to_bo(struct amdgpu_bo *bo,
  * @gpu_addr: GPU addr of the pinned BO
  * @cpu_addr: optional CPU address mapping
  *
- * Allocates and pins a BO for kernel internal use.
+ * Allocates and pins a BO for kernel internal use, and returns it still
+ * reserved.
  *
  * Returns 0 on success, negative error code otherwise.
  */
-int amdgpu_bo_create_kernel(struct amdgpu_device *adev,
-			    unsigned long size, int align,
-			    u32 domain, struct amdgpu_bo **bo_ptr,
-			    u64 *gpu_addr, void **cpu_addr)
+int amdgpu_bo_create_reserved(struct amdgpu_device *adev,
+			      unsigned long size, int align,
+			      u32 domain, struct amdgpu_bo **bo_ptr,
+			      u64 *gpu_addr, void **cpu_addr)
 {
+	bool free = false;
 	int r;
 
-	r = amdgpu_bo_create(adev, size, align, true, domain,
-			     AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
-			     AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
-			     NULL, NULL, bo_ptr);
-	if (r) {
-		dev_err(adev->dev, "(%d) failed to allocate kernel bo\n", r);
-		return r;
+	if (!*bo_ptr) {
+		r = amdgpu_bo_create(adev, size, align, true, domain,
+				     AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
+				     AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
+				     NULL, NULL, 0, bo_ptr);
+		if (r) {
+			dev_err(adev->dev, "(%d) failed to allocate kernel bo\n",
+				r);
+			return r;
+		}
+		free = true;
 	}
 
 	r = amdgpu_bo_reserve(*bo_ptr, false);
@@ -269,20 +226,52 @@ int amdgpu_bo_create_kernel(struct amdgpu_device *adev,
 		}
 	}
 
-	amdgpu_bo_unreserve(*bo_ptr);
-
 	return 0;
 
 error_unreserve:
 	amdgpu_bo_unreserve(*bo_ptr);
 
 error_free:
-	amdgpu_bo_unref(bo_ptr);
+	if (free)
+		amdgpu_bo_unref(bo_ptr);
 
 	return r;
 }
 
 /**
+ * amdgpu_bo_create_kernel - create BO for kernel use
+ *
+ * @adev: amdgpu device object
+ * @size: size for the new BO
+ * @align: alignment for the new BO
+ * @domain: where to place it
+ * @bo_ptr: resulting BO
+ * @gpu_addr: GPU addr of the pinned BO
+ * @cpu_addr: optional CPU address mapping
+ *
+ * Allocates and pins a BO for kernel internal use.
+ *
+ * Returns 0 on success, negative error code otherwise.
+ */
+int amdgpu_bo_create_kernel(struct amdgpu_device *adev,
+			    unsigned long size, int align,
+			    u32 domain, struct amdgpu_bo **bo_ptr,
+			    u64 *gpu_addr, void **cpu_addr)
+{
+	int r;
+
+	r = amdgpu_bo_create_reserved(adev, size, align, domain, bo_ptr,
+				      gpu_addr, cpu_addr);
+
+	if (r)
+		return r;
+
+	amdgpu_bo_unreserve(*bo_ptr);
+
+	return 0;
+}
+
+/**
  * amdgpu_bo_free_kernel - free BO for kernel use
  *
  * @bo: amdgpu BO to free
@@ -317,12 +306,13 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
 				struct sg_table *sg,
 				struct ttm_placement *placement,
 				struct reservation_object *resv,
+				uint64_t init_value,
 				struct amdgpu_bo **bo_ptr)
 {
 	struct amdgpu_bo *bo;
 	enum ttm_bo_type type;
 	unsigned long page_align;
-	u64 initial_bytes_moved;
+	u64 initial_bytes_moved, bytes_moved;
 	size_t acc_size;
 	int r;
 
@@ -351,13 +341,13 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
 	}
 	INIT_LIST_HEAD(&bo->shadow_list);
 	INIT_LIST_HEAD(&bo->va);
-	bo->prefered_domains = domain & (AMDGPU_GEM_DOMAIN_VRAM |
+	bo->preferred_domains = domain & (AMDGPU_GEM_DOMAIN_VRAM |
 					 AMDGPU_GEM_DOMAIN_GTT |
 					 AMDGPU_GEM_DOMAIN_CPU |
 					 AMDGPU_GEM_DOMAIN_GDS |
 					 AMDGPU_GEM_DOMAIN_GWS |
 					 AMDGPU_GEM_DOMAIN_OA);
-	bo->allowed_domains = bo->prefered_domains;
+	bo->allowed_domains = bo->preferred_domains;
 	if (!kernel && bo->allowed_domains == AMDGPU_GEM_DOMAIN_VRAM)
 		bo->allowed_domains |= AMDGPU_GEM_DOMAIN_GTT;
 
@@ -398,8 +388,14 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
 	r = ttm_bo_init_reserved(&adev->mman.bdev, &bo->tbo, size, type,
 				 &bo->placement, page_align, !kernel, NULL,
 				 acc_size, sg, resv, &amdgpu_ttm_bo_destroy);
-	amdgpu_cs_report_moved_bytes(adev,
-		atomic64_read(&adev->num_bytes_moved) - initial_bytes_moved);
+	bytes_moved = atomic64_read(&adev->num_bytes_moved) -
+		      initial_bytes_moved;
+	if (adev->mc.visible_vram_size < adev->mc.real_vram_size &&
+	    bo->tbo.mem.mem_type == TTM_PL_VRAM &&
+	    bo->tbo.mem.start < adev->mc.visible_vram_size >> PAGE_SHIFT)
+		amdgpu_cs_report_moved_bytes(adev, bytes_moved, bytes_moved);
+	else
+		amdgpu_cs_report_moved_bytes(adev, bytes_moved, 0);
 
 	if (unlikely(r != 0))
 		return r;
@@ -411,7 +407,7 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
 	    bo->tbo.mem.placement & TTM_PL_FLAG_VRAM) {
 		struct dma_fence *fence;
 
-		r = amdgpu_fill_buffer(bo, 0, bo->tbo.resv, &fence);
+		r = amdgpu_fill_buffer(bo, init_value, bo->tbo.resv, &fence);
 		if (unlikely(r))
 			goto fail_unreserve;
 
@@ -426,6 +422,10 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
 
 	trace_amdgpu_bo_create(bo);
 
+	/* Treat CPU_ACCESS_REQUIRED only as a hint if given by UMD */
+	if (type == ttm_bo_type_device)
+		bo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
+
 	return 0;
 
 fail_unreserve:
@@ -459,6 +459,7 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
 					AMDGPU_GEM_CREATE_CPU_GTT_USWC,
 					NULL, &placement,
 					bo->tbo.resv,
+					0,
 					&bo->shadow);
 	if (!r) {
 		bo->shadow->parent = amdgpu_bo_ref(bo);
@@ -470,11 +471,15 @@ static int amdgpu_bo_create_shadow(struct amdgpu_device *adev,
 	return r;
 }
 
+/* init_value will only take effect when flags contains
+ * AMDGPU_GEM_CREATE_VRAM_CLEARED.
+ */
 int amdgpu_bo_create(struct amdgpu_device *adev,
 		     unsigned long size, int byte_align,
 		     bool kernel, u32 domain, u64 flags,
 		     struct sg_table *sg,
 		     struct reservation_object *resv,
+		     uint64_t init_value,
 		     struct amdgpu_bo **bo_ptr)
 {
 	struct ttm_placement placement = {0};
@@ -489,7 +494,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
 
 	r = amdgpu_bo_create_restricted(adev, size, byte_align, kernel,
 					domain, flags, sg, &placement,
-					resv, bo_ptr);
+					resv, init_value, bo_ptr);
 	if (r)
 		return r;
 
@@ -535,7 +540,7 @@ int amdgpu_bo_backup_to_shadow(struct amdgpu_device *adev,
 
 	r = amdgpu_copy_buffer(ring, bo_addr, shadow_addr,
 			       amdgpu_bo_size(bo), resv, fence,
-			       direct);
+			       direct, false);
 	if (!r)
 		amdgpu_bo_fence(bo, *fence, true);
 
@@ -551,7 +556,7 @@ int amdgpu_bo_validate(struct amdgpu_bo *bo)
 	if (bo->pin_count)
 		return 0;
 
-	domain = bo->prefered_domains;
+	domain = bo->preferred_domains;
 
 retry:
 	amdgpu_ttm_placement_from_domain(bo, domain);
@@ -588,7 +593,7 @@ int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
 
 	r = amdgpu_copy_buffer(ring, shadow_addr, bo_addr,
 			       amdgpu_bo_size(bo), resv, fence,
-			       direct);
+			       direct, false);
 	if (!r)
 		amdgpu_bo_fence(bo, *fence, true);
 
@@ -598,16 +603,16 @@ int amdgpu_bo_restore_from_shadow(struct amdgpu_device *adev,
 
 int amdgpu_bo_kmap(struct amdgpu_bo *bo, void **ptr)
 {
-	bool is_iomem;
+	void *kptr;
 	long r;
 
 	if (bo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS)
 		return -EPERM;
 
-	if (bo->kptr) {
-		if (ptr) {
-			*ptr = bo->kptr;
-		}
+	kptr = amdgpu_bo_kptr(bo);
+	if (kptr) {
+		if (ptr)
+			*ptr = kptr;
 		return 0;
 	}
 
@@ -620,19 +625,23 @@ int amdgpu_bo_kmap(struct amdgpu_bo *bo, void **ptr)
 	if (r)
 		return r;
 
-	bo->kptr = ttm_kmap_obj_virtual(&bo->kmap, &is_iomem);
 	if (ptr)
-		*ptr = bo->kptr;
+		*ptr = amdgpu_bo_kptr(bo);
 
 	return 0;
 }
 
+void *amdgpu_bo_kptr(struct amdgpu_bo *bo)
+{
+	bool is_iomem;
+
+	return ttm_kmap_obj_virtual(&bo->kmap, &is_iomem);
+}
+
 void amdgpu_bo_kunmap(struct amdgpu_bo *bo)
 {
-	if (bo->kptr == NULL)
-		return;
-	bo->kptr = NULL;
-	ttm_bo_kunmap(&bo->kmap);
+	if (bo->kmap.bo)
+		ttm_bo_kunmap(&bo->kmap);
 }
 
 struct amdgpu_bo *amdgpu_bo_ref(struct amdgpu_bo *bo)
@@ -724,15 +733,16 @@ int amdgpu_bo_pin_restricted(struct amdgpu_bo *bo, u32 domain,
 		dev_err(adev->dev, "%p pin failed\n", bo);
 		goto error;
 	}
-	r = amdgpu_ttm_bind(&bo->tbo, &bo->tbo.mem);
-	if (unlikely(r)) {
-		dev_err(adev->dev, "%p bind failed\n", bo);
-		goto error;
-	}
 
 	bo->pin_count = 1;
-	if (gpu_addr != NULL)
+	if (gpu_addr != NULL) {
+		r = amdgpu_ttm_bind(&bo->tbo, &bo->tbo.mem);
+		if (unlikely(r)) {
+			dev_err(adev->dev, "%p bind failed\n", bo);
+			goto error;
+		}
 		*gpu_addr = amdgpu_bo_gpu_offset(bo);
+	}
 	if (domain == AMDGPU_GEM_DOMAIN_VRAM) {
 		adev->vram_pin_size += amdgpu_bo_size(bo);
 		if (bo->flags & AMDGPU_GEM_CREATE_NO_CPU_ACCESS)
@@ -921,6 +931,8 @@ void amdgpu_bo_move_notify(struct ttm_buffer_object *bo,
 	abo = container_of(bo, struct amdgpu_bo, tbo);
 	amdgpu_vm_bo_invalidate(adev, abo);
 
+	amdgpu_bo_kunmap(abo);
+
 	/* remember the eviction */
 	if (evict)
 		atomic64_inc(&adev->num_evictions);
@@ -930,8 +942,6 @@ void amdgpu_bo_move_notify(struct ttm_buffer_object *bo,
 		return;
 
 	/* move_notify is called before move happens */
-	amdgpu_update_memory_usage(adev, &bo->mem, new_mem);
-
 	trace_amdgpu_ttm_bo_move(abo, new_mem->mem_type, old_mem->mem_type);
 }
 
@@ -939,19 +949,22 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
 {
 	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
 	struct amdgpu_bo *abo;
-	unsigned long offset, size, lpfn;
-	int i, r;
+	unsigned long offset, size;
+	int r;
 
 	if (!amdgpu_ttm_bo_is_amdgpu_bo(bo))
 		return 0;
 
 	abo = container_of(bo, struct amdgpu_bo, tbo);
+
+	/* Remember that this BO was accessed by the CPU */
+	abo->flags |= AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
+
 	if (bo->mem.mem_type != TTM_PL_VRAM)
 		return 0;
 
 	size = bo->mem.num_pages << PAGE_SHIFT;
 	offset = bo->mem.start << PAGE_SHIFT;
-	/* TODO: figure out how to map scattered VRAM to the CPU */
 	if ((offset + size) <= adev->mc.visible_vram_size)
 		return 0;
 
@@ -961,26 +974,21 @@ int amdgpu_bo_fault_reserve_notify(struct ttm_buffer_object *bo)
 
 	/* hurrah the memory is not visible ! */
 	atomic64_inc(&adev->num_vram_cpu_page_faults);
-	amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM);
-	lpfn =	adev->mc.visible_vram_size >> PAGE_SHIFT;
-	for (i = 0; i < abo->placement.num_placement; i++) {
-		/* Force into visible VRAM */
-		if ((abo->placements[i].flags & TTM_PL_FLAG_VRAM) &&
-		    (!abo->placements[i].lpfn ||
-		     abo->placements[i].lpfn > lpfn))
-			abo->placements[i].lpfn = lpfn;
-	}
+	amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM |
+					 AMDGPU_GEM_DOMAIN_GTT);
+
+	/* Avoid costly evictions; only set GTT as a busy placement */
+	abo->placement.num_busy_placement = 1;
+	abo->placement.busy_placement = &abo->placements[1];
+
 	r = ttm_bo_validate(bo, &abo->placement, false, false);
-	if (unlikely(r == -ENOMEM)) {
-		amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_GTT);
-		return ttm_bo_validate(bo, &abo->placement, false, false);
-	} else if (unlikely(r != 0)) {
+	if (unlikely(r != 0))
 		return r;
-	}
 
 	offset = bo->mem.start << PAGE_SHIFT;
 	/* this should never happen */
-	if ((offset + size) > adev->mc.visible_vram_size)
+	if (bo->mem.mem_type == TTM_PL_VRAM &&
+	    (offset + size) > adev->mc.visible_vram_size)
 		return -EINVAL;
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
index 3824851..a288fa6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_object.h
@@ -33,6 +33,61 @@
 
 #define AMDGPU_BO_INVALID_OFFSET	LONG_MAX
 
+/* bo virtual addresses in a vm */
+struct amdgpu_bo_va_mapping {
+	struct list_head		list;
+	struct rb_node			rb;
+	uint64_t			start;
+	uint64_t			last;
+	uint64_t			__subtree_last;
+	uint64_t			offset;
+	uint64_t			flags;
+};
+
+/* User space allocated BO in a VM */
+struct amdgpu_bo_va {
+	struct amdgpu_vm_bo_base	base;
+
+	/* protected by bo being reserved */
+	struct dma_fence	        *last_pt_update;
+	unsigned			ref_count;
+
+	/* mappings for this bo_va */
+	struct list_head		invalids;
+	struct list_head		valids;
+};
+
+struct amdgpu_bo {
+	/* Protected by tbo.reserved */
+	u32				preferred_domains;
+	u32				allowed_domains;
+	struct ttm_place		placements[AMDGPU_GEM_DOMAIN_MAX + 1];
+	struct ttm_placement		placement;
+	struct ttm_buffer_object	tbo;
+	struct ttm_bo_kmap_obj		kmap;
+	u64				flags;
+	unsigned			pin_count;
+	u64				tiling_flags;
+	u64				metadata_flags;
+	void				*metadata;
+	u32				metadata_size;
+	unsigned			prime_shared_count;
+	/* list of all virtual address to which this bo is associated to */
+	struct list_head		va;
+	/* Constant after initialization */
+	struct drm_gem_object		gem_base;
+	struct amdgpu_bo		*parent;
+	struct amdgpu_bo		*shadow;
+
+	struct ttm_bo_kmap_obj		dma_buf_vmap;
+	struct amdgpu_mn		*mn;
+
+	union {
+		struct list_head	mn_list;
+		struct list_head	shadow_list;
+	};
+};
+
 /**
  * amdgpu_mem_type_to_domain - return domain corresponding to mem_type
  * @mem_type:	ttm memory type
@@ -120,7 +175,11 @@ static inline u64 amdgpu_bo_mmap_offset(struct amdgpu_bo *bo)
  */
 static inline bool amdgpu_bo_gpu_accessible(struct amdgpu_bo *bo)
 {
-	return bo->tbo.mem.mem_type != TTM_PL_SYSTEM;
+	switch (bo->tbo.mem.mem_type) {
+	case TTM_PL_TT: return amdgpu_ttm_is_bound(bo->tbo.ttm);
+	case TTM_PL_VRAM: return true;
+	default: return false;
+	}
 }
 
 int amdgpu_bo_create(struct amdgpu_device *adev,
@@ -128,6 +187,7 @@ int amdgpu_bo_create(struct amdgpu_device *adev,
 			    bool kernel, u32 domain, u64 flags,
 			    struct sg_table *sg,
 			    struct reservation_object *resv,
+			    uint64_t init_value,
 			    struct amdgpu_bo **bo_ptr);
 int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
 				unsigned long size, int byte_align,
@@ -135,7 +195,12 @@ int amdgpu_bo_create_restricted(struct amdgpu_device *adev,
 				struct sg_table *sg,
 				struct ttm_placement *placement,
 			        struct reservation_object *resv,
+				uint64_t init_value,
 				struct amdgpu_bo **bo_ptr);
+int amdgpu_bo_create_reserved(struct amdgpu_device *adev,
+			      unsigned long size, int align,
+			      u32 domain, struct amdgpu_bo **bo_ptr,
+			      u64 *gpu_addr, void **cpu_addr);
 int amdgpu_bo_create_kernel(struct amdgpu_device *adev,
 			    unsigned long size, int align,
 			    u32 domain, struct amdgpu_bo **bo_ptr,
@@ -143,6 +208,7 @@ int amdgpu_bo_create_kernel(struct amdgpu_device *adev,
 void amdgpu_bo_free_kernel(struct amdgpu_bo **bo, u64 *gpu_addr,
 			   void **cpu_addr);
 int amdgpu_bo_kmap(struct amdgpu_bo *bo, void **ptr);
+void *amdgpu_bo_kptr(struct amdgpu_bo *bo);
 void amdgpu_bo_kunmap(struct amdgpu_bo *bo);
 struct amdgpu_bo *amdgpu_bo_ref(struct amdgpu_bo *bo);
 void amdgpu_bo_unref(struct amdgpu_bo **bo);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.h
index c19c4d1..f21a771 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_pm.h
@@ -30,6 +30,7 @@ struct cg_flag_name
 	const char *name;
 };
 
+void amdgpu_pm_acpi_event_handler(struct amdgpu_device *adev);
 int amdgpu_pm_sysfs_init(struct amdgpu_device *adev);
 void amdgpu_pm_sysfs_fini(struct amdgpu_device *adev);
 void amdgpu_pm_print_power_states(struct amdgpu_device *adev);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
index 6bdc866..5b3f928 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_prime.c
@@ -69,7 +69,7 @@ amdgpu_gem_prime_import_sg_table(struct drm_device *dev,
 
 	ww_mutex_lock(&resv->lock, NULL);
 	ret = amdgpu_bo_create(adev, attach->dmabuf->size, PAGE_SIZE, false,
-			       AMDGPU_GEM_DOMAIN_GTT, 0, sg, resv, &bo);
+			       AMDGPU_GEM_DOMAIN_GTT, 0, sg, resv, 0, &bo);
 	ww_mutex_unlock(&resv->lock);
 	if (ret)
 		return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
index 4083be6..8c2204c 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.c
@@ -63,8 +63,13 @@ static int psp_sw_init(void *handle)
 		psp->smu_reload_quirk = psp_v3_1_smu_reload_quirk;
 		break;
 	case CHIP_RAVEN:
+#if 0
+		psp->init_microcode = psp_v10_0_init_microcode;
+#endif
 		psp->prep_cmd_buf = psp_v10_0_prep_cmd_buf;
 		psp->ring_init = psp_v10_0_ring_init;
+		psp->ring_create = psp_v10_0_ring_create;
+		psp->ring_destroy = psp_v10_0_ring_destroy;
 		psp->cmd_submit = psp_v10_0_cmd_submit;
 		psp->compare_sram_data = psp_v10_0_compare_sram_data;
 		break;
@@ -95,9 +100,8 @@ int psp_wait_for(struct psp_context *psp, uint32_t reg_index,
 	int i;
 	struct amdgpu_device *adev = psp->adev;
 
-	val = RREG32(reg_index);
-
 	for (i = 0; i < adev->usec_timeout; i++) {
+		val = RREG32(reg_index);
 		if (check_changed) {
 			if (val != reg_val)
 				return 0;
@@ -118,33 +122,18 @@ psp_cmd_submit_buf(struct psp_context *psp,
 		   int index)
 {
 	int ret;
-	struct amdgpu_bo *cmd_buf_bo;
-	uint64_t cmd_buf_mc_addr;
-	struct psp_gfx_cmd_resp *cmd_buf_mem;
-	struct amdgpu_device *adev = psp->adev;
 
-	ret = amdgpu_bo_create_kernel(adev, PSP_CMD_BUFFER_SIZE, PAGE_SIZE,
-				      AMDGPU_GEM_DOMAIN_VRAM,
-				      &cmd_buf_bo, &cmd_buf_mc_addr,
-				      (void **)&cmd_buf_mem);
-	if (ret)
-		return ret;
+	memset(psp->cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
 
-	memset(cmd_buf_mem, 0, PSP_CMD_BUFFER_SIZE);
+	memcpy(psp->cmd_buf_mem, cmd, sizeof(struct psp_gfx_cmd_resp));
 
-	memcpy(cmd_buf_mem, cmd, sizeof(struct psp_gfx_cmd_resp));
-
-	ret = psp_cmd_submit(psp, ucode, cmd_buf_mc_addr,
+	ret = psp_cmd_submit(psp, ucode, psp->cmd_buf_mc_addr,
 			     fence_mc_addr, index);
 
 	while (*((unsigned int *)psp->fence_buf) != index) {
 		msleep(1);
 	}
 
-	amdgpu_bo_free_kernel(&cmd_buf_bo,
-			      &cmd_buf_mc_addr,
-			      (void **)&cmd_buf_mem);
-
 	return ret;
 }
 
@@ -352,13 +341,20 @@ static int psp_load_fw(struct amdgpu_device *adev)
 				      &psp->fence_buf_mc_addr,
 				      &psp->fence_buf);
 	if (ret)
+		goto failed_mem2;
+
+	ret = amdgpu_bo_create_kernel(adev, PSP_CMD_BUFFER_SIZE, PAGE_SIZE,
+				      AMDGPU_GEM_DOMAIN_VRAM,
+				      &psp->cmd_buf_bo, &psp->cmd_buf_mc_addr,
+				      (void **)&psp->cmd_buf_mem);
+	if (ret)
 		goto failed_mem1;
 
 	memset(psp->fence_buf, 0, PSP_FENCE_BUFFER_SIZE);
 
 	ret = psp_ring_init(psp, PSP_RING_TYPE__KM);
 	if (ret)
-		goto failed_mem1;
+		goto failed_mem;
 
 	ret = psp_tmr_init(psp);
 	if (ret)
@@ -379,9 +375,13 @@ static int psp_load_fw(struct amdgpu_device *adev)
 	return 0;
 
 failed_mem:
+	amdgpu_bo_free_kernel(&psp->cmd_buf_bo,
+			      &psp->cmd_buf_mc_addr,
+			      (void **)&psp->cmd_buf_mem);
+failed_mem1:
 	amdgpu_bo_free_kernel(&psp->fence_buf_bo,
 			      &psp->fence_buf_mc_addr, &psp->fence_buf);
-failed_mem1:
+failed_mem2:
 	amdgpu_bo_free_kernel(&psp->fw_pri_bo,
 			      &psp->fw_pri_mc_addr, &psp->fw_pri_buf);
 failed:
@@ -435,16 +435,15 @@ static int psp_hw_fini(void *handle)
 
 	psp_ring_destroy(psp, PSP_RING_TYPE__KM);
 
-	if (psp->tmr_buf)
-		amdgpu_bo_free_kernel(&psp->tmr_bo, &psp->tmr_mc_addr, &psp->tmr_buf);
-
-	if (psp->fw_pri_buf)
-		amdgpu_bo_free_kernel(&psp->fw_pri_bo,
-				      &psp->fw_pri_mc_addr, &psp->fw_pri_buf);
-
-	if (psp->fence_buf_bo)
-		amdgpu_bo_free_kernel(&psp->fence_buf_bo,
-				      &psp->fence_buf_mc_addr, &psp->fence_buf);
+	amdgpu_bo_free_kernel(&psp->tmr_bo, &psp->tmr_mc_addr, &psp->tmr_buf);
+	amdgpu_bo_free_kernel(&psp->fw_pri_bo,
+			      &psp->fw_pri_mc_addr, &psp->fw_pri_buf);
+	amdgpu_bo_free_kernel(&psp->fence_buf_bo,
+			      &psp->fence_buf_mc_addr, &psp->fence_buf);
+	amdgpu_bo_free_kernel(&psp->asd_shared_bo, &psp->asd_shared_mc_addr,
+			      &psp->asd_shared_buf);
+	amdgpu_bo_free_kernel(&psp->cmd_buf_bo, &psp->cmd_buf_mc_addr,
+			      (void **)&psp->cmd_buf_mem);
 
 	kfree(psp->cmd);
 	psp->cmd = NULL;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
index 1a1c8b4..538fa9d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_psp.h
@@ -108,6 +108,11 @@ struct psp_context
 	struct amdgpu_bo 		*fence_buf_bo;
 	uint64_t 			fence_buf_mc_addr;
 	void				*fence_buf;
+
+	/* cmd buffer */
+	struct amdgpu_bo		*cmd_buf_bo;
+	uint64_t			cmd_buf_mc_addr;
+	struct psp_gfx_cmd_resp		*cmd_buf_mem;
 };
 
 struct amdgpu_psp_funcs {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
index 75165e0..6c5646b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c
@@ -184,32 +184,16 @@ int amdgpu_ring_init(struct amdgpu_device *adev, struct amdgpu_ring *ring,
 			return r;
 	}
 
-	if (ring->funcs->support_64bit_ptrs) {
-		r = amdgpu_wb_get_64bit(adev, &ring->rptr_offs);
-		if (r) {
-			dev_err(adev->dev, "(%d) ring rptr_offs wb alloc failed\n", r);
-			return r;
-		}
+	r = amdgpu_wb_get(adev, &ring->rptr_offs);
+	if (r) {
+		dev_err(adev->dev, "(%d) ring rptr_offs wb alloc failed\n", r);
+		return r;
+	}
 
-		r = amdgpu_wb_get_64bit(adev, &ring->wptr_offs);
-		if (r) {
-			dev_err(adev->dev, "(%d) ring wptr_offs wb alloc failed\n", r);
-			return r;
-		}
-
-	} else {
-		r = amdgpu_wb_get(adev, &ring->rptr_offs);
-		if (r) {
-			dev_err(adev->dev, "(%d) ring rptr_offs wb alloc failed\n", r);
-			return r;
-		}
-
-		r = amdgpu_wb_get(adev, &ring->wptr_offs);
-		if (r) {
-			dev_err(adev->dev, "(%d) ring wptr_offs wb alloc failed\n", r);
-			return r;
-		}
-
+	r = amdgpu_wb_get(adev, &ring->wptr_offs);
+	if (r) {
+		dev_err(adev->dev, "(%d) ring wptr_offs wb alloc failed\n", r);
+		return r;
 	}
 
 	r = amdgpu_wb_get(adev, &ring->fence_offs);
@@ -277,18 +261,15 @@ void amdgpu_ring_fini(struct amdgpu_ring *ring)
 {
 	ring->ready = false;
 
-	if (ring->funcs->support_64bit_ptrs) {
-		amdgpu_wb_free_64bit(ring->adev, ring->cond_exe_offs);
-		amdgpu_wb_free_64bit(ring->adev, ring->fence_offs);
-		amdgpu_wb_free_64bit(ring->adev, ring->rptr_offs);
-		amdgpu_wb_free_64bit(ring->adev, ring->wptr_offs);
-	} else {
-		amdgpu_wb_free(ring->adev, ring->cond_exe_offs);
-		amdgpu_wb_free(ring->adev, ring->fence_offs);
-		amdgpu_wb_free(ring->adev, ring->rptr_offs);
-		amdgpu_wb_free(ring->adev, ring->wptr_offs);
-	}
+	/* Not to finish a ring which is not initialized */
+	if (!(ring->adev) || !(ring->adev->rings[ring->idx]))
+		return;
 
+	amdgpu_wb_free(ring->adev, ring->rptr_offs);
+	amdgpu_wb_free(ring->adev, ring->wptr_offs);
+
+	amdgpu_wb_free(ring->adev, ring->cond_exe_offs);
+	amdgpu_wb_free(ring->adev, ring->fence_offs);
 
 	amdgpu_bo_free_kernel(&ring->ring_obj,
 			      &ring->gpu_addr,
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
index bc8dec9..322d2529 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h
@@ -212,4 +212,44 @@ static inline void amdgpu_ring_clear_ring(struct amdgpu_ring *ring)
 
 }
 
+static inline void amdgpu_ring_write(struct amdgpu_ring *ring, uint32_t v)
+{
+	if (ring->count_dw <= 0)
+		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
+	ring->ring[ring->wptr++ & ring->buf_mask] = v;
+	ring->wptr &= ring->ptr_mask;
+	ring->count_dw--;
+}
+
+static inline void amdgpu_ring_write_multiple(struct amdgpu_ring *ring,
+					      void *src, int count_dw)
+{
+	unsigned occupied, chunk1, chunk2;
+	void *dst;
+
+	if (unlikely(ring->count_dw < count_dw))
+		DRM_ERROR("amdgpu: writing more dwords to the ring than expected!\n");
+
+	occupied = ring->wptr & ring->buf_mask;
+	dst = (void *)&ring->ring[occupied];
+	chunk1 = ring->buf_mask + 1 - occupied;
+	chunk1 = (chunk1 >= count_dw) ? count_dw: chunk1;
+	chunk2 = count_dw - chunk1;
+	chunk1 <<= 2;
+	chunk2 <<= 2;
+
+	if (chunk1)
+		memcpy(dst, src, chunk1);
+
+	if (chunk2) {
+		src += chunk1;
+		dst = (void *)ring->ring;
+		memcpy(dst, src, chunk2);
+	}
+
+	ring->wptr += count_dw;
+	ring->wptr &= ring->ptr_mask;
+	ring->count_dw -= count_dw;
+}
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
index 5ca75a4..3144400 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sa.c
@@ -64,7 +64,7 @@ int amdgpu_sa_bo_manager_init(struct amdgpu_device *adev,
 		INIT_LIST_HEAD(&sa_manager->flist[i]);
 
 	r = amdgpu_bo_create(adev, size, align, true, domain,
-			     0, NULL, NULL, &sa_manager->bo);
+			     0, NULL, NULL, 0, &sa_manager->bo);
 	if (r) {
 		dev_err(adev->dev, "(%d) failed to allocate bo for manager\n", r);
 		return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_test.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_test.c
index 15510da..ed8c373 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_test.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_test.c
@@ -33,7 +33,7 @@ static void amdgpu_do_test_moves(struct amdgpu_device *adev)
 	struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
 	struct amdgpu_bo *vram_obj = NULL;
 	struct amdgpu_bo **gtt_obj = NULL;
-	uint64_t gtt_addr, vram_addr;
+	uint64_t gart_addr, vram_addr;
 	unsigned n, size;
 	int i, r;
 
@@ -42,7 +42,7 @@ static void amdgpu_do_test_moves(struct amdgpu_device *adev)
 	/* Number of tests =
 	 * (Total GTT - IB pool - writeback page - ring buffers) / test size
 	 */
-	n = adev->mc.gtt_size - AMDGPU_IB_POOL_SIZE*64*1024;
+	n = adev->mc.gart_size - AMDGPU_IB_POOL_SIZE*64*1024;
 	for (i = 0; i < AMDGPU_MAX_RINGS; ++i)
 		if (adev->rings[i])
 			n -= adev->rings[i]->ring_size;
@@ -61,7 +61,7 @@ static void amdgpu_do_test_moves(struct amdgpu_device *adev)
 
 	r = amdgpu_bo_create(adev, size, PAGE_SIZE, true,
 			     AMDGPU_GEM_DOMAIN_VRAM, 0,
-			     NULL, NULL, &vram_obj);
+			     NULL, NULL, 0, &vram_obj);
 	if (r) {
 		DRM_ERROR("Failed to create VRAM object\n");
 		goto out_cleanup;
@@ -76,13 +76,13 @@ static void amdgpu_do_test_moves(struct amdgpu_device *adev)
 	}
 	for (i = 0; i < n; i++) {
 		void *gtt_map, *vram_map;
-		void **gtt_start, **gtt_end;
+		void **gart_start, **gart_end;
 		void **vram_start, **vram_end;
 		struct dma_fence *fence = NULL;
 
 		r = amdgpu_bo_create(adev, size, PAGE_SIZE, true,
 				     AMDGPU_GEM_DOMAIN_GTT, 0, NULL,
-				     NULL, gtt_obj + i);
+				     NULL, 0, gtt_obj + i);
 		if (r) {
 			DRM_ERROR("Failed to create GTT object %d\n", i);
 			goto out_lclean;
@@ -91,7 +91,7 @@ static void amdgpu_do_test_moves(struct amdgpu_device *adev)
 		r = amdgpu_bo_reserve(gtt_obj[i], false);
 		if (unlikely(r != 0))
 			goto out_lclean_unref;
-		r = amdgpu_bo_pin(gtt_obj[i], AMDGPU_GEM_DOMAIN_GTT, &gtt_addr);
+		r = amdgpu_bo_pin(gtt_obj[i], AMDGPU_GEM_DOMAIN_GTT, &gart_addr);
 		if (r) {
 			DRM_ERROR("Failed to pin GTT object %d\n", i);
 			goto out_lclean_unres;
@@ -103,15 +103,15 @@ static void amdgpu_do_test_moves(struct amdgpu_device *adev)
 			goto out_lclean_unpin;
 		}
 
-		for (gtt_start = gtt_map, gtt_end = gtt_map + size;
-		     gtt_start < gtt_end;
-		     gtt_start++)
-			*gtt_start = gtt_start;
+		for (gart_start = gtt_map, gart_end = gtt_map + size;
+		     gart_start < gart_end;
+		     gart_start++)
+			*gart_start = gart_start;
 
 		amdgpu_bo_kunmap(gtt_obj[i]);
 
-		r = amdgpu_copy_buffer(ring, gtt_addr, vram_addr,
-				       size, NULL, &fence, false);
+		r = amdgpu_copy_buffer(ring, gart_addr, vram_addr,
+				       size, NULL, &fence, false, false);
 
 		if (r) {
 			DRM_ERROR("Failed GTT->VRAM copy %d\n", i);
@@ -132,21 +132,21 @@ static void amdgpu_do_test_moves(struct amdgpu_device *adev)
 			goto out_lclean_unpin;
 		}
 
-		for (gtt_start = gtt_map, gtt_end = gtt_map + size,
+		for (gart_start = gtt_map, gart_end = gtt_map + size,
 		     vram_start = vram_map, vram_end = vram_map + size;
 		     vram_start < vram_end;
-		     gtt_start++, vram_start++) {
-			if (*vram_start != gtt_start) {
+		     gart_start++, vram_start++) {
+			if (*vram_start != gart_start) {
 				DRM_ERROR("Incorrect GTT->VRAM copy %d: Got 0x%p, "
 					  "expected 0x%p (GTT/VRAM offset "
 					  "0x%16llx/0x%16llx)\n",
-					  i, *vram_start, gtt_start,
+					  i, *vram_start, gart_start,
 					  (unsigned long long)
-					  (gtt_addr - adev->mc.gtt_start +
-					   (void*)gtt_start - gtt_map),
+					  (gart_addr - adev->mc.gart_start +
+					   (void*)gart_start - gtt_map),
 					  (unsigned long long)
 					  (vram_addr - adev->mc.vram_start +
-					   (void*)gtt_start - gtt_map));
+					   (void*)gart_start - gtt_map));
 				amdgpu_bo_kunmap(vram_obj);
 				goto out_lclean_unpin;
 			}
@@ -155,8 +155,8 @@ static void amdgpu_do_test_moves(struct amdgpu_device *adev)
 
 		amdgpu_bo_kunmap(vram_obj);
 
-		r = amdgpu_copy_buffer(ring, vram_addr, gtt_addr,
-				       size, NULL, &fence, false);
+		r = amdgpu_copy_buffer(ring, vram_addr, gart_addr,
+				       size, NULL, &fence, false, false);
 
 		if (r) {
 			DRM_ERROR("Failed VRAM->GTT copy %d\n", i);
@@ -177,20 +177,20 @@ static void amdgpu_do_test_moves(struct amdgpu_device *adev)
 			goto out_lclean_unpin;
 		}
 
-		for (gtt_start = gtt_map, gtt_end = gtt_map + size,
+		for (gart_start = gtt_map, gart_end = gtt_map + size,
 		     vram_start = vram_map, vram_end = vram_map + size;
-		     gtt_start < gtt_end;
-		     gtt_start++, vram_start++) {
-			if (*gtt_start != vram_start) {
+		     gart_start < gart_end;
+		     gart_start++, vram_start++) {
+			if (*gart_start != vram_start) {
 				DRM_ERROR("Incorrect VRAM->GTT copy %d: Got 0x%p, "
 					  "expected 0x%p (VRAM/GTT offset "
 					  "0x%16llx/0x%16llx)\n",
-					  i, *gtt_start, vram_start,
+					  i, *gart_start, vram_start,
 					  (unsigned long long)
 					  (vram_addr - adev->mc.vram_start +
 					   (void*)vram_start - vram_map),
 					  (unsigned long long)
-					  (gtt_addr - adev->mc.gtt_start +
+					  (gart_addr - adev->mc.gart_start +
 					   (void*)vram_start - vram_map));
 				amdgpu_bo_kunmap(gtt_obj[i]);
 				goto out_lclean_unpin;
@@ -200,7 +200,7 @@ static void amdgpu_do_test_moves(struct amdgpu_device *adev)
 		amdgpu_bo_kunmap(gtt_obj[i]);
 
 		DRM_INFO("Tested GTT->VRAM and VRAM->GTT copy for GTT offset 0x%llx\n",
-			 gtt_addr - adev->mc.gtt_start);
+			 gart_addr - adev->mc.gart_start);
 		continue;
 
 out_lclean_unpin:
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
index 8601904..1c88bd5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_trace.h
@@ -14,6 +14,62 @@
 #define AMDGPU_JOB_GET_TIMELINE_NAME(job) \
 	 job->base.s_fence->finished.ops->get_timeline_name(&job->base.s_fence->finished)
 
+TRACE_EVENT(amdgpu_ttm_tt_populate,
+	    TP_PROTO(struct amdgpu_device *adev, uint64_t dma_address, uint64_t phys_address),
+	    TP_ARGS(adev, dma_address, phys_address),
+	    TP_STRUCT__entry(
+				__field(uint16_t, domain)
+				__field(uint8_t, bus)
+				__field(uint8_t, slot)
+				__field(uint8_t, func)
+				__field(uint64_t, dma)
+				__field(uint64_t, phys)
+			    ),
+	    TP_fast_assign(
+			   __entry->domain = pci_domain_nr(adev->pdev->bus);
+			   __entry->bus = adev->pdev->bus->number;
+			   __entry->slot = PCI_SLOT(adev->pdev->devfn);
+			   __entry->func = PCI_FUNC(adev->pdev->devfn);
+			   __entry->dma = dma_address;
+			   __entry->phys = phys_address;
+			   ),
+	    TP_printk("%04x:%02x:%02x.%x: 0x%llx => 0x%llx",
+		      (unsigned)__entry->domain,
+		      (unsigned)__entry->bus,
+		      (unsigned)__entry->slot,
+		      (unsigned)__entry->func,
+		      (unsigned long long)__entry->dma,
+		      (unsigned long long)__entry->phys)
+);
+
+TRACE_EVENT(amdgpu_ttm_tt_unpopulate,
+	    TP_PROTO(struct amdgpu_device *adev, uint64_t dma_address, uint64_t phys_address),
+	    TP_ARGS(adev, dma_address, phys_address),
+	    TP_STRUCT__entry(
+				__field(uint16_t, domain)
+				__field(uint8_t, bus)
+				__field(uint8_t, slot)
+				__field(uint8_t, func)
+				__field(uint64_t, dma)
+				__field(uint64_t, phys)
+			    ),
+	    TP_fast_assign(
+			   __entry->domain = pci_domain_nr(adev->pdev->bus);
+			   __entry->bus = adev->pdev->bus->number;
+			   __entry->slot = PCI_SLOT(adev->pdev->devfn);
+			   __entry->func = PCI_FUNC(adev->pdev->devfn);
+			   __entry->dma = dma_address;
+			   __entry->phys = phys_address;
+			   ),
+	    TP_printk("%04x:%02x:%02x.%x: 0x%llx => 0x%llx",
+		      (unsigned)__entry->domain,
+		      (unsigned)__entry->bus,
+		      (unsigned)__entry->slot,
+		      (unsigned)__entry->func,
+		      (unsigned long long)__entry->dma,
+		      (unsigned long long)__entry->phys)
+);
+
 TRACE_EVENT(amdgpu_mm_rreg,
 	    TP_PROTO(unsigned did, uint32_t reg, uint32_t value),
 	    TP_ARGS(did, reg, value),
@@ -105,12 +161,12 @@ TRACE_EVENT(amdgpu_bo_create,
 			   __entry->bo = bo;
 			   __entry->pages = bo->tbo.num_pages;
 			   __entry->type = bo->tbo.mem.mem_type;
-			   __entry->prefer = bo->prefered_domains;
+			   __entry->prefer = bo->preferred_domains;
 			   __entry->allow = bo->allowed_domains;
 			   __entry->visible = bo->flags;
 			   ),
 
-	    TP_printk("bo=%p, pages=%u, type=%d, prefered=%d, allowed=%d, visible=%d",
+	    TP_printk("bo=%p, pages=%u, type=%d, preferred=%d, allowed=%d, visible=%d",
 		       __entry->bo, __entry->pages, __entry->type,
 		       __entry->prefer, __entry->allow, __entry->visible)
 );
@@ -224,17 +280,17 @@ TRACE_EVENT(amdgpu_vm_bo_map,
 			     __field(long, start)
 			     __field(long, last)
 			     __field(u64, offset)
-			     __field(u32, flags)
+			     __field(u64, flags)
 			     ),
 
 	    TP_fast_assign(
-			   __entry->bo = bo_va ? bo_va->bo : NULL;
+			   __entry->bo = bo_va ? bo_va->base.bo : NULL;
 			   __entry->start = mapping->start;
 			   __entry->last = mapping->last;
 			   __entry->offset = mapping->offset;
 			   __entry->flags = mapping->flags;
 			   ),
-	    TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%08x",
+	    TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%llx",
 		      __entry->bo, __entry->start, __entry->last,
 		      __entry->offset, __entry->flags)
 );
@@ -248,17 +304,17 @@ TRACE_EVENT(amdgpu_vm_bo_unmap,
 			     __field(long, start)
 			     __field(long, last)
 			     __field(u64, offset)
-			     __field(u32, flags)
+			     __field(u64, flags)
 			     ),
 
 	    TP_fast_assign(
-			   __entry->bo = bo_va->bo;
+			   __entry->bo = bo_va->base.bo;
 			   __entry->start = mapping->start;
 			   __entry->last = mapping->last;
 			   __entry->offset = mapping->offset;
 			   __entry->flags = mapping->flags;
 			   ),
-	    TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%08x",
+	    TP_printk("bo=%p, start=%lx, last=%lx, offset=%010llx, flags=%llx",
 		      __entry->bo, __entry->start, __entry->last,
 		      __entry->offset, __entry->flags)
 );
@@ -269,7 +325,7 @@ DECLARE_EVENT_CLASS(amdgpu_vm_mapping,
 	    TP_STRUCT__entry(
 			     __field(u64, soffset)
 			     __field(u64, eoffset)
-			     __field(u32, flags)
+			     __field(u64, flags)
 			     ),
 
 	    TP_fast_assign(
@@ -277,7 +333,7 @@ DECLARE_EVENT_CLASS(amdgpu_vm_mapping,
 			   __entry->eoffset = mapping->last + 1;
 			   __entry->flags = mapping->flags;
 			   ),
-	    TP_printk("soffs=%010llx, eoffs=%010llx, flags=%08x",
+	    TP_printk("soffs=%010llx, eoffs=%010llx, flags=%llx",
 		      __entry->soffset, __entry->eoffset, __entry->flags)
 );
 
@@ -293,14 +349,14 @@ DEFINE_EVENT(amdgpu_vm_mapping, amdgpu_vm_bo_mapping,
 
 TRACE_EVENT(amdgpu_vm_set_ptes,
 	    TP_PROTO(uint64_t pe, uint64_t addr, unsigned count,
-		     uint32_t incr, uint32_t flags),
+		     uint32_t incr, uint64_t flags),
 	    TP_ARGS(pe, addr, count, incr, flags),
 	    TP_STRUCT__entry(
 			     __field(u64, pe)
 			     __field(u64, addr)
 			     __field(u32, count)
 			     __field(u32, incr)
-			     __field(u32, flags)
+			     __field(u64, flags)
 			     ),
 
 	    TP_fast_assign(
@@ -310,7 +366,7 @@ TRACE_EVENT(amdgpu_vm_set_ptes,
 			   __entry->incr = incr;
 			   __entry->flags = flags;
 			   ),
-	    TP_printk("pe=%010Lx, addr=%010Lx, incr=%u, flags=%08x, count=%u",
+	    TP_printk("pe=%010Lx, addr=%010Lx, incr=%u, flags=%llx, count=%u",
 		      __entry->pe, __entry->addr, __entry->incr,
 		      __entry->flags, __entry->count)
 );
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
index c9b131b..8b2c294 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c
@@ -43,14 +43,20 @@
 #include <linux/pagemap.h>
 #include <linux/debugfs.h>
 #include "amdgpu.h"
+#include "amdgpu_trace.h"
 #include "bif/bif_4_1_d.h"
 
 #define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
 
+static int amdgpu_map_buffer(struct ttm_buffer_object *bo,
+			     struct ttm_mem_reg *mem, unsigned num_pages,
+			     uint64_t offset, unsigned window,
+			     struct amdgpu_ring *ring,
+			     uint64_t *addr);
+
 static int amdgpu_ttm_debugfs_init(struct amdgpu_device *adev);
 static void amdgpu_ttm_debugfs_fini(struct amdgpu_device *adev);
 
-
 /*
  * Global memory.
  */
@@ -97,6 +103,8 @@ static int amdgpu_ttm_global_init(struct amdgpu_device *adev)
 		goto error_bo;
 	}
 
+	mutex_init(&adev->mman.gtt_window_lock);
+
 	ring = adev->mman.buffer_funcs_ring;
 	rq = &ring->sched.sched_rq[AMD_SCHED_PRIORITY_KERNEL];
 	r = amd_sched_entity_init(&ring->sched, &adev->mman.entity,
@@ -123,6 +131,7 @@ static void amdgpu_ttm_global_fini(struct amdgpu_device *adev)
 	if (adev->mman.mem_global_referenced) {
 		amd_sched_entity_fini(adev->mman.entity.sched,
 				      &adev->mman.entity);
+		mutex_destroy(&adev->mman.gtt_window_lock);
 		drm_global_item_unref(&adev->mman.bo_global_ref.ref);
 		drm_global_item_unref(&adev->mman.mem_global_ref);
 		adev->mman.mem_global_referenced = false;
@@ -150,7 +159,7 @@ static int amdgpu_init_mem_type(struct ttm_bo_device *bdev, uint32_t type,
 		break;
 	case TTM_PL_TT:
 		man->func = &amdgpu_gtt_mgr_func;
-		man->gpu_offset = adev->mc.gtt_start;
+		man->gpu_offset = adev->mc.gart_start;
 		man->available_caching = TTM_PL_MASK_CACHING;
 		man->default_caching = TTM_PL_FLAG_CACHED;
 		man->flags = TTM_MEMTYPE_FLAG_MAPPABLE | TTM_MEMTYPE_FLAG_CMA;
@@ -186,12 +195,11 @@ static void amdgpu_evict_flags(struct ttm_buffer_object *bo,
 {
 	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->bdev);
 	struct amdgpu_bo *abo;
-	static struct ttm_place placements = {
+	static const struct ttm_place placements = {
 		.fpfn = 0,
 		.lpfn = 0,
 		.flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_SYSTEM
 	};
-	unsigned i;
 
 	if (!amdgpu_ttm_bo_is_amdgpu_bo(bo)) {
 		placement->placement = &placements;
@@ -207,22 +215,36 @@ static void amdgpu_evict_flags(struct ttm_buffer_object *bo,
 		    adev->mman.buffer_funcs_ring &&
 		    adev->mman.buffer_funcs_ring->ready == false) {
 			amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_CPU);
-		} else {
-			amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_GTT);
-			for (i = 0; i < abo->placement.num_placement; ++i) {
-				if (!(abo->placements[i].flags &
-				      TTM_PL_FLAG_TT))
-					continue;
+		} else if (adev->mc.visible_vram_size < adev->mc.real_vram_size &&
+			   !(abo->flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED)) {
+			unsigned fpfn = adev->mc.visible_vram_size >> PAGE_SHIFT;
+			struct drm_mm_node *node = bo->mem.mm_node;
+			unsigned long pages_left;
 
-				if (abo->placements[i].lpfn)
-					continue;
-
-				/* set an upper limit to force directly
-				 * allocating address space for the BO.
-				 */
-				abo->placements[i].lpfn =
-					adev->mc.gtt_size >> PAGE_SHIFT;
+			for (pages_left = bo->mem.num_pages;
+			     pages_left;
+			     pages_left -= node->size, node++) {
+				if (node->start < fpfn)
+					break;
 			}
+
+			if (!pages_left)
+				goto gtt;
+
+			/* Try evicting to the CPU inaccessible part of VRAM
+			 * first, but only set GTT as busy placement, so this
+			 * BO will be evicted to GTT rather than causing other
+			 * BOs to be evicted from VRAM
+			 */
+			amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_VRAM |
+							 AMDGPU_GEM_DOMAIN_GTT);
+			abo->placements[0].fpfn = fpfn;
+			abo->placements[0].lpfn = 0;
+			abo->placement.busy_placement = &abo->placements[1];
+			abo->placement.num_busy_placement = 1;
+		} else {
+gtt:
+			amdgpu_ttm_placement_from_domain(abo, AMDGPU_GEM_DOMAIN_GTT);
 		}
 		break;
 	case TTM_PL_TT:
@@ -252,29 +274,18 @@ static void amdgpu_move_null(struct ttm_buffer_object *bo,
 	new_mem->mm_node = NULL;
 }
 
-static int amdgpu_mm_node_addr(struct ttm_buffer_object *bo,
-			       struct drm_mm_node *mm_node,
-			       struct ttm_mem_reg *mem,
-			       uint64_t *addr)
+static uint64_t amdgpu_mm_node_addr(struct ttm_buffer_object *bo,
+				    struct drm_mm_node *mm_node,
+				    struct ttm_mem_reg *mem)
 {
-	int r;
+	uint64_t addr = 0;
 
-	switch (mem->mem_type) {
-	case TTM_PL_TT:
-		r = amdgpu_ttm_bind(bo, mem);
-		if (r)
-			return r;
-
-	case TTM_PL_VRAM:
-		*addr = mm_node->start << PAGE_SHIFT;
-		*addr += bo->bdev->man[mem->mem_type].gpu_offset;
-		break;
-	default:
-		DRM_ERROR("Unknown placement %d\n", mem->mem_type);
-		return -EINVAL;
+	if (mem->mem_type != TTM_PL_TT ||
+	    amdgpu_gtt_mgr_is_allocated(mem)) {
+		addr = mm_node->start << PAGE_SHIFT;
+		addr += bo->bdev->man[mem->mem_type].gpu_offset;
 	}
-
-	return 0;
+	return addr;
 }
 
 static int amdgpu_move_blit(struct ttm_buffer_object *bo,
@@ -299,26 +310,40 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
 	}
 
 	old_mm = old_mem->mm_node;
-	r = amdgpu_mm_node_addr(bo, old_mm, old_mem, &old_start);
-	if (r)
-		return r;
 	old_size = old_mm->size;
-
+	old_start = amdgpu_mm_node_addr(bo, old_mm, old_mem);
 
 	new_mm = new_mem->mm_node;
-	r = amdgpu_mm_node_addr(bo, new_mm, new_mem, &new_start);
-	if (r)
-		return r;
 	new_size = new_mm->size;
+	new_start = amdgpu_mm_node_addr(bo, new_mm, new_mem);
 
 	num_pages = new_mem->num_pages;
+	mutex_lock(&adev->mman.gtt_window_lock);
 	while (num_pages) {
-		unsigned long cur_pages = min(old_size, new_size);
+		unsigned long cur_pages = min(min(old_size, new_size),
+					      (u64)AMDGPU_GTT_MAX_TRANSFER_SIZE);
+		uint64_t from = old_start, to = new_start;
 		struct dma_fence *next;
 
-		r = amdgpu_copy_buffer(ring, old_start, new_start,
+		if (old_mem->mem_type == TTM_PL_TT &&
+		    !amdgpu_gtt_mgr_is_allocated(old_mem)) {
+			r = amdgpu_map_buffer(bo, old_mem, cur_pages,
+					      old_start, 0, ring, &from);
+			if (r)
+				goto error;
+		}
+
+		if (new_mem->mem_type == TTM_PL_TT &&
+		    !amdgpu_gtt_mgr_is_allocated(new_mem)) {
+			r = amdgpu_map_buffer(bo, new_mem, cur_pages,
+					      new_start, 1, ring, &to);
+			if (r)
+				goto error;
+		}
+
+		r = amdgpu_copy_buffer(ring, from, to,
 				       cur_pages * PAGE_SIZE,
-				       bo->resv, &next, false);
+				       bo->resv, &next, false, true);
 		if (r)
 			goto error;
 
@@ -331,10 +356,7 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
 
 		old_size -= cur_pages;
 		if (!old_size) {
-			r = amdgpu_mm_node_addr(bo, ++old_mm, old_mem,
-						&old_start);
-			if (r)
-				goto error;
+			old_start = amdgpu_mm_node_addr(bo, ++old_mm, old_mem);
 			old_size = old_mm->size;
 		} else {
 			old_start += cur_pages * PAGE_SIZE;
@@ -342,22 +364,21 @@ static int amdgpu_move_blit(struct ttm_buffer_object *bo,
 
 		new_size -= cur_pages;
 		if (!new_size) {
-			r = amdgpu_mm_node_addr(bo, ++new_mm, new_mem,
-						&new_start);
-			if (r)
-				goto error;
-
+			new_start = amdgpu_mm_node_addr(bo, ++new_mm, new_mem);
 			new_size = new_mm->size;
 		} else {
 			new_start += cur_pages * PAGE_SIZE;
 		}
 	}
+	mutex_unlock(&adev->mman.gtt_window_lock);
 
 	r = ttm_bo_pipeline_move(bo, fence, evict, new_mem);
 	dma_fence_put(fence);
 	return r;
 
 error:
+	mutex_unlock(&adev->mman.gtt_window_lock);
+
 	if (fence)
 		dma_fence_wait(fence, false);
 	dma_fence_put(fence);
@@ -384,7 +405,7 @@ static int amdgpu_move_vram_ram(struct ttm_buffer_object *bo,
 	placement.num_busy_placement = 1;
 	placement.busy_placement = &placements;
 	placements.fpfn = 0;
-	placements.lpfn = adev->mc.gtt_size >> PAGE_SHIFT;
+	placements.lpfn = 0;
 	placements.flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_TT;
 	r = ttm_bo_mem_space(bo, &placement, &tmp_mem,
 			     interruptible, no_wait_gpu);
@@ -431,7 +452,7 @@ static int amdgpu_move_ram_vram(struct ttm_buffer_object *bo,
 	placement.num_busy_placement = 1;
 	placement.busy_placement = &placements;
 	placements.fpfn = 0;
-	placements.lpfn = adev->mc.gtt_size >> PAGE_SHIFT;
+	placements.lpfn = 0;
 	placements.flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_TT;
 	r = ttm_bo_mem_space(bo, &placement, &tmp_mem,
 			     interruptible, no_wait_gpu);
@@ -507,6 +528,15 @@ static int amdgpu_bo_move(struct ttm_buffer_object *bo,
 		}
 	}
 
+	if (bo->type == ttm_bo_type_device &&
+	    new_mem->mem_type == TTM_PL_VRAM &&
+	    old_mem->mem_type != TTM_PL_VRAM) {
+		/* amdgpu_bo_fault_reserve_notify will re-set this if the CPU
+		 * accesses the BO after it's moved.
+		 */
+		abo->flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
+	}
+
 	/* update statistics */
 	atomic64_add((u64)bo->num_pages << PAGE_SHIFT, &adev->num_bytes_moved);
 	return 0;
@@ -633,6 +663,38 @@ int amdgpu_ttm_tt_get_user_pages(struct ttm_tt *ttm, struct page **pages)
 	return r;
 }
 
+static void amdgpu_trace_dma_map(struct ttm_tt *ttm)
+{
+	struct amdgpu_device *adev = amdgpu_ttm_adev(ttm->bdev);
+	struct amdgpu_ttm_tt *gtt = (void *)ttm;
+	unsigned i;
+
+	if (unlikely(trace_amdgpu_ttm_tt_populate_enabled())) {
+		for (i = 0; i < ttm->num_pages; i++) {
+			trace_amdgpu_ttm_tt_populate(
+				adev,
+				gtt->ttm.dma_address[i],
+				page_to_phys(ttm->pages[i]));
+		}
+	}
+}
+
+static void amdgpu_trace_dma_unmap(struct ttm_tt *ttm)
+{
+	struct amdgpu_device *adev = amdgpu_ttm_adev(ttm->bdev);
+	struct amdgpu_ttm_tt *gtt = (void *)ttm;
+	unsigned i;
+
+	if (unlikely(trace_amdgpu_ttm_tt_unpopulate_enabled())) {
+		for (i = 0; i < ttm->num_pages; i++) {
+			trace_amdgpu_ttm_tt_unpopulate(
+				adev,
+				gtt->ttm.dma_address[i],
+				page_to_phys(ttm->pages[i]));
+		}
+	}
+}
+
 /* prepare the sg table with the user pages */
 static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm)
 {
@@ -659,6 +721,8 @@ static int amdgpu_ttm_tt_pin_userptr(struct ttm_tt *ttm)
 	drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
 					 gtt->ttm.dma_address, ttm->num_pages);
 
+	amdgpu_trace_dma_map(ttm);
+
 	return 0;
 
 release_sg:
@@ -692,14 +756,41 @@ static void amdgpu_ttm_tt_unpin_userptr(struct ttm_tt *ttm)
 		put_page(page);
 	}
 
+	amdgpu_trace_dma_unmap(ttm);
+
 	sg_free_table(ttm->sg);
 }
 
+static int amdgpu_ttm_do_bind(struct ttm_tt *ttm, struct ttm_mem_reg *mem)
+{
+	struct amdgpu_ttm_tt *gtt = (void *)ttm;
+	uint64_t flags;
+	int r;
+
+	spin_lock(&gtt->adev->gtt_list_lock);
+	flags = amdgpu_ttm_tt_pte_flags(gtt->adev, ttm, mem);
+	gtt->offset = (u64)mem->start << PAGE_SHIFT;
+	r = amdgpu_gart_bind(gtt->adev, gtt->offset, ttm->num_pages,
+		ttm->pages, gtt->ttm.dma_address, flags);
+
+	if (r) {
+		DRM_ERROR("failed to bind %lu pages at 0x%08llX\n",
+			  ttm->num_pages, gtt->offset);
+		goto error_gart_bind;
+	}
+
+	list_add_tail(&gtt->list, &gtt->adev->gtt_list);
+error_gart_bind:
+	spin_unlock(&gtt->adev->gtt_list_lock);
+	return r;
+
+}
+
 static int amdgpu_ttm_backend_bind(struct ttm_tt *ttm,
 				   struct ttm_mem_reg *bo_mem)
 {
 	struct amdgpu_ttm_tt *gtt = (void*)ttm;
-	int r;
+	int r = 0;
 
 	if (gtt->userptr) {
 		r = amdgpu_ttm_tt_pin_userptr(ttm);
@@ -718,7 +809,10 @@ static int amdgpu_ttm_backend_bind(struct ttm_tt *ttm,
 	    bo_mem->mem_type == AMDGPU_PL_OA)
 		return -EINVAL;
 
-	return 0;
+	if (amdgpu_gtt_mgr_is_allocated(bo_mem))
+	    r = amdgpu_ttm_do_bind(ttm, bo_mem);
+
+	return r;
 }
 
 bool amdgpu_ttm_is_bound(struct ttm_tt *ttm)
@@ -731,8 +825,6 @@ bool amdgpu_ttm_is_bound(struct ttm_tt *ttm)
 int amdgpu_ttm_bind(struct ttm_buffer_object *bo, struct ttm_mem_reg *bo_mem)
 {
 	struct ttm_tt *ttm = bo->ttm;
-	struct amdgpu_ttm_tt *gtt = (void *)bo->ttm;
-	uint64_t flags;
 	int r;
 
 	if (!ttm || amdgpu_ttm_is_bound(ttm))
@@ -745,22 +837,7 @@ int amdgpu_ttm_bind(struct ttm_buffer_object *bo, struct ttm_mem_reg *bo_mem)
 		return r;
 	}
 
-	spin_lock(&gtt->adev->gtt_list_lock);
-	flags = amdgpu_ttm_tt_pte_flags(gtt->adev, ttm, bo_mem);
-	gtt->offset = (u64)bo_mem->start << PAGE_SHIFT;
-	r = amdgpu_gart_bind(gtt->adev, gtt->offset, ttm->num_pages,
-		ttm->pages, gtt->ttm.dma_address, flags);
-
-	if (r) {
-		DRM_ERROR("failed to bind %lu pages at 0x%08llX\n",
-			  ttm->num_pages, gtt->offset);
-		goto error_gart_bind;
-	}
-
-	list_add_tail(&gtt->list, &gtt->adev->gtt_list);
-error_gart_bind:
-	spin_unlock(&gtt->adev->gtt_list_lock);
-	return r;
+	return amdgpu_ttm_do_bind(ttm, bo_mem);
 }
 
 int amdgpu_ttm_recover_gart(struct amdgpu_device *adev)
@@ -852,7 +929,7 @@ static struct ttm_tt *amdgpu_ttm_tt_create(struct ttm_bo_device *bdev,
 
 static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm)
 {
-	struct amdgpu_device *adev;
+	struct amdgpu_device *adev = amdgpu_ttm_adev(ttm->bdev);
 	struct amdgpu_ttm_tt *gtt = (void *)ttm;
 	unsigned i;
 	int r;
@@ -875,14 +952,14 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm)
 		drm_prime_sg_to_page_addr_arrays(ttm->sg, ttm->pages,
 						 gtt->ttm.dma_address, ttm->num_pages);
 		ttm->state = tt_unbound;
-		return 0;
+		r = 0;
+		goto trace_mappings;
 	}
 
-	adev = amdgpu_ttm_adev(ttm->bdev);
-
 #ifdef CONFIG_SWIOTLB
 	if (swiotlb_nr_tbl()) {
-		return ttm_dma_populate(&gtt->ttm, adev->dev);
+		r = ttm_dma_populate(&gtt->ttm, adev->dev);
+		goto trace_mappings;
 	}
 #endif
 
@@ -905,7 +982,12 @@ static int amdgpu_ttm_tt_populate(struct ttm_tt *ttm)
 			return -EFAULT;
 		}
 	}
-	return 0;
+
+	r = 0;
+trace_mappings:
+	if (likely(!r))
+		amdgpu_trace_dma_map(ttm);
+	return r;
 }
 
 static void amdgpu_ttm_tt_unpopulate(struct ttm_tt *ttm)
@@ -926,6 +1008,8 @@ static void amdgpu_ttm_tt_unpopulate(struct ttm_tt *ttm)
 
 	adev = amdgpu_ttm_adev(ttm->bdev);
 
+	amdgpu_trace_dma_unmap(ttm);
+
 #ifdef CONFIG_SWIOTLB
 	if (swiotlb_nr_tbl()) {
 		ttm_dma_unpopulate(&gtt->ttm, adev->dev);
@@ -1075,6 +1159,67 @@ static bool amdgpu_ttm_bo_eviction_valuable(struct ttm_buffer_object *bo,
 	return ttm_bo_eviction_valuable(bo, place);
 }
 
+static int amdgpu_ttm_access_memory(struct ttm_buffer_object *bo,
+				    unsigned long offset,
+				    void *buf, int len, int write)
+{
+	struct amdgpu_bo *abo = container_of(bo, struct amdgpu_bo, tbo);
+	struct amdgpu_device *adev = amdgpu_ttm_adev(abo->tbo.bdev);
+	struct drm_mm_node *nodes = abo->tbo.mem.mm_node;
+	uint32_t value = 0;
+	int ret = 0;
+	uint64_t pos;
+	unsigned long flags;
+
+	if (bo->mem.mem_type != TTM_PL_VRAM)
+		return -EIO;
+
+	while (offset >= (nodes->size << PAGE_SHIFT)) {
+		offset -= nodes->size << PAGE_SHIFT;
+		++nodes;
+	}
+	pos = (nodes->start << PAGE_SHIFT) + offset;
+
+	while (len && pos < adev->mc.mc_vram_size) {
+		uint64_t aligned_pos = pos & ~(uint64_t)3;
+		uint32_t bytes = 4 - (pos & 3);
+		uint32_t shift = (pos & 3) * 8;
+		uint32_t mask = 0xffffffff << shift;
+
+		if (len < bytes) {
+			mask &= 0xffffffff >> (bytes - len) * 8;
+			bytes = len;
+		}
+
+		spin_lock_irqsave(&adev->mmio_idx_lock, flags);
+		WREG32(mmMM_INDEX, ((uint32_t)aligned_pos) | 0x80000000);
+		WREG32(mmMM_INDEX_HI, aligned_pos >> 31);
+		if (!write || mask != 0xffffffff)
+			value = RREG32(mmMM_DATA);
+		if (write) {
+			value &= ~mask;
+			value |= (*(uint32_t *)buf << shift) & mask;
+			WREG32(mmMM_DATA, value);
+		}
+		spin_unlock_irqrestore(&adev->mmio_idx_lock, flags);
+		if (!write) {
+			value = (value & mask) >> shift;
+			memcpy(buf, &value, bytes);
+		}
+
+		ret += bytes;
+		buf = (uint8_t *)buf + bytes;
+		pos += bytes;
+		len -= bytes;
+		if (pos >= (nodes->start + nodes->size) << PAGE_SHIFT) {
+			++nodes;
+			pos = (nodes->start << PAGE_SHIFT);
+		}
+	}
+
+	return ret;
+}
+
 static struct ttm_bo_driver amdgpu_bo_driver = {
 	.ttm_tt_create = &amdgpu_ttm_tt_create,
 	.ttm_tt_populate = &amdgpu_ttm_tt_populate,
@@ -1090,11 +1235,14 @@ static struct ttm_bo_driver amdgpu_bo_driver = {
 	.io_mem_reserve = &amdgpu_ttm_io_mem_reserve,
 	.io_mem_free = &amdgpu_ttm_io_mem_free,
 	.io_mem_pfn = amdgpu_ttm_io_mem_pfn,
+	.access_memory = &amdgpu_ttm_access_memory
 };
 
 int amdgpu_ttm_init(struct amdgpu_device *adev)
 {
+	uint64_t gtt_size;
 	int r;
+	u64 vis_vram_limit;
 
 	r = amdgpu_ttm_global_init(adev);
 	if (r) {
@@ -1118,36 +1266,37 @@ int amdgpu_ttm_init(struct amdgpu_device *adev)
 		DRM_ERROR("Failed initializing VRAM heap.\n");
 		return r;
 	}
+
+	/* Reduce size of CPU-visible VRAM if requested */
+	vis_vram_limit = (u64)amdgpu_vis_vram_limit * 1024 * 1024;
+	if (amdgpu_vis_vram_limit > 0 &&
+	    vis_vram_limit <= adev->mc.visible_vram_size)
+		adev->mc.visible_vram_size = vis_vram_limit;
+
 	/* Change the size here instead of the init above so only lpfn is affected */
 	amdgpu_ttm_set_active_vram_size(adev, adev->mc.visible_vram_size);
 
-	r = amdgpu_bo_create(adev, adev->mc.stolen_size, PAGE_SIZE, true,
-			     AMDGPU_GEM_DOMAIN_VRAM,
-			     AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
-			     AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
-			     NULL, NULL, &adev->stollen_vga_memory);
-	if (r) {
-		return r;
-	}
-	r = amdgpu_bo_reserve(adev->stollen_vga_memory, false);
+	r = amdgpu_bo_create_kernel(adev, adev->mc.stolen_size, PAGE_SIZE,
+				    AMDGPU_GEM_DOMAIN_VRAM,
+				    &adev->stolen_vga_memory,
+				    NULL, NULL);
 	if (r)
 		return r;
-	r = amdgpu_bo_pin(adev->stollen_vga_memory, AMDGPU_GEM_DOMAIN_VRAM, NULL);
-	amdgpu_bo_unreserve(adev->stollen_vga_memory);
-	if (r) {
-		amdgpu_bo_unref(&adev->stollen_vga_memory);
-		return r;
-	}
 	DRM_INFO("amdgpu: %uM of VRAM memory ready\n",
 		 (unsigned) (adev->mc.real_vram_size / (1024 * 1024)));
-	r = ttm_bo_init_mm(&adev->mman.bdev, TTM_PL_TT,
-				adev->mc.gtt_size >> PAGE_SHIFT);
+
+	if (amdgpu_gtt_size == -1)
+		gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
+			       adev->mc.mc_vram_size);
+	else
+		gtt_size = (uint64_t)amdgpu_gtt_size << 20;
+	r = ttm_bo_init_mm(&adev->mman.bdev, TTM_PL_TT, gtt_size >> PAGE_SHIFT);
 	if (r) {
 		DRM_ERROR("Failed initializing GTT heap.\n");
 		return r;
 	}
 	DRM_INFO("amdgpu: %uM of GTT memory ready.\n",
-		 (unsigned)(adev->mc.gtt_size / (1024 * 1024)));
+		 (unsigned)(gtt_size / (1024 * 1024)));
 
 	adev->gds.mem.total_size = adev->gds.mem.total_size << AMDGPU_GDS_SHIFT;
 	adev->gds.mem.gfx_partition_size = adev->gds.mem.gfx_partition_size << AMDGPU_GDS_SHIFT;
@@ -1203,13 +1352,13 @@ void amdgpu_ttm_fini(struct amdgpu_device *adev)
 	if (!adev->mman.initialized)
 		return;
 	amdgpu_ttm_debugfs_fini(adev);
-	if (adev->stollen_vga_memory) {
-		r = amdgpu_bo_reserve(adev->stollen_vga_memory, true);
+	if (adev->stolen_vga_memory) {
+		r = amdgpu_bo_reserve(adev->stolen_vga_memory, true);
 		if (r == 0) {
-			amdgpu_bo_unpin(adev->stollen_vga_memory);
-			amdgpu_bo_unreserve(adev->stollen_vga_memory);
+			amdgpu_bo_unpin(adev->stolen_vga_memory);
+			amdgpu_bo_unreserve(adev->stolen_vga_memory);
 		}
-		amdgpu_bo_unref(&adev->stollen_vga_memory);
+		amdgpu_bo_unref(&adev->stolen_vga_memory);
 	}
 	ttm_bo_clean_mm(&adev->mman.bdev, TTM_PL_VRAM);
 	ttm_bo_clean_mm(&adev->mman.bdev, TTM_PL_TT);
@@ -1256,12 +1405,77 @@ int amdgpu_mmap(struct file *filp, struct vm_area_struct *vma)
 	return ttm_bo_mmap(filp, vma, &adev->mman.bdev);
 }
 
-int amdgpu_copy_buffer(struct amdgpu_ring *ring,
-		       uint64_t src_offset,
-		       uint64_t dst_offset,
-		       uint32_t byte_count,
+static int amdgpu_map_buffer(struct ttm_buffer_object *bo,
+			     struct ttm_mem_reg *mem, unsigned num_pages,
+			     uint64_t offset, unsigned window,
+			     struct amdgpu_ring *ring,
+			     uint64_t *addr)
+{
+	struct amdgpu_ttm_tt *gtt = (void *)bo->ttm;
+	struct amdgpu_device *adev = ring->adev;
+	struct ttm_tt *ttm = bo->ttm;
+	struct amdgpu_job *job;
+	unsigned num_dw, num_bytes;
+	dma_addr_t *dma_address;
+	struct dma_fence *fence;
+	uint64_t src_addr, dst_addr;
+	uint64_t flags;
+	int r;
+
+	BUG_ON(adev->mman.buffer_funcs->copy_max_bytes <
+	       AMDGPU_GTT_MAX_TRANSFER_SIZE * 8);
+
+	*addr = adev->mc.gart_start;
+	*addr += (u64)window * AMDGPU_GTT_MAX_TRANSFER_SIZE *
+		AMDGPU_GPU_PAGE_SIZE;
+
+	num_dw = adev->mman.buffer_funcs->copy_num_dw;
+	while (num_dw & 0x7)
+		num_dw++;
+
+	num_bytes = num_pages * 8;
+
+	r = amdgpu_job_alloc_with_ib(adev, num_dw * 4 + num_bytes, &job);
+	if (r)
+		return r;
+
+	src_addr = num_dw * 4;
+	src_addr += job->ibs[0].gpu_addr;
+
+	dst_addr = adev->gart.table_addr;
+	dst_addr += window * AMDGPU_GTT_MAX_TRANSFER_SIZE * 8;
+	amdgpu_emit_copy_buffer(adev, &job->ibs[0], src_addr,
+				dst_addr, num_bytes);
+
+	amdgpu_ring_pad_ib(ring, &job->ibs[0]);
+	WARN_ON(job->ibs[0].length_dw > num_dw);
+
+	dma_address = &gtt->ttm.dma_address[offset >> PAGE_SHIFT];
+	flags = amdgpu_ttm_tt_pte_flags(adev, ttm, mem);
+	r = amdgpu_gart_map(adev, 0, num_pages, dma_address, flags,
+			    &job->ibs[0].ptr[num_dw]);
+	if (r)
+		goto error_free;
+
+	r = amdgpu_job_submit(job, ring, &adev->mman.entity,
+			      AMDGPU_FENCE_OWNER_UNDEFINED, &fence);
+	if (r)
+		goto error_free;
+
+	dma_fence_put(fence);
+
+	return r;
+
+error_free:
+	amdgpu_job_free(job);
+	return r;
+}
+
+int amdgpu_copy_buffer(struct amdgpu_ring *ring, uint64_t src_offset,
+		       uint64_t dst_offset, uint32_t byte_count,
 		       struct reservation_object *resv,
-		       struct dma_fence **fence, bool direct_submit)
+		       struct dma_fence **fence, bool direct_submit,
+		       bool vm_needs_flush)
 {
 	struct amdgpu_device *adev = ring->adev;
 	struct amdgpu_job *job;
@@ -1283,6 +1497,7 @@ int amdgpu_copy_buffer(struct amdgpu_ring *ring,
 	if (r)
 		return r;
 
+	job->vm_needs_flush = vm_needs_flush;
 	if (resv) {
 		r = amdgpu_sync_resv(adev, &job->sync, resv,
 				     AMDGPU_FENCE_OWNER_UNDEFINED);
@@ -1327,11 +1542,12 @@ int amdgpu_copy_buffer(struct amdgpu_ring *ring,
 }
 
 int amdgpu_fill_buffer(struct amdgpu_bo *bo,
-		       uint32_t src_data,
+		       uint64_t src_data,
 		       struct reservation_object *resv,
 		       struct dma_fence **fence)
 {
 	struct amdgpu_device *adev = amdgpu_ttm_adev(bo->tbo.bdev);
+	/* max_bytes applies to SDMA_OP_PTEPDE as well as SDMA_OP_CONST_FILL*/
 	uint32_t max_bytes = adev->mman.buffer_funcs->fill_max_bytes;
 	struct amdgpu_ring *ring = adev->mman.buffer_funcs_ring;
 
@@ -1347,6 +1563,12 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo,
 		return -EINVAL;
 	}
 
+	if (bo->tbo.mem.mem_type == TTM_PL_TT) {
+		r = amdgpu_ttm_bind(&bo->tbo, &bo->tbo.mem);
+		if (r)
+			return r;
+	}
+
 	num_pages = bo->tbo.num_pages;
 	mm_node = bo->tbo.mem.mm_node;
 	num_loops = 0;
@@ -1357,7 +1579,9 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo,
 		num_pages -= mm_node->size;
 		++mm_node;
 	}
-	num_dw = num_loops * adev->mman.buffer_funcs->fill_num_dw;
+
+	/* 10 double words for each SDMA_OP_PTEPDE cmd */
+	num_dw = num_loops * 10;
 
 	/* for IB padding */
 	num_dw += 64;
@@ -1382,16 +1606,16 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo,
 		uint32_t byte_count = mm_node->size << PAGE_SHIFT;
 		uint64_t dst_addr;
 
-		r = amdgpu_mm_node_addr(&bo->tbo, mm_node,
-					&bo->tbo.mem, &dst_addr);
-		if (r)
-			return r;
+		WARN_ONCE(byte_count & 0x7, "size should be a multiple of 8");
 
+		dst_addr = amdgpu_mm_node_addr(&bo->tbo, mm_node, &bo->tbo.mem);
 		while (byte_count) {
 			uint32_t cur_size_in_bytes = min(byte_count, max_bytes);
 
-			amdgpu_emit_fill_buffer(adev, &job->ibs[0], src_data,
-						dst_addr, cur_size_in_bytes);
+			amdgpu_vm_set_pte_pde(adev, &job->ibs[0],
+					dst_addr, 0,
+					cur_size_in_bytes >> 3, 0,
+					src_data);
 
 			dst_addr += cur_size_in_bytes;
 			byte_count -= cur_size_in_bytes;
@@ -1417,32 +1641,16 @@ int amdgpu_fill_buffer(struct amdgpu_bo *bo,
 
 #if defined(CONFIG_DEBUG_FS)
 
-extern void amdgpu_gtt_mgr_print(struct seq_file *m, struct ttm_mem_type_manager
-				 *man);
 static int amdgpu_mm_dump_table(struct seq_file *m, void *data)
 {
 	struct drm_info_node *node = (struct drm_info_node *)m->private;
 	unsigned ttm_pl = *(int *)node->info_ent->data;
 	struct drm_device *dev = node->minor->dev;
 	struct amdgpu_device *adev = dev->dev_private;
-	struct drm_mm *mm = (struct drm_mm *)adev->mman.bdev.man[ttm_pl].priv;
-	struct ttm_bo_global *glob = adev->mman.bdev.glob;
+	struct ttm_mem_type_manager *man = &adev->mman.bdev.man[ttm_pl];
 	struct drm_printer p = drm_seq_file_printer(m);
 
-	spin_lock(&glob->lru_lock);
-	drm_mm_print(mm, &p);
-	spin_unlock(&glob->lru_lock);
-	switch (ttm_pl) {
-	case TTM_PL_VRAM:
-		seq_printf(m, "man size:%llu pages, ram usage:%lluMB, vis usage:%lluMB\n",
-			   adev->mman.bdev.man[ttm_pl].size,
-			   (u64)atomic64_read(&adev->vram_usage) >> 20,
-			   (u64)atomic64_read(&adev->vram_vis_usage) >> 20);
-		break;
-	case TTM_PL_TT:
-		amdgpu_gtt_mgr_print(m, &adev->mman.bdev.man[TTM_PL_TT]);
-		break;
-	}
+	man->func->debug(man, &p);
 	return 0;
 }
 
@@ -1574,7 +1782,7 @@ static int amdgpu_ttm_debugfs_init(struct amdgpu_device *adev)
 				  adev, &amdgpu_ttm_gtt_fops);
 	if (IS_ERR(ent))
 		return PTR_ERR(ent);
-	i_size_write(ent->d_inode, adev->mc.gtt_size);
+	i_size_write(ent->d_inode, adev->mc.gart_size);
 	adev->mman.gtt = ent;
 
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
index 6bdede8..f22a475 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.h
@@ -34,6 +34,9 @@
 #define AMDGPU_PL_FLAG_GWS		(TTM_PL_FLAG_PRIV << 1)
 #define AMDGPU_PL_FLAG_OA		(TTM_PL_FLAG_PRIV << 2)
 
+#define AMDGPU_GTT_MAX_TRANSFER_SIZE	512
+#define AMDGPU_GTT_NUM_TRANSFER_WINDOWS	2
+
 struct amdgpu_mman {
 	struct ttm_bo_global_ref        bo_global_ref;
 	struct drm_global_reference	mem_global_ref;
@@ -49,6 +52,8 @@ struct amdgpu_mman {
 	/* buffer handling */
 	const struct amdgpu_buffer_funcs	*buffer_funcs;
 	struct amdgpu_ring			*buffer_funcs_ring;
+
+	struct mutex				gtt_window_lock;
 	/* Scheduler entity for buffer moves */
 	struct amd_sched_entity			entity;
 };
@@ -56,24 +61,29 @@ struct amdgpu_mman {
 extern const struct ttm_mem_type_manager_func amdgpu_gtt_mgr_func;
 extern const struct ttm_mem_type_manager_func amdgpu_vram_mgr_func;
 
+bool amdgpu_gtt_mgr_is_allocated(struct ttm_mem_reg *mem);
 int amdgpu_gtt_mgr_alloc(struct ttm_mem_type_manager *man,
 			 struct ttm_buffer_object *tbo,
 			 const struct ttm_place *place,
 			 struct ttm_mem_reg *mem);
+uint64_t amdgpu_gtt_mgr_usage(struct ttm_mem_type_manager *man);
 
-int amdgpu_copy_buffer(struct amdgpu_ring *ring,
-		       uint64_t src_offset,
-		       uint64_t dst_offset,
-		       uint32_t byte_count,
+uint64_t amdgpu_vram_mgr_usage(struct ttm_mem_type_manager *man);
+uint64_t amdgpu_vram_mgr_vis_usage(struct ttm_mem_type_manager *man);
+
+int amdgpu_copy_buffer(struct amdgpu_ring *ring, uint64_t src_offset,
+		       uint64_t dst_offset, uint32_t byte_count,
 		       struct reservation_object *resv,
-		       struct dma_fence **fence, bool direct_submit);
+		       struct dma_fence **fence, bool direct_submit,
+		       bool vm_needs_flush);
 int amdgpu_fill_buffer(struct amdgpu_bo *bo,
-			uint32_t src_data,
+			uint64_t src_data,
 			struct reservation_object *resv,
 			struct dma_fence **fence);
 
 int amdgpu_mmap(struct file *filp, struct vm_area_struct *vma);
 bool amdgpu_ttm_is_bound(struct ttm_tt *ttm);
 int amdgpu_ttm_bind(struct ttm_buffer_object *bo, struct ttm_mem_reg *bo_mem);
+int amdgpu_ttm_recover_gart(struct amdgpu_device *adev);
 
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
index 4f50eeb..36c7633 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
@@ -275,14 +275,10 @@ amdgpu_ucode_get_load_type(struct amdgpu_device *adev, int load_type)
 		else
 			return AMDGPU_FW_LOAD_PSP;
 	case CHIP_RAVEN:
-#if 0
-		if (!load_type)
+		if (load_type != 2)
 			return AMDGPU_FW_LOAD_DIRECT;
 		else
 			return AMDGPU_FW_LOAD_PSP;
-#else
-		return AMDGPU_FW_LOAD_DIRECT;
-#endif
 	default:
 		DRM_ERROR("Unknow firmware load type\n");
 	}
@@ -362,8 +358,6 @@ static int amdgpu_ucode_patch_jt(struct amdgpu_firmware_info *ucode,
 			   (le32_to_cpu(header->jt_offset) * 4);
 	memcpy(dst_addr, src_addr, le32_to_cpu(header->jt_size) * 4);
 
-	ucode->ucode_size += le32_to_cpu(header->jt_size) * 4;
-
 	return 0;
 }
 
@@ -377,10 +371,15 @@ int amdgpu_ucode_init_bo(struct amdgpu_device *adev)
 	struct amdgpu_firmware_info *ucode = NULL;
 	const struct common_firmware_header *header = NULL;
 
+	if (!adev->firmware.fw_size) {
+		dev_warn(adev->dev, "No ip firmware need to load\n");
+		return 0;
+	}
+
 	err = amdgpu_bo_create(adev, adev->firmware.fw_size, PAGE_SIZE, true,
 				amdgpu_sriov_vf(adev) ? AMDGPU_GEM_DOMAIN_VRAM : AMDGPU_GEM_DOMAIN_GTT,
 				AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
-				NULL, NULL, bo);
+				NULL, NULL, 0, bo);
 	if (err) {
 		dev_err(adev->dev, "(%d) Firmware buffer allocate failed\n", err);
 		goto failed;
@@ -459,6 +458,9 @@ int amdgpu_ucode_fini_bo(struct amdgpu_device *adev)
 	int i;
 	struct amdgpu_firmware_info *ucode = NULL;
 
+	if (!adev->firmware.fw_size)
+		return 0;
+
 	for (i = 0; i < adev->firmware.max_ucodes; i++) {
 		ucode = &adev->firmware.ucode[i];
 		if (ucode->fw) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
index 2ca09f1..e19928d 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_uvd.c
@@ -588,6 +588,10 @@ static int amdgpu_uvd_cs_msg_decode(struct amdgpu_device *adev, uint32_t *msg,
 		}
 		break;
 
+	case 8: /* MJPEG */
+		min_dpb_size = 0;
+		break;
+
 	case 16: /* H265 */
 		image_size = (ALIGN(width, 16) * ALIGN(height, 16) * 3) / 2;
 		image_size = ALIGN(image_size, 256);
@@ -1051,7 +1055,7 @@ int amdgpu_uvd_get_create_msg(struct amdgpu_ring *ring, uint32_t handle,
 			     AMDGPU_GEM_DOMAIN_VRAM,
 			     AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
 			     AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
-			     NULL, NULL, &bo);
+			     NULL, NULL, 0, &bo);
 	if (r)
 		return r;
 
@@ -1101,7 +1105,7 @@ int amdgpu_uvd_get_destroy_msg(struct amdgpu_ring *ring, uint32_t handle,
 			     AMDGPU_GEM_DOMAIN_VRAM,
 			     AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
 			     AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
-			     NULL, NULL, &bo);
+			     NULL, NULL, 0, &bo);
 	if (r)
 		return r;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
index b692ad4..c855366 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c
@@ -937,9 +937,9 @@ int amdgpu_vce_ring_test_ring(struct amdgpu_ring *ring)
 	unsigned i;
 	int r, timeout = adev->usec_timeout;
 
-	/* workaround VCE ring test slow issue for sriov*/
+	/* skip ring test for sriov*/
 	if (amdgpu_sriov_vf(adev))
-		timeout *= 10;
+		return 0;
 
 	r = amdgpu_ring_alloc(ring, 16);
 	if (r) {
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
index 09190fa..041e012 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vcn.c
@@ -209,9 +209,9 @@ static void amdgpu_vcn_idle_work_handler(struct work_struct *work)
 
 	if (fences == 0) {
 		if (adev->pm.dpm_enabled) {
+			/* might be used when with pg/cg
 			amdgpu_dpm_enable_uvd(adev, false);
-		} else {
-			amdgpu_asic_set_uvd_clocks(adev, 0, 0);
+			*/
 		}
 	} else {
 		schedule_delayed_work(&adev->vcn.idle_work, VCN_IDLE_TIMEOUT);
@@ -223,12 +223,10 @@ void amdgpu_vcn_ring_begin_use(struct amdgpu_ring *ring)
 	struct amdgpu_device *adev = ring->adev;
 	bool set_clocks = !cancel_delayed_work_sync(&adev->vcn.idle_work);
 
-	if (set_clocks) {
-		if (adev->pm.dpm_enabled) {
-			amdgpu_dpm_enable_uvd(adev, true);
-		} else {
-			amdgpu_asic_set_uvd_clocks(adev, 53300, 40000);
-		}
+	if (set_clocks && adev->pm.dpm_enabled) {
+		/* might be used when with pg/cg
+		amdgpu_dpm_enable_uvd(adev, true);
+		*/
 	}
 }
 
@@ -361,7 +359,7 @@ static int amdgpu_vcn_dec_get_create_msg(struct amdgpu_ring *ring, uint32_t hand
 			     AMDGPU_GEM_DOMAIN_VRAM,
 			     AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
 			     AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
-			     NULL, NULL, &bo);
+			     NULL, NULL, 0, &bo);
 	if (r)
 		return r;
 
@@ -413,7 +411,7 @@ static int amdgpu_vcn_dec_get_destroy_msg(struct amdgpu_ring *ring, uint32_t han
 			     AMDGPU_GEM_DOMAIN_VRAM,
 			     AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
 			     AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
-			     NULL, NULL, &bo);
+			     NULL, NULL, 0, &bo);
 	if (r)
 		return r;
 
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vf_error.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vf_error.c
new file mode 100644
index 0000000..45ac918
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vf_error.c
@@ -0,0 +1,85 @@
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#include "amdgpu.h"
+#include "amdgpu_vf_error.h"
+#include "mxgpu_ai.h"
+
+#define AMDGPU_VF_ERROR_ENTRY_SIZE    16 
+
+/* struct error_entry - amdgpu VF error information. */
+struct amdgpu_vf_error_buffer {
+	int read_count;
+	int write_count;
+	uint16_t code[AMDGPU_VF_ERROR_ENTRY_SIZE];
+	uint16_t flags[AMDGPU_VF_ERROR_ENTRY_SIZE];
+	uint64_t data[AMDGPU_VF_ERROR_ENTRY_SIZE];
+};
+
+struct amdgpu_vf_error_buffer admgpu_vf_errors;
+
+
+void amdgpu_vf_error_put(uint16_t sub_error_code, uint16_t error_flags, uint64_t error_data)
+{
+	int index;
+	uint16_t error_code = AMDGIM_ERROR_CODE(AMDGIM_ERROR_CATEGORY_VF, sub_error_code);
+
+	index = admgpu_vf_errors.write_count % AMDGPU_VF_ERROR_ENTRY_SIZE;
+	admgpu_vf_errors.code [index] = error_code;
+	admgpu_vf_errors.flags [index] = error_flags;
+	admgpu_vf_errors.data [index] = error_data;
+	admgpu_vf_errors.write_count ++;
+}
+
+
+void amdgpu_vf_error_trans_all(struct amdgpu_device *adev)
+{
+	/* u32 pf2vf_flags = 0; */
+	u32 data1, data2, data3;
+	int index;
+
+	if ((NULL == adev) || (!amdgpu_sriov_vf(adev)) || (!adev->virt.ops) || (!adev->virt.ops->trans_msg)) {
+		return;
+	}
+/*
+ 	TODO: Enable these code when pv2vf_info is merged
+	AMDGPU_FW_VRAM_PF2VF_READ (adev, feature_flags, &pf2vf_flags);
+	if (!(pf2vf_flags & AMDGIM_FEATURE_ERROR_LOG_COLLECT)) {
+		return;
+	}
+*/
+	/* The errors are overlay of array, correct read_count as full. */
+	if (admgpu_vf_errors.write_count - admgpu_vf_errors.read_count > AMDGPU_VF_ERROR_ENTRY_SIZE) {
+		admgpu_vf_errors.read_count = admgpu_vf_errors.write_count - AMDGPU_VF_ERROR_ENTRY_SIZE;
+	}
+
+	while (admgpu_vf_errors.read_count < admgpu_vf_errors.write_count) {
+		index =admgpu_vf_errors.read_count % AMDGPU_VF_ERROR_ENTRY_SIZE;
+		data1 = AMDGIM_ERROR_CODE_FLAGS_TO_MAILBOX (admgpu_vf_errors.code[index], admgpu_vf_errors.flags[index]);
+		data2 = admgpu_vf_errors.data[index] & 0xFFFFFFFF;
+		data3 = (admgpu_vf_errors.data[index] >> 32) & 0xFFFFFFFF;
+
+		adev->virt.ops->trans_msg(adev, IDH_LOG_VF_ERROR, data1, data2, data3);
+		admgpu_vf_errors.read_count ++;
+	}
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vf_error.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vf_error.h
new file mode 100644
index 0000000..2a3278e
--- /dev/null
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vf_error.h
@@ -0,0 +1,62 @@
+/*
+ * Copyright 2017 Advanced Micro Devices, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ */
+
+#ifndef __VF_ERROR_H__
+#define __VF_ERROR_H__
+
+#define AMDGIM_ERROR_CODE_FLAGS_TO_MAILBOX(c,f)    (((c & 0xFFFF) << 16) | (f & 0xFFFF))
+#define AMDGIM_ERROR_CODE(t,c)       (((t&0xF)<<12)|(c&0xFFF))
+
+/* Please keep enum same as AMD GIM driver */
+enum AMDGIM_ERROR_VF {
+	AMDGIM_ERROR_VF_ATOMBIOS_INIT_FAIL = 0,
+	AMDGIM_ERROR_VF_NO_VBIOS,
+	AMDGIM_ERROR_VF_GPU_POST_ERROR,
+	AMDGIM_ERROR_VF_ATOMBIOS_GET_CLOCK_FAIL,
+	AMDGIM_ERROR_VF_FENCE_INIT_FAIL,
+
+	AMDGIM_ERROR_VF_AMDGPU_INIT_FAIL,
+	AMDGIM_ERROR_VF_IB_INIT_FAIL,
+	AMDGIM_ERROR_VF_AMDGPU_LATE_INIT_FAIL,
+	AMDGIM_ERROR_VF_ASIC_RESUME_FAIL,
+	AMDGIM_ERROR_VF_GPU_RESET_FAIL,
+
+	AMDGIM_ERROR_VF_TEST,
+	AMDGIM_ERROR_VF_MAX
+};
+
+enum AMDGIM_ERROR_CATEGORY {
+	AMDGIM_ERROR_CATEGORY_NON_USED = 0,
+	AMDGIM_ERROR_CATEGORY_GIM,
+	AMDGIM_ERROR_CATEGORY_PF,
+	AMDGIM_ERROR_CATEGORY_VF,
+	AMDGIM_ERROR_CATEGORY_VBIOS,
+	AMDGIM_ERROR_CATEGORY_MONITOR,
+
+	AMDGIM_ERROR_CATEGORY_MAX
+};
+
+void amdgpu_vf_error_put(uint16_t sub_error_code, uint16_t error_flags, uint64_t error_data);
+void amdgpu_vf_error_trans_all (struct amdgpu_device *adev);
+
+#endif /* __VF_ERROR_H__ */
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
index 8a081e1..ab05121 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c
@@ -46,14 +46,14 @@ int amdgpu_allocate_static_csa(struct amdgpu_device *adev)
  * address within META_DATA init package to support SRIOV gfx preemption.
  */
 
-int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm)
+int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm,
+			  struct amdgpu_bo_va **bo_va)
 {
-	int r;
-	struct amdgpu_bo_va *bo_va;
 	struct ww_acquire_ctx ticket;
 	struct list_head list;
 	struct amdgpu_bo_list_entry pd;
 	struct ttm_validate_buffer csa_tv;
+	int r;
 
 	INIT_LIST_HEAD(&list);
 	INIT_LIST_HEAD(&csa_tv.head);
@@ -69,34 +69,33 @@ int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm)
 		return r;
 	}
 
-	bo_va = amdgpu_vm_bo_add(adev, vm, adev->virt.csa_obj);
-	if (!bo_va) {
+	*bo_va = amdgpu_vm_bo_add(adev, vm, adev->virt.csa_obj);
+	if (!*bo_va) {
 		ttm_eu_backoff_reservation(&ticket, &list);
 		DRM_ERROR("failed to create bo_va for static CSA\n");
 		return -ENOMEM;
 	}
 
-	r = amdgpu_vm_alloc_pts(adev, bo_va->vm, AMDGPU_CSA_VADDR,
-				   AMDGPU_CSA_SIZE);
+	r = amdgpu_vm_alloc_pts(adev, (*bo_va)->base.vm, AMDGPU_CSA_VADDR,
+				AMDGPU_CSA_SIZE);
 	if (r) {
 		DRM_ERROR("failed to allocate pts for static CSA, err=%d\n", r);
-		amdgpu_vm_bo_rmv(adev, bo_va);
+		amdgpu_vm_bo_rmv(adev, *bo_va);
 		ttm_eu_backoff_reservation(&ticket, &list);
 		return r;
 	}
 
-	r = amdgpu_vm_bo_map(adev, bo_va, AMDGPU_CSA_VADDR, 0,AMDGPU_CSA_SIZE,
-						AMDGPU_PTE_READABLE | AMDGPU_PTE_WRITEABLE |
-						AMDGPU_PTE_EXECUTABLE);
+	r = amdgpu_vm_bo_map(adev, *bo_va, AMDGPU_CSA_VADDR, 0, AMDGPU_CSA_SIZE,
+			     AMDGPU_PTE_READABLE | AMDGPU_PTE_WRITEABLE |
+			     AMDGPU_PTE_EXECUTABLE);
 
 	if (r) {
 		DRM_ERROR("failed to do bo_map on static CSA, err=%d\n", r);
-		amdgpu_vm_bo_rmv(adev, bo_va);
+		amdgpu_vm_bo_rmv(adev, *bo_va);
 		ttm_eu_backoff_reservation(&ticket, &list);
 		return r;
 	}
 
-	vm->csa_bo_va = bo_va;
 	ttm_eu_backoff_reservation(&ticket, &list);
 	return 0;
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
index 9e1062e..afcfb8b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h
@@ -43,6 +43,7 @@ struct amdgpu_virt_ops {
 	int (*req_full_gpu)(struct amdgpu_device *adev, bool init);
 	int (*rel_full_gpu)(struct amdgpu_device *adev, bool init);
 	int (*reset_gpu)(struct amdgpu_device *adev);
+	void (*trans_msg)(struct amdgpu_device *adev, u32 req, u32 data1, u32 data2, u32 data3);
 };
 
 /* GPU virtualization */
@@ -89,7 +90,8 @@ static inline bool is_virtual_machine(void)
 
 struct amdgpu_vm;
 int amdgpu_allocate_static_csa(struct amdgpu_device *adev);
-int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm);
+int amdgpu_map_static_csa(struct amdgpu_device *adev, struct amdgpu_vm *vm,
+			  struct amdgpu_bo_va **bo_va);
 void amdgpu_virt_init_setting(struct amdgpu_device *adev);
 uint32_t amdgpu_virt_kiq_rreg(struct amdgpu_device *adev, uint32_t reg);
 void amdgpu_virt_kiq_wreg(struct amdgpu_device *adev, uint32_t reg, uint32_t v);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
index 5795f81..6b1343e 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c
@@ -77,8 +77,6 @@ struct amdgpu_pte_update_params {
 	void (*func)(struct amdgpu_pte_update_params *params, uint64_t pe,
 		     uint64_t addr, unsigned count, uint32_t incr,
 		     uint64_t flags);
-	/* indicate update pt or its shadow */
-	bool shadow;
 	/* The next two are used during VM update by CPU
 	 *  DMA addresses to use for mapping
 	 *  Kernel pointer of PD/PT BO that needs to be updated
@@ -161,11 +159,26 @@ void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
  */
 static int amdgpu_vm_validate_level(struct amdgpu_vm_pt *parent,
 				    int (*validate)(void *, struct amdgpu_bo *),
-				    void *param)
+				    void *param, bool use_cpu_for_update,
+				    struct ttm_bo_global *glob)
 {
 	unsigned i;
 	int r;
 
+	if (parent->bo->shadow) {
+		struct amdgpu_bo *shadow = parent->bo->shadow;
+
+		r = amdgpu_ttm_bind(&shadow->tbo, &shadow->tbo.mem);
+		if (r)
+			return r;
+	}
+
+	if (use_cpu_for_update) {
+		r = amdgpu_bo_kmap(parent->bo, NULL);
+		if (r)
+			return r;
+	}
+
 	if (!parent->entries)
 		return 0;
 
@@ -179,11 +192,18 @@ static int amdgpu_vm_validate_level(struct amdgpu_vm_pt *parent,
 		if (r)
 			return r;
 
+		spin_lock(&glob->lru_lock);
+		ttm_bo_move_to_lru_tail(&entry->bo->tbo);
+		if (entry->bo->shadow)
+			ttm_bo_move_to_lru_tail(&entry->bo->shadow->tbo);
+		spin_unlock(&glob->lru_lock);
+
 		/*
 		 * Recurse into the sub directory. This is harmless because we
 		 * have only a maximum of 5 layers.
 		 */
-		r = amdgpu_vm_validate_level(entry, validate, param);
+		r = amdgpu_vm_validate_level(entry, validate, param,
+					     use_cpu_for_update, glob);
 		if (r)
 			return r;
 	}
@@ -214,54 +234,12 @@ int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 	if (num_evictions == vm->last_eviction_counter)
 		return 0;
 
-	return amdgpu_vm_validate_level(&vm->root, validate, param);
+	return amdgpu_vm_validate_level(&vm->root, validate, param,
+					vm->use_cpu_for_update,
+					adev->mman.bdev.glob);
 }
 
 /**
- * amdgpu_vm_move_level_in_lru - move one level of PT BOs to the LRU tail
- *
- * @adev: amdgpu device instance
- * @vm: vm providing the BOs
- *
- * Move the PT BOs to the tail of the LRU.
- */
-static void amdgpu_vm_move_level_in_lru(struct amdgpu_vm_pt *parent)
-{
-	unsigned i;
-
-	if (!parent->entries)
-		return;
-
-	for (i = 0; i <= parent->last_entry_used; ++i) {
-		struct amdgpu_vm_pt *entry = &parent->entries[i];
-
-		if (!entry->bo)
-			continue;
-
-		ttm_bo_move_to_lru_tail(&entry->bo->tbo);
-		amdgpu_vm_move_level_in_lru(entry);
-	}
-}
-
-/**
- * amdgpu_vm_move_pt_bos_in_lru - move the PT BOs to the LRU tail
- *
- * @adev: amdgpu device instance
- * @vm: vm providing the BOs
- *
- * Move the PT BOs to the tail of the LRU.
- */
-void amdgpu_vm_move_pt_bos_in_lru(struct amdgpu_device *adev,
-				  struct amdgpu_vm *vm)
-{
-	struct ttm_bo_global *glob = adev->mman.bdev.glob;
-
-	spin_lock(&glob->lru_lock);
-	amdgpu_vm_move_level_in_lru(&vm->root);
-	spin_unlock(&glob->lru_lock);
-}
-
- /**
  * amdgpu_vm_alloc_levels - allocate the PD/PT levels
  *
  * @adev: amdgpu_device pointer
@@ -282,6 +260,7 @@ static int amdgpu_vm_alloc_levels(struct amdgpu_device *adev,
 	unsigned pt_idx, from, to;
 	int r;
 	u64 flags;
+	uint64_t init_value = 0;
 
 	if (!parent->entries) {
 		unsigned num_entries = amdgpu_vm_num_entries(adev, level);
@@ -315,6 +294,12 @@ static int amdgpu_vm_alloc_levels(struct amdgpu_device *adev,
 		flags |= (AMDGPU_GEM_CREATE_NO_CPU_ACCESS |
 				AMDGPU_GEM_CREATE_SHADOW);
 
+	if (vm->pte_support_ats) {
+		init_value = AMDGPU_PTE_SYSTEM;
+		if (level != adev->vm_manager.num_level - 1)
+			init_value |= AMDGPU_PDE_PTE;
+	}
+
 	/* walk over the address space and allocate the page tables */
 	for (pt_idx = from; pt_idx <= to; ++pt_idx) {
 		struct reservation_object *resv = vm->root.bo->tbo.resv;
@@ -327,10 +312,18 @@ static int amdgpu_vm_alloc_levels(struct amdgpu_device *adev,
 					     AMDGPU_GPU_PAGE_SIZE, true,
 					     AMDGPU_GEM_DOMAIN_VRAM,
 					     flags,
-					     NULL, resv, &pt);
+					     NULL, resv, init_value, &pt);
 			if (r)
 				return r;
 
+			if (vm->use_cpu_for_update) {
+				r = amdgpu_bo_kmap(pt, NULL);
+				if (r) {
+					amdgpu_bo_unref(&pt);
+					return r;
+				}
+			}
+
 			/* Keep a reference to the root directory to avoid
 			* freeing them up in the wrong order.
 			*/
@@ -424,7 +417,7 @@ static int amdgpu_vm_grab_reserved_vmid_locked(struct amdgpu_vm *vm,
 	struct dma_fence *updates = sync->last_vm_update;
 	int r = 0;
 	struct dma_fence *flushed, *tmp;
-	bool needs_flush = false;
+	bool needs_flush = vm->use_cpu_for_update;
 
 	flushed  = id->flushed_updates;
 	if ((amdgpu_vm_had_gpu_reset(adev, id)) ||
@@ -545,11 +538,11 @@ int amdgpu_vm_grab_id(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
 	}
 	kfree(fences);
 
-	job->vm_needs_flush = false;
+	job->vm_needs_flush = vm->use_cpu_for_update;
 	/* Check if we can use a VMID already assigned to this VM */
 	list_for_each_entry_reverse(id, &id_mgr->ids_lru, list) {
 		struct dma_fence *flushed;
-		bool needs_flush = false;
+		bool needs_flush = vm->use_cpu_for_update;
 
 		/* Check all the prerequisites to using this VMID */
 		if (amdgpu_vm_had_gpu_reset(adev, id))
@@ -745,7 +738,7 @@ static bool amdgpu_vm_is_large_bar(struct amdgpu_device *adev)
  *
  * Emit a VM flush when it is necessary.
  */
-int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job)
+int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job, bool need_pipe_sync)
 {
 	struct amdgpu_device *adev = ring->adev;
 	unsigned vmhub = ring->funcs->vmhub;
@@ -767,12 +760,15 @@ int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job)
 		vm_flush_needed = true;
 	}
 
-	if (!vm_flush_needed && !gds_switch_needed)
+	if (!vm_flush_needed && !gds_switch_needed && !need_pipe_sync)
 		return 0;
 
 	if (ring->funcs->init_cond_exec)
 		patch_offset = amdgpu_ring_init_cond_exec(ring);
 
+	if (need_pipe_sync)
+		amdgpu_ring_emit_pipeline_sync(ring);
+
 	if (ring->funcs->emit_vm_flush && vm_flush_needed) {
 		struct dma_fence *fence;
 
@@ -874,8 +870,8 @@ struct amdgpu_bo_va *amdgpu_vm_bo_find(struct amdgpu_vm *vm,
 {
 	struct amdgpu_bo_va *bo_va;
 
-	list_for_each_entry(bo_va, &bo->va, bo_list) {
-		if (bo_va->vm == vm) {
+	list_for_each_entry(bo_va, &bo->va, base.bo_list) {
+		if (bo_va->base.vm == vm) {
 			return bo_va;
 		}
 	}
@@ -981,6 +977,8 @@ static void amdgpu_vm_cpu_set_ptes(struct amdgpu_pte_update_params *params,
 	unsigned int i;
 	uint64_t value;
 
+	trace_amdgpu_vm_set_ptes(pe, addr, count, incr, flags);
+
 	for (i = 0; i < count; i++) {
 		value = params->pages_addr ?
 			amdgpu_vm_map_gart(params->pages_addr, addr) :
@@ -989,19 +987,16 @@ static void amdgpu_vm_cpu_set_ptes(struct amdgpu_pte_update_params *params,
 					i, value, flags);
 		addr += incr;
 	}
-
-	/* Flush HDP */
-	mb();
-	amdgpu_gart_flush_gpu_tlb(params->adev, 0);
 }
 
-static int amdgpu_vm_bo_wait(struct amdgpu_device *adev, struct amdgpu_bo *bo)
+static int amdgpu_vm_wait_pd(struct amdgpu_device *adev, struct amdgpu_vm *vm,
+			     void *owner)
 {
 	struct amdgpu_sync sync;
 	int r;
 
 	amdgpu_sync_create(&sync);
-	amdgpu_sync_resv(adev, &sync, bo->tbo.resv, AMDGPU_FENCE_OWNER_VM);
+	amdgpu_sync_resv(adev, &sync, vm->root.bo->tbo.resv, owner);
 	r = amdgpu_sync_wait(&sync, true);
 	amdgpu_sync_free(&sync);
 
@@ -1042,23 +1037,14 @@ static int amdgpu_vm_update_level(struct amdgpu_device *adev,
 	params.adev = adev;
 	shadow = parent->bo->shadow;
 
-	WARN_ON(vm->use_cpu_for_update && shadow);
-	if (vm->use_cpu_for_update && !shadow) {
-		r = amdgpu_bo_kmap(parent->bo, (void **)&pd_addr);
-		if (r)
+	if (vm->use_cpu_for_update) {
+		pd_addr = (unsigned long)amdgpu_bo_kptr(parent->bo);
+		r = amdgpu_vm_wait_pd(adev, vm, AMDGPU_FENCE_OWNER_VM);
+		if (unlikely(r))
 			return r;
-		r = amdgpu_vm_bo_wait(adev, parent->bo);
-		if (unlikely(r)) {
-			amdgpu_bo_kunmap(parent->bo);
-			return r;
-		}
+
 		params.func = amdgpu_vm_cpu_set_ptes;
 	} else {
-		if (shadow) {
-			r = amdgpu_ttm_bind(&shadow->tbo, &shadow->tbo.mem);
-			if (r)
-				return r;
-		}
 		ring = container_of(vm->entity.sched, struct amdgpu_ring,
 				    sched);
 
@@ -1094,21 +1080,14 @@ static int amdgpu_vm_update_level(struct amdgpu_device *adev,
 		if (bo == NULL)
 			continue;
 
-		if (bo->shadow) {
-			struct amdgpu_bo *pt_shadow = bo->shadow;
-
-			r = amdgpu_ttm_bind(&pt_shadow->tbo,
-					    &pt_shadow->tbo.mem);
-			if (r)
-				return r;
-		}
-
 		pt = amdgpu_bo_gpu_offset(bo);
 		pt = amdgpu_gart_get_vm_pde(adev, pt);
-		if (parent->entries[pt_idx].addr == pt)
+		/* Don't update huge pages here */
+		if ((parent->entries[pt_idx].addr & AMDGPU_PDE_PTE) ||
+		    parent->entries[pt_idx].addr == (pt | AMDGPU_PTE_VALID))
 			continue;
 
-		parent->entries[pt_idx].addr = pt;
+		parent->entries[pt_idx].addr = pt | AMDGPU_PTE_VALID;
 
 		pde = pd_addr + pt_idx * 8;
 		if (((last_pde + 8 * count) != pde) ||
@@ -1146,28 +1125,29 @@ static int amdgpu_vm_update_level(struct amdgpu_device *adev,
 			    count, incr, AMDGPU_PTE_VALID);
 	}
 
-	if (params.func == amdgpu_vm_cpu_set_ptes)
-		amdgpu_bo_kunmap(parent->bo);
-	else if (params.ib->length_dw == 0) {
-		amdgpu_job_free(job);
-	} else {
-		amdgpu_ring_pad_ib(ring, params.ib);
-		amdgpu_sync_resv(adev, &job->sync, parent->bo->tbo.resv,
-				 AMDGPU_FENCE_OWNER_VM);
-		if (shadow)
-			amdgpu_sync_resv(adev, &job->sync, shadow->tbo.resv,
+	if (!vm->use_cpu_for_update) {
+		if (params.ib->length_dw == 0) {
+			amdgpu_job_free(job);
+		} else {
+			amdgpu_ring_pad_ib(ring, params.ib);
+			amdgpu_sync_resv(adev, &job->sync, parent->bo->tbo.resv,
 					 AMDGPU_FENCE_OWNER_VM);
+			if (shadow)
+				amdgpu_sync_resv(adev, &job->sync,
+						 shadow->tbo.resv,
+						 AMDGPU_FENCE_OWNER_VM);
 
-		WARN_ON(params.ib->length_dw > ndw);
-		r = amdgpu_job_submit(job, ring, &vm->entity,
-				AMDGPU_FENCE_OWNER_VM, &fence);
-		if (r)
-			goto error_free;
+			WARN_ON(params.ib->length_dw > ndw);
+			r = amdgpu_job_submit(job, ring, &vm->entity,
+					AMDGPU_FENCE_OWNER_VM, &fence);
+			if (r)
+				goto error_free;
 
-		amdgpu_bo_fence(parent->bo, fence, true);
-		dma_fence_put(vm->last_dir_update);
-		vm->last_dir_update = dma_fence_get(fence);
-		dma_fence_put(fence);
+			amdgpu_bo_fence(parent->bo, fence, true);
+			dma_fence_put(vm->last_dir_update);
+			vm->last_dir_update = dma_fence_get(fence);
+			dma_fence_put(fence);
+		}
 	}
 	/*
 	 * Recurse into the subdirectories. This recursion is harmless because
@@ -1235,33 +1215,98 @@ int amdgpu_vm_update_directories(struct amdgpu_device *adev,
 	if (r)
 		amdgpu_vm_invalidate_level(&vm->root);
 
+	if (vm->use_cpu_for_update) {
+		/* Flush HDP */
+		mb();
+		amdgpu_gart_flush_gpu_tlb(adev, 0);
+	}
+
 	return r;
 }
 
 /**
- * amdgpu_vm_find_pt - find the page table for an address
+ * amdgpu_vm_find_entry - find the entry for an address
  *
  * @p: see amdgpu_pte_update_params definition
  * @addr: virtual address in question
+ * @entry: resulting entry or NULL
+ * @parent: parent entry
  *
- * Find the page table BO for a virtual address, return NULL when none found.
+ * Find the vm_pt entry and it's parent for the given address.
  */
-static struct amdgpu_bo *amdgpu_vm_get_pt(struct amdgpu_pte_update_params *p,
-					  uint64_t addr)
+void amdgpu_vm_get_entry(struct amdgpu_pte_update_params *p, uint64_t addr,
+			 struct amdgpu_vm_pt **entry,
+			 struct amdgpu_vm_pt **parent)
 {
-	struct amdgpu_vm_pt *entry = &p->vm->root;
 	unsigned idx, level = p->adev->vm_manager.num_level;
 
-	while (entry->entries) {
+	*parent = NULL;
+	*entry = &p->vm->root;
+	while ((*entry)->entries) {
 		idx = addr >> (p->adev->vm_manager.block_size * level--);
-		idx %= amdgpu_bo_size(entry->bo) / 8;
-		entry = &entry->entries[idx];
+		idx %= amdgpu_bo_size((*entry)->bo) / 8;
+		*parent = *entry;
+		*entry = &(*entry)->entries[idx];
 	}
 
 	if (level)
-		return NULL;
+		*entry = NULL;
+}
 
-	return entry->bo;
+/**
+ * amdgpu_vm_handle_huge_pages - handle updating the PD with huge pages
+ *
+ * @p: see amdgpu_pte_update_params definition
+ * @entry: vm_pt entry to check
+ * @parent: parent entry
+ * @nptes: number of PTEs updated with this operation
+ * @dst: destination address where the PTEs should point to
+ * @flags: access flags fro the PTEs
+ *
+ * Check if we can update the PD with a huge page.
+ */
+static void amdgpu_vm_handle_huge_pages(struct amdgpu_pte_update_params *p,
+					struct amdgpu_vm_pt *entry,
+					struct amdgpu_vm_pt *parent,
+					unsigned nptes, uint64_t dst,
+					uint64_t flags)
+{
+	bool use_cpu_update = (p->func == amdgpu_vm_cpu_set_ptes);
+	uint64_t pd_addr, pde;
+
+	/* In the case of a mixed PT the PDE must point to it*/
+	if (p->adev->asic_type < CHIP_VEGA10 ||
+	    nptes != AMDGPU_VM_PTE_COUNT(p->adev) ||
+	    p->func == amdgpu_vm_do_copy_ptes ||
+	    !(flags & AMDGPU_PTE_VALID)) {
+
+		dst = amdgpu_bo_gpu_offset(entry->bo);
+		dst = amdgpu_gart_get_vm_pde(p->adev, dst);
+		flags = AMDGPU_PTE_VALID;
+	} else {
+		/* Set the huge page flag to stop scanning at this PDE */
+		flags |= AMDGPU_PDE_PTE;
+	}
+
+	if (entry->addr == (dst | flags))
+		return;
+
+	entry->addr = (dst | flags);
+
+	if (use_cpu_update) {
+		pd_addr = (unsigned long)amdgpu_bo_kptr(parent->bo);
+		pde = pd_addr + (entry - parent->entries) * 8;
+		amdgpu_vm_cpu_set_ptes(p, pde, dst, 1, 0, flags);
+	} else {
+		if (parent->bo->shadow) {
+			pd_addr = amdgpu_bo_gpu_offset(parent->bo->shadow);
+			pde = pd_addr + (entry - parent->entries) * 8;
+			amdgpu_vm_do_set_ptes(p, pde, dst, 1, 0, flags);
+		}
+		pd_addr = amdgpu_bo_gpu_offset(parent->bo);
+		pde = pd_addr + (entry - parent->entries) * 8;
+		amdgpu_vm_do_set_ptes(p, pde, dst, 1, 0, flags);
+	}
 }
 
 /**
@@ -1287,49 +1332,44 @@ static int amdgpu_vm_update_ptes(struct amdgpu_pte_update_params *params,
 	uint64_t addr, pe_start;
 	struct amdgpu_bo *pt;
 	unsigned nptes;
-	int r;
 	bool use_cpu_update = (params->func == amdgpu_vm_cpu_set_ptes);
 
-
 	/* walk over the address space and update the page tables */
-	for (addr = start; addr < end; addr += nptes) {
-		pt = amdgpu_vm_get_pt(params, addr);
-		if (!pt) {
-			pr_err("PT not found, aborting update_ptes\n");
-			return -EINVAL;
-		}
+	for (addr = start; addr < end; addr += nptes,
+	     dst += nptes * AMDGPU_GPU_PAGE_SIZE) {
+		struct amdgpu_vm_pt *entry, *parent;
 
-		if (params->shadow) {
-			if (WARN_ONCE(use_cpu_update,
-				"CPU VM update doesn't suuport shadow pages"))
-				return 0;
-
-			if (!pt->shadow)
-				return 0;
-			pt = pt->shadow;
-		}
+		amdgpu_vm_get_entry(params, addr, &entry, &parent);
+		if (!entry)
+			return -ENOENT;
 
 		if ((addr & ~mask) == (end & ~mask))
 			nptes = end - addr;
 		else
 			nptes = AMDGPU_VM_PTE_COUNT(adev) - (addr & mask);
 
+		amdgpu_vm_handle_huge_pages(params, entry, parent,
+					    nptes, dst, flags);
+		/* We don't need to update PTEs for huge pages */
+		if (entry->addr & AMDGPU_PDE_PTE)
+			continue;
+
+		pt = entry->bo;
 		if (use_cpu_update) {
-			r = amdgpu_bo_kmap(pt, (void *)&pe_start);
-			if (r)
-				return r;
-		} else
+			pe_start = (unsigned long)amdgpu_bo_kptr(pt);
+		} else {
+			if (pt->shadow) {
+				pe_start = amdgpu_bo_gpu_offset(pt->shadow);
+				pe_start += (addr & mask) * 8;
+				params->func(params, pe_start, dst, nptes,
+					     AMDGPU_GPU_PAGE_SIZE, flags);
+			}
 			pe_start = amdgpu_bo_gpu_offset(pt);
+		}
 
 		pe_start += (addr & mask) * 8;
-
 		params->func(params, pe_start, dst, nptes,
 			     AMDGPU_GPU_PAGE_SIZE, flags);
-
-		dst += nptes * AMDGPU_GPU_PAGE_SIZE;
-
-		if (use_cpu_update)
-			amdgpu_bo_kunmap(pt);
 	}
 
 	return 0;
@@ -1370,10 +1410,9 @@ static int amdgpu_vm_frag_ptes(struct amdgpu_pte_update_params	*params,
 	 * Userspace can support this by aligning virtual base address and
 	 * allocation size to the fragment size.
 	 */
-
-	/* SI and newer are optimized for 64KB */
-	uint64_t frag_flags = AMDGPU_PTE_FRAG(AMDGPU_LOG2_PAGES_PER_FRAG);
-	uint64_t frag_align = 1 << AMDGPU_LOG2_PAGES_PER_FRAG;
+	unsigned pages_per_frag = params->adev->vm_manager.fragment_size;
+	uint64_t frag_flags = AMDGPU_PTE_FRAG(pages_per_frag);
+	uint64_t frag_align = 1 << pages_per_frag;
 
 	uint64_t frag_start = ALIGN(start, frag_align);
 	uint64_t frag_end = end & ~(frag_align - 1);
@@ -1445,6 +1484,10 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 	params.vm = vm;
 	params.src = src;
 
+	/* sync to everything on unmapping */
+	if (!(flags & AMDGPU_PTE_VALID))
+		owner = AMDGPU_FENCE_OWNER_UNDEFINED;
+
 	if (vm->use_cpu_for_update) {
 		/* params.src is used as flag to indicate system Memory */
 		if (pages_addr)
@@ -1453,23 +1496,18 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 		/* Wait for PT BOs to be free. PTs share the same resv. object
 		 * as the root PD BO
 		 */
-		r = amdgpu_vm_bo_wait(adev, vm->root.bo);
+		r = amdgpu_vm_wait_pd(adev, vm, owner);
 		if (unlikely(r))
 			return r;
 
 		params.func = amdgpu_vm_cpu_set_ptes;
 		params.pages_addr = pages_addr;
-		params.shadow = false;
 		return amdgpu_vm_frag_ptes(&params, start, last + 1,
 					   addr, flags);
 	}
 
 	ring = container_of(vm->entity.sched, struct amdgpu_ring, sched);
 
-	/* sync to everything on unmapping */
-	if (!(flags & AMDGPU_PTE_VALID))
-		owner = AMDGPU_FENCE_OWNER_UNDEFINED;
-
 	nptes = last - start + 1;
 
 	/*
@@ -1481,6 +1519,9 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 	/* padding, etc. */
 	ndw = 64;
 
+	/* one PDE write for each huge page */
+	ndw += ((nptes >> adev->vm_manager.block_size) + 1) * 6;
+
 	if (src) {
 		/* only copy commands needed */
 		ndw += ncmds * 7;
@@ -1542,11 +1583,6 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 	if (r)
 		goto error_free;
 
-	params.shadow = true;
-	r = amdgpu_vm_frag_ptes(&params, start, last + 1, addr, flags);
-	if (r)
-		goto error_free;
-	params.shadow = false;
 	r = amdgpu_vm_frag_ptes(&params, start, last + 1, addr, flags);
 	if (r)
 		goto error_free;
@@ -1565,6 +1601,7 @@ static int amdgpu_vm_bo_update_mapping(struct amdgpu_device *adev,
 
 error_free:
 	amdgpu_job_free(job);
+	amdgpu_vm_invalidate_level(&vm->root);
 	return r;
 }
 
@@ -1687,7 +1724,8 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev,
 			struct amdgpu_bo_va *bo_va,
 			bool clear)
 {
-	struct amdgpu_vm *vm = bo_va->vm;
+	struct amdgpu_bo *bo = bo_va->base.bo;
+	struct amdgpu_vm *vm = bo_va->base.vm;
 	struct amdgpu_bo_va_mapping *mapping;
 	dma_addr_t *pages_addr = NULL;
 	uint64_t gtt_flags, flags;
@@ -1696,27 +1734,27 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev,
 	struct dma_fence *exclusive;
 	int r;
 
-	if (clear || !bo_va->bo) {
+	if (clear || !bo_va->base.bo) {
 		mem = NULL;
 		nodes = NULL;
 		exclusive = NULL;
 	} else {
 		struct ttm_dma_tt *ttm;
 
-		mem = &bo_va->bo->tbo.mem;
+		mem = &bo_va->base.bo->tbo.mem;
 		nodes = mem->mm_node;
 		if (mem->mem_type == TTM_PL_TT) {
-			ttm = container_of(bo_va->bo->tbo.ttm, struct
-					   ttm_dma_tt, ttm);
+			ttm = container_of(bo_va->base.bo->tbo.ttm,
+					   struct ttm_dma_tt, ttm);
 			pages_addr = ttm->dma_address;
 		}
-		exclusive = reservation_object_get_excl(bo_va->bo->tbo.resv);
+		exclusive = reservation_object_get_excl(bo->tbo.resv);
 	}
 
-	if (bo_va->bo) {
-		flags = amdgpu_ttm_tt_pte_flags(adev, bo_va->bo->tbo.ttm, mem);
-		gtt_flags = (amdgpu_ttm_is_bound(bo_va->bo->tbo.ttm) &&
-			adev == amdgpu_ttm_adev(bo_va->bo->tbo.bdev)) ?
+	if (bo) {
+		flags = amdgpu_ttm_tt_pte_flags(adev, bo->tbo.ttm, mem);
+		gtt_flags = (amdgpu_ttm_is_bound(bo->tbo.ttm) &&
+			adev == amdgpu_ttm_adev(bo->tbo.bdev)) ?
 			flags : 0;
 	} else {
 		flags = 0x0;
@@ -1724,7 +1762,7 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev,
 	}
 
 	spin_lock(&vm->status_lock);
-	if (!list_empty(&bo_va->vm_status))
+	if (!list_empty(&bo_va->base.vm_status))
 		list_splice_init(&bo_va->valids, &bo_va->invalids);
 	spin_unlock(&vm->status_lock);
 
@@ -1747,11 +1785,17 @@ int amdgpu_vm_bo_update(struct amdgpu_device *adev,
 
 	spin_lock(&vm->status_lock);
 	list_splice_init(&bo_va->invalids, &bo_va->valids);
-	list_del_init(&bo_va->vm_status);
+	list_del_init(&bo_va->base.vm_status);
 	if (clear)
-		list_add(&bo_va->vm_status, &vm->cleared);
+		list_add(&bo_va->base.vm_status, &vm->cleared);
 	spin_unlock(&vm->status_lock);
 
+	if (vm->use_cpu_for_update) {
+		/* Flush HDP */
+		mb();
+		amdgpu_gart_flush_gpu_tlb(adev, 0);
+	}
+
 	return 0;
 }
 
@@ -1905,15 +1949,19 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
 	struct amdgpu_bo_va_mapping *mapping;
 	struct dma_fence *f = NULL;
 	int r;
+	uint64_t init_pte_value = 0;
 
 	while (!list_empty(&vm->freed)) {
 		mapping = list_first_entry(&vm->freed,
 			struct amdgpu_bo_va_mapping, list);
 		list_del(&mapping->list);
 
+		if (vm->pte_support_ats)
+			init_pte_value = AMDGPU_PTE_SYSTEM;
+
 		r = amdgpu_vm_bo_update_mapping(adev, NULL, 0, NULL, vm,
 						mapping->start, mapping->last,
-						0, 0, &f);
+						init_pte_value, 0, &f);
 		amdgpu_vm_free_mapping(adev, vm, mapping, f);
 		if (r) {
 			dma_fence_put(f);
@@ -1933,26 +1981,26 @@ int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
 }
 
 /**
- * amdgpu_vm_clear_invalids - clear invalidated BOs in the PT
+ * amdgpu_vm_clear_moved - clear moved BOs in the PT
  *
  * @adev: amdgpu_device pointer
  * @vm: requested vm
  *
- * Make sure all invalidated BOs are cleared in the PT.
+ * Make sure all moved BOs are cleared in the PT.
  * Returns 0 for success.
  *
  * PTs have to be reserved and mutex must be locked!
  */
-int amdgpu_vm_clear_invalids(struct amdgpu_device *adev,
-			     struct amdgpu_vm *vm, struct amdgpu_sync *sync)
+int amdgpu_vm_clear_moved(struct amdgpu_device *adev, struct amdgpu_vm *vm,
+			    struct amdgpu_sync *sync)
 {
 	struct amdgpu_bo_va *bo_va = NULL;
 	int r = 0;
 
 	spin_lock(&vm->status_lock);
-	while (!list_empty(&vm->invalidated)) {
-		bo_va = list_first_entry(&vm->invalidated,
-			struct amdgpu_bo_va, vm_status);
+	while (!list_empty(&vm->moved)) {
+		bo_va = list_first_entry(&vm->moved,
+			struct amdgpu_bo_va, base.vm_status);
 		spin_unlock(&vm->status_lock);
 
 		r = amdgpu_vm_bo_update(adev, bo_va, true);
@@ -1992,16 +2040,17 @@ struct amdgpu_bo_va *amdgpu_vm_bo_add(struct amdgpu_device *adev,
 	if (bo_va == NULL) {
 		return NULL;
 	}
-	bo_va->vm = vm;
-	bo_va->bo = bo;
+	bo_va->base.vm = vm;
+	bo_va->base.bo = bo;
+	INIT_LIST_HEAD(&bo_va->base.bo_list);
+	INIT_LIST_HEAD(&bo_va->base.vm_status);
+
 	bo_va->ref_count = 1;
-	INIT_LIST_HEAD(&bo_va->bo_list);
 	INIT_LIST_HEAD(&bo_va->valids);
 	INIT_LIST_HEAD(&bo_va->invalids);
-	INIT_LIST_HEAD(&bo_va->vm_status);
 
 	if (bo)
-		list_add_tail(&bo_va->bo_list, &bo->va);
+		list_add_tail(&bo_va->base.bo_list, &bo->va);
 
 	return bo_va;
 }
@@ -2026,7 +2075,8 @@ int amdgpu_vm_bo_map(struct amdgpu_device *adev,
 		     uint64_t size, uint64_t flags)
 {
 	struct amdgpu_bo_va_mapping *mapping, *tmp;
-	struct amdgpu_vm *vm = bo_va->vm;
+	struct amdgpu_bo *bo = bo_va->base.bo;
+	struct amdgpu_vm *vm = bo_va->base.vm;
 	uint64_t eaddr;
 
 	/* validate the parameters */
@@ -2037,7 +2087,7 @@ int amdgpu_vm_bo_map(struct amdgpu_device *adev,
 	/* make sure object fit at this offset */
 	eaddr = saddr + size - 1;
 	if (saddr >= eaddr ||
-	    (bo_va->bo && offset + size > amdgpu_bo_size(bo_va->bo)))
+	    (bo && offset + size > amdgpu_bo_size(bo)))
 		return -EINVAL;
 
 	saddr /= AMDGPU_GPU_PAGE_SIZE;
@@ -2047,7 +2097,7 @@ int amdgpu_vm_bo_map(struct amdgpu_device *adev,
 	if (tmp) {
 		/* bo and tmp overlap, invalid addr */
 		dev_err(adev->dev, "bo %p va 0x%010Lx-0x%010Lx conflict with "
-			"0x%010Lx-0x%010Lx\n", bo_va->bo, saddr, eaddr,
+			"0x%010Lx-0x%010Lx\n", bo, saddr, eaddr,
 			tmp->start, tmp->last + 1);
 		return -EINVAL;
 	}
@@ -2092,7 +2142,8 @@ int amdgpu_vm_bo_replace_map(struct amdgpu_device *adev,
 			     uint64_t size, uint64_t flags)
 {
 	struct amdgpu_bo_va_mapping *mapping;
-	struct amdgpu_vm *vm = bo_va->vm;
+	struct amdgpu_bo *bo = bo_va->base.bo;
+	struct amdgpu_vm *vm = bo_va->base.vm;
 	uint64_t eaddr;
 	int r;
 
@@ -2104,7 +2155,7 @@ int amdgpu_vm_bo_replace_map(struct amdgpu_device *adev,
 	/* make sure object fit at this offset */
 	eaddr = saddr + size - 1;
 	if (saddr >= eaddr ||
-	    (bo_va->bo && offset + size > amdgpu_bo_size(bo_va->bo)))
+	    (bo && offset + size > amdgpu_bo_size(bo)))
 		return -EINVAL;
 
 	/* Allocate all the needed memory */
@@ -2112,7 +2163,7 @@ int amdgpu_vm_bo_replace_map(struct amdgpu_device *adev,
 	if (!mapping)
 		return -ENOMEM;
 
-	r = amdgpu_vm_bo_clear_mappings(adev, bo_va->vm, saddr, size);
+	r = amdgpu_vm_bo_clear_mappings(adev, bo_va->base.vm, saddr, size);
 	if (r) {
 		kfree(mapping);
 		return r;
@@ -2152,7 +2203,7 @@ int amdgpu_vm_bo_unmap(struct amdgpu_device *adev,
 		       uint64_t saddr)
 {
 	struct amdgpu_bo_va_mapping *mapping;
-	struct amdgpu_vm *vm = bo_va->vm;
+	struct amdgpu_vm *vm = bo_va->base.vm;
 	bool valid = true;
 
 	saddr /= AMDGPU_GPU_PAGE_SIZE;
@@ -2300,12 +2351,12 @@ void amdgpu_vm_bo_rmv(struct amdgpu_device *adev,
 		      struct amdgpu_bo_va *bo_va)
 {
 	struct amdgpu_bo_va_mapping *mapping, *next;
-	struct amdgpu_vm *vm = bo_va->vm;
+	struct amdgpu_vm *vm = bo_va->base.vm;
 
-	list_del(&bo_va->bo_list);
+	list_del(&bo_va->base.bo_list);
 
 	spin_lock(&vm->status_lock);
-	list_del(&bo_va->vm_status);
+	list_del(&bo_va->base.vm_status);
 	spin_unlock(&vm->status_lock);
 
 	list_for_each_entry_safe(mapping, next, &bo_va->valids, list) {
@@ -2337,13 +2388,14 @@ void amdgpu_vm_bo_rmv(struct amdgpu_device *adev,
 void amdgpu_vm_bo_invalidate(struct amdgpu_device *adev,
 			     struct amdgpu_bo *bo)
 {
-	struct amdgpu_bo_va *bo_va;
+	struct amdgpu_vm_bo_base *bo_base;
 
-	list_for_each_entry(bo_va, &bo->va, bo_list) {
-		spin_lock(&bo_va->vm->status_lock);
-		if (list_empty(&bo_va->vm_status))
-			list_add(&bo_va->vm_status, &bo_va->vm->invalidated);
-		spin_unlock(&bo_va->vm->status_lock);
+	list_for_each_entry(bo_base, &bo->va, bo_list) {
+		spin_lock(&bo_base->vm->status_lock);
+		if (list_empty(&bo_base->vm_status))
+			list_add(&bo_base->vm_status,
+				 &bo_base->vm->moved);
+		spin_unlock(&bo_base->vm->status_lock);
 	}
 }
 
@@ -2361,12 +2413,26 @@ static uint32_t amdgpu_vm_get_block_size(uint64_t vm_size)
 }
 
 /**
- * amdgpu_vm_adjust_size - adjust vm size and block size
+ * amdgpu_vm_set_fragment_size - adjust fragment size in PTE
+ *
+ * @adev: amdgpu_device pointer
+ * @fragment_size_default: the default fragment size if it's set auto
+ */
+void amdgpu_vm_set_fragment_size(struct amdgpu_device *adev, uint32_t fragment_size_default)
+{
+	if (amdgpu_vm_fragment_size == -1)
+		adev->vm_manager.fragment_size = fragment_size_default;
+	else
+		adev->vm_manager.fragment_size = amdgpu_vm_fragment_size;
+}
+
+/**
+ * amdgpu_vm_adjust_size - adjust vm size, block size and fragment size
  *
  * @adev: amdgpu_device pointer
  * @vm_size: the default vm size if it's set auto
  */
-void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint64_t vm_size)
+void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint64_t vm_size, uint32_t fragment_size_default)
 {
 	/* adjust vm size firstly */
 	if (amdgpu_vm_size == -1)
@@ -2381,8 +2447,11 @@ void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint64_t vm_size)
 	else
 		adev->vm_manager.block_size = amdgpu_vm_block_size;
 
-	DRM_INFO("vm size is %llu GB, block size is %u-bit\n",
-		adev->vm_manager.vm_size, adev->vm_manager.block_size);
+	amdgpu_vm_set_fragment_size(adev, fragment_size_default);
+
+	DRM_INFO("vm size is %llu GB, block size is %u-bit, fragment size is %u-bit\n",
+		adev->vm_manager.vm_size, adev->vm_manager.block_size,
+		adev->vm_manager.fragment_size);
 }
 
 /**
@@ -2404,13 +2473,14 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 	struct amd_sched_rq *rq;
 	int r, i;
 	u64 flags;
+	uint64_t init_pde_value = 0;
 
 	vm->va = RB_ROOT;
 	vm->client_id = atomic64_inc_return(&adev->vm_manager.client_counter);
 	for (i = 0; i < AMDGPU_MAX_VMHUBS; i++)
 		vm->reserved_vmid[i] = NULL;
 	spin_lock_init(&vm->status_lock);
-	INIT_LIST_HEAD(&vm->invalidated);
+	INIT_LIST_HEAD(&vm->moved);
 	INIT_LIST_HEAD(&vm->cleared);
 	INIT_LIST_HEAD(&vm->freed);
 
@@ -2425,10 +2495,17 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 	if (r)
 		return r;
 
-	if (vm_context == AMDGPU_VM_CONTEXT_COMPUTE)
+	vm->pte_support_ats = false;
+
+	if (vm_context == AMDGPU_VM_CONTEXT_COMPUTE) {
 		vm->use_cpu_for_update = !!(adev->vm_manager.vm_update_mode &
 						AMDGPU_VM_USE_CPU_FOR_COMPUTE);
-	else
+
+		if (adev->asic_type == CHIP_RAVEN) {
+			vm->pte_support_ats = true;
+			init_pde_value = AMDGPU_PTE_SYSTEM | AMDGPU_PDE_PTE;
+		}
+	} else
 		vm->use_cpu_for_update = !!(adev->vm_manager.vm_update_mode &
 						AMDGPU_VM_USE_CPU_FOR_GFX);
 	DRM_DEBUG_DRIVER("VM update mode is %s\n",
@@ -2448,7 +2525,7 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 	r = amdgpu_bo_create(adev, amdgpu_vm_bo_size(adev, 0), align, true,
 			     AMDGPU_GEM_DOMAIN_VRAM,
 			     flags,
-			     NULL, NULL, &vm->root.bo);
+			     NULL, NULL, init_pde_value, &vm->root.bo);
 	if (r)
 		goto error_free_sched_entity;
 
@@ -2457,6 +2534,13 @@ int amdgpu_vm_init(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 		goto error_free_root;
 
 	vm->last_eviction_counter = atomic64_read(&adev->num_evictions);
+
+	if (vm->use_cpu_for_update) {
+		r = amdgpu_bo_kmap(vm->root.bo, NULL);
+		if (r)
+			goto error_free_root;
+	}
+
 	amdgpu_bo_unreserve(vm->root.bo);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
index 936f158..ba6691b 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vm.h
@@ -50,9 +50,6 @@ struct amdgpu_bo_list_entry;
 /* PTBs (Page Table Blocks) need to be aligned to 32K */
 #define AMDGPU_VM_PTB_ALIGN_SIZE   32768
 
-/* LOG2 number of continuous pages for the fragment field */
-#define AMDGPU_LOG2_PAGES_PER_FRAG 4
-
 #define AMDGPU_PTE_VALID	(1ULL << 0)
 #define AMDGPU_PTE_SYSTEM	(1ULL << 1)
 #define AMDGPU_PTE_SNOOPED	(1ULL << 2)
@@ -68,6 +65,9 @@ struct amdgpu_bo_list_entry;
 /* TILED for VEGA10, reserved for older ASICs  */
 #define AMDGPU_PTE_PRT		(1ULL << 51)
 
+/* PDE is handled as PTE for VEGA10 */
+#define AMDGPU_PDE_PTE		(1ULL << 54)
+
 /* VEGA10 only */
 #define AMDGPU_PTE_MTYPE(a)    ((uint64_t)a << 57)
 #define AMDGPU_PTE_MTYPE_MASK	AMDGPU_PTE_MTYPE(3ULL)
@@ -94,6 +94,18 @@ struct amdgpu_bo_list_entry;
 #define AMDGPU_VM_USE_CPU_FOR_GFX (1 << 0)
 #define AMDGPU_VM_USE_CPU_FOR_COMPUTE (1 << 1)
 
+/* base structure for tracking BO usage in a VM */
+struct amdgpu_vm_bo_base {
+	/* constant after initialization */
+	struct amdgpu_vm		*vm;
+	struct amdgpu_bo		*bo;
+
+	/* protected by bo being reserved */
+	struct list_head		bo_list;
+
+	/* protected by spinlock */
+	struct list_head		vm_status;
+};
 
 struct amdgpu_vm_pt {
 	struct amdgpu_bo	*bo;
@@ -112,7 +124,7 @@ struct amdgpu_vm {
 	spinlock_t		status_lock;
 
 	/* BOs moved, but not yet updated in the PT */
-	struct list_head	invalidated;
+	struct list_head	moved;
 
 	/* BOs cleared in the PT because of a move */
 	struct list_head	cleared;
@@ -135,11 +147,12 @@ struct amdgpu_vm {
 	u64                     client_id;
 	/* dedicated to vm */
 	struct amdgpu_vm_id	*reserved_vmid[AMDGPU_MAX_VMHUBS];
-	/* each VM will map on CSA */
-	struct amdgpu_bo_va *csa_bo_va;
 
 	/* Flag to indicate if VM tables are updated by CPU or GPU (SDMA) */
 	bool                    use_cpu_for_update;
+
+	/* Flag to indicate ATS support from PTE for GFX9 */
+	bool			pte_support_ats;
 };
 
 struct amdgpu_vm_id {
@@ -182,6 +195,7 @@ struct amdgpu_vm_manager {
 	uint32_t				num_level;
 	uint64_t				vm_size;
 	uint32_t				block_size;
+	uint32_t				fragment_size;
 	/* vram base address for page table entry  */
 	u64					vram_base_offset;
 	/* vm pte handling */
@@ -214,15 +228,13 @@ void amdgpu_vm_get_pd_bo(struct amdgpu_vm *vm,
 int amdgpu_vm_validate_pt_bos(struct amdgpu_device *adev, struct amdgpu_vm *vm,
 			      int (*callback)(void *p, struct amdgpu_bo *bo),
 			      void *param);
-void amdgpu_vm_move_pt_bos_in_lru(struct amdgpu_device *adev,
-				  struct amdgpu_vm *vm);
 int amdgpu_vm_alloc_pts(struct amdgpu_device *adev,
 			struct amdgpu_vm *vm,
 			uint64_t saddr, uint64_t size);
 int amdgpu_vm_grab_id(struct amdgpu_vm *vm, struct amdgpu_ring *ring,
 		      struct amdgpu_sync *sync, struct dma_fence *fence,
 		      struct amdgpu_job *job);
-int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job);
+int amdgpu_vm_flush(struct amdgpu_ring *ring, struct amdgpu_job *job, bool need_pipe_sync);
 void amdgpu_vm_reset_id(struct amdgpu_device *adev, unsigned vmhub,
 			unsigned vmid);
 void amdgpu_vm_reset_all_ids(struct amdgpu_device *adev);
@@ -231,8 +243,8 @@ int amdgpu_vm_update_directories(struct amdgpu_device *adev,
 int amdgpu_vm_clear_freed(struct amdgpu_device *adev,
 			  struct amdgpu_vm *vm,
 			  struct dma_fence **fence);
-int amdgpu_vm_clear_invalids(struct amdgpu_device *adev, struct amdgpu_vm *vm,
-			     struct amdgpu_sync *sync);
+int amdgpu_vm_clear_moved(struct amdgpu_device *adev, struct amdgpu_vm *vm,
+			  struct amdgpu_sync *sync);
 int amdgpu_vm_bo_update(struct amdgpu_device *adev,
 			struct amdgpu_bo_va *bo_va,
 			bool clear);
@@ -259,7 +271,10 @@ int amdgpu_vm_bo_clear_mappings(struct amdgpu_device *adev,
 				uint64_t saddr, uint64_t size);
 void amdgpu_vm_bo_rmv(struct amdgpu_device *adev,
 		      struct amdgpu_bo_va *bo_va);
-void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint64_t vm_size);
+void amdgpu_vm_set_fragment_size(struct amdgpu_device *adev,
+				uint32_t fragment_size_default);
+void amdgpu_vm_adjust_size(struct amdgpu_device *adev, uint64_t vm_size,
+				uint32_t fragment_size_default);
 int amdgpu_vm_ioctl(struct drm_device *dev, void *data, struct drm_file *filp);
 bool amdgpu_vm_need_pipeline_sync(struct amdgpu_ring *ring,
 				  struct amdgpu_job *job);
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
index a2c59a0..26e9006 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vram_mgr.c
@@ -28,6 +28,8 @@
 struct amdgpu_vram_mgr {
 	struct drm_mm mm;
 	spinlock_t lock;
+	atomic64_t usage;
+	atomic64_t vis_usage;
 };
 
 /**
@@ -79,6 +81,27 @@ static int amdgpu_vram_mgr_fini(struct ttm_mem_type_manager *man)
 }
 
 /**
+ * amdgpu_vram_mgr_vis_size - Calculate visible node size
+ *
+ * @adev: amdgpu device structure
+ * @node: MM node structure
+ *
+ * Calculate how many bytes of the MM node are inside visible VRAM
+ */
+static u64 amdgpu_vram_mgr_vis_size(struct amdgpu_device *adev,
+				    struct drm_mm_node *node)
+{
+	uint64_t start = node->start << PAGE_SHIFT;
+	uint64_t end = (node->size + node->start) << PAGE_SHIFT;
+
+	if (start >= adev->mc.visible_vram_size)
+		return 0;
+
+	return (end > adev->mc.visible_vram_size ?
+		adev->mc.visible_vram_size : end) - start;
+}
+
+/**
  * amdgpu_vram_mgr_new - allocate new ranges
  *
  * @man: TTM memory type manager
@@ -93,11 +116,13 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager *man,
 			       const struct ttm_place *place,
 			       struct ttm_mem_reg *mem)
 {
+	struct amdgpu_device *adev = amdgpu_ttm_adev(man->bdev);
 	struct amdgpu_vram_mgr *mgr = man->priv;
 	struct drm_mm *mm = &mgr->mm;
 	struct drm_mm_node *nodes;
 	enum drm_mm_insert_mode mode;
 	unsigned long lpfn, num_nodes, pages_per_node, pages_left;
+	uint64_t usage = 0, vis_usage = 0;
 	unsigned i;
 	int r;
 
@@ -142,6 +167,9 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager *man,
 		if (unlikely(r))
 			goto error;
 
+		usage += nodes[i].size << PAGE_SHIFT;
+		vis_usage += amdgpu_vram_mgr_vis_size(adev, &nodes[i]);
+
 		/* Calculate a virtual BO start address to easily check if
 		 * everything is CPU accessible.
 		 */
@@ -155,6 +183,9 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager *man,
 	}
 	spin_unlock(&mgr->lock);
 
+	atomic64_add(usage, &mgr->usage);
+	atomic64_add(vis_usage, &mgr->vis_usage);
+
 	mem->mm_node = nodes;
 
 	return 0;
@@ -181,8 +212,10 @@ static int amdgpu_vram_mgr_new(struct ttm_mem_type_manager *man,
 static void amdgpu_vram_mgr_del(struct ttm_mem_type_manager *man,
 				struct ttm_mem_reg *mem)
 {
+	struct amdgpu_device *adev = amdgpu_ttm_adev(man->bdev);
 	struct amdgpu_vram_mgr *mgr = man->priv;
 	struct drm_mm_node *nodes = mem->mm_node;
+	uint64_t usage = 0, vis_usage = 0;
 	unsigned pages = mem->num_pages;
 
 	if (!mem->mm_node)
@@ -192,31 +225,67 @@ static void amdgpu_vram_mgr_del(struct ttm_mem_type_manager *man,
 	while (pages) {
 		pages -= nodes->size;
 		drm_mm_remove_node(nodes);
+		usage += nodes->size << PAGE_SHIFT;
+		vis_usage += amdgpu_vram_mgr_vis_size(adev, nodes);
 		++nodes;
 	}
 	spin_unlock(&mgr->lock);
 
+	atomic64_sub(usage, &mgr->usage);
+	atomic64_sub(vis_usage, &mgr->vis_usage);
+
 	kfree(mem->mm_node);
 	mem->mm_node = NULL;
 }
 
 /**
+ * amdgpu_vram_mgr_usage - how many bytes are used in this domain
+ *
+ * @man: TTM memory type manager
+ *
+ * Returns how many bytes are used in this domain.
+ */
+uint64_t amdgpu_vram_mgr_usage(struct ttm_mem_type_manager *man)
+{
+	struct amdgpu_vram_mgr *mgr = man->priv;
+
+	return atomic64_read(&mgr->usage);
+}
+
+/**
+ * amdgpu_vram_mgr_vis_usage - how many bytes are used in the visible part
+ *
+ * @man: TTM memory type manager
+ *
+ * Returns how many bytes are used in the visible part of VRAM
+ */
+uint64_t amdgpu_vram_mgr_vis_usage(struct ttm_mem_type_manager *man)
+{
+	struct amdgpu_vram_mgr *mgr = man->priv;
+
+	return atomic64_read(&mgr->vis_usage);
+}
+
+/**
  * amdgpu_vram_mgr_debug - dump VRAM table
  *
  * @man: TTM memory type manager
- * @prefix: text prefix
+ * @printer: DRM printer to use
  *
  * Dump the table content using printk.
  */
 static void amdgpu_vram_mgr_debug(struct ttm_mem_type_manager *man,
-				  const char *prefix)
+				  struct drm_printer *printer)
 {
 	struct amdgpu_vram_mgr *mgr = man->priv;
-	struct drm_printer p = drm_debug_printer(prefix);
 
 	spin_lock(&mgr->lock);
-	drm_mm_print(&mgr->mm, &p);
+	drm_mm_print(&mgr->mm, printer);
 	spin_unlock(&mgr->lock);
+
+	drm_printf(printer, "man size:%llu pages, ram usage:%lluMB, vis usage:%lluMB\n",
+		   man->size, amdgpu_vram_mgr_usage(man) >> 20,
+		   amdgpu_vram_mgr_vis_usage(man) >> 20);
 }
 
 const struct ttm_mem_type_manager_func amdgpu_vram_mgr_func = {
diff --git a/drivers/gpu/drm/amd/amdgpu/cik.c b/drivers/gpu/drm/amd/amdgpu/cik.c
index 37a499a..567c4a5 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik.c
@@ -1824,21 +1824,14 @@ static int cik_common_suspend(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	amdgpu_amdkfd_suspend(adev);
-
 	return cik_common_hw_fini(adev);
 }
 
 static int cik_common_resume(void *handle)
 {
-	int r;
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
-	r = cik_common_hw_init(adev);
-	if (r)
-		return r;
-
-	return amdgpu_amdkfd_resume(adev);
+	return cik_common_hw_init(adev);
 }
 
 static bool cik_common_is_idle(void *handle)
diff --git a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
index c216e16..f508f4d 100644
--- a/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
+++ b/drivers/gpu/drm/amd/amdgpu/cik_sdma.c
@@ -342,6 +342,63 @@ static void cik_sdma_rlc_stop(struct amdgpu_device *adev)
 }
 
 /**
+ * cik_ctx_switch_enable - stop the async dma engines context switch
+ *
+ * @adev: amdgpu_device pointer
+ * @enable: enable/disable the DMA MEs context switch.
+ *
+ * Halt or unhalt the async dma engines context switch (VI).
+ */
+static void cik_ctx_switch_enable(struct amdgpu_device *adev, bool enable)
+{
+	u32 f32_cntl, phase_quantum = 0;
+	int i;
+
+	if (amdgpu_sdma_phase_quantum) {
+		unsigned value = amdgpu_sdma_phase_quantum;
+		unsigned unit = 0;
+
+		while (value > (SDMA0_PHASE0_QUANTUM__VALUE_MASK >>
+				SDMA0_PHASE0_QUANTUM__VALUE__SHIFT)) {
+			value = (value + 1) >> 1;
+			unit++;
+		}
+		if (unit > (SDMA0_PHASE0_QUANTUM__UNIT_MASK >>
+			    SDMA0_PHASE0_QUANTUM__UNIT__SHIFT)) {
+			value = (SDMA0_PHASE0_QUANTUM__VALUE_MASK >>
+				 SDMA0_PHASE0_QUANTUM__VALUE__SHIFT);
+			unit = (SDMA0_PHASE0_QUANTUM__UNIT_MASK >>
+				SDMA0_PHASE0_QUANTUM__UNIT__SHIFT);
+			WARN_ONCE(1,
+			"clamping sdma_phase_quantum to %uK clock cycles\n",
+				  value << unit);
+		}
+		phase_quantum =
+			value << SDMA0_PHASE0_QUANTUM__VALUE__SHIFT |
+			unit  << SDMA0_PHASE0_QUANTUM__UNIT__SHIFT;
+	}
+
+	for (i = 0; i < adev->sdma.num_instances; i++) {
+		f32_cntl = RREG32(mmSDMA0_CNTL + sdma_offsets[i]);
+		if (enable) {
+			f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
+					AUTO_CTXSW_ENABLE, 1);
+			if (amdgpu_sdma_phase_quantum) {
+				WREG32(mmSDMA0_PHASE0_QUANTUM + sdma_offsets[i],
+				       phase_quantum);
+				WREG32(mmSDMA0_PHASE1_QUANTUM + sdma_offsets[i],
+				       phase_quantum);
+			}
+		} else {
+			f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
+					AUTO_CTXSW_ENABLE, 0);
+		}
+
+		WREG32(mmSDMA0_CNTL + sdma_offsets[i], f32_cntl);
+	}
+}
+
+/**
  * cik_sdma_enable - stop the async dma engines
  *
  * @adev: amdgpu_device pointer
@@ -537,6 +594,8 @@ static int cik_sdma_start(struct amdgpu_device *adev)
 
 	/* halt the engine before programing */
 	cik_sdma_enable(adev, false);
+	/* enable sdma ring preemption */
+	cik_ctx_switch_enable(adev, true);
 
 	/* start the gfx rings and rlc compute queues */
 	r = cik_sdma_gfx_resume(adev);
@@ -984,6 +1043,7 @@ static int cik_sdma_hw_fini(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+	cik_ctx_switch_enable(adev, false);
 	cik_sdma_enable(adev, false);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
index 9f78c03..4e519dc 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v10_0.c
@@ -484,134 +484,6 @@ static bool dce_v10_0_is_display_hung(struct amdgpu_device *adev)
 	return true;
 }
 
-static void dce_v10_0_stop_mc_access(struct amdgpu_device *adev,
-				     struct amdgpu_mode_mc_save *save)
-{
-	u32 crtc_enabled, tmp;
-	int i;
-
-	save->vga_render_control = RREG32(mmVGA_RENDER_CONTROL);
-	save->vga_hdp_control = RREG32(mmVGA_HDP_CONTROL);
-
-	/* disable VGA render */
-	tmp = RREG32(mmVGA_RENDER_CONTROL);
-	tmp = REG_SET_FIELD(tmp, VGA_RENDER_CONTROL, VGA_VSTATUS_CNTL, 0);
-	WREG32(mmVGA_RENDER_CONTROL, tmp);
-
-	/* blank the display controllers */
-	for (i = 0; i < adev->mode_info.num_crtc; i++) {
-		crtc_enabled = REG_GET_FIELD(RREG32(mmCRTC_CONTROL + crtc_offsets[i]),
-					     CRTC_CONTROL, CRTC_MASTER_EN);
-		if (crtc_enabled) {
-#if 0
-			u32 frame_count;
-			int j;
-
-			save->crtc_enabled[i] = true;
-			tmp = RREG32(mmCRTC_BLANK_CONTROL + crtc_offsets[i]);
-			if (REG_GET_FIELD(tmp, CRTC_BLANK_CONTROL, CRTC_BLANK_DATA_EN) == 0) {
-				amdgpu_display_vblank_wait(adev, i);
-				WREG32(mmCRTC_UPDATE_LOCK + crtc_offsets[i], 1);
-				tmp = REG_SET_FIELD(tmp, CRTC_BLANK_CONTROL, CRTC_BLANK_DATA_EN, 1);
-				WREG32(mmCRTC_BLANK_CONTROL + crtc_offsets[i], tmp);
-				WREG32(mmCRTC_UPDATE_LOCK + crtc_offsets[i], 0);
-			}
-			/* wait for the next frame */
-			frame_count = amdgpu_display_vblank_get_counter(adev, i);
-			for (j = 0; j < adev->usec_timeout; j++) {
-				if (amdgpu_display_vblank_get_counter(adev, i) != frame_count)
-					break;
-				udelay(1);
-			}
-			tmp = RREG32(mmGRPH_UPDATE + crtc_offsets[i]);
-			if (REG_GET_FIELD(tmp, GRPH_UPDATE, GRPH_UPDATE_LOCK) == 0) {
-				tmp = REG_SET_FIELD(tmp, GRPH_UPDATE, GRPH_UPDATE_LOCK, 1);
-				WREG32(mmGRPH_UPDATE + crtc_offsets[i], tmp);
-			}
-			tmp = RREG32(mmMASTER_UPDATE_LOCK + crtc_offsets[i]);
-			if (REG_GET_FIELD(tmp, MASTER_UPDATE_LOCK, MASTER_UPDATE_LOCK) == 0) {
-				tmp = REG_SET_FIELD(tmp, MASTER_UPDATE_LOCK, MASTER_UPDATE_LOCK, 1);
-				WREG32(mmMASTER_UPDATE_LOCK + crtc_offsets[i], tmp);
-			}
-#else
-			/* XXX this is a hack to avoid strange behavior with EFI on certain systems */
-			WREG32(mmCRTC_UPDATE_LOCK + crtc_offsets[i], 1);
-			tmp = RREG32(mmCRTC_CONTROL + crtc_offsets[i]);
-			tmp = REG_SET_FIELD(tmp, CRTC_CONTROL, CRTC_MASTER_EN, 0);
-			WREG32(mmCRTC_CONTROL + crtc_offsets[i], tmp);
-			WREG32(mmCRTC_UPDATE_LOCK + crtc_offsets[i], 0);
-			save->crtc_enabled[i] = false;
-			/* ***** */
-#endif
-		} else {
-			save->crtc_enabled[i] = false;
-		}
-	}
-}
-
-static void dce_v10_0_resume_mc_access(struct amdgpu_device *adev,
-				       struct amdgpu_mode_mc_save *save)
-{
-	u32 tmp, frame_count;
-	int i, j;
-
-	/* update crtc base addresses */
-	for (i = 0; i < adev->mode_info.num_crtc; i++) {
-		WREG32(mmGRPH_PRIMARY_SURFACE_ADDRESS_HIGH + crtc_offsets[i],
-		       upper_32_bits(adev->mc.vram_start));
-		WREG32(mmGRPH_SECONDARY_SURFACE_ADDRESS_HIGH + crtc_offsets[i],
-		       upper_32_bits(adev->mc.vram_start));
-		WREG32(mmGRPH_PRIMARY_SURFACE_ADDRESS + crtc_offsets[i],
-		       (u32)adev->mc.vram_start);
-		WREG32(mmGRPH_SECONDARY_SURFACE_ADDRESS + crtc_offsets[i],
-		       (u32)adev->mc.vram_start);
-
-		if (save->crtc_enabled[i]) {
-			tmp = RREG32(mmMASTER_UPDATE_MODE + crtc_offsets[i]);
-			if (REG_GET_FIELD(tmp, MASTER_UPDATE_MODE, MASTER_UPDATE_MODE) != 0) {
-				tmp = REG_SET_FIELD(tmp, MASTER_UPDATE_MODE, MASTER_UPDATE_MODE, 0);
-				WREG32(mmMASTER_UPDATE_MODE + crtc_offsets[i], tmp);
-			}
-			tmp = RREG32(mmGRPH_UPDATE + crtc_offsets[i]);
-			if (REG_GET_FIELD(tmp, GRPH_UPDATE, GRPH_UPDATE_LOCK)) {
-				tmp = REG_SET_FIELD(tmp, GRPH_UPDATE, GRPH_UPDATE_LOCK, 0);
-				WREG32(mmGRPH_UPDATE + crtc_offsets[i], tmp);
-			}
-			tmp = RREG32(mmMASTER_UPDATE_LOCK + crtc_offsets[i]);
-			if (REG_GET_FIELD(tmp, MASTER_UPDATE_LOCK, MASTER_UPDATE_LOCK)) {
-				tmp = REG_SET_FIELD(tmp, MASTER_UPDATE_LOCK, MASTER_UPDATE_LOCK, 0);
-				WREG32(mmMASTER_UPDATE_LOCK + crtc_offsets[i], tmp);
-			}
-			for (j = 0; j < adev->usec_timeout; j++) {
-				tmp = RREG32(mmGRPH_UPDATE + crtc_offsets[i]);
-				if (REG_GET_FIELD(tmp, GRPH_UPDATE, GRPH_SURFACE_UPDATE_PENDING) == 0)
-					break;
-				udelay(1);
-			}
-			tmp = RREG32(mmCRTC_BLANK_CONTROL + crtc_offsets[i]);
-			tmp = REG_SET_FIELD(tmp, CRTC_BLANK_CONTROL, CRTC_BLANK_DATA_EN, 0);
-			WREG32(mmCRTC_UPDATE_LOCK + crtc_offsets[i], 1);
-			WREG32(mmCRTC_BLANK_CONTROL + crtc_offsets[i], tmp);
-			WREG32(mmCRTC_UPDATE_LOCK + crtc_offsets[i], 0);
-			/* wait for the next frame */
-			frame_count = amdgpu_display_vblank_get_counter(adev, i);
-			for (j = 0; j < adev->usec_timeout; j++) {
-				if (amdgpu_display_vblank_get_counter(adev, i) != frame_count)
-					break;
-				udelay(1);
-			}
-		}
-	}
-
-	WREG32(mmVGA_MEMORY_BASE_ADDRESS_HIGH, upper_32_bits(adev->mc.vram_start));
-	WREG32(mmVGA_MEMORY_BASE_ADDRESS, lower_32_bits(adev->mc.vram_start));
-
-	/* Unlock vga access */
-	WREG32(mmVGA_HDP_CONTROL, save->vga_hdp_control);
-	mdelay(1);
-	WREG32(mmVGA_RENDER_CONTROL, save->vga_render_control);
-}
-
 static void dce_v10_0_set_vga_render_state(struct amdgpu_device *adev,
 					   bool render)
 {
@@ -1867,7 +1739,7 @@ static void dce_v10_0_afmt_setmode(struct drm_encoder *encoder,
 	dce_v10_0_audio_write_sad_regs(encoder);
 	dce_v10_0_audio_write_latency_fields(encoder, mode);
 
-	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode);
+	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
 	if (err < 0) {
 		DRM_ERROR("failed to setup AVI infoframe: %zd\n", err);
 		return;
@@ -2267,6 +2139,7 @@ static void dce_v10_0_crtc_load_lut(struct drm_crtc *crtc)
 	struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
 	struct drm_device *dev = crtc->dev;
 	struct amdgpu_device *adev = dev->dev_private;
+	u16 *r, *g, *b;
 	int i;
 	u32 tmp;
 
@@ -2304,11 +2177,14 @@ static void dce_v10_0_crtc_load_lut(struct drm_crtc *crtc)
 	WREG32(mmDC_LUT_WRITE_EN_MASK + amdgpu_crtc->crtc_offset, 0x00000007);
 
 	WREG32(mmDC_LUT_RW_INDEX + amdgpu_crtc->crtc_offset, 0);
+	r = crtc->gamma_store;
+	g = r + crtc->gamma_size;
+	b = g + crtc->gamma_size;
 	for (i = 0; i < 256; i++) {
 		WREG32(mmDC_LUT_30_COLOR + amdgpu_crtc->crtc_offset,
-		       (amdgpu_crtc->lut_r[i] << 20) |
-		       (amdgpu_crtc->lut_g[i] << 10) |
-		       (amdgpu_crtc->lut_b[i] << 0));
+		       ((*r++ & 0xffc0) << 14) |
+		       ((*g++ & 0xffc0) << 4) |
+		       (*b++ >> 6));
 	}
 
 	tmp = RREG32(mmDEGAMMA_CONTROL + amdgpu_crtc->crtc_offset);
@@ -2555,7 +2431,7 @@ static int dce_v10_0_crtc_cursor_set2(struct drm_crtc *crtc,
 	aobj = gem_to_amdgpu_bo(obj);
 	ret = amdgpu_bo_reserve(aobj, false);
 	if (ret != 0) {
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ret;
 	}
 
@@ -2563,7 +2439,7 @@ static int dce_v10_0_crtc_cursor_set2(struct drm_crtc *crtc,
 	amdgpu_bo_unreserve(aobj);
 	if (ret) {
 		DRM_ERROR("Failed to pin new cursor BO (%d)\n", ret);
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ret;
 	}
 
@@ -2597,7 +2473,7 @@ static int dce_v10_0_crtc_cursor_set2(struct drm_crtc *crtc,
 			amdgpu_bo_unpin(aobj);
 			amdgpu_bo_unreserve(aobj);
 		}
-		drm_gem_object_unreference_unlocked(amdgpu_crtc->cursor_bo);
+		drm_gem_object_put_unlocked(amdgpu_crtc->cursor_bo);
 	}
 
 	amdgpu_crtc->cursor_bo = obj;
@@ -2624,15 +2500,6 @@ static int dce_v10_0_crtc_gamma_set(struct drm_crtc *crtc, u16 *red, u16 *green,
 				    u16 *blue, uint32_t size,
 				    struct drm_modeset_acquire_ctx *ctx)
 {
-	struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
-	int i;
-
-	/* userspace palettes are always correct as is */
-	for (i = 0; i < size; i++) {
-		amdgpu_crtc->lut_r[i] = red[i] >> 6;
-		amdgpu_crtc->lut_g[i] = green[i] >> 6;
-		amdgpu_crtc->lut_b[i] = blue[i] >> 6;
-	}
 	dce_v10_0_crtc_load_lut(crtc);
 
 	return 0;
@@ -2844,14 +2711,12 @@ static const struct drm_crtc_helper_funcs dce_v10_0_crtc_helper_funcs = {
 	.mode_set_base_atomic = dce_v10_0_crtc_set_base_atomic,
 	.prepare = dce_v10_0_crtc_prepare,
 	.commit = dce_v10_0_crtc_commit,
-	.load_lut = dce_v10_0_crtc_load_lut,
 	.disable = dce_v10_0_crtc_disable,
 };
 
 static int dce_v10_0_crtc_init(struct amdgpu_device *adev, int index)
 {
 	struct amdgpu_crtc *amdgpu_crtc;
-	int i;
 
 	amdgpu_crtc = kzalloc(sizeof(struct amdgpu_crtc) +
 			      (AMDGPUFB_CONN_LIMIT * sizeof(struct drm_connector *)), GFP_KERNEL);
@@ -2869,12 +2734,6 @@ static int dce_v10_0_crtc_init(struct amdgpu_device *adev, int index)
 	adev->ddev->mode_config.cursor_width = amdgpu_crtc->max_cursor_width;
 	adev->ddev->mode_config.cursor_height = amdgpu_crtc->max_cursor_height;
 
-	for (i = 0; i < 256; i++) {
-		amdgpu_crtc->lut_r[i] = i << 2;
-		amdgpu_crtc->lut_g[i] = i << 2;
-		amdgpu_crtc->lut_b[i] = i << 2;
-	}
-
 	switch (amdgpu_crtc->crtc_id) {
 	case 0:
 	default:
@@ -3025,6 +2884,8 @@ static int dce_v10_0_hw_init(void *handle)
 
 	dce_v10_0_init_golden_registers(adev);
 
+	/* disable vga render */
+	dce_v10_0_set_vga_render_state(adev, false);
 	/* init dig PHYs, disp eng pll */
 	amdgpu_atombios_encoder_init_dig(adev);
 	amdgpu_atombios_crtc_set_disp_eng_pll(adev, adev->clock.default_dispclk);
@@ -3737,7 +3598,6 @@ static void dce_v10_0_encoder_add(struct amdgpu_device *adev,
 }
 
 static const struct amdgpu_display_funcs dce_v10_0_display_funcs = {
-	.set_vga_render_state = &dce_v10_0_set_vga_render_state,
 	.bandwidth_update = &dce_v10_0_bandwidth_update,
 	.vblank_get_counter = &dce_v10_0_vblank_get_counter,
 	.vblank_wait = &dce_v10_0_vblank_wait,
@@ -3750,8 +3610,6 @@ static const struct amdgpu_display_funcs dce_v10_0_display_funcs = {
 	.page_flip_get_scanoutpos = &dce_v10_0_crtc_get_scanoutpos,
 	.add_encoder = &dce_v10_0_encoder_add,
 	.add_connector = &amdgpu_connector_add,
-	.stop_mc_access = &dce_v10_0_stop_mc_access,
-	.resume_mc_access = &dce_v10_0_resume_mc_access,
 };
 
 static void dce_v10_0_set_display_funcs(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
index 4bcf01d..11edc75 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v11_0.c
@@ -499,79 +499,6 @@ static bool dce_v11_0_is_display_hung(struct amdgpu_device *adev)
 	return true;
 }
 
-static void dce_v11_0_stop_mc_access(struct amdgpu_device *adev,
-				     struct amdgpu_mode_mc_save *save)
-{
-	u32 crtc_enabled, tmp;
-	int i;
-
-	save->vga_render_control = RREG32(mmVGA_RENDER_CONTROL);
-	save->vga_hdp_control = RREG32(mmVGA_HDP_CONTROL);
-
-	/* disable VGA render */
-	tmp = RREG32(mmVGA_RENDER_CONTROL);
-	tmp = REG_SET_FIELD(tmp, VGA_RENDER_CONTROL, VGA_VSTATUS_CNTL, 0);
-	WREG32(mmVGA_RENDER_CONTROL, tmp);
-
-	/* blank the display controllers */
-	for (i = 0; i < adev->mode_info.num_crtc; i++) {
-		crtc_enabled = REG_GET_FIELD(RREG32(mmCRTC_CONTROL + crtc_offsets[i]),
-					     CRTC_CONTROL, CRTC_MASTER_EN);
-		if (crtc_enabled) {
-#if 1
-			save->crtc_enabled[i] = true;
-			tmp = RREG32(mmCRTC_BLANK_CONTROL + crtc_offsets[i]);
-			if (REG_GET_FIELD(tmp, CRTC_BLANK_CONTROL, CRTC_BLANK_DATA_EN) == 0) {
-				/*it is correct only for RGB ; black is 0*/
-				WREG32(mmCRTC_BLANK_DATA_COLOR + crtc_offsets[i], 0);
-				tmp = REG_SET_FIELD(tmp, CRTC_BLANK_CONTROL, CRTC_BLANK_DATA_EN, 1);
-				WREG32(mmCRTC_BLANK_CONTROL + crtc_offsets[i], tmp);
-			}
-#else
-			/* XXX this is a hack to avoid strange behavior with EFI on certain systems */
-			WREG32(mmCRTC_UPDATE_LOCK + crtc_offsets[i], 1);
-			tmp = RREG32(mmCRTC_CONTROL + crtc_offsets[i]);
-			tmp = REG_SET_FIELD(tmp, CRTC_CONTROL, CRTC_MASTER_EN, 0);
-			WREG32(mmCRTC_CONTROL + crtc_offsets[i], tmp);
-			WREG32(mmCRTC_UPDATE_LOCK + crtc_offsets[i], 0);
-			save->crtc_enabled[i] = false;
-			/* ***** */
-#endif
-		} else {
-			save->crtc_enabled[i] = false;
-		}
-	}
-}
-
-static void dce_v11_0_resume_mc_access(struct amdgpu_device *adev,
-				       struct amdgpu_mode_mc_save *save)
-{
-	u32 tmp;
-	int i;
-
-	/* update crtc base addresses */
-	for (i = 0; i < adev->mode_info.num_crtc; i++) {
-		WREG32(mmGRPH_PRIMARY_SURFACE_ADDRESS_HIGH + crtc_offsets[i],
-		       upper_32_bits(adev->mc.vram_start));
-		WREG32(mmGRPH_PRIMARY_SURFACE_ADDRESS + crtc_offsets[i],
-		       (u32)adev->mc.vram_start);
-
-		if (save->crtc_enabled[i]) {
-			tmp = RREG32(mmCRTC_BLANK_CONTROL + crtc_offsets[i]);
-			tmp = REG_SET_FIELD(tmp, CRTC_BLANK_CONTROL, CRTC_BLANK_DATA_EN, 0);
-			WREG32(mmCRTC_BLANK_CONTROL + crtc_offsets[i], tmp);
-		}
-	}
-
-	WREG32(mmVGA_MEMORY_BASE_ADDRESS_HIGH, upper_32_bits(adev->mc.vram_start));
-	WREG32(mmVGA_MEMORY_BASE_ADDRESS, lower_32_bits(adev->mc.vram_start));
-
-	/* Unlock vga access */
-	WREG32(mmVGA_HDP_CONTROL, save->vga_hdp_control);
-	mdelay(1);
-	WREG32(mmVGA_RENDER_CONTROL, save->vga_render_control);
-}
-
 static void dce_v11_0_set_vga_render_state(struct amdgpu_device *adev,
 					   bool render)
 {
@@ -1851,7 +1778,7 @@ static void dce_v11_0_afmt_setmode(struct drm_encoder *encoder,
 	dce_v11_0_audio_write_sad_regs(encoder);
 	dce_v11_0_audio_write_latency_fields(encoder, mode);
 
-	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode);
+	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
 	if (err < 0) {
 		DRM_ERROR("failed to setup AVI infoframe: %zd\n", err);
 		return;
@@ -2251,6 +2178,7 @@ static void dce_v11_0_crtc_load_lut(struct drm_crtc *crtc)
 	struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
 	struct drm_device *dev = crtc->dev;
 	struct amdgpu_device *adev = dev->dev_private;
+	u16 *r, *g, *b;
 	int i;
 	u32 tmp;
 
@@ -2282,11 +2210,14 @@ static void dce_v11_0_crtc_load_lut(struct drm_crtc *crtc)
 	WREG32(mmDC_LUT_WRITE_EN_MASK + amdgpu_crtc->crtc_offset, 0x00000007);
 
 	WREG32(mmDC_LUT_RW_INDEX + amdgpu_crtc->crtc_offset, 0);
+	r = crtc->gamma_store;
+	g = r + crtc->gamma_size;
+	b = g + crtc->gamma_size;
 	for (i = 0; i < 256; i++) {
 		WREG32(mmDC_LUT_30_COLOR + amdgpu_crtc->crtc_offset,
-		       (amdgpu_crtc->lut_r[i] << 20) |
-		       (amdgpu_crtc->lut_g[i] << 10) |
-		       (amdgpu_crtc->lut_b[i] << 0));
+		       ((*r++ & 0xffc0) << 14) |
+		       ((*g++ & 0xffc0) << 4) |
+		       (*b++ >> 6));
 	}
 
 	tmp = RREG32(mmDEGAMMA_CONTROL + amdgpu_crtc->crtc_offset);
@@ -2575,7 +2506,7 @@ static int dce_v11_0_crtc_cursor_set2(struct drm_crtc *crtc,
 	aobj = gem_to_amdgpu_bo(obj);
 	ret = amdgpu_bo_reserve(aobj, false);
 	if (ret != 0) {
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ret;
 	}
 
@@ -2583,7 +2514,7 @@ static int dce_v11_0_crtc_cursor_set2(struct drm_crtc *crtc,
 	amdgpu_bo_unreserve(aobj);
 	if (ret) {
 		DRM_ERROR("Failed to pin new cursor BO (%d)\n", ret);
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ret;
 	}
 
@@ -2617,7 +2548,7 @@ static int dce_v11_0_crtc_cursor_set2(struct drm_crtc *crtc,
 			amdgpu_bo_unpin(aobj);
 			amdgpu_bo_unreserve(aobj);
 		}
-		drm_gem_object_unreference_unlocked(amdgpu_crtc->cursor_bo);
+		drm_gem_object_put_unlocked(amdgpu_crtc->cursor_bo);
 	}
 
 	amdgpu_crtc->cursor_bo = obj;
@@ -2644,15 +2575,6 @@ static int dce_v11_0_crtc_gamma_set(struct drm_crtc *crtc, u16 *red, u16 *green,
 				    u16 *blue, uint32_t size,
 				    struct drm_modeset_acquire_ctx *ctx)
 {
-	struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
-	int i;
-
-	/* userspace palettes are always correct as is */
-	for (i = 0; i < size; i++) {
-		amdgpu_crtc->lut_r[i] = red[i] >> 6;
-		amdgpu_crtc->lut_g[i] = green[i] >> 6;
-		amdgpu_crtc->lut_b[i] = blue[i] >> 6;
-	}
 	dce_v11_0_crtc_load_lut(crtc);
 
 	return 0;
@@ -2892,14 +2814,12 @@ static const struct drm_crtc_helper_funcs dce_v11_0_crtc_helper_funcs = {
 	.mode_set_base_atomic = dce_v11_0_crtc_set_base_atomic,
 	.prepare = dce_v11_0_crtc_prepare,
 	.commit = dce_v11_0_crtc_commit,
-	.load_lut = dce_v11_0_crtc_load_lut,
 	.disable = dce_v11_0_crtc_disable,
 };
 
 static int dce_v11_0_crtc_init(struct amdgpu_device *adev, int index)
 {
 	struct amdgpu_crtc *amdgpu_crtc;
-	int i;
 
 	amdgpu_crtc = kzalloc(sizeof(struct amdgpu_crtc) +
 			      (AMDGPUFB_CONN_LIMIT * sizeof(struct drm_connector *)), GFP_KERNEL);
@@ -2917,12 +2837,6 @@ static int dce_v11_0_crtc_init(struct amdgpu_device *adev, int index)
 	adev->ddev->mode_config.cursor_width = amdgpu_crtc->max_cursor_width;
 	adev->ddev->mode_config.cursor_height = amdgpu_crtc->max_cursor_height;
 
-	for (i = 0; i < 256; i++) {
-		amdgpu_crtc->lut_r[i] = i << 2;
-		amdgpu_crtc->lut_g[i] = i << 2;
-		amdgpu_crtc->lut_b[i] = i << 2;
-	}
-
 	switch (amdgpu_crtc->crtc_id) {
 	case 0:
 	default:
@@ -3086,6 +3000,8 @@ static int dce_v11_0_hw_init(void *handle)
 
 	dce_v11_0_init_golden_registers(adev);
 
+	/* disable vga render */
+	dce_v11_0_set_vga_render_state(adev, false);
 	/* init dig PHYs, disp eng pll */
 	amdgpu_atombios_crtc_powergate_init(adev);
 	amdgpu_atombios_encoder_init_dig(adev);
@@ -3806,7 +3722,6 @@ static void dce_v11_0_encoder_add(struct amdgpu_device *adev,
 }
 
 static const struct amdgpu_display_funcs dce_v11_0_display_funcs = {
-	.set_vga_render_state = &dce_v11_0_set_vga_render_state,
 	.bandwidth_update = &dce_v11_0_bandwidth_update,
 	.vblank_get_counter = &dce_v11_0_vblank_get_counter,
 	.vblank_wait = &dce_v11_0_vblank_wait,
@@ -3819,8 +3734,6 @@ static const struct amdgpu_display_funcs dce_v11_0_display_funcs = {
 	.page_flip_get_scanoutpos = &dce_v11_0_crtc_get_scanoutpos,
 	.add_encoder = &dce_v11_0_encoder_add,
 	.add_connector = &amdgpu_connector_add,
-	.stop_mc_access = &dce_v11_0_stop_mc_access,
-	.resume_mc_access = &dce_v11_0_resume_mc_access,
 };
 
 static void dce_v11_0_set_display_funcs(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
index fd134a4..a51e35f 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v6_0.c
@@ -42,6 +42,7 @@
 #include "dce/dce_6_0_d.h"
 #include "dce/dce_6_0_sh_mask.h"
 #include "gca/gfx_7_2_enum.h"
+#include "dce_v6_0.h"
 #include "si_enums.h"
 
 static void dce_v6_0_set_display_funcs(struct amdgpu_device *adev);
@@ -392,117 +393,6 @@ static u32 dce_v6_0_hpd_get_gpio_reg(struct amdgpu_device *adev)
 	return mmDC_GPIO_HPD_A;
 }
 
-static u32 evergreen_get_vblank_counter(struct amdgpu_device* adev, int crtc)
-{
-	if (crtc >= adev->mode_info.num_crtc)
-		return 0;
-	else
-		return RREG32(mmCRTC_STATUS_FRAME_COUNT + crtc_offsets[crtc]);
-}
-
-static void dce_v6_0_stop_mc_access(struct amdgpu_device *adev,
-				    struct amdgpu_mode_mc_save *save)
-{
-	u32 crtc_enabled, tmp, frame_count;
-	int i, j;
-
-	save->vga_render_control = RREG32(mmVGA_RENDER_CONTROL);
-	save->vga_hdp_control = RREG32(mmVGA_HDP_CONTROL);
-
-	/* disable VGA render */
-	WREG32(mmVGA_RENDER_CONTROL, 0);
-
-	/* blank the display controllers */
-	for (i = 0; i < adev->mode_info.num_crtc; i++) {
-		crtc_enabled = RREG32(mmCRTC_CONTROL + crtc_offsets[i]) & CRTC_CONTROL__CRTC_MASTER_EN_MASK;
-		if (crtc_enabled) {
-			save->crtc_enabled[i] = true;
-			tmp = RREG32(mmCRTC_BLANK_CONTROL + crtc_offsets[i]);
-
-			if (!(tmp & CRTC_BLANK_CONTROL__CRTC_BLANK_DATA_EN_MASK)) {
-				dce_v6_0_vblank_wait(adev, i);
-				WREG32(mmCRTC_UPDATE_LOCK + crtc_offsets[i], 1);
-				tmp |= CRTC_BLANK_CONTROL__CRTC_BLANK_DATA_EN_MASK;
-				WREG32(mmCRTC_BLANK_CONTROL + crtc_offsets[i], tmp);
-				WREG32(mmCRTC_UPDATE_LOCK + crtc_offsets[i], 0);
-			}
-			/* wait for the next frame */
-			frame_count = evergreen_get_vblank_counter(adev, i);
-			for (j = 0; j < adev->usec_timeout; j++) {
-				if (evergreen_get_vblank_counter(adev, i) != frame_count)
-					break;
-				udelay(1);
-			}
-
-			/* XXX this is a hack to avoid strange behavior with EFI on certain systems */
-			WREG32(mmCRTC_UPDATE_LOCK + crtc_offsets[i], 1);
-			tmp = RREG32(mmCRTC_CONTROL + crtc_offsets[i]);
-			tmp &= ~CRTC_CONTROL__CRTC_MASTER_EN_MASK;
-			WREG32(mmCRTC_CONTROL + crtc_offsets[i], tmp);
-			WREG32(mmCRTC_UPDATE_LOCK + crtc_offsets[i], 0);
-			save->crtc_enabled[i] = false;
-			/* ***** */
-		} else {
-			save->crtc_enabled[i] = false;
-		}
-	}
-}
-
-static void dce_v6_0_resume_mc_access(struct amdgpu_device *adev,
-				      struct amdgpu_mode_mc_save *save)
-{
-	u32 tmp;
-	int i, j;
-
-	/* update crtc base addresses */
-	for (i = 0; i < adev->mode_info.num_crtc; i++) {
-		WREG32(mmGRPH_PRIMARY_SURFACE_ADDRESS_HIGH + crtc_offsets[i],
-		       upper_32_bits(adev->mc.vram_start));
-		WREG32(mmGRPH_SECONDARY_SURFACE_ADDRESS_HIGH + crtc_offsets[i],
-		       upper_32_bits(adev->mc.vram_start));
-		WREG32(mmGRPH_PRIMARY_SURFACE_ADDRESS + crtc_offsets[i],
-		       (u32)adev->mc.vram_start);
-		WREG32(mmGRPH_SECONDARY_SURFACE_ADDRESS + crtc_offsets[i],
-		       (u32)adev->mc.vram_start);
-	}
-
-	WREG32(mmVGA_MEMORY_BASE_ADDRESS_HIGH, upper_32_bits(adev->mc.vram_start));
-	WREG32(mmVGA_MEMORY_BASE_ADDRESS, (u32)adev->mc.vram_start);
-
-	/* unlock regs and wait for update */
-	for (i = 0; i < adev->mode_info.num_crtc; i++) {
-		if (save->crtc_enabled[i]) {
-			tmp = RREG32(mmMASTER_UPDATE_MODE + crtc_offsets[i]);
-			if ((tmp & 0x7) != 0) {
-				tmp &= ~0x7;
-				WREG32(mmMASTER_UPDATE_MODE + crtc_offsets[i], tmp);
-			}
-			tmp = RREG32(mmGRPH_UPDATE + crtc_offsets[i]);
-			if (tmp & GRPH_UPDATE__GRPH_UPDATE_LOCK_MASK) {
-				tmp &= ~GRPH_UPDATE__GRPH_UPDATE_LOCK_MASK;
-				WREG32(mmGRPH_UPDATE + crtc_offsets[i], tmp);
-			}
-			tmp = RREG32(mmMASTER_UPDATE_LOCK + crtc_offsets[i]);
-			if (tmp & 1) {
-				tmp &= ~1;
-				WREG32(mmMASTER_UPDATE_LOCK + crtc_offsets[i], tmp);
-			}
-			for (j = 0; j < adev->usec_timeout; j++) {
-				tmp = RREG32(mmGRPH_UPDATE + crtc_offsets[i]);
-				if ((tmp & GRPH_UPDATE__GRPH_SURFACE_UPDATE_PENDING_MASK) == 0)
-					break;
-				udelay(1);
-			}
-		}
-	}
-
-	/* Unlock vga access */
-	WREG32(mmVGA_HDP_CONTROL, save->vga_hdp_control);
-	mdelay(1);
-	WREG32(mmVGA_RENDER_CONTROL, save->vga_render_control);
-
-}
-
 static void dce_v6_0_set_vga_render_state(struct amdgpu_device *adev,
 					  bool render)
 {
@@ -1597,7 +1487,7 @@ static void dce_v6_0_audio_set_avi_infoframe(struct drm_encoder *encoder,
 	ssize_t err;
 	u32 tmp;
 
-	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode);
+	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
 	if (err < 0) {
 		DRM_ERROR("failed to setup AVI infoframe: %zd\n", err);
 		return;
@@ -2182,6 +2072,7 @@ static void dce_v6_0_crtc_load_lut(struct drm_crtc *crtc)
 	struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
 	struct drm_device *dev = crtc->dev;
 	struct amdgpu_device *adev = dev->dev_private;
+	u16 *r, *g, *b;
 	int i;
 
 	DRM_DEBUG_KMS("%d\n", amdgpu_crtc->crtc_id);
@@ -2211,11 +2102,14 @@ static void dce_v6_0_crtc_load_lut(struct drm_crtc *crtc)
 	WREG32(mmDC_LUT_WRITE_EN_MASK + amdgpu_crtc->crtc_offset, 0x00000007);
 
 	WREG32(mmDC_LUT_RW_INDEX + amdgpu_crtc->crtc_offset, 0);
+	r = crtc->gamma_store;
+	g = r + crtc->gamma_size;
+	b = g + crtc->gamma_size;
 	for (i = 0; i < 256; i++) {
 		WREG32(mmDC_LUT_30_COLOR + amdgpu_crtc->crtc_offset,
-		       (amdgpu_crtc->lut_r[i] << 20) |
-		       (amdgpu_crtc->lut_g[i] << 10) |
-		       (amdgpu_crtc->lut_b[i] << 0));
+		       ((*r++ & 0xffc0) << 14) |
+		       ((*g++ & 0xffc0) << 4) |
+		       (*b++ >> 6));
 	}
 
 	WREG32(mmDEGAMMA_CONTROL + amdgpu_crtc->crtc_offset,
@@ -2428,7 +2322,7 @@ static int dce_v6_0_crtc_cursor_set2(struct drm_crtc *crtc,
 	aobj = gem_to_amdgpu_bo(obj);
 	ret = amdgpu_bo_reserve(aobj, false);
 	if (ret != 0) {
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ret;
 	}
 
@@ -2436,7 +2330,7 @@ static int dce_v6_0_crtc_cursor_set2(struct drm_crtc *crtc,
 	amdgpu_bo_unreserve(aobj);
 	if (ret) {
 		DRM_ERROR("Failed to pin new cursor BO (%d)\n", ret);
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ret;
 	}
 
@@ -2470,7 +2364,7 @@ static int dce_v6_0_crtc_cursor_set2(struct drm_crtc *crtc,
 			amdgpu_bo_unpin(aobj);
 			amdgpu_bo_unreserve(aobj);
 		}
-		drm_gem_object_unreference_unlocked(amdgpu_crtc->cursor_bo);
+		drm_gem_object_put_unlocked(amdgpu_crtc->cursor_bo);
 	}
 
 	amdgpu_crtc->cursor_bo = obj;
@@ -2496,15 +2390,6 @@ static int dce_v6_0_crtc_gamma_set(struct drm_crtc *crtc, u16 *red, u16 *green,
 				   u16 *blue, uint32_t size,
 				   struct drm_modeset_acquire_ctx *ctx)
 {
-	struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
-	int i;
-
-	/* userspace palettes are always correct as is */
-	for (i = 0; i < size; i++) {
-		amdgpu_crtc->lut_r[i] = red[i] >> 6;
-		amdgpu_crtc->lut_g[i] = green[i] >> 6;
-		amdgpu_crtc->lut_b[i] = blue[i] >> 6;
-	}
 	dce_v6_0_crtc_load_lut(crtc);
 
 	return 0;
@@ -2712,14 +2597,12 @@ static const struct drm_crtc_helper_funcs dce_v6_0_crtc_helper_funcs = {
 	.mode_set_base_atomic = dce_v6_0_crtc_set_base_atomic,
 	.prepare = dce_v6_0_crtc_prepare,
 	.commit = dce_v6_0_crtc_commit,
-	.load_lut = dce_v6_0_crtc_load_lut,
 	.disable = dce_v6_0_crtc_disable,
 };
 
 static int dce_v6_0_crtc_init(struct amdgpu_device *adev, int index)
 {
 	struct amdgpu_crtc *amdgpu_crtc;
-	int i;
 
 	amdgpu_crtc = kzalloc(sizeof(struct amdgpu_crtc) +
 			      (AMDGPUFB_CONN_LIMIT * sizeof(struct drm_connector *)), GFP_KERNEL);
@@ -2737,12 +2620,6 @@ static int dce_v6_0_crtc_init(struct amdgpu_device *adev, int index)
 	adev->ddev->mode_config.cursor_width = amdgpu_crtc->max_cursor_width;
 	adev->ddev->mode_config.cursor_height = amdgpu_crtc->max_cursor_height;
 
-	for (i = 0; i < 256; i++) {
-		amdgpu_crtc->lut_r[i] = i << 2;
-		amdgpu_crtc->lut_g[i] = i << 2;
-		amdgpu_crtc->lut_b[i] = i << 2;
-	}
-
 	amdgpu_crtc->crtc_offset = crtc_offsets[amdgpu_crtc->crtc_id];
 
 	amdgpu_crtc->pll_id = ATOM_PPLL_INVALID;
@@ -2873,6 +2750,8 @@ static int dce_v6_0_hw_init(void *handle)
 	int i;
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+	/* disable vga render */
+	dce_v6_0_set_vga_render_state(adev, false);
 	/* init dig PHYs, disp eng pll */
 	amdgpu_atombios_encoder_init_dig(adev);
 	amdgpu_atombios_crtc_set_disp_eng_pll(adev, adev->clock.default_dispclk);
@@ -3525,7 +3404,6 @@ static void dce_v6_0_encoder_add(struct amdgpu_device *adev,
 }
 
 static const struct amdgpu_display_funcs dce_v6_0_display_funcs = {
-	.set_vga_render_state = &dce_v6_0_set_vga_render_state,
 	.bandwidth_update = &dce_v6_0_bandwidth_update,
 	.vblank_get_counter = &dce_v6_0_vblank_get_counter,
 	.vblank_wait = &dce_v6_0_vblank_wait,
@@ -3538,8 +3416,6 @@ static const struct amdgpu_display_funcs dce_v6_0_display_funcs = {
 	.page_flip_get_scanoutpos = &dce_v6_0_crtc_get_scanoutpos,
 	.add_encoder = &dce_v6_0_encoder_add,
 	.add_connector = &amdgpu_connector_add,
-	.stop_mc_access = &dce_v6_0_stop_mc_access,
-	.resume_mc_access = &dce_v6_0_resume_mc_access,
 };
 
 static void dce_v6_0_set_display_funcs(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c b/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c
index a9e8695..9cf14b8 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_v8_0.c
@@ -419,81 +419,6 @@ static bool dce_v8_0_is_display_hung(struct amdgpu_device *adev)
 	return true;
 }
 
-static void dce_v8_0_stop_mc_access(struct amdgpu_device *adev,
-				    struct amdgpu_mode_mc_save *save)
-{
-	u32 crtc_enabled, tmp;
-	int i;
-
-	save->vga_render_control = RREG32(mmVGA_RENDER_CONTROL);
-	save->vga_hdp_control = RREG32(mmVGA_HDP_CONTROL);
-
-	/* disable VGA render */
-	tmp = RREG32(mmVGA_RENDER_CONTROL);
-	tmp = REG_SET_FIELD(tmp, VGA_RENDER_CONTROL, VGA_VSTATUS_CNTL, 0);
-	WREG32(mmVGA_RENDER_CONTROL, tmp);
-
-	/* blank the display controllers */
-	for (i = 0; i < adev->mode_info.num_crtc; i++) {
-		crtc_enabled = REG_GET_FIELD(RREG32(mmCRTC_CONTROL + crtc_offsets[i]),
-					     CRTC_CONTROL, CRTC_MASTER_EN);
-		if (crtc_enabled) {
-#if 1
-			save->crtc_enabled[i] = true;
-			tmp = RREG32(mmCRTC_BLANK_CONTROL + crtc_offsets[i]);
-			if (REG_GET_FIELD(tmp, CRTC_BLANK_CONTROL, CRTC_BLANK_DATA_EN) == 0) {
-				/*it is correct only for RGB ; black is 0*/
-				WREG32(mmCRTC_BLANK_DATA_COLOR + crtc_offsets[i], 0);
-				tmp = REG_SET_FIELD(tmp, CRTC_BLANK_CONTROL, CRTC_BLANK_DATA_EN, 1);
-				WREG32(mmCRTC_BLANK_CONTROL + crtc_offsets[i], tmp);
-			}
-			mdelay(20);
-#else
-			/* XXX this is a hack to avoid strange behavior with EFI on certain systems */
-			WREG32(mmCRTC_UPDATE_LOCK + crtc_offsets[i], 1);
-			tmp = RREG32(mmCRTC_CONTROL + crtc_offsets[i]);
-			tmp = REG_SET_FIELD(tmp, CRTC_CONTROL, CRTC_MASTER_EN, 0);
-			WREG32(mmCRTC_CONTROL + crtc_offsets[i], tmp);
-			WREG32(mmCRTC_UPDATE_LOCK + crtc_offsets[i], 0);
-			save->crtc_enabled[i] = false;
-			/* ***** */
-#endif
-		} else {
-			save->crtc_enabled[i] = false;
-		}
-	}
-}
-
-static void dce_v8_0_resume_mc_access(struct amdgpu_device *adev,
-				      struct amdgpu_mode_mc_save *save)
-{
-	u32 tmp;
-	int i;
-
-	/* update crtc base addresses */
-	for (i = 0; i < adev->mode_info.num_crtc; i++) {
-		WREG32(mmGRPH_PRIMARY_SURFACE_ADDRESS_HIGH + crtc_offsets[i],
-		       upper_32_bits(adev->mc.vram_start));
-		WREG32(mmGRPH_PRIMARY_SURFACE_ADDRESS + crtc_offsets[i],
-		       (u32)adev->mc.vram_start);
-
-		if (save->crtc_enabled[i]) {
-			tmp = RREG32(mmCRTC_BLANK_CONTROL + crtc_offsets[i]);
-			tmp = REG_SET_FIELD(tmp, CRTC_BLANK_CONTROL, CRTC_BLANK_DATA_EN, 0);
-			WREG32(mmCRTC_BLANK_CONTROL + crtc_offsets[i], tmp);
-		}
-		mdelay(20);
-	}
-
-	WREG32(mmVGA_MEMORY_BASE_ADDRESS_HIGH, upper_32_bits(adev->mc.vram_start));
-	WREG32(mmVGA_MEMORY_BASE_ADDRESS, lower_32_bits(adev->mc.vram_start));
-
-	/* Unlock vga access */
-	WREG32(mmVGA_HDP_CONTROL, save->vga_hdp_control);
-	mdelay(1);
-	WREG32(mmVGA_RENDER_CONTROL, save->vga_render_control);
-}
-
 static void dce_v8_0_set_vga_render_state(struct amdgpu_device *adev,
 					  bool render)
 {
@@ -1750,7 +1675,7 @@ static void dce_v8_0_afmt_setmode(struct drm_encoder *encoder,
 	dce_v8_0_audio_write_sad_regs(encoder);
 	dce_v8_0_audio_write_latency_fields(encoder, mode);
 
-	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode);
+	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
 	if (err < 0) {
 		DRM_ERROR("failed to setup AVI infoframe: %zd\n", err);
 		return;
@@ -2124,6 +2049,7 @@ static void dce_v8_0_crtc_load_lut(struct drm_crtc *crtc)
 	struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
 	struct drm_device *dev = crtc->dev;
 	struct amdgpu_device *adev = dev->dev_private;
+	u16 *r, *g, *b;
 	int i;
 
 	DRM_DEBUG_KMS("%d\n", amdgpu_crtc->crtc_id);
@@ -2153,11 +2079,14 @@ static void dce_v8_0_crtc_load_lut(struct drm_crtc *crtc)
 	WREG32(mmDC_LUT_WRITE_EN_MASK + amdgpu_crtc->crtc_offset, 0x00000007);
 
 	WREG32(mmDC_LUT_RW_INDEX + amdgpu_crtc->crtc_offset, 0);
+	r = crtc->gamma_store;
+	g = r + crtc->gamma_size;
+	b = g + crtc->gamma_size;
 	for (i = 0; i < 256; i++) {
 		WREG32(mmDC_LUT_30_COLOR + amdgpu_crtc->crtc_offset,
-		       (amdgpu_crtc->lut_r[i] << 20) |
-		       (amdgpu_crtc->lut_g[i] << 10) |
-		       (amdgpu_crtc->lut_b[i] << 0));
+		       ((*r++ & 0xffc0) << 14) |
+		       ((*g++ & 0xffc0) << 4) |
+		       (*b++ >> 6));
 	}
 
 	WREG32(mmDEGAMMA_CONTROL + amdgpu_crtc->crtc_offset,
@@ -2406,7 +2335,7 @@ static int dce_v8_0_crtc_cursor_set2(struct drm_crtc *crtc,
 	aobj = gem_to_amdgpu_bo(obj);
 	ret = amdgpu_bo_reserve(aobj, false);
 	if (ret != 0) {
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ret;
 	}
 
@@ -2414,7 +2343,7 @@ static int dce_v8_0_crtc_cursor_set2(struct drm_crtc *crtc,
 	amdgpu_bo_unreserve(aobj);
 	if (ret) {
 		DRM_ERROR("Failed to pin new cursor BO (%d)\n", ret);
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ret;
 	}
 
@@ -2448,7 +2377,7 @@ static int dce_v8_0_crtc_cursor_set2(struct drm_crtc *crtc,
 			amdgpu_bo_unpin(aobj);
 			amdgpu_bo_unreserve(aobj);
 		}
-		drm_gem_object_unreference_unlocked(amdgpu_crtc->cursor_bo);
+		drm_gem_object_put_unlocked(amdgpu_crtc->cursor_bo);
 	}
 
 	amdgpu_crtc->cursor_bo = obj;
@@ -2475,15 +2404,6 @@ static int dce_v8_0_crtc_gamma_set(struct drm_crtc *crtc, u16 *red, u16 *green,
 				   u16 *blue, uint32_t size,
 				   struct drm_modeset_acquire_ctx *ctx)
 {
-	struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
-	int i;
-
-	/* userspace palettes are always correct as is */
-	for (i = 0; i < size; i++) {
-		amdgpu_crtc->lut_r[i] = red[i] >> 6;
-		amdgpu_crtc->lut_g[i] = green[i] >> 6;
-		amdgpu_crtc->lut_b[i] = blue[i] >> 6;
-	}
 	dce_v8_0_crtc_load_lut(crtc);
 
 	return 0;
@@ -2702,14 +2622,12 @@ static const struct drm_crtc_helper_funcs dce_v8_0_crtc_helper_funcs = {
 	.mode_set_base_atomic = dce_v8_0_crtc_set_base_atomic,
 	.prepare = dce_v8_0_crtc_prepare,
 	.commit = dce_v8_0_crtc_commit,
-	.load_lut = dce_v8_0_crtc_load_lut,
 	.disable = dce_v8_0_crtc_disable,
 };
 
 static int dce_v8_0_crtc_init(struct amdgpu_device *adev, int index)
 {
 	struct amdgpu_crtc *amdgpu_crtc;
-	int i;
 
 	amdgpu_crtc = kzalloc(sizeof(struct amdgpu_crtc) +
 			      (AMDGPUFB_CONN_LIMIT * sizeof(struct drm_connector *)), GFP_KERNEL);
@@ -2727,12 +2645,6 @@ static int dce_v8_0_crtc_init(struct amdgpu_device *adev, int index)
 	adev->ddev->mode_config.cursor_width = amdgpu_crtc->max_cursor_width;
 	adev->ddev->mode_config.cursor_height = amdgpu_crtc->max_cursor_height;
 
-	for (i = 0; i < 256; i++) {
-		amdgpu_crtc->lut_r[i] = i << 2;
-		amdgpu_crtc->lut_g[i] = i << 2;
-		amdgpu_crtc->lut_b[i] = i << 2;
-	}
-
 	amdgpu_crtc->crtc_offset = crtc_offsets[amdgpu_crtc->crtc_id];
 
 	amdgpu_crtc->pll_id = ATOM_PPLL_INVALID;
@@ -2870,6 +2782,8 @@ static int dce_v8_0_hw_init(void *handle)
 	int i;
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
 
+	/* disable vga render */
+	dce_v8_0_set_vga_render_state(adev, false);
 	/* init dig PHYs, disp eng pll */
 	amdgpu_atombios_encoder_init_dig(adev);
 	amdgpu_atombios_crtc_set_disp_eng_pll(adev, adev->clock.default_dispclk);
@@ -3574,7 +3488,6 @@ static void dce_v8_0_encoder_add(struct amdgpu_device *adev,
 }
 
 static const struct amdgpu_display_funcs dce_v8_0_display_funcs = {
-	.set_vga_render_state = &dce_v8_0_set_vga_render_state,
 	.bandwidth_update = &dce_v8_0_bandwidth_update,
 	.vblank_get_counter = &dce_v8_0_vblank_get_counter,
 	.vblank_wait = &dce_v8_0_vblank_wait,
@@ -3587,8 +3500,6 @@ static const struct amdgpu_display_funcs dce_v8_0_display_funcs = {
 	.page_flip_get_scanoutpos = &dce_v8_0_crtc_get_scanoutpos,
 	.add_encoder = &dce_v8_0_encoder_add,
 	.add_connector = &amdgpu_connector_add,
-	.stop_mc_access = &dce_v8_0_stop_mc_access,
-	.resume_mc_access = &dce_v8_0_resume_mc_access,
 };
 
 static void dce_v8_0_set_display_funcs(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/amdgpu/dce_virtual.c b/drivers/gpu/drm/amd/amdgpu/dce_virtual.c
index 90bb083..b9ee907 100644
--- a/drivers/gpu/drm/amd/amdgpu/dce_virtual.c
+++ b/drivers/gpu/drm/amd/amdgpu/dce_virtual.c
@@ -95,62 +95,6 @@ static u32 dce_virtual_hpd_get_gpio_reg(struct amdgpu_device *adev)
 	return 0;
 }
 
-static void dce_virtual_stop_mc_access(struct amdgpu_device *adev,
-			      struct amdgpu_mode_mc_save *save)
-{
-	switch (adev->asic_type) {
-#ifdef CONFIG_DRM_AMDGPU_SI
-	case CHIP_TAHITI:
-	case CHIP_PITCAIRN:
-	case CHIP_VERDE:
-	case CHIP_OLAND:
-		dce_v6_0_disable_dce(adev);
-		break;
-#endif
-#ifdef CONFIG_DRM_AMDGPU_CIK
-	case CHIP_BONAIRE:
-	case CHIP_HAWAII:
-	case CHIP_KAVERI:
-	case CHIP_KABINI:
-	case CHIP_MULLINS:
-		dce_v8_0_disable_dce(adev);
-		break;
-#endif
-	case CHIP_FIJI:
-	case CHIP_TONGA:
-		dce_v10_0_disable_dce(adev);
-		break;
-	case CHIP_CARRIZO:
-	case CHIP_STONEY:
-	case CHIP_POLARIS10:
-	case CHIP_POLARIS11:
-	case CHIP_POLARIS12:
-		dce_v11_0_disable_dce(adev);
-		break;
-	case CHIP_TOPAZ:
-#ifdef CONFIG_DRM_AMDGPU_SI
-	case CHIP_HAINAN:
-#endif
-		/* no DCE */
-		return;
-	default:
-		DRM_ERROR("Virtual display unsupported ASIC type: 0x%X\n", adev->asic_type);
-	}
-
-	return;
-}
-static void dce_virtual_resume_mc_access(struct amdgpu_device *adev,
-				struct amdgpu_mode_mc_save *save)
-{
-	return;
-}
-
-static void dce_virtual_set_vga_render_state(struct amdgpu_device *adev,
-				    bool render)
-{
-	return;
-}
-
 /**
  * dce_virtual_bandwidth_update - program display watermarks
  *
@@ -168,16 +112,6 @@ static int dce_virtual_crtc_gamma_set(struct drm_crtc *crtc, u16 *red,
 				      u16 *green, u16 *blue, uint32_t size,
 				      struct drm_modeset_acquire_ctx *ctx)
 {
-	struct amdgpu_crtc *amdgpu_crtc = to_amdgpu_crtc(crtc);
-	int i;
-
-	/* userspace palettes are always correct as is */
-	for (i = 0; i < size; i++) {
-		amdgpu_crtc->lut_r[i] = red[i] >> 6;
-		amdgpu_crtc->lut_g[i] = green[i] >> 6;
-		amdgpu_crtc->lut_b[i] = blue[i] >> 6;
-	}
-
 	return 0;
 }
 
@@ -289,11 +223,6 @@ static int dce_virtual_crtc_set_base(struct drm_crtc *crtc, int x, int y,
 	return 0;
 }
 
-static void dce_virtual_crtc_load_lut(struct drm_crtc *crtc)
-{
-	return;
-}
-
 static int dce_virtual_crtc_set_base_atomic(struct drm_crtc *crtc,
 					 struct drm_framebuffer *fb,
 					 int x, int y, enum mode_set_atomic state)
@@ -309,14 +238,12 @@ static const struct drm_crtc_helper_funcs dce_virtual_crtc_helper_funcs = {
 	.mode_set_base_atomic = dce_virtual_crtc_set_base_atomic,
 	.prepare = dce_virtual_crtc_prepare,
 	.commit = dce_virtual_crtc_commit,
-	.load_lut = dce_virtual_crtc_load_lut,
 	.disable = dce_virtual_crtc_disable,
 };
 
 static int dce_virtual_crtc_init(struct amdgpu_device *adev, int index)
 {
 	struct amdgpu_crtc *amdgpu_crtc;
-	int i;
 
 	amdgpu_crtc = kzalloc(sizeof(struct amdgpu_crtc) +
 			      (AMDGPUFB_CONN_LIMIT * sizeof(struct drm_connector *)), GFP_KERNEL);
@@ -329,12 +256,6 @@ static int dce_virtual_crtc_init(struct amdgpu_device *adev, int index)
 	amdgpu_crtc->crtc_id = index;
 	adev->mode_info.crtcs[index] = amdgpu_crtc;
 
-	for (i = 0; i < 256; i++) {
-		amdgpu_crtc->lut_r[i] = i << 2;
-		amdgpu_crtc->lut_g[i] = i << 2;
-		amdgpu_crtc->lut_b[i] = i << 2;
-	}
-
 	amdgpu_crtc->pll_id = ATOM_PPLL_INVALID;
 	amdgpu_crtc->encoder = NULL;
 	amdgpu_crtc->connector = NULL;
@@ -522,6 +443,47 @@ static int dce_virtual_sw_fini(void *handle)
 
 static int dce_virtual_hw_init(void *handle)
 {
+	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
+
+	switch (adev->asic_type) {
+#ifdef CONFIG_DRM_AMDGPU_SI
+	case CHIP_TAHITI:
+	case CHIP_PITCAIRN:
+	case CHIP_VERDE:
+	case CHIP_OLAND:
+		dce_v6_0_disable_dce(adev);
+		break;
+#endif
+#ifdef CONFIG_DRM_AMDGPU_CIK
+	case CHIP_BONAIRE:
+	case CHIP_HAWAII:
+	case CHIP_KAVERI:
+	case CHIP_KABINI:
+	case CHIP_MULLINS:
+		dce_v8_0_disable_dce(adev);
+		break;
+#endif
+	case CHIP_FIJI:
+	case CHIP_TONGA:
+		dce_v10_0_disable_dce(adev);
+		break;
+	case CHIP_CARRIZO:
+	case CHIP_STONEY:
+	case CHIP_POLARIS11:
+	case CHIP_POLARIS10:
+		dce_v11_0_disable_dce(adev);
+		break;
+	case CHIP_TOPAZ:
+#ifdef CONFIG_DRM_AMDGPU_SI
+	case CHIP_HAINAN:
+#endif
+		/* no DCE */
+		break;
+	case CHIP_VEGA10:
+		break;
+	default:
+		DRM_ERROR("Virtual display unsupported ASIC type: 0x%X\n", adev->asic_type);
+	}
 	return 0;
 }
 
@@ -677,7 +639,6 @@ static int dce_virtual_connector_encoder_init(struct amdgpu_device *adev,
 }
 
 static const struct amdgpu_display_funcs dce_virtual_display_funcs = {
-	.set_vga_render_state = &dce_virtual_set_vga_render_state,
 	.bandwidth_update = &dce_virtual_bandwidth_update,
 	.vblank_get_counter = &dce_virtual_vblank_get_counter,
 	.vblank_wait = &dce_virtual_vblank_wait,
@@ -690,8 +651,6 @@ static const struct amdgpu_display_funcs dce_virtual_display_funcs = {
 	.page_flip_get_scanoutpos = &dce_virtual_crtc_get_scanoutpos,
 	.add_encoder = NULL,
 	.add_connector = NULL,
-	.stop_mc_access = &dce_virtual_stop_mc_access,
-	.resume_mc_access = &dce_virtual_resume_mc_access,
 };
 
 static void dce_virtual_set_display_funcs(struct amdgpu_device *adev)
@@ -809,7 +768,7 @@ static const struct amdgpu_irq_src_funcs dce_virtual_crtc_irq_funcs = {
 
 static void dce_virtual_set_irq_funcs(struct amdgpu_device *adev)
 {
-	adev->crtc_irq.num_types = AMDGPU_CRTC_IRQ_LAST;
+	adev->crtc_irq.num_types = AMDGPU_CRTC_IRQ_VBLANK6 + 1;
 	adev->crtc_irq.funcs = &dce_virtual_crtc_irq_funcs;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
index 5173ca1..d228f5a 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v6_0.c
@@ -1573,7 +1573,7 @@ static void gfx_v6_0_gpu_init(struct amdgpu_device *adev)
 
 static void gfx_v6_0_scratch_init(struct amdgpu_device *adev)
 {
-	adev->gfx.scratch.num_reg = 7;
+	adev->gfx.scratch.num_reg = 8;
 	adev->gfx.scratch.reg_base = mmSCRATCH_REG0;
 	adev->gfx.scratch.free_mask = (1u << adev->gfx.scratch.num_reg) - 1;
 }
@@ -2217,40 +2217,9 @@ static void gfx_v6_0_ring_emit_vm_flush(struct amdgpu_ring *ring,
 
 static void gfx_v6_0_rlc_fini(struct amdgpu_device *adev)
 {
-	int r;
-
-	if (adev->gfx.rlc.save_restore_obj) {
-		r = amdgpu_bo_reserve(adev->gfx.rlc.save_restore_obj, true);
-		if (unlikely(r != 0))
-			dev_warn(adev->dev, "(%d) reserve RLC sr bo failed\n", r);
-		amdgpu_bo_unpin(adev->gfx.rlc.save_restore_obj);
-		amdgpu_bo_unreserve(adev->gfx.rlc.save_restore_obj);
-
-		amdgpu_bo_unref(&adev->gfx.rlc.save_restore_obj);
-		adev->gfx.rlc.save_restore_obj = NULL;
-	}
-
-	if (adev->gfx.rlc.clear_state_obj) {
-		r = amdgpu_bo_reserve(adev->gfx.rlc.clear_state_obj, true);
-		if (unlikely(r != 0))
-			dev_warn(adev->dev, "(%d) reserve RLC c bo failed\n", r);
-		amdgpu_bo_unpin(adev->gfx.rlc.clear_state_obj);
-		amdgpu_bo_unreserve(adev->gfx.rlc.clear_state_obj);
-
-		amdgpu_bo_unref(&adev->gfx.rlc.clear_state_obj);
-		adev->gfx.rlc.clear_state_obj = NULL;
-	}
-
-	if (adev->gfx.rlc.cp_table_obj) {
-		r = amdgpu_bo_reserve(adev->gfx.rlc.cp_table_obj, true);
-		if (unlikely(r != 0))
-			dev_warn(adev->dev, "(%d) reserve RLC cp table bo failed\n", r);
-		amdgpu_bo_unpin(adev->gfx.rlc.cp_table_obj);
-		amdgpu_bo_unreserve(adev->gfx.rlc.cp_table_obj);
-
-		amdgpu_bo_unref(&adev->gfx.rlc.cp_table_obj);
-		adev->gfx.rlc.cp_table_obj = NULL;
-	}
+	amdgpu_bo_free_kernel(&adev->gfx.rlc.save_restore_obj, NULL, NULL);
+	amdgpu_bo_free_kernel(&adev->gfx.rlc.clear_state_obj, NULL, NULL);
+	amdgpu_bo_free_kernel(&adev->gfx.rlc.cp_table_obj, NULL, NULL);
 }
 
 static int gfx_v6_0_rlc_init(struct amdgpu_device *adev)
@@ -2273,43 +2242,23 @@ static int gfx_v6_0_rlc_init(struct amdgpu_device *adev)
 
 	if (src_ptr) {
 		/* save restore block */
-		if (adev->gfx.rlc.save_restore_obj == NULL) {
-			r = amdgpu_bo_create(adev, dws * 4, PAGE_SIZE, true,
-					     AMDGPU_GEM_DOMAIN_VRAM,
-					     AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED,
-					     NULL, NULL,
-					     &adev->gfx.rlc.save_restore_obj);
-
-			if (r) {
-				dev_warn(adev->dev, "(%d) create RLC sr bo failed\n", r);
-				return r;
-			}
-		}
-
-		r = amdgpu_bo_reserve(adev->gfx.rlc.save_restore_obj, false);
-		if (unlikely(r != 0)) {
-			gfx_v6_0_rlc_fini(adev);
-			return r;
-		}
-		r = amdgpu_bo_pin(adev->gfx.rlc.save_restore_obj, AMDGPU_GEM_DOMAIN_VRAM,
-				  &adev->gfx.rlc.save_restore_gpu_addr);
+		r = amdgpu_bo_create_reserved(adev, dws * 4, PAGE_SIZE,
+					      AMDGPU_GEM_DOMAIN_VRAM,
+					      &adev->gfx.rlc.save_restore_obj,
+					      &adev->gfx.rlc.save_restore_gpu_addr,
+					      (void **)&adev->gfx.rlc.sr_ptr);
 		if (r) {
-			amdgpu_bo_unreserve(adev->gfx.rlc.save_restore_obj);
-			dev_warn(adev->dev, "(%d) pin RLC sr bo failed\n", r);
+			dev_warn(adev->dev, "(%d) create RLC sr bo failed\n",
+				 r);
 			gfx_v6_0_rlc_fini(adev);
 			return r;
 		}
 
-		r = amdgpu_bo_kmap(adev->gfx.rlc.save_restore_obj, (void **)&adev->gfx.rlc.sr_ptr);
-		if (r) {
-			dev_warn(adev->dev, "(%d) map RLC sr bo failed\n", r);
-			gfx_v6_0_rlc_fini(adev);
-			return r;
-		}
 		/* write the sr buffer */
 		dst_ptr = adev->gfx.rlc.sr_ptr;
 		for (i = 0; i < adev->gfx.rlc.reg_list_size; i++)
 			dst_ptr[i] = cpu_to_le32(src_ptr[i]);
+
 		amdgpu_bo_kunmap(adev->gfx.rlc.save_restore_obj);
 		amdgpu_bo_unreserve(adev->gfx.rlc.save_restore_obj);
 	}
@@ -2319,39 +2268,17 @@ static int gfx_v6_0_rlc_init(struct amdgpu_device *adev)
 		adev->gfx.rlc.clear_state_size = gfx_v6_0_get_csb_size(adev);
 		dws = adev->gfx.rlc.clear_state_size + (256 / 4);
 
-		if (adev->gfx.rlc.clear_state_obj == NULL) {
-			r = amdgpu_bo_create(adev, dws * 4, PAGE_SIZE, true,
-					     AMDGPU_GEM_DOMAIN_VRAM,
-					     AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED,
-					     NULL, NULL,
-					     &adev->gfx.rlc.clear_state_obj);
-
-			if (r) {
-				dev_warn(adev->dev, "(%d) create RLC c bo failed\n", r);
-				gfx_v6_0_rlc_fini(adev);
-				return r;
-			}
-		}
-		r = amdgpu_bo_reserve(adev->gfx.rlc.clear_state_obj, false);
-		if (unlikely(r != 0)) {
-			gfx_v6_0_rlc_fini(adev);
-			return r;
-		}
-		r = amdgpu_bo_pin(adev->gfx.rlc.clear_state_obj, AMDGPU_GEM_DOMAIN_VRAM,
-				  &adev->gfx.rlc.clear_state_gpu_addr);
+		r = amdgpu_bo_create_reserved(adev, dws * 4, PAGE_SIZE,
+					      AMDGPU_GEM_DOMAIN_VRAM,
+					      &adev->gfx.rlc.clear_state_obj,
+					      &adev->gfx.rlc.clear_state_gpu_addr,
+					      (void **)&adev->gfx.rlc.cs_ptr);
 		if (r) {
-			amdgpu_bo_unreserve(adev->gfx.rlc.clear_state_obj);
-			dev_warn(adev->dev, "(%d) pin RLC c bo failed\n", r);
+			dev_warn(adev->dev, "(%d) create RLC c bo failed\n", r);
 			gfx_v6_0_rlc_fini(adev);
 			return r;
 		}
 
-		r = amdgpu_bo_kmap(adev->gfx.rlc.clear_state_obj, (void **)&adev->gfx.rlc.cs_ptr);
-		if (r) {
-			dev_warn(adev->dev, "(%d) map RLC c bo failed\n", r);
-			gfx_v6_0_rlc_fini(adev);
-			return r;
-		}
 		/* set up the cs buffer */
 		dst_ptr = adev->gfx.rlc.cs_ptr;
 		reg_list_mc_addr = adev->gfx.rlc.clear_state_gpu_addr + 256;
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
index 37b45e4..0086876 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v7_0.c
@@ -1823,7 +1823,7 @@ static void gfx_v7_0_setup_rb(struct amdgpu_device *adev)
 }
 
 /**
- * gmc_v7_0_init_compute_vmid - gart enable
+ * gfx_v7_0_init_compute_vmid - gart enable
  *
  * @adev: amdgpu_device pointer
  *
@@ -1833,7 +1833,7 @@ static void gfx_v7_0_setup_rb(struct amdgpu_device *adev)
 #define DEFAULT_SH_MEM_BASES	(0x6000)
 #define FIRST_COMPUTE_VMID	(8)
 #define LAST_COMPUTE_VMID	(16)
-static void gmc_v7_0_init_compute_vmid(struct amdgpu_device *adev)
+static void gfx_v7_0_init_compute_vmid(struct amdgpu_device *adev)
 {
 	int i;
 	uint32_t sh_mem_config;
@@ -1921,6 +1921,7 @@ static void gfx_v7_0_gpu_init(struct amdgpu_device *adev)
 				   ELEMENT_SIZE, 1);
 	sh_static_mem_cfg = REG_SET_FIELD(sh_static_mem_cfg, SH_STATIC_MEM_CONFIG,
 				   INDEX_STRIDE, 3);
+	WREG32(mmSH_STATIC_MEM_CONFIG, sh_static_mem_cfg);
 
 	mutex_lock(&adev->srbm_mutex);
 	for (i = 0; i < adev->vm_manager.id_mgr[0].num_ids; i++) {
@@ -1934,12 +1935,11 @@ static void gfx_v7_0_gpu_init(struct amdgpu_device *adev)
 		WREG32(mmSH_MEM_APE1_BASE, 1);
 		WREG32(mmSH_MEM_APE1_LIMIT, 0);
 		WREG32(mmSH_MEM_BASES, sh_mem_base);
-		WREG32(mmSH_STATIC_MEM_CONFIG, sh_static_mem_cfg);
 	}
 	cik_srbm_select(adev, 0, 0, 0, 0);
 	mutex_unlock(&adev->srbm_mutex);
 
-	gmc_v7_0_init_compute_vmid(adev);
+	gfx_v7_0_init_compute_vmid(adev);
 
 	WREG32(mmSX_DEBUG_1, 0x20);
 
@@ -2021,7 +2021,7 @@ static void gfx_v7_0_gpu_init(struct amdgpu_device *adev)
  */
 static void gfx_v7_0_scratch_init(struct amdgpu_device *adev)
 {
-	adev->gfx.scratch.num_reg = 7;
+	adev->gfx.scratch.num_reg = 8;
 	adev->gfx.scratch.reg_base = mmSCRATCH_REG0;
 	adev->gfx.scratch.free_mask = (1u << adev->gfx.scratch.num_reg) - 1;
 }
@@ -2774,39 +2774,18 @@ static int gfx_v7_0_cp_compute_load_microcode(struct amdgpu_device *adev)
  */
 static void gfx_v7_0_cp_compute_fini(struct amdgpu_device *adev)
 {
-	int i, r;
+	int i;
 
 	for (i = 0; i < adev->gfx.num_compute_rings; i++) {
 		struct amdgpu_ring *ring = &adev->gfx.compute_ring[i];
 
-		if (ring->mqd_obj) {
-			r = amdgpu_bo_reserve(ring->mqd_obj, true);
-			if (unlikely(r != 0))
-				dev_warn(adev->dev, "(%d) reserve MQD bo failed\n", r);
-
-			amdgpu_bo_unpin(ring->mqd_obj);
-			amdgpu_bo_unreserve(ring->mqd_obj);
-
-			amdgpu_bo_unref(&ring->mqd_obj);
-			ring->mqd_obj = NULL;
-		}
+		amdgpu_bo_free_kernel(&ring->mqd_obj, NULL, NULL);
 	}
 }
 
 static void gfx_v7_0_mec_fini(struct amdgpu_device *adev)
 {
-	int r;
-
-	if (adev->gfx.mec.hpd_eop_obj) {
-		r = amdgpu_bo_reserve(adev->gfx.mec.hpd_eop_obj, true);
-		if (unlikely(r != 0))
-			dev_warn(adev->dev, "(%d) reserve HPD EOP bo failed\n", r);
-		amdgpu_bo_unpin(adev->gfx.mec.hpd_eop_obj);
-		amdgpu_bo_unreserve(adev->gfx.mec.hpd_eop_obj);
-
-		amdgpu_bo_unref(&adev->gfx.mec.hpd_eop_obj);
-		adev->gfx.mec.hpd_eop_obj = NULL;
-	}
+	amdgpu_bo_free_kernel(&adev->gfx.mec.hpd_eop_obj, NULL, NULL);
 }
 
 static int gfx_v7_0_mec_init(struct amdgpu_device *adev)
@@ -2823,33 +2802,14 @@ static int gfx_v7_0_mec_init(struct amdgpu_device *adev)
 	/* allocate space for ALL pipes (even the ones we don't own) */
 	mec_hpd_size = adev->gfx.mec.num_mec * adev->gfx.mec.num_pipe_per_mec
 		* GFX7_MEC_HPD_SIZE * 2;
-	if (adev->gfx.mec.hpd_eop_obj == NULL) {
-		r = amdgpu_bo_create(adev,
-				     mec_hpd_size,
-				     PAGE_SIZE, true,
-				     AMDGPU_GEM_DOMAIN_GTT, 0, NULL, NULL,
-				     &adev->gfx.mec.hpd_eop_obj);
-		if (r) {
-			dev_warn(adev->dev, "(%d) create HDP EOP bo failed\n", r);
-			return r;
-		}
-	}
 
-	r = amdgpu_bo_reserve(adev->gfx.mec.hpd_eop_obj, false);
-	if (unlikely(r != 0)) {
-		gfx_v7_0_mec_fini(adev);
-		return r;
-	}
-	r = amdgpu_bo_pin(adev->gfx.mec.hpd_eop_obj, AMDGPU_GEM_DOMAIN_GTT,
-			  &adev->gfx.mec.hpd_eop_gpu_addr);
+	r = amdgpu_bo_create_reserved(adev, mec_hpd_size, PAGE_SIZE,
+				      AMDGPU_GEM_DOMAIN_GTT,
+				      &adev->gfx.mec.hpd_eop_obj,
+				      &adev->gfx.mec.hpd_eop_gpu_addr,
+				      (void **)&hpd);
 	if (r) {
-		dev_warn(adev->dev, "(%d) pin HDP EOP bo failed\n", r);
-		gfx_v7_0_mec_fini(adev);
-		return r;
-	}
-	r = amdgpu_bo_kmap(adev->gfx.mec.hpd_eop_obj, (void **)&hpd);
-	if (r) {
-		dev_warn(adev->dev, "(%d) map HDP EOP bo failed\n", r);
+		dev_warn(adev->dev, "(%d) create, pin or map of HDP EOP bo failed\n", r);
 		gfx_v7_0_mec_fini(adev);
 		return r;
 	}
@@ -3108,32 +3068,12 @@ static int gfx_v7_0_compute_queue_init(struct amdgpu_device *adev, int ring_id)
 	struct cik_mqd *mqd;
 	struct amdgpu_ring *ring = &adev->gfx.compute_ring[ring_id];
 
-	if (ring->mqd_obj == NULL) {
-		r = amdgpu_bo_create(adev,
-				sizeof(struct cik_mqd),
-				PAGE_SIZE, true,
-				AMDGPU_GEM_DOMAIN_GTT, 0, NULL, NULL,
-				&ring->mqd_obj);
-		if (r) {
-			dev_warn(adev->dev, "(%d) create MQD bo failed\n", r);
-			return r;
-		}
-	}
-
-	r = amdgpu_bo_reserve(ring->mqd_obj, false);
-	if (unlikely(r != 0))
-		goto out;
-
-	r = amdgpu_bo_pin(ring->mqd_obj, AMDGPU_GEM_DOMAIN_GTT,
-			&mqd_gpu_addr);
+	r = amdgpu_bo_create_reserved(adev, sizeof(struct cik_mqd), PAGE_SIZE,
+				      AMDGPU_GEM_DOMAIN_GTT, &ring->mqd_obj,
+				      &mqd_gpu_addr, (void **)&mqd);
 	if (r) {
-		dev_warn(adev->dev, "(%d) pin MQD bo failed\n", r);
-		goto out_unreserve;
-	}
-	r = amdgpu_bo_kmap(ring->mqd_obj, (void **)&mqd);
-	if (r) {
-		dev_warn(adev->dev, "(%d) map MQD bo failed\n", r);
-		goto out_unreserve;
+		dev_warn(adev->dev, "(%d) create MQD bo failed\n", r);
+		return r;
 	}
 
 	mutex_lock(&adev->srbm_mutex);
@@ -3147,9 +3087,7 @@ static int gfx_v7_0_compute_queue_init(struct amdgpu_device *adev, int ring_id)
 	mutex_unlock(&adev->srbm_mutex);
 
 	amdgpu_bo_kunmap(ring->mqd_obj);
-out_unreserve:
 	amdgpu_bo_unreserve(ring->mqd_obj);
-out:
 	return 0;
 }
 
@@ -3361,43 +3299,9 @@ static void gfx_v7_0_ring_emit_vm_flush(struct amdgpu_ring *ring,
  */
 static void gfx_v7_0_rlc_fini(struct amdgpu_device *adev)
 {
-	int r;
-
-	/* save restore block */
-	if (adev->gfx.rlc.save_restore_obj) {
-		r = amdgpu_bo_reserve(adev->gfx.rlc.save_restore_obj, true);
-		if (unlikely(r != 0))
-			dev_warn(adev->dev, "(%d) reserve RLC sr bo failed\n", r);
-		amdgpu_bo_unpin(adev->gfx.rlc.save_restore_obj);
-		amdgpu_bo_unreserve(adev->gfx.rlc.save_restore_obj);
-
-		amdgpu_bo_unref(&adev->gfx.rlc.save_restore_obj);
-		adev->gfx.rlc.save_restore_obj = NULL;
-	}
-
-	/* clear state block */
-	if (adev->gfx.rlc.clear_state_obj) {
-		r = amdgpu_bo_reserve(adev->gfx.rlc.clear_state_obj, true);
-		if (unlikely(r != 0))
-			dev_warn(adev->dev, "(%d) reserve RLC c bo failed\n", r);
-		amdgpu_bo_unpin(adev->gfx.rlc.clear_state_obj);
-		amdgpu_bo_unreserve(adev->gfx.rlc.clear_state_obj);
-
-		amdgpu_bo_unref(&adev->gfx.rlc.clear_state_obj);
-		adev->gfx.rlc.clear_state_obj = NULL;
-	}
-
-	/* clear state block */
-	if (adev->gfx.rlc.cp_table_obj) {
-		r = amdgpu_bo_reserve(adev->gfx.rlc.cp_table_obj, true);
-		if (unlikely(r != 0))
-			dev_warn(adev->dev, "(%d) reserve RLC cp table bo failed\n", r);
-		amdgpu_bo_unpin(adev->gfx.rlc.cp_table_obj);
-		amdgpu_bo_unreserve(adev->gfx.rlc.cp_table_obj);
-
-		amdgpu_bo_unref(&adev->gfx.rlc.cp_table_obj);
-		adev->gfx.rlc.cp_table_obj = NULL;
-	}
+	amdgpu_bo_free_kernel(&adev->gfx.rlc.save_restore_obj, NULL, NULL);
+	amdgpu_bo_free_kernel(&adev->gfx.rlc.clear_state_obj, NULL, NULL);
+	amdgpu_bo_free_kernel(&adev->gfx.rlc.cp_table_obj, NULL, NULL);
 }
 
 static int gfx_v7_0_rlc_init(struct amdgpu_device *adev)
@@ -3432,39 +3336,17 @@ static int gfx_v7_0_rlc_init(struct amdgpu_device *adev)
 
 	if (src_ptr) {
 		/* save restore block */
-		if (adev->gfx.rlc.save_restore_obj == NULL) {
-			r = amdgpu_bo_create(adev, dws * 4, PAGE_SIZE, true,
-					     AMDGPU_GEM_DOMAIN_VRAM,
-					     AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
-					     AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
-					     NULL, NULL,
-					     &adev->gfx.rlc.save_restore_obj);
-			if (r) {
-				dev_warn(adev->dev, "(%d) create RLC sr bo failed\n", r);
-				return r;
-			}
-		}
-
-		r = amdgpu_bo_reserve(adev->gfx.rlc.save_restore_obj, false);
-		if (unlikely(r != 0)) {
-			gfx_v7_0_rlc_fini(adev);
-			return r;
-		}
-		r = amdgpu_bo_pin(adev->gfx.rlc.save_restore_obj, AMDGPU_GEM_DOMAIN_VRAM,
-				  &adev->gfx.rlc.save_restore_gpu_addr);
+		r = amdgpu_bo_create_reserved(adev, dws * 4, PAGE_SIZE,
+					      AMDGPU_GEM_DOMAIN_VRAM,
+					      &adev->gfx.rlc.save_restore_obj,
+					      &adev->gfx.rlc.save_restore_gpu_addr,
+					      (void **)&adev->gfx.rlc.sr_ptr);
 		if (r) {
-			amdgpu_bo_unreserve(adev->gfx.rlc.save_restore_obj);
-			dev_warn(adev->dev, "(%d) pin RLC sr bo failed\n", r);
+			dev_warn(adev->dev, "(%d) create, pin or map of RLC sr bo failed\n", r);
 			gfx_v7_0_rlc_fini(adev);
 			return r;
 		}
 
-		r = amdgpu_bo_kmap(adev->gfx.rlc.save_restore_obj, (void **)&adev->gfx.rlc.sr_ptr);
-		if (r) {
-			dev_warn(adev->dev, "(%d) map RLC sr bo failed\n", r);
-			gfx_v7_0_rlc_fini(adev);
-			return r;
-		}
 		/* write the sr buffer */
 		dst_ptr = adev->gfx.rlc.sr_ptr;
 		for (i = 0; i < adev->gfx.rlc.reg_list_size; i++)
@@ -3477,39 +3359,17 @@ static int gfx_v7_0_rlc_init(struct amdgpu_device *adev)
 		/* clear state block */
 		adev->gfx.rlc.clear_state_size = dws = gfx_v7_0_get_csb_size(adev);
 
-		if (adev->gfx.rlc.clear_state_obj == NULL) {
-			r = amdgpu_bo_create(adev, dws * 4, PAGE_SIZE, true,
-					     AMDGPU_GEM_DOMAIN_VRAM,
-					     AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
-					     AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
-					     NULL, NULL,
-					     &adev->gfx.rlc.clear_state_obj);
-			if (r) {
-				dev_warn(adev->dev, "(%d) create RLC c bo failed\n", r);
-				gfx_v7_0_rlc_fini(adev);
-				return r;
-			}
-		}
-		r = amdgpu_bo_reserve(adev->gfx.rlc.clear_state_obj, false);
-		if (unlikely(r != 0)) {
-			gfx_v7_0_rlc_fini(adev);
-			return r;
-		}
-		r = amdgpu_bo_pin(adev->gfx.rlc.clear_state_obj, AMDGPU_GEM_DOMAIN_VRAM,
-				  &adev->gfx.rlc.clear_state_gpu_addr);
+		r = amdgpu_bo_create_reserved(adev, dws * 4, PAGE_SIZE,
+					      AMDGPU_GEM_DOMAIN_VRAM,
+					      &adev->gfx.rlc.clear_state_obj,
+					      &adev->gfx.rlc.clear_state_gpu_addr,
+					      (void **)&adev->gfx.rlc.cs_ptr);
 		if (r) {
-			amdgpu_bo_unreserve(adev->gfx.rlc.clear_state_obj);
-			dev_warn(adev->dev, "(%d) pin RLC c bo failed\n", r);
+			dev_warn(adev->dev, "(%d) create RLC c bo failed\n", r);
 			gfx_v7_0_rlc_fini(adev);
 			return r;
 		}
 
-		r = amdgpu_bo_kmap(adev->gfx.rlc.clear_state_obj, (void **)&adev->gfx.rlc.cs_ptr);
-		if (r) {
-			dev_warn(adev->dev, "(%d) map RLC c bo failed\n", r);
-			gfx_v7_0_rlc_fini(adev);
-			return r;
-		}
 		/* set up the cs buffer */
 		dst_ptr = adev->gfx.rlc.cs_ptr;
 		gfx_v7_0_get_csb_buffer(adev, dst_ptr);
@@ -3518,37 +3378,14 @@ static int gfx_v7_0_rlc_init(struct amdgpu_device *adev)
 	}
 
 	if (adev->gfx.rlc.cp_table_size) {
-		if (adev->gfx.rlc.cp_table_obj == NULL) {
-			r = amdgpu_bo_create(adev, adev->gfx.rlc.cp_table_size, PAGE_SIZE, true,
-					     AMDGPU_GEM_DOMAIN_VRAM,
-					     AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
-					     AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
-					     NULL, NULL,
-					     &adev->gfx.rlc.cp_table_obj);
-			if (r) {
-				dev_warn(adev->dev, "(%d) create RLC cp table bo failed\n", r);
-				gfx_v7_0_rlc_fini(adev);
-				return r;
-			}
-		}
 
-		r = amdgpu_bo_reserve(adev->gfx.rlc.cp_table_obj, false);
-		if (unlikely(r != 0)) {
-			dev_warn(adev->dev, "(%d) reserve RLC cp table bo failed\n", r);
-			gfx_v7_0_rlc_fini(adev);
-			return r;
-		}
-		r = amdgpu_bo_pin(adev->gfx.rlc.cp_table_obj, AMDGPU_GEM_DOMAIN_VRAM,
-				  &adev->gfx.rlc.cp_table_gpu_addr);
+		r = amdgpu_bo_create_reserved(adev, adev->gfx.rlc.cp_table_size,
+					      PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM,
+					      &adev->gfx.rlc.cp_table_obj,
+					      &adev->gfx.rlc.cp_table_gpu_addr,
+					      (void **)&adev->gfx.rlc.cp_table_ptr);
 		if (r) {
-			amdgpu_bo_unreserve(adev->gfx.rlc.cp_table_obj);
-			dev_warn(adev->dev, "(%d) pin RLC cp_table bo failed\n", r);
-			gfx_v7_0_rlc_fini(adev);
-			return r;
-		}
-		r = amdgpu_bo_kmap(adev->gfx.rlc.cp_table_obj, (void **)&adev->gfx.rlc.cp_table_ptr);
-		if (r) {
-			dev_warn(adev->dev, "(%d) map RLC cp table bo failed\n", r);
+			dev_warn(adev->dev, "(%d) create RLC cp table bo failed\n", r);
 			gfx_v7_0_rlc_fini(adev);
 			return r;
 		}
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
index aa5a50f..832e592 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v8_0.c
@@ -193,8 +193,8 @@ static const u32 tonga_golden_common_all[] =
 	mmGB_ADDR_CONFIG, 0xffffffff, 0x22011003,
 	mmSPI_RESOURCE_RESERVE_CU_0, 0xffffffff, 0x00000800,
 	mmSPI_RESOURCE_RESERVE_CU_1, 0xffffffff, 0x00000800,
-	mmSPI_RESOURCE_RESERVE_EN_CU_0, 0xffffffff, 0x00007FBF,
-	mmSPI_RESOURCE_RESERVE_EN_CU_1, 0xffffffff, 0x00007FAF
+	mmSPI_RESOURCE_RESERVE_EN_CU_0, 0xffffffff, 0x00FF7FBF,
+	mmSPI_RESOURCE_RESERVE_EN_CU_1, 0xffffffff, 0x00FF7FAF
 };
 
 static const u32 tonga_mgcg_cgcg_init[] =
@@ -303,8 +303,8 @@ static const u32 polaris11_golden_common_all[] =
 	mmGB_ADDR_CONFIG, 0xffffffff, 0x22011002,
 	mmSPI_RESOURCE_RESERVE_CU_0, 0xffffffff, 0x00000800,
 	mmSPI_RESOURCE_RESERVE_CU_1, 0xffffffff, 0x00000800,
-	mmSPI_RESOURCE_RESERVE_EN_CU_0, 0xffffffff, 0x00007FBF,
-	mmSPI_RESOURCE_RESERVE_EN_CU_1, 0xffffffff, 0x00007FAF,
+	mmSPI_RESOURCE_RESERVE_EN_CU_0, 0xffffffff, 0x00FF7FBF,
+	mmSPI_RESOURCE_RESERVE_EN_CU_1, 0xffffffff, 0x00FF7FAF,
 };
 
 static const u32 golden_settings_polaris10_a11[] =
@@ -336,8 +336,8 @@ static const u32 polaris10_golden_common_all[] =
 	mmGB_ADDR_CONFIG, 0xffffffff, 0x22011003,
 	mmSPI_RESOURCE_RESERVE_CU_0, 0xffffffff, 0x00000800,
 	mmSPI_RESOURCE_RESERVE_CU_1, 0xffffffff, 0x00000800,
-	mmSPI_RESOURCE_RESERVE_EN_CU_0, 0xffffffff, 0x00007FBF,
-	mmSPI_RESOURCE_RESERVE_EN_CU_1, 0xffffffff, 0x00007FAF,
+	mmSPI_RESOURCE_RESERVE_EN_CU_0, 0xffffffff, 0x00FF7FBF,
+	mmSPI_RESOURCE_RESERVE_EN_CU_1, 0xffffffff, 0x00FF7FAF,
 };
 
 static const u32 fiji_golden_common_all[] =
@@ -348,8 +348,8 @@ static const u32 fiji_golden_common_all[] =
 	mmGB_ADDR_CONFIG, 0xffffffff, 0x22011003,
 	mmSPI_RESOURCE_RESERVE_CU_0, 0xffffffff, 0x00000800,
 	mmSPI_RESOURCE_RESERVE_CU_1, 0xffffffff, 0x00000800,
-	mmSPI_RESOURCE_RESERVE_EN_CU_0, 0xffffffff, 0x00007FBF,
-	mmSPI_RESOURCE_RESERVE_EN_CU_1, 0xffffffff, 0x00007FAF,
+	mmSPI_RESOURCE_RESERVE_EN_CU_0, 0xffffffff, 0x00FF7FBF,
+	mmSPI_RESOURCE_RESERVE_EN_CU_1, 0xffffffff, 0x00FF7FAF,
 	mmGRBM_GFX_INDEX, 0xffffffff, 0xe0000000,
 	mmSPI_CONFIG_CNTL_1, 0x0000000f, 0x00000009,
 };
@@ -436,8 +436,8 @@ static const u32 iceland_golden_common_all[] =
 	mmGB_ADDR_CONFIG, 0xffffffff, 0x22010001,
 	mmSPI_RESOURCE_RESERVE_CU_0, 0xffffffff, 0x00000800,
 	mmSPI_RESOURCE_RESERVE_CU_1, 0xffffffff, 0x00000800,
-	mmSPI_RESOURCE_RESERVE_EN_CU_0, 0xffffffff, 0x00007FBF,
-	mmSPI_RESOURCE_RESERVE_EN_CU_1, 0xffffffff, 0x00007FAF
+	mmSPI_RESOURCE_RESERVE_EN_CU_0, 0xffffffff, 0x00FF7FBF,
+	mmSPI_RESOURCE_RESERVE_EN_CU_1, 0xffffffff, 0x00FF7FAF
 };
 
 static const u32 iceland_mgcg_cgcg_init[] =
@@ -532,8 +532,8 @@ static const u32 cz_golden_common_all[] =
 	mmGB_ADDR_CONFIG, 0xffffffff, 0x22010001,
 	mmSPI_RESOURCE_RESERVE_CU_0, 0xffffffff, 0x00000800,
 	mmSPI_RESOURCE_RESERVE_CU_1, 0xffffffff, 0x00000800,
-	mmSPI_RESOURCE_RESERVE_EN_CU_0, 0xffffffff, 0x00007FBF,
-	mmSPI_RESOURCE_RESERVE_EN_CU_1, 0xffffffff, 0x00007FAF
+	mmSPI_RESOURCE_RESERVE_EN_CU_0, 0xffffffff, 0x00FF7FBF,
+	mmSPI_RESOURCE_RESERVE_EN_CU_1, 0xffffffff, 0x00FF7FAF
 };
 
 static const u32 cz_mgcg_cgcg_init[] =
@@ -637,8 +637,8 @@ static const u32 stoney_golden_common_all[] =
 	mmGB_ADDR_CONFIG, 0xffffffff, 0x12010001,
 	mmSPI_RESOURCE_RESERVE_CU_0, 0xffffffff, 0x00000800,
 	mmSPI_RESOURCE_RESERVE_CU_1, 0xffffffff, 0x00000800,
-	mmSPI_RESOURCE_RESERVE_EN_CU_0, 0xffffffff, 0x00007FBF,
-	mmSPI_RESOURCE_RESERVE_EN_CU_1, 0xffffffff, 0x00007FAF,
+	mmSPI_RESOURCE_RESERVE_EN_CU_0, 0xffffffff, 0x00FF7FBF,
+	mmSPI_RESOURCE_RESERVE_EN_CU_1, 0xffffffff, 0x00FF7FAF,
 };
 
 static const u32 stoney_mgcg_cgcg_init[] =
@@ -750,7 +750,7 @@ static void gfx_v8_0_init_golden_registers(struct amdgpu_device *adev)
 
 static void gfx_v8_0_scratch_init(struct amdgpu_device *adev)
 {
-	adev->gfx.scratch.num_reg = 7;
+	adev->gfx.scratch.num_reg = 8;
 	adev->gfx.scratch.reg_base = mmSCRATCH_REG0;
 	adev->gfx.scratch.free_mask = (1u << adev->gfx.scratch.num_reg) - 1;
 }
@@ -1238,29 +1238,8 @@ static void cz_init_cp_jump_table(struct amdgpu_device *adev)
 
 static void gfx_v8_0_rlc_fini(struct amdgpu_device *adev)
 {
-	int r;
-
-	/* clear state block */
-	if (adev->gfx.rlc.clear_state_obj) {
-		r = amdgpu_bo_reserve(adev->gfx.rlc.clear_state_obj, true);
-		if (unlikely(r != 0))
-			dev_warn(adev->dev, "(%d) reserve RLC cbs bo failed\n", r);
-		amdgpu_bo_unpin(adev->gfx.rlc.clear_state_obj);
-		amdgpu_bo_unreserve(adev->gfx.rlc.clear_state_obj);
-		amdgpu_bo_unref(&adev->gfx.rlc.clear_state_obj);
-		adev->gfx.rlc.clear_state_obj = NULL;
-	}
-
-	/* jump table block */
-	if (adev->gfx.rlc.cp_table_obj) {
-		r = amdgpu_bo_reserve(adev->gfx.rlc.cp_table_obj, true);
-		if (unlikely(r != 0))
-			dev_warn(adev->dev, "(%d) reserve RLC cp table bo failed\n", r);
-		amdgpu_bo_unpin(adev->gfx.rlc.cp_table_obj);
-		amdgpu_bo_unreserve(adev->gfx.rlc.cp_table_obj);
-		amdgpu_bo_unref(&adev->gfx.rlc.cp_table_obj);
-		adev->gfx.rlc.cp_table_obj = NULL;
-	}
+	amdgpu_bo_free_kernel(&adev->gfx.rlc.clear_state_obj, NULL, NULL);
+	amdgpu_bo_free_kernel(&adev->gfx.rlc.cp_table_obj, NULL, NULL);
 }
 
 static int gfx_v8_0_rlc_init(struct amdgpu_device *adev)
@@ -1278,39 +1257,17 @@ static int gfx_v8_0_rlc_init(struct amdgpu_device *adev)
 		/* clear state block */
 		adev->gfx.rlc.clear_state_size = dws = gfx_v8_0_get_csb_size(adev);
 
-		if (adev->gfx.rlc.clear_state_obj == NULL) {
-			r = amdgpu_bo_create(adev, dws * 4, PAGE_SIZE, true,
-					     AMDGPU_GEM_DOMAIN_VRAM,
-					     AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
-					     AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
-					     NULL, NULL,
-					     &adev->gfx.rlc.clear_state_obj);
-			if (r) {
-				dev_warn(adev->dev, "(%d) create RLC c bo failed\n", r);
-				gfx_v8_0_rlc_fini(adev);
-				return r;
-			}
-		}
-		r = amdgpu_bo_reserve(adev->gfx.rlc.clear_state_obj, false);
-		if (unlikely(r != 0)) {
-			gfx_v8_0_rlc_fini(adev);
-			return r;
-		}
-		r = amdgpu_bo_pin(adev->gfx.rlc.clear_state_obj, AMDGPU_GEM_DOMAIN_VRAM,
-				  &adev->gfx.rlc.clear_state_gpu_addr);
+		r = amdgpu_bo_create_reserved(adev, dws * 4, PAGE_SIZE,
+					      AMDGPU_GEM_DOMAIN_VRAM,
+					      &adev->gfx.rlc.clear_state_obj,
+					      &adev->gfx.rlc.clear_state_gpu_addr,
+					      (void **)&adev->gfx.rlc.cs_ptr);
 		if (r) {
-			amdgpu_bo_unreserve(adev->gfx.rlc.clear_state_obj);
-			dev_warn(adev->dev, "(%d) pin RLC cbs bo failed\n", r);
+			dev_warn(adev->dev, "(%d) create RLC c bo failed\n", r);
 			gfx_v8_0_rlc_fini(adev);
 			return r;
 		}
 
-		r = amdgpu_bo_kmap(adev->gfx.rlc.clear_state_obj, (void **)&adev->gfx.rlc.cs_ptr);
-		if (r) {
-			dev_warn(adev->dev, "(%d) map RLC cbs bo failed\n", r);
-			gfx_v8_0_rlc_fini(adev);
-			return r;
-		}
 		/* set up the cs buffer */
 		dst_ptr = adev->gfx.rlc.cs_ptr;
 		gfx_v8_0_get_csb_buffer(adev, dst_ptr);
@@ -1321,34 +1278,13 @@ static int gfx_v8_0_rlc_init(struct amdgpu_device *adev)
 	if ((adev->asic_type == CHIP_CARRIZO) ||
 	    (adev->asic_type == CHIP_STONEY)) {
 		adev->gfx.rlc.cp_table_size = ALIGN(96 * 5 * 4, 2048) + (64 * 1024); /* JT + GDS */
-		if (adev->gfx.rlc.cp_table_obj == NULL) {
-			r = amdgpu_bo_create(adev, adev->gfx.rlc.cp_table_size, PAGE_SIZE, true,
-					     AMDGPU_GEM_DOMAIN_VRAM,
-					     AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
-					     AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS,
-					     NULL, NULL,
-					     &adev->gfx.rlc.cp_table_obj);
-			if (r) {
-				dev_warn(adev->dev, "(%d) create RLC cp table bo failed\n", r);
-				return r;
-			}
-		}
-
-		r = amdgpu_bo_reserve(adev->gfx.rlc.cp_table_obj, false);
-		if (unlikely(r != 0)) {
-			dev_warn(adev->dev, "(%d) reserve RLC cp table bo failed\n", r);
-			return r;
-		}
-		r = amdgpu_bo_pin(adev->gfx.rlc.cp_table_obj, AMDGPU_GEM_DOMAIN_VRAM,
-				  &adev->gfx.rlc.cp_table_gpu_addr);
+		r = amdgpu_bo_create_reserved(adev, adev->gfx.rlc.cp_table_size,
+					      PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM,
+					      &adev->gfx.rlc.cp_table_obj,
+					      &adev->gfx.rlc.cp_table_gpu_addr,
+					      (void **)&adev->gfx.rlc.cp_table_ptr);
 		if (r) {
-			amdgpu_bo_unreserve(adev->gfx.rlc.cp_table_obj);
-			dev_warn(adev->dev, "(%d) pin RLC cp table bo failed\n", r);
-			return r;
-		}
-		r = amdgpu_bo_kmap(adev->gfx.rlc.cp_table_obj, (void **)&adev->gfx.rlc.cp_table_ptr);
-		if (r) {
-			dev_warn(adev->dev, "(%d) map RLC cp table bo failed\n", r);
+			dev_warn(adev->dev, "(%d) create RLC cp table bo failed\n", r);
 			return r;
 		}
 
@@ -1363,17 +1299,7 @@ static int gfx_v8_0_rlc_init(struct amdgpu_device *adev)
 
 static void gfx_v8_0_mec_fini(struct amdgpu_device *adev)
 {
-	int r;
-
-	if (adev->gfx.mec.hpd_eop_obj) {
-		r = amdgpu_bo_reserve(adev->gfx.mec.hpd_eop_obj, true);
-		if (unlikely(r != 0))
-			dev_warn(adev->dev, "(%d) reserve HPD EOP bo failed\n", r);
-		amdgpu_bo_unpin(adev->gfx.mec.hpd_eop_obj);
-		amdgpu_bo_unreserve(adev->gfx.mec.hpd_eop_obj);
-		amdgpu_bo_unref(&adev->gfx.mec.hpd_eop_obj);
-		adev->gfx.mec.hpd_eop_obj = NULL;
-	}
+	amdgpu_bo_free_kernel(&adev->gfx.mec.hpd_eop_obj, NULL, NULL);
 }
 
 static int gfx_v8_0_mec_init(struct amdgpu_device *adev)
@@ -1389,34 +1315,13 @@ static int gfx_v8_0_mec_init(struct amdgpu_device *adev)
 
 	mec_hpd_size = adev->gfx.num_compute_rings * GFX8_MEC_HPD_SIZE;
 
-	if (adev->gfx.mec.hpd_eop_obj == NULL) {
-		r = amdgpu_bo_create(adev,
-				     mec_hpd_size,
-				     PAGE_SIZE, true,
-				     AMDGPU_GEM_DOMAIN_GTT, 0, NULL, NULL,
-				     &adev->gfx.mec.hpd_eop_obj);
-		if (r) {
-			dev_warn(adev->dev, "(%d) create HDP EOP bo failed\n", r);
-			return r;
-		}
-	}
-
-	r = amdgpu_bo_reserve(adev->gfx.mec.hpd_eop_obj, false);
-	if (unlikely(r != 0)) {
-		gfx_v8_0_mec_fini(adev);
-		return r;
-	}
-	r = amdgpu_bo_pin(adev->gfx.mec.hpd_eop_obj, AMDGPU_GEM_DOMAIN_GTT,
-			  &adev->gfx.mec.hpd_eop_gpu_addr);
+	r = amdgpu_bo_create_reserved(adev, mec_hpd_size, PAGE_SIZE,
+				      AMDGPU_GEM_DOMAIN_GTT,
+				      &adev->gfx.mec.hpd_eop_obj,
+				      &adev->gfx.mec.hpd_eop_gpu_addr,
+				      (void **)&hpd);
 	if (r) {
-		dev_warn(adev->dev, "(%d) pin HDP EOP bo failed\n", r);
-		gfx_v8_0_mec_fini(adev);
-		return r;
-	}
-	r = amdgpu_bo_kmap(adev->gfx.mec.hpd_eop_obj, (void **)&hpd);
-	if (r) {
-		dev_warn(adev->dev, "(%d) map HDP EOP bo failed\n", r);
-		gfx_v8_0_mec_fini(adev);
+		dev_warn(adev->dev, "(%d) create HDP EOP bo failed\n", r);
 		return r;
 	}
 
@@ -3802,6 +3707,8 @@ static void gfx_v8_0_gpu_init(struct amdgpu_device *adev)
 				   ELEMENT_SIZE, 1);
 	sh_static_mem_cfg = REG_SET_FIELD(sh_static_mem_cfg, SH_STATIC_MEM_CONFIG,
 				   INDEX_STRIDE, 3);
+	WREG32(mmSH_STATIC_MEM_CONFIG, sh_static_mem_cfg);
+
 	mutex_lock(&adev->srbm_mutex);
 	for (i = 0; i < adev->vm_manager.id_mgr[0].num_ids; i++) {
 		vi_srbm_select(adev, 0, 0, 0, i);
@@ -3825,7 +3732,6 @@ static void gfx_v8_0_gpu_init(struct amdgpu_device *adev)
 
 		WREG32(mmSH_MEM_APE1_BASE, 1);
 		WREG32(mmSH_MEM_APE1_LIMIT, 0);
-		WREG32(mmSH_STATIC_MEM_CONFIG, sh_static_mem_cfg);
 	}
 	vi_srbm_select(adev, 0, 0, 0, 0);
 	mutex_unlock(&adev->srbm_mutex);
@@ -4564,7 +4470,7 @@ static int gfx_v8_0_kiq_kcq_enable(struct amdgpu_device *adev)
 		/* This situation may be hit in the future if a new HW
 		 * generation exposes more than 64 queues. If so, the
 		 * definition of queue_mask needs updating */
-		if (WARN_ON(i > (sizeof(queue_mask)*8))) {
+		if (WARN_ON(i >= (sizeof(queue_mask)*8))) {
 			DRM_ERROR("Invalid KCQ enabled: %d\n", i);
 			break;
 		}
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
index c9b9c88..69182ee 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c
@@ -116,7 +116,9 @@ static const u32 golden_settings_gc_9_0[] =
 	SOC15_REG_OFFSET(GC, 0, mmRLC_GPM_UTCL1_CNTL_2), 0x08000000, 0x08000080,
 	SOC15_REG_OFFSET(GC, 0, mmRLC_PREWALKER_UTCL1_CNTL), 0x08000000, 0x08000080,
 	SOC15_REG_OFFSET(GC, 0, mmRLC_SPM_UTCL1_CNTL), 0x08000000, 0x08000080,
+	SOC15_REG_OFFSET(GC, 0, mmSH_MEM_CONFIG), 0x00001000, 0x00001000,
 	SOC15_REG_OFFSET(GC, 0, mmSPI_CONFIG_CNTL_1), 0x0000000f, 0x01000107,
+	SOC15_REG_OFFSET(GC, 0, mmSQC_CONFIG), 0x03000000, 0x020a2000,
 	SOC15_REG_OFFSET(GC, 0, mmTA_CNTL_AUX), 0xfffffeef, 0x010b0000,
 	SOC15_REG_OFFSET(GC, 0, mmTCP_CHAN_STEER_HI), 0xffffffff, 0x4a2c0e68,
 	SOC15_REG_OFFSET(GC, 0, mmTCP_CHAN_STEER_LO), 0xffffffff, 0xb5d3f197,
@@ -211,7 +213,7 @@ static void gfx_v9_0_init_golden_registers(struct amdgpu_device *adev)
 
 static void gfx_v9_0_scratch_init(struct amdgpu_device *adev)
 {
-	adev->gfx.scratch.num_reg = 7;
+	adev->gfx.scratch.num_reg = 8;
 	adev->gfx.scratch.reg_base = SOC15_REG_OFFSET(GC, 0, mmSCRATCH_REG0);
 	adev->gfx.scratch.free_mask = (1u << adev->gfx.scratch.num_reg) - 1;
 }
@@ -772,18 +774,16 @@ static int gfx_v9_0_rlc_init(struct amdgpu_device *adev)
 	if (cs_data) {
 		/* clear state block */
 		adev->gfx.rlc.clear_state_size = dws = gfx_v9_0_get_csb_size(adev);
-		if (adev->gfx.rlc.clear_state_obj == NULL) {
-			r = amdgpu_bo_create_kernel(adev, dws * 4, PAGE_SIZE,
-						AMDGPU_GEM_DOMAIN_VRAM,
-						&adev->gfx.rlc.clear_state_obj,
-						&adev->gfx.rlc.clear_state_gpu_addr,
-						(void **)&adev->gfx.rlc.cs_ptr);
-			if (r) {
-				dev_err(adev->dev,
-					"(%d) failed to create rlc csb bo\n", r);
-				gfx_v9_0_rlc_fini(adev);
-				return r;
-			}
+		r = amdgpu_bo_create_reserved(adev, dws * 4, PAGE_SIZE,
+					      AMDGPU_GEM_DOMAIN_VRAM,
+					      &adev->gfx.rlc.clear_state_obj,
+					      &adev->gfx.rlc.clear_state_gpu_addr,
+					      (void **)&adev->gfx.rlc.cs_ptr);
+		if (r) {
+			dev_err(adev->dev, "(%d) failed to create rlc csb bo\n",
+				r);
+			gfx_v9_0_rlc_fini(adev);
+			return r;
 		}
 		/* set up the cs buffer */
 		dst_ptr = adev->gfx.rlc.cs_ptr;
@@ -795,18 +795,16 @@ static int gfx_v9_0_rlc_init(struct amdgpu_device *adev)
 	if (adev->asic_type == CHIP_RAVEN) {
 		/* TODO: double check the cp_table_size for RV */
 		adev->gfx.rlc.cp_table_size = ALIGN(96 * 5 * 4, 2048) + (64 * 1024); /* JT + GDS */
-		if (adev->gfx.rlc.cp_table_obj == NULL) {
-			r = amdgpu_bo_create_kernel(adev, adev->gfx.rlc.cp_table_size,
-						PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM,
-						&adev->gfx.rlc.cp_table_obj,
-						&adev->gfx.rlc.cp_table_gpu_addr,
-						(void **)&adev->gfx.rlc.cp_table_ptr);
-			if (r) {
-				dev_err(adev->dev,
-					"(%d) failed to create cp table bo\n", r);
-				gfx_v9_0_rlc_fini(adev);
-				return r;
-			}
+		r = amdgpu_bo_create_reserved(adev, adev->gfx.rlc.cp_table_size,
+					      PAGE_SIZE, AMDGPU_GEM_DOMAIN_VRAM,
+					      &adev->gfx.rlc.cp_table_obj,
+					      &adev->gfx.rlc.cp_table_gpu_addr,
+					      (void **)&adev->gfx.rlc.cp_table_ptr);
+		if (r) {
+			dev_err(adev->dev,
+				"(%d) failed to create cp table bo\n", r);
+			gfx_v9_0_rlc_fini(adev);
+			return r;
 		}
 
 		rv_init_cp_jump_table(adev);
@@ -821,28 +819,8 @@ static int gfx_v9_0_rlc_init(struct amdgpu_device *adev)
 
 static void gfx_v9_0_mec_fini(struct amdgpu_device *adev)
 {
-	int r;
-
-	if (adev->gfx.mec.hpd_eop_obj) {
-		r = amdgpu_bo_reserve(adev->gfx.mec.hpd_eop_obj, true);
-		if (unlikely(r != 0))
-			dev_warn(adev->dev, "(%d) reserve HPD EOP bo failed\n", r);
-		amdgpu_bo_unpin(adev->gfx.mec.hpd_eop_obj);
-		amdgpu_bo_unreserve(adev->gfx.mec.hpd_eop_obj);
-
-		amdgpu_bo_unref(&adev->gfx.mec.hpd_eop_obj);
-		adev->gfx.mec.hpd_eop_obj = NULL;
-	}
-	if (adev->gfx.mec.mec_fw_obj) {
-		r = amdgpu_bo_reserve(adev->gfx.mec.mec_fw_obj, true);
-		if (unlikely(r != 0))
-			dev_warn(adev->dev, "(%d) reserve mec firmware bo failed\n", r);
-		amdgpu_bo_unpin(adev->gfx.mec.mec_fw_obj);
-		amdgpu_bo_unreserve(adev->gfx.mec.mec_fw_obj);
-
-		amdgpu_bo_unref(&adev->gfx.mec.mec_fw_obj);
-		adev->gfx.mec.mec_fw_obj = NULL;
-	}
+	amdgpu_bo_free_kernel(&adev->gfx.mec.hpd_eop_obj, NULL, NULL);
+	amdgpu_bo_free_kernel(&adev->gfx.mec.mec_fw_obj, NULL, NULL);
 }
 
 static int gfx_v9_0_mec_init(struct amdgpu_device *adev)
@@ -862,33 +840,13 @@ static int gfx_v9_0_mec_init(struct amdgpu_device *adev)
 	amdgpu_gfx_compute_queue_acquire(adev);
 	mec_hpd_size = adev->gfx.num_compute_rings * GFX9_MEC_HPD_SIZE;
 
-	if (adev->gfx.mec.hpd_eop_obj == NULL) {
-		r = amdgpu_bo_create(adev,
-				     mec_hpd_size,
-				     PAGE_SIZE, true,
-				     AMDGPU_GEM_DOMAIN_GTT, 0, NULL, NULL,
-				     &adev->gfx.mec.hpd_eop_obj);
-		if (r) {
-			dev_warn(adev->dev, "(%d) create HDP EOP bo failed\n", r);
-			return r;
-		}
-	}
-
-	r = amdgpu_bo_reserve(adev->gfx.mec.hpd_eop_obj, false);
-	if (unlikely(r != 0)) {
-		gfx_v9_0_mec_fini(adev);
-		return r;
-	}
-	r = amdgpu_bo_pin(adev->gfx.mec.hpd_eop_obj, AMDGPU_GEM_DOMAIN_GTT,
-			  &adev->gfx.mec.hpd_eop_gpu_addr);
+	r = amdgpu_bo_create_reserved(adev, mec_hpd_size, PAGE_SIZE,
+				      AMDGPU_GEM_DOMAIN_GTT,
+				      &adev->gfx.mec.hpd_eop_obj,
+				      &adev->gfx.mec.hpd_eop_gpu_addr,
+				      (void **)&hpd);
 	if (r) {
-		dev_warn(adev->dev, "(%d) pin HDP EOP bo failed\n", r);
-		gfx_v9_0_mec_fini(adev);
-		return r;
-	}
-	r = amdgpu_bo_kmap(adev->gfx.mec.hpd_eop_obj, (void **)&hpd);
-	if (r) {
-		dev_warn(adev->dev, "(%d) map HDP EOP bo failed\n", r);
+		dev_warn(adev->dev, "(%d) create HDP EOP bo failed\n", r);
 		gfx_v9_0_mec_fini(adev);
 		return r;
 	}
@@ -905,42 +863,22 @@ static int gfx_v9_0_mec_init(struct amdgpu_device *adev)
 		 le32_to_cpu(mec_hdr->header.ucode_array_offset_bytes));
 	fw_size = le32_to_cpu(mec_hdr->header.ucode_size_bytes) / 4;
 
-	if (adev->gfx.mec.mec_fw_obj == NULL) {
-		r = amdgpu_bo_create(adev,
-			mec_hdr->header.ucode_size_bytes,
-			PAGE_SIZE, true,
-			AMDGPU_GEM_DOMAIN_GTT, 0, NULL, NULL,
-			&adev->gfx.mec.mec_fw_obj);
-		if (r) {
-			dev_warn(adev->dev, "(%d) create mec firmware bo failed\n", r);
-			return r;
-		}
+	r = amdgpu_bo_create_reserved(adev, mec_hdr->header.ucode_size_bytes,
+				      PAGE_SIZE, AMDGPU_GEM_DOMAIN_GTT,
+				      &adev->gfx.mec.mec_fw_obj,
+				      &adev->gfx.mec.mec_fw_gpu_addr,
+				      (void **)&fw);
+	if (r) {
+		dev_warn(adev->dev, "(%d) create mec firmware bo failed\n", r);
+		gfx_v9_0_mec_fini(adev);
+		return r;
 	}
 
-	r = amdgpu_bo_reserve(adev->gfx.mec.mec_fw_obj, false);
-	if (unlikely(r != 0)) {
-		gfx_v9_0_mec_fini(adev);
-		return r;
-	}
-	r = amdgpu_bo_pin(adev->gfx.mec.mec_fw_obj, AMDGPU_GEM_DOMAIN_GTT,
-			&adev->gfx.mec.mec_fw_gpu_addr);
-	if (r) {
-		dev_warn(adev->dev, "(%d) pin mec firmware bo failed\n", r);
-		gfx_v9_0_mec_fini(adev);
-		return r;
-	}
-	r = amdgpu_bo_kmap(adev->gfx.mec.mec_fw_obj, (void **)&fw);
-	if (r) {
-		dev_warn(adev->dev, "(%d) map firmware bo failed\n", r);
-		gfx_v9_0_mec_fini(adev);
-		return r;
-	}
 	memcpy(fw, fw_data, fw_size);
 
 	amdgpu_bo_kunmap(adev->gfx.mec.mec_fw_obj);
 	amdgpu_bo_unreserve(adev->gfx.mec.mec_fw_obj);
 
-
 	return 0;
 }
 
@@ -2219,7 +2157,7 @@ static int gfx_v9_0_cp_gfx_start(struct amdgpu_device *adev)
 	struct amdgpu_ring *ring = &adev->gfx.gfx_ring[0];
 	const struct cs_section_def *sect = NULL;
 	const struct cs_extent_def *ext = NULL;
-	int r, i;
+	int r, i, tmp;
 
 	/* init the CP */
 	WREG32_SOC15(GC, 0, mmCP_MAX_CONTEXT, adev->gfx.config.max_hw_contexts - 1);
@@ -2227,7 +2165,7 @@ static int gfx_v9_0_cp_gfx_start(struct amdgpu_device *adev)
 
 	gfx_v9_0_cp_gfx_enable(adev, true);
 
-	r = amdgpu_ring_alloc(ring, gfx_v9_0_get_csb_size(adev) + 4);
+	r = amdgpu_ring_alloc(ring, gfx_v9_0_get_csb_size(adev) + 4 + 3);
 	if (r) {
 		DRM_ERROR("amdgpu: cp failed to lock ring (%d).\n", r);
 		return r;
@@ -2265,6 +2203,12 @@ static int gfx_v9_0_cp_gfx_start(struct amdgpu_device *adev)
 	amdgpu_ring_write(ring, 0x8000);
 	amdgpu_ring_write(ring, 0x8000);
 
+	amdgpu_ring_write(ring, PACKET3(PACKET3_SET_UCONFIG_REG,1));
+	tmp = (PACKET3_SET_UCONFIG_REG_INDEX_TYPE |
+		(SOC15_REG_OFFSET(GC, 0, mmVGT_INDEX_TYPE) - PACKET3_SET_UCONFIG_REG_START));
+	amdgpu_ring_write(ring, tmp);
+	amdgpu_ring_write(ring, 0);
+
 	amdgpu_ring_commit(ring);
 
 	return 0;
@@ -2427,7 +2371,7 @@ static int gfx_v9_0_kiq_kcq_enable(struct amdgpu_device *adev)
 		/* This situation may be hit in the future if a new HW
 		 * generation exposes more than 64 queues. If so, the
 		 * definition of queue_mask needs updating */
-		if (WARN_ON(i > (sizeof(queue_mask)*8))) {
+		if (WARN_ON(i >= (sizeof(queue_mask)*8))) {
 			DRM_ERROR("Invalid KCQ enabled: %d\n", i);
 			break;
 		}
@@ -4158,7 +4102,7 @@ static int gfx_v9_0_kiq_irq(struct amdgpu_device *adev,
 	return 0;
 }
 
-const struct amd_ip_funcs gfx_v9_0_ip_funcs = {
+static const struct amd_ip_funcs gfx_v9_0_ip_funcs = {
 	.name = "gfx_v9_0",
 	.early_init = gfx_v9_0_early_init,
 	.late_init = gfx_v9_0_late_init,
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.h b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.h
index 56ef652..fa5a3fb 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.h
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.h
@@ -24,7 +24,6 @@
 #ifndef __GFX_V9_0_H__
 #define __GFX_V9_0_H__
 
-extern const struct amd_ip_funcs gfx_v9_0_ip_funcs;
 extern const struct amdgpu_ip_block_version gfx_v9_0_ip_block;
 
 void gfx_v9_0_select_se_sh(struct amdgpu_device *adev, u32 se_num, u32 sh_num);
diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
index a42f483..4f2788b 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.c
@@ -58,14 +58,14 @@ static void gfxhub_v1_0_init_gart_aperture_regs(struct amdgpu_device *adev)
 	gfxhub_v1_0_init_gart_pt_regs(adev);
 
 	WREG32_SOC15(GC, 0, mmVM_CONTEXT0_PAGE_TABLE_START_ADDR_LO32,
-		     (u32)(adev->mc.gtt_start >> 12));
+		     (u32)(adev->mc.gart_start >> 12));
 	WREG32_SOC15(GC, 0, mmVM_CONTEXT0_PAGE_TABLE_START_ADDR_HI32,
-		     (u32)(adev->mc.gtt_start >> 44));
+		     (u32)(adev->mc.gart_start >> 44));
 
 	WREG32_SOC15(GC, 0, mmVM_CONTEXT0_PAGE_TABLE_END_ADDR_LO32,
-		     (u32)(adev->mc.gtt_end >> 12));
+		     (u32)(adev->mc.gart_end >> 12));
 	WREG32_SOC15(GC, 0, mmVM_CONTEXT0_PAGE_TABLE_END_ADDR_HI32,
-		     (u32)(adev->mc.gtt_end >> 44));
+		     (u32)(adev->mc.gart_end >> 44));
 }
 
 static void gfxhub_v1_0_init_system_aperture_regs(struct amdgpu_device *adev)
@@ -124,12 +124,12 @@ static void gfxhub_v1_0_init_tlb_regs(struct amdgpu_device *adev)
 
 static void gfxhub_v1_0_init_cache_regs(struct amdgpu_device *adev)
 {
-	uint32_t tmp;
+	uint32_t tmp, field;
 
 	/* Setup L2 cache */
 	tmp = RREG32_SOC15(GC, 0, mmVM_L2_CNTL);
 	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 1);
-	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_FRAGMENT_PROCESSING, 0);
+	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_FRAGMENT_PROCESSING, 1);
 	/* XXX for emulation, Refer to closed source code.*/
 	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, L2_PDE0_CACHE_TAG_GENERATION_MODE,
 			    0);
@@ -143,7 +143,10 @@ static void gfxhub_v1_0_init_cache_regs(struct amdgpu_device *adev)
 	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL2, INVALIDATE_L2_CACHE, 1);
 	WREG32_SOC15(GC, 0, mmVM_L2_CNTL2, tmp);
 
+	field = adev->vm_manager.fragment_size;
 	tmp = mmVM_L2_CNTL3_DEFAULT;
+	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, BANK_SELECT, field);
+	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, L2_CACHE_BIGK_FRAGMENT_SIZE, 6);
 	WREG32_SOC15(GC, 0, mmVM_L2_CNTL3, tmp);
 
 	tmp = mmVM_L2_CNTL4_DEFAULT;
@@ -206,6 +209,9 @@ static void gfxhub_v1_0_setup_vmid_config(struct amdgpu_device *adev)
 		tmp = REG_SET_FIELD(tmp, VM_CONTEXT1_CNTL,
 				PAGE_TABLE_BLOCK_SIZE,
 				adev->vm_manager.block_size - 9);
+		/* Send no-retry XNACK on fault to suppress VM fault storm. */
+		tmp = REG_SET_FIELD(tmp, VM_CONTEXT1_CNTL,
+				    RETRY_PERMISSION_OR_INVALID_PAGE_FAULT, 0);
 		WREG32_SOC15_OFFSET(GC, 0, mmVM_CONTEXT1_CNTL, i, tmp);
 		WREG32_SOC15_OFFSET(GC, 0, mmVM_CONTEXT1_PAGE_TABLE_START_ADDR_LO32, i*2, 0);
 		WREG32_SOC15_OFFSET(GC, 0, mmVM_CONTEXT1_PAGE_TABLE_START_ADDR_HI32, i*2, 0);
diff --git a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.h b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.h
index d2dbb08..206e29c 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.h
+++ b/drivers/gpu/drm/amd/amdgpu/gfxhub_v1_0.h
@@ -30,7 +30,5 @@ void gfxhub_v1_0_set_fault_enable_default(struct amdgpu_device *adev,
 					  bool value);
 void gfxhub_v1_0_init(struct amdgpu_device *adev);
 u64 gfxhub_v1_0_get_mc_fb_offset(struct amdgpu_device *adev);
-extern const struct amd_ip_funcs gfxhub_v1_0_ip_funcs;
-extern const struct amdgpu_ip_block_version gfxhub_v1_0_ip_block;
 
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
index d0214d9..12b0c4c 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c
@@ -66,14 +66,10 @@ static const u32 crtc_offsets[6] =
 	SI_CRTC5_REGISTER_OFFSET
 };
 
-static void gmc_v6_0_mc_stop(struct amdgpu_device *adev,
-			     struct amdgpu_mode_mc_save *save)
+static void gmc_v6_0_mc_stop(struct amdgpu_device *adev)
 {
 	u32 blackout;
 
-	if (adev->mode_info.num_crtc)
-		amdgpu_display_stop_mc_access(adev, save);
-
 	gmc_v6_0_wait_for_idle((void *)adev);
 
 	blackout = RREG32(mmMC_SHARED_BLACKOUT_CNTL);
@@ -90,8 +86,7 @@ static void gmc_v6_0_mc_stop(struct amdgpu_device *adev,
 
 }
 
-static void gmc_v6_0_mc_resume(struct amdgpu_device *adev,
-			       struct amdgpu_mode_mc_save *save)
+static void gmc_v6_0_mc_resume(struct amdgpu_device *adev)
 {
 	u32 tmp;
 
@@ -103,10 +98,6 @@ static void gmc_v6_0_mc_resume(struct amdgpu_device *adev,
 	tmp = REG_SET_FIELD(0, BIF_FB_EN, FB_READ_EN, 1);
 	tmp = REG_SET_FIELD(tmp, BIF_FB_EN, FB_WRITE_EN, 1);
 	WREG32(mmBIF_FB_EN, tmp);
-
-	if (adev->mode_info.num_crtc)
-		amdgpu_display_resume_mc_access(adev, save);
-
 }
 
 static int gmc_v6_0_init_microcode(struct amdgpu_device *adev)
@@ -228,20 +219,20 @@ static int gmc_v6_0_mc_load_microcode(struct amdgpu_device *adev)
 static void gmc_v6_0_vram_gtt_location(struct amdgpu_device *adev,
 				       struct amdgpu_mc *mc)
 {
+	u64 base = RREG32(mmMC_VM_FB_LOCATION) & 0xFFFF;
+	base <<= 24;
+
 	if (mc->mc_vram_size > 0xFFC0000000ULL) {
 		dev_warn(adev->dev, "limiting VRAM\n");
 		mc->real_vram_size = 0xFFC0000000ULL;
 		mc->mc_vram_size = 0xFFC0000000ULL;
 	}
-	amdgpu_vram_location(adev, &adev->mc, 0);
-	adev->mc.gtt_base_align = 0;
-	amdgpu_gtt_location(adev, mc);
+	amdgpu_vram_location(adev, &adev->mc, base);
+	amdgpu_gart_location(adev, mc);
 }
 
 static void gmc_v6_0_mc_program(struct amdgpu_device *adev)
 {
-	struct amdgpu_mode_mc_save save;
-	u32 tmp;
 	int i, j;
 
 	/* Initialize HDP */
@@ -254,16 +245,23 @@ static void gmc_v6_0_mc_program(struct amdgpu_device *adev)
 	}
 	WREG32(mmHDP_REG_COHERENCY_FLUSH_CNTL, 0);
 
-	if (adev->mode_info.num_crtc)
-		amdgpu_display_set_vga_render_state(adev, false);
-
-	gmc_v6_0_mc_stop(adev, &save);
-
 	if (gmc_v6_0_wait_for_idle((void *)adev)) {
 		dev_warn(adev->dev, "Wait for MC idle timedout !\n");
 	}
 
-	WREG32(mmVGA_HDP_CONTROL, VGA_HDP_CONTROL__VGA_MEMORY_DISABLE_MASK);
+	if (adev->mode_info.num_crtc) {
+		u32 tmp;
+
+		/* Lockout access through VGA aperture*/
+		tmp = RREG32(mmVGA_HDP_CONTROL);
+		tmp |= VGA_HDP_CONTROL__VGA_MEMORY_DISABLE_MASK;
+		WREG32(mmVGA_HDP_CONTROL, tmp);
+
+		/* disable VGA render */
+		tmp = RREG32(mmVGA_RENDER_CONTROL);
+		tmp &= ~VGA_VSTATUS_CNTL;
+		WREG32(mmVGA_RENDER_CONTROL, tmp);
+	}
 	/* Update configuration */
 	WREG32(mmMC_VM_SYSTEM_APERTURE_LOW_ADDR,
 	       adev->mc.vram_start >> 12);
@@ -271,13 +269,6 @@ static void gmc_v6_0_mc_program(struct amdgpu_device *adev)
 	       adev->mc.vram_end >> 12);
 	WREG32(mmMC_VM_SYSTEM_APERTURE_DEFAULT_ADDR,
 	       adev->vram_scratch.gpu_addr >> 12);
-	tmp = ((adev->mc.vram_end >> 24) & 0xFFFF) << 16;
-	tmp |= ((adev->mc.vram_start >> 24) & 0xFFFF);
-	WREG32(mmMC_VM_FB_LOCATION, tmp);
-	/* XXX double check these! */
-	WREG32(mmHDP_NONSURFACE_BASE, (adev->mc.vram_start >> 8));
-	WREG32(mmHDP_NONSURFACE_INFO, (2 << 7) | (1 << 30));
-	WREG32(mmHDP_NONSURFACE_SIZE, 0x3FFFFFFF);
 	WREG32(mmMC_VM_AGP_BASE, 0);
 	WREG32(mmMC_VM_AGP_TOP, 0x0FFFFFFF);
 	WREG32(mmMC_VM_AGP_BOT, 0x0FFFFFFF);
@@ -285,7 +276,6 @@ static void gmc_v6_0_mc_program(struct amdgpu_device *adev)
 	if (gmc_v6_0_wait_for_idle((void *)adev)) {
 		dev_warn(adev->dev, "Wait for MC idle timedout !\n");
 	}
-	gmc_v6_0_mc_resume(adev, &save);
 }
 
 static int gmc_v6_0_mc_init(struct amdgpu_device *adev)
@@ -342,15 +332,7 @@ static int gmc_v6_0_mc_init(struct amdgpu_device *adev)
 	adev->mc.real_vram_size = RREG32(mmCONFIG_MEMSIZE) * 1024ULL * 1024ULL;
 	adev->mc.visible_vram_size = adev->mc.aper_size;
 
-	/* unless the user had overridden it, set the gart
-	 * size equal to the 1024 or vram, whichever is larger.
-	 */
-	if (amdgpu_gart_size == -1)
-		adev->mc.gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
-					adev->mc.mc_vram_size);
-	else
-		adev->mc.gtt_size = (uint64_t)amdgpu_gart_size << 20;
-
+	amdgpu_gart_set_defaults(adev);
 	gmc_v6_0_vram_gtt_location(adev, &adev->mc);
 
 	return 0;
@@ -479,6 +461,7 @@ static void gmc_v6_0_set_prt(struct amdgpu_device *adev, bool enable)
 static int gmc_v6_0_gart_enable(struct amdgpu_device *adev)
 {
 	int r, i;
+	u32 field;
 
 	if (adev->gart.robj == NULL) {
 		dev_err(adev->dev, "No VRAM object for PCIE GART.\n");
@@ -506,13 +489,15 @@ static int gmc_v6_0_gart_enable(struct amdgpu_device *adev)
 	WREG32(mmVM_L2_CNTL2,
 	       VM_L2_CNTL2__INVALIDATE_ALL_L1_TLBS_MASK |
 	       VM_L2_CNTL2__INVALIDATE_L2_CACHE_MASK);
+
+	field = adev->vm_manager.fragment_size;
 	WREG32(mmVM_L2_CNTL3,
 	       VM_L2_CNTL3__L2_CACHE_BIGK_ASSOCIATIVITY_MASK |
-	       (4UL << VM_L2_CNTL3__BANK_SELECT__SHIFT) |
-	       (4UL << VM_L2_CNTL3__L2_CACHE_BIGK_FRAGMENT_SIZE__SHIFT));
+	       (field << VM_L2_CNTL3__BANK_SELECT__SHIFT) |
+	       (field << VM_L2_CNTL3__L2_CACHE_BIGK_FRAGMENT_SIZE__SHIFT));
 	/* setup context0 */
-	WREG32(mmVM_CONTEXT0_PAGE_TABLE_START_ADDR, adev->mc.gtt_start >> 12);
-	WREG32(mmVM_CONTEXT0_PAGE_TABLE_END_ADDR, adev->mc.gtt_end >> 12);
+	WREG32(mmVM_CONTEXT0_PAGE_TABLE_START_ADDR, adev->mc.gart_start >> 12);
+	WREG32(mmVM_CONTEXT0_PAGE_TABLE_END_ADDR, adev->mc.gart_end >> 12);
 	WREG32(mmVM_CONTEXT0_PAGE_TABLE_BASE_ADDR, adev->gart.table_addr >> 12);
 	WREG32(mmVM_CONTEXT0_PROTECTION_FAULT_DEFAULT_ADDR,
 			(u32)(adev->dummy_page.addr >> 12));
@@ -559,7 +544,7 @@ static int gmc_v6_0_gart_enable(struct amdgpu_device *adev)
 
 	gmc_v6_0_gart_flush_gpu_tlb(adev, 0);
 	dev_info(adev->dev, "PCIE GART of %uM enabled (table at 0x%016llX).\n",
-		 (unsigned)(adev->mc.gtt_size >> 20),
+		 (unsigned)(adev->mc.gart_size >> 20),
 		 (unsigned long long)adev->gart.table_addr);
 	adev->gart.ready = true;
 	return 0;
@@ -829,7 +814,7 @@ static int gmc_v6_0_sw_init(void *handle)
 	if (r)
 		return r;
 
-	amdgpu_vm_adjust_size(adev, 64);
+	amdgpu_vm_adjust_size(adev, 64, 4);
 	adev->vm_manager.max_pfn = adev->vm_manager.vm_size << 18;
 
 	adev->mc.mc_mask = 0xffffffffffULL;
@@ -987,7 +972,6 @@ static int gmc_v6_0_wait_for_idle(void *handle)
 static int gmc_v6_0_soft_reset(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct amdgpu_mode_mc_save save;
 	u32 srbm_soft_reset = 0;
 	u32 tmp = RREG32(mmSRBM_STATUS);
 
@@ -1003,7 +987,7 @@ static int gmc_v6_0_soft_reset(void *handle)
 	}
 
 	if (srbm_soft_reset) {
-		gmc_v6_0_mc_stop(adev, &save);
+		gmc_v6_0_mc_stop(adev);
 		if (gmc_v6_0_wait_for_idle(adev)) {
 			dev_warn(adev->dev, "Wait for GMC idle timed out !\n");
 		}
@@ -1023,7 +1007,7 @@ static int gmc_v6_0_soft_reset(void *handle)
 
 		udelay(50);
 
-		gmc_v6_0_mc_resume(adev, &save);
+		gmc_v6_0_mc_resume(adev);
 		udelay(50);
 	}
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
index 7e9ea53..e42c1ad 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c
@@ -37,6 +37,9 @@
 #include "oss/oss_2_0_d.h"
 #include "oss/oss_2_0_sh_mask.h"
 
+#include "dce/dce_8_0_d.h"
+#include "dce/dce_8_0_sh_mask.h"
+
 #include "amdgpu_atombios.h"
 
 static void gmc_v7_0_set_gart_funcs(struct amdgpu_device *adev);
@@ -76,14 +79,10 @@ static void gmc_v7_0_init_golden_registers(struct amdgpu_device *adev)
 	}
 }
 
-static void gmc_v7_0_mc_stop(struct amdgpu_device *adev,
-			     struct amdgpu_mode_mc_save *save)
+static void gmc_v7_0_mc_stop(struct amdgpu_device *adev)
 {
 	u32 blackout;
 
-	if (adev->mode_info.num_crtc)
-		amdgpu_display_stop_mc_access(adev, save);
-
 	gmc_v7_0_wait_for_idle((void *)adev);
 
 	blackout = RREG32(mmMC_SHARED_BLACKOUT_CNTL);
@@ -99,8 +98,7 @@ static void gmc_v7_0_mc_stop(struct amdgpu_device *adev,
 	udelay(100);
 }
 
-static void gmc_v7_0_mc_resume(struct amdgpu_device *adev,
-			       struct amdgpu_mode_mc_save *save)
+static void gmc_v7_0_mc_resume(struct amdgpu_device *adev)
 {
 	u32 tmp;
 
@@ -112,9 +110,6 @@ static void gmc_v7_0_mc_resume(struct amdgpu_device *adev,
 	tmp = REG_SET_FIELD(0, BIF_FB_EN, FB_READ_EN, 1);
 	tmp = REG_SET_FIELD(tmp, BIF_FB_EN, FB_WRITE_EN, 1);
 	WREG32(mmBIF_FB_EN, tmp);
-
-	if (adev->mode_info.num_crtc)
-		amdgpu_display_resume_mc_access(adev, save);
 }
 
 /**
@@ -242,15 +237,17 @@ static int gmc_v7_0_mc_load_microcode(struct amdgpu_device *adev)
 static void gmc_v7_0_vram_gtt_location(struct amdgpu_device *adev,
 				       struct amdgpu_mc *mc)
 {
+	u64 base = RREG32(mmMC_VM_FB_LOCATION) & 0xFFFF;
+	base <<= 24;
+
 	if (mc->mc_vram_size > 0xFFC0000000ULL) {
 		/* leave room for at least 1024M GTT */
 		dev_warn(adev->dev, "limiting VRAM\n");
 		mc->real_vram_size = 0xFFC0000000ULL;
 		mc->mc_vram_size = 0xFFC0000000ULL;
 	}
-	amdgpu_vram_location(adev, &adev->mc, 0);
-	adev->mc.gtt_base_align = 0;
-	amdgpu_gtt_location(adev, mc);
+	amdgpu_vram_location(adev, &adev->mc, base);
+	amdgpu_gart_location(adev, mc);
 }
 
 /**
@@ -263,7 +260,6 @@ static void gmc_v7_0_vram_gtt_location(struct amdgpu_device *adev,
  */
 static void gmc_v7_0_mc_program(struct amdgpu_device *adev)
 {
-	struct amdgpu_mode_mc_save save;
 	u32 tmp;
 	int i, j;
 
@@ -277,13 +273,20 @@ static void gmc_v7_0_mc_program(struct amdgpu_device *adev)
 	}
 	WREG32(mmHDP_REG_COHERENCY_FLUSH_CNTL, 0);
 
-	if (adev->mode_info.num_crtc)
-		amdgpu_display_set_vga_render_state(adev, false);
-
-	gmc_v7_0_mc_stop(adev, &save);
 	if (gmc_v7_0_wait_for_idle((void *)adev)) {
 		dev_warn(adev->dev, "Wait for MC idle timedout !\n");
 	}
+	if (adev->mode_info.num_crtc) {
+		/* Lockout access through VGA aperture*/
+		tmp = RREG32(mmVGA_HDP_CONTROL);
+		tmp = REG_SET_FIELD(tmp, VGA_HDP_CONTROL, VGA_MEMORY_DISABLE, 1);
+		WREG32(mmVGA_HDP_CONTROL, tmp);
+
+		/* disable VGA render */
+		tmp = RREG32(mmVGA_RENDER_CONTROL);
+		tmp = REG_SET_FIELD(tmp, VGA_RENDER_CONTROL, VGA_VSTATUS_CNTL, 0);
+		WREG32(mmVGA_RENDER_CONTROL, tmp);
+	}
 	/* Update configuration */
 	WREG32(mmMC_VM_SYSTEM_APERTURE_LOW_ADDR,
 	       adev->mc.vram_start >> 12);
@@ -291,20 +294,12 @@ static void gmc_v7_0_mc_program(struct amdgpu_device *adev)
 	       adev->mc.vram_end >> 12);
 	WREG32(mmMC_VM_SYSTEM_APERTURE_DEFAULT_ADDR,
 	       adev->vram_scratch.gpu_addr >> 12);
-	tmp = ((adev->mc.vram_end >> 24) & 0xFFFF) << 16;
-	tmp |= ((adev->mc.vram_start >> 24) & 0xFFFF);
-	WREG32(mmMC_VM_FB_LOCATION, tmp);
-	/* XXX double check these! */
-	WREG32(mmHDP_NONSURFACE_BASE, (adev->mc.vram_start >> 8));
-	WREG32(mmHDP_NONSURFACE_INFO, (2 << 7) | (1 << 30));
-	WREG32(mmHDP_NONSURFACE_SIZE, 0x3FFFFFFF);
 	WREG32(mmMC_VM_AGP_BASE, 0);
 	WREG32(mmMC_VM_AGP_TOP, 0x0FFFFFFF);
 	WREG32(mmMC_VM_AGP_BOT, 0x0FFFFFFF);
 	if (gmc_v7_0_wait_for_idle((void *)adev)) {
 		dev_warn(adev->dev, "Wait for MC idle timedout !\n");
 	}
-	gmc_v7_0_mc_resume(adev, &save);
 
 	WREG32(mmBIF_FB_EN, BIF_FB_EN__FB_READ_EN_MASK | BIF_FB_EN__FB_WRITE_EN_MASK);
 
@@ -391,15 +386,7 @@ static int gmc_v7_0_mc_init(struct amdgpu_device *adev)
 	if (adev->mc.visible_vram_size > adev->mc.real_vram_size)
 		adev->mc.visible_vram_size = adev->mc.real_vram_size;
 
-	/* unless the user had overridden it, set the gart
-	 * size equal to the 1024 or vram, whichever is larger.
-	 */
-	if (amdgpu_gart_size == -1)
-		adev->mc.gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
-					adev->mc.mc_vram_size);
-	else
-		adev->mc.gtt_size = (uint64_t)amdgpu_gart_size << 20;
-
+	amdgpu_gart_set_defaults(adev);
 	gmc_v7_0_vram_gtt_location(adev, &adev->mc);
 
 	return 0;
@@ -575,7 +562,7 @@ static void gmc_v7_0_set_prt(struct amdgpu_device *adev, bool enable)
 static int gmc_v7_0_gart_enable(struct amdgpu_device *adev)
 {
 	int r, i;
-	u32 tmp;
+	u32 tmp, field;
 
 	if (adev->gart.robj == NULL) {
 		dev_err(adev->dev, "No VRAM object for PCIE GART.\n");
@@ -605,14 +592,16 @@ static int gmc_v7_0_gart_enable(struct amdgpu_device *adev)
 	tmp = REG_SET_FIELD(0, VM_L2_CNTL2, INVALIDATE_ALL_L1_TLBS, 1);
 	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL2, INVALIDATE_L2_CACHE, 1);
 	WREG32(mmVM_L2_CNTL2, tmp);
+
+	field = adev->vm_manager.fragment_size;
 	tmp = RREG32(mmVM_L2_CNTL3);
 	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, L2_CACHE_BIGK_ASSOCIATIVITY, 1);
-	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, BANK_SELECT, 4);
-	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, L2_CACHE_BIGK_FRAGMENT_SIZE, 4);
+	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, BANK_SELECT, field);
+	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, L2_CACHE_BIGK_FRAGMENT_SIZE, field);
 	WREG32(mmVM_L2_CNTL3, tmp);
 	/* setup context0 */
-	WREG32(mmVM_CONTEXT0_PAGE_TABLE_START_ADDR, adev->mc.gtt_start >> 12);
-	WREG32(mmVM_CONTEXT0_PAGE_TABLE_END_ADDR, adev->mc.gtt_end >> 12);
+	WREG32(mmVM_CONTEXT0_PAGE_TABLE_START_ADDR, adev->mc.gart_start >> 12);
+	WREG32(mmVM_CONTEXT0_PAGE_TABLE_END_ADDR, adev->mc.gart_end >> 12);
 	WREG32(mmVM_CONTEXT0_PAGE_TABLE_BASE_ADDR, adev->gart.table_addr >> 12);
 	WREG32(mmVM_CONTEXT0_PROTECTION_FAULT_DEFAULT_ADDR,
 			(u32)(adev->dummy_page.addr >> 12));
@@ -666,7 +655,7 @@ static int gmc_v7_0_gart_enable(struct amdgpu_device *adev)
 
 	gmc_v7_0_gart_flush_gpu_tlb(adev, 0);
 	DRM_INFO("PCIE GART of %uM enabled (table at 0x%016llX).\n",
-		 (unsigned)(adev->mc.gtt_size >> 20),
+		 (unsigned)(adev->mc.gart_size >> 20),
 		 (unsigned long long)adev->gart.table_addr);
 	adev->gart.ready = true;
 	return 0;
@@ -961,7 +950,7 @@ static int gmc_v7_0_sw_init(void *handle)
 	 * Currently set to 4GB ((1 << 20) 4k pages).
 	 * Max GPUVM size for cayman and SI is 40 bits.
 	 */
-	amdgpu_vm_adjust_size(adev, 64);
+	amdgpu_vm_adjust_size(adev, 64, 4);
 	adev->vm_manager.max_pfn = adev->vm_manager.vm_size << 18;
 
 	/* Set the internal MC address mask
@@ -1138,7 +1127,6 @@ static int gmc_v7_0_wait_for_idle(void *handle)
 static int gmc_v7_0_soft_reset(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
-	struct amdgpu_mode_mc_save save;
 	u32 srbm_soft_reset = 0;
 	u32 tmp = RREG32(mmSRBM_STATUS);
 
@@ -1154,7 +1142,7 @@ static int gmc_v7_0_soft_reset(void *handle)
 	}
 
 	if (srbm_soft_reset) {
-		gmc_v7_0_mc_stop(adev, &save);
+		gmc_v7_0_mc_stop(adev);
 		if (gmc_v7_0_wait_for_idle((void *)adev)) {
 			dev_warn(adev->dev, "Wait for GMC idle timed out !\n");
 		}
@@ -1175,7 +1163,7 @@ static int gmc_v7_0_soft_reset(void *handle)
 		/* Wait a little for things to settle down */
 		udelay(50);
 
-		gmc_v7_0_mc_resume(adev, &save);
+		gmc_v7_0_mc_resume(adev);
 		udelay(50);
 	}
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
index cc9f880..7ca2dae 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c
@@ -35,6 +35,9 @@
 #include "oss/oss_3_0_d.h"
 #include "oss/oss_3_0_sh_mask.h"
 
+#include "dce/dce_10_0_d.h"
+#include "dce/dce_10_0_sh_mask.h"
+
 #include "vid.h"
 #include "vi.h"
 
@@ -161,14 +164,10 @@ static void gmc_v8_0_init_golden_registers(struct amdgpu_device *adev)
 	}
 }
 
-static void gmc_v8_0_mc_stop(struct amdgpu_device *adev,
-			     struct amdgpu_mode_mc_save *save)
+static void gmc_v8_0_mc_stop(struct amdgpu_device *adev)
 {
 	u32 blackout;
 
-	if (adev->mode_info.num_crtc)
-		amdgpu_display_stop_mc_access(adev, save);
-
 	gmc_v8_0_wait_for_idle(adev);
 
 	blackout = RREG32(mmMC_SHARED_BLACKOUT_CNTL);
@@ -184,8 +183,7 @@ static void gmc_v8_0_mc_stop(struct amdgpu_device *adev,
 	udelay(100);
 }
 
-static void gmc_v8_0_mc_resume(struct amdgpu_device *adev,
-			       struct amdgpu_mode_mc_save *save)
+static void gmc_v8_0_mc_resume(struct amdgpu_device *adev)
 {
 	u32 tmp;
 
@@ -197,9 +195,6 @@ static void gmc_v8_0_mc_resume(struct amdgpu_device *adev,
 	tmp = REG_SET_FIELD(0, BIF_FB_EN, FB_READ_EN, 1);
 	tmp = REG_SET_FIELD(tmp, BIF_FB_EN, FB_WRITE_EN, 1);
 	WREG32(mmBIF_FB_EN, tmp);
-
-	if (adev->mode_info.num_crtc)
-		amdgpu_display_resume_mc_access(adev, save);
 }
 
 /**
@@ -404,15 +399,20 @@ static int gmc_v8_0_polaris_mc_load_microcode(struct amdgpu_device *adev)
 static void gmc_v8_0_vram_gtt_location(struct amdgpu_device *adev,
 				       struct amdgpu_mc *mc)
 {
+	u64 base = 0;
+
+	if (!amdgpu_sriov_vf(adev))
+		base = RREG32(mmMC_VM_FB_LOCATION) & 0xFFFF;
+	base <<= 24;
+
 	if (mc->mc_vram_size > 0xFFC0000000ULL) {
 		/* leave room for at least 1024M GTT */
 		dev_warn(adev->dev, "limiting VRAM\n");
 		mc->real_vram_size = 0xFFC0000000ULL;
 		mc->mc_vram_size = 0xFFC0000000ULL;
 	}
-	amdgpu_vram_location(adev, &adev->mc, 0);
-	adev->mc.gtt_base_align = 0;
-	amdgpu_gtt_location(adev, mc);
+	amdgpu_vram_location(adev, &adev->mc, base);
+	amdgpu_gart_location(adev, mc);
 }
 
 /**
@@ -425,7 +425,6 @@ static void gmc_v8_0_vram_gtt_location(struct amdgpu_device *adev,
  */
 static void gmc_v8_0_mc_program(struct amdgpu_device *adev)
 {
-	struct amdgpu_mode_mc_save save;
 	u32 tmp;
 	int i, j;
 
@@ -439,13 +438,20 @@ static void gmc_v8_0_mc_program(struct amdgpu_device *adev)
 	}
 	WREG32(mmHDP_REG_COHERENCY_FLUSH_CNTL, 0);
 
-	if (adev->mode_info.num_crtc)
-		amdgpu_display_set_vga_render_state(adev, false);
-
-	gmc_v8_0_mc_stop(adev, &save);
 	if (gmc_v8_0_wait_for_idle((void *)adev)) {
 		dev_warn(adev->dev, "Wait for MC idle timedout !\n");
 	}
+	if (adev->mode_info.num_crtc) {
+		/* Lockout access through VGA aperture*/
+		tmp = RREG32(mmVGA_HDP_CONTROL);
+		tmp = REG_SET_FIELD(tmp, VGA_HDP_CONTROL, VGA_MEMORY_DISABLE, 1);
+		WREG32(mmVGA_HDP_CONTROL, tmp);
+
+		/* disable VGA render */
+		tmp = RREG32(mmVGA_RENDER_CONTROL);
+		tmp = REG_SET_FIELD(tmp, VGA_RENDER_CONTROL, VGA_VSTATUS_CNTL, 0);
+		WREG32(mmVGA_RENDER_CONTROL, tmp);
+	}
 	/* Update configuration */
 	WREG32(mmMC_VM_SYSTEM_APERTURE_LOW_ADDR,
 	       adev->mc.vram_start >> 12);
@@ -453,20 +459,23 @@ static void gmc_v8_0_mc_program(struct amdgpu_device *adev)
 	       adev->mc.vram_end >> 12);
 	WREG32(mmMC_VM_SYSTEM_APERTURE_DEFAULT_ADDR,
 	       adev->vram_scratch.gpu_addr >> 12);
-	tmp = ((adev->mc.vram_end >> 24) & 0xFFFF) << 16;
-	tmp |= ((adev->mc.vram_start >> 24) & 0xFFFF);
-	WREG32(mmMC_VM_FB_LOCATION, tmp);
-	/* XXX double check these! */
-	WREG32(mmHDP_NONSURFACE_BASE, (adev->mc.vram_start >> 8));
-	WREG32(mmHDP_NONSURFACE_INFO, (2 << 7) | (1 << 30));
-	WREG32(mmHDP_NONSURFACE_SIZE, 0x3FFFFFFF);
+
+	if (amdgpu_sriov_vf(adev)) {
+		tmp = ((adev->mc.vram_end >> 24) & 0xFFFF) << 16;
+		tmp |= ((adev->mc.vram_start >> 24) & 0xFFFF);
+		WREG32(mmMC_VM_FB_LOCATION, tmp);
+		/* XXX double check these! */
+		WREG32(mmHDP_NONSURFACE_BASE, (adev->mc.vram_start >> 8));
+		WREG32(mmHDP_NONSURFACE_INFO, (2 << 7) | (1 << 30));
+		WREG32(mmHDP_NONSURFACE_SIZE, 0x3FFFFFFF);
+	}
+
 	WREG32(mmMC_VM_AGP_BASE, 0);
 	WREG32(mmMC_VM_AGP_TOP, 0x0FFFFFFF);
 	WREG32(mmMC_VM_AGP_BOT, 0x0FFFFFFF);
 	if (gmc_v8_0_wait_for_idle((void *)adev)) {
 		dev_warn(adev->dev, "Wait for MC idle timedout !\n");
 	}
-	gmc_v8_0_mc_resume(adev, &save);
 
 	WREG32(mmBIF_FB_EN, BIF_FB_EN__FB_READ_EN_MASK | BIF_FB_EN__FB_WRITE_EN_MASK);
 
@@ -553,15 +562,7 @@ static int gmc_v8_0_mc_init(struct amdgpu_device *adev)
 	if (adev->mc.visible_vram_size > adev->mc.real_vram_size)
 		adev->mc.visible_vram_size = adev->mc.real_vram_size;
 
-	/* unless the user had overridden it, set the gart
-	 * size equal to the 1024 or vram, whichever is larger.
-	 */
-	if (amdgpu_gart_size == -1)
-		adev->mc.gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
-					adev->mc.mc_vram_size);
-	else
-		adev->mc.gtt_size = (uint64_t)amdgpu_gart_size << 20;
-
+	amdgpu_gart_set_defaults(adev);
 	gmc_v8_0_vram_gtt_location(adev, &adev->mc);
 
 	return 0;
@@ -761,7 +762,7 @@ static void gmc_v8_0_set_prt(struct amdgpu_device *adev, bool enable)
 static int gmc_v8_0_gart_enable(struct amdgpu_device *adev)
 {
 	int r, i;
-	u32 tmp;
+	u32 tmp, field;
 
 	if (adev->gart.robj == NULL) {
 		dev_err(adev->dev, "No VRAM object for PCIE GART.\n");
@@ -792,10 +793,12 @@ static int gmc_v8_0_gart_enable(struct amdgpu_device *adev)
 	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL2, INVALIDATE_ALL_L1_TLBS, 1);
 	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL2, INVALIDATE_L2_CACHE, 1);
 	WREG32(mmVM_L2_CNTL2, tmp);
+
+	field = adev->vm_manager.fragment_size;
 	tmp = RREG32(mmVM_L2_CNTL3);
 	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, L2_CACHE_BIGK_ASSOCIATIVITY, 1);
-	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, BANK_SELECT, 4);
-	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, L2_CACHE_BIGK_FRAGMENT_SIZE, 4);
+	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, BANK_SELECT, field);
+	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, L2_CACHE_BIGK_FRAGMENT_SIZE, field);
 	WREG32(mmVM_L2_CNTL3, tmp);
 	/* XXX: set to enable PTE/PDE in system memory */
 	tmp = RREG32(mmVM_L2_CNTL4);
@@ -813,8 +816,8 @@ static int gmc_v8_0_gart_enable(struct amdgpu_device *adev)
 	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL4, VMC_TAP_CONTEXT1_PTE_REQUEST_SNOOP, 0);
 	WREG32(mmVM_L2_CNTL4, tmp);
 	/* setup context0 */
-	WREG32(mmVM_CONTEXT0_PAGE_TABLE_START_ADDR, adev->mc.gtt_start >> 12);
-	WREG32(mmVM_CONTEXT0_PAGE_TABLE_END_ADDR, adev->mc.gtt_end >> 12);
+	WREG32(mmVM_CONTEXT0_PAGE_TABLE_START_ADDR, adev->mc.gart_start >> 12);
+	WREG32(mmVM_CONTEXT0_PAGE_TABLE_END_ADDR, adev->mc.gart_end >> 12);
 	WREG32(mmVM_CONTEXT0_PAGE_TABLE_BASE_ADDR, adev->gart.table_addr >> 12);
 	WREG32(mmVM_CONTEXT0_PROTECTION_FAULT_DEFAULT_ADDR,
 			(u32)(adev->dummy_page.addr >> 12));
@@ -869,7 +872,7 @@ static int gmc_v8_0_gart_enable(struct amdgpu_device *adev)
 
 	gmc_v8_0_gart_flush_gpu_tlb(adev, 0);
 	DRM_INFO("PCIE GART of %uM enabled (table at 0x%016llX).\n",
-		 (unsigned)(adev->mc.gtt_size >> 20),
+		 (unsigned)(adev->mc.gart_size >> 20),
 		 (unsigned long long)adev->gart.table_addr);
 	adev->gart.ready = true;
 	return 0;
@@ -1045,7 +1048,7 @@ static int gmc_v8_0_sw_init(void *handle)
 	 * Currently set to 4GB ((1 << 20) 4k pages).
 	 * Max GPUVM size for cayman and SI is 40 bits.
 	 */
-	amdgpu_vm_adjust_size(adev, 64);
+	amdgpu_vm_adjust_size(adev, 64, 4);
 	adev->vm_manager.max_pfn = adev->vm_manager.vm_size << 18;
 
 	/* Set the internal MC address mask
@@ -1260,7 +1263,7 @@ static int gmc_v8_0_pre_soft_reset(void *handle)
 	if (!adev->mc.srbm_soft_reset)
 		return 0;
 
-	gmc_v8_0_mc_stop(adev, &adev->mc.save);
+	gmc_v8_0_mc_stop(adev);
 	if (gmc_v8_0_wait_for_idle(adev)) {
 		dev_warn(adev->dev, "Wait for GMC idle timed out !\n");
 	}
@@ -1306,7 +1309,7 @@ static int gmc_v8_0_post_soft_reset(void *handle)
 	if (!adev->mc.srbm_soft_reset)
 		return 0;
 
-	gmc_v8_0_mc_resume(adev, &adev->mc.save);
+	gmc_v8_0_mc_resume(adev);
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
index 175ba5f..2769c2b 100644
--- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
@@ -23,11 +23,14 @@
 #include <linux/firmware.h>
 #include "amdgpu.h"
 #include "gmc_v9_0.h"
+#include "amdgpu_atomfirmware.h"
 
 #include "vega10/soc15ip.h"
 #include "vega10/HDP/hdp_4_0_offset.h"
 #include "vega10/HDP/hdp_4_0_sh_mask.h"
 #include "vega10/GC/gc_9_0_sh_mask.h"
+#include "vega10/DC/dce_12_0_offset.h"
+#include "vega10/DC/dce_12_0_sh_mask.h"
 #include "vega10/vega10_enum.h"
 
 #include "soc15_common.h"
@@ -419,8 +422,7 @@ static void gmc_v9_0_vram_gtt_location(struct amdgpu_device *adev,
 	if (!amdgpu_sriov_vf(adev))
 		base = mmhub_v1_0_get_fb_location(adev);
 	amdgpu_vram_location(adev, &adev->mc, base);
-	adev->mc.gtt_base_align = 0;
-	amdgpu_gtt_location(adev, mc);
+	amdgpu_gart_location(adev, mc);
 	/* base offset of vram pages */
 	if (adev->flags & AMD_IS_APU)
 		adev->vm_manager.vram_base_offset = gfxhub_v1_0_get_mc_fb_offset(adev);
@@ -442,43 +444,46 @@ static int gmc_v9_0_mc_init(struct amdgpu_device *adev)
 	u32 tmp;
 	int chansize, numchan;
 
-	/* hbm memory channel size */
-	chansize = 128;
+	adev->mc.vram_width = amdgpu_atomfirmware_get_vram_width(adev);
+	if (!adev->mc.vram_width) {
+		/* hbm memory channel size */
+		chansize = 128;
 
-	tmp = RREG32_SOC15(DF, 0, mmDF_CS_AON0_DramBaseAddress0);
-	tmp &= DF_CS_AON0_DramBaseAddress0__IntLvNumChan_MASK;
-	tmp >>= DF_CS_AON0_DramBaseAddress0__IntLvNumChan__SHIFT;
-	switch (tmp) {
-	case 0:
-	default:
-		numchan = 1;
-		break;
-	case 1:
-		numchan = 2;
-		break;
-	case 2:
-		numchan = 0;
-		break;
-	case 3:
-		numchan = 4;
-		break;
-	case 4:
-		numchan = 0;
-		break;
-	case 5:
-		numchan = 8;
-		break;
-	case 6:
-		numchan = 0;
-		break;
-	case 7:
-		numchan = 16;
-		break;
-	case 8:
-		numchan = 2;
-		break;
+		tmp = RREG32_SOC15(DF, 0, mmDF_CS_AON0_DramBaseAddress0);
+		tmp &= DF_CS_AON0_DramBaseAddress0__IntLvNumChan_MASK;
+		tmp >>= DF_CS_AON0_DramBaseAddress0__IntLvNumChan__SHIFT;
+		switch (tmp) {
+		case 0:
+		default:
+			numchan = 1;
+			break;
+		case 1:
+			numchan = 2;
+			break;
+		case 2:
+			numchan = 0;
+			break;
+		case 3:
+			numchan = 4;
+			break;
+		case 4:
+			numchan = 0;
+			break;
+		case 5:
+			numchan = 8;
+			break;
+		case 6:
+			numchan = 0;
+			break;
+		case 7:
+			numchan = 16;
+			break;
+		case 8:
+			numchan = 2;
+			break;
+		}
+		adev->mc.vram_width = numchan * chansize;
 	}
-	adev->mc.vram_width = numchan * chansize;
 
 	/* Could aper size report 0 ? */
 	adev->mc.aper_base = pci_resource_start(adev->pdev, 0);
@@ -494,15 +499,7 @@ static int gmc_v9_0_mc_init(struct amdgpu_device *adev)
 	if (adev->mc.visible_vram_size > adev->mc.real_vram_size)
 		adev->mc.visible_vram_size = adev->mc.real_vram_size;
 
-	/* unless the user had overridden it, set the gart
-	 * size equal to the 1024 or vram, whichever is larger.
-	 */
-	if (amdgpu_gart_size == -1)
-		adev->mc.gtt_size = max((AMDGPU_DEFAULT_GTT_SIZE_MB << 20),
-					adev->mc.mc_vram_size);
-	else
-		adev->mc.gtt_size = (uint64_t)amdgpu_gart_size << 20;
-
+	amdgpu_gart_set_defaults(adev);
 	gmc_v9_0_vram_gtt_location(adev, &adev->mc);
 
 	return 0;
@@ -537,10 +534,21 @@ static int gmc_v9_0_sw_init(void *handle)
 
 	spin_lock_init(&adev->mc.invalidate_lock);
 
-	if (adev->flags & AMD_IS_APU) {
+	switch (adev->asic_type) {
+	case CHIP_RAVEN:
 		adev->mc.vram_type = AMDGPU_VRAM_TYPE_UNKNOWN;
-		amdgpu_vm_adjust_size(adev, 64);
-	} else {
+		if (adev->rev_id == 0x0 || adev->rev_id == 0x1) {
+			adev->vm_manager.vm_size = 1U << 18;
+			adev->vm_manager.block_size = 9;
+			adev->vm_manager.num_level = 3;
+			amdgpu_vm_set_fragment_size(adev, 9);
+		} else {
+			/* vm_size is 64GB for legacy 2-level page support */
+			amdgpu_vm_adjust_size(adev, 64, 9);
+			adev->vm_manager.num_level = 1;
+		}
+		break;
+	case CHIP_VEGA10:
 		/* XXX Don't know how to get VRAM type yet. */
 		adev->mc.vram_type = AMDGPU_VRAM_TYPE_HBM;
 		/*
@@ -550,11 +558,18 @@ static int gmc_v9_0_sw_init(void *handle)
 		 */
 		adev->vm_manager.vm_size = 1U << 18;
 		adev->vm_manager.block_size = 9;
-		DRM_INFO("vm size is %llu GB, block size is %u-bit\n",
-				adev->vm_manager.vm_size,
-				adev->vm_manager.block_size);
+		adev->vm_manager.num_level = 3;
+		amdgpu_vm_set_fragment_size(adev, 9);
+		break;
+	default:
+		break;
 	}
 
+	DRM_INFO("vm size is %llu GB, block size is %u-bit,fragment size is %u-bit\n",
+			adev->vm_manager.vm_size,
+			adev->vm_manager.block_size,
+			adev->vm_manager.fragment_size);
+
 	/* This interrupt is VMC page fault.*/
 	r = amdgpu_irq_add_id(adev, AMDGPU_IH_CLIENTID_VMC, 0,
 				&adev->mc.vm_fault);
@@ -619,11 +634,6 @@ static int gmc_v9_0_sw_init(void *handle)
 	adev->vm_manager.id_mgr[AMDGPU_GFXHUB].num_ids = AMDGPU_NUM_OF_VMIDS;
 	adev->vm_manager.id_mgr[AMDGPU_MMHUB].num_ids = AMDGPU_NUM_OF_VMIDS;
 
-	/* TODO: fix num_level for APU when updating vm size and block size */
-	if (adev->flags & AMD_IS_APU)
-		adev->vm_manager.num_level = 1;
-	else
-		adev->vm_manager.num_level = 3;
 	amdgpu_vm_manager_init(adev);
 
 	return 0;
@@ -731,7 +741,7 @@ static int gmc_v9_0_gart_enable(struct amdgpu_device *adev)
 	gmc_v9_0_gart_flush_gpu_tlb(adev, 0);
 
 	DRM_INFO("PCIE GART of %uM enabled (table at 0x%016llX).\n",
-		 (unsigned)(adev->mc.gtt_size >> 20),
+		 (unsigned)(adev->mc.gart_size >> 20),
 		 (unsigned long long)adev->gart.table_addr);
 	adev->gart.ready = true;
 	return 0;
@@ -745,6 +755,20 @@ static int gmc_v9_0_hw_init(void *handle)
 	/* The sequence of these two function calls matters.*/
 	gmc_v9_0_init_golden_registers(adev);
 
+	if (adev->mode_info.num_crtc) {
+		u32 tmp;
+
+		/* Lockout access through VGA aperture*/
+		tmp = RREG32_SOC15(DCE, 0, mmVGA_HDP_CONTROL);
+		tmp = REG_SET_FIELD(tmp, VGA_HDP_CONTROL, VGA_MEMORY_DISABLE, 1);
+		WREG32_SOC15(DCE, 0, mmVGA_HDP_CONTROL, tmp);
+
+		/* disable VGA render */
+		tmp = RREG32_SOC15(DCE, 0, mmVGA_RENDER_CONTROL);
+		tmp = REG_SET_FIELD(tmp, VGA_RENDER_CONTROL, VGA_VSTATUS_CNTL, 0);
+		WREG32_SOC15(DCE, 0, mmVGA_RENDER_CONTROL, tmp);
+	}
+
 	r = gmc_v9_0_gart_enable(adev);
 
 	return r;
diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
index 9804318..4395a4f 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.c
@@ -69,14 +69,14 @@ static void mmhub_v1_0_init_gart_aperture_regs(struct amdgpu_device *adev)
 	mmhub_v1_0_init_gart_pt_regs(adev);
 
 	WREG32_SOC15(MMHUB, 0, mmVM_CONTEXT0_PAGE_TABLE_START_ADDR_LO32,
-		     (u32)(adev->mc.gtt_start >> 12));
+		     (u32)(adev->mc.gart_start >> 12));
 	WREG32_SOC15(MMHUB, 0, mmVM_CONTEXT0_PAGE_TABLE_START_ADDR_HI32,
-		     (u32)(adev->mc.gtt_start >> 44));
+		     (u32)(adev->mc.gart_start >> 44));
 
 	WREG32_SOC15(MMHUB, 0, mmVM_CONTEXT0_PAGE_TABLE_END_ADDR_LO32,
-		     (u32)(adev->mc.gtt_end >> 12));
+		     (u32)(adev->mc.gart_end >> 12));
 	WREG32_SOC15(MMHUB, 0, mmVM_CONTEXT0_PAGE_TABLE_END_ADDR_HI32,
-		     (u32)(adev->mc.gtt_end >> 44));
+		     (u32)(adev->mc.gart_end >> 44));
 }
 
 static void mmhub_v1_0_init_system_aperture_regs(struct amdgpu_device *adev)
@@ -138,12 +138,12 @@ static void mmhub_v1_0_init_tlb_regs(struct amdgpu_device *adev)
 
 static void mmhub_v1_0_init_cache_regs(struct amdgpu_device *adev)
 {
-	uint32_t tmp;
+	uint32_t tmp, field;
 
 	/* Setup L2 cache */
 	tmp = RREG32_SOC15(MMHUB, 0, mmVM_L2_CNTL);
 	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_CACHE, 1);
-	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_FRAGMENT_PROCESSING, 0);
+	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, ENABLE_L2_FRAGMENT_PROCESSING, 1);
 	/* XXX for emulation, Refer to closed source code.*/
 	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL, L2_PDE0_CACHE_TAG_GENERATION_MODE,
 			    0);
@@ -157,7 +157,10 @@ static void mmhub_v1_0_init_cache_regs(struct amdgpu_device *adev)
 	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL2, INVALIDATE_L2_CACHE, 1);
 	WREG32_SOC15(MMHUB, 0, mmVM_L2_CNTL2, tmp);
 
+	field = adev->vm_manager.fragment_size;
 	tmp = mmVM_L2_CNTL3_DEFAULT;
+	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, BANK_SELECT, field);
+	tmp = REG_SET_FIELD(tmp, VM_L2_CNTL3, L2_CACHE_BIGK_FRAGMENT_SIZE, 6);
 	WREG32_SOC15(MMHUB, 0, mmVM_L2_CNTL3, tmp);
 
 	tmp = mmVM_L2_CNTL4_DEFAULT;
@@ -222,6 +225,9 @@ static void mmhub_v1_0_setup_vmid_config(struct amdgpu_device *adev)
 		tmp = REG_SET_FIELD(tmp, VM_CONTEXT1_CNTL,
 				PAGE_TABLE_BLOCK_SIZE,
 				adev->vm_manager.block_size - 9);
+		/* Send no-retry XNACK on fault to suppress VM fault storm. */
+		tmp = REG_SET_FIELD(tmp, VM_CONTEXT1_CNTL,
+				    RETRY_PERMISSION_OR_INVALID_PAGE_FAULT, 0);
 		WREG32_SOC15_OFFSET(MMHUB, 0, mmVM_CONTEXT1_CNTL, i, tmp);
 		WREG32_SOC15_OFFSET(MMHUB, 0, mmVM_CONTEXT1_PAGE_TABLE_START_ADDR_LO32, i*2, 0);
 		WREG32_SOC15_OFFSET(MMHUB, 0, mmVM_CONTEXT1_PAGE_TABLE_START_ADDR_HI32, i*2, 0);
@@ -245,28 +251,28 @@ static void mmhub_v1_0_program_invalidation(struct amdgpu_device *adev)
 }
 
 struct pctl_data {
-    uint32_t index;
-    uint32_t data;
+	uint32_t index;
+	uint32_t data;
 };
 
-const struct pctl_data pctl0_data[] = {
-    {0x0, 0x7a640},
-    {0x9, 0x2a64a},
-    {0xd, 0x2a680},
-    {0x11, 0x6a684},
-    {0x19, 0xea68e},
-    {0x29, 0xa69e},
-    {0x2b, 0x34a6c0},
-    {0x61, 0x83a707},
-    {0xe6, 0x8a7a4},
-    {0xf0, 0x1a7b8},
-    {0xf3, 0xfa7cc},
-    {0x104, 0x17a7dd},
-    {0x11d, 0xa7dc},
-    {0x11f, 0x12a7f5},
-    {0x133, 0xa808},
-    {0x135, 0x12a810},
-    {0x149, 0x7a82c}
+static const struct pctl_data pctl0_data[] = {
+	{0x0, 0x7a640},
+	{0x9, 0x2a64a},
+	{0xd, 0x2a680},
+	{0x11, 0x6a684},
+	{0x19, 0xea68e},
+	{0x29, 0xa69e},
+	{0x2b, 0x34a6c0},
+	{0x61, 0x83a707},
+	{0xe6, 0x8a7a4},
+	{0xf0, 0x1a7b8},
+	{0xf3, 0xfa7cc},
+	{0x104, 0x17a7dd},
+	{0x11d, 0xa7dc},
+	{0x11f, 0x12a7f5},
+	{0x133, 0xa808},
+	{0x135, 0x12a810},
+	{0x149, 0x7a82c}
 };
 #define PCTL0_DATA_LEN (sizeof(pctl0_data)/sizeof(pctl0_data[0]))
 
@@ -274,32 +280,39 @@ const struct pctl_data pctl0_data[] = {
 #define PCTL0_STCTRL_REG_SAVE_RANGE0_BASE  0xa640
 #define PCTL0_STCTRL_REG_SAVE_RANGE0_LIMIT 0xa833
 
-const struct pctl_data pctl1_data[] = {
-    {0x0, 0x39a000},
-    {0x3b, 0x44a040},
-    {0x81, 0x2a08d},
-    {0x85, 0x6ba094},
-    {0xf2, 0x18a100},
-    {0x10c, 0x4a132},
-    {0x112, 0xca141},
-    {0x120, 0x2fa158},
-    {0x151, 0x17a1d0},
-    {0x16a, 0x1a1e9},
-    {0x16d, 0x13a1ec},
-    {0x182, 0x7a201},
-    {0x18b, 0x3a20a},
-    {0x190, 0x7a580},
-    {0x199, 0xa590},
-    {0x19b, 0x4a594},
-    {0x1a1, 0x1a59c},
-    {0x1a4, 0x7a82c},
-    {0x1ad, 0xfa7cc},
-    {0x1be, 0x17a7dd},
-    {0x1d7, 0x12a810}
+static const struct pctl_data pctl1_data[] = {
+	{0x0, 0x39a000},
+	{0x3b, 0x44a040},
+	{0x81, 0x2a08d},
+	{0x85, 0x6ba094},
+	{0xf2, 0x18a100},
+	{0x10c, 0x4a132},
+	{0x112, 0xca141},
+	{0x120, 0x2fa158},
+	{0x151, 0x17a1d0},
+	{0x16a, 0x1a1e9},
+	{0x16d, 0x13a1ec},
+	{0x182, 0x7a201},
+	{0x18b, 0x3a20a},
+	{0x190, 0x7a580},
+	{0x199, 0xa590},
+	{0x19b, 0x4a594},
+	{0x1a1, 0x1a59c},
+	{0x1a4, 0x7a82c},
+	{0x1ad, 0xfa7cc},
+	{0x1be, 0x17a7dd},
+	{0x1d7, 0x12a810},
+	{0x1eb, 0x4000a7e1},
+	{0x1ec, 0x5000a7f5},
+	{0x1ed, 0x4000a7e2},
+	{0x1ee, 0x5000a7dc},
+	{0x1ef, 0x4000a7e3},
+	{0x1f0, 0x5000a7f6},
+	{0x1f1, 0x5000a7e4}
 };
 #define PCTL1_DATA_LEN (sizeof(pctl1_data)/sizeof(pctl1_data[0]))
 
-#define PCTL1_RENG_EXEC_END_PTR 0x1ea
+#define PCTL1_RENG_EXEC_END_PTR 0x1f1
 #define PCTL1_STCTRL_REG_SAVE_RANGE0_BASE  0xa000
 #define PCTL1_STCTRL_REG_SAVE_RANGE0_LIMIT 0xa20d
 #define PCTL1_STCTRL_REG_SAVE_RANGE1_BASE  0xa580
diff --git a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.h b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.h
index 57bb940..5d38229 100644
--- a/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.h
+++ b/drivers/gpu/drm/amd/amdgpu/mmhub_v1_0.h
@@ -36,7 +36,4 @@ void mmhub_v1_0_initialize_power_gating(struct amdgpu_device *adev);
 void mmhub_v1_0_update_power_gating(struct amdgpu_device *adev,
                                 bool enable);
 
-extern const struct amd_ip_funcs mmhub_v1_0_ip_funcs;
-extern const struct amdgpu_ip_block_version mmhub_v1_0_ip_block;
-
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
index bde3ca3..2812d88 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c
@@ -72,21 +72,6 @@ static void xgpu_ai_mailbox_set_valid(struct amdgpu_device *adev, bool val)
 		      reg);
 }
 
-static void xgpu_ai_mailbox_trans_msg(struct amdgpu_device *adev,
-				      enum idh_request req)
-{
-	u32 reg;
-
-	reg = RREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0,
-					     mmBIF_BX_PF0_MAILBOX_MSGBUF_TRN_DW0));
-	reg = REG_SET_FIELD(reg, BIF_BX_PF0_MAILBOX_MSGBUF_TRN_DW0,
-			    MSGBUF_DATA, req);
-	WREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0, mmBIF_BX_PF0_MAILBOX_MSGBUF_TRN_DW0),
-		      reg);
-
-	xgpu_ai_mailbox_set_valid(adev, true);
-}
-
 static int xgpu_ai_mailbox_rcv_msg(struct amdgpu_device *adev,
 				   enum idh_event event)
 {
@@ -154,13 +139,25 @@ static int xgpu_ai_poll_msg(struct amdgpu_device *adev, enum idh_event event)
 	return r;
 }
 
-
-static int xgpu_ai_send_access_requests(struct amdgpu_device *adev,
-					enum idh_request req)
-{
+static void xgpu_ai_mailbox_trans_msg (struct amdgpu_device *adev,
+	      enum idh_request req, u32 data1, u32 data2, u32 data3) {
+	u32 reg;
 	int r;
 
-	xgpu_ai_mailbox_trans_msg(adev, req);
+	reg = RREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0,
+					     mmBIF_BX_PF0_MAILBOX_MSGBUF_TRN_DW0));
+	reg = REG_SET_FIELD(reg, BIF_BX_PF0_MAILBOX_MSGBUF_TRN_DW0,
+			    MSGBUF_DATA, req);
+	WREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0, mmBIF_BX_PF0_MAILBOX_MSGBUF_TRN_DW0),
+		      reg);
+	WREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0, mmBIF_BX_PF0_MAILBOX_MSGBUF_TRN_DW1),
+				data1);
+	WREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0, mmBIF_BX_PF0_MAILBOX_MSGBUF_TRN_DW2),
+				data2);
+	WREG32_NO_KIQ(SOC15_REG_OFFSET(NBIO, 0, mmBIF_BX_PF0_MAILBOX_MSGBUF_TRN_DW3),
+				data3);
+
+	xgpu_ai_mailbox_set_valid(adev, true);
 
 	/* start to poll ack */
 	r = xgpu_ai_poll_ack(adev);
@@ -168,6 +165,14 @@ static int xgpu_ai_send_access_requests(struct amdgpu_device *adev,
 		pr_err("Doesn't get ack from pf, continue\n");
 
 	xgpu_ai_mailbox_set_valid(adev, false);
+}
+
+static int xgpu_ai_send_access_requests(struct amdgpu_device *adev,
+					enum idh_request req)
+{
+	int r;
+
+	xgpu_ai_mailbox_trans_msg(adev, req, 0, 0, 0);
 
 	/* start to check msg if request is idh_req_gpu_init_access */
 	if (req == IDH_REQ_GPU_INIT_ACCESS ||
@@ -342,4 +347,5 @@ const struct amdgpu_virt_ops xgpu_ai_virt_ops = {
 	.req_full_gpu	= xgpu_ai_request_full_gpu_access,
 	.rel_full_gpu	= xgpu_ai_release_full_gpu_access,
 	.reset_gpu = xgpu_ai_request_reset,
+	.trans_msg = xgpu_ai_mailbox_trans_msg,
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h
index 9aefc44..1e91b9a 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.h
@@ -31,7 +31,9 @@ enum idh_request {
 	IDH_REL_GPU_INIT_ACCESS,
 	IDH_REQ_GPU_FINI_ACCESS,
 	IDH_REL_GPU_FINI_ACCESS,
-	IDH_REQ_GPU_RESET_ACCESS
+	IDH_REQ_GPU_RESET_ACCESS,
+
+	IDH_LOG_VF_ERROR       = 200,
 };
 
 enum idh_event {
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
index 171a658..c25a831 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.c
@@ -613,4 +613,5 @@ const struct amdgpu_virt_ops xgpu_vi_virt_ops = {
 	.req_full_gpu		= xgpu_vi_request_full_gpu_access,
 	.rel_full_gpu		= xgpu_vi_release_full_gpu_access,
 	.reset_gpu		= xgpu_vi_request_reset,
+	.trans_msg		= NULL, /* Does not need to trans VF errors to host. */
 };
diff --git a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.h b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.h
index 2db7411..c791d73 100644
--- a/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.h
+++ b/drivers/gpu/drm/amd/amdgpu/mxgpu_vi.h
@@ -32,7 +32,9 @@ enum idh_request {
 	IDH_REL_GPU_INIT_ACCESS,
 	IDH_REQ_GPU_FINI_ACCESS,
 	IDH_REL_GPU_FINI_ACCESS,
-	IDH_REQ_GPU_RESET_ACCESS
+	IDH_REQ_GPU_RESET_ACCESS,
+
+	IDH_LOG_VF_ERROR       = 200,
 };
 
 /* VI mailbox messages data */
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c b/drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c
index 1e272f7..045988b 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v6_1.c
@@ -32,6 +32,7 @@
 
 #define smnCPM_CONTROL                                                                                  0x11180460
 #define smnPCIE_CNTL2                                                                                   0x11180070
+#define smnPCIE_CONFIG_CNTL                                                                             0x11180044
 
 u32 nbio_v6_1_get_rev_id(struct amdgpu_device *adev)
 {
@@ -67,7 +68,7 @@ void nbio_v6_1_mc_access_enable(struct amdgpu_device *adev, bool enable)
 
 void nbio_v6_1_hdp_flush(struct amdgpu_device *adev)
 {
-	WREG32_SOC15(NBIO, 0, mmBIF_BX_PF0_HDP_MEM_COHERENCY_FLUSH_CNTL, 0);
+	WREG32_SOC15_NO_KIQ(NBIO, 0, mmBIF_BX_PF0_HDP_MEM_COHERENCY_FLUSH_CNTL, 0);
 }
 
 u32 nbio_v6_1_get_memsize(struct amdgpu_device *adev)
@@ -256,3 +257,15 @@ void nbio_v6_1_detect_hw_virt(struct amdgpu_device *adev)
 			adev->virt.caps |= AMDGPU_PASSTHROUGH_MODE;
 	}
 }
+
+void nbio_v6_1_init_registers(struct amdgpu_device *adev)
+{
+	uint32_t def, data;
+
+	def = data = RREG32_PCIE(smnPCIE_CONFIG_CNTL);
+	data = REG_SET_FIELD(data, PCIE_CONFIG_CNTL, CI_SWUS_MAX_READ_REQUEST_SIZE_MODE, 1);
+	data = REG_SET_FIELD(data, PCIE_CONFIG_CNTL, CI_SWUS_MAX_READ_REQUEST_SIZE_PRIV, 1);
+
+	if (def != data)
+		WREG32_PCIE(smnPCIE_CONFIG_CNTL, data);
+}
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v6_1.h b/drivers/gpu/drm/amd/amdgpu/nbio_v6_1.h
index f6f8bc0..686e4b4 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v6_1.h
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v6_1.h
@@ -50,5 +50,6 @@ void nbio_v6_1_update_medium_grain_clock_gating(struct amdgpu_device *adev, bool
 void nbio_v6_1_update_medium_grain_light_sleep(struct amdgpu_device *adev, bool enable);
 void nbio_v6_1_get_clockgating_state(struct amdgpu_device *adev, u32 *flags);
 void nbio_v6_1_detect_hw_virt(struct amdgpu_device *adev);
+void nbio_v6_1_init_registers(struct amdgpu_device *adev);
 
 #endif
diff --git a/drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c b/drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c
index aa04632..11b70d6 100644
--- a/drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/nbio_v7_0.c
@@ -65,7 +65,7 @@ void nbio_v7_0_mc_access_enable(struct amdgpu_device *adev, bool enable)
 
 void nbio_v7_0_hdp_flush(struct amdgpu_device *adev)
 {
-	WREG32_SOC15(NBIO, 0, mmHDP_MEM_COHERENCY_FLUSH_CNTL, 0);
+	WREG32_SOC15_NO_KIQ(NBIO, 0, mmHDP_MEM_COHERENCY_FLUSH_CNTL, 0);
 }
 
 u32 nbio_v7_0_get_memsize(struct amdgpu_device *adev)
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v10_0.c b/drivers/gpu/drm/amd/amdgpu/psp_v10_0.c
index 2258323..f7cf994 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v10_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v10_0.c
@@ -86,6 +86,52 @@ psp_v10_0_get_fw_type(struct amdgpu_firmware_info *ucode, enum psp_gfx_fw_type *
 	return 0;
 }
 
+int psp_v10_0_init_microcode(struct psp_context *psp)
+{
+	struct amdgpu_device *adev = psp->adev;
+	const char *chip_name;
+	char fw_name[30];
+	int err = 0;
+	const struct psp_firmware_header_v1_0 *hdr;
+
+	DRM_DEBUG("\n");
+
+	switch (adev->asic_type) {
+	case CHIP_RAVEN:
+		chip_name = "raven";
+		break;
+	default: BUG();
+	}
+
+	snprintf(fw_name, sizeof(fw_name), "amdgpu/%s_asd.bin", chip_name);
+	err = request_firmware(&adev->psp.asd_fw, fw_name, adev->dev);
+	if (err)
+		goto out;
+
+	err = amdgpu_ucode_validate(adev->psp.asd_fw);
+	if (err)
+		goto out;
+
+	hdr = (const struct psp_firmware_header_v1_0 *)adev->psp.asd_fw->data;
+	adev->psp.asd_fw_version = le32_to_cpu(hdr->header.ucode_version);
+	adev->psp.asd_feature_version = le32_to_cpu(hdr->ucode_feature_version);
+	adev->psp.asd_ucode_size = le32_to_cpu(hdr->header.ucode_size_bytes);
+	adev->psp.asd_start_addr = (uint8_t *)hdr +
+				le32_to_cpu(hdr->header.ucode_array_offset_bytes);
+
+	return 0;
+out:
+	if (err) {
+		dev_err(adev->dev,
+			"psp v10.0: Failed to load firmware \"%s\"\n",
+			fw_name);
+		release_firmware(adev->psp.asd_fw);
+		adev->psp.asd_fw = NULL;
+	}
+
+	return err;
+}
+
 int psp_v10_0_prep_cmd_buf(struct amdgpu_firmware_info *ucode, struct psp_gfx_cmd_resp *cmd)
 {
 	int ret;
@@ -110,7 +156,6 @@ int psp_v10_0_prep_cmd_buf(struct amdgpu_firmware_info *ucode, struct psp_gfx_cm
 int psp_v10_0_ring_init(struct psp_context *psp, enum psp_ring_type ring_type)
 {
 	int ret = 0;
-	unsigned int psp_ring_reg = 0;
 	struct psp_ring *ring;
 	struct amdgpu_device *adev = psp->adev;
 
@@ -130,6 +175,16 @@ int psp_v10_0_ring_init(struct psp_context *psp, enum psp_ring_type ring_type)
 		return ret;
 	}
 
+	return 0;
+}
+
+int psp_v10_0_ring_create(struct psp_context *psp, enum psp_ring_type ring_type)
+{
+	int ret = 0;
+	unsigned int psp_ring_reg = 0;
+	struct psp_ring *ring = &psp->km_ring;
+	struct amdgpu_device *adev = psp->adev;
+
 	/* Write low address of the ring to C2PMSG_69 */
 	psp_ring_reg = lower_32_bits(ring->ring_mem_mc_addr);
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_69, psp_ring_reg);
@@ -143,13 +198,42 @@ int psp_v10_0_ring_init(struct psp_context *psp, enum psp_ring_type ring_type)
 	psp_ring_reg = ring_type;
 	psp_ring_reg = psp_ring_reg << 16;
 	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_64, psp_ring_reg);
-	/* Wait for response flag (bit 31) in C2PMSG_64 */
-	psp_ring_reg = 0;
-	while ((psp_ring_reg & 0x80000000) == 0) {
-		psp_ring_reg = RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_64);
-	}
 
-	return 0;
+	/* There might be handshake issue with hardware which needs delay */
+	mdelay(20);
+
+	/* Wait for response flag (bit 31) in C2PMSG_64 */
+	ret = psp_wait_for(psp, SOC15_REG_OFFSET(MP0, 0, mmMP0_SMN_C2PMSG_64),
+			   0x80000000, 0x8000FFFF, false);
+
+	return ret;
+}
+
+int psp_v10_0_ring_destroy(struct psp_context *psp, enum psp_ring_type ring_type)
+{
+	int ret = 0;
+	struct psp_ring *ring;
+	unsigned int psp_ring_reg = 0;
+	struct amdgpu_device *adev = psp->adev;
+
+	ring = &psp->km_ring;
+
+	/* Write the ring destroy command to C2PMSG_64 */
+	psp_ring_reg = 3 << 16;
+	WREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_64, psp_ring_reg);
+
+	/* There might be handshake issue with hardware which needs delay */
+	mdelay(20);
+
+	/* Wait for response flag (bit 31) in C2PMSG_64 */
+	ret = psp_wait_for(psp, SOC15_REG_OFFSET(MP0, 0, mmMP0_SMN_C2PMSG_64),
+			   0x80000000, 0x80000000, false);
+
+	amdgpu_bo_free_kernel(&adev->firmware.rbuf,
+			      &ring->ring_mem_mc_addr,
+			      (void **)&ring->ring_mem);
+
+	return ret;
 }
 
 int psp_v10_0_cmd_submit(struct psp_context *psp,
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v10_0.h b/drivers/gpu/drm/amd/amdgpu/psp_v10_0.h
index 2022b7b..e76cde2 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v10_0.h
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v10_0.h
@@ -27,10 +27,15 @@
 
 #include "amdgpu_psp.h"
 
+extern int psp_v10_0_init_microcode(struct psp_context *psp);
 extern int psp_v10_0_prep_cmd_buf(struct amdgpu_firmware_info *ucode,
 				 struct psp_gfx_cmd_resp *cmd);
 extern int psp_v10_0_ring_init(struct psp_context *psp,
 			      enum psp_ring_type ring_type);
+extern int psp_v10_0_ring_create(struct psp_context *psp,
+				 enum psp_ring_type ring_type);
+extern int psp_v10_0_ring_destroy(struct psp_context *psp,
+				  enum psp_ring_type ring_type);
 extern int psp_v10_0_cmd_submit(struct psp_context *psp,
 			       struct amdgpu_firmware_info *ucode,
 			       uint64_t cmd_buf_mc_addr, uint64_t fence_mc_addr,
diff --git a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
index c98d77d..2a535a4 100644
--- a/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
+++ b/drivers/gpu/drm/amd/amdgpu/psp_v3_1.c
@@ -237,11 +237,9 @@ int psp_v3_1_bootloader_load_sos(struct psp_context *psp)
 
 	/* there might be handshake issue with hardware which needs delay */
 	mdelay(20);
-#if 0
 	ret = psp_wait_for(psp, SOC15_REG_OFFSET(MP0, 0, mmMP0_SMN_C2PMSG_81),
 			   RREG32_SOC15(MP0, 0, mmMP0_SMN_C2PMSG_81),
 			   0, true);
-#endif
 
 	return ret;
 }
@@ -341,10 +339,10 @@ int psp_v3_1_ring_destroy(struct psp_context *psp, enum psp_ring_type ring_type)
 	ret = psp_wait_for(psp, SOC15_REG_OFFSET(MP0, 0, mmMP0_SMN_C2PMSG_64),
 			   0x80000000, 0x80000000, false);
 
-	if (ring->ring_mem)
-		amdgpu_bo_free_kernel(&adev->firmware.rbuf,
-				      &ring->ring_mem_mc_addr,
-				      (void **)&ring->ring_mem);
+	amdgpu_bo_free_kernel(&adev->firmware.rbuf,
+			      &ring->ring_mem_mc_addr,
+			      (void **)&ring->ring_mem);
+
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
index 1d766ae..b1de44f 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v3_0.c
@@ -551,17 +551,53 @@ static void sdma_v3_0_rlc_stop(struct amdgpu_device *adev)
  */
 static void sdma_v3_0_ctx_switch_enable(struct amdgpu_device *adev, bool enable)
 {
-	u32 f32_cntl;
+	u32 f32_cntl, phase_quantum = 0;
 	int i;
 
+	if (amdgpu_sdma_phase_quantum) {
+		unsigned value = amdgpu_sdma_phase_quantum;
+		unsigned unit = 0;
+
+		while (value > (SDMA0_PHASE0_QUANTUM__VALUE_MASK >>
+				SDMA0_PHASE0_QUANTUM__VALUE__SHIFT)) {
+			value = (value + 1) >> 1;
+			unit++;
+		}
+		if (unit > (SDMA0_PHASE0_QUANTUM__UNIT_MASK >>
+			    SDMA0_PHASE0_QUANTUM__UNIT__SHIFT)) {
+			value = (SDMA0_PHASE0_QUANTUM__VALUE_MASK >>
+				 SDMA0_PHASE0_QUANTUM__VALUE__SHIFT);
+			unit = (SDMA0_PHASE0_QUANTUM__UNIT_MASK >>
+				SDMA0_PHASE0_QUANTUM__UNIT__SHIFT);
+			WARN_ONCE(1,
+			"clamping sdma_phase_quantum to %uK clock cycles\n",
+				  value << unit);
+		}
+		phase_quantum =
+			value << SDMA0_PHASE0_QUANTUM__VALUE__SHIFT |
+			unit  << SDMA0_PHASE0_QUANTUM__UNIT__SHIFT;
+	}
+
 	for (i = 0; i < adev->sdma.num_instances; i++) {
 		f32_cntl = RREG32(mmSDMA0_CNTL + sdma_offsets[i]);
-		if (enable)
+		if (enable) {
 			f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
 					AUTO_CTXSW_ENABLE, 1);
-		else
+			f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
+					ATC_L1_ENABLE, 1);
+			if (amdgpu_sdma_phase_quantum) {
+				WREG32(mmSDMA0_PHASE0_QUANTUM + sdma_offsets[i],
+				       phase_quantum);
+				WREG32(mmSDMA0_PHASE1_QUANTUM + sdma_offsets[i],
+				       phase_quantum);
+			}
+		} else {
 			f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
 					AUTO_CTXSW_ENABLE, 0);
+			f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
+					ATC_L1_ENABLE, 1);
+		}
+
 		WREG32(mmSDMA0_CNTL + sdma_offsets[i], f32_cntl);
 	}
 }
diff --git a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
index 4a65697..fd7c72a 100644
--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -291,6 +291,8 @@ static void sdma_v4_0_ring_set_wptr(struct amdgpu_ring *ring)
 
 	DRM_DEBUG("Setting write pointer\n");
 	if (ring->use_doorbell) {
+		u64 *wb = (u64 *)&adev->wb.wb[ring->wptr_offs];
+
 		DRM_DEBUG("Using doorbell -- "
 				"wptr_offs == 0x%08x "
 				"lower_32_bits(ring->wptr) << 2 == 0x%08x "
@@ -299,8 +301,7 @@ static void sdma_v4_0_ring_set_wptr(struct amdgpu_ring *ring)
 				lower_32_bits(ring->wptr << 2),
 				upper_32_bits(ring->wptr << 2));
 		/* XXX check if swapping is necessary on BE */
-		adev->wb.wb[ring->wptr_offs] = lower_32_bits(ring->wptr << 2);
-		adev->wb.wb[ring->wptr_offs + 1] = upper_32_bits(ring->wptr << 2);
+		WRITE_ONCE(*wb, (ring->wptr << 2));
 		DRM_DEBUG("calling WDOORBELL64(0x%08x, 0x%016llx)\n",
 				ring->doorbell_index, ring->wptr << 2);
 		WDOORBELL64(ring->doorbell_index, ring->wptr << 2);
@@ -493,13 +494,45 @@ static void sdma_v4_0_rlc_stop(struct amdgpu_device *adev)
  */
 static void sdma_v4_0_ctx_switch_enable(struct amdgpu_device *adev, bool enable)
 {
-	u32 f32_cntl;
+	u32 f32_cntl, phase_quantum = 0;
 	int i;
 
+	if (amdgpu_sdma_phase_quantum) {
+		unsigned value = amdgpu_sdma_phase_quantum;
+		unsigned unit = 0;
+
+		while (value > (SDMA0_PHASE0_QUANTUM__VALUE_MASK >>
+				SDMA0_PHASE0_QUANTUM__VALUE__SHIFT)) {
+			value = (value + 1) >> 1;
+			unit++;
+		}
+		if (unit > (SDMA0_PHASE0_QUANTUM__UNIT_MASK >>
+			    SDMA0_PHASE0_QUANTUM__UNIT__SHIFT)) {
+			value = (SDMA0_PHASE0_QUANTUM__VALUE_MASK >>
+				 SDMA0_PHASE0_QUANTUM__VALUE__SHIFT);
+			unit = (SDMA0_PHASE0_QUANTUM__UNIT_MASK >>
+				SDMA0_PHASE0_QUANTUM__UNIT__SHIFT);
+			WARN_ONCE(1,
+			"clamping sdma_phase_quantum to %uK clock cycles\n",
+				  value << unit);
+		}
+		phase_quantum =
+			value << SDMA0_PHASE0_QUANTUM__VALUE__SHIFT |
+			unit  << SDMA0_PHASE0_QUANTUM__UNIT__SHIFT;
+	}
+
 	for (i = 0; i < adev->sdma.num_instances; i++) {
 		f32_cntl = RREG32(sdma_v4_0_get_reg_offset(i, mmSDMA0_CNTL));
 		f32_cntl = REG_SET_FIELD(f32_cntl, SDMA0_CNTL,
 				AUTO_CTXSW_ENABLE, enable ? 1 : 0);
+		if (enable && amdgpu_sdma_phase_quantum) {
+			WREG32(sdma_v4_0_get_reg_offset(i, mmSDMA0_PHASE0_QUANTUM),
+			       phase_quantum);
+			WREG32(sdma_v4_0_get_reg_offset(i, mmSDMA0_PHASE1_QUANTUM),
+			       phase_quantum);
+			WREG32(sdma_v4_0_get_reg_offset(i, mmSDMA0_PHASE2_QUANTUM),
+			       phase_quantum);
+		}
 		WREG32(sdma_v4_0_get_reg_offset(i, mmSDMA0_CNTL), f32_cntl);
 	}
 
@@ -541,12 +574,13 @@ static void sdma_v4_0_enable(struct amdgpu_device *adev, bool enable)
 static int sdma_v4_0_gfx_resume(struct amdgpu_device *adev)
 {
 	struct amdgpu_ring *ring;
-	u32 rb_cntl, ib_cntl;
+	u32 rb_cntl, ib_cntl, wptr_poll_cntl;
 	u32 rb_bufsz;
 	u32 wb_offset;
 	u32 doorbell;
 	u32 doorbell_offset;
 	u32 temp;
+	u64 wptr_gpu_addr;
 	int i, r;
 
 	for (i = 0; i < adev->sdma.num_instances; i++) {
@@ -628,6 +662,19 @@ static int sdma_v4_0_gfx_resume(struct amdgpu_device *adev)
 			WREG32(sdma_v4_0_get_reg_offset(i, mmSDMA0_F32_CNTL), temp);
 		}
 
+		/* setup the wptr shadow polling */
+		wptr_gpu_addr = adev->wb.gpu_addr + (ring->wptr_offs * 4);
+		WREG32(sdma_v4_0_get_reg_offset(i, mmSDMA0_GFX_RB_WPTR_POLL_ADDR_LO),
+		       lower_32_bits(wptr_gpu_addr));
+		WREG32(sdma_v4_0_get_reg_offset(i, mmSDMA0_GFX_RB_WPTR_POLL_ADDR_HI),
+		       upper_32_bits(wptr_gpu_addr));
+		wptr_poll_cntl = RREG32(sdma_v4_0_get_reg_offset(i, mmSDMA0_GFX_RB_WPTR_POLL_CNTL));
+		if (amdgpu_sriov_vf(adev))
+			wptr_poll_cntl = REG_SET_FIELD(wptr_poll_cntl, SDMA0_GFX_RB_WPTR_POLL_CNTL, F32_POLL_ENABLE, 1);
+		else
+			wptr_poll_cntl = REG_SET_FIELD(wptr_poll_cntl, SDMA0_GFX_RB_WPTR_POLL_CNTL, F32_POLL_ENABLE, 0);
+		WREG32(sdma_v4_0_get_reg_offset(i, mmSDMA0_GFX_RB_WPTR_POLL_CNTL), wptr_poll_cntl);
+
 		/* enable DMA RB */
 		rb_cntl = REG_SET_FIELD(rb_cntl, SDMA0_GFX_RB_CNTL, RB_ENABLE, 1);
 		WREG32(sdma_v4_0_get_reg_offset(i, mmSDMA0_GFX_RB_CNTL), rb_cntl);
@@ -655,6 +702,7 @@ static int sdma_v4_0_gfx_resume(struct amdgpu_device *adev)
 
 		if (adev->mman.buffer_funcs_ring == ring)
 			amdgpu_ttm_set_active_vram_size(adev, adev->mc.real_vram_size);
+
 	}
 
 	return 0;
@@ -751,15 +799,12 @@ static int sdma_v4_0_load_microcode(struct amdgpu_device *adev)
 	const struct sdma_firmware_header_v1_0 *hdr;
 	const __le32 *fw_data;
 	u32 fw_size;
-	u32 digest_size = 0;
 	int i, j;
 
 	/* halt the MEs */
 	sdma_v4_0_enable(adev, false);
 
 	for (i = 0; i < adev->sdma.num_instances; i++) {
-		uint16_t version_major;
-		uint16_t version_minor;
 		if (!adev->sdma.instance[i].fw)
 			return -EINVAL;
 
@@ -767,23 +812,12 @@ static int sdma_v4_0_load_microcode(struct amdgpu_device *adev)
 		amdgpu_ucode_print_sdma_hdr(&hdr->header);
 		fw_size = le32_to_cpu(hdr->header.ucode_size_bytes) / 4;
 
-		version_major = le16_to_cpu(hdr->header.header_version_major);
-		version_minor = le16_to_cpu(hdr->header.header_version_minor);
-
-		if (version_major == 1 && version_minor >= 1) {
-			const struct sdma_firmware_header_v1_1 *sdma_v1_1_hdr = (const struct sdma_firmware_header_v1_1 *) hdr;
-			digest_size = le32_to_cpu(sdma_v1_1_hdr->digest_size);
-		}
-
-		fw_size -= digest_size;
-
 		fw_data = (const __le32 *)
 			(adev->sdma.instance[i].fw->data +
 				le32_to_cpu(hdr->header.ucode_array_offset_bytes));
 
 		WREG32(sdma_v4_0_get_reg_offset(i, mmSDMA0_UCODE_ADDR), 0);
 
-
 		for (j = 0; j < fw_size; j++)
 			WREG32(sdma_v4_0_get_reg_offset(i, mmSDMA0_UCODE_DATA), le32_to_cpup(fw_data++));
 
diff --git a/drivers/gpu/drm/amd/amdgpu/si.c b/drivers/gpu/drm/amd/amdgpu/si.c
index 4267fa4..8284d5d 100644
--- a/drivers/gpu/drm/amd/amdgpu/si.c
+++ b/drivers/gpu/drm/amd/amdgpu/si.c
@@ -1150,6 +1150,33 @@ static bool si_read_disabled_bios(struct amdgpu_device *adev)
 	return r;
 }
 
+#define mmROM_INDEX 0x2A
+#define mmROM_DATA  0x2B
+
+static bool si_read_bios_from_rom(struct amdgpu_device *adev,
+				  u8 *bios, u32 length_bytes)
+{
+	u32 *dw_ptr;
+	u32 i, length_dw;
+
+	if (bios == NULL)
+		return false;
+	if (length_bytes == 0)
+		return false;
+	/* APU vbios image is part of sbios image */
+	if (adev->flags & AMD_IS_APU)
+		return false;
+
+	dw_ptr = (u32 *)bios;
+	length_dw = ALIGN(length_bytes, 4) / 4;
+	/* set rom index to 0 */
+	WREG32(mmROM_INDEX, 0);
+	for (i = 0; i < length_dw; i++)
+		dw_ptr[i] = RREG32(mmROM_DATA);
+
+	return true;
+}
+
 //xxx: not implemented
 static int si_asic_reset(struct amdgpu_device *adev)
 {
@@ -1206,6 +1233,7 @@ static void si_detect_hw_virtualization(struct amdgpu_device *adev)
 static const struct amdgpu_asic_funcs si_asic_funcs =
 {
 	.read_disabled_bios = &si_read_disabled_bios,
+	.read_bios_from_rom = &si_read_bios_from_rom,
 	.read_register = &si_read_register,
 	.reset = &si_asic_reset,
 	.set_vga_state = &si_vga_set_state,
diff --git a/drivers/gpu/drm/amd/amdgpu/si_dpm.c b/drivers/gpu/drm/amd/amdgpu/si_dpm.c
index a7ad839..d63873f 100644
--- a/drivers/gpu/drm/amd/amdgpu/si_dpm.c
+++ b/drivers/gpu/drm/amd/amdgpu/si_dpm.c
@@ -2055,6 +2055,7 @@ static void si_initialize_powertune_defaults(struct amdgpu_device *adev)
 		case 0x682C:
 			si_pi->cac_weights = cac_weights_cape_verde_pro;
 			si_pi->dte_data = dte_data_sun_xt;
+			update_dte_from_pl2 = true;
 			break;
 		case 0x6825:
 		case 0x6827:
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15.c b/drivers/gpu/drm/amd/amdgpu/soc15.c
index a7341d8..f2c3a49 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -25,7 +25,7 @@
 #include <linux/module.h>
 #include <drm/drmP.h>
 #include "amdgpu.h"
-#include "amdgpu_atomfirmware.h"
+#include "amdgpu_atombios.h"
 #include "amdgpu_ih.h"
 #include "amdgpu_uvd.h"
 #include "amdgpu_vce.h"
@@ -62,8 +62,6 @@
 #include "dce_virtual.h"
 #include "mxgpu_ai.h"
 
-MODULE_FIRMWARE("amdgpu/vega10_smc.bin");
-
 #define mmFabricConfigAccessControl                                                                    0x0410
 #define mmFabricConfigAccessControl_BASE_IDX                                                           0
 #define mmFabricConfigAccessControl_DEFAULT                                      0x00000000
@@ -198,6 +196,50 @@ static void soc15_didt_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
 	spin_unlock_irqrestore(&adev->didt_idx_lock, flags);
 }
 
+static u32 soc15_gc_cac_rreg(struct amdgpu_device *adev, u32 reg)
+{
+	unsigned long flags;
+	u32 r;
+
+	spin_lock_irqsave(&adev->gc_cac_idx_lock, flags);
+	WREG32_SOC15(GC, 0, mmGC_CAC_IND_INDEX, (reg));
+	r = RREG32_SOC15(GC, 0, mmGC_CAC_IND_DATA);
+	spin_unlock_irqrestore(&adev->gc_cac_idx_lock, flags);
+	return r;
+}
+
+static void soc15_gc_cac_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&adev->gc_cac_idx_lock, flags);
+	WREG32_SOC15(GC, 0, mmGC_CAC_IND_INDEX, (reg));
+	WREG32_SOC15(GC, 0, mmGC_CAC_IND_DATA, (v));
+	spin_unlock_irqrestore(&adev->gc_cac_idx_lock, flags);
+}
+
+static u32 soc15_se_cac_rreg(struct amdgpu_device *adev, u32 reg)
+{
+	unsigned long flags;
+	u32 r;
+
+	spin_lock_irqsave(&adev->se_cac_idx_lock, flags);
+	WREG32_SOC15(GC, 0, mmSE_CAC_IND_INDEX, (reg));
+	r = RREG32_SOC15(GC, 0, mmSE_CAC_IND_DATA);
+	spin_unlock_irqrestore(&adev->se_cac_idx_lock, flags);
+	return r;
+}
+
+static void soc15_se_cac_wreg(struct amdgpu_device *adev, u32 reg, u32 v)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&adev->se_cac_idx_lock, flags);
+	WREG32_SOC15(GC, 0, mmSE_CAC_IND_INDEX, (reg));
+	WREG32_SOC15(GC, 0, mmSE_CAC_IND_DATA, (v));
+	spin_unlock_irqrestore(&adev->se_cac_idx_lock, flags);
+}
+
 static u32 soc15_get_config_memsize(struct amdgpu_device *adev)
 {
 	if (adev->flags & AMD_IS_APU)
@@ -392,11 +434,11 @@ static void soc15_gpu_pci_config_reset(struct amdgpu_device *adev)
 
 static int soc15_asic_reset(struct amdgpu_device *adev)
 {
-	amdgpu_atomfirmware_scratch_regs_engine_hung(adev, true);
+	amdgpu_atombios_scratch_regs_engine_hung(adev, true);
 
 	soc15_gpu_pci_config_reset(adev);
 
-	amdgpu_atomfirmware_scratch_regs_engine_hung(adev, false);
+	amdgpu_atombios_scratch_regs_engine_hung(adev, false);
 
 	return 0;
 }
@@ -524,13 +566,6 @@ static uint32_t soc15_get_rev_id(struct amdgpu_device *adev)
 		return nbio_v6_1_get_rev_id(adev);
 }
 
-
-int gmc_v9_0_mc_wait_for_idle(struct amdgpu_device *adev)
-{
-	/* to be implemented in MC IP*/
-	return 0;
-}
-
 static const struct amdgpu_asic_funcs soc15_asic_funcs =
 {
 	.read_disabled_bios = &soc15_read_disabled_bios,
@@ -557,6 +592,10 @@ static int soc15_common_early_init(void *handle)
 	adev->uvd_ctx_wreg = &soc15_uvd_ctx_wreg;
 	adev->didt_rreg = &soc15_didt_rreg;
 	adev->didt_wreg = &soc15_didt_wreg;
+	adev->gc_cac_rreg = &soc15_gc_cac_rreg;
+	adev->gc_cac_wreg = &soc15_gc_cac_wreg;
+	adev->se_cac_rreg = &soc15_se_cac_rreg;
+	adev->se_cac_wreg = &soc15_se_cac_wreg;
 
 	adev->asic_funcs = &soc15_asic_funcs;
 
@@ -681,6 +720,9 @@ static int soc15_common_hw_init(void *handle)
 	soc15_pcie_gen3_enable(adev);
 	/* enable aspm */
 	soc15_program_aspm(adev);
+	/* setup nbio registers */
+	if (!(adev->flags & AMD_IS_APU))
+		nbio_v6_1_init_registers(adev);
 	/* enable the doorbell aperture */
 	soc15_enable_doorbell_aperture(adev, true);
 
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15_common.h b/drivers/gpu/drm/amd/amdgpu/soc15_common.h
index e2d330e..7a8e4e2 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15_common.h
+++ b/drivers/gpu/drm/amd/amdgpu/soc15_common.h
@@ -77,6 +77,13 @@ struct nbio_pcie_index_data {
 		(3 == reg##_BASE_IDX ? ip##_BASE__INST##inst##_SEG3 + reg : \
 		(ip##_BASE__INST##inst##_SEG4 + reg))))), value)
 
+#define WREG32_SOC15_NO_KIQ(ip, inst, reg, value) \
+	WREG32_NO_KIQ( (0 == reg##_BASE_IDX ? ip##_BASE__INST##inst##_SEG0 + reg : \
+		(1 == reg##_BASE_IDX ? ip##_BASE__INST##inst##_SEG1 + reg : \
+		(2 == reg##_BASE_IDX ? ip##_BASE__INST##inst##_SEG2 + reg : \
+		(3 == reg##_BASE_IDX ? ip##_BASE__INST##inst##_SEG3 + reg : \
+		(ip##_BASE__INST##inst##_SEG4 + reg))))), value)
+
 #define WREG32_SOC15_OFFSET(ip, inst, reg, offset, value) \
 	WREG32( (0 == reg##_BASE_IDX ? ip##_BASE__INST##inst##_SEG0 + reg : \
 		(1 == reg##_BASE_IDX ? ip##_BASE__INST##inst##_SEG1 + reg : \
diff --git a/drivers/gpu/drm/amd/amdgpu/soc15d.h b/drivers/gpu/drm/amd/amdgpu/soc15d.h
index e79befd..7f408f8 100644
--- a/drivers/gpu/drm/amd/amdgpu/soc15d.h
+++ b/drivers/gpu/drm/amd/amdgpu/soc15d.h
@@ -250,6 +250,7 @@
 #define	PACKET3_SET_UCONFIG_REG				0x79
 #define		PACKET3_SET_UCONFIG_REG_START			0x0000c000
 #define		PACKET3_SET_UCONFIG_REG_END			0x0000c400
+#define		PACKET3_SET_UCONFIG_REG_INDEX_TYPE		(2 << 28)
 #define	PACKET3_SCRATCH_RAM_WRITE			0x7D
 #define	PACKET3_SCRATCH_RAM_READ			0x7E
 #define	PACKET3_LOAD_CONST_RAM				0x80
diff --git a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
index 987b958..23a8575 100644
--- a/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/uvd_v7_0.c
@@ -165,6 +165,9 @@ static int uvd_v7_0_enc_ring_test_ring(struct amdgpu_ring *ring)
 	unsigned i;
 	int r;
 
+	if (amdgpu_sriov_vf(adev))
+		return 0;
+
 	r = amdgpu_ring_alloc(ring, 16);
 	if (r) {
 		DRM_ERROR("amdgpu: uvd enc failed to lock ring %d (%d).\n",
@@ -432,13 +435,19 @@ static int uvd_v7_0_sw_init(void *handle)
 			return r;
 	}
 
-
 	for (i = 0; i < adev->uvd.num_enc_rings; ++i) {
 		ring = &adev->uvd.ring_enc[i];
 		sprintf(ring->name, "uvd_enc%d", i);
 		if (amdgpu_sriov_vf(adev)) {
 			ring->use_doorbell = true;
-			ring->doorbell_index = AMDGPU_DOORBELL64_UVD_RING0_1 * 2;
+
+			/* currently only use the first enconding ring for
+			 * sriov, so set unused location for other unused rings.
+			 */
+			if (i == 0)
+				ring->doorbell_index = AMDGPU_DOORBELL64_UVD_RING0_1 * 2;
+			else
+				ring->doorbell_index = AMDGPU_DOORBELL64_UVD_RING2_3 * 2 + 1;
 		}
 		r = amdgpu_ring_init(adev, ring, 512, &adev->uvd.irq, 0);
 		if (r)
@@ -685,6 +694,11 @@ static int uvd_v7_0_mmsch_start(struct amdgpu_device *adev,
 	/* 4, set resp to zero */
 	WREG32_SOC15(VCE, 0, mmVCE_MMSCH_VF_MAILBOX_RESP, 0);
 
+	WDOORBELL32(adev->uvd.ring_enc[0].doorbell_index, 0);
+	adev->wb.wb[adev->uvd.ring_enc[0].wptr_offs] = 0;
+	adev->uvd.ring_enc[0].wptr = 0;
+	adev->uvd.ring_enc[0].wptr_old = 0;
+
 	/* 5, kick off the initialization and wait until VCE_MMSCH_VF_MAILBOX_RESP becomes non-zero */
 	WREG32_SOC15(VCE, 0, mmVCE_MMSCH_VF_MAILBOX_HOST, 0x10000001);
 
@@ -702,7 +716,6 @@ static int uvd_v7_0_mmsch_start(struct amdgpu_device *adev,
 		dev_err(adev->dev, "failed to init MMSCH, mmVCE_MMSCH_VF_MAILBOX_RESP = %x\n", data);
 		return -EBUSY;
 	}
-	WDOORBELL32(adev->uvd.ring_enc[0].doorbell_index, 0);
 
 	return 0;
 }
@@ -736,11 +749,9 @@ static int uvd_v7_0_sriov_start(struct amdgpu_device *adev)
 		init_table += header->uvd_table_offset;
 
 		ring = &adev->uvd.ring;
+		ring->wptr = 0;
 		size = AMDGPU_GPU_PAGE_ALIGN(adev->uvd.fw->size + 4);
 
-		/* disable clock gating */
-		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_POWER_STATUS),
-						   ~UVD_POWER_STATUS__UVD_PG_MODE_MASK, 0);
 		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_STATUS),
 						   0xFFFFFFFF, 0x00000004);
 		/* mc resume*/
@@ -777,12 +788,6 @@ static int uvd_v7_0_sriov_start(struct amdgpu_device *adev)
 		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_VCPU_CACHE_SIZE2),
 					    AMDGPU_UVD_STACK_SIZE + (AMDGPU_UVD_SESSION_SIZE * 40));
 
-		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_UDEC_ADDR_CONFIG),
-					    adev->gfx.config.gb_addr_config);
-		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_UDEC_DB_ADDR_CONFIG),
-					    adev->gfx.config.gb_addr_config);
-		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_UDEC_DBW_ADDR_CONFIG),
-					    adev->gfx.config.gb_addr_config);
 		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_GP_SCRATCH4), adev->uvd.max_handles);
 		/* mc resume end*/
 
@@ -819,17 +824,6 @@ static int uvd_v7_0_sriov_start(struct amdgpu_device *adev)
 						       UVD_LMI_CTRL__REQ_MODE_MASK |
 						       0x00100000L));
 
-		/* disable byte swapping */
-		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_LMI_SWAP_CNTL), 0);
-		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_MP_SWAP_CNTL), 0);
-
-		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_MPC_SET_MUXA0), 0x40c2040);
-		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_MPC_SET_MUXA1), 0x0);
-		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_MPC_SET_MUXB0), 0x40c2040);
-		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_MPC_SET_MUXB1), 0x0);
-		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_MPC_SET_ALU), 0);
-		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_MPC_SET_MUX), 0x88);
-
 		/* take all subblocks out of reset, except VCPU */
 		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_SOFT_RESET),
 					    UVD_SOFT_RESET__VCPU_SOFT_RESET_MASK);
@@ -838,15 +832,6 @@ static int uvd_v7_0_sriov_start(struct amdgpu_device *adev)
 		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_VCPU_CNTL),
 					    UVD_VCPU_CNTL__CLK_EN_MASK);
 
-		/* enable UMC */
-		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_LMI_CTRL2),
-						   ~UVD_LMI_CTRL2__STALL_ARB_UMC_MASK, 0);
-
-		/* boot up the VCPU */
-		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_SOFT_RESET), 0);
-
-		MMSCH_V1_0_INSERT_DIRECT_POLL(SOC15_REG_OFFSET(UVD, 0, mmUVD_STATUS), 0x02, 0x02);
-
 		/* enable master interrupt */
 		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_MASTINT_EN),
 						   ~(UVD_MASTINT_EN__VCPU_EN_MASK|UVD_MASTINT_EN__SYS_EN_MASK),
@@ -859,40 +844,31 @@ static int uvd_v7_0_sriov_start(struct amdgpu_device *adev)
 		/* force RBC into idle state */
 		size = order_base_2(ring->ring_size);
 		tmp = REG_SET_FIELD(0, UVD_RBC_RB_CNTL, RB_BUFSZ, size);
-		tmp = REG_SET_FIELD(tmp, UVD_RBC_RB_CNTL, RB_BLKSZ, 1);
 		tmp = REG_SET_FIELD(tmp, UVD_RBC_RB_CNTL, RB_NO_FETCH, 1);
-		tmp = REG_SET_FIELD(tmp, UVD_RBC_RB_CNTL, RB_WPTR_POLL_EN, 0);
-		tmp = REG_SET_FIELD(tmp, UVD_RBC_RB_CNTL, RB_NO_UPDATE, 1);
-		tmp = REG_SET_FIELD(tmp, UVD_RBC_RB_CNTL, RB_RPTR_WR_EN, 1);
 		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_RBC_RB_CNTL), tmp);
 
-		/* set the write pointer delay */
-		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_RBC_RB_WPTR_CNTL), 0);
-
-		/* set the wb address */
-		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_RBC_RB_RPTR_ADDR),
-					    (upper_32_bits(ring->gpu_addr) >> 2));
-
-		/* programm the RB_BASE for ring buffer */
-		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_LMI_RBC_RB_64BIT_BAR_LOW),
-					    lower_32_bits(ring->gpu_addr));
-		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_LMI_RBC_RB_64BIT_BAR_HIGH),
-					    upper_32_bits(ring->gpu_addr));
-
-		ring->wptr = 0;
 		ring = &adev->uvd.ring_enc[0];
+		ring->wptr = 0;
 		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_RB_BASE_LO), ring->gpu_addr);
 		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_RB_BASE_HI), upper_32_bits(ring->gpu_addr));
 		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_RB_SIZE), ring->ring_size / 4);
 
+		/* boot up the VCPU */
+		MMSCH_V1_0_INSERT_DIRECT_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_SOFT_RESET), 0);
+
+		/* enable UMC */
+		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(UVD, 0, mmUVD_LMI_CTRL2),
+										   ~UVD_LMI_CTRL2__STALL_ARB_UMC_MASK, 0);
+
+		MMSCH_V1_0_INSERT_DIRECT_POLL(SOC15_REG_OFFSET(UVD, 0, mmUVD_STATUS), 0x02, 0x02);
+
 		/* add end packet */
 		memcpy((void *)init_table, &end, sizeof(struct mmsch_v1_0_cmd_end));
 		table_size += sizeof(struct mmsch_v1_0_cmd_end) / 4;
 		header->uvd_table_size = table_size;
 
-		return uvd_v7_0_mmsch_start(adev, &adev->virt.mm_table);
 	}
-	return -EINVAL; /* already initializaed ? */
+	return uvd_v7_0_mmsch_start(adev, &adev->virt.mm_table);
 }
 
 /**
diff --git a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
index 1ecd6bb..11134d5 100644
--- a/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/vce_v4_0.c
@@ -173,6 +173,11 @@ static int vce_v4_0_mmsch_start(struct amdgpu_device *adev,
 	/* 4, set resp to zero */
 	WREG32(SOC15_REG_OFFSET(VCE, 0, mmVCE_MMSCH_VF_MAILBOX_RESP), 0);
 
+	WDOORBELL32(adev->vce.ring[0].doorbell_index, 0);
+	adev->wb.wb[adev->vce.ring[0].wptr_offs] = 0;
+	adev->vce.ring[0].wptr = 0;
+	adev->vce.ring[0].wptr_old = 0;
+
 	/* 5, kick off the initialization and wait until VCE_MMSCH_VF_MAILBOX_RESP becomes non-zero */
 	WREG32(SOC15_REG_OFFSET(VCE, 0, mmVCE_MMSCH_VF_MAILBOX_HOST), 0x10000001);
 
@@ -190,7 +195,6 @@ static int vce_v4_0_mmsch_start(struct amdgpu_device *adev,
 		dev_err(adev->dev, "failed to init MMSCH, mmVCE_MMSCH_VF_MAILBOX_RESP = %x\n", data);
 		return -EBUSY;
 	}
-	WDOORBELL32(adev->vce.ring[0].doorbell_index, 0);
 
 	return 0;
 }
@@ -274,7 +278,8 @@ static int vce_v4_0_sriov_start(struct amdgpu_device *adev)
 
 		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_LMI_CTRL2), ~0x100, 0);
 		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_SYS_INT_EN),
-						   0xffffffff, VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK);
+						   VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK,
+						   VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK);
 
 		/* end of MC_RESUME */
 		MMSCH_V1_0_INSERT_DIRECT_RD_MOD_WT(SOC15_REG_OFFSET(VCE, 0, mmVCE_STATUS),
@@ -296,11 +301,9 @@ static int vce_v4_0_sriov_start(struct amdgpu_device *adev)
 		memcpy((void *)init_table, &end, sizeof(struct mmsch_v1_0_cmd_end));
 		table_size += sizeof(struct mmsch_v1_0_cmd_end) / 4;
 		header->vce_table_size = table_size;
-
-		return vce_v4_0_mmsch_start(adev, &adev->virt.mm_table);
 	}
 
-	return -EINVAL; /* already initializaed ? */
+	return vce_v4_0_mmsch_start(adev, &adev->virt.mm_table);
 }
 
 /**
@@ -443,12 +446,14 @@ static int vce_v4_0_sw_init(void *handle)
 		if (amdgpu_sriov_vf(adev)) {
 			/* DOORBELL only works under SRIOV */
 			ring->use_doorbell = true;
+
+			/* currently only use the first encoding ring for sriov,
+			 * so set unused location for other unused rings.
+			 */
 			if (i == 0)
-				ring->doorbell_index = AMDGPU_DOORBELL64_RING0_1 * 2;
-			else if (i == 1)
-				ring->doorbell_index = AMDGPU_DOORBELL64_RING2_3 * 2;
+				ring->doorbell_index = AMDGPU_DOORBELL64_VCE_RING0_1 * 2;
 			else
-				ring->doorbell_index = AMDGPU_DOORBELL64_RING2_3 * 2 + 1;
+				ring->doorbell_index = AMDGPU_DOORBELL64_VCE_RING2_3 * 2 + 1;
 		}
 		r = amdgpu_ring_init(adev, ring, 512, &adev->vce.irq, 0);
 		if (r)
@@ -990,11 +995,13 @@ static int vce_v4_0_set_interrupt_state(struct amdgpu_device *adev,
 {
 	uint32_t val = 0;
 
-	if (state == AMDGPU_IRQ_STATE_ENABLE)
-		val |= VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK;
+	if (!amdgpu_sriov_vf(adev)) {
+		if (state == AMDGPU_IRQ_STATE_ENABLE)
+			val |= VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK;
 
-	WREG32_P(SOC15_REG_OFFSET(VCE, 0, mmVCE_SYS_INT_EN), val,
-			~VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK);
+		WREG32_P(SOC15_REG_OFFSET(VCE, 0, mmVCE_SYS_INT_EN), val,
+				~VCE_SYS_INT_EN__VCE_SYS_INT_TRAP_INTERRUPT_EN_MASK);
+	}
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdgpu/vi.c b/drivers/gpu/drm/amd/amdgpu/vi.c
index 6cac291..9ff69b90 100644
--- a/drivers/gpu/drm/amd/amdgpu/vi.c
+++ b/drivers/gpu/drm/amd/amdgpu/vi.c
@@ -1028,8 +1028,7 @@ static int vi_common_early_init(void *handle)
 		/* rev0 hardware requires workarounds to support PG */
 		adev->pg_flags = 0;
 		if (adev->rev_id != 0x00 || CZ_REV_BRISTOL(adev->pdev->revision)) {
-			adev->pg_flags |= AMD_PG_SUPPORT_GFX_PG |
-				AMD_PG_SUPPORT_GFX_SMG |
+			adev->pg_flags |= AMD_PG_SUPPORT_GFX_SMG |
 				AMD_PG_SUPPORT_GFX_PIPELINE |
 				AMD_PG_SUPPORT_CP |
 				AMD_PG_SUPPORT_UVD |
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
index 6316aad..e4a8c2e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c
@@ -142,12 +142,12 @@ static int set_queue_properties_from_user(struct queue_properties *q_properties,
 				struct kfd_ioctl_create_queue_args *args)
 {
 	if (args->queue_percentage > KFD_MAX_QUEUE_PERCENTAGE) {
-		pr_err("kfd: queue percentage must be between 0 to KFD_MAX_QUEUE_PERCENTAGE\n");
+		pr_err("Queue percentage must be between 0 to KFD_MAX_QUEUE_PERCENTAGE\n");
 		return -EINVAL;
 	}
 
 	if (args->queue_priority > KFD_MAX_QUEUE_PRIORITY) {
-		pr_err("kfd: queue priority must be between 0 to KFD_MAX_QUEUE_PRIORITY\n");
+		pr_err("Queue priority must be between 0 to KFD_MAX_QUEUE_PRIORITY\n");
 		return -EINVAL;
 	}
 
@@ -155,26 +155,26 @@ static int set_queue_properties_from_user(struct queue_properties *q_properties,
 		(!access_ok(VERIFY_WRITE,
 			(const void __user *) args->ring_base_address,
 			sizeof(uint64_t)))) {
-		pr_err("kfd: can't access ring base address\n");
+		pr_err("Can't access ring base address\n");
 		return -EFAULT;
 	}
 
 	if (!is_power_of_2(args->ring_size) && (args->ring_size != 0)) {
-		pr_err("kfd: ring size must be a power of 2 or 0\n");
+		pr_err("Ring size must be a power of 2 or 0\n");
 		return -EINVAL;
 	}
 
 	if (!access_ok(VERIFY_WRITE,
 			(const void __user *) args->read_pointer_address,
 			sizeof(uint32_t))) {
-		pr_err("kfd: can't access read pointer\n");
+		pr_err("Can't access read pointer\n");
 		return -EFAULT;
 	}
 
 	if (!access_ok(VERIFY_WRITE,
 			(const void __user *) args->write_pointer_address,
 			sizeof(uint32_t))) {
-		pr_err("kfd: can't access write pointer\n");
+		pr_err("Can't access write pointer\n");
 		return -EFAULT;
 	}
 
@@ -182,7 +182,7 @@ static int set_queue_properties_from_user(struct queue_properties *q_properties,
 		!access_ok(VERIFY_WRITE,
 			(const void __user *) args->eop_buffer_address,
 			sizeof(uint32_t))) {
-		pr_debug("kfd: can't access eop buffer");
+		pr_debug("Can't access eop buffer");
 		return -EFAULT;
 	}
 
@@ -190,7 +190,7 @@ static int set_queue_properties_from_user(struct queue_properties *q_properties,
 		!access_ok(VERIFY_WRITE,
 			(const void __user *) args->ctx_save_restore_address,
 			sizeof(uint32_t))) {
-		pr_debug("kfd: can't access ctx save restore buffer");
+		pr_debug("Can't access ctx save restore buffer");
 		return -EFAULT;
 	}
 
@@ -219,27 +219,27 @@ static int set_queue_properties_from_user(struct queue_properties *q_properties,
 	else
 		q_properties->format = KFD_QUEUE_FORMAT_PM4;
 
-	pr_debug("Queue Percentage (%d, %d)\n",
+	pr_debug("Queue Percentage: %d, %d\n",
 			q_properties->queue_percent, args->queue_percentage);
 
-	pr_debug("Queue Priority (%d, %d)\n",
+	pr_debug("Queue Priority: %d, %d\n",
 			q_properties->priority, args->queue_priority);
 
-	pr_debug("Queue Address (0x%llX, 0x%llX)\n",
+	pr_debug("Queue Address: 0x%llX, 0x%llX\n",
 			q_properties->queue_address, args->ring_base_address);
 
-	pr_debug("Queue Size (0x%llX, %u)\n",
+	pr_debug("Queue Size: 0x%llX, %u\n",
 			q_properties->queue_size, args->ring_size);
 
-	pr_debug("Queue r/w Pointers (0x%llX, 0x%llX)\n",
-			(uint64_t) q_properties->read_ptr,
-			(uint64_t) q_properties->write_ptr);
+	pr_debug("Queue r/w Pointers: %p, %p\n",
+			q_properties->read_ptr,
+			q_properties->write_ptr);
 
-	pr_debug("Queue Format (%d)\n", q_properties->format);
+	pr_debug("Queue Format: %d\n", q_properties->format);
 
-	pr_debug("Queue EOP (0x%llX)\n", q_properties->eop_ring_buffer_address);
+	pr_debug("Queue EOP: 0x%llX\n", q_properties->eop_ring_buffer_address);
 
-	pr_debug("Queue CTX save arex (0x%llX)\n",
+	pr_debug("Queue CTX save area: 0x%llX\n",
 			q_properties->ctx_save_restore_area_address);
 
 	return 0;
@@ -257,16 +257,16 @@ static int kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p,
 
 	memset(&q_properties, 0, sizeof(struct queue_properties));
 
-	pr_debug("kfd: creating queue ioctl\n");
+	pr_debug("Creating queue ioctl\n");
 
 	err = set_queue_properties_from_user(&q_properties, args);
 	if (err)
 		return err;
 
-	pr_debug("kfd: looking for gpu id 0x%x\n", args->gpu_id);
+	pr_debug("Looking for gpu id 0x%x\n", args->gpu_id);
 	dev = kfd_device_by_id(args->gpu_id);
-	if (dev == NULL) {
-		pr_debug("kfd: gpu id 0x%x was not found\n", args->gpu_id);
+	if (!dev) {
+		pr_debug("Could not find gpu id 0x%x\n", args->gpu_id);
 		return -EINVAL;
 	}
 
@@ -278,7 +278,7 @@ static int kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p,
 		goto err_bind_process;
 	}
 
-	pr_debug("kfd: creating queue for PASID %d on GPU 0x%x\n",
+	pr_debug("Creating queue for PASID %d on gpu 0x%x\n",
 			p->pasid,
 			dev->id);
 
@@ -296,15 +296,15 @@ static int kfd_ioctl_create_queue(struct file *filep, struct kfd_process *p,
 
 	mutex_unlock(&p->mutex);
 
-	pr_debug("kfd: queue id %d was created successfully\n", args->queue_id);
+	pr_debug("Queue id %d was created successfully\n", args->queue_id);
 
-	pr_debug("ring buffer address == 0x%016llX\n",
+	pr_debug("Ring buffer address == 0x%016llX\n",
 			args->ring_base_address);
 
-	pr_debug("read ptr address    == 0x%016llX\n",
+	pr_debug("Read ptr address    == 0x%016llX\n",
 			args->read_pointer_address);
 
-	pr_debug("write ptr address   == 0x%016llX\n",
+	pr_debug("Write ptr address   == 0x%016llX\n",
 			args->write_pointer_address);
 
 	return 0;
@@ -321,7 +321,7 @@ static int kfd_ioctl_destroy_queue(struct file *filp, struct kfd_process *p,
 	int retval;
 	struct kfd_ioctl_destroy_queue_args *args = data;
 
-	pr_debug("kfd: destroying queue id %d for PASID %d\n",
+	pr_debug("Destroying queue id %d for pasid %d\n",
 				args->queue_id,
 				p->pasid);
 
@@ -341,12 +341,12 @@ static int kfd_ioctl_update_queue(struct file *filp, struct kfd_process *p,
 	struct queue_properties properties;
 
 	if (args->queue_percentage > KFD_MAX_QUEUE_PERCENTAGE) {
-		pr_err("kfd: queue percentage must be between 0 to KFD_MAX_QUEUE_PERCENTAGE\n");
+		pr_err("Queue percentage must be between 0 to KFD_MAX_QUEUE_PERCENTAGE\n");
 		return -EINVAL;
 	}
 
 	if (args->queue_priority > KFD_MAX_QUEUE_PRIORITY) {
-		pr_err("kfd: queue priority must be between 0 to KFD_MAX_QUEUE_PRIORITY\n");
+		pr_err("Queue priority must be between 0 to KFD_MAX_QUEUE_PRIORITY\n");
 		return -EINVAL;
 	}
 
@@ -354,12 +354,12 @@ static int kfd_ioctl_update_queue(struct file *filp, struct kfd_process *p,
 		(!access_ok(VERIFY_WRITE,
 			(const void __user *) args->ring_base_address,
 			sizeof(uint64_t)))) {
-		pr_err("kfd: can't access ring base address\n");
+		pr_err("Can't access ring base address\n");
 		return -EFAULT;
 	}
 
 	if (!is_power_of_2(args->ring_size) && (args->ring_size != 0)) {
-		pr_err("kfd: ring size must be a power of 2 or 0\n");
+		pr_err("Ring size must be a power of 2 or 0\n");
 		return -EINVAL;
 	}
 
@@ -368,7 +368,7 @@ static int kfd_ioctl_update_queue(struct file *filp, struct kfd_process *p,
 	properties.queue_percent = args->queue_percentage;
 	properties.priority = args->queue_priority;
 
-	pr_debug("kfd: updating queue id %d for PASID %d\n",
+	pr_debug("Updating queue id %d for pasid %d\n",
 			args->queue_id, p->pasid);
 
 	mutex_lock(&p->mutex);
@@ -400,7 +400,7 @@ static int kfd_ioctl_set_memory_policy(struct file *filep,
 	}
 
 	dev = kfd_device_by_id(args->gpu_id);
-	if (dev == NULL)
+	if (!dev)
 		return -EINVAL;
 
 	mutex_lock(&p->mutex);
@@ -443,7 +443,7 @@ static int kfd_ioctl_dbg_register(struct file *filep,
 	long status = 0;
 
 	dev = kfd_device_by_id(args->gpu_id);
-	if (dev == NULL)
+	if (!dev)
 		return -EINVAL;
 
 	if (dev->device_info->asic_family == CHIP_CARRIZO) {
@@ -460,12 +460,11 @@ static int kfd_ioctl_dbg_register(struct file *filep,
 	 */
 	pdd = kfd_bind_process_to_device(dev, p);
 	if (IS_ERR(pdd)) {
-		mutex_unlock(&p->mutex);
-		mutex_unlock(kfd_get_dbgmgr_mutex());
-		return PTR_ERR(pdd);
+		status = PTR_ERR(pdd);
+		goto out;
 	}
 
-	if (dev->dbgmgr == NULL) {
+	if (!dev->dbgmgr) {
 		/* In case of a legal call, we have no dbgmgr yet */
 		create_ok = kfd_dbgmgr_create(&dbgmgr_ptr, dev);
 		if (create_ok) {
@@ -480,6 +479,7 @@ static int kfd_ioctl_dbg_register(struct file *filep,
 		status = -EINVAL;
 	}
 
+out:
 	mutex_unlock(&p->mutex);
 	mutex_unlock(kfd_get_dbgmgr_mutex());
 
@@ -494,7 +494,7 @@ static int kfd_ioctl_dbg_unregister(struct file *filep,
 	long status;
 
 	dev = kfd_device_by_id(args->gpu_id);
-	if (dev == NULL)
+	if (!dev)
 		return -EINVAL;
 
 	if (dev->device_info->asic_family == CHIP_CARRIZO) {
@@ -505,7 +505,7 @@ static int kfd_ioctl_dbg_unregister(struct file *filep,
 	mutex_lock(kfd_get_dbgmgr_mutex());
 
 	status = kfd_dbgmgr_unregister(dev->dbgmgr, p);
-	if (status == 0) {
+	if (!status) {
 		kfd_dbgmgr_destroy(dev->dbgmgr);
 		dev->dbgmgr = NULL;
 	}
@@ -539,7 +539,7 @@ static int kfd_ioctl_dbg_address_watch(struct file *filep,
 	memset((void *) &aw_info, 0, sizeof(struct dbg_address_watch_info));
 
 	dev = kfd_device_by_id(args->gpu_id);
-	if (dev == NULL)
+	if (!dev)
 		return -EINVAL;
 
 	if (dev->device_info->asic_family == CHIP_CARRIZO) {
@@ -580,8 +580,8 @@ static int kfd_ioctl_dbg_address_watch(struct file *filep,
 	args_idx += sizeof(aw_info.watch_address) * aw_info.num_watch_points;
 
 	if (args_idx >= args->buf_size_in_bytes - sizeof(*args)) {
-		kfree(args_buff);
-		return -EINVAL;
+		status = -EINVAL;
+		goto out;
 	}
 
 	watch_mask_value = (uint64_t) args_buff[args_idx];
@@ -604,8 +604,8 @@ static int kfd_ioctl_dbg_address_watch(struct file *filep,
 	}
 
 	if (args_idx >= args->buf_size_in_bytes - sizeof(args)) {
-		kfree(args_buff);
-		return -EINVAL;
+		status = -EINVAL;
+		goto out;
 	}
 
 	/* Currently HSA Event is not supported for DBG */
@@ -617,6 +617,7 @@ static int kfd_ioctl_dbg_address_watch(struct file *filep,
 
 	mutex_unlock(kfd_get_dbgmgr_mutex());
 
+out:
 	kfree(args_buff);
 
 	return status;
@@ -646,7 +647,7 @@ static int kfd_ioctl_dbg_wave_control(struct file *filep,
 				sizeof(wac_info.trapId);
 
 	dev = kfd_device_by_id(args->gpu_id);
-	if (dev == NULL)
+	if (!dev)
 		return -EINVAL;
 
 	if (dev->device_info->asic_family == CHIP_CARRIZO) {
@@ -782,8 +783,9 @@ static int kfd_ioctl_get_process_apertures(struct file *filp,
 				"scratch_limit %llX\n", pdd->scratch_limit);
 
 			args->num_of_nodes++;
-		} while ((pdd = kfd_get_next_process_device_data(p, pdd)) != NULL &&
-				(args->num_of_nodes < NUM_OF_SUPPORTED_GPUS));
+
+			pdd = kfd_get_next_process_device_data(p, pdd);
+		} while (pdd && (args->num_of_nodes < NUM_OF_SUPPORTED_GPUS));
 	}
 
 	mutex_unlock(&p->mutex);
@@ -846,9 +848,84 @@ static int kfd_ioctl_wait_events(struct file *filp, struct kfd_process *p,
 
 	return err;
 }
+static int kfd_ioctl_set_scratch_backing_va(struct file *filep,
+					struct kfd_process *p, void *data)
+{
+	struct kfd_ioctl_set_scratch_backing_va_args *args = data;
+	struct kfd_process_device *pdd;
+	struct kfd_dev *dev;
+	long err;
+
+	dev = kfd_device_by_id(args->gpu_id);
+	if (!dev)
+		return -EINVAL;
+
+	mutex_lock(&p->mutex);
+
+	pdd = kfd_bind_process_to_device(dev, p);
+	if (IS_ERR(pdd)) {
+		err = PTR_ERR(pdd);
+		goto bind_process_to_device_fail;
+	}
+
+	pdd->qpd.sh_hidden_private_base = args->va_addr;
+
+	mutex_unlock(&p->mutex);
+
+	if (sched_policy == KFD_SCHED_POLICY_NO_HWS && pdd->qpd.vmid != 0)
+		dev->kfd2kgd->set_scratch_backing_va(
+			dev->kgd, args->va_addr, pdd->qpd.vmid);
+
+	return 0;
+
+bind_process_to_device_fail:
+	mutex_unlock(&p->mutex);
+	return err;
+}
+
+static int kfd_ioctl_get_tile_config(struct file *filep,
+		struct kfd_process *p, void *data)
+{
+	struct kfd_ioctl_get_tile_config_args *args = data;
+	struct kfd_dev *dev;
+	struct tile_config config;
+	int err = 0;
+
+	dev = kfd_device_by_id(args->gpu_id);
+
+	dev->kfd2kgd->get_tile_config(dev->kgd, &config);
+
+	args->gb_addr_config = config.gb_addr_config;
+	args->num_banks = config.num_banks;
+	args->num_ranks = config.num_ranks;
+
+	if (args->num_tile_configs > config.num_tile_configs)
+		args->num_tile_configs = config.num_tile_configs;
+	err = copy_to_user((void __user *)args->tile_config_ptr,
+			config.tile_config_ptr,
+			args->num_tile_configs * sizeof(uint32_t));
+	if (err) {
+		args->num_tile_configs = 0;
+		return -EFAULT;
+	}
+
+	if (args->num_macro_tile_configs > config.num_macro_tile_configs)
+		args->num_macro_tile_configs =
+				config.num_macro_tile_configs;
+	err = copy_to_user((void __user *)args->macro_tile_config_ptr,
+			config.macro_tile_config_ptr,
+			args->num_macro_tile_configs * sizeof(uint32_t));
+	if (err) {
+		args->num_macro_tile_configs = 0;
+		return -EFAULT;
+	}
+
+	return 0;
+}
 
 #define AMDKFD_IOCTL_DEF(ioctl, _func, _flags) \
-	[_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, .cmd_drv = 0, .name = #ioctl}
+	[_IOC_NR(ioctl)] = {.cmd = ioctl, .func = _func, .flags = _flags, \
+			    .cmd_drv = 0, .name = #ioctl}
 
 /** Ioctl table */
 static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = {
@@ -899,6 +976,12 @@ static const struct amdkfd_ioctl_desc amdkfd_ioctls[] = {
 
 	AMDKFD_IOCTL_DEF(AMDKFD_IOC_DBG_WAVE_CONTROL,
 			kfd_ioctl_dbg_wave_control, 0),
+
+	AMDKFD_IOCTL_DEF(AMDKFD_IOC_SET_SCRATCH_BACKING_VA,
+			kfd_ioctl_set_scratch_backing_va, 0),
+
+	AMDKFD_IOCTL_DEF(AMDKFD_IOC_GET_TILE_CONFIG,
+			kfd_ioctl_get_tile_config, 0)
 };
 
 #define AMDKFD_CORE_IOCTL_COUNT	ARRAY_SIZE(amdkfd_ioctls)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
index d5e19b5..0aa021a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgdev.c
@@ -42,8 +42,6 @@
 
 static void dbgdev_address_watch_disable_nodiq(struct kfd_dev *dev)
 {
-	BUG_ON(!dev || !dev->kfd2kgd);
-
 	dev->kfd2kgd->address_watch_disable(dev->kgd);
 }
 
@@ -62,7 +60,8 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
 	unsigned int *ib_packet_buff;
 	int status;
 
-	BUG_ON(!dbgdev || !dbgdev->kq || !packet_buff || !size_in_bytes);
+	if (WARN_ON(!size_in_bytes))
+		return -EINVAL;
 
 	kq = dbgdev->kq;
 
@@ -77,8 +76,8 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
 	status = kq->ops.acquire_packet_buffer(kq,
 				pq_packets_size_in_bytes / sizeof(uint32_t),
 				&ib_packet_buff);
-	if (status != 0) {
-		pr_err("amdkfd: acquire_packet_buffer failed\n");
+	if (status) {
+		pr_err("acquire_packet_buffer failed\n");
 		return status;
 	}
 
@@ -115,8 +114,8 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
 	status = kfd_gtt_sa_allocate(dbgdev->dev, sizeof(uint64_t),
 					&mem_obj);
 
-	if (status != 0) {
-		pr_err("amdkfd: Failed to allocate GART memory\n");
+	if (status) {
+		pr_err("Failed to allocate GART memory\n");
 		kq->ops.rollback_packet(kq);
 		return status;
 	}
@@ -168,8 +167,6 @@ static int dbgdev_diq_submit_ib(struct kfd_dbgdev *dbgdev,
 
 static int dbgdev_register_nodiq(struct kfd_dbgdev *dbgdev)
 {
-	BUG_ON(!dbgdev);
-
 	/*
 	 * no action is needed in this case,
 	 * just make sure diq will not be used
@@ -187,14 +184,12 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
 	struct kernel_queue *kq = NULL;
 	int status;
 
-	BUG_ON(!dbgdev || !dbgdev->pqm || !dbgdev->dev);
-
 	status = pqm_create_queue(dbgdev->pqm, dbgdev->dev, NULL,
 				&properties, 0, KFD_QUEUE_TYPE_DIQ,
 				&qid);
 
 	if (status) {
-		pr_err("amdkfd: Failed to create DIQ\n");
+		pr_err("Failed to create DIQ\n");
 		return status;
 	}
 
@@ -202,8 +197,8 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
 
 	kq = pqm_get_kernel_queue(dbgdev->pqm, qid);
 
-	if (kq == NULL) {
-		pr_err("amdkfd: Error getting DIQ\n");
+	if (!kq) {
+		pr_err("Error getting DIQ\n");
 		pqm_destroy_queue(dbgdev->pqm, qid);
 		return -EFAULT;
 	}
@@ -215,8 +210,6 @@ static int dbgdev_register_diq(struct kfd_dbgdev *dbgdev)
 
 static int dbgdev_unregister_nodiq(struct kfd_dbgdev *dbgdev)
 {
-	BUG_ON(!dbgdev || !dbgdev->dev);
-
 	/* disable watch address */
 	dbgdev_address_watch_disable_nodiq(dbgdev->dev);
 	return 0;
@@ -227,8 +220,6 @@ static int dbgdev_unregister_diq(struct kfd_dbgdev *dbgdev)
 	/* todo - disable address watch */
 	int status;
 
-	BUG_ON(!dbgdev || !dbgdev->pqm || !dbgdev->kq);
-
 	status = pqm_destroy_queue(dbgdev->pqm,
 			dbgdev->kq->queue->properties.queue_id);
 	dbgdev->kq = NULL;
@@ -245,14 +236,12 @@ static void dbgdev_address_watch_set_registers(
 {
 	union ULARGE_INTEGER addr;
 
-	BUG_ON(!adw_info || !addrHi || !addrLo || !cntl);
-
 	addr.quad_part = 0;
 	addrHi->u32All = 0;
 	addrLo->u32All = 0;
 	cntl->u32All = 0;
 
-	if (adw_info->watch_mask != NULL)
+	if (adw_info->watch_mask)
 		cntl->bitfields.mask =
 			(uint32_t) (adw_info->watch_mask[index] &
 					ADDRESS_WATCH_REG_CNTL_DEFAULT_MASK);
@@ -279,7 +268,7 @@ static void dbgdev_address_watch_set_registers(
 }
 
 static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
-					struct dbg_address_watch_info *adw_info)
+				      struct dbg_address_watch_info *adw_info)
 {
 	union TCP_WATCH_ADDR_H_BITS addrHi;
 	union TCP_WATCH_ADDR_L_BITS addrLo;
@@ -287,13 +276,11 @@ static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
 	struct kfd_process_device *pdd;
 	unsigned int i;
 
-	BUG_ON(!dbgdev || !dbgdev->dev || !adw_info);
-
 	/* taking the vmid for that process on the safe way using pdd */
 	pdd = kfd_get_process_device_data(dbgdev->dev,
 					adw_info->process);
 	if (!pdd) {
-		pr_err("amdkfd: Failed to get pdd for wave control no DIQ\n");
+		pr_err("Failed to get pdd for wave control no DIQ\n");
 		return -EFAULT;
 	}
 
@@ -303,17 +290,16 @@ static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
 
 	if ((adw_info->num_watch_points > MAX_WATCH_ADDRESSES) ||
 			(adw_info->num_watch_points == 0)) {
-		pr_err("amdkfd: num_watch_points is invalid\n");
+		pr_err("num_watch_points is invalid\n");
 		return -EINVAL;
 	}
 
-	if ((adw_info->watch_mode == NULL) ||
-		(adw_info->watch_address == NULL)) {
-		pr_err("amdkfd: adw_info fields are not valid\n");
+	if (!adw_info->watch_mode || !adw_info->watch_address) {
+		pr_err("adw_info fields are not valid\n");
 		return -EINVAL;
 	}
 
-	for (i = 0 ; i < adw_info->num_watch_points ; i++) {
+	for (i = 0; i < adw_info->num_watch_points; i++) {
 		dbgdev_address_watch_set_registers(adw_info, &addrHi, &addrLo,
 						&cntl, i, pdd->qpd.vmid);
 
@@ -348,7 +334,7 @@ static int dbgdev_address_watch_nodiq(struct kfd_dbgdev *dbgdev,
 }
 
 static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
-					struct dbg_address_watch_info *adw_info)
+				    struct dbg_address_watch_info *adw_info)
 {
 	struct pm4__set_config_reg *packets_vec;
 	union TCP_WATCH_ADDR_H_BITS addrHi;
@@ -363,28 +349,25 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
 	/* we do not control the vmid in DIQ mode, just a place holder */
 	unsigned int vmid = 0;
 
-	BUG_ON(!dbgdev || !dbgdev->dev || !adw_info);
-
 	addrHi.u32All = 0;
 	addrLo.u32All = 0;
 	cntl.u32All = 0;
 
 	if ((adw_info->num_watch_points > MAX_WATCH_ADDRESSES) ||
 			(adw_info->num_watch_points == 0)) {
-		pr_err("amdkfd: num_watch_points is invalid\n");
+		pr_err("num_watch_points is invalid\n");
 		return -EINVAL;
 	}
 
-	if ((NULL == adw_info->watch_mode) ||
-			(NULL == adw_info->watch_address)) {
-		pr_err("amdkfd: adw_info fields are not valid\n");
+	if (!adw_info->watch_mode || !adw_info->watch_address) {
+		pr_err("adw_info fields are not valid\n");
 		return -EINVAL;
 	}
 
 	status = kfd_gtt_sa_allocate(dbgdev->dev, ib_size, &mem_obj);
 
-	if (status != 0) {
-		pr_err("amdkfd: Failed to allocate GART memory\n");
+	if (status) {
+		pr_err("Failed to allocate GART memory\n");
 		return status;
 	}
 
@@ -442,8 +425,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
 					i,
 					ADDRESS_WATCH_REG_CNTL);
 
-		aw_reg_add_dword /= sizeof(uint32_t);
-
 		packets_vec[0].bitfields2.reg_offset =
 					aw_reg_add_dword - AMD_CONFIG_REG_BASE;
 
@@ -455,8 +436,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
 					i,
 					ADDRESS_WATCH_REG_ADDR_HI);
 
-		aw_reg_add_dword /= sizeof(uint32_t);
-
 		packets_vec[1].bitfields2.reg_offset =
 					aw_reg_add_dword - AMD_CONFIG_REG_BASE;
 		packets_vec[1].reg_data[0] = addrHi.u32All;
@@ -467,8 +446,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
 					i,
 					ADDRESS_WATCH_REG_ADDR_LO);
 
-		aw_reg_add_dword /= sizeof(uint32_t);
-
 		packets_vec[2].bitfields2.reg_offset =
 				aw_reg_add_dword - AMD_CONFIG_REG_BASE;
 		packets_vec[2].reg_data[0] = addrLo.u32All;
@@ -485,8 +462,6 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
 					i,
 					ADDRESS_WATCH_REG_CNTL);
 
-		aw_reg_add_dword /= sizeof(uint32_t);
-
 		packets_vec[3].bitfields2.reg_offset =
 					aw_reg_add_dword - AMD_CONFIG_REG_BASE;
 		packets_vec[3].reg_data[0] = cntl.u32All;
@@ -498,8 +473,8 @@ static int dbgdev_address_watch_diq(struct kfd_dbgdev *dbgdev,
 					packet_buff_uint,
 					ib_size);
 
-		if (status != 0) {
-			pr_err("amdkfd: Failed to submit IB to DIQ\n");
+		if (status) {
+			pr_err("Failed to submit IB to DIQ\n");
 			break;
 		}
 	}
@@ -518,8 +493,6 @@ static int dbgdev_wave_control_set_registers(
 	union GRBM_GFX_INDEX_BITS reg_gfx_index;
 	struct HsaDbgWaveMsgAMDGen2 *pMsg;
 
-	BUG_ON(!wac_info || !in_reg_sq_cmd || !in_reg_gfx_index);
-
 	reg_sq_cmd.u32All = 0;
 	reg_gfx_index.u32All = 0;
 	pMsg = &wac_info->dbgWave_msg.DbgWaveMsg.WaveMsgInfoGen2;
@@ -620,18 +593,16 @@ static int dbgdev_wave_control_diq(struct kfd_dbgdev *dbgdev,
 	struct pm4__set_config_reg *packets_vec;
 	size_t ib_size = sizeof(struct pm4__set_config_reg) * 3;
 
-	BUG_ON(!dbgdev || !wac_info);
-
 	reg_sq_cmd.u32All = 0;
 
 	status = dbgdev_wave_control_set_registers(wac_info, &reg_sq_cmd,
 							&reg_gfx_index);
 	if (status) {
-		pr_err("amdkfd: Failed to set wave control registers\n");
+		pr_err("Failed to set wave control registers\n");
 		return status;
 	}
 
-	/* we do not control the VMID in DIQ,so reset it to a known value */
+	/* we do not control the VMID in DIQ, so reset it to a known value */
 	reg_sq_cmd.bits.vm_id = 0;
 
 	pr_debug("\t\t %30s\n", "* * * * * * * * * * * * * * * * * *");
@@ -667,7 +638,7 @@ static int dbgdev_wave_control_diq(struct kfd_dbgdev *dbgdev,
 	status = kfd_gtt_sa_allocate(dbgdev->dev, ib_size, &mem_obj);
 
 	if (status != 0) {
-		pr_err("amdkfd: Failed to allocate GART memory\n");
+		pr_err("Failed to allocate GART memory\n");
 		return status;
 	}
 
@@ -719,8 +690,8 @@ static int dbgdev_wave_control_diq(struct kfd_dbgdev *dbgdev,
 			packet_buff_uint,
 			ib_size);
 
-	if (status != 0)
-		pr_err("amdkfd: Failed to submit IB to DIQ\n");
+	if (status)
+		pr_err("Failed to submit IB to DIQ\n");
 
 	kfd_gtt_sa_free(dbgdev->dev, mem_obj);
 
@@ -735,21 +706,19 @@ static int dbgdev_wave_control_nodiq(struct kfd_dbgdev *dbgdev,
 	union GRBM_GFX_INDEX_BITS reg_gfx_index;
 	struct kfd_process_device *pdd;
 
-	BUG_ON(!dbgdev || !dbgdev->dev || !wac_info);
-
 	reg_sq_cmd.u32All = 0;
 
 	/* taking the VMID for that process on the safe way using PDD */
 	pdd = kfd_get_process_device_data(dbgdev->dev, wac_info->process);
 
 	if (!pdd) {
-		pr_err("amdkfd: Failed to get pdd for wave control no DIQ\n");
+		pr_err("Failed to get pdd for wave control no DIQ\n");
 		return -EFAULT;
 	}
 	status = dbgdev_wave_control_set_registers(wac_info, &reg_sq_cmd,
 							&reg_gfx_index);
 	if (status) {
-		pr_err("amdkfd: Failed to set wave control registers\n");
+		pr_err("Failed to set wave control registers\n");
 		return status;
 	}
 
@@ -818,12 +787,13 @@ int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, struct kfd_process *p)
 
 	/* Scan all registers in the range ATC_VMID8_PASID_MAPPING ..
 	 * ATC_VMID15_PASID_MAPPING
-	 * to check which VMID the current process is mapped to. */
+	 * to check which VMID the current process is mapped to.
+	 */
 
 	for (vmid = first_vmid_to_scan; vmid <= last_vmid_to_scan; vmid++) {
 		if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_valid
 				(dev->kgd, vmid)) {
-			if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_valid
+			if (dev->kfd2kgd->get_atc_vmid_pasid_mapping_pasid
 					(dev->kgd, vmid) == p->pasid) {
 				pr_debug("Killing wave fronts of vmid %d and pasid %d\n",
 						vmid, p->pasid);
@@ -833,7 +803,7 @@ int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, struct kfd_process *p)
 	}
 
 	if (vmid > last_vmid_to_scan) {
-		pr_err("amdkfd: didn't found vmid for pasid (%d)\n", p->pasid);
+		pr_err("Didn't find vmid for pasid %d\n", p->pasid);
 		return -EFAULT;
 	}
 
@@ -860,8 +830,6 @@ int dbgdev_wave_reset_wavefronts(struct kfd_dev *dev, struct kfd_process *p)
 void kfd_dbgdev_init(struct kfd_dbgdev *pdbgdev, struct kfd_dev *pdev,
 			enum DBGDEV_TYPE type)
 {
-	BUG_ON(!pdbgdev || !pdev);
-
 	pdbgdev->dev = pdev;
 	pdbgdev->kq = NULL;
 	pdbgdev->type = type;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
index 56d6763..3da25f7b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.c
@@ -44,8 +44,6 @@ struct mutex *kfd_get_dbgmgr_mutex(void)
 
 static void kfd_dbgmgr_uninitialize(struct kfd_dbgmgr *pmgr)
 {
-	BUG_ON(!pmgr);
-
 	kfree(pmgr->dbgdev);
 
 	pmgr->dbgdev = NULL;
@@ -55,7 +53,7 @@ static void kfd_dbgmgr_uninitialize(struct kfd_dbgmgr *pmgr)
 
 void kfd_dbgmgr_destroy(struct kfd_dbgmgr *pmgr)
 {
-	if (pmgr != NULL) {
+	if (pmgr) {
 		kfd_dbgmgr_uninitialize(pmgr);
 		kfree(pmgr);
 	}
@@ -66,12 +64,12 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct kfd_dev *pdev)
 	enum DBGDEV_TYPE type = DBGDEV_TYPE_DIQ;
 	struct kfd_dbgmgr *new_buff;
 
-	BUG_ON(pdev == NULL);
-	BUG_ON(!pdev->init_complete);
+	if (WARN_ON(!pdev->init_complete))
+		return false;
 
 	new_buff = kfd_alloc_struct(new_buff);
 	if (!new_buff) {
-		pr_err("amdkfd: Failed to allocate dbgmgr instance\n");
+		pr_err("Failed to allocate dbgmgr instance\n");
 		return false;
 	}
 
@@ -79,7 +77,7 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct kfd_dev *pdev)
 	new_buff->dev = pdev;
 	new_buff->dbgdev = kfd_alloc_struct(new_buff->dbgdev);
 	if (!new_buff->dbgdev) {
-		pr_err("amdkfd: Failed to allocate dbgdev instance\n");
+		pr_err("Failed to allocate dbgdev instance\n");
 		kfree(new_buff);
 		return false;
 	}
@@ -96,8 +94,6 @@ bool kfd_dbgmgr_create(struct kfd_dbgmgr **ppmgr, struct kfd_dev *pdev)
 
 long kfd_dbgmgr_register(struct kfd_dbgmgr *pmgr, struct kfd_process *p)
 {
-	BUG_ON(!p || !pmgr || !pmgr->dbgdev);
-
 	if (pmgr->pasid != 0) {
 		pr_debug("H/W debugger is already active using pasid %d\n",
 				pmgr->pasid);
@@ -118,8 +114,6 @@ long kfd_dbgmgr_register(struct kfd_dbgmgr *pmgr, struct kfd_process *p)
 
 long kfd_dbgmgr_unregister(struct kfd_dbgmgr *pmgr, struct kfd_process *p)
 {
-	BUG_ON(!p || !pmgr || !pmgr->dbgdev);
-
 	/* Is the requests coming from the already registered process? */
 	if (pmgr->pasid != p->pasid) {
 		pr_debug("H/W debugger is not registered by calling pasid %d\n",
@@ -137,8 +131,6 @@ long kfd_dbgmgr_unregister(struct kfd_dbgmgr *pmgr, struct kfd_process *p)
 long kfd_dbgmgr_wave_control(struct kfd_dbgmgr *pmgr,
 				struct dbg_wave_control_info *wac_info)
 {
-	BUG_ON(!pmgr || !pmgr->dbgdev || !wac_info);
-
 	/* Is the requests coming from the already registered process? */
 	if (pmgr->pasid != wac_info->process->pasid) {
 		pr_debug("H/W debugger support was not registered for requester pasid %d\n",
@@ -152,9 +144,6 @@ long kfd_dbgmgr_wave_control(struct kfd_dbgmgr *pmgr,
 long kfd_dbgmgr_address_watch(struct kfd_dbgmgr *pmgr,
 				struct dbg_address_watch_info *adw_info)
 {
-	BUG_ON(!pmgr || !pmgr->dbgdev || !adw_info);
-
-
 	/* Is the requests coming from the already registered process? */
 	if (pmgr->pasid != adw_info->process->pasid) {
 		pr_debug("H/W debugger support was not registered for requester pasid %d\n",
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h
index 257a745..a04a1fe 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_dbgmgr.h
@@ -30,13 +30,11 @@
 #pragma pack(push, 4)
 
 enum HSA_DBG_WAVEOP {
-	HSA_DBG_WAVEOP_HALT = 1,	/* Halts a wavefront		*/
-	HSA_DBG_WAVEOP_RESUME = 2,	/* Resumes a wavefront		*/
-	HSA_DBG_WAVEOP_KILL = 3,	/* Kills a wavefront		*/
-	HSA_DBG_WAVEOP_DEBUG = 4,	/* Causes wavefront to enter
-						debug mode		*/
-	HSA_DBG_WAVEOP_TRAP = 5,	/* Causes wavefront to take
-						a trap			*/
+	HSA_DBG_WAVEOP_HALT = 1,   /* Halts a wavefront */
+	HSA_DBG_WAVEOP_RESUME = 2, /* Resumes a wavefront */
+	HSA_DBG_WAVEOP_KILL = 3,   /* Kills a wavefront */
+	HSA_DBG_WAVEOP_DEBUG = 4,  /* Causes wavefront to enter dbg mode */
+	HSA_DBG_WAVEOP_TRAP = 5,   /* Causes wavefront to take a trap */
 	HSA_DBG_NUM_WAVEOP = 5,
 	HSA_DBG_MAX_WAVEOP = 0xFFFFFFFF
 };
@@ -81,15 +79,13 @@ struct HsaDbgWaveMsgAMDGen2 {
 			uint32_t UserData:8;	/* user data */
 			uint32_t ShaderArray:1;	/* Shader array */
 			uint32_t Priv:1;	/* Privileged */
-			uint32_t Reserved0:4;	/* This field is reserved,
-						   should be 0 */
+			uint32_t Reserved0:4;	/* Reserved, should be 0 */
 			uint32_t WaveId:4;	/* wave id */
 			uint32_t SIMD:2;	/* SIMD id */
 			uint32_t HSACU:4;	/* Compute unit */
 			uint32_t ShaderEngine:2;/* Shader engine */
 			uint32_t MessageType:2;	/* see HSA_DBG_WAVEMSG_TYPE */
-			uint32_t Reserved1:4;	/* This field is reserved,
-						   should be 0 */
+			uint32_t Reserved1:4;	/* Reserved, should be 0 */
 		} ui32;
 		uint32_t Value;
 	};
@@ -121,20 +117,23 @@ struct HsaDbgWaveMessage {
  * in the user mode instruction stream. The OS scheduler event is typically
  * associated and signaled by an interrupt issued by the GPU, but other HSA
  * system interrupt conditions from other HW (e.g. IOMMUv2) may be surfaced
- * by the KFD by this mechanism, too. */
+ * by the KFD by this mechanism, too.
+ */
 
 /* these are the new definitions for events */
 enum HSA_EVENTTYPE {
 	HSA_EVENTTYPE_SIGNAL = 0,	/* user-mode generated GPU signal */
 	HSA_EVENTTYPE_NODECHANGE = 1,	/* HSA node change (attach/detach) */
 	HSA_EVENTTYPE_DEVICESTATECHANGE = 2,	/* HSA device state change
-						   (start/stop) */
+						 * (start/stop)
+						 */
 	HSA_EVENTTYPE_HW_EXCEPTION = 3,	/* GPU shader exception event */
 	HSA_EVENTTYPE_SYSTEM_EVENT = 4,	/* GPU SYSCALL with parameter info */
 	HSA_EVENTTYPE_DEBUG_EVENT = 5,	/* GPU signal for debugging */
 	HSA_EVENTTYPE_PROFILE_EVENT = 6,/* GPU signal for profiling */
 	HSA_EVENTTYPE_QUEUE_EVENT = 7,	/* GPU signal queue idle state
-					   (EOP pm4) */
+					 * (EOP pm4)
+					 */
 	/* ...  */
 	HSA_EVENTTYPE_MAXID,
 	HSA_EVENTTYPE_TYPE_SIZE = 0xFFFFFFFF
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 3f95f7c..61fff25 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -26,7 +26,7 @@
 #include <linux/slab.h>
 #include "kfd_priv.h"
 #include "kfd_device_queue_manager.h"
-#include "kfd_pm4_headers.h"
+#include "kfd_pm4_headers_vi.h"
 
 #define MQD_SIZE_ALIGNED 768
 
@@ -98,11 +98,14 @@ static const struct kfd_device_info *lookup_device_info(unsigned short did)
 
 	for (i = 0; i < ARRAY_SIZE(supported_devices); i++) {
 		if (supported_devices[i].did == did) {
-			BUG_ON(supported_devices[i].device_info == NULL);
+			WARN_ON(!supported_devices[i].device_info);
 			return supported_devices[i].device_info;
 		}
 	}
 
+	dev_warn(kfd_device, "DID %04x is missing in supported_devices\n",
+		 did);
+
 	return NULL;
 }
 
@@ -114,8 +117,10 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
 	const struct kfd_device_info *device_info =
 					lookup_device_info(pdev->device);
 
-	if (!device_info)
+	if (!device_info) {
+		dev_err(kfd_device, "kgd2kfd_probe failed\n");
 		return NULL;
+	}
 
 	kfd = kzalloc(sizeof(*kfd), GFP_KERNEL);
 	if (!kfd)
@@ -152,15 +157,16 @@ static bool device_iommu_pasid_init(struct kfd_dev *kfd)
 	}
 
 	if ((iommu_info.flags & required_iommu_flags) != required_iommu_flags) {
-		dev_err(kfd_device, "error required iommu flags ats(%i), pri(%i), pasid(%i)\n",
+		dev_err(kfd_device, "error required iommu flags ats %i, pri %i, pasid %i\n",
 		       (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_ATS_SUP) != 0,
 		       (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PRI_SUP) != 0,
-		       (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PASID_SUP) != 0);
+		       (iommu_info.flags & AMD_IOMMU_DEVICE_FLAG_PASID_SUP)
+									!= 0);
 		return false;
 	}
 
 	pasid_limit = min_t(unsigned int,
-			(unsigned int)1 << kfd->device_info->max_pasid_bits,
+			(unsigned int)(1 << kfd->device_info->max_pasid_bits),
 			iommu_info.max_pasids);
 	/*
 	 * last pasid is used for kernel queues doorbells
@@ -211,9 +217,8 @@ static int iommu_invalid_ppr_cb(struct pci_dev *pdev, int pasid,
 			flags);
 
 	dev = kfd_device_by_pci_dev(pdev);
-	BUG_ON(dev == NULL);
-
-	kfd_signal_iommu_event(dev, pasid, address,
+	if (!WARN_ON(!dev))
+		kfd_signal_iommu_event(dev, pasid, address,
 			flags & PPR_FAULT_WRITE, flags & PPR_FAULT_EXEC);
 
 	return AMD_IOMMU_INV_PRI_RSP_INVALID;
@@ -234,9 +239,9 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 	 * calculate max size of runlist packet.
 	 * There can be only 2 packets at once
 	 */
-	size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct pm4_map_process) +
-		max_num_of_queues_per_device *
-		sizeof(struct pm4_map_queues) + sizeof(struct pm4_runlist)) * 2;
+	size += (KFD_MAX_NUM_OF_PROCESSES * sizeof(struct pm4_mes_map_process) +
+		max_num_of_queues_per_device * sizeof(struct pm4_mes_map_queues)
+		+ sizeof(struct pm4_mes_runlist)) * 2;
 
 	/* Add size of HIQ & DIQ */
 	size += KFD_KERNEL_QUEUE_SIZE * 2;
@@ -247,42 +252,37 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 	if (kfd->kfd2kgd->init_gtt_mem_allocation(
 			kfd->kgd, size, &kfd->gtt_mem,
 			&kfd->gtt_start_gpu_addr, &kfd->gtt_start_cpu_ptr)){
-		dev_err(kfd_device,
-			"Could not allocate %d bytes for device (%x:%x)\n",
-			size, kfd->pdev->vendor, kfd->pdev->device);
+		dev_err(kfd_device, "Could not allocate %d bytes\n", size);
 		goto out;
 	}
 
-	dev_info(kfd_device,
-		"Allocated %d bytes on gart for device(%x:%x)\n",
-		size, kfd->pdev->vendor, kfd->pdev->device);
+	dev_info(kfd_device, "Allocated %d bytes on gart\n", size);
 
 	/* Initialize GTT sa with 512 byte chunk size */
 	if (kfd_gtt_sa_init(kfd, size, 512) != 0) {
-		dev_err(kfd_device,
-			"Error initializing gtt sub-allocator\n");
+		dev_err(kfd_device, "Error initializing gtt sub-allocator\n");
 		goto kfd_gtt_sa_init_error;
 	}
 
-	kfd_doorbell_init(kfd);
-
-	if (kfd_topology_add_device(kfd) != 0) {
+	if (kfd_doorbell_init(kfd)) {
 		dev_err(kfd_device,
-			"Error adding device (%x:%x) to topology\n",
-			kfd->pdev->vendor, kfd->pdev->device);
+			"Error initializing doorbell aperture\n");
+		goto kfd_doorbell_error;
+	}
+
+	if (kfd_topology_add_device(kfd)) {
+		dev_err(kfd_device, "Error adding device to topology\n");
 		goto kfd_topology_add_device_error;
 	}
 
 	if (kfd_interrupt_init(kfd)) {
-		dev_err(kfd_device,
-			"Error initializing interrupts for device (%x:%x)\n",
-			kfd->pdev->vendor, kfd->pdev->device);
+		dev_err(kfd_device, "Error initializing interrupts\n");
 		goto kfd_interrupt_error;
 	}
 
 	if (!device_iommu_pasid_init(kfd)) {
 		dev_err(kfd_device,
-			"Error initializing iommuv2 for device (%x:%x)\n",
+			"Error initializing iommuv2 for device %x:%x\n",
 			kfd->pdev->vendor, kfd->pdev->device);
 		goto device_iommu_pasid_error;
 	}
@@ -292,15 +292,13 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 
 	kfd->dqm = device_queue_manager_init(kfd);
 	if (!kfd->dqm) {
-		dev_err(kfd_device,
-			"Error initializing queue manager for device (%x:%x)\n",
-			kfd->pdev->vendor, kfd->pdev->device);
+		dev_err(kfd_device, "Error initializing queue manager\n");
 		goto device_queue_manager_error;
 	}
 
-	if (kfd->dqm->ops.start(kfd->dqm) != 0) {
+	if (kfd->dqm->ops.start(kfd->dqm)) {
 		dev_err(kfd_device,
-			"Error starting queuen manager for device (%x:%x)\n",
+			"Error starting queue manager for device %x:%x\n",
 			kfd->pdev->vendor, kfd->pdev->device);
 		goto dqm_start_error;
 	}
@@ -308,10 +306,10 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 	kfd->dbgmgr = NULL;
 
 	kfd->init_complete = true;
-	dev_info(kfd_device, "added device (%x:%x)\n", kfd->pdev->vendor,
+	dev_info(kfd_device, "added device %x:%x\n", kfd->pdev->vendor,
 		 kfd->pdev->device);
 
-	pr_debug("kfd: Starting kfd with the following scheduling policy %d\n",
+	pr_debug("Starting kfd with the following scheduling policy %d\n",
 		sched_policy);
 
 	goto out;
@@ -325,11 +323,13 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd,
 kfd_interrupt_error:
 	kfd_topology_remove_device(kfd);
 kfd_topology_add_device_error:
+	kfd_doorbell_fini(kfd);
+kfd_doorbell_error:
 	kfd_gtt_sa_fini(kfd);
 kfd_gtt_sa_init_error:
 	kfd->kfd2kgd->free_gtt_mem(kfd->kgd, kfd->gtt_mem);
 	dev_err(kfd_device,
-		"device (%x:%x) NOT added due to errors\n",
+		"device %x:%x NOT added due to errors\n",
 		kfd->pdev->vendor, kfd->pdev->device);
 out:
 	return kfd->init_complete;
@@ -342,6 +342,7 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd)
 		amd_iommu_free_device(kfd->pdev);
 		kfd_interrupt_exit(kfd);
 		kfd_topology_remove_device(kfd);
+		kfd_doorbell_fini(kfd);
 		kfd_gtt_sa_fini(kfd);
 		kfd->kfd2kgd->free_gtt_mem(kfd->kgd, kfd->gtt_mem);
 	}
@@ -351,8 +352,6 @@ void kgd2kfd_device_exit(struct kfd_dev *kfd)
 
 void kgd2kfd_suspend(struct kfd_dev *kfd)
 {
-	BUG_ON(kfd == NULL);
-
 	if (kfd->init_complete) {
 		kfd->dqm->ops.stop(kfd->dqm);
 		amd_iommu_set_invalidate_ctx_cb(kfd->pdev, NULL);
@@ -366,14 +365,15 @@ int kgd2kfd_resume(struct kfd_dev *kfd)
 	unsigned int pasid_limit;
 	int err;
 
-	BUG_ON(kfd == NULL);
-
 	pasid_limit = kfd_get_pasid_limit();
 
 	if (kfd->init_complete) {
 		err = amd_iommu_init_device(kfd->pdev, pasid_limit);
-		if (err < 0)
+		if (err < 0) {
+			dev_err(kfd_device, "failed to initialize iommu\n");
 			return -ENXIO;
+		}
+
 		amd_iommu_set_invalidate_ctx_cb(kfd->pdev,
 						iommu_pasid_shutdown_callback);
 		amd_iommu_set_invalid_ppr_cb(kfd->pdev, iommu_invalid_ppr_cb);
@@ -402,26 +402,27 @@ void kgd2kfd_interrupt(struct kfd_dev *kfd, const void *ih_ring_entry)
 static int kfd_gtt_sa_init(struct kfd_dev *kfd, unsigned int buf_size,
 				unsigned int chunk_size)
 {
-	unsigned int num_of_bits;
+	unsigned int num_of_longs;
 
-	BUG_ON(!kfd);
-	BUG_ON(!kfd->gtt_mem);
-	BUG_ON(buf_size < chunk_size);
-	BUG_ON(buf_size == 0);
-	BUG_ON(chunk_size == 0);
+	if (WARN_ON(buf_size < chunk_size))
+		return -EINVAL;
+	if (WARN_ON(buf_size == 0))
+		return -EINVAL;
+	if (WARN_ON(chunk_size == 0))
+		return -EINVAL;
 
 	kfd->gtt_sa_chunk_size = chunk_size;
 	kfd->gtt_sa_num_of_chunks = buf_size / chunk_size;
 
-	num_of_bits = kfd->gtt_sa_num_of_chunks / BITS_PER_BYTE;
-	BUG_ON(num_of_bits == 0);
+	num_of_longs = (kfd->gtt_sa_num_of_chunks + BITS_PER_LONG - 1) /
+		BITS_PER_LONG;
 
-	kfd->gtt_sa_bitmap = kzalloc(num_of_bits, GFP_KERNEL);
+	kfd->gtt_sa_bitmap = kcalloc(num_of_longs, sizeof(long), GFP_KERNEL);
 
 	if (!kfd->gtt_sa_bitmap)
 		return -ENOMEM;
 
-	pr_debug("kfd: gtt_sa_num_of_chunks = %d, gtt_sa_bitmap = %p\n",
+	pr_debug("gtt_sa_num_of_chunks = %d, gtt_sa_bitmap = %p\n",
 			kfd->gtt_sa_num_of_chunks, kfd->gtt_sa_bitmap);
 
 	mutex_init(&kfd->gtt_sa_lock);
@@ -455,8 +456,6 @@ int kfd_gtt_sa_allocate(struct kfd_dev *kfd, unsigned int size,
 {
 	unsigned int found, start_search, cur_size;
 
-	BUG_ON(!kfd);
-
 	if (size == 0)
 		return -EINVAL;
 
@@ -467,7 +466,7 @@ int kfd_gtt_sa_allocate(struct kfd_dev *kfd, unsigned int size,
 	if ((*mem_obj) == NULL)
 		return -ENOMEM;
 
-	pr_debug("kfd: allocated mem_obj = %p for size = %d\n", *mem_obj, size);
+	pr_debug("Allocated mem_obj = %p for size = %d\n", *mem_obj, size);
 
 	start_search = 0;
 
@@ -479,7 +478,7 @@ int kfd_gtt_sa_allocate(struct kfd_dev *kfd, unsigned int size,
 					kfd->gtt_sa_num_of_chunks,
 					start_search);
 
-	pr_debug("kfd: found = %d\n", found);
+	pr_debug("Found = %d\n", found);
 
 	/* If there wasn't any free chunk, bail out */
 	if (found == kfd->gtt_sa_num_of_chunks)
@@ -497,12 +496,12 @@ int kfd_gtt_sa_allocate(struct kfd_dev *kfd, unsigned int size,
 					found,
 					kfd->gtt_sa_chunk_size);
 
-	pr_debug("kfd: gpu_addr = %p, cpu_addr = %p\n",
+	pr_debug("gpu_addr = %p, cpu_addr = %p\n",
 			(uint64_t *) (*mem_obj)->gpu_addr, (*mem_obj)->cpu_ptr);
 
 	/* If we need only one chunk, mark it as allocated and get out */
 	if (size <= kfd->gtt_sa_chunk_size) {
-		pr_debug("kfd: single bit\n");
+		pr_debug("Single bit\n");
 		set_bit(found, kfd->gtt_sa_bitmap);
 		goto kfd_gtt_out;
 	}
@@ -537,7 +536,7 @@ int kfd_gtt_sa_allocate(struct kfd_dev *kfd, unsigned int size,
 
 	} while (cur_size > 0);
 
-	pr_debug("kfd: range_start = %d, range_end = %d\n",
+	pr_debug("range_start = %d, range_end = %d\n",
 		(*mem_obj)->range_start, (*mem_obj)->range_end);
 
 	/* Mark the chunks as allocated */
@@ -551,7 +550,7 @@ int kfd_gtt_sa_allocate(struct kfd_dev *kfd, unsigned int size,
 	return 0;
 
 kfd_gtt_no_free_chunk:
-	pr_debug("kfd: allocation failed with mem_obj = %p\n", mem_obj);
+	pr_debug("Allocation failed with mem_obj = %p\n", mem_obj);
 	mutex_unlock(&kfd->gtt_sa_lock);
 	kfree(mem_obj);
 	return -ENOMEM;
@@ -561,13 +560,11 @@ int kfd_gtt_sa_free(struct kfd_dev *kfd, struct kfd_mem_obj *mem_obj)
 {
 	unsigned int bit;
 
-	BUG_ON(!kfd);
-
 	/* Act like kfree when trying to free a NULL object */
 	if (!mem_obj)
 		return 0;
 
-	pr_debug("kfd: free mem_obj = %p, range_start = %d, range_end = %d\n",
+	pr_debug("Free mem_obj = %p, range_start = %d, range_end = %d\n",
 			mem_obj, mem_obj->range_start, mem_obj->range_end);
 
 	mutex_lock(&kfd->gtt_sa_lock);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
index 602769c..53a66e8 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager.c
@@ -79,20 +79,17 @@ static bool is_pipe_enabled(struct device_queue_manager *dqm, int mec, int pipe)
 
 unsigned int get_queues_num(struct device_queue_manager *dqm)
 {
-	BUG_ON(!dqm || !dqm->dev);
 	return bitmap_weight(dqm->dev->shared_resources.queue_bitmap,
 				KGD_MAX_QUEUES);
 }
 
 unsigned int get_queues_per_pipe(struct device_queue_manager *dqm)
 {
-	BUG_ON(!dqm || !dqm->dev);
 	return dqm->dev->shared_resources.num_queue_per_pipe;
 }
 
 unsigned int get_pipes_per_mec(struct device_queue_manager *dqm)
 {
-	BUG_ON(!dqm || !dqm->dev);
 	return dqm->dev->shared_resources.num_pipe_per_mec;
 }
 
@@ -121,7 +118,7 @@ static int allocate_vmid(struct device_queue_manager *dqm,
 
 	/* Kaveri kfd vmid's starts from vmid 8 */
 	allocated_vmid = bit + KFD_VMID_START_OFFSET;
-	pr_debug("kfd: vmid allocation %d\n", allocated_vmid);
+	pr_debug("vmid allocation %d\n", allocated_vmid);
 	qpd->vmid = allocated_vmid;
 	q->properties.vmid = allocated_vmid;
 
@@ -152,42 +149,38 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm,
 {
 	int retval;
 
-	BUG_ON(!dqm || !q || !qpd || !allocated_vmid);
-
-	pr_debug("kfd: In func %s\n", __func__);
 	print_queue(q);
 
 	mutex_lock(&dqm->lock);
 
 	if (dqm->total_queue_count >= max_num_of_queues_per_device) {
-		pr_warn("amdkfd: Can't create new usermode queue because %d queues were already created\n",
+		pr_warn("Can't create new usermode queue because %d queues were already created\n",
 				dqm->total_queue_count);
-		mutex_unlock(&dqm->lock);
-		return -EPERM;
+		retval = -EPERM;
+		goto out_unlock;
 	}
 
 	if (list_empty(&qpd->queues_list)) {
 		retval = allocate_vmid(dqm, qpd, q);
-		if (retval != 0) {
-			mutex_unlock(&dqm->lock);
-			return retval;
-		}
+		if (retval)
+			goto out_unlock;
 	}
 	*allocated_vmid = qpd->vmid;
 	q->properties.vmid = qpd->vmid;
 
 	if (q->properties.type == KFD_QUEUE_TYPE_COMPUTE)
 		retval = create_compute_queue_nocpsch(dqm, q, qpd);
-	if (q->properties.type == KFD_QUEUE_TYPE_SDMA)
+	else if (q->properties.type == KFD_QUEUE_TYPE_SDMA)
 		retval = create_sdma_queue_nocpsch(dqm, q, qpd);
+	else
+		retval = -EINVAL;
 
-	if (retval != 0) {
+	if (retval) {
 		if (list_empty(&qpd->queues_list)) {
 			deallocate_vmid(dqm, qpd, q);
 			*allocated_vmid = 0;
 		}
-		mutex_unlock(&dqm->lock);
-		return retval;
+		goto out_unlock;
 	}
 
 	list_add(&q->list, &qpd->queues_list);
@@ -205,8 +198,9 @@ static int create_queue_nocpsch(struct device_queue_manager *dqm,
 	pr_debug("Total of %d queues are accountable so far\n",
 			dqm->total_queue_count);
 
+out_unlock:
 	mutex_unlock(&dqm->lock);
-	return 0;
+	return retval;
 }
 
 static int allocate_hqd(struct device_queue_manager *dqm, struct queue *q)
@@ -216,7 +210,8 @@ static int allocate_hqd(struct device_queue_manager *dqm, struct queue *q)
 
 	set = false;
 
-	for (pipe = dqm->next_pipe_to_allocate, i = 0; i < get_pipes_per_mec(dqm);
+	for (pipe = dqm->next_pipe_to_allocate, i = 0;
+			i < get_pipes_per_mec(dqm);
 			pipe = ((pipe + 1) % get_pipes_per_mec(dqm)), ++i) {
 
 		if (!is_pipe_enabled(dqm, 0, pipe))
@@ -239,8 +234,7 @@ static int allocate_hqd(struct device_queue_manager *dqm, struct queue *q)
 	if (!set)
 		return -EBUSY;
 
-	pr_debug("kfd: DQM %s hqd slot - pipe (%d) queue(%d)\n",
-				__func__, q->pipe, q->queue);
+	pr_debug("hqd slot - pipe %d, queue %d\n", q->pipe, q->queue);
 	/* horizontal hqd allocation */
 	dqm->next_pipe_to_allocate = (pipe + 1) % get_pipes_per_mec(dqm);
 
@@ -260,36 +254,38 @@ static int create_compute_queue_nocpsch(struct device_queue_manager *dqm,
 	int retval;
 	struct mqd_manager *mqd;
 
-	BUG_ON(!dqm || !q || !qpd);
-
 	mqd = dqm->ops.get_mqd_manager(dqm, KFD_MQD_TYPE_COMPUTE);
-	if (mqd == NULL)
+	if (!mqd)
 		return -ENOMEM;
 
 	retval = allocate_hqd(dqm, q);
-	if (retval != 0)
+	if (retval)
 		return retval;
 
 	retval = mqd->init_mqd(mqd, &q->mqd, &q->mqd_mem_obj,
 				&q->gart_mqd_addr, &q->properties);
-	if (retval != 0) {
-		deallocate_hqd(dqm, q);
-		return retval;
-	}
+	if (retval)
+		goto out_deallocate_hqd;
 
-	pr_debug("kfd: loading mqd to hqd on pipe (%d) queue (%d)\n",
-			q->pipe,
-			q->queue);
+	pr_debug("Loading mqd to hqd on pipe %d, queue %d\n",
+			q->pipe, q->queue);
 
-	retval = mqd->load_mqd(mqd, q->mqd, q->pipe,
-			q->queue, (uint32_t __user *) q->properties.write_ptr);
-	if (retval != 0) {
-		deallocate_hqd(dqm, q);
-		mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
-		return retval;
-	}
+	dqm->dev->kfd2kgd->set_scratch_backing_va(
+			dqm->dev->kgd, qpd->sh_hidden_private_base, qpd->vmid);
+
+	retval = mqd->load_mqd(mqd, q->mqd, q->pipe, q->queue, &q->properties,
+			       q->process->mm);
+	if (retval)
+		goto out_uninit_mqd;
 
 	return 0;
+
+out_uninit_mqd:
+	mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
+out_deallocate_hqd:
+	deallocate_hqd(dqm, q);
+
+	return retval;
 }
 
 static int destroy_queue_nocpsch(struct device_queue_manager *dqm,
@@ -299,12 +295,8 @@ static int destroy_queue_nocpsch(struct device_queue_manager *dqm,
 	int retval;
 	struct mqd_manager *mqd;
 
-	BUG_ON(!dqm || !q || !q->mqd || !qpd);
-
 	retval = 0;
 
-	pr_debug("kfd: In Func %s\n", __func__);
-
 	mutex_lock(&dqm->lock);
 
 	if (q->properties.type == KFD_QUEUE_TYPE_COMPUTE) {
@@ -323,7 +315,7 @@ static int destroy_queue_nocpsch(struct device_queue_manager *dqm,
 		dqm->sdma_queue_count--;
 		deallocate_sdma_queue(dqm, q->sdma_id);
 	} else {
-		pr_debug("q->properties.type is invalid (%d)\n",
+		pr_debug("q->properties.type %d is invalid\n",
 				q->properties.type);
 		retval = -EINVAL;
 		goto out;
@@ -334,7 +326,7 @@ static int destroy_queue_nocpsch(struct device_queue_manager *dqm,
 				QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS,
 				q->pipe, q->queue);
 
-	if (retval != 0)
+	if (retval)
 		goto out;
 
 	mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
@@ -364,14 +356,12 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q)
 	struct mqd_manager *mqd;
 	bool prev_active = false;
 
-	BUG_ON(!dqm || !q || !q->mqd);
-
 	mutex_lock(&dqm->lock);
 	mqd = dqm->ops.get_mqd_manager(dqm,
 			get_mqd_type_from_queue_type(q->properties.type));
-	if (mqd == NULL) {
-		mutex_unlock(&dqm->lock);
-		return -ENOMEM;
+	if (!mqd) {
+		retval = -ENOMEM;
+		goto out_unlock;
 	}
 
 	if (q->properties.is_active)
@@ -385,12 +375,13 @@ static int update_queue(struct device_queue_manager *dqm, struct queue *q)
 	retval = mqd->update_mqd(mqd, q->mqd, &q->properties);
 	if ((q->properties.is_active) && (!prev_active))
 		dqm->queue_count++;
-	else if ((!q->properties.is_active) && (prev_active))
+	else if (!q->properties.is_active && prev_active)
 		dqm->queue_count--;
 
 	if (sched_policy != KFD_SCHED_POLICY_NO_HWS)
 		retval = execute_queues_cpsch(dqm, false);
 
+out_unlock:
 	mutex_unlock(&dqm->lock);
 	return retval;
 }
@@ -400,15 +391,16 @@ static struct mqd_manager *get_mqd_manager_nocpsch(
 {
 	struct mqd_manager *mqd;
 
-	BUG_ON(!dqm || type >= KFD_MQD_TYPE_MAX);
+	if (WARN_ON(type >= KFD_MQD_TYPE_MAX))
+		return NULL;
 
-	pr_debug("kfd: In func %s mqd type %d\n", __func__, type);
+	pr_debug("mqd type %d\n", type);
 
 	mqd = dqm->mqds[type];
 	if (!mqd) {
 		mqd = mqd_manager_init(type, dqm->dev);
-		if (mqd == NULL)
-			pr_err("kfd: mqd manager is NULL");
+		if (!mqd)
+			pr_err("mqd manager is NULL");
 		dqm->mqds[type] = mqd;
 	}
 
@@ -421,11 +413,7 @@ static int register_process_nocpsch(struct device_queue_manager *dqm,
 	struct device_process_node *n;
 	int retval;
 
-	BUG_ON(!dqm || !qpd);
-
-	pr_debug("kfd: In func %s\n", __func__);
-
-	n = kzalloc(sizeof(struct device_process_node), GFP_KERNEL);
+	n = kzalloc(sizeof(*n), GFP_KERNEL);
 	if (!n)
 		return -ENOMEM;
 
@@ -449,10 +437,6 @@ static int unregister_process_nocpsch(struct device_queue_manager *dqm,
 	int retval;
 	struct device_process_node *cur, *next;
 
-	BUG_ON(!dqm || !qpd);
-
-	pr_debug("In func %s\n", __func__);
-
 	pr_debug("qpd->queues_list is %s\n",
 			list_empty(&qpd->queues_list) ? "empty" : "not empty");
 
@@ -493,51 +477,39 @@ static void init_interrupts(struct device_queue_manager *dqm)
 {
 	unsigned int i;
 
-	BUG_ON(dqm == NULL);
-
 	for (i = 0 ; i < get_pipes_per_mec(dqm) ; i++)
 		if (is_pipe_enabled(dqm, 0, i))
 			dqm->dev->kfd2kgd->init_interrupts(dqm->dev->kgd, i);
 }
 
-static int init_scheduler(struct device_queue_manager *dqm)
-{
-	int retval = 0;
-
-	BUG_ON(!dqm);
-
-	pr_debug("kfd: In %s\n", __func__);
-
-	return retval;
-}
-
 static int initialize_nocpsch(struct device_queue_manager *dqm)
 {
-	int i;
+	int pipe, queue;
 
-	BUG_ON(!dqm);
+	pr_debug("num of pipes: %d\n", get_pipes_per_mec(dqm));
 
-	pr_debug("kfd: In func %s num of pipes: %d\n",
-			__func__, get_pipes_per_mec(dqm));
+	dqm->allocated_queues = kcalloc(get_pipes_per_mec(dqm),
+					sizeof(unsigned int), GFP_KERNEL);
+	if (!dqm->allocated_queues)
+		return -ENOMEM;
 
 	mutex_init(&dqm->lock);
 	INIT_LIST_HEAD(&dqm->queues);
 	dqm->queue_count = dqm->next_pipe_to_allocate = 0;
 	dqm->sdma_queue_count = 0;
-	dqm->allocated_queues = kcalloc(get_pipes_per_mec(dqm),
-					sizeof(unsigned int), GFP_KERNEL);
-	if (!dqm->allocated_queues) {
-		mutex_destroy(&dqm->lock);
-		return -ENOMEM;
-	}
 
-	for (i = 0; i < get_pipes_per_mec(dqm); i++)
-		dqm->allocated_queues[i] = (1 << get_queues_per_pipe(dqm)) - 1;
+	for (pipe = 0; pipe < get_pipes_per_mec(dqm); pipe++) {
+		int pipe_offset = pipe * get_queues_per_pipe(dqm);
+
+		for (queue = 0; queue < get_queues_per_pipe(dqm); queue++)
+			if (test_bit(pipe_offset + queue,
+				     dqm->dev->shared_resources.queue_bitmap))
+				dqm->allocated_queues[pipe] |= 1 << queue;
+	}
 
 	dqm->vmid_bitmap = (1 << VMID_PER_DEVICE) - 1;
 	dqm->sdma_bitmap = (1 << CIK_SDMA_QUEUES) - 1;
 
-	init_scheduler(dqm);
 	return 0;
 }
 
@@ -545,9 +517,7 @@ static void uninitialize_nocpsch(struct device_queue_manager *dqm)
 {
 	int i;
 
-	BUG_ON(!dqm);
-
-	BUG_ON(dqm->queue_count > 0 || dqm->processes_count > 0);
+	WARN_ON(dqm->queue_count > 0 || dqm->processes_count > 0);
 
 	kfree(dqm->allocated_queues);
 	for (i = 0 ; i < KFD_MQD_TYPE_MAX ; i++)
@@ -604,33 +574,34 @@ static int create_sdma_queue_nocpsch(struct device_queue_manager *dqm,
 		return -ENOMEM;
 
 	retval = allocate_sdma_queue(dqm, &q->sdma_id);
-	if (retval != 0)
+	if (retval)
 		return retval;
 
 	q->properties.sdma_queue_id = q->sdma_id % CIK_SDMA_QUEUES_PER_ENGINE;
 	q->properties.sdma_engine_id = q->sdma_id / CIK_SDMA_ENGINE_NUM;
 
-	pr_debug("kfd: sdma id is:    %d\n", q->sdma_id);
-	pr_debug("     sdma queue id: %d\n", q->properties.sdma_queue_id);
-	pr_debug("     sdma engine id: %d\n", q->properties.sdma_engine_id);
+	pr_debug("SDMA id is:    %d\n", q->sdma_id);
+	pr_debug("SDMA queue id: %d\n", q->properties.sdma_queue_id);
+	pr_debug("SDMA engine id: %d\n", q->properties.sdma_engine_id);
 
 	dqm->ops_asic_specific.init_sdma_vm(dqm, q, qpd);
 	retval = mqd->init_mqd(mqd, &q->mqd, &q->mqd_mem_obj,
 				&q->gart_mqd_addr, &q->properties);
-	if (retval != 0) {
-		deallocate_sdma_queue(dqm, q->sdma_id);
-		return retval;
-	}
+	if (retval)
+		goto out_deallocate_sdma_queue;
 
-	retval = mqd->load_mqd(mqd, q->mqd, 0,
-				0, NULL);
-	if (retval != 0) {
-		deallocate_sdma_queue(dqm, q->sdma_id);
-		mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
-		return retval;
-	}
+	retval = mqd->load_mqd(mqd, q->mqd, 0, 0, &q->properties, NULL);
+	if (retval)
+		goto out_uninit_mqd;
 
 	return 0;
+
+out_uninit_mqd:
+	mqd->uninit_mqd(mqd, q->mqd, q->mqd_mem_obj);
+out_deallocate_sdma_queue:
+	deallocate_sdma_queue(dqm, q->sdma_id);
+
+	return retval;
 }
 
 /*
@@ -642,10 +613,6 @@ static int set_sched_resources(struct device_queue_manager *dqm)
 	int i, mec;
 	struct scheduling_resources res;
 
-	BUG_ON(!dqm);
-
-	pr_debug("kfd: In func %s\n", __func__);
-
 	res.vmid_mask = (1 << VMID_PER_DEVICE) - 1;
 	res.vmid_mask <<= KFD_VMID_START_OFFSET;
 
@@ -663,8 +630,9 @@ static int set_sched_resources(struct device_queue_manager *dqm)
 
 		/* This situation may be hit in the future if a new HW
 		 * generation exposes more than 64 queues. If so, the
-		 * definition of res.queue_mask needs updating */
-		if (WARN_ON(i > (sizeof(res.queue_mask)*8))) {
+		 * definition of res.queue_mask needs updating
+		 */
+		if (WARN_ON(i >= (sizeof(res.queue_mask)*8))) {
 			pr_err("Invalid queue enabled by amdgpu: %d\n", i);
 			break;
 		}
@@ -674,9 +642,9 @@ static int set_sched_resources(struct device_queue_manager *dqm)
 	res.gws_mask = res.oac_mask = res.gds_heap_base =
 						res.gds_heap_size = 0;
 
-	pr_debug("kfd: scheduling resources:\n"
-			"      vmid mask: 0x%8X\n"
-			"      queue mask: 0x%8llX\n",
+	pr_debug("Scheduling resources:\n"
+			"vmid mask: 0x%8X\n"
+			"queue mask: 0x%8llX\n",
 			res.vmid_mask, res.queue_mask);
 
 	return pm_send_set_resources(&dqm->packets, &res);
@@ -686,10 +654,7 @@ static int initialize_cpsch(struct device_queue_manager *dqm)
 {
 	int retval;
 
-	BUG_ON(!dqm);
-
-	pr_debug("kfd: In func %s num of pipes: %d\n",
-			__func__, get_pipes_per_mec(dqm));
+	pr_debug("num of pipes: %d\n", get_pipes_per_mec(dqm));
 
 	mutex_init(&dqm->lock);
 	INIT_LIST_HEAD(&dqm->queues);
@@ -697,13 +662,9 @@ static int initialize_cpsch(struct device_queue_manager *dqm)
 	dqm->sdma_queue_count = 0;
 	dqm->active_runlist = false;
 	retval = dqm->ops_asic_specific.initialize(dqm);
-	if (retval != 0)
-		goto fail_init_pipelines;
+	if (retval)
+		mutex_destroy(&dqm->lock);
 
-	return 0;
-
-fail_init_pipelines:
-	mutex_destroy(&dqm->lock);
 	return retval;
 }
 
@@ -712,25 +673,23 @@ static int start_cpsch(struct device_queue_manager *dqm)
 	struct device_process_node *node;
 	int retval;
 
-	BUG_ON(!dqm);
-
 	retval = 0;
 
 	retval = pm_init(&dqm->packets, dqm);
-	if (retval != 0)
+	if (retval)
 		goto fail_packet_manager_init;
 
 	retval = set_sched_resources(dqm);
-	if (retval != 0)
+	if (retval)
 		goto fail_set_sched_resources;
 
-	pr_debug("kfd: allocating fence memory\n");
+	pr_debug("Allocating fence memory\n");
 
 	/* allocate fence memory on the gart */
 	retval = kfd_gtt_sa_allocate(dqm->dev, sizeof(*dqm->fence_addr),
 					&dqm->fence_mem);
 
-	if (retval != 0)
+	if (retval)
 		goto fail_allocate_vidmem;
 
 	dqm->fence_addr = dqm->fence_mem->cpu_ptr;
@@ -758,8 +717,6 @@ static int stop_cpsch(struct device_queue_manager *dqm)
 	struct device_process_node *node;
 	struct kfd_process_device *pdd;
 
-	BUG_ON(!dqm);
-
 	destroy_queues_cpsch(dqm, true, true);
 
 	list_for_each_entry(node, &dqm->queues, list) {
@@ -776,13 +733,9 @@ static int create_kernel_queue_cpsch(struct device_queue_manager *dqm,
 					struct kernel_queue *kq,
 					struct qcm_process_device *qpd)
 {
-	BUG_ON(!dqm || !kq || !qpd);
-
-	pr_debug("kfd: In func %s\n", __func__);
-
 	mutex_lock(&dqm->lock);
 	if (dqm->total_queue_count >= max_num_of_queues_per_device) {
-		pr_warn("amdkfd: Can't create new kernel queue because %d queues were already created\n",
+		pr_warn("Can't create new kernel queue because %d queues were already created\n",
 				dqm->total_queue_count);
 		mutex_unlock(&dqm->lock);
 		return -EPERM;
@@ -809,10 +762,6 @@ static void destroy_kernel_queue_cpsch(struct device_queue_manager *dqm,
 					struct kernel_queue *kq,
 					struct qcm_process_device *qpd)
 {
-	BUG_ON(!dqm || !kq);
-
-	pr_debug("kfd: In %s\n", __func__);
-
 	mutex_lock(&dqm->lock);
 	/* here we actually preempt the DIQ */
 	destroy_queues_cpsch(dqm, true, false);
@@ -844,8 +793,6 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
 	int retval;
 	struct mqd_manager *mqd;
 
-	BUG_ON(!dqm || !q || !qpd);
-
 	retval = 0;
 
 	if (allocate_vmid)
@@ -854,7 +801,7 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
 	mutex_lock(&dqm->lock);
 
 	if (dqm->total_queue_count >= max_num_of_queues_per_device) {
-		pr_warn("amdkfd: Can't create new usermode queue because %d queues were already created\n",
+		pr_warn("Can't create new usermode queue because %d queues were already created\n",
 				dqm->total_queue_count);
 		retval = -EPERM;
 		goto out;
@@ -866,15 +813,15 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
 	mqd = dqm->ops.get_mqd_manager(dqm,
 			get_mqd_type_from_queue_type(q->properties.type));
 
-	if (mqd == NULL) {
-		mutex_unlock(&dqm->lock);
-		return -ENOMEM;
+	if (!mqd) {
+		retval = -ENOMEM;
+		goto out;
 	}
 
 	dqm->ops_asic_specific.init_sdma_vm(dqm, q, qpd);
 	retval = mqd->init_mqd(mqd, &q->mqd, &q->mqd_mem_obj,
 				&q->gart_mqd_addr, &q->properties);
-	if (retval != 0)
+	if (retval)
 		goto out;
 
 	list_add(&q->list, &qpd->queues_list);
@@ -884,7 +831,7 @@ static int create_queue_cpsch(struct device_queue_manager *dqm, struct queue *q,
 	}
 
 	if (q->properties.type == KFD_QUEUE_TYPE_SDMA)
-			dqm->sdma_queue_count++;
+		dqm->sdma_queue_count++;
 	/*
 	 * Unconditionally increment this counter, regardless of the queue's
 	 * type or whether the queue is active.
@@ -903,12 +850,11 @@ int amdkfd_fence_wait_timeout(unsigned int *fence_addr,
 				unsigned int fence_value,
 				unsigned long timeout)
 {
-	BUG_ON(!fence_addr);
 	timeout += jiffies;
 
 	while (*fence_addr != fence_value) {
 		if (time_after(jiffies, timeout)) {
-			pr_err("kfd: qcm fence wait loop timeout expired\n");
+			pr_err("qcm fence wait loop timeout expired\n");
 			return -ETIME;
 		}
 		schedule();
@@ -932,8 +878,6 @@ static int destroy_queues_cpsch(struct device_queue_manager *dqm,
 	enum kfd_preempt_type_filter preempt_type;
 	struct kfd_process_device *pdd;
 
-	BUG_ON(!dqm);
-
 	retval = 0;
 
 	if (lock)
@@ -941,7 +885,7 @@ static int destroy_queues_cpsch(struct device_queue_manager *dqm,
 	if (!dqm->active_runlist)
 		goto out;
 
-	pr_debug("kfd: Before destroying queues, sdma queue count is : %u\n",
+	pr_debug("Before destroying queues, sdma queue count is : %u\n",
 		dqm->sdma_queue_count);
 
 	if (dqm->sdma_queue_count > 0) {
@@ -955,7 +899,7 @@ static int destroy_queues_cpsch(struct device_queue_manager *dqm,
 
 	retval = pm_send_unmap_queue(&dqm->packets, KFD_QUEUE_TYPE_COMPUTE,
 			preempt_type, 0, false, 0);
-	if (retval != 0)
+	if (retval)
 		goto out;
 
 	*dqm->fence_addr = KFD_FENCE_INIT;
@@ -964,7 +908,7 @@ static int destroy_queues_cpsch(struct device_queue_manager *dqm,
 	/* should be timed out */
 	retval = amdkfd_fence_wait_timeout(dqm->fence_addr, KFD_FENCE_COMPLETED,
 				QUEUE_PREEMPT_DEFAULT_TIMEOUT_MS);
-	if (retval != 0) {
+	if (retval) {
 		pdd = kfd_get_process_device_data(dqm->dev,
 				kfd_get_process(current));
 		pdd->reset_wavefronts = true;
@@ -983,14 +927,12 @@ static int execute_queues_cpsch(struct device_queue_manager *dqm, bool lock)
 {
 	int retval;
 
-	BUG_ON(!dqm);
-
 	if (lock)
 		mutex_lock(&dqm->lock);
 
 	retval = destroy_queues_cpsch(dqm, false, false);
-	if (retval != 0) {
-		pr_err("kfd: the cp might be in an unrecoverable state due to an unsuccessful queues preemption");
+	if (retval) {
+		pr_err("The cp might be in an unrecoverable state due to an unsuccessful queues preemption");
 		goto out;
 	}
 
@@ -1005,8 +947,8 @@ static int execute_queues_cpsch(struct device_queue_manager *dqm, bool lock)
 	}
 
 	retval = pm_send_runlist(&dqm->packets, &dqm->queues);
-	if (retval != 0) {
-		pr_err("kfd: failed to execute runlist");
+	if (retval) {
+		pr_err("failed to execute runlist");
 		goto out;
 	}
 	dqm->active_runlist = true;
@@ -1025,8 +967,6 @@ static int destroy_queue_cpsch(struct device_queue_manager *dqm,
 	struct mqd_manager *mqd;
 	bool preempt_all_queues;
 
-	BUG_ON(!dqm || !qpd || !q);
-
 	preempt_all_queues = false;
 
 	retval = 0;
@@ -1098,8 +1038,6 @@ static bool set_cache_memory_policy(struct device_queue_manager *dqm,
 {
 	bool retval;
 
-	pr_debug("kfd: In func %s\n", __func__);
-
 	mutex_lock(&dqm->lock);
 
 	if (alternate_aperture_size == 0) {
@@ -1120,14 +1058,11 @@ static bool set_cache_memory_policy(struct device_queue_manager *dqm,
 		uint64_t base = (uintptr_t)alternate_aperture_base;
 		uint64_t limit = base + alternate_aperture_size - 1;
 
-		if (limit <= base)
+		if (limit <= base || (base & APE1_FIXED_BITS_MASK) != 0 ||
+		   (limit & APE1_FIXED_BITS_MASK) != APE1_LIMIT_ALIGNMENT) {
+			retval = false;
 			goto out;
-
-		if ((base & APE1_FIXED_BITS_MASK) != 0)
-			goto out;
-
-		if ((limit & APE1_FIXED_BITS_MASK) != APE1_LIMIT_ALIGNMENT)
-			goto out;
+		}
 
 		qpd->sh_mem_ape1_base = base >> 16;
 		qpd->sh_mem_ape1_limit = limit >> 16;
@@ -1144,27 +1079,22 @@ static bool set_cache_memory_policy(struct device_queue_manager *dqm,
 	if ((sched_policy == KFD_SCHED_POLICY_NO_HWS) && (qpd->vmid != 0))
 		program_sh_mem_settings(dqm, qpd);
 
-	pr_debug("kfd: sh_mem_config: 0x%x, ape1_base: 0x%x, ape1_limit: 0x%x\n",
+	pr_debug("sh_mem_config: 0x%x, ape1_base: 0x%x, ape1_limit: 0x%x\n",
 		qpd->sh_mem_config, qpd->sh_mem_ape1_base,
 		qpd->sh_mem_ape1_limit);
 
-	mutex_unlock(&dqm->lock);
-	return retval;
-
 out:
 	mutex_unlock(&dqm->lock);
-	return false;
+	return retval;
 }
 
 struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
 {
 	struct device_queue_manager *dqm;
 
-	BUG_ON(!dev);
+	pr_debug("Loading device queue manager\n");
 
-	pr_debug("kfd: loading device queue manager\n");
-
-	dqm = kzalloc(sizeof(struct device_queue_manager), GFP_KERNEL);
+	dqm = kzalloc(sizeof(*dqm), GFP_KERNEL);
 	if (!dqm)
 		return NULL;
 
@@ -1202,8 +1132,8 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
 		dqm->ops.set_cache_memory_policy = set_cache_memory_policy;
 		break;
 	default:
-		BUG();
-		break;
+		pr_err("Invalid scheduling policy %d\n", sched_policy);
+		goto out_free;
 	}
 
 	switch (dev->device_info->asic_family) {
@@ -1216,18 +1146,16 @@ struct device_queue_manager *device_queue_manager_init(struct kfd_dev *dev)
 		break;
 	}
 
-	if (dqm->ops.initialize(dqm) != 0) {
-		kfree(dqm);
-		return NULL;
-	}
+	if (!dqm->ops.initialize(dqm))
+		return dqm;
 
-	return dqm;
+out_free:
+	kfree(dqm);
+	return NULL;
 }
 
 void device_queue_manager_uninit(struct device_queue_manager *dqm)
 {
-	BUG_ON(!dqm);
-
 	dqm->ops.uninitialize(dqm);
 	kfree(dqm);
 }
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
index 48dc056..72c3cba 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_cik.c
@@ -24,6 +24,7 @@
 #include "kfd_device_queue_manager.h"
 #include "cik_regs.h"
 #include "oss/oss_2_4_sh_mask.h"
+#include "gca/gfx_7_2_sh_mask.h"
 
 static bool set_cache_memory_policy_cik(struct device_queue_manager *dqm,
 				   struct qcm_process_device *qpd,
@@ -65,7 +66,7 @@ static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble)
 	 * for LDS/Scratch and GPUVM.
 	 */
 
-	BUG_ON((top_address_nybble & 1) || top_address_nybble > 0xE ||
+	WARN_ON((top_address_nybble & 1) || top_address_nybble > 0xE ||
 		top_address_nybble == 0);
 
 	return PRIVATE_BASE(top_address_nybble << 12) |
@@ -104,8 +105,6 @@ static int register_process_cik(struct device_queue_manager *dqm,
 	struct kfd_process_device *pdd;
 	unsigned int temp;
 
-	BUG_ON(!dqm || !qpd);
-
 	pdd = qpd_to_pdd(qpd);
 
 	/* check if sh_mem_config register already configured */
@@ -125,9 +124,10 @@ static int register_process_cik(struct device_queue_manager *dqm,
 	} else {
 		temp = get_sh_mem_bases_nybble_64(pdd);
 		qpd->sh_mem_bases = compute_sh_mem_bases_64bit(temp);
+		qpd->sh_mem_config |= 1  << SH_MEM_CONFIG__PRIVATE_ATC__SHIFT;
 	}
 
-	pr_debug("kfd: is32bit process: %d sh_mem_bases nybble: 0x%X and register 0x%X\n",
+	pr_debug("is32bit process: %d sh_mem_bases nybble: 0x%X and register 0x%X\n",
 		qpd->pqm->process->is_32bit_user_mode, temp, qpd->sh_mem_bases);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
index 7e9cae9..40e9ddd 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device_queue_manager_vi.c
@@ -67,7 +67,7 @@ static uint32_t compute_sh_mem_bases_64bit(unsigned int top_address_nybble)
 	 * for LDS/Scratch and GPUVM.
 	 */
 
-	BUG_ON((top_address_nybble & 1) || top_address_nybble > 0xE ||
+	WARN_ON((top_address_nybble & 1) || top_address_nybble > 0xE ||
 		top_address_nybble == 0);
 
 	return top_address_nybble << 12 |
@@ -110,8 +110,6 @@ static int register_process_vi(struct device_queue_manager *dqm,
 	struct kfd_process_device *pdd;
 	unsigned int temp;
 
-	BUG_ON(!dqm || !qpd);
-
 	pdd = qpd_to_pdd(qpd);
 
 	/* check if sh_mem_config register already configured */
@@ -137,9 +135,11 @@ static int register_process_vi(struct device_queue_manager *dqm,
 		qpd->sh_mem_bases = compute_sh_mem_bases_64bit(temp);
 		qpd->sh_mem_config |= SH_MEM_ADDRESS_MODE_HSA64 <<
 			SH_MEM_CONFIG__ADDRESS_MODE__SHIFT;
+		qpd->sh_mem_config |= 1  <<
+			SH_MEM_CONFIG__PRIVATE_ATC__SHIFT;
 	}
 
-	pr_debug("kfd: is32bit process: %d sh_mem_bases nybble: 0x%X and register 0x%X\n",
+	pr_debug("is32bit process: %d sh_mem_bases nybble: 0x%X and register 0x%X\n",
 		qpd->pqm->process->is_32bit_user_mode, temp, qpd->sh_mem_bases);
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
index 453c5d6..acf4d2a 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_doorbell.c
@@ -59,7 +59,7 @@ static inline size_t doorbell_process_allocation(void)
 }
 
 /* Doorbell calculations for device init. */
-void kfd_doorbell_init(struct kfd_dev *kfd)
+int kfd_doorbell_init(struct kfd_dev *kfd)
 {
 	size_t doorbell_start_offset;
 	size_t doorbell_aperture_size;
@@ -95,26 +95,35 @@ void kfd_doorbell_init(struct kfd_dev *kfd)
 	kfd->doorbell_kernel_ptr = ioremap(kfd->doorbell_base,
 						doorbell_process_allocation());
 
-	BUG_ON(!kfd->doorbell_kernel_ptr);
+	if (!kfd->doorbell_kernel_ptr)
+		return -ENOMEM;
 
-	pr_debug("kfd: doorbell initialization:\n");
-	pr_debug("kfd: doorbell base           == 0x%08lX\n",
+	pr_debug("Doorbell initialization:\n");
+	pr_debug("doorbell base           == 0x%08lX\n",
 			(uintptr_t)kfd->doorbell_base);
 
-	pr_debug("kfd: doorbell_id_offset      == 0x%08lX\n",
+	pr_debug("doorbell_id_offset      == 0x%08lX\n",
 			kfd->doorbell_id_offset);
 
-	pr_debug("kfd: doorbell_process_limit  == 0x%08lX\n",
+	pr_debug("doorbell_process_limit  == 0x%08lX\n",
 			doorbell_process_limit);
 
-	pr_debug("kfd: doorbell_kernel_offset  == 0x%08lX\n",
+	pr_debug("doorbell_kernel_offset  == 0x%08lX\n",
 			(uintptr_t)kfd->doorbell_base);
 
-	pr_debug("kfd: doorbell aperture size  == 0x%08lX\n",
+	pr_debug("doorbell aperture size  == 0x%08lX\n",
 			kfd->shared_resources.doorbell_aperture_size);
 
-	pr_debug("kfd: doorbell kernel address == 0x%08lX\n",
+	pr_debug("doorbell kernel address == 0x%08lX\n",
 			(uintptr_t)kfd->doorbell_kernel_ptr);
+
+	return 0;
+}
+
+void kfd_doorbell_fini(struct kfd_dev *kfd)
+{
+	if (kfd->doorbell_kernel_ptr)
+		iounmap(kfd->doorbell_kernel_ptr);
 }
 
 int kfd_doorbell_mmap(struct kfd_process *process, struct vm_area_struct *vma)
@@ -131,7 +140,7 @@ int kfd_doorbell_mmap(struct kfd_process *process, struct vm_area_struct *vma)
 
 	/* Find kfd device according to gpu id */
 	dev = kfd_device_by_id(vma->vm_pgoff);
-	if (dev == NULL)
+	if (!dev)
 		return -EINVAL;
 
 	/* Calculate physical address of doorbell */
@@ -142,12 +151,11 @@ int kfd_doorbell_mmap(struct kfd_process *process, struct vm_area_struct *vma)
 
 	vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
 
-	pr_debug("kfd: mapping doorbell page in %s\n"
+	pr_debug("Mapping doorbell page\n"
 		 "     target user address == 0x%08llX\n"
 		 "     physical address    == 0x%08llX\n"
 		 "     vm_flags            == 0x%04lX\n"
 		 "     size                == 0x%04lX\n",
-		 __func__,
 		 (unsigned long long) vma->vm_start, address, vma->vm_flags,
 		 doorbell_process_allocation());
 
@@ -166,8 +174,6 @@ u32 __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
 {
 	u32 inx;
 
-	BUG_ON(!kfd || !doorbell_off);
-
 	mutex_lock(&kfd->doorbell_mutex);
 	inx = find_first_zero_bit(kfd->doorbell_available_index,
 					KFD_MAX_NUM_OF_QUEUES_PER_PROCESS);
@@ -185,7 +191,7 @@ u32 __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
 	*doorbell_off = KERNEL_DOORBELL_PASID * (doorbell_process_allocation() /
 							sizeof(u32)) + inx;
 
-	pr_debug("kfd: get kernel queue doorbell\n"
+	pr_debug("Get kernel queue doorbell\n"
 			 "     doorbell offset   == 0x%08X\n"
 			 "     kernel address    == 0x%08lX\n",
 		*doorbell_off, (uintptr_t)(kfd->doorbell_kernel_ptr + inx));
@@ -197,8 +203,6 @@ void kfd_release_kernel_doorbell(struct kfd_dev *kfd, u32 __iomem *db_addr)
 {
 	unsigned int inx;
 
-	BUG_ON(!kfd || !db_addr);
-
 	inx = (unsigned int)(db_addr - kfd->doorbell_kernel_ptr);
 
 	mutex_lock(&kfd->doorbell_mutex);
@@ -210,7 +214,7 @@ inline void write_kernel_doorbell(u32 __iomem *db, u32 value)
 {
 	if (db) {
 		writel(value, db);
-		pr_debug("writing %d to doorbell address 0x%p\n", value, db);
+		pr_debug("Writing %d to doorbell address 0x%p\n", value, db);
 	}
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
index d1ce83d..5979158 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c
@@ -110,7 +110,7 @@ static bool allocate_free_slot(struct kfd_process *process,
 			*out_page = page;
 			*out_slot_index = slot;
 
-			pr_debug("allocated event signal slot in page %p, slot %d\n",
+			pr_debug("Allocated event signal slot in page %p, slot %d\n",
 					page, slot);
 
 			return true;
@@ -155,9 +155,9 @@ static bool allocate_signal_page(struct file *devkfd, struct kfd_process *p)
 						   struct signal_page,
 						   event_pages)->page_index + 1;
 
-	pr_debug("allocated new event signal page at %p, for process %p\n",
+	pr_debug("Allocated new event signal page at %p, for process %p\n",
 			page, p);
-	pr_debug("page index is %d\n", page->page_index);
+	pr_debug("Page index is %d\n", page->page_index);
 
 	list_add(&page->event_pages, &p->signal_event_pages);
 
@@ -194,7 +194,8 @@ static void release_event_notification_slot(struct signal_page *page,
 	page->free_slots++;
 
 	/* We don't free signal pages, they are retained by the process
-	 * and reused until it exits. */
+	 * and reused until it exits.
+	 */
 }
 
 static struct signal_page *lookup_signal_page_by_index(struct kfd_process *p,
@@ -246,7 +247,7 @@ static u32 make_nonsignal_event_id(struct kfd_process *p)
 
 	for (id = p->next_nonsignal_event_id;
 		id < KFD_LAST_NONSIGNAL_EVENT_ID &&
-		lookup_event_by_id(p, id) != NULL;
+		lookup_event_by_id(p, id);
 		id++)
 		;
 
@@ -265,7 +266,7 @@ static u32 make_nonsignal_event_id(struct kfd_process *p)
 
 	for (id = KFD_FIRST_NONSIGNAL_EVENT_ID;
 		id < KFD_LAST_NONSIGNAL_EVENT_ID &&
-		lookup_event_by_id(p, id) != NULL;
+		lookup_event_by_id(p, id);
 		id++)
 		;
 
@@ -291,13 +292,13 @@ static int create_signal_event(struct file *devkfd,
 				struct kfd_event *ev)
 {
 	if (p->signal_event_count == KFD_SIGNAL_EVENT_LIMIT) {
-		pr_warn("amdkfd: Signal event wasn't created because limit was reached\n");
+		pr_warn("Signal event wasn't created because limit was reached\n");
 		return -ENOMEM;
 	}
 
 	if (!allocate_event_notification_slot(devkfd, p, &ev->signal_page,
 						&ev->signal_slot_index)) {
-		pr_warn("amdkfd: Signal event wasn't created because out of kernel memory\n");
+		pr_warn("Signal event wasn't created because out of kernel memory\n");
 		return -ENOMEM;
 	}
 
@@ -309,11 +310,7 @@ static int create_signal_event(struct file *devkfd,
 	ev->event_id = make_signal_event_id(ev->signal_page,
 						ev->signal_slot_index);
 
-	pr_debug("signal event number %zu created with id %d, address %p\n",
-			p->signal_event_count, ev->event_id,
-			ev->user_signal_address);
-
-	pr_debug("signal event number %zu created with id %d, address %p\n",
+	pr_debug("Signal event number %zu created with id %d, address %p\n",
 			p->signal_event_count, ev->event_id,
 			ev->user_signal_address);
 
@@ -345,7 +342,7 @@ void kfd_event_init_process(struct kfd_process *p)
 
 static void destroy_event(struct kfd_process *p, struct kfd_event *ev)
 {
-	if (ev->signal_page != NULL) {
+	if (ev->signal_page) {
 		release_event_notification_slot(ev->signal_page,
 						ev->signal_slot_index);
 		p->signal_event_count--;
@@ -584,7 +581,7 @@ void kfd_signal_event_interrupt(unsigned int pasid, uint32_t partial_id,
 		 * search faster.
 		 */
 		struct signal_page *page;
-		unsigned i;
+		unsigned int i;
 
 		list_for_each_entry(page, &p->signal_event_pages, event_pages)
 			for (i = 0; i < SLOTS_PER_PAGE; i++)
@@ -816,7 +813,7 @@ int kfd_event_mmap(struct kfd_process *p, struct vm_area_struct *vma)
 	/* check required size is logical */
 	if (get_order(KFD_SIGNAL_EVENT_LIMIT * 8) !=
 			get_order(vma->vm_end - vma->vm_start)) {
-		pr_err("amdkfd: event page mmap requested illegal size\n");
+		pr_err("Event page mmap requested illegal size\n");
 		return -EINVAL;
 	}
 
@@ -825,7 +822,7 @@ int kfd_event_mmap(struct kfd_process *p, struct vm_area_struct *vma)
 	page = lookup_signal_page_by_index(p, page_index);
 	if (!page) {
 		/* Probably KFD bug, but mmap is user-accessible. */
-		pr_debug("signal page could not be found for page_index %u\n",
+		pr_debug("Signal page could not be found for page_index %u\n",
 				page_index);
 		return -EINVAL;
 	}
@@ -836,7 +833,7 @@ int kfd_event_mmap(struct kfd_process *p, struct vm_area_struct *vma)
 	vma->vm_flags |= VM_IO | VM_DONTCOPY | VM_DONTEXPAND | VM_NORESERVE
 		       | VM_DONTDUMP | VM_PFNMAP;
 
-	pr_debug("mapping signal page\n");
+	pr_debug("Mapping signal page\n");
 	pr_debug("     start user address  == 0x%08lx\n", vma->vm_start);
 	pr_debug("     end user address    == 0x%08lx\n", vma->vm_end);
 	pr_debug("     pfn                 == 0x%016lX\n", pfn);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c b/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
index 2b65510..c59384b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_flat_memory.c
@@ -304,7 +304,7 @@ int kfd_init_apertures(struct kfd_process *process)
 		id < NUM_OF_SUPPORTED_GPUS) {
 
 		pdd = kfd_create_process_device_data(dev, process);
-		if (pdd == NULL) {
+		if (!pdd) {
 			pr_err("Failed to create process device data\n");
 			return -1;
 		}
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
index 7f134aa..70b3a99c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
@@ -179,7 +179,7 @@ static void interrupt_wq(struct work_struct *work)
 bool interrupt_is_wanted(struct kfd_dev *dev, const uint32_t *ih_ring_entry)
 {
 	/* integer and bitwise OR so there is no boolean short-circuiting */
-	unsigned wanted = 0;
+	unsigned int wanted = 0;
 
 	wanted |= dev->device_info->event_interrupt_class->interrupt_isr(dev,
 								ih_ring_entry);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
index d135cd0..681b639 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_kernel_queue.c
@@ -41,11 +41,11 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
 	int retval;
 	union PM4_MES_TYPE_3_HEADER nop;
 
-	BUG_ON(!kq || !dev);
-	BUG_ON(type != KFD_QUEUE_TYPE_DIQ && type != KFD_QUEUE_TYPE_HIQ);
+	if (WARN_ON(type != KFD_QUEUE_TYPE_DIQ && type != KFD_QUEUE_TYPE_HIQ))
+		return false;
 
-	pr_debug("amdkfd: In func %s initializing queue type %d size %d\n",
-			__func__, KFD_QUEUE_TYPE_HIQ, queue_size);
+	pr_debug("Initializing queue type %d size %d\n", KFD_QUEUE_TYPE_HIQ,
+			queue_size);
 
 	memset(&prop, 0, sizeof(prop));
 	memset(&nop, 0, sizeof(nop));
@@ -63,23 +63,23 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
 						KFD_MQD_TYPE_HIQ);
 		break;
 	default:
-		BUG();
-		break;
+		pr_err("Invalid queue type %d\n", type);
+		return false;
 	}
 
-	if (kq->mqd == NULL)
+	if (!kq->mqd)
 		return false;
 
 	prop.doorbell_ptr = kfd_get_kernel_doorbell(dev, &prop.doorbell_off);
 
-	if (prop.doorbell_ptr == NULL) {
-		pr_err("amdkfd: error init doorbell");
+	if (!prop.doorbell_ptr) {
+		pr_err("Failed to initialize doorbell");
 		goto err_get_kernel_doorbell;
 	}
 
 	retval = kfd_gtt_sa_allocate(dev, queue_size, &kq->pq);
 	if (retval != 0) {
-		pr_err("amdkfd: error init pq queues size (%d)\n", queue_size);
+		pr_err("Failed to init pq queues size %d\n", queue_size);
 		goto err_pq_allocate_vidmem;
 	}
 
@@ -87,7 +87,7 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
 	kq->pq_gpu_addr = kq->pq->gpu_addr;
 
 	retval = kq->ops_asic_specific.initialize(kq, dev, type, queue_size);
-	if (retval == false)
+	if (!retval)
 		goto err_eop_allocate_vidmem;
 
 	retval = kfd_gtt_sa_allocate(dev, sizeof(*kq->rptr_kernel),
@@ -139,11 +139,12 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
 
 	/* assign HIQ to HQD */
 	if (type == KFD_QUEUE_TYPE_HIQ) {
-		pr_debug("assigning hiq to hqd\n");
+		pr_debug("Assigning hiq to hqd\n");
 		kq->queue->pipe = KFD_CIK_HIQ_PIPE;
 		kq->queue->queue = KFD_CIK_HIQ_QUEUE;
 		kq->mqd->load_mqd(kq->mqd, kq->queue->mqd, kq->queue->pipe,
-					kq->queue->queue, NULL);
+				  kq->queue->queue, &kq->queue->properties,
+				  NULL);
 	} else {
 		/* allocate fence for DIQ */
 
@@ -180,8 +181,6 @@ static bool initialize(struct kernel_queue *kq, struct kfd_dev *dev,
 
 static void uninitialize(struct kernel_queue *kq)
 {
-	BUG_ON(!kq);
-
 	if (kq->queue->properties.type == KFD_QUEUE_TYPE_HIQ)
 		kq->mqd->destroy_mqd(kq->mqd,
 					NULL,
@@ -211,8 +210,6 @@ static int acquire_packet_buffer(struct kernel_queue *kq,
 	uint32_t wptr, rptr;
 	unsigned int *queue_address;
 
-	BUG_ON(!kq || !buffer_ptr);
-
 	rptr = *kq->rptr_kernel;
 	wptr = *kq->wptr_kernel;
 	queue_address = (unsigned int *)kq->pq_kernel_addr;
@@ -252,11 +249,7 @@ static void submit_packet(struct kernel_queue *kq)
 {
 #ifdef DEBUG
 	int i;
-#endif
 
-	BUG_ON(!kq);
-
-#ifdef DEBUG
 	for (i = *kq->wptr_kernel; i < kq->pending_wptr; i++) {
 		pr_debug("0x%2X ", kq->pq_kernel_addr[i]);
 		if (i % 15 == 0)
@@ -272,7 +265,6 @@ static void submit_packet(struct kernel_queue *kq)
 
 static void rollback_packet(struct kernel_queue *kq)
 {
-	BUG_ON(!kq);
 	kq->pending_wptr = *kq->queue->properties.write_ptr;
 }
 
@@ -281,9 +273,7 @@ struct kernel_queue *kernel_queue_init(struct kfd_dev *dev,
 {
 	struct kernel_queue *kq;
 
-	BUG_ON(!dev);
-
-	kq = kzalloc(sizeof(struct kernel_queue), GFP_KERNEL);
+	kq = kzalloc(sizeof(*kq), GFP_KERNEL);
 	if (!kq)
 		return NULL;
 
@@ -304,7 +294,7 @@ struct kernel_queue *kernel_queue_init(struct kfd_dev *dev,
 	}
 
 	if (!kq->ops.initialize(kq, dev, type, KFD_KERNEL_QUEUE_SIZE)) {
-		pr_err("amdkfd: failed to init kernel queue\n");
+		pr_err("Failed to init kernel queue\n");
 		kfree(kq);
 		return NULL;
 	}
@@ -313,32 +303,37 @@ struct kernel_queue *kernel_queue_init(struct kfd_dev *dev,
 
 void kernel_queue_uninit(struct kernel_queue *kq)
 {
-	BUG_ON(!kq);
-
 	kq->ops.uninitialize(kq);
 	kfree(kq);
 }
 
+/* FIXME: Can this test be removed? */
 static __attribute__((unused)) void test_kq(struct kfd_dev *dev)
 {
 	struct kernel_queue *kq;
 	uint32_t *buffer, i;
 	int retval;
 
-	BUG_ON(!dev);
-
-	pr_err("amdkfd: starting kernel queue test\n");
+	pr_err("Starting kernel queue test\n");
 
 	kq = kernel_queue_init(dev, KFD_QUEUE_TYPE_HIQ);
-	BUG_ON(!kq);
+	if (unlikely(!kq)) {
+		pr_err("  Failed to initialize HIQ\n");
+		pr_err("Kernel queue test failed\n");
+		return;
+	}
 
 	retval = kq->ops.acquire_packet_buffer(kq, 5, &buffer);
-	BUG_ON(retval != 0);
+	if (unlikely(retval != 0)) {
+		pr_err("  Failed to acquire packet buffer\n");
+		pr_err("Kernel queue test failed\n");
+		return;
+	}
 	for (i = 0; i < 5; i++)
 		buffer[i] = kq->nop_packet;
 	kq->ops.submit_packet(kq);
 
-	pr_err("amdkfd: ending kernel queue test\n");
+	pr_err("Ending kernel queue test\n");
 }
 
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
index 850a562..0d73bea 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c
@@ -61,7 +61,8 @@ MODULE_PARM_DESC(send_sigterm,
 
 static int amdkfd_init_completed;
 
-int kgd2kfd_init(unsigned interface_version, const struct kgd2kfd_calls **g2f)
+int kgd2kfd_init(unsigned int interface_version,
+		const struct kgd2kfd_calls **g2f)
 {
 	if (!amdkfd_init_completed)
 		return -EPROBE_DEFER;
@@ -90,7 +91,7 @@ static int __init kfd_module_init(void)
 	/* Verify module parameters */
 	if ((sched_policy < KFD_SCHED_POLICY_HWS) ||
 		(sched_policy > KFD_SCHED_POLICY_NO_HWS)) {
-		pr_err("kfd: sched_policy has invalid value\n");
+		pr_err("sched_policy has invalid value\n");
 		return -1;
 	}
 
@@ -98,13 +99,13 @@ static int __init kfd_module_init(void)
 	if ((max_num_of_queues_per_device < 1) ||
 		(max_num_of_queues_per_device >
 			KFD_MAX_NUM_OF_QUEUES_PER_DEVICE)) {
-		pr_err("kfd: max_num_of_queues_per_device must be between 1 to KFD_MAX_NUM_OF_QUEUES_PER_DEVICE\n");
+		pr_err("max_num_of_queues_per_device must be between 1 to KFD_MAX_NUM_OF_QUEUES_PER_DEVICE\n");
 		return -1;
 	}
 
 	err = kfd_pasid_init();
 	if (err < 0)
-		goto err_pasid;
+		return err;
 
 	err = kfd_chardev_init();
 	if (err < 0)
@@ -126,7 +127,6 @@ static int __init kfd_module_init(void)
 	kfd_chardev_exit();
 err_ioctl:
 	kfd_pasid_exit();
-err_pasid:
 	return err;
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
index 213a71e..1f3a6ba 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager.h
@@ -67,7 +67,8 @@ struct mqd_manager {
 
 	int	(*load_mqd)(struct mqd_manager *mm, void *mqd,
 				uint32_t pipe_id, uint32_t queue_id,
-				uint32_t __user *wptr);
+				struct queue_properties *p,
+				struct mm_struct *mms);
 
 	int	(*update_mqd)(struct mqd_manager *mm, void *mqd,
 				struct queue_properties *q);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
index 6acc431..44ffd23 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c
@@ -44,10 +44,6 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
 	struct cik_mqd *m;
 	int retval;
 
-	BUG_ON(!mm || !q || !mqd);
-
-	pr_debug("kfd: In func %s\n", __func__);
-
 	retval = kfd_gtt_sa_allocate(mm->dev, sizeof(struct cik_mqd),
 					mqd_mem_obj);
 
@@ -101,7 +97,7 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
 		m->cp_hqd_iq_rptr = AQL_ENABLE;
 
 	*mqd = m;
-	if (gart_addr != NULL)
+	if (gart_addr)
 		*gart_addr = addr;
 	retval = mm->update_mqd(mm, m, q);
 
@@ -115,8 +111,6 @@ static int init_mqd_sdma(struct mqd_manager *mm, void **mqd,
 	int retval;
 	struct cik_sdma_rlc_registers *m;
 
-	BUG_ON(!mm || !mqd || !mqd_mem_obj);
-
 	retval = kfd_gtt_sa_allocate(mm->dev,
 					sizeof(struct cik_sdma_rlc_registers),
 					mqd_mem_obj);
@@ -129,7 +123,7 @@ static int init_mqd_sdma(struct mqd_manager *mm, void **mqd,
 	memset(m, 0, sizeof(struct cik_sdma_rlc_registers));
 
 	*mqd = m;
-	if (gart_addr != NULL)
+	if (gart_addr)
 		*gart_addr = (*mqd_mem_obj)->gpu_addr;
 
 	retval = mm->update_mqd(mm, m, q);
@@ -140,27 +134,31 @@ static int init_mqd_sdma(struct mqd_manager *mm, void **mqd,
 static void uninit_mqd(struct mqd_manager *mm, void *mqd,
 			struct kfd_mem_obj *mqd_mem_obj)
 {
-	BUG_ON(!mm || !mqd);
 	kfd_gtt_sa_free(mm->dev, mqd_mem_obj);
 }
 
 static void uninit_mqd_sdma(struct mqd_manager *mm, void *mqd,
 				struct kfd_mem_obj *mqd_mem_obj)
 {
-	BUG_ON(!mm || !mqd);
 	kfd_gtt_sa_free(mm->dev, mqd_mem_obj);
 }
 
 static int load_mqd(struct mqd_manager *mm, void *mqd, uint32_t pipe_id,
-			uint32_t queue_id, uint32_t __user *wptr)
+		    uint32_t queue_id, struct queue_properties *p,
+		    struct mm_struct *mms)
 {
-	return mm->dev->kfd2kgd->hqd_load
-		(mm->dev->kgd, mqd, pipe_id, queue_id, wptr);
+	/* AQL write pointer counts in 64B packets, PM4/CP counts in dwords. */
+	uint32_t wptr_shift = (p->format == KFD_QUEUE_FORMAT_AQL ? 4 : 0);
+	uint32_t wptr_mask = (uint32_t)((p->queue_size / sizeof(uint32_t)) - 1);
+
+	return mm->dev->kfd2kgd->hqd_load(mm->dev->kgd, mqd, pipe_id, queue_id,
+					  (uint32_t __user *)p->write_ptr,
+					  wptr_shift, wptr_mask, mms);
 }
 
 static int load_mqd_sdma(struct mqd_manager *mm, void *mqd,
-			uint32_t pipe_id, uint32_t queue_id,
-			uint32_t __user *wptr)
+			 uint32_t pipe_id, uint32_t queue_id,
+			 struct queue_properties *p, struct mm_struct *mms)
 {
 	return mm->dev->kfd2kgd->hqd_sdma_load(mm->dev->kgd, mqd);
 }
@@ -170,10 +168,6 @@ static int update_mqd(struct mqd_manager *mm, void *mqd,
 {
 	struct cik_mqd *m;
 
-	BUG_ON(!mm || !q || !mqd);
-
-	pr_debug("kfd: In func %s\n", __func__);
-
 	m = get_mqd(mqd);
 	m->cp_hqd_pq_control = DEFAULT_RPTR_BLOCK_SIZE |
 				DEFAULT_MIN_AVAIL_SIZE | PQ_ATC_EN;
@@ -188,21 +182,17 @@ static int update_mqd(struct mqd_manager *mm, void *mqd,
 	m->cp_hqd_pq_base_hi = upper_32_bits((uint64_t)q->queue_address >> 8);
 	m->cp_hqd_pq_rptr_report_addr_lo = lower_32_bits((uint64_t)q->read_ptr);
 	m->cp_hqd_pq_rptr_report_addr_hi = upper_32_bits((uint64_t)q->read_ptr);
-	m->cp_hqd_pq_doorbell_control = DOORBELL_EN |
-					DOORBELL_OFFSET(q->doorbell_off);
+	m->cp_hqd_pq_doorbell_control = DOORBELL_OFFSET(q->doorbell_off);
 
 	m->cp_hqd_vmid = q->vmid;
 
-	if (q->format == KFD_QUEUE_FORMAT_AQL) {
+	if (q->format == KFD_QUEUE_FORMAT_AQL)
 		m->cp_hqd_pq_control |= NO_UPDATE_RPTR;
-	}
 
-	m->cp_hqd_active = 0;
 	q->is_active = false;
 	if (q->queue_size > 0 &&
 			q->queue_address != 0 &&
 			q->queue_percent > 0) {
-		m->cp_hqd_active = 1;
 		q->is_active = true;
 	}
 
@@ -214,8 +204,6 @@ static int update_mqd_sdma(struct mqd_manager *mm, void *mqd,
 {
 	struct cik_sdma_rlc_registers *m;
 
-	BUG_ON(!mm || !mqd || !q);
-
 	m = get_sdma_mqd(mqd);
 	m->sdma_rlc_rb_cntl = ffs(q->queue_size / sizeof(unsigned int)) <<
 			SDMA0_RLC0_RB_CNTL__RB_SIZE__SHIFT |
@@ -254,7 +242,7 @@ static int destroy_mqd(struct mqd_manager *mm, void *mqd,
 			unsigned int timeout, uint32_t pipe_id,
 			uint32_t queue_id)
 {
-	return mm->dev->kfd2kgd->hqd_destroy(mm->dev->kgd, type, timeout,
+	return mm->dev->kfd2kgd->hqd_destroy(mm->dev->kgd, mqd, type, timeout,
 					pipe_id, queue_id);
 }
 
@@ -301,10 +289,6 @@ static int init_mqd_hiq(struct mqd_manager *mm, void **mqd,
 	struct cik_mqd *m;
 	int retval;
 
-	BUG_ON(!mm || !q || !mqd || !mqd_mem_obj);
-
-	pr_debug("kfd: In func %s\n", __func__);
-
 	retval = kfd_gtt_sa_allocate(mm->dev, sizeof(struct cik_mqd),
 					mqd_mem_obj);
 
@@ -359,10 +343,6 @@ static int update_mqd_hiq(struct mqd_manager *mm, void *mqd,
 {
 	struct cik_mqd *m;
 
-	BUG_ON(!mm || !q || !mqd);
-
-	pr_debug("kfd: In func %s\n", __func__);
-
 	m = get_mqd(mqd);
 	m->cp_hqd_pq_control = DEFAULT_RPTR_BLOCK_SIZE |
 				DEFAULT_MIN_AVAIL_SIZE |
@@ -400,8 +380,6 @@ struct cik_sdma_rlc_registers *get_sdma_mqd(void *mqd)
 {
 	struct cik_sdma_rlc_registers *m;
 
-	BUG_ON(!mqd);
-
 	m = (struct cik_sdma_rlc_registers *)mqd;
 
 	return m;
@@ -412,12 +390,10 @@ struct mqd_manager *mqd_manager_init_cik(enum KFD_MQD_TYPE type,
 {
 	struct mqd_manager *mqd;
 
-	BUG_ON(!dev);
-	BUG_ON(type >= KFD_MQD_TYPE_MAX);
+	if (WARN_ON(type >= KFD_MQD_TYPE_MAX))
+		return NULL;
 
-	pr_debug("kfd: In func %s\n", __func__);
-
-	mqd = kzalloc(sizeof(struct mqd_manager), GFP_KERNEL);
+	mqd = kzalloc(sizeof(*mqd), GFP_KERNEL);
 	if (!mqd)
 		return NULL;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
index a9b9882..73cbfe1 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_vi.c
@@ -85,7 +85,7 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
 		m->cp_hqd_iq_rptr = 1;
 
 	*mqd = m;
-	if (gart_addr != NULL)
+	if (gart_addr)
 		*gart_addr = addr;
 	retval = mm->update_mqd(mm, m, q);
 
@@ -94,10 +94,15 @@ static int init_mqd(struct mqd_manager *mm, void **mqd,
 
 static int load_mqd(struct mqd_manager *mm, void *mqd,
 			uint32_t pipe_id, uint32_t queue_id,
-			uint32_t __user *wptr)
+			struct queue_properties *p, struct mm_struct *mms)
 {
-	return mm->dev->kfd2kgd->hqd_load
-		(mm->dev->kgd, mqd, pipe_id, queue_id, wptr);
+	/* AQL write pointer counts in 64B packets, PM4/CP counts in dwords. */
+	uint32_t wptr_shift = (p->format == KFD_QUEUE_FORMAT_AQL ? 4 : 0);
+	uint32_t wptr_mask = (uint32_t)((p->queue_size / sizeof(uint32_t)) - 1);
+
+	return mm->dev->kfd2kgd->hqd_load(mm->dev->kgd, mqd, pipe_id, queue_id,
+					  (uint32_t __user *)p->write_ptr,
+					  wptr_shift, wptr_mask, mms);
 }
 
 static int __update_mqd(struct mqd_manager *mm, void *mqd,
@@ -106,10 +111,6 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
 {
 	struct vi_mqd *m;
 
-	BUG_ON(!mm || !q || !mqd);
-
-	pr_debug("kfd: In func %s\n", __func__);
-
 	m = get_mqd(mqd);
 
 	m->cp_hqd_pq_control = 5 << CP_HQD_PQ_CONTROL__RPTR_BLOCK_SIZE__SHIFT |
@@ -117,7 +118,7 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
 			mtype << CP_HQD_PQ_CONTROL__MTYPE__SHIFT;
 	m->cp_hqd_pq_control |=
 			ffs(q->queue_size / sizeof(unsigned int)) - 1 - 1;
-	pr_debug("kfd: cp_hqd_pq_control 0x%x\n", m->cp_hqd_pq_control);
+	pr_debug("cp_hqd_pq_control 0x%x\n", m->cp_hqd_pq_control);
 
 	m->cp_hqd_pq_base_lo = lower_32_bits((uint64_t)q->queue_address >> 8);
 	m->cp_hqd_pq_base_hi = upper_32_bits((uint64_t)q->queue_address >> 8);
@@ -126,10 +127,9 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
 	m->cp_hqd_pq_rptr_report_addr_hi = upper_32_bits((uint64_t)q->read_ptr);
 
 	m->cp_hqd_pq_doorbell_control =
-		1 << CP_HQD_PQ_DOORBELL_CONTROL__DOORBELL_EN__SHIFT |
 		q->doorbell_off <<
 			CP_HQD_PQ_DOORBELL_CONTROL__DOORBELL_OFFSET__SHIFT;
-	pr_debug("kfd: cp_hqd_pq_doorbell_control 0x%x\n",
+	pr_debug("cp_hqd_pq_doorbell_control 0x%x\n",
 			m->cp_hqd_pq_doorbell_control);
 
 	m->cp_hqd_eop_control = atc_bit << CP_HQD_EOP_CONTROL__EOP_ATC__SHIFT |
@@ -139,8 +139,15 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
 			3 << CP_HQD_IB_CONTROL__MIN_IB_AVAIL_SIZE__SHIFT |
 			mtype << CP_HQD_IB_CONTROL__MTYPE__SHIFT;
 
-	m->cp_hqd_eop_control |=
-		ffs(q->eop_ring_buffer_size / sizeof(unsigned int)) - 1 - 1;
+	/*
+	 * HW does not clamp this field correctly. Maximum EOP queue size
+	 * is constrained by per-SE EOP done signal count, which is 8-bit.
+	 * Limit is 0xFF EOP entries (= 0x7F8 dwords). CP will not submit
+	 * more than (EOP entry count - 1) so a queue size of 0x800 dwords
+	 * is safe, giving a maximum field value of 0xA.
+	 */
+	m->cp_hqd_eop_control |= min(0xA,
+		ffs(q->eop_ring_buffer_size / sizeof(unsigned int)) - 1 - 1);
 	m->cp_hqd_eop_base_addr_lo =
 			lower_32_bits(q->eop_ring_buffer_address >> 8);
 	m->cp_hqd_eop_base_addr_hi =
@@ -156,12 +163,10 @@ static int __update_mqd(struct mqd_manager *mm, void *mqd,
 				2 << CP_HQD_PQ_CONTROL__SLOT_BASED_WPTR__SHIFT;
 	}
 
-	m->cp_hqd_active = 0;
 	q->is_active = false;
 	if (q->queue_size > 0 &&
 			q->queue_address != 0 &&
 			q->queue_percent > 0) {
-		m->cp_hqd_active = 1;
 		q->is_active = true;
 	}
 
@@ -181,14 +186,13 @@ static int destroy_mqd(struct mqd_manager *mm, void *mqd,
 			uint32_t queue_id)
 {
 	return mm->dev->kfd2kgd->hqd_destroy
-		(mm->dev->kgd, type, timeout,
+		(mm->dev->kgd, mqd, type, timeout,
 		pipe_id, queue_id);
 }
 
 static void uninit_mqd(struct mqd_manager *mm, void *mqd,
 			struct kfd_mem_obj *mqd_mem_obj)
 {
-	BUG_ON(!mm || !mqd);
 	kfd_gtt_sa_free(mm->dev, mqd_mem_obj);
 }
 
@@ -238,12 +242,10 @@ struct mqd_manager *mqd_manager_init_vi(enum KFD_MQD_TYPE type,
 {
 	struct mqd_manager *mqd;
 
-	BUG_ON(!dev);
-	BUG_ON(type >= KFD_MQD_TYPE_MAX);
+	if (WARN_ON(type >= KFD_MQD_TYPE_MAX))
+		return NULL;
 
-	pr_debug("kfd: In func %s\n", __func__);
-
-	mqd = kzalloc(sizeof(struct mqd_manager), GFP_KERNEL);
+	mqd = kzalloc(sizeof(*mqd), GFP_KERNEL);
 	if (!mqd)
 		return NULL;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
index 7131998..1d31260 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_packet_manager.c
@@ -26,7 +26,6 @@
 #include "kfd_device_queue_manager.h"
 #include "kfd_kernel_queue.h"
 #include "kfd_priv.h"
-#include "kfd_pm4_headers.h"
 #include "kfd_pm4_headers_vi.h"
 #include "kfd_pm4_opcodes.h"
 
@@ -35,7 +34,8 @@ static inline void inc_wptr(unsigned int *wptr, unsigned int increment_bytes,
 {
 	unsigned int temp = *wptr + increment_bytes / sizeof(uint32_t);
 
-	BUG_ON((temp * sizeof(uint32_t)) > buffer_size_bytes);
+	WARN((temp * sizeof(uint32_t)) > buffer_size_bytes,
+	     "Runlist IB overflow");
 	*wptr = temp;
 }
 
@@ -43,12 +43,12 @@ static unsigned int build_pm4_header(unsigned int opcode, size_t packet_size)
 {
 	union PM4_MES_TYPE_3_HEADER header;
 
-	header.u32all = 0;
+	header.u32All = 0;
 	header.opcode = opcode;
 	header.count = packet_size/sizeof(uint32_t) - 2;
 	header.type = PM4_TYPE_3;
 
-	return header.u32all;
+	return header.u32All;
 }
 
 static void pm_calc_rlib_size(struct packet_manager *pm,
@@ -58,8 +58,6 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
 	unsigned int process_count, queue_count;
 	unsigned int map_queue_size;
 
-	BUG_ON(!pm || !rlib_size || !over_subscription);
-
 	process_count = pm->dqm->processes_count;
 	queue_count = pm->dqm->queue_count;
 
@@ -67,15 +65,12 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
 	*over_subscription = false;
 	if ((process_count > 1) || queue_count > get_queues_num(pm->dqm)) {
 		*over_subscription = true;
-		pr_debug("kfd: over subscribed runlist\n");
+		pr_debug("Over subscribed runlist\n");
 	}
 
-	map_queue_size =
-		(pm->dqm->dev->device_info->asic_family == CHIP_CARRIZO) ?
-		sizeof(struct pm4_mes_map_queues) :
-		sizeof(struct pm4_map_queues);
+	map_queue_size = sizeof(struct pm4_mes_map_queues);
 	/* calculate run list ib allocation size */
-	*rlib_size = process_count * sizeof(struct pm4_map_process) +
+	*rlib_size = process_count * sizeof(struct pm4_mes_map_process) +
 		     queue_count * map_queue_size;
 
 	/*
@@ -83,9 +78,9 @@ static void pm_calc_rlib_size(struct packet_manager *pm,
 	 * when over subscription
 	 */
 	if (*over_subscription)
-		*rlib_size += sizeof(struct pm4_runlist);
+		*rlib_size += sizeof(struct pm4_mes_runlist);
 
-	pr_debug("kfd: runlist ib size %d\n", *rlib_size);
+	pr_debug("runlist ib size %d\n", *rlib_size);
 }
 
 static int pm_allocate_runlist_ib(struct packet_manager *pm,
@@ -96,17 +91,16 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm,
 {
 	int retval;
 
-	BUG_ON(!pm);
-	BUG_ON(pm->allocated);
-	BUG_ON(is_over_subscription == NULL);
+	if (WARN_ON(pm->allocated))
+		return -EINVAL;
 
 	pm_calc_rlib_size(pm, rl_buffer_size, is_over_subscription);
 
 	retval = kfd_gtt_sa_allocate(pm->dqm->dev, *rl_buffer_size,
 					&pm->ib_buffer_obj);
 
-	if (retval != 0) {
-		pr_err("kfd: failed to allocate runlist IB\n");
+	if (retval) {
+		pr_err("Failed to allocate runlist IB\n");
 		return retval;
 	}
 
@@ -121,15 +115,16 @@ static int pm_allocate_runlist_ib(struct packet_manager *pm,
 static int pm_create_runlist(struct packet_manager *pm, uint32_t *buffer,
 			uint64_t ib, size_t ib_size_in_dwords, bool chain)
 {
-	struct pm4_runlist *packet;
+	struct pm4_mes_runlist *packet;
 
-	BUG_ON(!pm || !buffer || !ib);
+	if (WARN_ON(!ib))
+		return -EFAULT;
 
-	packet = (struct pm4_runlist *)buffer;
+	packet = (struct pm4_mes_runlist *)buffer;
 
-	memset(buffer, 0, sizeof(struct pm4_runlist));
-	packet->header.u32all = build_pm4_header(IT_RUN_LIST,
-						sizeof(struct pm4_runlist));
+	memset(buffer, 0, sizeof(struct pm4_mes_runlist));
+	packet->header.u32All = build_pm4_header(IT_RUN_LIST,
+						sizeof(struct pm4_mes_runlist));
 
 	packet->bitfields4.ib_size = ib_size_in_dwords;
 	packet->bitfields4.chain = chain ? 1 : 0;
@@ -144,20 +139,16 @@ static int pm_create_runlist(struct packet_manager *pm, uint32_t *buffer,
 static int pm_create_map_process(struct packet_manager *pm, uint32_t *buffer,
 				struct qcm_process_device *qpd)
 {
-	struct pm4_map_process *packet;
+	struct pm4_mes_map_process *packet;
 	struct queue *cur;
 	uint32_t num_queues;
 
-	BUG_ON(!pm || !buffer || !qpd);
+	packet = (struct pm4_mes_map_process *)buffer;
 
-	packet = (struct pm4_map_process *)buffer;
+	memset(buffer, 0, sizeof(struct pm4_mes_map_process));
 
-	pr_debug("kfd: In func %s\n", __func__);
-
-	memset(buffer, 0, sizeof(struct pm4_map_process));
-
-	packet->header.u32all = build_pm4_header(IT_MAP_PROCESS,
-					sizeof(struct pm4_map_process));
+	packet->header.u32All = build_pm4_header(IT_MAP_PROCESS,
+					sizeof(struct pm4_mes_map_process));
 	packet->bitfields2.diq_enable = (qpd->is_debug) ? 1 : 0;
 	packet->bitfields2.process_quantum = 1;
 	packet->bitfields2.pasid = qpd->pqm->process->pasid;
@@ -175,27 +166,26 @@ static int pm_create_map_process(struct packet_manager *pm, uint32_t *buffer,
 	packet->sh_mem_ape1_base = qpd->sh_mem_ape1_base;
 	packet->sh_mem_ape1_limit = qpd->sh_mem_ape1_limit;
 
+	/* TODO: scratch support */
+	packet->sh_hidden_private_base_vmid = 0;
+
 	packet->gds_addr_lo = lower_32_bits(qpd->gds_context_area);
 	packet->gds_addr_hi = upper_32_bits(qpd->gds_context_area);
 
 	return 0;
 }
 
-static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t *buffer,
+static int pm_create_map_queue(struct packet_manager *pm, uint32_t *buffer,
 		struct queue *q, bool is_static)
 {
 	struct pm4_mes_map_queues *packet;
 	bool use_static = is_static;
 
-	BUG_ON(!pm || !buffer || !q);
-
-	pr_debug("kfd: In func %s\n", __func__);
-
 	packet = (struct pm4_mes_map_queues *)buffer;
-	memset(buffer, 0, sizeof(struct pm4_map_queues));
+	memset(buffer, 0, sizeof(struct pm4_mes_map_queues));
 
-	packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
-						sizeof(struct pm4_map_queues));
+	packet->header.u32All = build_pm4_header(IT_MAP_QUEUES,
+						sizeof(struct pm4_mes_map_queues));
 	packet->bitfields2.alloc_format =
 		alloc_format__mes_map_queues__one_per_pipe_vi;
 	packet->bitfields2.num_queues = 1;
@@ -223,10 +213,8 @@ static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t *buffer,
 		use_static = false; /* no static queues under SDMA */
 		break;
 	default:
-		pr_err("kfd: in %s queue type %d\n", __func__,
-				q->properties.type);
-		BUG();
-		break;
+		WARN(1, "queue type %d", q->properties.type);
+		return -EINVAL;
 	}
 	packet->bitfields3.doorbell_offset =
 			q->properties.doorbell_off;
@@ -246,68 +234,6 @@ static int pm_create_map_queue_vi(struct packet_manager *pm, uint32_t *buffer,
 	return 0;
 }
 
-static int pm_create_map_queue(struct packet_manager *pm, uint32_t *buffer,
-				struct queue *q, bool is_static)
-{
-	struct pm4_map_queues *packet;
-	bool use_static = is_static;
-
-	BUG_ON(!pm || !buffer || !q);
-
-	pr_debug("kfd: In func %s\n", __func__);
-
-	packet = (struct pm4_map_queues *)buffer;
-	memset(buffer, 0, sizeof(struct pm4_map_queues));
-
-	packet->header.u32all = build_pm4_header(IT_MAP_QUEUES,
-						sizeof(struct pm4_map_queues));
-	packet->bitfields2.alloc_format =
-				alloc_format__mes_map_queues__one_per_pipe;
-	packet->bitfields2.num_queues = 1;
-	packet->bitfields2.queue_sel =
-		queue_sel__mes_map_queues__map_to_hws_determined_queue_slots;
-
-	packet->bitfields2.vidmem = (q->properties.is_interop) ?
-			vidmem__mes_map_queues__uses_video_memory :
-			vidmem__mes_map_queues__uses_no_video_memory;
-
-	switch (q->properties.type) {
-	case KFD_QUEUE_TYPE_COMPUTE:
-	case KFD_QUEUE_TYPE_DIQ:
-		packet->bitfields2.engine_sel =
-				engine_sel__mes_map_queues__compute;
-		break;
-	case KFD_QUEUE_TYPE_SDMA:
-		packet->bitfields2.engine_sel =
-				engine_sel__mes_map_queues__sdma0;
-		use_static = false; /* no static queues under SDMA */
-		break;
-	default:
-		BUG();
-		break;
-	}
-
-	packet->mes_map_queues_ordinals[0].bitfields3.doorbell_offset =
-			q->properties.doorbell_off;
-
-	packet->mes_map_queues_ordinals[0].bitfields3.is_static =
-			(use_static) ? 1 : 0;
-
-	packet->mes_map_queues_ordinals[0].mqd_addr_lo =
-			lower_32_bits(q->gart_mqd_addr);
-
-	packet->mes_map_queues_ordinals[0].mqd_addr_hi =
-			upper_32_bits(q->gart_mqd_addr);
-
-	packet->mes_map_queues_ordinals[0].wptr_addr_lo =
-			lower_32_bits((uint64_t)q->properties.write_ptr);
-
-	packet->mes_map_queues_ordinals[0].wptr_addr_hi =
-			upper_32_bits((uint64_t)q->properties.write_ptr);
-
-	return 0;
-}
-
 static int pm_create_runlist_ib(struct packet_manager *pm,
 				struct list_head *queues,
 				uint64_t *rl_gpu_addr,
@@ -322,19 +248,16 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 	struct kernel_queue *kq;
 	bool is_over_subscription;
 
-	BUG_ON(!pm || !queues || !rl_size_bytes || !rl_gpu_addr);
-
 	rl_wptr = retval = proccesses_mapped = 0;
 
 	retval = pm_allocate_runlist_ib(pm, &rl_buffer, rl_gpu_addr,
 				&alloc_size_bytes, &is_over_subscription);
-	if (retval != 0)
+	if (retval)
 		return retval;
 
 	*rl_size_bytes = alloc_size_bytes;
 
-	pr_debug("kfd: In func %s\n", __func__);
-	pr_debug("kfd: building runlist ib process count: %d queues count %d\n",
+	pr_debug("Building runlist ib process count: %d queues count %d\n",
 		pm->dqm->processes_count, pm->dqm->queue_count);
 
 	/* build the run list ib packet */
@@ -342,42 +265,35 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 		qpd = cur->qpd;
 		/* build map process packet */
 		if (proccesses_mapped >= pm->dqm->processes_count) {
-			pr_debug("kfd: not enough space left in runlist IB\n");
+			pr_debug("Not enough space left in runlist IB\n");
 			pm_release_ib(pm);
 			return -ENOMEM;
 		}
 
 		retval = pm_create_map_process(pm, &rl_buffer[rl_wptr], qpd);
-		if (retval != 0)
+		if (retval)
 			return retval;
 
 		proccesses_mapped++;
-		inc_wptr(&rl_wptr, sizeof(struct pm4_map_process),
+		inc_wptr(&rl_wptr, sizeof(struct pm4_mes_map_process),
 				alloc_size_bytes);
 
 		list_for_each_entry(kq, &qpd->priv_queue_list, list) {
 			if (!kq->queue->properties.is_active)
 				continue;
 
-			pr_debug("kfd: static_queue, mapping kernel q %d, is debug status %d\n",
+			pr_debug("static_queue, mapping kernel q %d, is debug status %d\n",
 				kq->queue->queue, qpd->is_debug);
 
-			if (pm->dqm->dev->device_info->asic_family ==
-					CHIP_CARRIZO)
-				retval = pm_create_map_queue_vi(pm,
+			retval = pm_create_map_queue(pm,
 						&rl_buffer[rl_wptr],
 						kq->queue,
 						qpd->is_debug);
-			else
-				retval = pm_create_map_queue(pm,
-						&rl_buffer[rl_wptr],
-						kq->queue,
-						qpd->is_debug);
-			if (retval != 0)
+			if (retval)
 				return retval;
 
 			inc_wptr(&rl_wptr,
-				sizeof(struct pm4_map_queues),
+				sizeof(struct pm4_mes_map_queues),
 				alloc_size_bytes);
 		}
 
@@ -385,51 +301,44 @@ static int pm_create_runlist_ib(struct packet_manager *pm,
 			if (!q->properties.is_active)
 				continue;
 
-			pr_debug("kfd: static_queue, mapping user queue %d, is debug status %d\n",
+			pr_debug("static_queue, mapping user queue %d, is debug status %d\n",
 				q->queue, qpd->is_debug);
 
-			if (pm->dqm->dev->device_info->asic_family ==
-					CHIP_CARRIZO)
-				retval = pm_create_map_queue_vi(pm,
-						&rl_buffer[rl_wptr],
-						q,
-						qpd->is_debug);
-			else
-				retval = pm_create_map_queue(pm,
+			retval = pm_create_map_queue(pm,
 						&rl_buffer[rl_wptr],
 						q,
 						qpd->is_debug);
 
-			if (retval != 0)
+			if (retval)
 				return retval;
 
 			inc_wptr(&rl_wptr,
-				sizeof(struct pm4_map_queues),
+				sizeof(struct pm4_mes_map_queues),
 				alloc_size_bytes);
 		}
 	}
 
-	pr_debug("kfd: finished map process and queues to runlist\n");
+	pr_debug("Finished map process and queues to runlist\n");
 
 	if (is_over_subscription)
-		pm_create_runlist(pm, &rl_buffer[rl_wptr], *rl_gpu_addr,
-				alloc_size_bytes / sizeof(uint32_t), true);
+		retval = pm_create_runlist(pm, &rl_buffer[rl_wptr],
+					*rl_gpu_addr,
+					alloc_size_bytes / sizeof(uint32_t),
+					true);
 
 	for (i = 0; i < alloc_size_bytes / sizeof(uint32_t); i++)
 		pr_debug("0x%2X ", rl_buffer[i]);
 	pr_debug("\n");
 
-	return 0;
+	return retval;
 }
 
 int pm_init(struct packet_manager *pm, struct device_queue_manager *dqm)
 {
-	BUG_ON(!dqm);
-
 	pm->dqm = dqm;
 	mutex_init(&pm->lock);
 	pm->priv_queue = kernel_queue_init(dqm->dev, KFD_QUEUE_TYPE_HIQ);
-	if (pm->priv_queue == NULL) {
+	if (!pm->priv_queue) {
 		mutex_destroy(&pm->lock);
 		return -ENOMEM;
 	}
@@ -440,8 +349,6 @@ int pm_init(struct packet_manager *pm, struct device_queue_manager *dqm)
 
 void pm_uninit(struct packet_manager *pm)
 {
-	BUG_ON(!pm);
-
 	mutex_destroy(&pm->lock);
 	kernel_queue_uninit(pm->priv_queue);
 }
@@ -449,25 +356,22 @@ void pm_uninit(struct packet_manager *pm)
 int pm_send_set_resources(struct packet_manager *pm,
 				struct scheduling_resources *res)
 {
-	struct pm4_set_resources *packet;
-
-	BUG_ON(!pm || !res);
-
-	pr_debug("kfd: In func %s\n", __func__);
+	struct pm4_mes_set_resources *packet;
+	int retval = 0;
 
 	mutex_lock(&pm->lock);
 	pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
 					sizeof(*packet) / sizeof(uint32_t),
-			(unsigned int **)&packet);
-	if (packet == NULL) {
-		mutex_unlock(&pm->lock);
-		pr_err("kfd: failed to allocate buffer on kernel queue\n");
-		return -ENOMEM;
+					(unsigned int **)&packet);
+	if (!packet) {
+		pr_err("Failed to allocate buffer on kernel queue\n");
+		retval = -ENOMEM;
+		goto out;
 	}
 
-	memset(packet, 0, sizeof(struct pm4_set_resources));
-	packet->header.u32all = build_pm4_header(IT_SET_RESOURCES,
-					sizeof(struct pm4_set_resources));
+	memset(packet, 0, sizeof(struct pm4_mes_set_resources));
+	packet->header.u32All = build_pm4_header(IT_SET_RESOURCES,
+					sizeof(struct pm4_mes_set_resources));
 
 	packet->bitfields2.queue_type =
 			queue_type__mes_set_resources__hsa_interface_queue_hiq;
@@ -485,9 +389,10 @@ int pm_send_set_resources(struct packet_manager *pm,
 
 	pm->priv_queue->ops.submit_packet(pm->priv_queue);
 
+out:
 	mutex_unlock(&pm->lock);
 
-	return 0;
+	return retval;
 }
 
 int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
@@ -497,26 +402,24 @@ int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
 	size_t rl_ib_size, packet_size_dwords;
 	int retval;
 
-	BUG_ON(!pm || !dqm_queues);
-
 	retval = pm_create_runlist_ib(pm, dqm_queues, &rl_gpu_ib_addr,
 					&rl_ib_size);
-	if (retval != 0)
+	if (retval)
 		goto fail_create_runlist_ib;
 
-	pr_debug("kfd: runlist IB address: 0x%llX\n", rl_gpu_ib_addr);
+	pr_debug("runlist IB address: 0x%llX\n", rl_gpu_ib_addr);
 
-	packet_size_dwords = sizeof(struct pm4_runlist) / sizeof(uint32_t);
+	packet_size_dwords = sizeof(struct pm4_mes_runlist) / sizeof(uint32_t);
 	mutex_lock(&pm->lock);
 
 	retval = pm->priv_queue->ops.acquire_packet_buffer(pm->priv_queue,
 					packet_size_dwords, &rl_buffer);
-	if (retval != 0)
+	if (retval)
 		goto fail_acquire_packet_buffer;
 
 	retval = pm_create_runlist(pm, rl_buffer, rl_gpu_ib_addr,
 					rl_ib_size / sizeof(uint32_t), false);
-	if (retval != 0)
+	if (retval)
 		goto fail_create_runlist;
 
 	pm->priv_queue->ops.submit_packet(pm->priv_queue);
@@ -530,8 +433,7 @@ int pm_send_runlist(struct packet_manager *pm, struct list_head *dqm_queues)
 fail_acquire_packet_buffer:
 	mutex_unlock(&pm->lock);
 fail_create_runlist_ib:
-	if (pm->allocated)
-		pm_release_ib(pm);
+	pm_release_ib(pm);
 	return retval;
 }
 
@@ -539,20 +441,21 @@ int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address,
 			uint32_t fence_value)
 {
 	int retval;
-	struct pm4_query_status *packet;
+	struct pm4_mes_query_status *packet;
 
-	BUG_ON(!pm || !fence_address);
+	if (WARN_ON(!fence_address))
+		return -EFAULT;
 
 	mutex_lock(&pm->lock);
 	retval = pm->priv_queue->ops.acquire_packet_buffer(
 			pm->priv_queue,
-			sizeof(struct pm4_query_status) / sizeof(uint32_t),
+			sizeof(struct pm4_mes_query_status) / sizeof(uint32_t),
 			(unsigned int **)&packet);
-	if (retval != 0)
+	if (retval)
 		goto fail_acquire_packet_buffer;
 
-	packet->header.u32all = build_pm4_header(IT_QUERY_STATUS,
-					sizeof(struct pm4_query_status));
+	packet->header.u32All = build_pm4_header(IT_QUERY_STATUS,
+					sizeof(struct pm4_mes_query_status));
 
 	packet->bitfields2.context_id = 0;
 	packet->bitfields2.interrupt_sel =
@@ -566,9 +469,6 @@ int pm_send_query_status(struct packet_manager *pm, uint64_t fence_address,
 	packet->data_lo = lower_32_bits((uint64_t)fence_value);
 
 	pm->priv_queue->ops.submit_packet(pm->priv_queue);
-	mutex_unlock(&pm->lock);
-
-	return 0;
 
 fail_acquire_packet_buffer:
 	mutex_unlock(&pm->lock);
@@ -582,24 +482,22 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
 {
 	int retval;
 	uint32_t *buffer;
-	struct pm4_unmap_queues *packet;
-
-	BUG_ON(!pm);
+	struct pm4_mes_unmap_queues *packet;
 
 	mutex_lock(&pm->lock);
 	retval = pm->priv_queue->ops.acquire_packet_buffer(
 			pm->priv_queue,
-			sizeof(struct pm4_unmap_queues) / sizeof(uint32_t),
+			sizeof(struct pm4_mes_unmap_queues) / sizeof(uint32_t),
 			&buffer);
-	if (retval != 0)
+	if (retval)
 		goto err_acquire_packet_buffer;
 
-	packet = (struct pm4_unmap_queues *)buffer;
-	memset(buffer, 0, sizeof(struct pm4_unmap_queues));
-	pr_debug("kfd: static_queue: unmapping queues: mode is %d , reset is %d , type is %d\n",
+	packet = (struct pm4_mes_unmap_queues *)buffer;
+	memset(buffer, 0, sizeof(struct pm4_mes_unmap_queues));
+	pr_debug("static_queue: unmapping queues: mode is %d , reset is %d , type is %d\n",
 		mode, reset, type);
-	packet->header.u32all = build_pm4_header(IT_UNMAP_QUEUES,
-					sizeof(struct pm4_unmap_queues));
+	packet->header.u32All = build_pm4_header(IT_UNMAP_QUEUES,
+					sizeof(struct pm4_mes_unmap_queues));
 	switch (type) {
 	case KFD_QUEUE_TYPE_COMPUTE:
 	case KFD_QUEUE_TYPE_DIQ:
@@ -611,8 +509,9 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
 			engine_sel__mes_unmap_queues__sdma0 + sdma_engine;
 		break;
 	default:
-		BUG();
-		break;
+		WARN(1, "queue type %d", type);
+		retval = -EINVAL;
+		goto err_invalid;
 	}
 
 	if (reset)
@@ -636,16 +535,17 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
 		break;
 	case KFD_PREEMPT_TYPE_FILTER_ALL_QUEUES:
 		packet->bitfields2.queue_sel =
-				queue_sel__mes_unmap_queues__perform_request_on_all_active_queues;
+				queue_sel__mes_unmap_queues__unmap_all_queues;
 		break;
 	case KFD_PREEMPT_TYPE_FILTER_DYNAMIC_QUEUES:
 		/* in this case, we do not preempt static queues */
 		packet->bitfields2.queue_sel =
-				queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues_only;
+				queue_sel__mes_unmap_queues__unmap_all_non_static_queues;
 		break;
 	default:
-		BUG();
-		break;
+		WARN(1, "filter %d", mode);
+		retval = -EINVAL;
+		goto err_invalid;
 	}
 
 	pm->priv_queue->ops.submit_packet(pm->priv_queue);
@@ -653,6 +553,8 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
 	mutex_unlock(&pm->lock);
 	return 0;
 
+err_invalid:
+	pm->priv_queue->ops.rollback_packet(pm->priv_queue);
 err_acquire_packet_buffer:
 	mutex_unlock(&pm->lock);
 	return retval;
@@ -660,8 +562,6 @@ int pm_send_unmap_queue(struct packet_manager *pm, enum kfd_queue_type type,
 
 void pm_release_ib(struct packet_manager *pm)
 {
-	BUG_ON(!pm);
-
 	mutex_lock(&pm->lock);
 	if (pm->allocated) {
 		kfd_gtt_sa_free(pm->dqm->dev, pm->ib_buffer_obj);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
index 6cfe7f1..1e06de0 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pasid.c
@@ -32,7 +32,8 @@ int kfd_pasid_init(void)
 {
 	pasid_limit = KFD_MAX_NUM_OF_PROCESSES;
 
-	pasid_bitmap = kcalloc(BITS_TO_LONGS(pasid_limit), sizeof(long), GFP_KERNEL);
+	pasid_bitmap = kcalloc(BITS_TO_LONGS(pasid_limit), sizeof(long),
+				GFP_KERNEL);
 	if (!pasid_bitmap)
 		return -ENOMEM;
 
@@ -91,6 +92,6 @@ unsigned int kfd_pasid_alloc(void)
 
 void kfd_pasid_free(unsigned int pasid)
 {
-	BUG_ON(pasid == 0 || pasid >= pasid_limit);
-	clear_bit(pasid, pasid_bitmap);
+	if (!WARN_ON(pasid == 0 || pasid >= pasid_limit))
+		clear_bit(pasid, pasid_bitmap);
 }
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
index 5b393f3..e50f73d 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers.h
@@ -28,112 +28,19 @@
 #define PM4_MES_HEADER_DEFINED
 union PM4_MES_TYPE_3_HEADER {
 	struct {
-		uint32_t reserved1:8;	/* < reserved */
-		uint32_t opcode:8;	/* < IT opcode */
-		uint32_t count:14;	/* < number of DWORDs - 1
-					 * in the information body.
-					 */
-		uint32_t type:2;	/* < packet identifier.
-					 * It should be 3 for type 3 packets
-					 */
+		/* reserved */
+		uint32_t reserved1:8;
+		/* IT opcode */
+		uint32_t opcode:8;
+		/* number of DWORDs - 1 in the information body */
+		uint32_t count:14;
+		/* packet identifier. It should be 3 for type 3 packets */
+		uint32_t type:2;
 	};
 	uint32_t u32all;
 };
 #endif /* PM4_MES_HEADER_DEFINED */
 
-/* --------------------MES_SET_RESOURCES-------------------- */
-
-#ifndef PM4_MES_SET_RESOURCES_DEFINED
-#define PM4_MES_SET_RESOURCES_DEFINED
-enum set_resources_queue_type_enum {
-	queue_type__mes_set_resources__kernel_interface_queue_kiq = 0,
-	queue_type__mes_set_resources__hsa_interface_queue_hiq = 1,
-	queue_type__mes_set_resources__hsa_debug_interface_queue = 4
-};
-
-struct pm4_set_resources {
-	union {
-		union PM4_MES_TYPE_3_HEADER header;	/* header */
-		uint32_t ordinal1;
-	};
-
-	union {
-		struct {
-			uint32_t vmid_mask:16;
-			uint32_t unmap_latency:8;
-			uint32_t reserved1:5;
-			enum set_resources_queue_type_enum queue_type:3;
-		} bitfields2;
-		uint32_t ordinal2;
-	};
-
-	uint32_t queue_mask_lo;
-	uint32_t queue_mask_hi;
-	uint32_t gws_mask_lo;
-	uint32_t gws_mask_hi;
-
-	union {
-		struct {
-			uint32_t oac_mask:16;
-			uint32_t reserved2:16;
-		} bitfields7;
-		uint32_t ordinal7;
-	};
-
-	union {
-		struct {
-			uint32_t gds_heap_base:6;
-			uint32_t reserved3:5;
-			uint32_t gds_heap_size:6;
-			uint32_t reserved4:15;
-		} bitfields8;
-		uint32_t ordinal8;
-	};
-
-};
-#endif
-
-/*--------------------MES_RUN_LIST-------------------- */
-
-#ifndef PM4_MES_RUN_LIST_DEFINED
-#define PM4_MES_RUN_LIST_DEFINED
-
-struct pm4_runlist {
-	union {
-		union PM4_MES_TYPE_3_HEADER header;	/* header */
-		uint32_t ordinal1;
-	};
-
-	union {
-		struct {
-			uint32_t reserved1:2;
-			uint32_t ib_base_lo:30;
-		} bitfields2;
-		uint32_t ordinal2;
-	};
-
-	union {
-		struct {
-			uint32_t ib_base_hi:16;
-			uint32_t reserved2:16;
-		} bitfields3;
-		uint32_t ordinal3;
-	};
-
-	union {
-		struct {
-			uint32_t ib_size:20;
-			uint32_t chain:1;
-			uint32_t offload_polling:1;
-			uint32_t reserved3:1;
-			uint32_t valid:1;
-			uint32_t reserved4:8;
-		} bitfields4;
-		uint32_t ordinal4;
-	};
-
-};
-#endif
 
 /*--------------------MES_MAP_PROCESS-------------------- */
 
@@ -186,217 +93,58 @@ struct pm4_map_process {
 };
 #endif
 
-/*--------------------MES_MAP_QUEUES--------------------*/
+#ifndef PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
+#define PM4_MES_MAP_PROCESS_DEFINED_KV_SCRATCH
 
-#ifndef PM4_MES_MAP_QUEUES_DEFINED
-#define PM4_MES_MAP_QUEUES_DEFINED
-enum map_queues_queue_sel_enum {
-	queue_sel__mes_map_queues__map_to_specified_queue_slots = 0,
-	queue_sel__mes_map_queues__map_to_hws_determined_queue_slots = 1,
-	queue_sel__mes_map_queues__enable_process_queues = 2
-};
-
-enum map_queues_vidmem_enum {
-	vidmem__mes_map_queues__uses_no_video_memory = 0,
-	vidmem__mes_map_queues__uses_video_memory = 1
-};
-
-enum map_queues_alloc_format_enum {
-	alloc_format__mes_map_queues__one_per_pipe = 0,
-	alloc_format__mes_map_queues__all_on_one_pipe = 1
-};
-
-enum map_queues_engine_sel_enum {
-	engine_sel__mes_map_queues__compute = 0,
-	engine_sel__mes_map_queues__sdma0 = 2,
-	engine_sel__mes_map_queues__sdma1 = 3
-};
-
-struct pm4_map_queues {
+struct pm4_map_process_scratch_kv {
 	union {
-		union PM4_MES_TYPE_3_HEADER header;	/* header */
-		uint32_t ordinal1;
-	};
-
-	union {
-		struct {
-			uint32_t reserved1:4;
-			enum map_queues_queue_sel_enum queue_sel:2;
-			uint32_t reserved2:2;
-			uint32_t vmid:4;
-			uint32_t reserved3:4;
-			enum map_queues_vidmem_enum vidmem:2;
-			uint32_t reserved4:6;
-			enum map_queues_alloc_format_enum alloc_format:2;
-			enum map_queues_engine_sel_enum engine_sel:3;
-			uint32_t num_queues:3;
-		} bitfields2;
-		uint32_t ordinal2;
-	};
-
-	struct {
-		union {
-			struct {
-				uint32_t is_static:1;
-				uint32_t reserved5:1;
-				uint32_t doorbell_offset:21;
-				uint32_t reserved6:3;
-				uint32_t queue:6;
-			} bitfields3;
-			uint32_t ordinal3;
-		};
-
-		uint32_t mqd_addr_lo;
-		uint32_t mqd_addr_hi;
-		uint32_t wptr_addr_lo;
-		uint32_t wptr_addr_hi;
-
-	} mes_map_queues_ordinals[1];	/* 1..N of these ordinal groups */
-
-};
-#endif
-
-/*--------------------MES_QUERY_STATUS--------------------*/
-
-#ifndef PM4_MES_QUERY_STATUS_DEFINED
-#define PM4_MES_QUERY_STATUS_DEFINED
-enum query_status_interrupt_sel_enum {
-	interrupt_sel__mes_query_status__completion_status = 0,
-	interrupt_sel__mes_query_status__process_status = 1,
-	interrupt_sel__mes_query_status__queue_status = 2
-};
-
-enum query_status_command_enum {
-	command__mes_query_status__interrupt_only = 0,
-	command__mes_query_status__fence_only_immediate = 1,
-	command__mes_query_status__fence_only_after_write_ack = 2,
-	command__mes_query_status__fence_wait_for_write_ack_send_interrupt = 3
-};
-
-enum query_status_engine_sel_enum {
-	engine_sel__mes_query_status__compute = 0,
-	engine_sel__mes_query_status__sdma0_queue = 2,
-	engine_sel__mes_query_status__sdma1_queue = 3
-};
-
-struct pm4_query_status {
-	union {
-		union PM4_MES_TYPE_3_HEADER header;	/* header */
-		uint32_t ordinal1;
-	};
-
-	union {
-		struct {
-			uint32_t context_id:28;
-			enum query_status_interrupt_sel_enum interrupt_sel:2;
-			enum query_status_command_enum command:2;
-		} bitfields2;
-		uint32_t ordinal2;
+		union PM4_MES_TYPE_3_HEADER   header; /* header */
+		uint32_t            ordinal1;
 	};
 
 	union {
 		struct {
 			uint32_t pasid:16;
-			uint32_t reserved1:16;
-		} bitfields3a;
-		struct {
-			uint32_t reserved2:2;
-			uint32_t doorbell_offset:21;
-			uint32_t reserved3:3;
-			enum query_status_engine_sel_enum engine_sel:3;
-			uint32_t reserved4:3;
-		} bitfields3b;
-		uint32_t ordinal3;
-	};
-
-	uint32_t addr_lo;
-	uint32_t addr_hi;
-	uint32_t data_lo;
-	uint32_t data_hi;
-};
-#endif
-
-/*--------------------MES_UNMAP_QUEUES--------------------*/
-
-#ifndef PM4_MES_UNMAP_QUEUES_DEFINED
-#define PM4_MES_UNMAP_QUEUES_DEFINED
-enum unmap_queues_action_enum {
-	action__mes_unmap_queues__preempt_queues = 0,
-	action__mes_unmap_queues__reset_queues = 1,
-	action__mes_unmap_queues__disable_process_queues = 2
-};
-
-enum unmap_queues_queue_sel_enum {
-	queue_sel__mes_unmap_queues__perform_request_on_specified_queues = 0,
-	queue_sel__mes_unmap_queues__perform_request_on_pasid_queues = 1,
-	queue_sel__mes_unmap_queues__perform_request_on_all_active_queues = 2,
-	queue_sel__mes_unmap_queues__perform_request_on_dynamic_queues_only = 3
-};
-
-enum unmap_queues_engine_sel_enum {
-	engine_sel__mes_unmap_queues__compute = 0,
-	engine_sel__mes_unmap_queues__sdma0 = 2,
-	engine_sel__mes_unmap_queues__sdma1 = 3
-};
-
-struct pm4_unmap_queues {
-	union {
-		union PM4_MES_TYPE_3_HEADER header;	/* header */
-		uint32_t ordinal1;
-	};
-
-	union {
-		struct {
-			enum unmap_queues_action_enum action:2;
-			uint32_t reserved1:2;
-			enum unmap_queues_queue_sel_enum queue_sel:2;
-			uint32_t reserved2:20;
-			enum unmap_queues_engine_sel_enum engine_sel:3;
-			uint32_t num_queues:3;
+			uint32_t reserved1:8;
+			uint32_t diq_enable:1;
+			uint32_t process_quantum:7;
 		} bitfields2;
 		uint32_t ordinal2;
 	};
 
 	union {
 		struct {
-			uint32_t pasid:16;
-			uint32_t reserved3:16;
-		} bitfields3a;
-		struct {
-			uint32_t reserved4:2;
-			uint32_t doorbell_offset0:21;
-			uint32_t reserved5:9;
-		} bitfields3b;
+			uint32_t page_table_base:28;
+			uint32_t reserved2:4;
+		} bitfields3;
 		uint32_t ordinal3;
 	};
 
+	uint32_t reserved3;
+	uint32_t sh_mem_bases;
+	uint32_t sh_mem_config;
+	uint32_t sh_mem_ape1_base;
+	uint32_t sh_mem_ape1_limit;
+	uint32_t sh_hidden_private_base_vmid;
+	uint32_t reserved4;
+	uint32_t reserved5;
+	uint32_t gds_addr_lo;
+	uint32_t gds_addr_hi;
+
 	union {
 		struct {
+			uint32_t num_gws:6;
 			uint32_t reserved6:2;
-			uint32_t doorbell_offset1:21;
-			uint32_t reserved7:9;
-		} bitfields4;
-		uint32_t ordinal4;
+			uint32_t num_oac:4;
+			uint32_t reserved7:4;
+			uint32_t gds_size:6;
+			uint32_t num_queues:10;
+		} bitfields14;
+		uint32_t ordinal14;
 	};
 
-	union {
-		struct {
-			uint32_t reserved8:2;
-			uint32_t doorbell_offset2:21;
-			uint32_t reserved9:9;
-		} bitfields5;
-		uint32_t ordinal5;
-	};
-
-	union {
-		struct {
-			uint32_t reserved10:2;
-			uint32_t doorbell_offset3:21;
-			uint32_t reserved11:9;
-		} bitfields6;
-		uint32_t ordinal6;
-	};
-
+	uint32_t completion_signal_lo32;
+uint32_t completion_signal_hi32;
 };
 #endif
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
index 08c7219..7c8d9b3 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_pm4_headers_vi.h
@@ -30,10 +30,12 @@ union PM4_MES_TYPE_3_HEADER {
 	struct {
 		uint32_t reserved1 : 8; /* < reserved */
 		uint32_t opcode    : 8; /* < IT opcode */
-		uint32_t count     : 14;/* < number of DWORDs - 1 in the
-		information body. */
-		uint32_t type      : 2; /* < packet identifier.
-					It should be 3 for type 3 packets */
+		uint32_t count     : 14;/* < Number of DWORDS - 1 in the
+					 *   information body
+					 */
+		uint32_t type      : 2; /* < packet identifier
+					 *   It should be 3 for type 3 packets
+					 */
 	};
 	uint32_t u32All;
 };
@@ -124,9 +126,10 @@ struct pm4_mes_runlist {
 			uint32_t ib_size:20;
 			uint32_t chain:1;
 			uint32_t offload_polling:1;
-			uint32_t reserved3:1;
+			uint32_t reserved2:1;
 			uint32_t valid:1;
-			uint32_t reserved4:8;
+			uint32_t process_cnt:4;
+			uint32_t reserved3:4;
 		} bitfields4;
 		uint32_t ordinal4;
 	};
@@ -141,8 +144,8 @@ struct pm4_mes_runlist {
 
 struct pm4_mes_map_process {
 	union {
-		union PM4_MES_TYPE_3_HEADER   header;            /* header */
-		uint32_t            ordinal1;
+		union PM4_MES_TYPE_3_HEADER header;	/* header */
+		uint32_t ordinal1;
 	};
 
 	union {
@@ -153,36 +156,48 @@ struct pm4_mes_map_process {
 			uint32_t process_quantum:7;
 		} bitfields2;
 		uint32_t ordinal2;
-};
+	};
 
 	union {
 		struct {
 			uint32_t page_table_base:28;
-			uint32_t reserved2:4;
+			uint32_t reserved3:4;
 		} bitfields3;
 		uint32_t ordinal3;
 	};
 
+	uint32_t reserved;
+
 	uint32_t sh_mem_bases;
+	uint32_t sh_mem_config;
 	uint32_t sh_mem_ape1_base;
 	uint32_t sh_mem_ape1_limit;
-	uint32_t sh_mem_config;
+
+	uint32_t sh_hidden_private_base_vmid;
+
+	uint32_t reserved2;
+	uint32_t reserved3;
+
 	uint32_t gds_addr_lo;
 	uint32_t gds_addr_hi;
 
 	union {
 		struct {
 			uint32_t num_gws:6;
-			uint32_t reserved3:2;
+			uint32_t reserved4:2;
 			uint32_t num_oac:4;
-			uint32_t reserved4:4;
+			uint32_t reserved5:4;
 			uint32_t gds_size:6;
 			uint32_t num_queues:10;
 		} bitfields10;
 		uint32_t ordinal10;
 	};
 
+	uint32_t completion_signal_lo;
+	uint32_t completion_signal_hi;
+
 };
+
 #endif
 
 /*--------------------MES_MAP_QUEUES--------------------*/
@@ -335,7 +350,7 @@ enum mes_unmap_queues_engine_sel_enum {
 	engine_sel__mes_unmap_queues__sdmal = 3
 };
 
-struct PM4_MES_UNMAP_QUEUES {
+struct pm4_mes_unmap_queues {
 	union {
 		union PM4_MES_TYPE_3_HEADER   header;            /* header */
 		uint32_t            ordinal1;
@@ -395,4 +410,101 @@ struct PM4_MES_UNMAP_QUEUES {
 };
 #endif
 
+#ifndef PM4_MEC_RELEASE_MEM_DEFINED
+#define PM4_MEC_RELEASE_MEM_DEFINED
+enum RELEASE_MEM_event_index_enum {
+	event_index___release_mem__end_of_pipe = 5,
+	event_index___release_mem__shader_done = 6
+};
+
+enum RELEASE_MEM_cache_policy_enum {
+	cache_policy___release_mem__lru = 0,
+	cache_policy___release_mem__stream = 1,
+	cache_policy___release_mem__bypass = 2
+};
+
+enum RELEASE_MEM_dst_sel_enum {
+	dst_sel___release_mem__memory_controller = 0,
+	dst_sel___release_mem__tc_l2 = 1,
+	dst_sel___release_mem__queue_write_pointer_register = 2,
+	dst_sel___release_mem__queue_write_pointer_poll_mask_bit = 3
+};
+
+enum RELEASE_MEM_int_sel_enum {
+	int_sel___release_mem__none = 0,
+	int_sel___release_mem__send_interrupt_only = 1,
+	int_sel___release_mem__send_interrupt_after_write_confirm = 2,
+	int_sel___release_mem__send_data_after_write_confirm = 3
+};
+
+enum RELEASE_MEM_data_sel_enum {
+	data_sel___release_mem__none = 0,
+	data_sel___release_mem__send_32_bit_low = 1,
+	data_sel___release_mem__send_64_bit_data = 2,
+	data_sel___release_mem__send_gpu_clock_counter = 3,
+	data_sel___release_mem__send_cp_perfcounter_hi_lo = 4,
+	data_sel___release_mem__store_gds_data_to_memory = 5
+};
+
+struct pm4_mec_release_mem {
+	union {
+		union PM4_MES_TYPE_3_HEADER header;     /*header */
+		unsigned int ordinal1;
+	};
+
+	union {
+		struct {
+			unsigned int event_type:6;
+			unsigned int reserved1:2;
+			enum RELEASE_MEM_event_index_enum event_index:4;
+			unsigned int tcl1_vol_action_ena:1;
+			unsigned int tc_vol_action_ena:1;
+			unsigned int reserved2:1;
+			unsigned int tc_wb_action_ena:1;
+			unsigned int tcl1_action_ena:1;
+			unsigned int tc_action_ena:1;
+			unsigned int reserved3:6;
+			unsigned int atc:1;
+			enum RELEASE_MEM_cache_policy_enum cache_policy:2;
+			unsigned int reserved4:5;
+		} bitfields2;
+		unsigned int ordinal2;
+	};
+
+	union {
+		struct {
+			unsigned int reserved5:16;
+			enum RELEASE_MEM_dst_sel_enum dst_sel:2;
+			unsigned int reserved6:6;
+			enum RELEASE_MEM_int_sel_enum int_sel:3;
+			unsigned int reserved7:2;
+			enum RELEASE_MEM_data_sel_enum data_sel:3;
+		} bitfields3;
+		unsigned int ordinal3;
+	};
+
+	union {
+		struct {
+			unsigned int reserved8:2;
+			unsigned int address_lo_32b:30;
+		} bitfields4;
+		struct {
+			unsigned int reserved9:3;
+			unsigned int address_lo_64b:29;
+		} bitfields5;
+		unsigned int ordinal4;
+	};
+
+	unsigned int address_hi;
+
+	unsigned int data_lo;
+
+	unsigned int data_hi;
+};
+#endif
+
+enum {
+	CACHE_FLUSH_AND_INV_TS_EVENT = 0x00000014
+};
+
 #endif
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 4750cabe4..b397ec7 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -239,11 +239,6 @@ enum kfd_preempt_type_filter {
 	KFD_PREEMPT_TYPE_FILTER_BY_PASID
 };
 
-enum kfd_preempt_type {
-	KFD_PREEMPT_TYPE_WAVEFRONT,
-	KFD_PREEMPT_TYPE_WAVEFRONT_RESET
-};
-
 /**
  * enum kfd_queue_type
  *
@@ -294,13 +289,13 @@ enum kfd_queue_format {
  * @write_ptr: Defines the number of dwords written to the ring buffer.
  *
  * @doorbell_ptr: This field aim is to notify the H/W of new packet written to
- * the queue ring buffer. This field should be similar to write_ptr and the user
- * should update this field after he updated the write_ptr.
+ * the queue ring buffer. This field should be similar to write_ptr and the
+ * user should update this field after he updated the write_ptr.
  *
  * @doorbell_off: The doorbell offset in the doorbell pci-bar.
  *
- * @is_interop: Defines if this is a interop queue. Interop queue means that the
- * queue can access both graphics and compute resources.
+ * @is_interop: Defines if this is a interop queue. Interop queue means that
+ * the queue can access both graphics and compute resources.
  *
  * @is_active: Defines if the queue is active or not.
  *
@@ -352,9 +347,10 @@ struct queue_properties {
  * @properties: The queue properties.
  *
  * @mec: Used only in no cp scheduling mode and identifies to micro engine id
- * that the queue should be execute on.
+ *	 that the queue should be execute on.
  *
- * @pipe: Used only in no cp scheduling mode and identifies the queue's pipe id.
+ * @pipe: Used only in no cp scheduling mode and identifies the queue's pipe
+ *	  id.
  *
  * @queue: Used only in no cp scheduliong mode and identifies the queue's slot.
  *
@@ -436,6 +432,7 @@ struct qcm_process_device {
 	uint32_t gds_size;
 	uint32_t num_gws;
 	uint32_t num_oac;
+	uint32_t sh_hidden_private_base;
 };
 
 /* Data that is per-process-per device. */
@@ -520,8 +517,8 @@ struct kfd_process {
 	struct mutex event_mutex;
 	/* All events in process hashed by ID, linked on kfd_event.events. */
 	DECLARE_HASHTABLE(events, 4);
-	struct list_head signal_event_pages;	/* struct slot_page_header.
-								event_pages */
+	/* struct slot_page_header.event_pages */
+	struct list_head signal_event_pages;
 	u32 next_nonsignal_event_id;
 	size_t signal_event_count;
 };
@@ -559,8 +556,10 @@ struct kfd_process_device *kfd_create_process_device_data(struct kfd_dev *dev,
 							struct kfd_process *p);
 
 /* Process device data iterator */
-struct kfd_process_device *kfd_get_first_process_device_data(struct kfd_process *p);
-struct kfd_process_device *kfd_get_next_process_device_data(struct kfd_process *p,
+struct kfd_process_device *kfd_get_first_process_device_data(
+							struct kfd_process *p);
+struct kfd_process_device *kfd_get_next_process_device_data(
+						struct kfd_process *p,
 						struct kfd_process_device *pdd);
 bool kfd_has_process_device_data(struct kfd_process *p);
 
@@ -573,7 +572,8 @@ unsigned int kfd_pasid_alloc(void);
 void kfd_pasid_free(unsigned int pasid);
 
 /* Doorbells */
-void kfd_doorbell_init(struct kfd_dev *kfd);
+int kfd_doorbell_init(struct kfd_dev *kfd);
+void kfd_doorbell_fini(struct kfd_dev *kfd);
 int kfd_doorbell_mmap(struct kfd_process *process, struct vm_area_struct *vma);
 u32 __iomem *kfd_get_kernel_doorbell(struct kfd_dev *kfd,
 					unsigned int *doorbell_off);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 035bbc9..c74cf22 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -79,9 +79,7 @@ struct kfd_process *kfd_create_process(const struct task_struct *thread)
 {
 	struct kfd_process *process;
 
-	BUG_ON(!kfd_process_wq);
-
-	if (thread->mm == NULL)
+	if (!thread->mm)
 		return ERR_PTR(-EINVAL);
 
 	/* Only the pthreads threading model is supported. */
@@ -101,7 +99,7 @@ struct kfd_process *kfd_create_process(const struct task_struct *thread)
 	/* A prior open of /dev/kfd could have already created the process. */
 	process = find_process(thread);
 	if (process)
-		pr_debug("kfd: process already found\n");
+		pr_debug("Process already found\n");
 
 	if (!process)
 		process = create_process(thread);
@@ -117,7 +115,7 @@ struct kfd_process *kfd_get_process(const struct task_struct *thread)
 {
 	struct kfd_process *process;
 
-	if (thread->mm == NULL)
+	if (!thread->mm)
 		return ERR_PTR(-EINVAL);
 
 	/* Only the pthreads threading model is supported. */
@@ -202,10 +200,8 @@ static void kfd_process_destroy_delayed(struct rcu_head *rcu)
 	struct kfd_process_release_work *work;
 	struct kfd_process *p;
 
-	BUG_ON(!kfd_process_wq);
-
 	p = container_of(rcu, struct kfd_process, rcu);
-	BUG_ON(atomic_read(&p->mm->mm_count) <= 0);
+	WARN_ON(atomic_read(&p->mm->mm_count) <= 0);
 
 	mmdrop(p->mm);
 
@@ -229,7 +225,8 @@ static void kfd_process_notifier_release(struct mmu_notifier *mn,
 	 * mmu_notifier srcu is read locked
 	 */
 	p = container_of(mn, struct kfd_process, mmu_notifier);
-	BUG_ON(p->mm != mm);
+	if (WARN_ON(p->mm != mm))
+		return;
 
 	mutex_lock(&kfd_processes_mutex);
 	hash_del_rcu(&p->kfd_processes);
@@ -250,7 +247,7 @@ static void kfd_process_notifier_release(struct mmu_notifier *mn,
 			kfd_dbgmgr_destroy(pdd->dev->dbgmgr);
 
 		if (pdd->reset_wavefronts) {
-			pr_warn("amdkfd: Resetting all wave fronts\n");
+			pr_warn("Resetting all wave fronts\n");
 			dbgdev_wave_reset_wavefronts(pdd->dev, p);
 			pdd->reset_wavefronts = false;
 		}
@@ -407,8 +404,6 @@ void kfd_unbind_process_from_device(struct kfd_dev *dev, unsigned int pasid)
 	struct kfd_process *p;
 	struct kfd_process_device *pdd;
 
-	BUG_ON(dev == NULL);
-
 	/*
 	 * Look for the process that matches the pasid. If there is no such
 	 * process, we either released it in amdkfd's own notifier, or there
@@ -449,14 +444,16 @@ void kfd_unbind_process_from_device(struct kfd_dev *dev, unsigned int pasid)
 	mutex_unlock(&p->mutex);
 }
 
-struct kfd_process_device *kfd_get_first_process_device_data(struct kfd_process *p)
+struct kfd_process_device *kfd_get_first_process_device_data(
+						struct kfd_process *p)
 {
 	return list_first_entry(&p->per_device_data,
 				struct kfd_process_device,
 				per_device_list);
 }
 
-struct kfd_process_device *kfd_get_next_process_device_data(struct kfd_process *p,
+struct kfd_process_device *kfd_get_next_process_device_data(
+						struct kfd_process *p,
 						struct kfd_process_device *pdd)
 {
 	if (list_is_last(&pdd->per_device_list, &p->per_device_data))
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 32cdf2b..1cae95e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -32,12 +32,9 @@ static inline struct process_queue_node *get_queue_by_qid(
 {
 	struct process_queue_node *pqn;
 
-	BUG_ON(!pqm);
-
 	list_for_each_entry(pqn, &pqm->queues, process_queue_list) {
-		if (pqn->q && pqn->q->properties.queue_id == qid)
-			return pqn;
-		if (pqn->kq && pqn->kq->queue->properties.queue_id == qid)
+		if ((pqn->q && pqn->q->properties.queue_id == qid) ||
+		    (pqn->kq && pqn->kq->queue->properties.queue_id == qid))
 			return pqn;
 	}
 
@@ -49,17 +46,13 @@ static int find_available_queue_slot(struct process_queue_manager *pqm,
 {
 	unsigned long found;
 
-	BUG_ON(!pqm || !qid);
-
-	pr_debug("kfd: in %s\n", __func__);
-
 	found = find_first_zero_bit(pqm->queue_slot_bitmap,
 			KFD_MAX_NUM_OF_QUEUES_PER_PROCESS);
 
-	pr_debug("kfd: the new slot id %lu\n", found);
+	pr_debug("The new slot id %lu\n", found);
 
 	if (found >= KFD_MAX_NUM_OF_QUEUES_PER_PROCESS) {
-		pr_info("amdkfd: Can not open more queues for process with pasid %d\n",
+		pr_info("Cannot open more queues for process with pasid %d\n",
 				pqm->process->pasid);
 		return -ENOMEM;
 	}
@@ -72,13 +65,11 @@ static int find_available_queue_slot(struct process_queue_manager *pqm,
 
 int pqm_init(struct process_queue_manager *pqm, struct kfd_process *p)
 {
-	BUG_ON(!pqm);
-
 	INIT_LIST_HEAD(&pqm->queues);
 	pqm->queue_slot_bitmap =
 			kzalloc(DIV_ROUND_UP(KFD_MAX_NUM_OF_QUEUES_PER_PROCESS,
 					BITS_PER_BYTE), GFP_KERNEL);
-	if (pqm->queue_slot_bitmap == NULL)
+	if (!pqm->queue_slot_bitmap)
 		return -ENOMEM;
 	pqm->process = p;
 
@@ -90,10 +81,6 @@ void pqm_uninit(struct process_queue_manager *pqm)
 	int retval;
 	struct process_queue_node *pqn, *next;
 
-	BUG_ON(!pqm);
-
-	pr_debug("In func %s\n", __func__);
-
 	list_for_each_entry_safe(pqn, next, &pqm->queues, process_queue_list) {
 		retval = pqm_destroy_queue(
 				pqm,
@@ -102,7 +89,7 @@ void pqm_uninit(struct process_queue_manager *pqm)
 					pqn->kq->queue->properties.queue_id);
 
 		if (retval != 0) {
-			pr_err("kfd: failed to destroy queue\n");
+			pr_err("failed to destroy queue\n");
 			return;
 		}
 	}
@@ -117,8 +104,6 @@ static int create_cp_queue(struct process_queue_manager *pqm,
 {
 	int retval;
 
-	retval = 0;
-
 	/* Doorbell initialized in user space*/
 	q_properties->doorbell_ptr = NULL;
 
@@ -131,17 +116,14 @@ static int create_cp_queue(struct process_queue_manager *pqm,
 
 	retval = init_queue(q, q_properties);
 	if (retval != 0)
-		goto err_init_queue;
+		return retval;
 
 	(*q)->device = dev;
 	(*q)->process = pqm->process;
 
-	pr_debug("kfd: PQM After init queue");
+	pr_debug("PQM After init queue");
 
 	return retval;
-
-err_init_queue:
-	return retval;
 }
 
 int pqm_create_queue(struct process_queue_manager *pqm,
@@ -161,8 +143,6 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 	int num_queues = 0;
 	struct queue *cur;
 
-	BUG_ON(!pqm || !dev || !properties || !qid);
-
 	memset(&q_properties, 0, sizeof(struct queue_properties));
 	memcpy(&q_properties, properties, sizeof(struct queue_properties));
 	q = NULL;
@@ -185,7 +165,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 		list_for_each_entry(cur, &pdd->qpd.queues_list, list)
 			num_queues++;
 		if (num_queues >= dev->device_info->max_no_of_hqd/2)
-			return (-ENOSPC);
+			return -ENOSPC;
 	}
 
 	retval = find_available_queue_slot(pqm, qid);
@@ -197,7 +177,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 		dev->dqm->ops.register_process(dev->dqm, &pdd->qpd);
 	}
 
-	pqn = kzalloc(sizeof(struct process_queue_node), GFP_KERNEL);
+	pqn = kzalloc(sizeof(*pqn), GFP_KERNEL);
 	if (!pqn) {
 		retval = -ENOMEM;
 		goto err_allocate_pqn;
@@ -210,7 +190,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 		if ((sched_policy == KFD_SCHED_POLICY_HWS_NO_OVERSUBSCRIPTION) &&
 		((dev->dqm->processes_count >= VMID_PER_DEVICE) ||
 		(dev->dqm->queue_count >= get_queues_num(dev->dqm)))) {
-			pr_err("kfd: over-subscription is not allowed in radeon_kfd.sched_policy == 1\n");
+			pr_err("Over-subscription is not allowed in radeon_kfd.sched_policy == 1\n");
 			retval = -EPERM;
 			goto err_create_queue;
 		}
@@ -227,7 +207,7 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 		break;
 	case KFD_QUEUE_TYPE_DIQ:
 		kq = kernel_queue_init(dev, KFD_QUEUE_TYPE_DIQ);
-		if (kq == NULL) {
+		if (!kq) {
 			retval = -ENOMEM;
 			goto err_create_queue;
 		}
@@ -238,22 +218,22 @@ int pqm_create_queue(struct process_queue_manager *pqm,
 							kq, &pdd->qpd);
 		break;
 	default:
-		BUG();
-		break;
+		WARN(1, "Invalid queue type %d", type);
+		retval = -EINVAL;
 	}
 
 	if (retval != 0) {
-		pr_debug("Error dqm create queue\n");
+		pr_err("DQM create queue failed\n");
 		goto err_create_queue;
 	}
 
-	pr_debug("kfd: PQM After DQM create queue\n");
+	pr_debug("PQM After DQM create queue\n");
 
 	list_add(&pqn->process_queue_list, &pqm->queues);
 
 	if (q) {
 		*properties = q->properties;
-		pr_debug("kfd: PQM done creating queue\n");
+		pr_debug("PQM done creating queue\n");
 		print_queue_properties(properties);
 	}
 
@@ -279,14 +259,11 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid)
 
 	dqm = NULL;
 
-	BUG_ON(!pqm);
 	retval = 0;
 
-	pr_debug("kfd: In Func %s\n", __func__);
-
 	pqn = get_queue_by_qid(pqm, qid);
-	if (pqn == NULL) {
-		pr_err("kfd: queue id does not match any known queue\n");
+	if (!pqn) {
+		pr_err("Queue id does not match any known queue\n");
 		return -EINVAL;
 	}
 
@@ -295,7 +272,8 @@ int pqm_destroy_queue(struct process_queue_manager *pqm, unsigned int qid)
 		dev = pqn->kq->dev;
 	if (pqn->q)
 		dev = pqn->q->device;
-	BUG_ON(!dev);
+	if (WARN_ON(!dev))
+		return -ENODEV;
 
 	pdd = kfd_get_process_device_data(dev, pqm->process);
 	if (!pdd) {
@@ -335,12 +313,9 @@ int pqm_update_queue(struct process_queue_manager *pqm, unsigned int qid,
 	int retval;
 	struct process_queue_node *pqn;
 
-	BUG_ON(!pqm);
-
 	pqn = get_queue_by_qid(pqm, qid);
 	if (!pqn) {
-		pr_debug("amdkfd: No queue %d exists for update operation\n",
-				qid);
+		pr_debug("No queue %d exists for update operation\n", qid);
 		return -EFAULT;
 	}
 
@@ -363,8 +338,6 @@ struct kernel_queue *pqm_get_kernel_queue(
 {
 	struct process_queue_node *pqn;
 
-	BUG_ON(!pqm);
-
 	pqn = get_queue_by_qid(pqm, qid);
 	if (pqn && pqn->kq)
 		return pqn->kq;
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
index 0ab1970..a5315d4 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_queue.c
@@ -65,17 +65,15 @@ void print_queue(struct queue *q)
 
 int init_queue(struct queue **q, const struct queue_properties *properties)
 {
-	struct queue *tmp;
+	struct queue *tmp_q;
 
-	BUG_ON(!q);
-
-	tmp = kzalloc(sizeof(struct queue), GFP_KERNEL);
-	if (!tmp)
+	tmp_q = kzalloc(sizeof(*tmp_q), GFP_KERNEL);
+	if (!tmp_q)
 		return -ENOMEM;
 
-	memcpy(&tmp->properties, properties, sizeof(struct queue_properties));
+	memcpy(&tmp_q->properties, properties, sizeof(*properties));
 
-	*q = tmp;
+	*q = tmp_q;
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
index 1e50647..19ce590 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_topology.c
@@ -108,9 +108,6 @@ static int kfd_topology_get_crat_acpi(void *crat_image, size_t *size)
 static void kfd_populated_cu_info_cpu(struct kfd_topology_device *dev,
 		struct crat_subtype_computeunit *cu)
 {
-	BUG_ON(!dev);
-	BUG_ON(!cu);
-
 	dev->node_props.cpu_cores_count = cu->num_cpu_cores;
 	dev->node_props.cpu_core_id_base = cu->processor_id_low;
 	if (cu->hsa_capability & CRAT_CU_FLAGS_IOMMU_PRESENT)
@@ -123,9 +120,6 @@ static void kfd_populated_cu_info_cpu(struct kfd_topology_device *dev,
 static void kfd_populated_cu_info_gpu(struct kfd_topology_device *dev,
 		struct crat_subtype_computeunit *cu)
 {
-	BUG_ON(!dev);
-	BUG_ON(!cu);
-
 	dev->node_props.simd_id_base = cu->processor_id_low;
 	dev->node_props.simd_count = cu->num_simd_cores;
 	dev->node_props.lds_size_in_kb = cu->lds_size_in_kb;
@@ -148,8 +142,6 @@ static int kfd_parse_subtype_cu(struct crat_subtype_computeunit *cu)
 	struct kfd_topology_device *dev;
 	int i = 0;
 
-	BUG_ON(!cu);
-
 	pr_info("Found CU entry in CRAT table with proximity_domain=%d caps=%x\n",
 			cu->proximity_domain, cu->hsa_capability);
 	list_for_each_entry(dev, &topology_device_list, list) {
@@ -177,8 +169,6 @@ static int kfd_parse_subtype_mem(struct crat_subtype_memory *mem)
 	struct kfd_topology_device *dev;
 	int i = 0;
 
-	BUG_ON(!mem);
-
 	pr_info("Found memory entry in CRAT table with proximity_domain=%d\n",
 			mem->promixity_domain);
 	list_for_each_entry(dev, &topology_device_list, list) {
@@ -223,8 +213,6 @@ static int kfd_parse_subtype_cache(struct crat_subtype_cache *cache)
 	struct kfd_topology_device *dev;
 	uint32_t id;
 
-	BUG_ON(!cache);
-
 	id = cache->processor_id_low;
 
 	pr_info("Found cache entry in CRAT table with processor_id=%d\n", id);
@@ -274,8 +262,6 @@ static int kfd_parse_subtype_iolink(struct crat_subtype_iolink *iolink)
 	uint32_t id_from;
 	uint32_t id_to;
 
-	BUG_ON(!iolink);
-
 	id_from = iolink->proximity_domain_from;
 	id_to = iolink->proximity_domain_to;
 
@@ -323,8 +309,6 @@ static int kfd_parse_subtype(struct crat_subtype_generic *sub_type_hdr)
 	struct crat_subtype_iolink *iolink;
 	int ret = 0;
 
-	BUG_ON(!sub_type_hdr);
-
 	switch (sub_type_hdr->type) {
 	case CRAT_SUBTYPE_COMPUTEUNIT_AFFINITY:
 		cu = (struct crat_subtype_computeunit *)sub_type_hdr;
@@ -368,8 +352,6 @@ static void kfd_release_topology_device(struct kfd_topology_device *dev)
 	struct kfd_cache_properties *cache;
 	struct kfd_iolink_properties *iolink;
 
-	BUG_ON(!dev);
-
 	list_del(&dev->list);
 
 	while (dev->mem_props.next != &dev->mem_props) {
@@ -416,7 +398,7 @@ static struct kfd_topology_device *kfd_create_topology_device(void)
 	struct kfd_topology_device *dev;
 
 	dev = kfd_alloc_struct(dev);
-	if (dev == NULL) {
+	if (!dev) {
 		pr_err("No memory to allocate a topology device");
 		return NULL;
 	}
@@ -666,7 +648,7 @@ static ssize_t node_show(struct kobject *kobj, struct attribute *attr,
 			dev->node_props.simd_count);
 
 	if (dev->mem_bank_count < dev->node_props.mem_banks_count) {
-		pr_info_once("kfd: mem_banks_count truncated from %d to %d\n",
+		pr_info_once("mem_banks_count truncated from %d to %d\n",
 				dev->node_props.mem_banks_count,
 				dev->mem_bank_count);
 		sysfs_show_32bit_prop(buffer, "mem_banks_count",
@@ -763,8 +745,6 @@ static void kfd_remove_sysfs_node_entry(struct kfd_topology_device *dev)
 	struct kfd_cache_properties *cache;
 	struct kfd_mem_properties *mem;
 
-	BUG_ON(!dev);
-
 	if (dev->kobj_iolink) {
 		list_for_each_entry(iolink, &dev->io_link_props, list)
 			if (iolink->kobj) {
@@ -819,12 +799,12 @@ static int kfd_build_sysfs_node_entry(struct kfd_topology_device *dev,
 	int ret;
 	uint32_t i;
 
-	BUG_ON(!dev);
+	if (WARN_ON(dev->kobj_node))
+		return -EEXIST;
 
 	/*
 	 * Creating the sysfs folders
 	 */
-	BUG_ON(dev->kobj_node);
 	dev->kobj_node = kfd_alloc_struct(dev->kobj_node);
 	if (!dev->kobj_node)
 		return -ENOMEM;
@@ -957,7 +937,7 @@ static int kfd_topology_update_sysfs(void)
 	int ret;
 
 	pr_info("Creating topology SYSFS entries\n");
-	if (sys_props.kobj_topology == NULL) {
+	if (!sys_props.kobj_topology) {
 		sys_props.kobj_topology =
 				kfd_alloc_struct(sys_props.kobj_topology);
 		if (!sys_props.kobj_topology)
@@ -1117,10 +1097,8 @@ static struct kfd_topology_device *kfd_assign_gpu(struct kfd_dev *gpu)
 	struct kfd_topology_device *dev;
 	struct kfd_topology_device *out_dev = NULL;
 
-	BUG_ON(!gpu);
-
 	list_for_each_entry(dev, &topology_device_list, list)
-		if (dev->gpu == NULL && dev->node_props.simd_count > 0) {
+		if (!dev->gpu && (dev->node_props.simd_count > 0)) {
 			dev->gpu = gpu;
 			out_dev = dev;
 			break;
@@ -1143,11 +1121,9 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
 	struct kfd_topology_device *dev;
 	int res;
 
-	BUG_ON(!gpu);
-
 	gpu_id = kfd_generate_gpu_id(gpu);
 
-	pr_debug("kfd: Adding new GPU (ID: 0x%x) to topology\n", gpu_id);
+	pr_debug("Adding new GPU (ID: 0x%x) to topology\n", gpu_id);
 
 	down_write(&topology_lock);
 	/*
@@ -1170,8 +1146,8 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
 		 * GPU vBIOS
 		 */
 
-		/*
-		 * Update the SYSFS tree, since we added another topology device
+		/* Update the SYSFS tree, since we added another topology
+		 * device
 		 */
 		if (kfd_topology_update_sysfs() < 0)
 			kfd_topology_release_sysfs();
@@ -1190,7 +1166,7 @@ int kfd_topology_add_device(struct kfd_dev *gpu)
 
 	if (dev->gpu->device_info->asic_family == CHIP_CARRIZO) {
 		dev->node_props.capability |= HSA_CAP_DOORBELL_PACKET_TYPE;
-		pr_info("amdkfd: adding doorbell packet type capability\n");
+		pr_info("Adding doorbell packet type capability\n");
 	}
 
 	res = 0;
@@ -1210,8 +1186,6 @@ int kfd_topology_remove_device(struct kfd_dev *gpu)
 	uint32_t gpu_id;
 	int res = -ENODEV;
 
-	BUG_ON(!gpu);
-
 	down_write(&topology_lock);
 
 	list_for_each_entry(dev, &topology_device_list, list)
diff --git a/drivers/gpu/drm/amd/include/atomfirmware.h b/drivers/gpu/drm/amd/include/atomfirmware.h
index 0021a1c..837296d 100644
--- a/drivers/gpu/drm/amd/include/atomfirmware.h
+++ b/drivers/gpu/drm/amd/include/atomfirmware.h
@@ -1233,6 +1233,69 @@ struct  atom_asic_profiling_info_v4_1
   uint32_t  phyclk2gfxclk_c;
 };
 
+struct  atom_asic_profiling_info_v4_2 {
+	struct  atom_common_table_header  table_header;
+	uint32_t  maxvddc;
+	uint32_t  minvddc;
+	uint32_t  avfs_meannsigma_acontant0;
+	uint32_t  avfs_meannsigma_acontant1;
+	uint32_t  avfs_meannsigma_acontant2;
+	uint16_t  avfs_meannsigma_dc_tol_sigma;
+	uint16_t  avfs_meannsigma_platform_mean;
+	uint16_t  avfs_meannsigma_platform_sigma;
+	uint32_t  gb_vdroop_table_cksoff_a0;
+	uint32_t  gb_vdroop_table_cksoff_a1;
+	uint32_t  gb_vdroop_table_cksoff_a2;
+	uint32_t  gb_vdroop_table_ckson_a0;
+	uint32_t  gb_vdroop_table_ckson_a1;
+	uint32_t  gb_vdroop_table_ckson_a2;
+	uint32_t  avfsgb_fuse_table_cksoff_m1;
+	uint32_t  avfsgb_fuse_table_cksoff_m2;
+	uint32_t  avfsgb_fuse_table_cksoff_b;
+	uint32_t  avfsgb_fuse_table_ckson_m1;
+	uint32_t  avfsgb_fuse_table_ckson_m2;
+	uint32_t  avfsgb_fuse_table_ckson_b;
+	uint16_t  max_voltage_0_25mv;
+	uint8_t   enable_gb_vdroop_table_cksoff;
+	uint8_t   enable_gb_vdroop_table_ckson;
+	uint8_t   enable_gb_fuse_table_cksoff;
+	uint8_t   enable_gb_fuse_table_ckson;
+	uint16_t  psm_age_comfactor;
+	uint8_t   enable_apply_avfs_cksoff_voltage;
+	uint8_t   reserved;
+	uint32_t  dispclk2gfxclk_a;
+	uint32_t  dispclk2gfxclk_b;
+	uint32_t  dispclk2gfxclk_c;
+	uint32_t  pixclk2gfxclk_a;
+	uint32_t  pixclk2gfxclk_b;
+	uint32_t  pixclk2gfxclk_c;
+	uint32_t  dcefclk2gfxclk_a;
+	uint32_t  dcefclk2gfxclk_b;
+	uint32_t  dcefclk2gfxclk_c;
+	uint32_t  phyclk2gfxclk_a;
+	uint32_t  phyclk2gfxclk_b;
+	uint32_t  phyclk2gfxclk_c;
+	uint32_t  acg_gb_vdroop_table_a0;
+	uint32_t  acg_gb_vdroop_table_a1;
+	uint32_t  acg_gb_vdroop_table_a2;
+	uint32_t  acg_avfsgb_fuse_table_m1;
+	uint32_t  acg_avfsgb_fuse_table_m2;
+	uint32_t  acg_avfsgb_fuse_table_b;
+	uint8_t   enable_acg_gb_vdroop_table;
+	uint8_t   enable_acg_gb_fuse_table;
+	uint32_t  acg_dispclk2gfxclk_a;
+	uint32_t  acg_dispclk2gfxclk_b;
+	uint32_t  acg_dispclk2gfxclk_c;
+	uint32_t  acg_pixclk2gfxclk_a;
+	uint32_t  acg_pixclk2gfxclk_b;
+	uint32_t  acg_pixclk2gfxclk_c;
+	uint32_t  acg_dcefclk2gfxclk_a;
+	uint32_t  acg_dcefclk2gfxclk_b;
+	uint32_t  acg_dcefclk2gfxclk_c;
+	uint32_t  acg_phyclk2gfxclk_a;
+	uint32_t  acg_phyclk2gfxclk_b;
+	uint32_t  acg_phyclk2gfxclk_c;
+};
 
 /* 
   ***************************************************************************
diff --git a/drivers/gpu/drm/amd/include/cgs_common.h b/drivers/gpu/drm/amd/include/cgs_common.h
index 0a94f74..0214f63 100644
--- a/drivers/gpu/drm/amd/include/cgs_common.h
+++ b/drivers/gpu/drm/amd/include/cgs_common.h
@@ -50,6 +50,7 @@ enum cgs_ind_reg {
 	CGS_IND_REG__UVD_CTX,
 	CGS_IND_REG__DIDT,
 	CGS_IND_REG_GC_CAC,
+	CGS_IND_REG_SE_CAC,
 	CGS_IND_REG__AUDIO_ENDPT
 };
 
@@ -406,6 +407,8 @@ typedef int (*cgs_is_virtualization_enabled_t)(void *cgs_device);
 
 typedef int (*cgs_enter_safe_mode)(struct cgs_device *cgs_device, bool en);
 
+typedef void (*cgs_lock_grbm_idx)(struct cgs_device *cgs_device, bool lock);
+
 struct cgs_ops {
 	/* memory management calls (similar to KFD interface) */
 	cgs_alloc_gpu_mem_t alloc_gpu_mem;
@@ -441,6 +444,7 @@ struct cgs_ops {
 	cgs_query_system_info query_system_info;
 	cgs_is_virtualization_enabled_t is_virtualization_enabled;
 	cgs_enter_safe_mode enter_safe_mode;
+	cgs_lock_grbm_idx lock_grbm_idx;
 };
 
 struct cgs_os_ops; /* To be define in OS-specific CGS header */
@@ -517,4 +521,6 @@ struct cgs_device
 #define cgs_enter_safe_mode(cgs_device, en) \
 		CGS_CALL(enter_safe_mode, cgs_device, en)
 
+#define cgs_lock_grbm_idx(cgs_device, lock) \
+		CGS_CALL(lock_grbm_idx, cgs_device, lock)
 #endif /* _CGS_COMMON_H */
diff --git a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
index 36f3766..94277cb7 100644
--- a/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
+++ b/drivers/gpu/drm/amd/include/kgd_kfd_interface.h
@@ -41,6 +41,11 @@ struct kgd_dev;
 
 struct kgd_mem;
 
+enum kfd_preempt_type {
+	KFD_PREEMPT_TYPE_WAVEFRONT_DRAIN = 0,
+	KFD_PREEMPT_TYPE_WAVEFRONT_RESET,
+};
+
 enum kgd_memory_pool {
 	KGD_POOL_SYSTEM_CACHEABLE = 1,
 	KGD_POOL_SYSTEM_WRITECOMBINE = 2,
@@ -82,6 +87,17 @@ struct kgd2kfd_shared_resources {
 	size_t doorbell_start_offset;
 };
 
+struct tile_config {
+	uint32_t *tile_config_ptr;
+	uint32_t *macro_tile_config_ptr;
+	uint32_t num_tile_configs;
+	uint32_t num_macro_tile_configs;
+
+	uint32_t gb_addr_config;
+	uint32_t num_banks;
+	uint32_t num_ranks;
+};
+
 /**
  * struct kfd2kgd_calls
  *
@@ -123,6 +139,11 @@ struct kgd2kfd_shared_resources {
  *
  * @get_fw_version: Returns FW versions from the header
  *
+ * @set_scratch_backing_va: Sets VA for scratch backing memory of a VMID.
+ * Only used for no cp scheduling mode
+ *
+ * @get_tile_config: Returns GPU-specific tiling mode information
+ *
  * This structure contains function pointers to services that the kgd driver
  * provides to amdkfd driver.
  *
@@ -153,14 +174,16 @@ struct kfd2kgd_calls {
 	int (*init_interrupts)(struct kgd_dev *kgd, uint32_t pipe_id);
 
 	int (*hqd_load)(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
-			uint32_t queue_id, uint32_t __user *wptr);
+			uint32_t queue_id, uint32_t __user *wptr,
+			uint32_t wptr_shift, uint32_t wptr_mask,
+			struct mm_struct *mm);
 
 	int (*hqd_sdma_load)(struct kgd_dev *kgd, void *mqd);
 
 	bool (*hqd_is_occupied)(struct kgd_dev *kgd, uint64_t queue_address,
 				uint32_t pipe_id, uint32_t queue_id);
 
-	int (*hqd_destroy)(struct kgd_dev *kgd, uint32_t reset_type,
+	int (*hqd_destroy)(struct kgd_dev *kgd, void *mqd, uint32_t reset_type,
 				unsigned int timeout, uint32_t pipe_id,
 				uint32_t queue_id);
 
@@ -192,6 +215,9 @@ struct kfd2kgd_calls {
 
 	uint16_t (*get_fw_version)(struct kgd_dev *kgd,
 				enum kgd_engine_type type);
+	void (*set_scratch_backing_va)(struct kgd_dev *kgd,
+				uint64_t va, uint32_t vmid);
+	int (*get_tile_config)(struct kgd_dev *kgd, struct tile_config *config);
 };
 
 /**
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
index 0b74da3..bc839ff 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/cz_hwmgr.c
@@ -1240,13 +1240,18 @@ static int cz_phm_force_dpm_highest(struct pp_hwmgr *hwmgr)
 {
 	struct cz_hwmgr *cz_hwmgr = (struct cz_hwmgr *)(hwmgr->backend);
 
-	if (cz_hwmgr->sclk_dpm.soft_min_clk !=
-				cz_hwmgr->sclk_dpm.soft_max_clk)
-		smum_send_msg_to_smc_with_parameter(hwmgr->smumgr,
-						PPSMC_MSG_SetSclkSoftMin,
-						cz_get_sclk_level(hwmgr,
-						cz_hwmgr->sclk_dpm.soft_max_clk,
-						PPSMC_MSG_SetSclkSoftMin));
+	smum_send_msg_to_smc_with_parameter(hwmgr->smumgr,
+					PPSMC_MSG_SetSclkSoftMin,
+					cz_get_sclk_level(hwmgr,
+					cz_hwmgr->sclk_dpm.soft_max_clk,
+					PPSMC_MSG_SetSclkSoftMin));
+
+	smum_send_msg_to_smc_with_parameter(hwmgr->smumgr,
+				PPSMC_MSG_SetSclkSoftMax,
+				cz_get_sclk_level(hwmgr,
+				cz_hwmgr->sclk_dpm.soft_max_clk,
+				PPSMC_MSG_SetSclkSoftMax));
+
 	return 0;
 }
 
@@ -1292,17 +1297,55 @@ static int cz_phm_force_dpm_lowest(struct pp_hwmgr *hwmgr)
 {
 	struct cz_hwmgr *cz_hwmgr = (struct cz_hwmgr *)(hwmgr->backend);
 
-	if (cz_hwmgr->sclk_dpm.soft_min_clk !=
-				cz_hwmgr->sclk_dpm.soft_max_clk) {
-		cz_hwmgr->sclk_dpm.soft_max_clk =
-			cz_hwmgr->sclk_dpm.soft_min_clk;
+	smum_send_msg_to_smc_with_parameter(hwmgr->smumgr,
+			PPSMC_MSG_SetSclkSoftMax,
+			cz_get_sclk_level(hwmgr,
+			cz_hwmgr->sclk_dpm.soft_min_clk,
+			PPSMC_MSG_SetSclkSoftMax));
 
-		smum_send_msg_to_smc_with_parameter(hwmgr->smumgr,
+	smum_send_msg_to_smc_with_parameter(hwmgr->smumgr,
+				PPSMC_MSG_SetSclkSoftMin,
+				cz_get_sclk_level(hwmgr,
+				cz_hwmgr->sclk_dpm.soft_min_clk,
+				PPSMC_MSG_SetSclkSoftMin));
+
+	return 0;
+}
+
+static int cz_phm_force_dpm_sclk(struct pp_hwmgr *hwmgr, uint32_t sclk)
+{
+	smum_send_msg_to_smc_with_parameter(hwmgr->smumgr,
+				PPSMC_MSG_SetSclkSoftMin,
+				cz_get_sclk_level(hwmgr,
+				sclk,
+				PPSMC_MSG_SetSclkSoftMin));
+
+	smum_send_msg_to_smc_with_parameter(hwmgr->smumgr,
 				PPSMC_MSG_SetSclkSoftMax,
 				cz_get_sclk_level(hwmgr,
-				cz_hwmgr->sclk_dpm.soft_max_clk,
+				sclk,
 				PPSMC_MSG_SetSclkSoftMax));
+	return 0;
+}
+
+static int cz_get_profiling_clk(struct pp_hwmgr *hwmgr, uint32_t *sclk)
+{
+	struct phm_clock_voltage_dependency_table *table =
+		hwmgr->dyn_state.vddc_dependency_on_sclk;
+	int32_t tmp_sclk;
+	int32_t count;
+
+	tmp_sclk = table->entries[table->count-1].clk * 70 / 100;
+
+	for (count = table->count-1; count >= 0; count--) {
+		if (tmp_sclk >= table->entries[count].clk) {
+			tmp_sclk = table->entries[count].clk;
+			*sclk = tmp_sclk;
+			break;
+		}
 	}
+	if (count < 0)
+		*sclk = table->entries[0].clk;
 
 	return 0;
 }
@@ -1310,30 +1353,70 @@ static int cz_phm_force_dpm_lowest(struct pp_hwmgr *hwmgr)
 static int cz_dpm_force_dpm_level(struct pp_hwmgr *hwmgr,
 				enum amd_dpm_forced_level level)
 {
+	uint32_t sclk = 0;
 	int ret = 0;
+	uint32_t profile_mode_mask = AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD |
+					AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK |
+					AMD_DPM_FORCED_LEVEL_PROFILE_PEAK;
+
+	if (level == hwmgr->dpm_level)
+		return ret;
+
+	if (!(hwmgr->dpm_level & profile_mode_mask)) {
+		/* enter profile mode, save current level, disable gfx cg*/
+		if (level & profile_mode_mask) {
+			hwmgr->saved_dpm_level = hwmgr->dpm_level;
+			cgs_set_clockgating_state(hwmgr->device,
+						AMD_IP_BLOCK_TYPE_GFX,
+						AMD_CG_STATE_UNGATE);
+		}
+	} else {
+		/* exit profile mode, restore level, enable gfx cg*/
+		if (!(level & profile_mode_mask)) {
+			if (level == AMD_DPM_FORCED_LEVEL_PROFILE_EXIT)
+				level = hwmgr->saved_dpm_level;
+			cgs_set_clockgating_state(hwmgr->device,
+					AMD_IP_BLOCK_TYPE_GFX,
+					AMD_CG_STATE_GATE);
+		}
+	}
 
 	switch (level) {
 	case AMD_DPM_FORCED_LEVEL_HIGH:
+	case AMD_DPM_FORCED_LEVEL_PROFILE_PEAK:
 		ret = cz_phm_force_dpm_highest(hwmgr);
 		if (ret)
 			return ret;
+		hwmgr->dpm_level = level;
 		break;
 	case AMD_DPM_FORCED_LEVEL_LOW:
+	case AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK:
 		ret = cz_phm_force_dpm_lowest(hwmgr);
 		if (ret)
 			return ret;
+		hwmgr->dpm_level = level;
 		break;
 	case AMD_DPM_FORCED_LEVEL_AUTO:
 		ret = cz_phm_unforce_dpm_levels(hwmgr);
 		if (ret)
 			return ret;
+		hwmgr->dpm_level = level;
 		break;
+	case AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD:
+		ret = cz_get_profiling_clk(hwmgr, &sclk);
+		if (ret)
+			return ret;
+		hwmgr->dpm_level = level;
+		cz_phm_force_dpm_sclk(hwmgr, sclk);
+		break;
+	case AMD_DPM_FORCED_LEVEL_MANUAL:
+		hwmgr->dpm_level = level;
+		break;
+	case AMD_DPM_FORCED_LEVEL_PROFILE_EXIT:
 	default:
 		break;
 	}
 
-	hwmgr->dpm_level = level;
-
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c
index d025653..9547f26 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/hwmgr.c
@@ -557,9 +557,8 @@ uint16_t phm_find_closest_vddci(struct pp_atomctrl_voltage_table *vddci_table, u
 			return vddci_table->entries[i].value;
 	}
 
-	PP_ASSERT_WITH_CODE(false,
-			"VDDCI is larger than max VDDCI in VDDCI Voltage Table!",
-			return vddci_table->entries[i-1].value);
+	pr_debug("vddci is larger than max value in vddci_table\n");
+	return vddci_table->entries[i-1].value;
 }
 
 int phm_find_boot_level(void *table,
@@ -583,26 +582,26 @@ int phm_get_sclk_for_voltage_evv(struct pp_hwmgr *hwmgr,
 	phm_ppt_v1_voltage_lookup_table *lookup_table,
 	uint16_t virtual_voltage_id, int32_t *sclk)
 {
-	uint8_t entryId;
-	uint8_t voltageId;
+	uint8_t entry_id;
+	uint8_t voltage_id;
 	struct phm_ppt_v1_information *table_info =
 			(struct phm_ppt_v1_information *)(hwmgr->pptable);
 
 	PP_ASSERT_WITH_CODE(lookup_table->count != 0, "Lookup table is empty", return -EINVAL);
 
 	/* search for leakage voltage ID 0xff01 ~ 0xff08 and sckl */
-	for (entryId = 0; entryId < table_info->vdd_dep_on_sclk->count; entryId++) {
-		voltageId = table_info->vdd_dep_on_sclk->entries[entryId].vddInd;
-		if (lookup_table->entries[voltageId].us_vdd == virtual_voltage_id)
+	for (entry_id = 0; entry_id < table_info->vdd_dep_on_sclk->count; entry_id++) {
+		voltage_id = table_info->vdd_dep_on_sclk->entries[entry_id].vddInd;
+		if (lookup_table->entries[voltage_id].us_vdd == virtual_voltage_id)
 			break;
 	}
 
-	PP_ASSERT_WITH_CODE(entryId < table_info->vdd_dep_on_sclk->count,
-			"Can't find requested voltage id in vdd_dep_on_sclk table!",
-			return -EINVAL;
-			);
+	if (entry_id >= table_info->vdd_dep_on_sclk->count) {
+		pr_debug("Can't find requested voltage id in vdd_dep_on_sclk table\n");
+		return -EINVAL;
+	}
 
-	*sclk = table_info->vdd_dep_on_sclk->entries[entryId].clk;
+	*sclk = table_info->vdd_dep_on_sclk->entries[entry_id].clk;
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/ppatomfwctrl.c b/drivers/gpu/drm/amd/powerplay/hwmgr/ppatomfwctrl.c
index 720d500..c062844 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/ppatomfwctrl.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/ppatomfwctrl.c
@@ -142,7 +142,7 @@ int pp_atomfwctrl_get_voltage_table_v4(struct pp_hwmgr *hwmgr,
 		}
 	} else if (voltage_mode == VOLTAGE_OBJ_SVID2) {
 		voltage_table->psi1_enable =
-			voltage_object->svid2_voltage_obj.loadline_psi1 & 0x1;
+			(voltage_object->svid2_voltage_obj.loadline_psi1 & 0x20) >> 5;
 		voltage_table->psi0_enable =
 			voltage_object->svid2_voltage_obj.psi0_enable & 0x1;
 		voltage_table->max_vid_step =
@@ -276,7 +276,10 @@ int pp_atomfwctrl_get_avfs_information(struct pp_hwmgr *hwmgr,
 		struct pp_atomfwctrl_avfs_parameters *param)
 {
 	uint16_t idx;
+	uint8_t format_revision, content_revision;
+
 	struct atom_asic_profiling_info_v4_1 *profile;
+	struct atom_asic_profiling_info_v4_2 *profile_v4_2;
 
 	idx = GetIndexIntoMasterDataTable(asic_profiling_info);
 	profile = (struct atom_asic_profiling_info_v4_1 *)
@@ -286,76 +289,172 @@ int pp_atomfwctrl_get_avfs_information(struct pp_hwmgr *hwmgr,
 	if (!profile)
 		return -1;
 
-	param->ulMaxVddc = le32_to_cpu(profile->maxvddc);
-	param->ulMinVddc = le32_to_cpu(profile->minvddc);
-	param->ulMeanNsigmaAcontant0 =
-			le32_to_cpu(profile->avfs_meannsigma_acontant0);
-	param->ulMeanNsigmaAcontant1 =
-			le32_to_cpu(profile->avfs_meannsigma_acontant1);
-	param->ulMeanNsigmaAcontant2 =
-			le32_to_cpu(profile->avfs_meannsigma_acontant2);
-	param->usMeanNsigmaDcTolSigma =
-			le16_to_cpu(profile->avfs_meannsigma_dc_tol_sigma);
-	param->usMeanNsigmaPlatformMean =
-			le16_to_cpu(profile->avfs_meannsigma_platform_mean);
-	param->usMeanNsigmaPlatformSigma =
-			le16_to_cpu(profile->avfs_meannsigma_platform_sigma);
-	param->ulGbVdroopTableCksoffA0 =
-			le32_to_cpu(profile->gb_vdroop_table_cksoff_a0);
-	param->ulGbVdroopTableCksoffA1 =
-			le32_to_cpu(profile->gb_vdroop_table_cksoff_a1);
-	param->ulGbVdroopTableCksoffA2 =
-			le32_to_cpu(profile->gb_vdroop_table_cksoff_a2);
-	param->ulGbVdroopTableCksonA0 =
-			le32_to_cpu(profile->gb_vdroop_table_ckson_a0);
-	param->ulGbVdroopTableCksonA1 =
-			le32_to_cpu(profile->gb_vdroop_table_ckson_a1);
-	param->ulGbVdroopTableCksonA2 =
-			le32_to_cpu(profile->gb_vdroop_table_ckson_a2);
-	param->ulGbFuseTableCksoffM1 =
-			le32_to_cpu(profile->avfsgb_fuse_table_cksoff_m1);
-	param->ulGbFuseTableCksoffM2 =
-			le32_to_cpu(profile->avfsgb_fuse_table_cksoff_m2);
-	param->ulGbFuseTableCksoffB =
-			le32_to_cpu(profile->avfsgb_fuse_table_cksoff_b);
-	param->ulGbFuseTableCksonM1 =
-			le32_to_cpu(profile->avfsgb_fuse_table_ckson_m1);
-	param->ulGbFuseTableCksonM2 =
-			le32_to_cpu(profile->avfsgb_fuse_table_ckson_m2);
-	param->ulGbFuseTableCksonB =
-			le32_to_cpu(profile->avfsgb_fuse_table_ckson_b);
+	format_revision = ((struct atom_common_table_header *)profile)->format_revision;
+	content_revision = ((struct atom_common_table_header *)profile)->content_revision;
 
-	param->ucEnableGbVdroopTableCkson =
-			profile->enable_gb_vdroop_table_ckson;
-	param->ucEnableGbFuseTableCkson =
-			profile->enable_gb_fuse_table_ckson;
-	param->usPsmAgeComfactor =
-			le16_to_cpu(profile->psm_age_comfactor);
+	if (format_revision == 4 && content_revision == 1) {
+		param->ulMaxVddc = le32_to_cpu(profile->maxvddc);
+		param->ulMinVddc = le32_to_cpu(profile->minvddc);
+		param->ulMeanNsigmaAcontant0 =
+				le32_to_cpu(profile->avfs_meannsigma_acontant0);
+		param->ulMeanNsigmaAcontant1 =
+				le32_to_cpu(profile->avfs_meannsigma_acontant1);
+		param->ulMeanNsigmaAcontant2 =
+				le32_to_cpu(profile->avfs_meannsigma_acontant2);
+		param->usMeanNsigmaDcTolSigma =
+				le16_to_cpu(profile->avfs_meannsigma_dc_tol_sigma);
+		param->usMeanNsigmaPlatformMean =
+				le16_to_cpu(profile->avfs_meannsigma_platform_mean);
+		param->usMeanNsigmaPlatformSigma =
+				le16_to_cpu(profile->avfs_meannsigma_platform_sigma);
+		param->ulGbVdroopTableCksoffA0 =
+				le32_to_cpu(profile->gb_vdroop_table_cksoff_a0);
+		param->ulGbVdroopTableCksoffA1 =
+				le32_to_cpu(profile->gb_vdroop_table_cksoff_a1);
+		param->ulGbVdroopTableCksoffA2 =
+				le32_to_cpu(profile->gb_vdroop_table_cksoff_a2);
+		param->ulGbVdroopTableCksonA0 =
+				le32_to_cpu(profile->gb_vdroop_table_ckson_a0);
+		param->ulGbVdroopTableCksonA1 =
+				le32_to_cpu(profile->gb_vdroop_table_ckson_a1);
+		param->ulGbVdroopTableCksonA2 =
+				le32_to_cpu(profile->gb_vdroop_table_ckson_a2);
+		param->ulGbFuseTableCksoffM1 =
+				le32_to_cpu(profile->avfsgb_fuse_table_cksoff_m1);
+		param->ulGbFuseTableCksoffM2 =
+				le32_to_cpu(profile->avfsgb_fuse_table_cksoff_m2);
+		param->ulGbFuseTableCksoffB =
+				le32_to_cpu(profile->avfsgb_fuse_table_cksoff_b);
+		param->ulGbFuseTableCksonM1 =
+				le32_to_cpu(profile->avfsgb_fuse_table_ckson_m1);
+		param->ulGbFuseTableCksonM2 =
+				le32_to_cpu(profile->avfsgb_fuse_table_ckson_m2);
+		param->ulGbFuseTableCksonB =
+				le32_to_cpu(profile->avfsgb_fuse_table_ckson_b);
 
-	param->ulDispclk2GfxclkM1 =
-			le32_to_cpu(profile->dispclk2gfxclk_a);
-	param->ulDispclk2GfxclkM2 =
-			le32_to_cpu(profile->dispclk2gfxclk_b);
-	param->ulDispclk2GfxclkB =
-			le32_to_cpu(profile->dispclk2gfxclk_c);
-	param->ulDcefclk2GfxclkM1 =
-			le32_to_cpu(profile->dcefclk2gfxclk_a);
-	param->ulDcefclk2GfxclkM2 =
-			le32_to_cpu(profile->dcefclk2gfxclk_b);
-	param->ulDcefclk2GfxclkB =
-			le32_to_cpu(profile->dcefclk2gfxclk_c);
-	param->ulPixelclk2GfxclkM1 =
-			le32_to_cpu(profile->pixclk2gfxclk_a);
-	param->ulPixelclk2GfxclkM2 =
-			le32_to_cpu(profile->pixclk2gfxclk_b);
-	param->ulPixelclk2GfxclkB =
-			le32_to_cpu(profile->pixclk2gfxclk_c);
-	param->ulPhyclk2GfxclkM1 =
-			le32_to_cpu(profile->phyclk2gfxclk_a);
-	param->ulPhyclk2GfxclkM2 =
-			le32_to_cpu(profile->phyclk2gfxclk_b);
-	param->ulPhyclk2GfxclkB =
-			le32_to_cpu(profile->phyclk2gfxclk_c);
+		param->ucEnableGbVdroopTableCkson =
+				profile->enable_gb_vdroop_table_ckson;
+		param->ucEnableGbFuseTableCkson =
+				profile->enable_gb_fuse_table_ckson;
+		param->usPsmAgeComfactor =
+				le16_to_cpu(profile->psm_age_comfactor);
+
+		param->ulDispclk2GfxclkM1 =
+				le32_to_cpu(profile->dispclk2gfxclk_a);
+		param->ulDispclk2GfxclkM2 =
+				le32_to_cpu(profile->dispclk2gfxclk_b);
+		param->ulDispclk2GfxclkB =
+				le32_to_cpu(profile->dispclk2gfxclk_c);
+		param->ulDcefclk2GfxclkM1 =
+				le32_to_cpu(profile->dcefclk2gfxclk_a);
+		param->ulDcefclk2GfxclkM2 =
+				le32_to_cpu(profile->dcefclk2gfxclk_b);
+		param->ulDcefclk2GfxclkB =
+				le32_to_cpu(profile->dcefclk2gfxclk_c);
+		param->ulPixelclk2GfxclkM1 =
+				le32_to_cpu(profile->pixclk2gfxclk_a);
+		param->ulPixelclk2GfxclkM2 =
+				le32_to_cpu(profile->pixclk2gfxclk_b);
+		param->ulPixelclk2GfxclkB =
+				le32_to_cpu(profile->pixclk2gfxclk_c);
+		param->ulPhyclk2GfxclkM1 =
+				le32_to_cpu(profile->phyclk2gfxclk_a);
+		param->ulPhyclk2GfxclkM2 =
+				le32_to_cpu(profile->phyclk2gfxclk_b);
+		param->ulPhyclk2GfxclkB =
+				le32_to_cpu(profile->phyclk2gfxclk_c);
+		param->ulAcgGbVdroopTableA0           = 0;
+		param->ulAcgGbVdroopTableA1           = 0;
+		param->ulAcgGbVdroopTableA2           = 0;
+		param->ulAcgGbFuseTableM1             = 0;
+		param->ulAcgGbFuseTableM2             = 0;
+		param->ulAcgGbFuseTableB              = 0;
+		param->ucAcgEnableGbVdroopTable       = 0;
+		param->ucAcgEnableGbFuseTable         = 0;
+	} else if (format_revision == 4 && content_revision == 2) {
+		profile_v4_2 = (struct atom_asic_profiling_info_v4_2 *)profile;
+		param->ulMaxVddc = le32_to_cpu(profile_v4_2->maxvddc);
+		param->ulMinVddc = le32_to_cpu(profile_v4_2->minvddc);
+		param->ulMeanNsigmaAcontant0 =
+				le32_to_cpu(profile_v4_2->avfs_meannsigma_acontant0);
+		param->ulMeanNsigmaAcontant1 =
+				le32_to_cpu(profile_v4_2->avfs_meannsigma_acontant1);
+		param->ulMeanNsigmaAcontant2 =
+				le32_to_cpu(profile_v4_2->avfs_meannsigma_acontant2);
+		param->usMeanNsigmaDcTolSigma =
+				le16_to_cpu(profile_v4_2->avfs_meannsigma_dc_tol_sigma);
+		param->usMeanNsigmaPlatformMean =
+				le16_to_cpu(profile_v4_2->avfs_meannsigma_platform_mean);
+		param->usMeanNsigmaPlatformSigma =
+				le16_to_cpu(profile_v4_2->avfs_meannsigma_platform_sigma);
+		param->ulGbVdroopTableCksoffA0 =
+				le32_to_cpu(profile_v4_2->gb_vdroop_table_cksoff_a0);
+		param->ulGbVdroopTableCksoffA1 =
+				le32_to_cpu(profile_v4_2->gb_vdroop_table_cksoff_a1);
+		param->ulGbVdroopTableCksoffA2 =
+				le32_to_cpu(profile_v4_2->gb_vdroop_table_cksoff_a2);
+		param->ulGbVdroopTableCksonA0 =
+				le32_to_cpu(profile_v4_2->gb_vdroop_table_ckson_a0);
+		param->ulGbVdroopTableCksonA1 =
+				le32_to_cpu(profile_v4_2->gb_vdroop_table_ckson_a1);
+		param->ulGbVdroopTableCksonA2 =
+				le32_to_cpu(profile_v4_2->gb_vdroop_table_ckson_a2);
+		param->ulGbFuseTableCksoffM1 =
+				le32_to_cpu(profile_v4_2->avfsgb_fuse_table_cksoff_m1);
+		param->ulGbFuseTableCksoffM2 =
+				le32_to_cpu(profile_v4_2->avfsgb_fuse_table_cksoff_m2);
+		param->ulGbFuseTableCksoffB =
+				le32_to_cpu(profile_v4_2->avfsgb_fuse_table_cksoff_b);
+		param->ulGbFuseTableCksonM1 =
+				le32_to_cpu(profile_v4_2->avfsgb_fuse_table_ckson_m1);
+		param->ulGbFuseTableCksonM2 =
+				le32_to_cpu(profile_v4_2->avfsgb_fuse_table_ckson_m2);
+		param->ulGbFuseTableCksonB =
+				le32_to_cpu(profile_v4_2->avfsgb_fuse_table_ckson_b);
+
+		param->ucEnableGbVdroopTableCkson =
+				profile_v4_2->enable_gb_vdroop_table_ckson;
+		param->ucEnableGbFuseTableCkson =
+				profile_v4_2->enable_gb_fuse_table_ckson;
+		param->usPsmAgeComfactor =
+				le16_to_cpu(profile_v4_2->psm_age_comfactor);
+
+		param->ulDispclk2GfxclkM1 =
+				le32_to_cpu(profile_v4_2->dispclk2gfxclk_a);
+		param->ulDispclk2GfxclkM2 =
+				le32_to_cpu(profile_v4_2->dispclk2gfxclk_b);
+		param->ulDispclk2GfxclkB =
+				le32_to_cpu(profile_v4_2->dispclk2gfxclk_c);
+		param->ulDcefclk2GfxclkM1 =
+				le32_to_cpu(profile_v4_2->dcefclk2gfxclk_a);
+		param->ulDcefclk2GfxclkM2 =
+				le32_to_cpu(profile_v4_2->dcefclk2gfxclk_b);
+		param->ulDcefclk2GfxclkB =
+				le32_to_cpu(profile_v4_2->dcefclk2gfxclk_c);
+		param->ulPixelclk2GfxclkM1 =
+				le32_to_cpu(profile_v4_2->pixclk2gfxclk_a);
+		param->ulPixelclk2GfxclkM2 =
+				le32_to_cpu(profile_v4_2->pixclk2gfxclk_b);
+		param->ulPixelclk2GfxclkB =
+				le32_to_cpu(profile_v4_2->pixclk2gfxclk_c);
+		param->ulPhyclk2GfxclkM1 =
+				le32_to_cpu(profile->phyclk2gfxclk_a);
+		param->ulPhyclk2GfxclkM2 =
+				le32_to_cpu(profile_v4_2->phyclk2gfxclk_b);
+		param->ulPhyclk2GfxclkB =
+				le32_to_cpu(profile_v4_2->phyclk2gfxclk_c);
+		param->ulAcgGbVdroopTableA0 = le32_to_cpu(profile_v4_2->acg_gb_vdroop_table_a0);
+		param->ulAcgGbVdroopTableA1 = le32_to_cpu(profile_v4_2->acg_gb_vdroop_table_a1);
+		param->ulAcgGbVdroopTableA2 = le32_to_cpu(profile_v4_2->acg_gb_vdroop_table_a2);
+		param->ulAcgGbFuseTableM1 = le32_to_cpu(profile_v4_2->acg_avfsgb_fuse_table_m1);
+		param->ulAcgGbFuseTableM2 = le32_to_cpu(profile_v4_2->acg_avfsgb_fuse_table_m2);
+		param->ulAcgGbFuseTableB = le32_to_cpu(profile_v4_2->acg_avfsgb_fuse_table_b);
+		param->ucAcgEnableGbVdroopTable = le32_to_cpu(profile_v4_2->enable_acg_gb_vdroop_table);
+		param->ucAcgEnableGbFuseTable = le32_to_cpu(profile_v4_2->enable_acg_gb_fuse_table);
+	} else {
+		pr_info("Invalid VBIOS AVFS ProfilingInfo Revision!\n");
+		return -EINVAL;
+	}
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/ppatomfwctrl.h b/drivers/gpu/drm/amd/powerplay/hwmgr/ppatomfwctrl.h
index 81908b5..8e6b1f0 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/ppatomfwctrl.h
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/ppatomfwctrl.h
@@ -109,6 +109,14 @@ struct pp_atomfwctrl_avfs_parameters {
 	uint32_t   ulPhyclk2GfxclkM1;
 	uint32_t   ulPhyclk2GfxclkM2;
 	uint32_t   ulPhyclk2GfxclkB;
+	uint32_t   ulAcgGbVdroopTableA0;
+	uint32_t   ulAcgGbVdroopTableA1;
+	uint32_t   ulAcgGbVdroopTableA2;
+	uint32_t   ulAcgGbFuseTableM1;
+	uint32_t   ulAcgGbFuseTableM2;
+	uint32_t   ulAcgGbFuseTableB;
+	uint32_t   ucAcgEnableGbVdroopTable;
+	uint32_t   ucAcgEnableGbFuseTable;
 };
 
 struct pp_atomfwctrl_gpio_parameters {
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
index 4c7f430..edc5fb6 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.c
@@ -265,6 +265,15 @@ static int rv_tf_set_clock_limit(struct pp_hwmgr *hwmgr, void *input,
 		}
 	} */
 
+	if (((hwmgr->uvd_arbiter.vclk_soft_min / 100) != rv_data->vclk_soft_min) ||
+	    ((hwmgr->uvd_arbiter.dclk_soft_min / 100) != rv_data->dclk_soft_min)) {
+		rv_data->vclk_soft_min = hwmgr->uvd_arbiter.vclk_soft_min / 100;
+		rv_data->dclk_soft_min = hwmgr->uvd_arbiter.dclk_soft_min / 100;
+		smum_send_msg_to_smc_with_parameter(hwmgr->smumgr,
+			PPSMC_MSG_SetSoftMinVcn,
+			(rv_data->vclk_soft_min << 16) | rv_data->vclk_soft_min);
+	}
+
 	if((hwmgr->gfx_arbiter.sclk_hard_min != 0) &&
 		((hwmgr->gfx_arbiter.sclk_hard_min / 100) != rv_data->soc_actual_hard_min_freq)) {
 		smum_send_msg_to_smc_with_parameter(hwmgr->smumgr,
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.h b/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.h
index afb8522..2472b50 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.h
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/rv_hwmgr.h
@@ -280,6 +280,8 @@ struct rv_hwmgr {
 
 	uint32_t                        f_actual_hard_min_freq;
 	uint32_t                        fabric_actual_soft_min_freq;
+	uint32_t                        vclk_soft_min;
+	uint32_t                        dclk_soft_min;
 	uint32_t                        gfx_actual_soft_min_freq;
 
 	bool                           vcn_power_gated;
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
index 1f01020..c274323 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/smu7_hwmgr.c
@@ -1962,9 +1962,6 @@ static int smu7_thermal_parameter_init(struct pp_hwmgr *hwmgr)
 			temp_reg = PHM_SET_FIELD(temp_reg, CNB_PWRMGT_CNTL, DPM_ENABLED, 0x1);
 			break;
 		default:
-			PP_ASSERT_WITH_CODE(0,
-			"Failed to setup PCC HW register! Wrong GPIO assigned for VDDC_PCC_GPIO_PINID!",
-			);
 			break;
 		}
 		cgs_write_ind_register(hwmgr->device, CGS_IND_REG__SMC, ixCNB_PWRMGT_CNTL, temp_reg);
@@ -4630,6 +4627,15 @@ static int smu7_set_power_profile_state(struct pp_hwmgr *hwmgr,
 
 static int smu7_avfs_control(struct pp_hwmgr *hwmgr, bool enable)
 {
+	struct pp_smumgr *smumgr = (struct pp_smumgr *)(hwmgr->smumgr);
+	struct smu7_smumgr *smu_data = (struct smu7_smumgr *)(smumgr->backend);
+
+	if (smu_data == NULL)
+		return -EINVAL;
+
+	if (smu_data->avfs.avfs_btc_status == AVFS_BTC_NOTSUPPORTED)
+		return 0;
+
 	if (enable) {
 		if (!PHM_READ_VFPF_INDIRECT_FIELD(hwmgr->device,
 				CGS_IND_REG__SMC, FEATURE_STATUS, AVS_ON))
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
index 197174e..9d71a25 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.c
@@ -78,6 +78,8 @@ uint32_t channel_number[] = {1, 2, 0, 4, 0, 8, 0, 16, 2};
 #define DF_CS_AON0_DramBaseAddress0__IntLvNumChan_MASK                                                        0x000000F0L
 #define DF_CS_AON0_DramBaseAddress0__IntLvAddrSel_MASK                                                        0x00000700L
 #define DF_CS_AON0_DramBaseAddress0__DramBaseAddr_MASK                                                        0xFFFFF000L
+static int vega10_force_clock_level(struct pp_hwmgr *hwmgr,
+		enum pp_clock_type type, uint32_t mask);
 
 const ULONG PhwVega10_Magic = (ULONG)(PHM_VIslands_Magic);
 
@@ -146,6 +148,19 @@ static void vega10_set_default_registry_data(struct pp_hwmgr *hwmgr)
 	data->registry_data.vr1hot_enabled = 1;
 	data->registry_data.regulator_hot_gpio_support = 1;
 
+	data->registry_data.didt_support = 1;
+	if (data->registry_data.didt_support) {
+		data->registry_data.didt_mode = 6;
+		data->registry_data.sq_ramping_support = 1;
+		data->registry_data.db_ramping_support = 0;
+		data->registry_data.td_ramping_support = 0;
+		data->registry_data.tcp_ramping_support = 0;
+		data->registry_data.dbr_ramping_support = 0;
+		data->registry_data.edc_didt_support = 1;
+		data->registry_data.gc_didt_support = 0;
+		data->registry_data.psm_didt_support = 0;
+	}
+
 	data->display_voltage_mode = PPVEGA10_VEGA10DISPLAYVOLTAGEMODE_DFLT;
 	data->dcef_clk_quad_eqn_a = PPREGKEY_VEGA10QUADRATICEQUATION_DFLT;
 	data->dcef_clk_quad_eqn_b = PPREGKEY_VEGA10QUADRATICEQUATION_DFLT;
@@ -223,6 +238,8 @@ static int vega10_set_features_platform_caps(struct pp_hwmgr *hwmgr)
 	phm_cap_unset(hwmgr->platform_descriptor.platformCaps,
 			PHM_PlatformCaps_PowerContainment);
 	phm_cap_unset(hwmgr->platform_descriptor.platformCaps,
+			PHM_PlatformCaps_DiDtSupport);
+	phm_cap_unset(hwmgr->platform_descriptor.platformCaps,
 			PHM_PlatformCaps_SQRamping);
 	phm_cap_unset(hwmgr->platform_descriptor.platformCaps,
 			PHM_PlatformCaps_DBRamping);
@@ -230,6 +247,34 @@ static int vega10_set_features_platform_caps(struct pp_hwmgr *hwmgr)
 			PHM_PlatformCaps_TDRamping);
 	phm_cap_unset(hwmgr->platform_descriptor.platformCaps,
 			PHM_PlatformCaps_TCPRamping);
+	phm_cap_unset(hwmgr->platform_descriptor.platformCaps,
+			PHM_PlatformCaps_DBRRamping);
+	phm_cap_unset(hwmgr->platform_descriptor.platformCaps,
+			PHM_PlatformCaps_DiDtEDCEnable);
+	phm_cap_unset(hwmgr->platform_descriptor.platformCaps,
+			PHM_PlatformCaps_GCEDC);
+	phm_cap_unset(hwmgr->platform_descriptor.platformCaps,
+			PHM_PlatformCaps_PSM);
+
+	if (data->registry_data.didt_support) {
+		phm_cap_set(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_DiDtSupport);
+		if (data->registry_data.sq_ramping_support)
+			phm_cap_set(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_SQRamping);
+		if (data->registry_data.db_ramping_support)
+			phm_cap_set(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_DBRamping);
+		if (data->registry_data.td_ramping_support)
+			phm_cap_set(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_TDRamping);
+		if (data->registry_data.tcp_ramping_support)
+			phm_cap_set(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_TCPRamping);
+		if (data->registry_data.dbr_ramping_support)
+			phm_cap_set(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_DBRRamping);
+		if (data->registry_data.edc_didt_support)
+			phm_cap_set(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_DiDtEDCEnable);
+		if (data->registry_data.gc_didt_support)
+			phm_cap_set(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_GCEDC);
+		if (data->registry_data.psm_didt_support)
+			phm_cap_set(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_PSM);
+	}
 
 	if (data->registry_data.power_containment_support)
 		phm_cap_set(hwmgr->platform_descriptor.platformCaps,
@@ -321,8 +366,8 @@ static void vega10_init_dpm_defaults(struct pp_hwmgr *hwmgr)
 			FEATURE_LED_DISPLAY_BIT;
 	data->smu_features[GNLD_FAN_CONTROL].smu_feature_id =
 			FEATURE_FAN_CONTROL_BIT;
-	data->smu_features[GNLD_VOLTAGE_CONTROLLER].smu_feature_id =
-			FEATURE_VOLTAGE_CONTROLLER_BIT;
+	data->smu_features[GNLD_ACG].smu_feature_id = FEATURE_ACG_BIT;
+	data->smu_features[GNLD_DIDT].smu_feature_id = FEATURE_GFX_EDC_BIT;
 
 	if (!data->registry_data.prefetcher_dpm_key_disabled)
 		data->smu_features[GNLD_DPM_PREFETCHER].supported = true;
@@ -386,6 +431,15 @@ static void vega10_init_dpm_defaults(struct pp_hwmgr *hwmgr)
 	if (data->registry_data.vr0hot_enabled)
 		data->smu_features[GNLD_VR0HOT].supported = true;
 
+	smum_send_msg_to_smc(hwmgr->smumgr, PPSMC_MSG_GetSmuVersion);
+	vega10_read_arg_from_smc(hwmgr->smumgr, &(data->smu_version));
+		/* ACG firmware has major version 5 */
+	if ((data->smu_version & 0xff000000) == 0x5000000)
+		data->smu_features[GNLD_ACG].supported = true;
+
+	if (data->registry_data.didt_support)
+		data->smu_features[GNLD_DIDT].supported = true;
+
 }
 
 #ifdef PPLIB_VEGA10_EVV_SUPPORT
@@ -2222,6 +2276,21 @@ static int vega10_populate_avfs_parameters(struct pp_hwmgr *hwmgr)
 			pp_table->DisplayClock2Gfxclk[DSPCLK_PHYCLK].m1_shift = 24;
 			pp_table->DisplayClock2Gfxclk[DSPCLK_PHYCLK].m2_shift = 12;
 			pp_table->DisplayClock2Gfxclk[DSPCLK_PHYCLK].b_shift = 12;
+
+			pp_table->AcgBtcGbVdroopTable.a0       = avfs_params.ulAcgGbVdroopTableA0;
+			pp_table->AcgBtcGbVdroopTable.a0_shift = 20;
+			pp_table->AcgBtcGbVdroopTable.a1       = avfs_params.ulAcgGbVdroopTableA1;
+			pp_table->AcgBtcGbVdroopTable.a1_shift = 20;
+			pp_table->AcgBtcGbVdroopTable.a2       = avfs_params.ulAcgGbVdroopTableA2;
+			pp_table->AcgBtcGbVdroopTable.a2_shift = 20;
+
+			pp_table->AcgAvfsGb.m1                   = avfs_params.ulAcgGbFuseTableM1;
+			pp_table->AcgAvfsGb.m2                   = avfs_params.ulAcgGbFuseTableM2;
+			pp_table->AcgAvfsGb.b                    = avfs_params.ulAcgGbFuseTableB;
+			pp_table->AcgAvfsGb.m1_shift             = 0;
+			pp_table->AcgAvfsGb.m2_shift             = 0;
+			pp_table->AcgAvfsGb.b_shift              = 0;
+
 		} else {
 			data->smu_features[GNLD_AVFS].supported = false;
 		}
@@ -2230,6 +2299,55 @@ static int vega10_populate_avfs_parameters(struct pp_hwmgr *hwmgr)
 	return 0;
 }
 
+static int vega10_acg_enable(struct pp_hwmgr *hwmgr)
+{
+	struct vega10_hwmgr *data =
+			(struct vega10_hwmgr *)(hwmgr->backend);
+	uint32_t agc_btc_response;
+
+	if (data->smu_features[GNLD_ACG].supported) {
+		if (0 == vega10_enable_smc_features(hwmgr->smumgr, true,
+					data->smu_features[GNLD_DPM_PREFETCHER].smu_feature_bitmap))
+			data->smu_features[GNLD_DPM_PREFETCHER].enabled = true;
+
+		smum_send_msg_to_smc(hwmgr->smumgr, PPSMC_MSG_InitializeAcg);
+
+		smum_send_msg_to_smc(hwmgr->smumgr, PPSMC_MSG_RunAcgBtc);
+		vega10_read_arg_from_smc(hwmgr->smumgr, &agc_btc_response);
+
+		if (1 == agc_btc_response) {
+			if (1 == data->acg_loop_state)
+				smum_send_msg_to_smc(hwmgr->smumgr, PPSMC_MSG_RunAcgInClosedLoop);
+			else if (2 == data->acg_loop_state)
+				smum_send_msg_to_smc(hwmgr->smumgr, PPSMC_MSG_RunAcgInOpenLoop);
+			if (0 == vega10_enable_smc_features(hwmgr->smumgr, true,
+				data->smu_features[GNLD_ACG].smu_feature_bitmap))
+					data->smu_features[GNLD_ACG].enabled = true;
+		} else {
+			pr_info("[ACG_Enable] ACG BTC Returned Failed Status!\n");
+			data->smu_features[GNLD_ACG].enabled = false;
+		}
+	}
+
+	return 0;
+}
+
+static int vega10_acg_disable(struct pp_hwmgr *hwmgr)
+{
+	struct vega10_hwmgr *data =
+			(struct vega10_hwmgr *)(hwmgr->backend);
+
+	if (data->smu_features[GNLD_ACG].supported) {
+		if (data->smu_features[GNLD_ACG].enabled) {
+		if (0 == vega10_enable_smc_features(hwmgr->smumgr, false,
+				data->smu_features[GNLD_ACG].smu_feature_bitmap))
+			data->smu_features[GNLD_ACG].enabled = false;
+		}
+	}
+
+	return 0;
+}
+
 static int vega10_populate_gpio_parameters(struct pp_hwmgr *hwmgr)
 {
 	struct vega10_hwmgr *data =
@@ -2404,6 +2522,9 @@ static int vega10_init_smc_table(struct pp_hwmgr *hwmgr)
 	pp_table->DisplayDpmVoltageMode =
 			(uint8_t)(table_info->uc_dcef_dpm_voltage_mode);
 
+	data->vddc_voltage_table.psi0_enable = voltage_table.psi0_enable;
+	data->vddc_voltage_table.psi1_enable = voltage_table.psi1_enable;
+
 	if (data->registry_data.ulv_support &&
 			table_info->us_ulv_voltage_offset) {
 		result = vega10_populate_ulv_state(hwmgr);
@@ -2500,7 +2621,7 @@ static int vega10_init_smc_table(struct pp_hwmgr *hwmgr)
 	result = vega10_avfs_enable(hwmgr, true);
 	PP_ASSERT_WITH_CODE(!result, "Attempt to enable AVFS feature Failed!",
 					return result);
-
+	vega10_acg_enable(hwmgr);
 	vega10_save_default_power_profile(hwmgr);
 
 	return 0;
@@ -2832,6 +2953,11 @@ static int vega10_enable_dpm_tasks(struct pp_hwmgr *hwmgr)
 	PP_ASSERT_WITH_CODE(!tmp_result,
 			"Failed to start DPM!", result = tmp_result);
 
+	/* enable didt, do not abort if failed didt */
+	tmp_result = vega10_enable_didt_config(hwmgr);
+	PP_ASSERT(!tmp_result,
+			"Failed to enable didt config!");
+
 	tmp_result = vega10_enable_power_containment(hwmgr);
 	PP_ASSERT_WITH_CODE(!tmp_result,
 			"Failed to enable power containment!",
@@ -3578,10 +3704,22 @@ static void vega10_apply_dal_minimum_voltage_request(
 	return;
 }
 
+static int vega10_get_soc_index_for_max_uclk(struct pp_hwmgr *hwmgr)
+{
+	struct phm_ppt_v1_clock_voltage_dependency_table *vdd_dep_table_on_mclk;
+	struct phm_ppt_v2_information *table_info =
+			(struct phm_ppt_v2_information *)(hwmgr->pptable);
+
+	vdd_dep_table_on_mclk  = table_info->vdd_dep_on_mclk;
+
+	return vdd_dep_table_on_mclk->entries[NUM_UCLK_DPM_LEVELS - 1].vddInd + 1;
+}
+
 static int vega10_upload_dpm_bootup_level(struct pp_hwmgr *hwmgr)
 {
 	struct vega10_hwmgr *data =
 			(struct vega10_hwmgr *)(hwmgr->backend);
+	uint32_t socclk_idx;
 
 	vega10_apply_dal_minimum_voltage_request(hwmgr);
 
@@ -3602,13 +3740,22 @@ static int vega10_upload_dpm_bootup_level(struct pp_hwmgr *hwmgr)
 	if (!data->registry_data.mclk_dpm_key_disabled) {
 		if (data->smc_state_table.mem_boot_level !=
 				data->dpm_table.mem_table.dpm_state.soft_min_level) {
+			if (data->smc_state_table.mem_boot_level == NUM_UCLK_DPM_LEVELS - 1) {
+				socclk_idx = vega10_get_soc_index_for_max_uclk(hwmgr);
 				PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc_with_parameter(
-				hwmgr->smumgr,
-				 PPSMC_MSG_SetSoftMinUclkByIndex,
-				data->smc_state_table.mem_boot_level),
-				"Failed to set soft min mclk index!",
-				return -EINVAL);
-
+							hwmgr->smumgr,
+						PPSMC_MSG_SetSoftMinSocclkByIndex,
+						socclk_idx),
+						"Failed to set soft min uclk index!",
+						return -EINVAL);
+			} else {
+				PP_ASSERT_WITH_CODE(!smum_send_msg_to_smc_with_parameter(
+						hwmgr->smumgr,
+						PPSMC_MSG_SetSoftMinUclkByIndex,
+						data->smc_state_table.mem_boot_level),
+						"Failed to set soft min uclk index!",
+						return -EINVAL);
+			}
 			data->dpm_table.mem_table.dpm_state.soft_min_level =
 					data->smc_state_table.mem_boot_level;
 		}
@@ -4015,7 +4162,7 @@ static int vega10_notify_smc_display_config_after_ps_adjustment(
 			pr_info("Attempt to set Hard Min for DCEFCLK Failed!");
 		}
 	} else {
-		pr_info("Cannot find requested DCEFCLK!");
+		pr_debug("Cannot find requested DCEFCLK!");
 	}
 
 	if (min_clocks.memoryClock != 0) {
@@ -4097,34 +4244,30 @@ static int vega10_unforce_dpm_levels(struct pp_hwmgr *hwmgr)
 	return 0;
 }
 
-static int vega10_dpm_force_dpm_level(struct pp_hwmgr *hwmgr,
-				enum amd_dpm_forced_level level)
+static int vega10_get_profiling_clk_mask(struct pp_hwmgr *hwmgr, enum amd_dpm_forced_level level,
+				uint32_t *sclk_mask, uint32_t *mclk_mask, uint32_t *soc_mask)
 {
-	int ret = 0;
+	struct phm_ppt_v2_information *table_info =
+			(struct phm_ppt_v2_information *)(hwmgr->pptable);
 
-	switch (level) {
-	case AMD_DPM_FORCED_LEVEL_HIGH:
-		ret = vega10_force_dpm_highest(hwmgr);
-		if (ret)
-			return ret;
-		break;
-	case AMD_DPM_FORCED_LEVEL_LOW:
-		ret = vega10_force_dpm_lowest(hwmgr);
-		if (ret)
-			return ret;
-		break;
-	case AMD_DPM_FORCED_LEVEL_AUTO:
-		ret = vega10_unforce_dpm_levels(hwmgr);
-		if (ret)
-			return ret;
-		break;
-	default:
-		break;
+	if (table_info->vdd_dep_on_sclk->count > VEGA10_UMD_PSTATE_GFXCLK_LEVEL &&
+		table_info->vdd_dep_on_socclk->count > VEGA10_UMD_PSTATE_SOCCLK_LEVEL &&
+		table_info->vdd_dep_on_mclk->count > VEGA10_UMD_PSTATE_MCLK_LEVEL) {
+		*sclk_mask = VEGA10_UMD_PSTATE_GFXCLK_LEVEL;
+		*soc_mask = VEGA10_UMD_PSTATE_SOCCLK_LEVEL;
+		*mclk_mask = VEGA10_UMD_PSTATE_MCLK_LEVEL;
 	}
 
-	hwmgr->dpm_level = level;
-
-	return ret;
+	if (level == AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK) {
+		*sclk_mask = 0;
+	} else if (level == AMD_DPM_FORCED_LEVEL_PROFILE_MIN_MCLK) {
+		*mclk_mask = 0;
+	} else if (level == AMD_DPM_FORCED_LEVEL_PROFILE_PEAK) {
+		*sclk_mask = table_info->vdd_dep_on_sclk->count - 1;
+		*soc_mask = table_info->vdd_dep_on_socclk->count - 1;
+		*mclk_mask = table_info->vdd_dep_on_mclk->count - 1;
+	}
+	return 0;
 }
 
 static int vega10_set_fan_control_mode(struct pp_hwmgr *hwmgr, uint32_t mode)
@@ -4151,6 +4294,86 @@ static int vega10_set_fan_control_mode(struct pp_hwmgr *hwmgr, uint32_t mode)
 	return result;
 }
 
+static int vega10_dpm_force_dpm_level(struct pp_hwmgr *hwmgr,
+				enum amd_dpm_forced_level level)
+{
+	int ret = 0;
+	uint32_t sclk_mask = 0;
+	uint32_t mclk_mask = 0;
+	uint32_t soc_mask = 0;
+	uint32_t profile_mode_mask = AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD |
+					AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK |
+					AMD_DPM_FORCED_LEVEL_PROFILE_MIN_MCLK |
+					AMD_DPM_FORCED_LEVEL_PROFILE_PEAK;
+
+	if (level == hwmgr->dpm_level)
+		return ret;
+
+	if (!(hwmgr->dpm_level & profile_mode_mask)) {
+		/* enter profile mode, save current level, disable gfx cg*/
+		if (level & profile_mode_mask) {
+			hwmgr->saved_dpm_level = hwmgr->dpm_level;
+			cgs_set_clockgating_state(hwmgr->device,
+						AMD_IP_BLOCK_TYPE_GFX,
+						AMD_CG_STATE_UNGATE);
+		}
+	} else {
+		/* exit profile mode, restore level, enable gfx cg*/
+		if (!(level & profile_mode_mask)) {
+			if (level == AMD_DPM_FORCED_LEVEL_PROFILE_EXIT)
+				level = hwmgr->saved_dpm_level;
+			cgs_set_clockgating_state(hwmgr->device,
+					AMD_IP_BLOCK_TYPE_GFX,
+					AMD_CG_STATE_GATE);
+		}
+	}
+
+	switch (level) {
+	case AMD_DPM_FORCED_LEVEL_HIGH:
+		ret = vega10_force_dpm_highest(hwmgr);
+		if (ret)
+			return ret;
+		hwmgr->dpm_level = level;
+		break;
+	case AMD_DPM_FORCED_LEVEL_LOW:
+		ret = vega10_force_dpm_lowest(hwmgr);
+		if (ret)
+			return ret;
+		hwmgr->dpm_level = level;
+		break;
+	case AMD_DPM_FORCED_LEVEL_AUTO:
+		ret = vega10_unforce_dpm_levels(hwmgr);
+		if (ret)
+			return ret;
+		hwmgr->dpm_level = level;
+		break;
+	case AMD_DPM_FORCED_LEVEL_PROFILE_STANDARD:
+	case AMD_DPM_FORCED_LEVEL_PROFILE_MIN_SCLK:
+	case AMD_DPM_FORCED_LEVEL_PROFILE_MIN_MCLK:
+	case AMD_DPM_FORCED_LEVEL_PROFILE_PEAK:
+		ret = vega10_get_profiling_clk_mask(hwmgr, level, &sclk_mask, &mclk_mask, &soc_mask);
+		if (ret)
+			return ret;
+		hwmgr->dpm_level = level;
+		vega10_force_clock_level(hwmgr, PP_SCLK, 1<<sclk_mask);
+		vega10_force_clock_level(hwmgr, PP_MCLK, 1<<mclk_mask);
+		break;
+	case AMD_DPM_FORCED_LEVEL_MANUAL:
+		hwmgr->dpm_level = level;
+		break;
+	case AMD_DPM_FORCED_LEVEL_PROFILE_EXIT:
+	default:
+		break;
+	}
+
+	if (level == AMD_DPM_FORCED_LEVEL_PROFILE_PEAK && hwmgr->saved_dpm_level != AMD_DPM_FORCED_LEVEL_PROFILE_PEAK)
+		vega10_set_fan_control_mode(hwmgr, AMD_FAN_CTRL_NONE);
+	else if (level != AMD_DPM_FORCED_LEVEL_PROFILE_PEAK && hwmgr->saved_dpm_level == AMD_DPM_FORCED_LEVEL_PROFILE_PEAK)
+		vega10_set_fan_control_mode(hwmgr, AMD_FAN_CTRL_AUTO);
+
+	return 0;
+}
+
 static int vega10_get_fan_control_mode(struct pp_hwmgr *hwmgr)
 {
 	struct vega10_hwmgr *data = (struct vega10_hwmgr *)(hwmgr->backend);
@@ -4396,7 +4619,9 @@ static int vega10_force_clock_level(struct pp_hwmgr *hwmgr,
 	struct vega10_hwmgr *data = (struct vega10_hwmgr *)(hwmgr->backend);
 	int i;
 
-	if (hwmgr->dpm_level != AMD_DPM_FORCED_LEVEL_MANUAL)
+	if (hwmgr->dpm_level & (AMD_DPM_FORCED_LEVEL_AUTO |
+				AMD_DPM_FORCED_LEVEL_LOW |
+				AMD_DPM_FORCED_LEVEL_HIGH))
 		return -EINVAL;
 
 	switch (type) {
@@ -4661,6 +4886,10 @@ static int vega10_disable_dpm_tasks(struct pp_hwmgr *hwmgr)
 	PP_ASSERT_WITH_CODE((tmp_result == 0),
 			"Failed to disable power containment!", result = tmp_result);
 
+	tmp_result = vega10_disable_didt_config(hwmgr);
+	PP_ASSERT_WITH_CODE((tmp_result == 0),
+			"Failed to disable didt config!", result = tmp_result);
+
 	tmp_result = vega10_avfs_enable(hwmgr, false);
 	PP_ASSERT_WITH_CODE((tmp_result == 0),
 			"Failed to disable AVFS!", result = tmp_result);
@@ -4677,6 +4906,9 @@ static int vega10_disable_dpm_tasks(struct pp_hwmgr *hwmgr)
 	PP_ASSERT_WITH_CODE((tmp_result == 0),
 			"Failed to disable ulv!", result = tmp_result);
 
+	tmp_result =  vega10_acg_disable(hwmgr);
+	PP_ASSERT_WITH_CODE((tmp_result == 0),
+			"Failed to disable acg!", result = tmp_result);
 	return result;
 }
 
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.h b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.h
index 6e5c5b9..676cd77 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.h
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_hwmgr.h
@@ -64,7 +64,9 @@ enum {
 	GNLD_FW_CTF,
 	GNLD_LED_DISPLAY,
 	GNLD_FAN_CONTROL,
-	GNLD_VOLTAGE_CONTROLLER,
+	GNLD_FEATURE_FAST_PPT_BIT,
+	GNLD_DIDT,
+	GNLD_ACG,
 	GNLD_FEATURES_MAX
 };
 
@@ -230,7 +232,9 @@ struct vega10_registry_data {
 	uint8_t   cac_support;
 	uint8_t   clock_stretcher_support;
 	uint8_t   db_ramping_support;
+	uint8_t   didt_mode;
 	uint8_t   didt_support;
+	uint8_t   edc_didt_support;
 	uint8_t   dynamic_state_patching_support;
 	uint8_t   enable_pkg_pwr_tracking_feature;
 	uint8_t   enable_tdc_limit_feature;
@@ -263,6 +267,9 @@ struct vega10_registry_data {
 	uint8_t   tcp_ramping_support;
 	uint8_t   tdc_support;
 	uint8_t   td_ramping_support;
+	uint8_t   dbr_ramping_support;
+	uint8_t   gc_didt_support;
+	uint8_t   psm_didt_support;
 	uint8_t   thermal_out_gpio_support;
 	uint8_t   thermal_support;
 	uint8_t   fw_ctf_enabled;
@@ -381,6 +388,8 @@ struct vega10_hwmgr {
 	struct vega10_smc_state_table  smc_state_table;
 
 	uint32_t                       config_telemetry;
+	uint32_t                       smu_version;
+	uint32_t                       acg_loop_state;
 };
 
 #define VEGA10_DPM2_NEAR_TDP_DEC                      10
@@ -425,6 +434,10 @@ struct vega10_hwmgr {
 #define PPVEGA10_VEGA10UCLKCLKAVERAGEALPHA_DFLT      25 /* 10% * 255 = 25 */
 #define PPVEGA10_VEGA10GFXACTIVITYAVERAGEALPHA_DFLT  25 /* 10% * 255 = 25 */
 
+#define VEGA10_UMD_PSTATE_GFXCLK_LEVEL         0x3
+#define VEGA10_UMD_PSTATE_SOCCLK_LEVEL         0x3
+#define VEGA10_UMD_PSTATE_MCLK_LEVEL           0x2
+
 extern int tonga_initializa_dynamic_state_adjustment_rule_settings(struct pp_hwmgr *hwmgr);
 extern int tonga_hwmgr_backend_fini(struct pp_hwmgr *hwmgr);
 extern int tonga_get_mc_microcode_version (struct pp_hwmgr *hwmgr);
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_powertune.c b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_powertune.c
index 3f72268..e7fa670 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_powertune.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_powertune.c
@@ -26,7 +26,1298 @@
 #include "vega10_powertune.h"
 #include "vega10_smumgr.h"
 #include "vega10_ppsmc.h"
+#include "vega10_inc.h"
 #include "pp_debug.h"
+#include "pp_soc15.h"
+
+static const struct vega10_didt_config_reg SEDiDtTuningCtrlConfig_Vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* DIDT_SQ */
+	{   ixDIDT_SQ_TUNING_CTRL,             DIDT_SQ_TUNING_CTRL__MAX_POWER_DELTA_HI_MASK,        DIDT_SQ_TUNING_CTRL__MAX_POWER_DELTA_HI__SHIFT,        0x3853 },
+	{   ixDIDT_SQ_TUNING_CTRL,             DIDT_SQ_TUNING_CTRL__MAX_POWER_DELTA_LO_MASK,        DIDT_SQ_TUNING_CTRL__MAX_POWER_DELTA_LO__SHIFT,        0x3153 },
+
+	/* DIDT_TD */
+	{   ixDIDT_TD_TUNING_CTRL,             DIDT_TD_TUNING_CTRL__MAX_POWER_DELTA_HI_MASK,        DIDT_TD_TUNING_CTRL__MAX_POWER_DELTA_HI__SHIFT,        0x0dde },
+	{   ixDIDT_TD_TUNING_CTRL,             DIDT_TD_TUNING_CTRL__MAX_POWER_DELTA_LO_MASK,        DIDT_TD_TUNING_CTRL__MAX_POWER_DELTA_LO__SHIFT,        0x0dde },
+
+	/* DIDT_TCP */
+	{   ixDIDT_TCP_TUNING_CTRL,            DIDT_TCP_TUNING_CTRL__MAX_POWER_DELTA_HI_MASK,       DIDT_TCP_TUNING_CTRL__MAX_POWER_DELTA_HI__SHIFT,       0x3dde },
+	{   ixDIDT_TCP_TUNING_CTRL,            DIDT_TCP_TUNING_CTRL__MAX_POWER_DELTA_LO_MASK,       DIDT_TCP_TUNING_CTRL__MAX_POWER_DELTA_LO__SHIFT,       0x3dde },
+
+	/* DIDT_DB */
+	{   ixDIDT_DB_TUNING_CTRL,             DIDT_DB_TUNING_CTRL__MAX_POWER_DELTA_HI_MASK,        DIDT_DB_TUNING_CTRL__MAX_POWER_DELTA_HI__SHIFT,        0x3dde },
+	{   ixDIDT_DB_TUNING_CTRL,             DIDT_DB_TUNING_CTRL__MAX_POWER_DELTA_LO_MASK,        DIDT_DB_TUNING_CTRL__MAX_POWER_DELTA_LO__SHIFT,        0x3dde },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg SEDiDtCtrl3Config_vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset               Mask                                                     Shift                                                            Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/*DIDT_SQ_CTRL3 */
+	{   ixDIDT_SQ_CTRL3,     DIDT_SQ_CTRL3__GC_DIDT_ENABLE_MASK,       DIDT_SQ_CTRL3__GC_DIDT_ENABLE__SHIFT,             0x0000 },
+	{   ixDIDT_SQ_CTRL3,     DIDT_SQ_CTRL3__GC_DIDT_CLK_EN_OVERRIDE_MASK,       DIDT_SQ_CTRL3__GC_DIDT_CLK_EN_OVERRIDE__SHIFT,             0x0000 },
+	{   ixDIDT_SQ_CTRL3,     DIDT_SQ_CTRL3__THROTTLE_POLICY_MASK,       DIDT_SQ_CTRL3__THROTTLE_POLICY__SHIFT,             0x0003 },
+	{   ixDIDT_SQ_CTRL3,     DIDT_SQ_CTRL3__DIDT_TRIGGER_THROTTLE_LOWBIT_MASK,       DIDT_SQ_CTRL3__DIDT_TRIGGER_THROTTLE_LOWBIT__SHIFT,             0x0000 },
+	{   ixDIDT_SQ_CTRL3,     DIDT_SQ_CTRL3__DIDT_POWER_LEVEL_LOWBIT_MASK,       DIDT_SQ_CTRL3__DIDT_POWER_LEVEL_LOWBIT__SHIFT,             0x0000 },
+	{   ixDIDT_SQ_CTRL3,     DIDT_SQ_CTRL3__DIDT_STALL_PATTERN_BIT_NUMS_MASK,       DIDT_SQ_CTRL3__DIDT_STALL_PATTERN_BIT_NUMS__SHIFT,             0x0003 },
+	{   ixDIDT_SQ_CTRL3,     DIDT_SQ_CTRL3__GC_DIDT_LEVEL_COMB_EN_MASK,       DIDT_SQ_CTRL3__GC_DIDT_LEVEL_COMB_EN__SHIFT,             0x0000 },
+	{   ixDIDT_SQ_CTRL3,     DIDT_SQ_CTRL3__SE_DIDT_LEVEL_COMB_EN_MASK,       DIDT_SQ_CTRL3__SE_DIDT_LEVEL_COMB_EN__SHIFT,             0x0000 },
+	{   ixDIDT_SQ_CTRL3,     DIDT_SQ_CTRL3__QUALIFY_STALL_EN_MASK,       DIDT_SQ_CTRL3__QUALIFY_STALL_EN__SHIFT,             0x0000 },
+	{   ixDIDT_SQ_CTRL3,     DIDT_SQ_CTRL3__DIDT_STALL_SEL_MASK,       DIDT_SQ_CTRL3__DIDT_STALL_SEL__SHIFT,             0x0000 },
+	{   ixDIDT_SQ_CTRL3,     DIDT_SQ_CTRL3__DIDT_FORCE_STALL_MASK,       DIDT_SQ_CTRL3__DIDT_FORCE_STALL__SHIFT,             0x0000 },
+	{   ixDIDT_SQ_CTRL3,     DIDT_SQ_CTRL3__DIDT_STALL_DELAY_EN_MASK,       DIDT_SQ_CTRL3__DIDT_STALL_DELAY_EN__SHIFT,             0x0000 },
+
+	/*DIDT_TCP_CTRL3 */
+	{   ixDIDT_TCP_CTRL3,    DIDT_TCP_CTRL3__GC_DIDT_ENABLE_MASK,      DIDT_TCP_CTRL3__GC_DIDT_ENABLE__SHIFT,            0x0000 },
+	{   ixDIDT_TCP_CTRL3,    DIDT_TCP_CTRL3__GC_DIDT_CLK_EN_OVERRIDE_MASK,      DIDT_TCP_CTRL3__GC_DIDT_CLK_EN_OVERRIDE__SHIFT,            0x0000 },
+	{   ixDIDT_TCP_CTRL3,    DIDT_TCP_CTRL3__THROTTLE_POLICY_MASK,      DIDT_TCP_CTRL3__THROTTLE_POLICY__SHIFT,            0x0003 },
+	{   ixDIDT_TCP_CTRL3,    DIDT_TCP_CTRL3__DIDT_TRIGGER_THROTTLE_LOWBIT_MASK,      DIDT_TCP_CTRL3__DIDT_TRIGGER_THROTTLE_LOWBIT__SHIFT,            0x0000 },
+	{   ixDIDT_TCP_CTRL3,    DIDT_TCP_CTRL3__DIDT_POWER_LEVEL_LOWBIT_MASK,      DIDT_TCP_CTRL3__DIDT_POWER_LEVEL_LOWBIT__SHIFT,            0x0000 },
+	{   ixDIDT_TCP_CTRL3,    DIDT_TCP_CTRL3__DIDT_STALL_PATTERN_BIT_NUMS_MASK,      DIDT_TCP_CTRL3__DIDT_STALL_PATTERN_BIT_NUMS__SHIFT,            0x0003 },
+	{   ixDIDT_TCP_CTRL3,    DIDT_TCP_CTRL3__GC_DIDT_LEVEL_COMB_EN_MASK,      DIDT_TCP_CTRL3__GC_DIDT_LEVEL_COMB_EN__SHIFT,            0x0000 },
+	{   ixDIDT_TCP_CTRL3,    DIDT_TCP_CTRL3__SE_DIDT_LEVEL_COMB_EN_MASK,      DIDT_TCP_CTRL3__SE_DIDT_LEVEL_COMB_EN__SHIFT,            0x0000 },
+	{   ixDIDT_TCP_CTRL3,    DIDT_TCP_CTRL3__QUALIFY_STALL_EN_MASK,      DIDT_TCP_CTRL3__QUALIFY_STALL_EN__SHIFT,            0x0000 },
+	{   ixDIDT_TCP_CTRL3,    DIDT_TCP_CTRL3__DIDT_STALL_SEL_MASK,      DIDT_TCP_CTRL3__DIDT_STALL_SEL__SHIFT,            0x0000 },
+	{   ixDIDT_TCP_CTRL3,    DIDT_TCP_CTRL3__DIDT_FORCE_STALL_MASK,      DIDT_TCP_CTRL3__DIDT_FORCE_STALL__SHIFT,            0x0000 },
+	{   ixDIDT_TCP_CTRL3,    DIDT_TCP_CTRL3__DIDT_STALL_DELAY_EN_MASK,      DIDT_TCP_CTRL3__DIDT_STALL_DELAY_EN__SHIFT,            0x0000 },
+
+	/*DIDT_TD_CTRL3 */
+	{   ixDIDT_TD_CTRL3,     DIDT_TD_CTRL3__GC_DIDT_ENABLE_MASK,       DIDT_TD_CTRL3__GC_DIDT_ENABLE__SHIFT,             0x0000 },
+	{   ixDIDT_TD_CTRL3,     DIDT_TD_CTRL3__GC_DIDT_CLK_EN_OVERRIDE_MASK,       DIDT_TD_CTRL3__GC_DIDT_CLK_EN_OVERRIDE__SHIFT,             0x0000 },
+	{   ixDIDT_TD_CTRL3,     DIDT_TD_CTRL3__THROTTLE_POLICY_MASK,       DIDT_TD_CTRL3__THROTTLE_POLICY__SHIFT,             0x0003 },
+	{   ixDIDT_TD_CTRL3,     DIDT_TD_CTRL3__DIDT_TRIGGER_THROTTLE_LOWBIT_MASK,       DIDT_TD_CTRL3__DIDT_TRIGGER_THROTTLE_LOWBIT__SHIFT,             0x0000 },
+	{   ixDIDT_TD_CTRL3,     DIDT_TD_CTRL3__DIDT_POWER_LEVEL_LOWBIT_MASK,       DIDT_TD_CTRL3__DIDT_POWER_LEVEL_LOWBIT__SHIFT,             0x0000 },
+	{   ixDIDT_TD_CTRL3,     DIDT_TD_CTRL3__DIDT_STALL_PATTERN_BIT_NUMS_MASK,       DIDT_TD_CTRL3__DIDT_STALL_PATTERN_BIT_NUMS__SHIFT,             0x0003 },
+	{   ixDIDT_TD_CTRL3,     DIDT_TD_CTRL3__GC_DIDT_LEVEL_COMB_EN_MASK,       DIDT_TD_CTRL3__GC_DIDT_LEVEL_COMB_EN__SHIFT,             0x0000 },
+	{   ixDIDT_TD_CTRL3,     DIDT_TD_CTRL3__SE_DIDT_LEVEL_COMB_EN_MASK,       DIDT_TD_CTRL3__SE_DIDT_LEVEL_COMB_EN__SHIFT,             0x0000 },
+	{   ixDIDT_TD_CTRL3,     DIDT_TD_CTRL3__QUALIFY_STALL_EN_MASK,       DIDT_TD_CTRL3__QUALIFY_STALL_EN__SHIFT,             0x0000 },
+	{   ixDIDT_TD_CTRL3,     DIDT_TD_CTRL3__DIDT_STALL_SEL_MASK,       DIDT_TD_CTRL3__DIDT_STALL_SEL__SHIFT,             0x0000 },
+	{   ixDIDT_TD_CTRL3,     DIDT_TD_CTRL3__DIDT_FORCE_STALL_MASK,       DIDT_TD_CTRL3__DIDT_FORCE_STALL__SHIFT,             0x0000 },
+	{   ixDIDT_TD_CTRL3,     DIDT_TD_CTRL3__DIDT_STALL_DELAY_EN_MASK,       DIDT_TD_CTRL3__DIDT_STALL_DELAY_EN__SHIFT,             0x0000 },
+
+	/*DIDT_DB_CTRL3 */
+	{   ixDIDT_DB_CTRL3,     DIDT_DB_CTRL3__GC_DIDT_ENABLE_MASK,       DIDT_DB_CTRL3__GC_DIDT_ENABLE__SHIFT,             0x0000 },
+	{   ixDIDT_DB_CTRL3,     DIDT_DB_CTRL3__GC_DIDT_CLK_EN_OVERRIDE_MASK,       DIDT_DB_CTRL3__GC_DIDT_CLK_EN_OVERRIDE__SHIFT,             0x0000 },
+	{   ixDIDT_DB_CTRL3,     DIDT_DB_CTRL3__THROTTLE_POLICY_MASK,       DIDT_DB_CTRL3__THROTTLE_POLICY__SHIFT,             0x0003 },
+	{   ixDIDT_DB_CTRL3,     DIDT_DB_CTRL3__DIDT_TRIGGER_THROTTLE_LOWBIT_MASK,       DIDT_DB_CTRL3__DIDT_TRIGGER_THROTTLE_LOWBIT__SHIFT,             0x0000 },
+	{   ixDIDT_DB_CTRL3,     DIDT_DB_CTRL3__DIDT_POWER_LEVEL_LOWBIT_MASK,       DIDT_DB_CTRL3__DIDT_POWER_LEVEL_LOWBIT__SHIFT,             0x0000 },
+	{   ixDIDT_DB_CTRL3,     DIDT_DB_CTRL3__DIDT_STALL_PATTERN_BIT_NUMS_MASK,       DIDT_DB_CTRL3__DIDT_STALL_PATTERN_BIT_NUMS__SHIFT,             0x0003 },
+	{   ixDIDT_DB_CTRL3,     DIDT_DB_CTRL3__GC_DIDT_LEVEL_COMB_EN_MASK,       DIDT_DB_CTRL3__GC_DIDT_LEVEL_COMB_EN__SHIFT,             0x0000 },
+	{   ixDIDT_DB_CTRL3,     DIDT_DB_CTRL3__SE_DIDT_LEVEL_COMB_EN_MASK,       DIDT_DB_CTRL3__SE_DIDT_LEVEL_COMB_EN__SHIFT,             0x0000 },
+	{   ixDIDT_DB_CTRL3,     DIDT_DB_CTRL3__QUALIFY_STALL_EN_MASK,       DIDT_DB_CTRL3__QUALIFY_STALL_EN__SHIFT,             0x0000 },
+	{   ixDIDT_DB_CTRL3,     DIDT_DB_CTRL3__DIDT_STALL_SEL_MASK,       DIDT_DB_CTRL3__DIDT_STALL_SEL__SHIFT,             0x0000 },
+	{   ixDIDT_DB_CTRL3,     DIDT_DB_CTRL3__DIDT_FORCE_STALL_MASK,       DIDT_DB_CTRL3__DIDT_FORCE_STALL__SHIFT,             0x0000 },
+	{   ixDIDT_DB_CTRL3,     DIDT_DB_CTRL3__DIDT_STALL_DELAY_EN_MASK,       DIDT_DB_CTRL3__DIDT_STALL_DELAY_EN__SHIFT,             0x0000 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg SEDiDtCtrl2Config_Vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                            Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* DIDT_SQ */
+	{   ixDIDT_SQ_CTRL2,                  DIDT_SQ_CTRL2__MAX_POWER_DELTA_MASK,                 DIDT_SQ_CTRL2__MAX_POWER_DELTA__SHIFT,                 0x3853 },
+	{   ixDIDT_SQ_CTRL2,                  DIDT_SQ_CTRL2__SHORT_TERM_INTERVAL_SIZE_MASK,        DIDT_SQ_CTRL2__SHORT_TERM_INTERVAL_SIZE__SHIFT,        0x00c0 },
+	{   ixDIDT_SQ_CTRL2,                  DIDT_SQ_CTRL2__LONG_TERM_INTERVAL_RATIO_MASK,        DIDT_SQ_CTRL2__LONG_TERM_INTERVAL_RATIO__SHIFT,        0x0000 },
+
+	/* DIDT_TD */
+	{   ixDIDT_TD_CTRL2,                  DIDT_TD_CTRL2__MAX_POWER_DELTA_MASK,                 DIDT_TD_CTRL2__MAX_POWER_DELTA__SHIFT,                 0x3fff },
+	{   ixDIDT_TD_CTRL2,                  DIDT_TD_CTRL2__SHORT_TERM_INTERVAL_SIZE_MASK,        DIDT_TD_CTRL2__SHORT_TERM_INTERVAL_SIZE__SHIFT,        0x00c0 },
+	{   ixDIDT_TD_CTRL2,                  DIDT_TD_CTRL2__LONG_TERM_INTERVAL_RATIO_MASK,        DIDT_TD_CTRL2__LONG_TERM_INTERVAL_RATIO__SHIFT,        0x0001 },
+
+	/* DIDT_TCP */
+	{   ixDIDT_TCP_CTRL2,                 DIDT_TCP_CTRL2__MAX_POWER_DELTA_MASK,                DIDT_TCP_CTRL2__MAX_POWER_DELTA__SHIFT,                0x3dde },
+	{   ixDIDT_TCP_CTRL2,                 DIDT_TCP_CTRL2__SHORT_TERM_INTERVAL_SIZE_MASK,       DIDT_TCP_CTRL2__SHORT_TERM_INTERVAL_SIZE__SHIFT,       0x00c0 },
+	{   ixDIDT_TCP_CTRL2,                 DIDT_TCP_CTRL2__LONG_TERM_INTERVAL_RATIO_MASK,       DIDT_TCP_CTRL2__LONG_TERM_INTERVAL_RATIO__SHIFT,       0x0001 },
+
+	/* DIDT_DB */
+	{   ixDIDT_DB_CTRL2,                  DIDT_DB_CTRL2__MAX_POWER_DELTA_MASK,                 DIDT_DB_CTRL2__MAX_POWER_DELTA__SHIFT,                 0x3dde },
+	{   ixDIDT_DB_CTRL2,                  DIDT_DB_CTRL2__SHORT_TERM_INTERVAL_SIZE_MASK,        DIDT_DB_CTRL2__SHORT_TERM_INTERVAL_SIZE__SHIFT,        0x00c0 },
+	{   ixDIDT_DB_CTRL2,                  DIDT_DB_CTRL2__LONG_TERM_INTERVAL_RATIO_MASK,        DIDT_DB_CTRL2__LONG_TERM_INTERVAL_RATIO__SHIFT,        0x0001 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg SEDiDtCtrl1Config_Vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* DIDT_SQ */
+	{   ixDIDT_SQ_CTRL1,                   DIDT_SQ_CTRL1__MIN_POWER_MASK,                       DIDT_SQ_CTRL1__MIN_POWER__SHIFT,                       0x0000 },
+	{   ixDIDT_SQ_CTRL1,                   DIDT_SQ_CTRL1__MAX_POWER_MASK,                       DIDT_SQ_CTRL1__MAX_POWER__SHIFT,                       0xffff },
+	/* DIDT_TD */
+	{   ixDIDT_TD_CTRL1,                   DIDT_TD_CTRL1__MIN_POWER_MASK,                       DIDT_TD_CTRL1__MIN_POWER__SHIFT,                       0x0000 },
+	{   ixDIDT_TD_CTRL1,                   DIDT_TD_CTRL1__MAX_POWER_MASK,                       DIDT_TD_CTRL1__MAX_POWER__SHIFT,                       0xffff },
+	/* DIDT_TCP */
+	{   ixDIDT_TCP_CTRL1,                  DIDT_TCP_CTRL1__MIN_POWER_MASK,                      DIDT_TCP_CTRL1__MIN_POWER__SHIFT,                      0x0000 },
+	{   ixDIDT_TCP_CTRL1,                  DIDT_TCP_CTRL1__MAX_POWER_MASK,                      DIDT_TCP_CTRL1__MAX_POWER__SHIFT,                      0xffff },
+	/* DIDT_DB */
+	{   ixDIDT_DB_CTRL1,                   DIDT_DB_CTRL1__MIN_POWER_MASK,                       DIDT_DB_CTRL1__MIN_POWER__SHIFT,                       0x0000 },
+	{   ixDIDT_DB_CTRL1,                   DIDT_DB_CTRL1__MAX_POWER_MASK,                       DIDT_DB_CTRL1__MAX_POWER__SHIFT,                       0xffff },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+
+static const struct vega10_didt_config_reg SEDiDtWeightConfig_Vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                  Shift                                                 Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* DIDT_SQ */
+	{   ixDIDT_SQ_WEIGHT0_3,               0xFFFFFFFF,                                           0,                                                    0x2B363B1A },
+	{   ixDIDT_SQ_WEIGHT4_7,               0xFFFFFFFF,                                           0,                                                    0x270B2432 },
+	{   ixDIDT_SQ_WEIGHT8_11,              0xFFFFFFFF,                                           0,                                                    0x00000018 },
+
+	/* DIDT_TD */
+	{   ixDIDT_TD_WEIGHT0_3,               0xFFFFFFFF,                                           0,                                                    0x2B1D220F },
+	{   ixDIDT_TD_WEIGHT4_7,               0xFFFFFFFF,                                           0,                                                    0x00007558 },
+	{   ixDIDT_TD_WEIGHT8_11,              0xFFFFFFFF,                                           0,                                                    0x00000000 },
+
+	/* DIDT_TCP */
+	{   ixDIDT_TCP_WEIGHT0_3,               0xFFFFFFFF,                                          0,                                                    0x5ACE160D },
+	{   ixDIDT_TCP_WEIGHT4_7,               0xFFFFFFFF,                                          0,                                                    0x00000000 },
+	{   ixDIDT_TCP_WEIGHT8_11,              0xFFFFFFFF,                                          0,                                                    0x00000000 },
+
+	/* DIDT_DB */
+	{   ixDIDT_DB_WEIGHT0_3,                0xFFFFFFFF,                                          0,                                                    0x0E152A0F },
+	{   ixDIDT_DB_WEIGHT4_7,                0xFFFFFFFF,                                          0,                                                    0x09061813 },
+	{   ixDIDT_DB_WEIGHT8_11,               0xFFFFFFFF,                                          0,                                                    0x00000013 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg SEDiDtCtrl0Config_Vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* DIDT_SQ */
+	{  ixDIDT_SQ_CTRL0,                   DIDT_SQ_CTRL0__DIDT_CTRL_EN_MASK,   DIDT_SQ_CTRL0__DIDT_CTRL_EN__SHIFT,  0x0000 },
+	{  ixDIDT_SQ_CTRL0,                   DIDT_SQ_CTRL0__PHASE_OFFSET_MASK,   DIDT_SQ_CTRL0__PHASE_OFFSET__SHIFT,  0x0000 },
+	{  ixDIDT_SQ_CTRL0,                   DIDT_SQ_CTRL0__DIDT_CTRL_RST_MASK,   DIDT_SQ_CTRL0__DIDT_CTRL_RST__SHIFT,  0x0000 },
+	{  ixDIDT_SQ_CTRL0,                   DIDT_SQ_CTRL0__DIDT_CLK_EN_OVERRIDE_MASK,   DIDT_SQ_CTRL0__DIDT_CLK_EN_OVERRIDE__SHIFT,  0x0000 },
+	{  ixDIDT_SQ_CTRL0,                   DIDT_SQ_CTRL0__DIDT_STALL_CTRL_EN_MASK,   DIDT_SQ_CTRL0__DIDT_STALL_CTRL_EN__SHIFT,  0x0001 },
+	{  ixDIDT_SQ_CTRL0,                   DIDT_SQ_CTRL0__DIDT_TUNING_CTRL_EN_MASK,   DIDT_SQ_CTRL0__DIDT_TUNING_CTRL_EN__SHIFT,  0x0001 },
+	{  ixDIDT_SQ_CTRL0,                   DIDT_SQ_CTRL0__DIDT_STALL_AUTO_RELEASE_EN_MASK,   DIDT_SQ_CTRL0__DIDT_STALL_AUTO_RELEASE_EN__SHIFT,  0x0001 },
+	{  ixDIDT_SQ_CTRL0,                   DIDT_SQ_CTRL0__DIDT_HI_POWER_THRESHOLD_MASK,   DIDT_SQ_CTRL0__DIDT_HI_POWER_THRESHOLD__SHIFT,  0xffff },
+	{  ixDIDT_SQ_CTRL0,                   DIDT_SQ_CTRL0__DIDT_AUTO_MPD_EN_MASK,   DIDT_SQ_CTRL0__DIDT_AUTO_MPD_EN__SHIFT,  0x0000 },
+	{  ixDIDT_SQ_CTRL0,                   DIDT_SQ_CTRL0__DIDT_STALL_EVENT_EN_MASK,   DIDT_SQ_CTRL0__DIDT_STALL_EVENT_EN__SHIFT,  0x0000 },
+	{  ixDIDT_SQ_CTRL0,                   DIDT_SQ_CTRL0__DIDT_STALL_EVENT_COUNTER_CLEAR_MASK,   DIDT_SQ_CTRL0__DIDT_STALL_EVENT_COUNTER_CLEAR__SHIFT,  0x0000 },
+	/* DIDT_TD */
+	{  ixDIDT_TD_CTRL0,                   DIDT_TD_CTRL0__DIDT_CTRL_EN_MASK,   DIDT_TD_CTRL0__DIDT_CTRL_EN__SHIFT,  0x0000 },
+	{  ixDIDT_TD_CTRL0,                   DIDT_TD_CTRL0__PHASE_OFFSET_MASK,   DIDT_TD_CTRL0__PHASE_OFFSET__SHIFT,  0x0000 },
+	{  ixDIDT_TD_CTRL0,                   DIDT_TD_CTRL0__DIDT_CTRL_RST_MASK,   DIDT_TD_CTRL0__DIDT_CTRL_RST__SHIFT,  0x0000 },
+	{  ixDIDT_TD_CTRL0,                   DIDT_TD_CTRL0__DIDT_CLK_EN_OVERRIDE_MASK,   DIDT_TD_CTRL0__DIDT_CLK_EN_OVERRIDE__SHIFT,  0x0000 },
+	{  ixDIDT_TD_CTRL0,                   DIDT_TD_CTRL0__DIDT_STALL_CTRL_EN_MASK,   DIDT_TD_CTRL0__DIDT_STALL_CTRL_EN__SHIFT,  0x0001 },
+	{  ixDIDT_TD_CTRL0,                   DIDT_TD_CTRL0__DIDT_TUNING_CTRL_EN_MASK,   DIDT_TD_CTRL0__DIDT_TUNING_CTRL_EN__SHIFT,  0x0001 },
+	{  ixDIDT_TD_CTRL0,                   DIDT_TD_CTRL0__DIDT_STALL_AUTO_RELEASE_EN_MASK,   DIDT_TD_CTRL0__DIDT_STALL_AUTO_RELEASE_EN__SHIFT,  0x0001 },
+	{  ixDIDT_TD_CTRL0,                   DIDT_TD_CTRL0__DIDT_HI_POWER_THRESHOLD_MASK,   DIDT_TD_CTRL0__DIDT_HI_POWER_THRESHOLD__SHIFT,  0xffff },
+	{  ixDIDT_TD_CTRL0,                   DIDT_TD_CTRL0__DIDT_AUTO_MPD_EN_MASK,   DIDT_TD_CTRL0__DIDT_AUTO_MPD_EN__SHIFT,  0x0000 },
+	{  ixDIDT_TD_CTRL0,                   DIDT_TD_CTRL0__DIDT_STALL_EVENT_EN_MASK,   DIDT_TD_CTRL0__DIDT_STALL_EVENT_EN__SHIFT,  0x0000 },
+	{  ixDIDT_TD_CTRL0,                   DIDT_TD_CTRL0__DIDT_STALL_EVENT_COUNTER_CLEAR_MASK,   DIDT_TD_CTRL0__DIDT_STALL_EVENT_COUNTER_CLEAR__SHIFT,  0x0000 },
+	/* DIDT_TCP */
+	{  ixDIDT_TCP_CTRL0,                  DIDT_TCP_CTRL0__DIDT_CTRL_EN_MASK,  DIDT_TCP_CTRL0__DIDT_CTRL_EN__SHIFT, 0x0000 },
+	{  ixDIDT_TCP_CTRL0,                  DIDT_TCP_CTRL0__PHASE_OFFSET_MASK,  DIDT_TCP_CTRL0__PHASE_OFFSET__SHIFT, 0x0000 },
+	{  ixDIDT_TCP_CTRL0,                  DIDT_TCP_CTRL0__DIDT_CTRL_RST_MASK,  DIDT_TCP_CTRL0__DIDT_CTRL_RST__SHIFT, 0x0000 },
+	{  ixDIDT_TCP_CTRL0,                  DIDT_TCP_CTRL0__DIDT_CLK_EN_OVERRIDE_MASK,  DIDT_TCP_CTRL0__DIDT_CLK_EN_OVERRIDE__SHIFT, 0x0000 },
+	{  ixDIDT_TCP_CTRL0,                  DIDT_TCP_CTRL0__DIDT_STALL_CTRL_EN_MASK,  DIDT_TCP_CTRL0__DIDT_STALL_CTRL_EN__SHIFT, 0x0001 },
+	{  ixDIDT_TCP_CTRL0,                  DIDT_TCP_CTRL0__DIDT_TUNING_CTRL_EN_MASK,  DIDT_TCP_CTRL0__DIDT_TUNING_CTRL_EN__SHIFT, 0x0001 },
+	{  ixDIDT_TCP_CTRL0,                  DIDT_TCP_CTRL0__DIDT_STALL_AUTO_RELEASE_EN_MASK,  DIDT_TCP_CTRL0__DIDT_STALL_AUTO_RELEASE_EN__SHIFT, 0x0001 },
+	{  ixDIDT_TCP_CTRL0,                  DIDT_TCP_CTRL0__DIDT_HI_POWER_THRESHOLD_MASK,  DIDT_TCP_CTRL0__DIDT_HI_POWER_THRESHOLD__SHIFT, 0xffff },
+	{  ixDIDT_TCP_CTRL0,                  DIDT_TCP_CTRL0__DIDT_AUTO_MPD_EN_MASK,  DIDT_TCP_CTRL0__DIDT_AUTO_MPD_EN__SHIFT, 0x0000 },
+	{  ixDIDT_TCP_CTRL0,                  DIDT_TCP_CTRL0__DIDT_STALL_EVENT_EN_MASK,  DIDT_TCP_CTRL0__DIDT_STALL_EVENT_EN__SHIFT, 0x0000 },
+	{  ixDIDT_TCP_CTRL0,                  DIDT_TCP_CTRL0__DIDT_STALL_EVENT_COUNTER_CLEAR_MASK,  DIDT_TCP_CTRL0__DIDT_STALL_EVENT_COUNTER_CLEAR__SHIFT, 0x0000 },
+	/* DIDT_DB */
+	{  ixDIDT_DB_CTRL0,                   DIDT_DB_CTRL0__DIDT_CTRL_EN_MASK,   DIDT_DB_CTRL0__DIDT_CTRL_EN__SHIFT,  0x0000 },
+	{  ixDIDT_DB_CTRL0,                   DIDT_DB_CTRL0__PHASE_OFFSET_MASK,   DIDT_DB_CTRL0__PHASE_OFFSET__SHIFT,  0x0000 },
+	{  ixDIDT_DB_CTRL0,                   DIDT_DB_CTRL0__DIDT_CTRL_RST_MASK,   DIDT_DB_CTRL0__DIDT_CTRL_RST__SHIFT,  0x0000 },
+	{  ixDIDT_DB_CTRL0,                   DIDT_DB_CTRL0__DIDT_CLK_EN_OVERRIDE_MASK,   DIDT_DB_CTRL0__DIDT_CLK_EN_OVERRIDE__SHIFT,  0x0000 },
+	{  ixDIDT_DB_CTRL0,                   DIDT_DB_CTRL0__DIDT_STALL_CTRL_EN_MASK,   DIDT_DB_CTRL0__DIDT_STALL_CTRL_EN__SHIFT,  0x0001 },
+	{  ixDIDT_DB_CTRL0,                   DIDT_DB_CTRL0__DIDT_TUNING_CTRL_EN_MASK,   DIDT_DB_CTRL0__DIDT_TUNING_CTRL_EN__SHIFT,  0x0001 },
+	{  ixDIDT_DB_CTRL0,                   DIDT_DB_CTRL0__DIDT_STALL_AUTO_RELEASE_EN_MASK,   DIDT_DB_CTRL0__DIDT_STALL_AUTO_RELEASE_EN__SHIFT,  0x0001 },
+	{  ixDIDT_DB_CTRL0,                   DIDT_DB_CTRL0__DIDT_HI_POWER_THRESHOLD_MASK,   DIDT_DB_CTRL0__DIDT_HI_POWER_THRESHOLD__SHIFT,  0xffff },
+	{  ixDIDT_DB_CTRL0,                   DIDT_DB_CTRL0__DIDT_AUTO_MPD_EN_MASK,   DIDT_DB_CTRL0__DIDT_AUTO_MPD_EN__SHIFT,  0x0000 },
+	{  ixDIDT_DB_CTRL0,                   DIDT_DB_CTRL0__DIDT_STALL_EVENT_EN_MASK,   DIDT_DB_CTRL0__DIDT_STALL_EVENT_EN__SHIFT,  0x0000 },
+	{  ixDIDT_DB_CTRL0,                   DIDT_DB_CTRL0__DIDT_STALL_EVENT_COUNTER_CLEAR_MASK,   DIDT_DB_CTRL0__DIDT_STALL_EVENT_COUNTER_CLEAR__SHIFT,  0x0000 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+
+static const struct vega10_didt_config_reg SEDiDtStallCtrlConfig_vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                   Mask                                                     Shift                                                      Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* DIDT_SQ */
+	{   ixDIDT_SQ_STALL_CTRL,    DIDT_SQ_STALL_CTRL__DIDT_STALL_DELAY_HI_MASK,    DIDT_SQ_STALL_CTRL__DIDT_STALL_DELAY_HI__SHIFT,     0x0004 },
+	{   ixDIDT_SQ_STALL_CTRL,    DIDT_SQ_STALL_CTRL__DIDT_STALL_DELAY_LO_MASK,    DIDT_SQ_STALL_CTRL__DIDT_STALL_DELAY_LO__SHIFT,     0x0004 },
+	{   ixDIDT_SQ_STALL_CTRL,    DIDT_SQ_STALL_CTRL__DIDT_MAX_STALLS_ALLOWED_HI_MASK,    DIDT_SQ_STALL_CTRL__DIDT_MAX_STALLS_ALLOWED_HI__SHIFT,     0x000a },
+	{   ixDIDT_SQ_STALL_CTRL,    DIDT_SQ_STALL_CTRL__DIDT_MAX_STALLS_ALLOWED_LO_MASK,    DIDT_SQ_STALL_CTRL__DIDT_MAX_STALLS_ALLOWED_LO__SHIFT,     0x000a },
+
+	/* DIDT_TD */
+	{   ixDIDT_TD_STALL_CTRL,    DIDT_TD_STALL_CTRL__DIDT_STALL_DELAY_HI_MASK,    DIDT_TD_STALL_CTRL__DIDT_STALL_DELAY_HI__SHIFT,     0x0001 },
+	{   ixDIDT_TD_STALL_CTRL,    DIDT_TD_STALL_CTRL__DIDT_STALL_DELAY_LO_MASK,    DIDT_TD_STALL_CTRL__DIDT_STALL_DELAY_LO__SHIFT,     0x0001 },
+	{   ixDIDT_TD_STALL_CTRL,    DIDT_TD_STALL_CTRL__DIDT_MAX_STALLS_ALLOWED_HI_MASK,    DIDT_TD_STALL_CTRL__DIDT_MAX_STALLS_ALLOWED_HI__SHIFT,     0x000a },
+	{   ixDIDT_TD_STALL_CTRL,    DIDT_TD_STALL_CTRL__DIDT_MAX_STALLS_ALLOWED_LO_MASK,    DIDT_TD_STALL_CTRL__DIDT_MAX_STALLS_ALLOWED_LO__SHIFT,     0x000a },
+
+	/* DIDT_TCP */
+	{   ixDIDT_TCP_STALL_CTRL,   DIDT_TCP_STALL_CTRL__DIDT_STALL_DELAY_HI_MASK,   DIDT_TCP_STALL_CTRL__DIDT_STALL_DELAY_HI__SHIFT,    0x0001 },
+	{   ixDIDT_TCP_STALL_CTRL,   DIDT_TCP_STALL_CTRL__DIDT_STALL_DELAY_LO_MASK,   DIDT_TCP_STALL_CTRL__DIDT_STALL_DELAY_LO__SHIFT,    0x0001 },
+	{   ixDIDT_TCP_STALL_CTRL,   DIDT_TCP_STALL_CTRL__DIDT_MAX_STALLS_ALLOWED_HI_MASK,   DIDT_TCP_STALL_CTRL__DIDT_MAX_STALLS_ALLOWED_HI__SHIFT,    0x000a },
+	{   ixDIDT_TCP_STALL_CTRL,   DIDT_TCP_STALL_CTRL__DIDT_MAX_STALLS_ALLOWED_LO_MASK,   DIDT_TCP_STALL_CTRL__DIDT_MAX_STALLS_ALLOWED_LO__SHIFT,    0x000a },
+
+	/* DIDT_DB */
+	{   ixDIDT_DB_STALL_CTRL,    DIDT_DB_STALL_CTRL__DIDT_STALL_DELAY_HI_MASK,    DIDT_DB_STALL_CTRL__DIDT_STALL_DELAY_HI__SHIFT,     0x0004 },
+	{   ixDIDT_DB_STALL_CTRL,    DIDT_DB_STALL_CTRL__DIDT_STALL_DELAY_LO_MASK,    DIDT_DB_STALL_CTRL__DIDT_STALL_DELAY_LO__SHIFT,     0x0004 },
+	{   ixDIDT_DB_STALL_CTRL,    DIDT_DB_STALL_CTRL__DIDT_MAX_STALLS_ALLOWED_HI_MASK,    DIDT_DB_STALL_CTRL__DIDT_MAX_STALLS_ALLOWED_HI__SHIFT,     0x000a },
+	{   ixDIDT_DB_STALL_CTRL,    DIDT_DB_STALL_CTRL__DIDT_MAX_STALLS_ALLOWED_LO_MASK,    DIDT_DB_STALL_CTRL__DIDT_MAX_STALLS_ALLOWED_LO__SHIFT,     0x000a },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg SEDiDtStallPatternConfig_vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                        Mask                                                      Shift                                                    Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* DIDT_SQ_STALL_PATTERN_1_2 */
+	{   ixDIDT_SQ_STALL_PATTERN_1_2,  DIDT_SQ_STALL_PATTERN_1_2__DIDT_STALL_PATTERN_1_MASK,    DIDT_SQ_STALL_PATTERN_1_2__DIDT_STALL_PATTERN_1__SHIFT,  0x0001 },
+	{   ixDIDT_SQ_STALL_PATTERN_1_2,  DIDT_SQ_STALL_PATTERN_1_2__DIDT_STALL_PATTERN_2_MASK,    DIDT_SQ_STALL_PATTERN_1_2__DIDT_STALL_PATTERN_2__SHIFT,  0x0001 },
+
+	/* DIDT_SQ_STALL_PATTERN_3_4 */
+	{   ixDIDT_SQ_STALL_PATTERN_3_4,  DIDT_SQ_STALL_PATTERN_3_4__DIDT_STALL_PATTERN_3_MASK,    DIDT_SQ_STALL_PATTERN_3_4__DIDT_STALL_PATTERN_3__SHIFT,  0x0001 },
+	{   ixDIDT_SQ_STALL_PATTERN_3_4,  DIDT_SQ_STALL_PATTERN_3_4__DIDT_STALL_PATTERN_4_MASK,    DIDT_SQ_STALL_PATTERN_3_4__DIDT_STALL_PATTERN_4__SHIFT,  0x0001 },
+
+	/* DIDT_SQ_STALL_PATTERN_5_6 */
+	{   ixDIDT_SQ_STALL_PATTERN_5_6,  DIDT_SQ_STALL_PATTERN_5_6__DIDT_STALL_PATTERN_5_MASK,    DIDT_SQ_STALL_PATTERN_5_6__DIDT_STALL_PATTERN_5__SHIFT,  0x0000 },
+	{   ixDIDT_SQ_STALL_PATTERN_5_6,  DIDT_SQ_STALL_PATTERN_5_6__DIDT_STALL_PATTERN_6_MASK,    DIDT_SQ_STALL_PATTERN_5_6__DIDT_STALL_PATTERN_6__SHIFT,  0x0000 },
+
+	/* DIDT_SQ_STALL_PATTERN_7 */
+	{   ixDIDT_SQ_STALL_PATTERN_7,    DIDT_SQ_STALL_PATTERN_7__DIDT_STALL_PATTERN_7_MASK,      DIDT_SQ_STALL_PATTERN_7__DIDT_STALL_PATTERN_7__SHIFT,    0x0000 },
+
+	/* DIDT_TCP_STALL_PATTERN_1_2 */
+	{   ixDIDT_TCP_STALL_PATTERN_1_2, DIDT_TCP_STALL_PATTERN_1_2__DIDT_STALL_PATTERN_1_MASK,   DIDT_TCP_STALL_PATTERN_1_2__DIDT_STALL_PATTERN_1__SHIFT, 0x0001 },
+	{   ixDIDT_TCP_STALL_PATTERN_1_2, DIDT_TCP_STALL_PATTERN_1_2__DIDT_STALL_PATTERN_2_MASK,   DIDT_TCP_STALL_PATTERN_1_2__DIDT_STALL_PATTERN_2__SHIFT, 0x0001 },
+
+	/* DIDT_TCP_STALL_PATTERN_3_4 */
+	{   ixDIDT_TCP_STALL_PATTERN_3_4, DIDT_TCP_STALL_PATTERN_3_4__DIDT_STALL_PATTERN_3_MASK,   DIDT_TCP_STALL_PATTERN_3_4__DIDT_STALL_PATTERN_3__SHIFT, 0x0001 },
+	{   ixDIDT_TCP_STALL_PATTERN_3_4, DIDT_TCP_STALL_PATTERN_3_4__DIDT_STALL_PATTERN_4_MASK,   DIDT_TCP_STALL_PATTERN_3_4__DIDT_STALL_PATTERN_4__SHIFT, 0x0001 },
+
+	/* DIDT_TCP_STALL_PATTERN_5_6 */
+	{   ixDIDT_TCP_STALL_PATTERN_5_6, DIDT_TCP_STALL_PATTERN_5_6__DIDT_STALL_PATTERN_5_MASK,   DIDT_TCP_STALL_PATTERN_5_6__DIDT_STALL_PATTERN_5__SHIFT, 0x0000 },
+	{   ixDIDT_TCP_STALL_PATTERN_5_6, DIDT_TCP_STALL_PATTERN_5_6__DIDT_STALL_PATTERN_6_MASK,   DIDT_TCP_STALL_PATTERN_5_6__DIDT_STALL_PATTERN_6__SHIFT, 0x0000 },
+
+	/* DIDT_TCP_STALL_PATTERN_7 */
+	{   ixDIDT_TCP_STALL_PATTERN_7,   DIDT_TCP_STALL_PATTERN_7__DIDT_STALL_PATTERN_7_MASK,     DIDT_TCP_STALL_PATTERN_7__DIDT_STALL_PATTERN_7__SHIFT,   0x0000 },
+
+	/* DIDT_TD_STALL_PATTERN_1_2 */
+	{   ixDIDT_TD_STALL_PATTERN_1_2,  DIDT_TD_STALL_PATTERN_1_2__DIDT_STALL_PATTERN_1_MASK,    DIDT_TD_STALL_PATTERN_1_2__DIDT_STALL_PATTERN_1__SHIFT,  0x0001 },
+	{   ixDIDT_TD_STALL_PATTERN_1_2,  DIDT_TD_STALL_PATTERN_1_2__DIDT_STALL_PATTERN_2_MASK,    DIDT_TD_STALL_PATTERN_1_2__DIDT_STALL_PATTERN_2__SHIFT,  0x0001 },
+
+	/* DIDT_TD_STALL_PATTERN_3_4 */
+	{   ixDIDT_TD_STALL_PATTERN_3_4,  DIDT_TD_STALL_PATTERN_3_4__DIDT_STALL_PATTERN_3_MASK,    DIDT_TD_STALL_PATTERN_3_4__DIDT_STALL_PATTERN_3__SHIFT,  0x0001 },
+	{   ixDIDT_TD_STALL_PATTERN_3_4,  DIDT_TD_STALL_PATTERN_3_4__DIDT_STALL_PATTERN_4_MASK,    DIDT_TD_STALL_PATTERN_3_4__DIDT_STALL_PATTERN_4__SHIFT,  0x0001 },
+
+	/* DIDT_TD_STALL_PATTERN_5_6 */
+	{   ixDIDT_TD_STALL_PATTERN_5_6,  DIDT_TD_STALL_PATTERN_5_6__DIDT_STALL_PATTERN_5_MASK,    DIDT_TD_STALL_PATTERN_5_6__DIDT_STALL_PATTERN_5__SHIFT,  0x0000 },
+	{   ixDIDT_TD_STALL_PATTERN_5_6,  DIDT_TD_STALL_PATTERN_5_6__DIDT_STALL_PATTERN_6_MASK,    DIDT_TD_STALL_PATTERN_5_6__DIDT_STALL_PATTERN_6__SHIFT,  0x0000 },
+
+	/* DIDT_TD_STALL_PATTERN_7 */
+	{   ixDIDT_TD_STALL_PATTERN_7,    DIDT_TD_STALL_PATTERN_7__DIDT_STALL_PATTERN_7_MASK,      DIDT_TD_STALL_PATTERN_7__DIDT_STALL_PATTERN_7__SHIFT,    0x0000 },
+
+	/* DIDT_DB_STALL_PATTERN_1_2 */
+	{   ixDIDT_DB_STALL_PATTERN_1_2,  DIDT_DB_STALL_PATTERN_1_2__DIDT_STALL_PATTERN_1_MASK,    DIDT_DB_STALL_PATTERN_1_2__DIDT_STALL_PATTERN_1__SHIFT,  0x0001 },
+	{   ixDIDT_DB_STALL_PATTERN_1_2,  DIDT_DB_STALL_PATTERN_1_2__DIDT_STALL_PATTERN_2_MASK,    DIDT_DB_STALL_PATTERN_1_2__DIDT_STALL_PATTERN_2__SHIFT,  0x0001 },
+
+	/* DIDT_DB_STALL_PATTERN_3_4 */
+	{   ixDIDT_DB_STALL_PATTERN_3_4,  DIDT_DB_STALL_PATTERN_3_4__DIDT_STALL_PATTERN_3_MASK,    DIDT_DB_STALL_PATTERN_3_4__DIDT_STALL_PATTERN_3__SHIFT,  0x0001 },
+	{   ixDIDT_DB_STALL_PATTERN_3_4,  DIDT_DB_STALL_PATTERN_3_4__DIDT_STALL_PATTERN_4_MASK,    DIDT_DB_STALL_PATTERN_3_4__DIDT_STALL_PATTERN_4__SHIFT,  0x0001 },
+
+	/* DIDT_DB_STALL_PATTERN_5_6 */
+	{   ixDIDT_DB_STALL_PATTERN_5_6,  DIDT_DB_STALL_PATTERN_5_6__DIDT_STALL_PATTERN_5_MASK,    DIDT_DB_STALL_PATTERN_5_6__DIDT_STALL_PATTERN_5__SHIFT,  0x0000 },
+	{   ixDIDT_DB_STALL_PATTERN_5_6,  DIDT_DB_STALL_PATTERN_5_6__DIDT_STALL_PATTERN_6_MASK,    DIDT_DB_STALL_PATTERN_5_6__DIDT_STALL_PATTERN_6__SHIFT,  0x0000 },
+
+	/* DIDT_DB_STALL_PATTERN_7 */
+	{   ixDIDT_DB_STALL_PATTERN_7,    DIDT_DB_STALL_PATTERN_7__DIDT_STALL_PATTERN_7_MASK,      DIDT_DB_STALL_PATTERN_7__DIDT_STALL_PATTERN_7__SHIFT,    0x0000 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg SELCacConfig_Vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* SQ */
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x00060021 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x00860021 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x01060021 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x01860021 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x02060021 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x02860021 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x03060021 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x03860021 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x04060021 },
+	/* TD */
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x000E0020 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x008E0020 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x010E0020 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x018E0020 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x020E0020 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x028E0020 },
+	/* TCP */
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x001c0020 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x009c0020 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x011c0020 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x019c0020 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x021c0020 },
+	/* DB */
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x00200008 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x00820008 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x01020008 },
+	{   ixSE_CAC_CNTL,                     0xFFFFFFFF,                                          0,                                                     0x01820008 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+
+static const struct vega10_didt_config_reg SEEDCStallPatternConfig_Vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* SQ */
+	{   ixDIDT_SQ_EDC_STALL_PATTERN_1_2,   0xFFFFFFFF,                                          0,                                                     0x00030001 },
+	{   ixDIDT_SQ_EDC_STALL_PATTERN_3_4,   0xFFFFFFFF,                                          0,                                                     0x000F0007 },
+	{   ixDIDT_SQ_EDC_STALL_PATTERN_5_6,   0xFFFFFFFF,                                          0,                                                     0x003F001F },
+	{   ixDIDT_SQ_EDC_STALL_PATTERN_7,     0xFFFFFFFF,                                          0,                                                     0x0000007F },
+	/* TD */
+	{   ixDIDT_TD_EDC_STALL_PATTERN_1_2,   0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_TD_EDC_STALL_PATTERN_3_4,   0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_TD_EDC_STALL_PATTERN_5_6,   0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_TD_EDC_STALL_PATTERN_7,     0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	/* TCP */
+	{   ixDIDT_TCP_EDC_STALL_PATTERN_1_2,   0xFFFFFFFF,                                         0,                                                     0x00000000 },
+	{   ixDIDT_TCP_EDC_STALL_PATTERN_3_4,   0xFFFFFFFF,                                         0,                                                     0x00000000 },
+	{   ixDIDT_TCP_EDC_STALL_PATTERN_5_6,   0xFFFFFFFF,                                         0,                                                     0x00000000 },
+	{   ixDIDT_TCP_EDC_STALL_PATTERN_7,     0xFFFFFFFF,                                         0,                                                     0x00000000 },
+	/* DB */
+	{   ixDIDT_DB_EDC_STALL_PATTERN_1_2,   0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_DB_EDC_STALL_PATTERN_3_4,   0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_DB_EDC_STALL_PATTERN_5_6,   0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_DB_EDC_STALL_PATTERN_7,     0xFFFFFFFF,                                          0,                                                     0x00000000 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg SEEDCForceStallPatternConfig_Vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* SQ */
+	{   ixDIDT_SQ_EDC_STALL_PATTERN_1_2,   0xFFFFFFFF,                                          0,                                                     0x00000015 },
+	{   ixDIDT_SQ_EDC_STALL_PATTERN_3_4,   0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_SQ_EDC_STALL_PATTERN_5_6,   0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_SQ_EDC_STALL_PATTERN_7,     0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	/* TD */
+	{   ixDIDT_TD_EDC_STALL_PATTERN_1_2,   0xFFFFFFFF,                                          0,                                                     0x00000015 },
+	{   ixDIDT_TD_EDC_STALL_PATTERN_3_4,   0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_TD_EDC_STALL_PATTERN_5_6,   0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_TD_EDC_STALL_PATTERN_7,     0xFFFFFFFF,                                          0,                                                     0x00000000 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg SEEDCStallDelayConfig_Vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* SQ */
+	{   ixDIDT_SQ_EDC_STALL_DELAY_1,       0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_SQ_EDC_STALL_DELAY_2,       0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_SQ_EDC_STALL_DELAY_3,       0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_SQ_EDC_STALL_DELAY_4,       0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	/* TD */
+	{   ixDIDT_TD_EDC_STALL_DELAY_1,       0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_TD_EDC_STALL_DELAY_2,       0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_TD_EDC_STALL_DELAY_3,       0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_TD_EDC_STALL_DELAY_4,       0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	/* TCP */
+	{   ixDIDT_TCP_EDC_STALL_DELAY_1,      0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_TCP_EDC_STALL_DELAY_2,      0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_TCP_EDC_STALL_DELAY_3,      0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	{   ixDIDT_TCP_EDC_STALL_DELAY_4,      0xFFFFFFFF,                                          0,                                                     0x00000000 },
+	/* DB */
+	{   ixDIDT_DB_EDC_STALL_DELAY_1,       0xFFFFFFFF,                                          0,                                                     0x00000000 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg SEEDCThresholdConfig_Vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	{   ixDIDT_SQ_EDC_THRESHOLD,           0xFFFFFFFF,                                          0,                                                     0x0000010E },
+	{   ixDIDT_TD_EDC_THRESHOLD,           0xFFFFFFFF,                                          0,                                                     0xFFFFFFFF },
+	{   ixDIDT_TCP_EDC_THRESHOLD,          0xFFFFFFFF,                                          0,                                                     0xFFFFFFFF },
+	{   ixDIDT_DB_EDC_THRESHOLD,           0xFFFFFFFF,                                          0,                                                     0xFFFFFFFF },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg SEEDCCtrlResetConfig_Vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* SQ */
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_EN_MASK,                       DIDT_SQ_EDC_CTRL__EDC_EN__SHIFT,                        0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_SW_RST_MASK,                   DIDT_SQ_EDC_CTRL__EDC_SW_RST__SHIFT,                    0x0001 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_CLK_EN_OVERRIDE_MASK,          DIDT_SQ_EDC_CTRL__EDC_CLK_EN_OVERRIDE__SHIFT,           0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_FORCE_STALL_MASK,              DIDT_SQ_EDC_CTRL__EDC_FORCE_STALL__SHIFT,               0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_TRIGGER_THROTTLE_LOWBIT_MASK,  DIDT_SQ_EDC_CTRL__EDC_TRIGGER_THROTTLE_LOWBIT__SHIFT,   0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_STALL_PATTERN_BIT_NUMS_MASK,   DIDT_SQ_EDC_CTRL__EDC_STALL_PATTERN_BIT_NUMS__SHIFT,    0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_ALLOW_WRITE_PWRDELTA_MASK,     DIDT_SQ_EDC_CTRL__EDC_ALLOW_WRITE_PWRDELTA__SHIFT,      0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__GC_EDC_EN_MASK,                    DIDT_SQ_EDC_CTRL__GC_EDC_EN__SHIFT,                     0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__GC_EDC_STALL_POLICY_MASK,          DIDT_SQ_EDC_CTRL__GC_EDC_STALL_POLICY__SHIFT,           0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__GC_EDC_LEVEL_COMB_EN_MASK,         DIDT_SQ_EDC_CTRL__GC_EDC_LEVEL_COMB_EN__SHIFT,          0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__SE_EDC_LEVEL_COMB_EN_MASK,         DIDT_SQ_EDC_CTRL__SE_EDC_LEVEL_COMB_EN__SHIFT,          0x0000 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg SEEDCCtrlConfig_Vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* SQ */
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_EN_MASK,                       DIDT_SQ_EDC_CTRL__EDC_EN__SHIFT,                        0x0001 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_SW_RST_MASK,                   DIDT_SQ_EDC_CTRL__EDC_SW_RST__SHIFT,                    0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_CLK_EN_OVERRIDE_MASK,          DIDT_SQ_EDC_CTRL__EDC_CLK_EN_OVERRIDE__SHIFT,           0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_FORCE_STALL_MASK,              DIDT_SQ_EDC_CTRL__EDC_FORCE_STALL__SHIFT,               0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_TRIGGER_THROTTLE_LOWBIT_MASK,  DIDT_SQ_EDC_CTRL__EDC_TRIGGER_THROTTLE_LOWBIT__SHIFT,   0x0004 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_STALL_PATTERN_BIT_NUMS_MASK,   DIDT_SQ_EDC_CTRL__EDC_STALL_PATTERN_BIT_NUMS__SHIFT,    0x0006 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_ALLOW_WRITE_PWRDELTA_MASK,     DIDT_SQ_EDC_CTRL__EDC_ALLOW_WRITE_PWRDELTA__SHIFT,      0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__GC_EDC_EN_MASK,                    DIDT_SQ_EDC_CTRL__GC_EDC_EN__SHIFT,                     0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__GC_EDC_STALL_POLICY_MASK,          DIDT_SQ_EDC_CTRL__GC_EDC_STALL_POLICY__SHIFT,           0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__GC_EDC_LEVEL_COMB_EN_MASK,         DIDT_SQ_EDC_CTRL__GC_EDC_LEVEL_COMB_EN__SHIFT,          0x0001 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__SE_EDC_LEVEL_COMB_EN_MASK,         DIDT_SQ_EDC_CTRL__SE_EDC_LEVEL_COMB_EN__SHIFT,          0x0000 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg SEEDCCtrlForceStallConfig_Vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* SQ */
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_EN_MASK,                       DIDT_SQ_EDC_CTRL__EDC_EN__SHIFT,                        0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_SW_RST_MASK,                   DIDT_SQ_EDC_CTRL__EDC_SW_RST__SHIFT,                    0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_CLK_EN_OVERRIDE_MASK,          DIDT_SQ_EDC_CTRL__EDC_CLK_EN_OVERRIDE__SHIFT,           0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_FORCE_STALL_MASK,              DIDT_SQ_EDC_CTRL__EDC_FORCE_STALL__SHIFT,               0x0001 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_TRIGGER_THROTTLE_LOWBIT_MASK,  DIDT_SQ_EDC_CTRL__EDC_TRIGGER_THROTTLE_LOWBIT__SHIFT,   0x0001 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_STALL_PATTERN_BIT_NUMS_MASK,   DIDT_SQ_EDC_CTRL__EDC_STALL_PATTERN_BIT_NUMS__SHIFT,    0x000C },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_ALLOW_WRITE_PWRDELTA_MASK,     DIDT_SQ_EDC_CTRL__EDC_ALLOW_WRITE_PWRDELTA__SHIFT,      0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__GC_EDC_EN_MASK,                    DIDT_SQ_EDC_CTRL__GC_EDC_EN__SHIFT,                     0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__GC_EDC_STALL_POLICY_MASK,          DIDT_SQ_EDC_CTRL__GC_EDC_STALL_POLICY__SHIFT,           0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__GC_EDC_LEVEL_COMB_EN_MASK,         DIDT_SQ_EDC_CTRL__GC_EDC_LEVEL_COMB_EN__SHIFT,          0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__SE_EDC_LEVEL_COMB_EN_MASK,         DIDT_SQ_EDC_CTRL__SE_EDC_LEVEL_COMB_EN__SHIFT,          0x0001 },
+
+	/* TD */
+	{   ixDIDT_TD_EDC_CTRL,                DIDT_TD_EDC_CTRL__EDC_EN_MASK,                       DIDT_TD_EDC_CTRL__EDC_EN__SHIFT,                        0x0000 },
+	{   ixDIDT_TD_EDC_CTRL,                DIDT_TD_EDC_CTRL__EDC_SW_RST_MASK,                   DIDT_TD_EDC_CTRL__EDC_SW_RST__SHIFT,                    0x0000 },
+	{   ixDIDT_TD_EDC_CTRL,                DIDT_TD_EDC_CTRL__EDC_CLK_EN_OVERRIDE_MASK,          DIDT_TD_EDC_CTRL__EDC_CLK_EN_OVERRIDE__SHIFT,           0x0000 },
+	{   ixDIDT_TD_EDC_CTRL,                DIDT_TD_EDC_CTRL__EDC_FORCE_STALL_MASK,              DIDT_TD_EDC_CTRL__EDC_FORCE_STALL__SHIFT,               0x0001 },
+	{   ixDIDT_TD_EDC_CTRL,                DIDT_TD_EDC_CTRL__EDC_TRIGGER_THROTTLE_LOWBIT_MASK,  DIDT_TD_EDC_CTRL__EDC_TRIGGER_THROTTLE_LOWBIT__SHIFT,   0x0001 },
+	{   ixDIDT_TD_EDC_CTRL,                DIDT_TD_EDC_CTRL__EDC_STALL_PATTERN_BIT_NUMS_MASK,   DIDT_TD_EDC_CTRL__EDC_STALL_PATTERN_BIT_NUMS__SHIFT,    0x000E },
+	{   ixDIDT_TD_EDC_CTRL,                DIDT_TD_EDC_CTRL__EDC_ALLOW_WRITE_PWRDELTA_MASK,     DIDT_TD_EDC_CTRL__EDC_ALLOW_WRITE_PWRDELTA__SHIFT,      0x0000 },
+	{   ixDIDT_TD_EDC_CTRL,                DIDT_TD_EDC_CTRL__GC_EDC_EN_MASK,                    DIDT_TD_EDC_CTRL__GC_EDC_EN__SHIFT,                     0x0000 },
+	{   ixDIDT_TD_EDC_CTRL,                DIDT_TD_EDC_CTRL__GC_EDC_STALL_POLICY_MASK,          DIDT_TD_EDC_CTRL__GC_EDC_STALL_POLICY__SHIFT,           0x0000 },
+	{   ixDIDT_TD_EDC_CTRL,                DIDT_TD_EDC_CTRL__GC_EDC_LEVEL_COMB_EN_MASK,         DIDT_TD_EDC_CTRL__GC_EDC_LEVEL_COMB_EN__SHIFT,          0x0000 },
+	{   ixDIDT_TD_EDC_CTRL,                DIDT_TD_EDC_CTRL__SE_EDC_LEVEL_COMB_EN_MASK,         DIDT_TD_EDC_CTRL__SE_EDC_LEVEL_COMB_EN__SHIFT,          0x0001 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg    GCDiDtDroopCtrlConfig_vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	{   mmGC_DIDT_DROOP_CTRL,             GC_DIDT_DROOP_CTRL__DIDT_DROOP_LEVEL_EN_MASK,   GC_DIDT_DROOP_CTRL__DIDT_DROOP_LEVEL_EN__SHIFT,  0x0000 },
+	{   mmGC_DIDT_DROOP_CTRL,             GC_DIDT_DROOP_CTRL__DIDT_DROOP_THRESHOLD_MASK,   GC_DIDT_DROOP_CTRL__DIDT_DROOP_THRESHOLD__SHIFT,  0x0000 },
+	{   mmGC_DIDT_DROOP_CTRL,             GC_DIDT_DROOP_CTRL__DIDT_DROOP_LEVEL_INDEX_MASK,   GC_DIDT_DROOP_CTRL__DIDT_DROOP_LEVEL_INDEX__SHIFT,  0x0000 },
+	{   mmGC_DIDT_DROOP_CTRL,             GC_DIDT_DROOP_CTRL__DIDT_LEVEL_SEL_MASK,   GC_DIDT_DROOP_CTRL__DIDT_LEVEL_SEL__SHIFT,  0x0000 },
+	{   mmGC_DIDT_DROOP_CTRL,             GC_DIDT_DROOP_CTRL__DIDT_DROOP_LEVEL_OVERFLOW_MASK,   GC_DIDT_DROOP_CTRL__DIDT_DROOP_LEVEL_OVERFLOW__SHIFT,  0x0000 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg    GCDiDtCtrl0Config_vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	{   mmGC_DIDT_CTRL0,                  GC_DIDT_CTRL0__DIDT_CTRL_EN_MASK,   GC_DIDT_CTRL0__DIDT_CTRL_EN__SHIFT,  0x0000 },
+	{   mmGC_DIDT_CTRL0,                  GC_DIDT_CTRL0__PHASE_OFFSET_MASK,   GC_DIDT_CTRL0__PHASE_OFFSET__SHIFT,  0x0000 },
+	{   mmGC_DIDT_CTRL0,                  GC_DIDT_CTRL0__DIDT_SW_RST_MASK,   GC_DIDT_CTRL0__DIDT_SW_RST__SHIFT,  0x0000 },
+	{   mmGC_DIDT_CTRL0,                  GC_DIDT_CTRL0__DIDT_CLK_EN_OVERRIDE_MASK,   GC_DIDT_CTRL0__DIDT_CLK_EN_OVERRIDE__SHIFT,  0x0000 },
+	{   mmGC_DIDT_CTRL0,                  GC_DIDT_CTRL0__DIDT_TRIGGER_THROTTLE_LOWBIT_MASK,   GC_DIDT_CTRL0__DIDT_TRIGGER_THROTTLE_LOWBIT__SHIFT,  0x0000 },
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+
+static const struct vega10_didt_config_reg   PSMSEEDCStallPatternConfig_Vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* SQ EDC STALL PATTERNs */
+	{   ixDIDT_SQ_EDC_STALL_PATTERN_1_2,  DIDT_SQ_EDC_STALL_PATTERN_1_2__EDC_STALL_PATTERN_1_MASK,   DIDT_SQ_EDC_STALL_PATTERN_1_2__EDC_STALL_PATTERN_1__SHIFT,   0x0101 },
+	{   ixDIDT_SQ_EDC_STALL_PATTERN_1_2,  DIDT_SQ_EDC_STALL_PATTERN_1_2__EDC_STALL_PATTERN_2_MASK,   DIDT_SQ_EDC_STALL_PATTERN_1_2__EDC_STALL_PATTERN_2__SHIFT,   0x0101 },
+	{   ixDIDT_SQ_EDC_STALL_PATTERN_3_4,  DIDT_SQ_EDC_STALL_PATTERN_3_4__EDC_STALL_PATTERN_3_MASK,   DIDT_SQ_EDC_STALL_PATTERN_3_4__EDC_STALL_PATTERN_3__SHIFT,   0x1111 },
+	{   ixDIDT_SQ_EDC_STALL_PATTERN_3_4,  DIDT_SQ_EDC_STALL_PATTERN_3_4__EDC_STALL_PATTERN_4_MASK,   DIDT_SQ_EDC_STALL_PATTERN_3_4__EDC_STALL_PATTERN_4__SHIFT,   0x1111 },
+
+	{   ixDIDT_SQ_EDC_STALL_PATTERN_5_6,  DIDT_SQ_EDC_STALL_PATTERN_5_6__EDC_STALL_PATTERN_5_MASK,   DIDT_SQ_EDC_STALL_PATTERN_5_6__EDC_STALL_PATTERN_5__SHIFT,   0x1515 },
+	{   ixDIDT_SQ_EDC_STALL_PATTERN_5_6,  DIDT_SQ_EDC_STALL_PATTERN_5_6__EDC_STALL_PATTERN_6_MASK,   DIDT_SQ_EDC_STALL_PATTERN_5_6__EDC_STALL_PATTERN_6__SHIFT,   0x1515 },
+
+	{   ixDIDT_SQ_EDC_STALL_PATTERN_7,  DIDT_SQ_EDC_STALL_PATTERN_7__EDC_STALL_PATTERN_7_MASK,   DIDT_SQ_EDC_STALL_PATTERN_7__EDC_STALL_PATTERN_7__SHIFT,     0x5555 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg   PSMSEEDCStallDelayConfig_Vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* SQ EDC STALL DELAYs */
+	{   ixDIDT_SQ_EDC_STALL_DELAY_1,      DIDT_SQ_EDC_STALL_DELAY_1__EDC_STALL_DELAY_SQ0_MASK,  DIDT_SQ_EDC_STALL_DELAY_1__EDC_STALL_DELAY_SQ0__SHIFT,  0x0000 },
+	{   ixDIDT_SQ_EDC_STALL_DELAY_1,      DIDT_SQ_EDC_STALL_DELAY_1__EDC_STALL_DELAY_SQ1_MASK,  DIDT_SQ_EDC_STALL_DELAY_1__EDC_STALL_DELAY_SQ1__SHIFT,  0x0000 },
+	{   ixDIDT_SQ_EDC_STALL_DELAY_1,      DIDT_SQ_EDC_STALL_DELAY_1__EDC_STALL_DELAY_SQ2_MASK,  DIDT_SQ_EDC_STALL_DELAY_1__EDC_STALL_DELAY_SQ2__SHIFT,  0x0000 },
+	{   ixDIDT_SQ_EDC_STALL_DELAY_1,      DIDT_SQ_EDC_STALL_DELAY_1__EDC_STALL_DELAY_SQ3_MASK,  DIDT_SQ_EDC_STALL_DELAY_1__EDC_STALL_DELAY_SQ3__SHIFT,  0x0000 },
+
+	{   ixDIDT_SQ_EDC_STALL_DELAY_2,      DIDT_SQ_EDC_STALL_DELAY_2__EDC_STALL_DELAY_SQ4_MASK,  DIDT_SQ_EDC_STALL_DELAY_2__EDC_STALL_DELAY_SQ4__SHIFT,  0x0000 },
+	{   ixDIDT_SQ_EDC_STALL_DELAY_2,      DIDT_SQ_EDC_STALL_DELAY_2__EDC_STALL_DELAY_SQ5_MASK,  DIDT_SQ_EDC_STALL_DELAY_2__EDC_STALL_DELAY_SQ5__SHIFT,  0x0000 },
+	{   ixDIDT_SQ_EDC_STALL_DELAY_2,      DIDT_SQ_EDC_STALL_DELAY_2__EDC_STALL_DELAY_SQ6_MASK,  DIDT_SQ_EDC_STALL_DELAY_2__EDC_STALL_DELAY_SQ6__SHIFT,  0x0000 },
+	{   ixDIDT_SQ_EDC_STALL_DELAY_2,      DIDT_SQ_EDC_STALL_DELAY_2__EDC_STALL_DELAY_SQ7_MASK,  DIDT_SQ_EDC_STALL_DELAY_2__EDC_STALL_DELAY_SQ7__SHIFT,  0x0000 },
+
+	{   ixDIDT_SQ_EDC_STALL_DELAY_3,      DIDT_SQ_EDC_STALL_DELAY_3__EDC_STALL_DELAY_SQ8_MASK,  DIDT_SQ_EDC_STALL_DELAY_3__EDC_STALL_DELAY_SQ8__SHIFT,  0x0000 },
+	{   ixDIDT_SQ_EDC_STALL_DELAY_3,      DIDT_SQ_EDC_STALL_DELAY_3__EDC_STALL_DELAY_SQ9_MASK,  DIDT_SQ_EDC_STALL_DELAY_3__EDC_STALL_DELAY_SQ9__SHIFT,  0x0000 },
+	{   ixDIDT_SQ_EDC_STALL_DELAY_3,      DIDT_SQ_EDC_STALL_DELAY_3__EDC_STALL_DELAY_SQ10_MASK, DIDT_SQ_EDC_STALL_DELAY_3__EDC_STALL_DELAY_SQ10__SHIFT, 0x0000 },
+	{   ixDIDT_SQ_EDC_STALL_DELAY_3,      DIDT_SQ_EDC_STALL_DELAY_3__EDC_STALL_DELAY_SQ11_MASK, DIDT_SQ_EDC_STALL_DELAY_3__EDC_STALL_DELAY_SQ11__SHIFT, 0x0000 },
+
+	{   ixDIDT_SQ_EDC_STALL_DELAY_4,      DIDT_SQ_EDC_STALL_DELAY_4__EDC_STALL_DELAY_SQ12_MASK, DIDT_SQ_EDC_STALL_DELAY_4__EDC_STALL_DELAY_SQ12__SHIFT, 0x0000 },
+	{   ixDIDT_SQ_EDC_STALL_DELAY_4,      DIDT_SQ_EDC_STALL_DELAY_4__EDC_STALL_DELAY_SQ12_MASK, DIDT_SQ_EDC_STALL_DELAY_4__EDC_STALL_DELAY_SQ13__SHIFT, 0x0000 },
+	{   ixDIDT_SQ_EDC_STALL_DELAY_4,      DIDT_SQ_EDC_STALL_DELAY_4__EDC_STALL_DELAY_SQ14_MASK, DIDT_SQ_EDC_STALL_DELAY_4__EDC_STALL_DELAY_SQ14__SHIFT, 0x0000 },
+	{   ixDIDT_SQ_EDC_STALL_DELAY_4,      DIDT_SQ_EDC_STALL_DELAY_4__EDC_STALL_DELAY_SQ15_MASK, DIDT_SQ_EDC_STALL_DELAY_4__EDC_STALL_DELAY_SQ15__SHIFT, 0x0000 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg   PSMSEEDCThresholdConfig_Vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* SQ EDC THRESHOLD */
+	{   ixDIDT_SQ_EDC_THRESHOLD,           DIDT_SQ_EDC_THRESHOLD__EDC_THRESHOLD_MASK,           DIDT_SQ_EDC_THRESHOLD__EDC_THRESHOLD__SHIFT,            0x0000 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg   PSMSEEDCCtrlResetConfig_Vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* SQ EDC CTRL */
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_EN_MASK,                       DIDT_SQ_EDC_CTRL__EDC_EN__SHIFT,                        0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_SW_RST_MASK,                   DIDT_SQ_EDC_CTRL__EDC_SW_RST__SHIFT,                    0x0001 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_CLK_EN_OVERRIDE_MASK,          DIDT_SQ_EDC_CTRL__EDC_CLK_EN_OVERRIDE__SHIFT,           0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_FORCE_STALL_MASK,              DIDT_SQ_EDC_CTRL__EDC_FORCE_STALL__SHIFT,               0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_TRIGGER_THROTTLE_LOWBIT_MASK,  DIDT_SQ_EDC_CTRL__EDC_TRIGGER_THROTTLE_LOWBIT__SHIFT,   0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_STALL_PATTERN_BIT_NUMS_MASK,   DIDT_SQ_EDC_CTRL__EDC_STALL_PATTERN_BIT_NUMS__SHIFT,    0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_ALLOW_WRITE_PWRDELTA_MASK,     DIDT_SQ_EDC_CTRL__EDC_ALLOW_WRITE_PWRDELTA__SHIFT,      0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__GC_EDC_EN_MASK,                    DIDT_SQ_EDC_CTRL__GC_EDC_EN__SHIFT,                     0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__GC_EDC_STALL_POLICY_MASK,          DIDT_SQ_EDC_CTRL__GC_EDC_STALL_POLICY__SHIFT,           0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__GC_EDC_LEVEL_COMB_EN_MASK,         DIDT_SQ_EDC_CTRL__GC_EDC_LEVEL_COMB_EN__SHIFT,          0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__SE_EDC_LEVEL_COMB_EN_MASK,         DIDT_SQ_EDC_CTRL__SE_EDC_LEVEL_COMB_EN__SHIFT,          0x0000 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg   PSMSEEDCCtrlConfig_Vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	/* SQ EDC CTRL */
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_EN_MASK,                       DIDT_SQ_EDC_CTRL__EDC_EN__SHIFT,                        0x0001 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_SW_RST_MASK,                   DIDT_SQ_EDC_CTRL__EDC_SW_RST__SHIFT,                    0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_CLK_EN_OVERRIDE_MASK,          DIDT_SQ_EDC_CTRL__EDC_CLK_EN_OVERRIDE__SHIFT,           0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_FORCE_STALL_MASK,              DIDT_SQ_EDC_CTRL__EDC_FORCE_STALL__SHIFT,               0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_TRIGGER_THROTTLE_LOWBIT_MASK,  DIDT_SQ_EDC_CTRL__EDC_TRIGGER_THROTTLE_LOWBIT__SHIFT,   0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_STALL_PATTERN_BIT_NUMS_MASK,   DIDT_SQ_EDC_CTRL__EDC_STALL_PATTERN_BIT_NUMS__SHIFT,    0x000E },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__EDC_ALLOW_WRITE_PWRDELTA_MASK,     DIDT_SQ_EDC_CTRL__EDC_ALLOW_WRITE_PWRDELTA__SHIFT,      0x0000 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__GC_EDC_EN_MASK,                    DIDT_SQ_EDC_CTRL__GC_EDC_EN__SHIFT,                     0x0001 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__GC_EDC_STALL_POLICY_MASK,          DIDT_SQ_EDC_CTRL__GC_EDC_STALL_POLICY__SHIFT,           0x0003 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__GC_EDC_LEVEL_COMB_EN_MASK,         DIDT_SQ_EDC_CTRL__GC_EDC_LEVEL_COMB_EN__SHIFT,          0x0001 },
+	{   ixDIDT_SQ_EDC_CTRL,                DIDT_SQ_EDC_CTRL__SE_EDC_LEVEL_COMB_EN_MASK,         DIDT_SQ_EDC_CTRL__SE_EDC_LEVEL_COMB_EN__SHIFT,          0x0000 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg   PSMGCEDCThresholdConfig_vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	{   mmGC_EDC_THRESHOLD,                GC_EDC_THRESHOLD__EDC_THRESHOLD_MASK,                GC_EDC_THRESHOLD__EDC_THRESHOLD__SHIFT,                 0x0000000 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg   PSMGCEDCDroopCtrlConfig_vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	{   mmGC_EDC_DROOP_CTRL,               GC_EDC_DROOP_CTRL__EDC_DROOP_LEVEL_EN_MASK,          GC_EDC_DROOP_CTRL__EDC_DROOP_LEVEL_EN__SHIFT,           0x0001 },
+	{   mmGC_EDC_DROOP_CTRL,               GC_EDC_DROOP_CTRL__EDC_DROOP_THRESHOLD_MASK,         GC_EDC_DROOP_CTRL__EDC_DROOP_THRESHOLD__SHIFT,          0x0384 },
+	{   mmGC_EDC_DROOP_CTRL,               GC_EDC_DROOP_CTRL__EDC_DROOP_LEVEL_INDEX_MASK,       GC_EDC_DROOP_CTRL__EDC_DROOP_LEVEL_INDEX__SHIFT,        0x0001 },
+	{   mmGC_EDC_DROOP_CTRL,               GC_EDC_DROOP_CTRL__AVG_PSM_SEL_MASK,                 GC_EDC_DROOP_CTRL__AVG_PSM_SEL__SHIFT,                  0x0001 },
+	{   mmGC_EDC_DROOP_CTRL,               GC_EDC_DROOP_CTRL__EDC_LEVEL_SEL_MASK,               GC_EDC_DROOP_CTRL__EDC_LEVEL_SEL__SHIFT,                0x0001 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg   PSMGCEDCCtrlResetConfig_vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	{   mmGC_EDC_CTRL,                     GC_EDC_CTRL__EDC_EN_MASK,                            GC_EDC_CTRL__EDC_EN__SHIFT,                             0x0000 },
+	{   mmGC_EDC_CTRL,                     GC_EDC_CTRL__EDC_SW_RST_MASK,                        GC_EDC_CTRL__EDC_SW_RST__SHIFT,                         0x0001 },
+	{   mmGC_EDC_CTRL,                     GC_EDC_CTRL__EDC_CLK_EN_OVERRIDE_MASK,               GC_EDC_CTRL__EDC_CLK_EN_OVERRIDE__SHIFT,                0x0000 },
+	{   mmGC_EDC_CTRL,                     GC_EDC_CTRL__EDC_FORCE_STALL_MASK,                   GC_EDC_CTRL__EDC_FORCE_STALL__SHIFT,                    0x0000 },
+	{   mmGC_EDC_CTRL,                     GC_EDC_CTRL__EDC_TRIGGER_THROTTLE_LOWBIT_MASK,       GC_EDC_CTRL__EDC_TRIGGER_THROTTLE_LOWBIT__SHIFT,        0x0000 },
+	{   mmGC_EDC_CTRL,                     GC_EDC_CTRL__EDC_ALLOW_WRITE_PWRDELTA_MASK,          GC_EDC_CTRL__EDC_ALLOW_WRITE_PWRDELTA__SHIFT,           0x0000 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg   PSMGCEDCCtrlConfig_vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	{   mmGC_EDC_CTRL,                     GC_EDC_CTRL__EDC_EN_MASK,                            GC_EDC_CTRL__EDC_EN__SHIFT,                             0x0001 },
+	{   mmGC_EDC_CTRL,                     GC_EDC_CTRL__EDC_SW_RST_MASK,                        GC_EDC_CTRL__EDC_SW_RST__SHIFT,                         0x0000 },
+	{   mmGC_EDC_CTRL,                     GC_EDC_CTRL__EDC_CLK_EN_OVERRIDE_MASK,               GC_EDC_CTRL__EDC_CLK_EN_OVERRIDE__SHIFT,                0x0000 },
+	{   mmGC_EDC_CTRL,                     GC_EDC_CTRL__EDC_FORCE_STALL_MASK,                   GC_EDC_CTRL__EDC_FORCE_STALL__SHIFT,                    0x0000 },
+	{   mmGC_EDC_CTRL,                     GC_EDC_CTRL__EDC_TRIGGER_THROTTLE_LOWBIT_MASK,       GC_EDC_CTRL__EDC_TRIGGER_THROTTLE_LOWBIT__SHIFT,        0x0000 },
+	{   mmGC_EDC_CTRL,                     GC_EDC_CTRL__EDC_ALLOW_WRITE_PWRDELTA_MASK,          GC_EDC_CTRL__EDC_ALLOW_WRITE_PWRDELTA__SHIFT,           0x0000 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg    AvfsPSMResetConfig_vega10[]=
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	{   0x16A02,                         0xFFFFFFFF,                                            0x0,                                                    0x0000005F },
+	{   0x16A05,                         0xFFFFFFFF,                                            0x0,                                                    0x00000001 },
+	{   0x16A06,                         0x00000001,                                            0x0,                                                    0x02000000 },
+	{   0x16A01,                         0xFFFFFFFF,                                            0x0,                                                    0x00003027 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static const struct vega10_didt_config_reg    AvfsPSMInitConfig_vega10[] =
+{
+/* ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ *      Offset                             Mask                                                 Shift                                                  Value
+ * ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ */
+	{   0x16A05,                         0xFFFFFFFF,                                            0x18,                                                    0x00000001 },
+	{   0x16A05,                         0xFFFFFFFF,                                            0x8,                                                     0x00000003 },
+	{   0x16A05,                         0xFFFFFFFF,                                            0xa,                                                     0x00000006 },
+	{   0x16A05,                         0xFFFFFFFF,                                            0x7,                                                     0x00000000 },
+	{   0x16A06,                         0xFFFFFFFF,                                            0x18,                                                    0x00000001 },
+	{   0x16A06,                         0xFFFFFFFF,                                            0x19,                                                    0x00000001 },
+	{   0x16A01,                         0xFFFFFFFF,                                            0x0,                                                     0x00003027 },
+
+	{   0xFFFFFFFF  }  /* End of list */
+};
+
+static int vega10_program_didt_config_registers(struct pp_hwmgr *hwmgr, const struct vega10_didt_config_reg *config_regs, enum vega10_didt_config_reg_type reg_type)
+{
+	uint32_t data;
+
+	PP_ASSERT_WITH_CODE((config_regs != NULL), "[vega10_program_didt_config_registers] Invalid config register table!", return -EINVAL);
+
+	while (config_regs->offset != 0xFFFFFFFF) {
+		switch (reg_type) {
+		case VEGA10_CONFIGREG_DIDT:
+			data = cgs_read_ind_register(hwmgr->device, CGS_IND_REG__DIDT, config_regs->offset);
+			data &= ~config_regs->mask;
+			data |= ((config_regs->value << config_regs->shift) & config_regs->mask);
+			cgs_write_ind_register(hwmgr->device, CGS_IND_REG__DIDT, config_regs->offset, data);
+			break;
+		case VEGA10_CONFIGREG_GCCAC:
+			data = cgs_read_ind_register(hwmgr->device, CGS_IND_REG_GC_CAC, config_regs->offset);
+			data &= ~config_regs->mask;
+			data |= ((config_regs->value << config_regs->shift) & config_regs->mask);
+			cgs_write_ind_register(hwmgr->device, CGS_IND_REG_GC_CAC, config_regs->offset, data);
+			break;
+		case VEGA10_CONFIGREG_SECAC:
+			data = cgs_read_ind_register(hwmgr->device, CGS_IND_REG_SE_CAC, config_regs->offset);
+			data &= ~config_regs->mask;
+			data |= ((config_regs->value << config_regs->shift) & config_regs->mask);
+			cgs_write_ind_register(hwmgr->device, CGS_IND_REG_SE_CAC, config_regs->offset, data);
+			break;
+		default:
+			return -EINVAL;
+		}
+
+		config_regs++;
+	}
+
+	return 0;
+}
+
+static int vega10_program_gc_didt_config_registers(struct pp_hwmgr *hwmgr, const struct vega10_didt_config_reg *config_regs)
+{
+	uint32_t data;
+
+	while (config_regs->offset != 0xFFFFFFFF) {
+		data = cgs_read_register(hwmgr->device, config_regs->offset);
+		data &= ~config_regs->mask;
+		data |= ((config_regs->value << config_regs->shift) & config_regs->mask);
+		cgs_write_register(hwmgr->device, config_regs->offset, data);
+		config_regs++;
+	}
+
+	return 0;
+}
+
+static void vega10_didt_set_mask(struct pp_hwmgr *hwmgr, const bool enable)
+{
+	uint32_t data;
+	int result;
+	uint32_t en = (enable ? 1 : 0);
+	uint32_t didt_block_info = SQ_IR_MASK | TCP_IR_MASK | TD_PCC_MASK;
+
+	if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_SQRamping)) {
+		data = cgs_read_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_SQ_CTRL0);
+		data &= ~DIDT_SQ_CTRL0__DIDT_CTRL_EN_MASK;
+		data |= ((en << DIDT_SQ_CTRL0__DIDT_CTRL_EN__SHIFT) & DIDT_SQ_CTRL0__DIDT_CTRL_EN_MASK);
+		cgs_write_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_SQ_CTRL0, data);
+		didt_block_info &= ~SQ_Enable_MASK;
+		didt_block_info |= en << SQ_Enable_SHIFT;
+	}
+
+	if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_DBRamping)) {
+		data = cgs_read_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_DB_CTRL0);
+		data &= ~DIDT_DB_CTRL0__DIDT_CTRL_EN_MASK;
+		data |= ((en << DIDT_DB_CTRL0__DIDT_CTRL_EN__SHIFT) & DIDT_DB_CTRL0__DIDT_CTRL_EN_MASK);
+		cgs_write_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_DB_CTRL0, data);
+		didt_block_info &= ~DB_Enable_MASK;
+		didt_block_info |= en << DB_Enable_SHIFT;
+	}
+
+	if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_TDRamping)) {
+		data = cgs_read_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_TD_CTRL0);
+		data &= ~DIDT_TD_CTRL0__DIDT_CTRL_EN_MASK;
+		data |= ((en << DIDT_TD_CTRL0__DIDT_CTRL_EN__SHIFT) & DIDT_TD_CTRL0__DIDT_CTRL_EN_MASK);
+		cgs_write_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_TD_CTRL0, data);
+		didt_block_info &= ~TD_Enable_MASK;
+		didt_block_info |= en << TD_Enable_SHIFT;
+	}
+
+	if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_TCPRamping)) {
+		data = cgs_read_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_TCP_CTRL0);
+		data &= ~DIDT_TCP_CTRL0__DIDT_CTRL_EN_MASK;
+		data |= ((en << DIDT_TCP_CTRL0__DIDT_CTRL_EN__SHIFT) & DIDT_TCP_CTRL0__DIDT_CTRL_EN_MASK);
+		cgs_write_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_TCP_CTRL0, data);
+		didt_block_info &= ~TCP_Enable_MASK;
+		didt_block_info |= en << TCP_Enable_SHIFT;
+	}
+
+	if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_DBRRamping)) {
+		data = cgs_read_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_DBR_CTRL0);
+		data &= ~DIDT_DBR_CTRL0__DIDT_CTRL_EN_MASK;
+		data |= ((en << DIDT_DBR_CTRL0__DIDT_CTRL_EN__SHIFT) & DIDT_DBR_CTRL0__DIDT_CTRL_EN_MASK);
+		cgs_write_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_DBR_CTRL0, data);
+	}
+
+	if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_DiDtEDCEnable)) {
+		if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_SQRamping)) {
+			data = cgs_read_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_SQ_EDC_CTRL);
+			data &= ~DIDT_SQ_EDC_CTRL__EDC_EN_MASK;
+			data |= ((en << DIDT_SQ_EDC_CTRL__EDC_EN__SHIFT) & DIDT_SQ_EDC_CTRL__EDC_EN_MASK);
+			data &= ~DIDT_SQ_EDC_CTRL__EDC_SW_RST_MASK;
+			data |= ((~en << DIDT_SQ_EDC_CTRL__EDC_SW_RST__SHIFT) & DIDT_SQ_EDC_CTRL__EDC_SW_RST_MASK);
+			cgs_write_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_SQ_EDC_CTRL, data);
+		}
+
+		if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_DBRamping)) {
+			data = cgs_read_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_DB_EDC_CTRL);
+			data &= ~DIDT_DB_EDC_CTRL__EDC_EN_MASK;
+			data |= ((en << DIDT_DB_EDC_CTRL__EDC_EN__SHIFT) & DIDT_DB_EDC_CTRL__EDC_EN_MASK);
+			data &= ~DIDT_DB_EDC_CTRL__EDC_SW_RST_MASK;
+			data |= ((~en << DIDT_DB_EDC_CTRL__EDC_SW_RST__SHIFT) & DIDT_DB_EDC_CTRL__EDC_SW_RST_MASK);
+			cgs_write_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_DB_EDC_CTRL, data);
+		}
+
+		if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_TDRamping)) {
+			data = cgs_read_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_TD_EDC_CTRL);
+			data &= ~DIDT_TD_EDC_CTRL__EDC_EN_MASK;
+			data |= ((en << DIDT_TD_EDC_CTRL__EDC_EN__SHIFT) & DIDT_TD_EDC_CTRL__EDC_EN_MASK);
+			data &= ~DIDT_TD_EDC_CTRL__EDC_SW_RST_MASK;
+			data |= ((~en << DIDT_TD_EDC_CTRL__EDC_SW_RST__SHIFT) & DIDT_TD_EDC_CTRL__EDC_SW_RST_MASK);
+			cgs_write_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_TD_EDC_CTRL, data);
+		}
+
+		if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_TCPRamping)) {
+			data = cgs_read_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_TCP_EDC_CTRL);
+			data &= ~DIDT_TCP_EDC_CTRL__EDC_EN_MASK;
+			data |= ((en << DIDT_TCP_EDC_CTRL__EDC_EN__SHIFT) & DIDT_TCP_EDC_CTRL__EDC_EN_MASK);
+			data &= ~DIDT_TCP_EDC_CTRL__EDC_SW_RST_MASK;
+			data |= ((~en << DIDT_TCP_EDC_CTRL__EDC_SW_RST__SHIFT) & DIDT_TCP_EDC_CTRL__EDC_SW_RST_MASK);
+			cgs_write_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_TCP_EDC_CTRL, data);
+		}
+
+		if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_DBRRamping)) {
+			data = cgs_read_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_DBR_EDC_CTRL);
+			data &= ~DIDT_DBR_EDC_CTRL__EDC_EN_MASK;
+			data |= ((en << DIDT_DBR_EDC_CTRL__EDC_EN__SHIFT) & DIDT_DBR_EDC_CTRL__EDC_EN_MASK);
+			data &= ~DIDT_DBR_EDC_CTRL__EDC_SW_RST_MASK;
+			data |= ((~en << DIDT_DBR_EDC_CTRL__EDC_SW_RST__SHIFT) & DIDT_DBR_EDC_CTRL__EDC_SW_RST_MASK);
+			cgs_write_ind_register(hwmgr->device, CGS_IND_REG__DIDT, ixDIDT_DBR_EDC_CTRL, data);
+		}
+	}
+
+	if (enable) {
+		/* For Vega10, SMC does not support any mask yet. */
+		result = smum_send_msg_to_smc_with_parameter(hwmgr->smumgr, PPSMC_MSG_ConfigureGfxDidt, didt_block_info);
+		PP_ASSERT((0 == result), "[EnableDiDtConfig] SMC Configure Gfx Didt Failed!");
+	}
+}
+
+static int vega10_enable_cac_driving_se_didt_config(struct pp_hwmgr *hwmgr)
+{
+	int result;
+	uint32_t num_se = 0, count, data;
+	struct cgs_system_info sys_info = {0};
+	uint32_t reg;
+
+	sys_info.size = sizeof(struct cgs_system_info);
+	sys_info.info_id = CGS_SYSTEM_INFO_GFX_SE_INFO;
+	if (cgs_query_system_info(hwmgr->device, &sys_info) == 0)
+		num_se = sys_info.value;
+
+	cgs_enter_safe_mode(hwmgr->device, true);
+
+	cgs_lock_grbm_idx(hwmgr->device, true);
+	reg = soc15_get_register_offset(GC_HWID, 0, mmGRBM_GFX_INDEX_BASE_IDX, mmGRBM_GFX_INDEX);
+	for (count = 0; count < num_se; count++) {
+		data = GRBM_GFX_INDEX__INSTANCE_BROADCAST_WRITES_MASK | GRBM_GFX_INDEX__SH_BROADCAST_WRITES_MASK | ( count << GRBM_GFX_INDEX__SE_INDEX__SHIFT);
+		cgs_write_register(hwmgr->device, reg, data);
+
+		result =  vega10_program_didt_config_registers(hwmgr, SEDiDtStallCtrlConfig_vega10, VEGA10_CONFIGREG_DIDT);
+		result |= vega10_program_didt_config_registers(hwmgr, SEDiDtStallPatternConfig_vega10, VEGA10_CONFIGREG_DIDT);
+		result |= vega10_program_didt_config_registers(hwmgr, SEDiDtWeightConfig_Vega10, VEGA10_CONFIGREG_DIDT);
+		result |= vega10_program_didt_config_registers(hwmgr, SEDiDtCtrl1Config_Vega10, VEGA10_CONFIGREG_DIDT);
+		result |= vega10_program_didt_config_registers(hwmgr, SEDiDtCtrl2Config_Vega10, VEGA10_CONFIGREG_DIDT);
+		result |= vega10_program_didt_config_registers(hwmgr, SEDiDtCtrl3Config_vega10, VEGA10_CONFIGREG_DIDT);
+		result |= vega10_program_didt_config_registers(hwmgr, SEDiDtTuningCtrlConfig_Vega10, VEGA10_CONFIGREG_DIDT);
+		result |= vega10_program_didt_config_registers(hwmgr, SELCacConfig_Vega10, VEGA10_CONFIGREG_SECAC);
+		result |= vega10_program_didt_config_registers(hwmgr, SEDiDtCtrl0Config_Vega10, VEGA10_CONFIGREG_DIDT);
+
+		if (0 != result)
+			break;
+	}
+	cgs_write_register(hwmgr->device, reg, 0xE0000000);
+	cgs_lock_grbm_idx(hwmgr->device, false);
+
+	vega10_didt_set_mask(hwmgr, true);
+
+	cgs_enter_safe_mode(hwmgr->device, false);
+
+	return 0;
+}
+
+static int vega10_disable_cac_driving_se_didt_config(struct pp_hwmgr *hwmgr)
+{
+	cgs_enter_safe_mode(hwmgr->device, true);
+
+	vega10_didt_set_mask(hwmgr, false);
+
+	cgs_enter_safe_mode(hwmgr->device, false);
+
+	return 0;
+}
+
+static int vega10_enable_psm_gc_didt_config(struct pp_hwmgr *hwmgr)
+{
+	int result;
+	uint32_t num_se = 0, count, data;
+	struct cgs_system_info sys_info = {0};
+	uint32_t reg;
+
+	sys_info.size = sizeof(struct cgs_system_info);
+	sys_info.info_id = CGS_SYSTEM_INFO_GFX_SE_INFO;
+	if (cgs_query_system_info(hwmgr->device, &sys_info) == 0)
+		num_se = sys_info.value;
+
+	cgs_enter_safe_mode(hwmgr->device, true);
+
+	cgs_lock_grbm_idx(hwmgr->device, true);
+	reg = soc15_get_register_offset(GC_HWID, 0, mmGRBM_GFX_INDEX_BASE_IDX, mmGRBM_GFX_INDEX);
+	for (count = 0; count < num_se; count++) {
+		data = GRBM_GFX_INDEX__INSTANCE_BROADCAST_WRITES_MASK | GRBM_GFX_INDEX__SH_BROADCAST_WRITES_MASK | ( count << GRBM_GFX_INDEX__SE_INDEX__SHIFT);
+		cgs_write_register(hwmgr->device, reg, data);
+
+		result = vega10_program_didt_config_registers(hwmgr, SEDiDtStallCtrlConfig_vega10, VEGA10_CONFIGREG_DIDT);
+		result |= vega10_program_didt_config_registers(hwmgr, SEDiDtStallPatternConfig_vega10, VEGA10_CONFIGREG_DIDT);
+		result |= vega10_program_didt_config_registers(hwmgr, SEDiDtCtrl3Config_vega10, VEGA10_CONFIGREG_DIDT);
+		result |= vega10_program_didt_config_registers(hwmgr, SEDiDtCtrl0Config_Vega10, VEGA10_CONFIGREG_DIDT);
+		if (0 != result)
+			break;
+	}
+	cgs_write_register(hwmgr->device, reg, 0xE0000000);
+	cgs_lock_grbm_idx(hwmgr->device, false);
+
+	vega10_didt_set_mask(hwmgr, true);
+
+	cgs_enter_safe_mode(hwmgr->device, false);
+
+	vega10_program_gc_didt_config_registers(hwmgr, GCDiDtDroopCtrlConfig_vega10);
+	if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_GCEDC))
+		vega10_program_gc_didt_config_registers(hwmgr, GCDiDtCtrl0Config_vega10);
+
+	if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_PSM))
+		vega10_program_gc_didt_config_registers(hwmgr,  AvfsPSMInitConfig_vega10);
+
+	return 0;
+}
+
+static int vega10_disable_psm_gc_didt_config(struct pp_hwmgr *hwmgr)
+{
+	uint32_t data;
+
+	cgs_enter_safe_mode(hwmgr->device, true);
+
+	vega10_didt_set_mask(hwmgr, false);
+
+	cgs_enter_safe_mode(hwmgr->device, false);
+
+	if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_GCEDC)) {
+		data = 0x00000000;
+		cgs_write_register(hwmgr->device, mmGC_DIDT_CTRL0, data);
+	}
+
+	if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_PSM))
+		vega10_program_gc_didt_config_registers(hwmgr,  AvfsPSMResetConfig_vega10);
+
+	return 0;
+}
+
+static int vega10_enable_se_edc_config(struct pp_hwmgr *hwmgr)
+{
+	int result;
+	uint32_t num_se = 0, count, data;
+	struct cgs_system_info sys_info = {0};
+	uint32_t reg;
+
+	sys_info.size = sizeof(struct cgs_system_info);
+	sys_info.info_id = CGS_SYSTEM_INFO_GFX_SE_INFO;
+	if (cgs_query_system_info(hwmgr->device, &sys_info) == 0)
+		num_se = sys_info.value;
+
+	cgs_enter_safe_mode(hwmgr->device, true);
+
+	cgs_lock_grbm_idx(hwmgr->device, true);
+	reg = soc15_get_register_offset(GC_HWID, 0, mmGRBM_GFX_INDEX_BASE_IDX, mmGRBM_GFX_INDEX);
+	for (count = 0; count < num_se; count++) {
+		data = GRBM_GFX_INDEX__INSTANCE_BROADCAST_WRITES_MASK | GRBM_GFX_INDEX__SH_BROADCAST_WRITES_MASK | ( count << GRBM_GFX_INDEX__SE_INDEX__SHIFT);
+		cgs_write_register(hwmgr->device, reg, data);
+		result = vega10_program_didt_config_registers(hwmgr, SEDiDtWeightConfig_Vega10, VEGA10_CONFIGREG_DIDT);
+		result |= vega10_program_didt_config_registers(hwmgr, SEEDCStallPatternConfig_Vega10, VEGA10_CONFIGREG_DIDT);
+		result |= vega10_program_didt_config_registers(hwmgr, SEEDCStallDelayConfig_Vega10, VEGA10_CONFIGREG_DIDT);
+		result |= vega10_program_didt_config_registers(hwmgr, SEEDCThresholdConfig_Vega10, VEGA10_CONFIGREG_DIDT);
+		result |= vega10_program_didt_config_registers(hwmgr, SEEDCCtrlResetConfig_Vega10, VEGA10_CONFIGREG_DIDT);
+		result |= vega10_program_didt_config_registers(hwmgr, SEEDCCtrlConfig_Vega10, VEGA10_CONFIGREG_DIDT);
+
+		if (0 != result)
+			break;
+	}
+	cgs_write_register(hwmgr->device, reg, 0xE0000000);
+	cgs_lock_grbm_idx(hwmgr->device, false);
+
+	vega10_didt_set_mask(hwmgr, true);
+
+	cgs_enter_safe_mode(hwmgr->device, false);
+
+	return 0;
+}
+
+static int vega10_disable_se_edc_config(struct pp_hwmgr *hwmgr)
+{
+	cgs_enter_safe_mode(hwmgr->device, true);
+
+	vega10_didt_set_mask(hwmgr, false);
+
+	cgs_enter_safe_mode(hwmgr->device, false);
+
+	return 0;
+}
+
+static int vega10_enable_psm_gc_edc_config(struct pp_hwmgr *hwmgr)
+{
+	int result;
+	uint32_t num_se = 0;
+	uint32_t count, data;
+	struct cgs_system_info sys_info = {0};
+	uint32_t reg;
+
+	sys_info.size = sizeof(struct cgs_system_info);
+	sys_info.info_id = CGS_SYSTEM_INFO_GFX_SE_INFO;
+	if (cgs_query_system_info(hwmgr->device, &sys_info) == 0)
+		num_se = sys_info.value;
+
+	cgs_enter_safe_mode(hwmgr->device, true);
+
+	vega10_program_gc_didt_config_registers(hwmgr, AvfsPSMResetConfig_vega10);
+
+	cgs_lock_grbm_idx(hwmgr->device, true);
+	reg = soc15_get_register_offset(GC_HWID, 0, mmGRBM_GFX_INDEX_BASE_IDX, mmGRBM_GFX_INDEX);
+	for (count = 0; count < num_se; count++) {
+		data = GRBM_GFX_INDEX__INSTANCE_BROADCAST_WRITES_MASK | GRBM_GFX_INDEX__SH_BROADCAST_WRITES_MASK | ( count << GRBM_GFX_INDEX__SE_INDEX__SHIFT);
+		cgs_write_register(hwmgr->device, reg, data);
+		result |= vega10_program_didt_config_registers(hwmgr, PSMSEEDCStallPatternConfig_Vega10, VEGA10_CONFIGREG_DIDT);
+		result |= vega10_program_didt_config_registers(hwmgr, PSMSEEDCStallDelayConfig_Vega10, VEGA10_CONFIGREG_DIDT);
+		result |= vega10_program_didt_config_registers(hwmgr, PSMSEEDCCtrlResetConfig_Vega10, VEGA10_CONFIGREG_DIDT);
+		result |= vega10_program_didt_config_registers(hwmgr, PSMSEEDCCtrlConfig_Vega10, VEGA10_CONFIGREG_DIDT);
+
+		if (0 != result)
+			break;
+	}
+	cgs_write_register(hwmgr->device, reg, 0xE0000000);
+	cgs_lock_grbm_idx(hwmgr->device, false);
+
+	vega10_didt_set_mask(hwmgr, true);
+
+	cgs_enter_safe_mode(hwmgr->device, false);
+
+	vega10_program_gc_didt_config_registers(hwmgr, PSMGCEDCDroopCtrlConfig_vega10);
+
+	if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_GCEDC)) {
+		vega10_program_gc_didt_config_registers(hwmgr, PSMGCEDCCtrlResetConfig_vega10);
+		vega10_program_gc_didt_config_registers(hwmgr, PSMGCEDCCtrlConfig_vega10);
+	}
+
+	if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_PSM))
+		vega10_program_gc_didt_config_registers(hwmgr,  AvfsPSMInitConfig_vega10);
+
+	return 0;
+}
+
+static int vega10_disable_psm_gc_edc_config(struct pp_hwmgr *hwmgr)
+{
+	uint32_t data;
+
+	cgs_enter_safe_mode(hwmgr->device, true);
+
+	vega10_didt_set_mask(hwmgr, false);
+
+	cgs_enter_safe_mode(hwmgr->device, false);
+
+	if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_GCEDC)) {
+		data = 0x00000000;
+		cgs_write_register(hwmgr->device, mmGC_EDC_CTRL, data);
+	}
+
+	if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps, PHM_PlatformCaps_PSM))
+		vega10_program_gc_didt_config_registers(hwmgr,  AvfsPSMResetConfig_vega10);
+
+	return 0;
+}
+
+static int vega10_enable_se_edc_force_stall_config(struct pp_hwmgr *hwmgr)
+{
+	uint32_t reg;
+	int result;
+
+	cgs_enter_safe_mode(hwmgr->device, true);
+
+	cgs_lock_grbm_idx(hwmgr->device, true);
+	reg = soc15_get_register_offset(GC_HWID, 0, mmGRBM_GFX_INDEX_BASE_IDX, mmGRBM_GFX_INDEX);
+	cgs_write_register(hwmgr->device, reg, 0xE0000000);
+	cgs_lock_grbm_idx(hwmgr->device, false);
+
+	result = vega10_program_didt_config_registers(hwmgr, SEEDCForceStallPatternConfig_Vega10, VEGA10_CONFIGREG_DIDT);
+	result |= vega10_program_didt_config_registers(hwmgr, SEEDCCtrlForceStallConfig_Vega10, VEGA10_CONFIGREG_DIDT);
+	if (0 != result)
+		return result;
+
+	vega10_didt_set_mask(hwmgr, false);
+
+	cgs_enter_safe_mode(hwmgr->device, false);
+
+	return 0;
+}
+
+static int vega10_disable_se_edc_force_stall_config(struct pp_hwmgr *hwmgr)
+{
+	int result;
+
+	result = vega10_disable_se_edc_config(hwmgr);
+	PP_ASSERT_WITH_CODE((0 == result), "[DisableDiDtConfig] Pre DIDT disable clock gating failed!", return result);
+
+	return 0;
+}
+
+int vega10_enable_didt_config(struct pp_hwmgr *hwmgr)
+{
+	int result = 0;
+	struct vega10_hwmgr *data = (struct vega10_hwmgr *)(hwmgr->backend);
+
+	if (data->smu_features[GNLD_DIDT].supported) {
+		if (data->smu_features[GNLD_DIDT].enabled)
+			PP_DBG_LOG("[EnableDiDtConfig] Feature DiDt Already enabled!\n");
+
+		switch (data->registry_data.didt_mode) {
+		case 0:
+			result = vega10_enable_cac_driving_se_didt_config(hwmgr);
+			PP_ASSERT_WITH_CODE((0 == result), "[EnableDiDt] Attempt to enable DiDt Mode 0 Failed!", return result);
+			break;
+		case 2:
+			result = vega10_enable_psm_gc_didt_config(hwmgr);
+			PP_ASSERT_WITH_CODE((0 == result), "[EnableDiDt] Attempt to enable DiDt Mode 2 Failed!", return result);
+			break;
+		case 3:
+			result = vega10_enable_se_edc_config(hwmgr);
+			PP_ASSERT_WITH_CODE((0 == result), "[EnableDiDt] Attempt to enable DiDt Mode 3 Failed!", return result);
+			break;
+		case 1:
+		case 4:
+		case 5:
+			result = vega10_enable_psm_gc_edc_config(hwmgr);
+			PP_ASSERT_WITH_CODE((0 == result), "[EnableDiDt] Attempt to enable DiDt Mode 5 Failed!", return result);
+			break;
+		case 6:
+			result = vega10_enable_se_edc_force_stall_config(hwmgr);
+			PP_ASSERT_WITH_CODE((0 == result), "[EnableDiDt] Attempt to enable DiDt Mode 6 Failed!", return result);
+			break;
+		default:
+			result = -EINVAL;
+			break;
+		}
+
+		if (0 == result) {
+			PP_ASSERT_WITH_CODE((!vega10_enable_smc_features(hwmgr->smumgr, true, data->smu_features[GNLD_DIDT].smu_feature_bitmap)),
+				"[EnableDiDtConfig] Attempt to Enable DiDt feature Failed!", return result);
+			data->smu_features[GNLD_DIDT].enabled = true;
+		}
+	}
+
+	return result;
+}
+
+int vega10_disable_didt_config(struct pp_hwmgr *hwmgr)
+{
+	int result = 0;
+	struct vega10_hwmgr *data = (struct vega10_hwmgr *)(hwmgr->backend);
+
+	if (data->smu_features[GNLD_DIDT].supported) {
+		if (!data->smu_features[GNLD_DIDT].enabled)
+			PP_DBG_LOG("[DisableDiDtConfig] Feature DiDt Already Disabled!\n");
+
+		switch (data->registry_data.didt_mode) {
+		case 0:
+			result = vega10_disable_cac_driving_se_didt_config(hwmgr);
+			PP_ASSERT_WITH_CODE((0 == result), "[DisableDiDt] Attempt to disable DiDt Mode 0 Failed!", return result);
+			break;
+		case 2:
+			result = vega10_disable_psm_gc_didt_config(hwmgr);
+			PP_ASSERT_WITH_CODE((0 == result), "[DisableDiDt] Attempt to disable DiDt Mode 2 Failed!", return result);
+			break;
+		case 3:
+			result = vega10_disable_se_edc_config(hwmgr);
+			PP_ASSERT_WITH_CODE((0 == result), "[DisableDiDt] Attempt to disable DiDt Mode 3 Failed!", return result);
+			break;
+		case 1:
+		case 4:
+		case 5:
+			result = vega10_disable_psm_gc_edc_config(hwmgr);
+			PP_ASSERT_WITH_CODE((0 == result), "[DisableDiDt] Attempt to disable DiDt Mode 5 Failed!", return result);
+			break;
+		case 6:
+			result = vega10_disable_se_edc_force_stall_config(hwmgr);
+			PP_ASSERT_WITH_CODE((0 == result), "[DisableDiDt] Attempt to disable DiDt Mode 6 Failed!", return result);
+			break;
+		default:
+			result = -EINVAL;
+			break;
+		}
+
+		if (0 == result) {
+			PP_ASSERT_WITH_CODE((0 != vega10_enable_smc_features(hwmgr->smumgr, false, data->smu_features[GNLD_DIDT].smu_feature_bitmap)),
+					"[DisableDiDtConfig] Attempt to Disable DiDt feature Failed!", return result);
+			data->smu_features[GNLD_DIDT].enabled = false;
+		}
+	}
+
+	return result;
+}
 
 void vega10_initialize_power_tune_defaults(struct pp_hwmgr *hwmgr)
 {
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_powertune.h b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_powertune.h
index 9ecaa27..b95771a 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_powertune.h
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_powertune.h
@@ -31,6 +31,12 @@ enum vega10_pt_config_reg_type {
 	VEGA10_CONFIGREG_MAX
 };
 
+enum vega10_didt_config_reg_type {
+	VEGA10_CONFIGREG_DIDT = 0,
+	VEGA10_CONFIGREG_GCCAC,
+	VEGA10_CONFIGREG_SECAC
+};
+
 /* PowerContainment Features */
 #define POWERCONTAINMENT_FEATURE_DTE             0x00000001
 #define POWERCONTAINMENT_FEATURE_TDCLimit        0x00000002
@@ -44,6 +50,13 @@ struct vega10_pt_config_reg {
 	enum vega10_pt_config_reg_type       type;
 };
 
+struct vega10_didt_config_reg {
+	uint32_t		offset;
+	uint32_t		mask;
+	uint32_t		shift;
+	uint32_t		value;
+};
+
 struct vega10_pt_defaults {
     uint8_t   SviLoadLineEn;
     uint8_t   SviLoadLineVddC;
@@ -62,5 +75,8 @@ int vega10_set_power_limit(struct pp_hwmgr *hwmgr, uint32_t n);
 int vega10_power_control_set_level(struct pp_hwmgr *hwmgr);
 int vega10_disable_power_containment(struct pp_hwmgr *hwmgr);
 
+int vega10_enable_didt_config(struct pp_hwmgr *hwmgr);
+int vega10_disable_didt_config(struct pp_hwmgr *hwmgr);
+
 #endif  /* _VEGA10_POWERTUNE_H_ */
 
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_processpptables.c b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_processpptables.c
index 1623644..e343df1 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_processpptables.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_processpptables.c
@@ -31,6 +31,8 @@
 #include "cgs_common.h"
 #include "vega10_pptable.h"
 
+#define NUM_DSPCLK_LEVELS 8
+
 static void set_hw_cap(struct pp_hwmgr *hwmgr, bool enable,
 		enum phm_platform_caps cap)
 {
@@ -644,11 +646,11 @@ static int get_gfxclk_voltage_dependency_table(
 	return 0;
 }
 
-static int get_dcefclk_voltage_dependency_table(
+static int get_pix_clk_voltage_dependency_table(
 		struct pp_hwmgr *hwmgr,
 		struct phm_ppt_v1_clock_voltage_dependency_table
 			**pp_vega10_clk_dep_table,
-		const ATOM_Vega10_DCEFCLK_Dependency_Table *clk_dep_table)
+		const  ATOM_Vega10_PIXCLK_Dependency_Table *clk_dep_table)
 {
 	uint32_t table_size, i;
 	struct phm_ppt_v1_clock_voltage_dependency_table
@@ -681,6 +683,76 @@ static int get_dcefclk_voltage_dependency_table(
 	return 0;
 }
 
+static int get_dcefclk_voltage_dependency_table(
+		struct pp_hwmgr *hwmgr,
+		struct phm_ppt_v1_clock_voltage_dependency_table
+			**pp_vega10_clk_dep_table,
+		const ATOM_Vega10_DCEFCLK_Dependency_Table *clk_dep_table)
+{
+	uint32_t table_size, i;
+	uint8_t num_entries;
+	struct phm_ppt_v1_clock_voltage_dependency_table
+				*clk_table;
+	struct cgs_system_info sys_info = {0};
+	uint32_t dev_id;
+	uint32_t rev_id;
+
+	PP_ASSERT_WITH_CODE((clk_dep_table->ucNumEntries != 0),
+			"Invalid PowerPlay Table!", return -1);
+
+/*
+ * workaround needed to add another DPM level for pioneer cards
+ * as VBIOS is locked down.
+ * This DPM level was added to support 3DPM monitors @ 4K120Hz
+ *
+ */
+	sys_info.size = sizeof(struct cgs_system_info);
+	sys_info.info_id = CGS_SYSTEM_INFO_PCIE_DEV;
+	cgs_query_system_info(hwmgr->device, &sys_info);
+	dev_id = (uint32_t)sys_info.value;
+
+	sys_info.size = sizeof(struct cgs_system_info);
+	sys_info.info_id = CGS_SYSTEM_INFO_PCIE_REV;
+	cgs_query_system_info(hwmgr->device, &sys_info);
+	rev_id = (uint32_t)sys_info.value;
+
+	if (dev_id == 0x6863 && rev_id == 0 &&
+		clk_dep_table->entries[clk_dep_table->ucNumEntries - 1].ulClk < 90000)
+		num_entries = clk_dep_table->ucNumEntries + 1 > NUM_DSPCLK_LEVELS ?
+				NUM_DSPCLK_LEVELS : clk_dep_table->ucNumEntries + 1;
+	else
+		num_entries = clk_dep_table->ucNumEntries;
+
+
+	table_size = sizeof(uint32_t) +
+			sizeof(phm_ppt_v1_clock_voltage_dependency_record) *
+			num_entries;
+
+	clk_table = (struct phm_ppt_v1_clock_voltage_dependency_table *)
+			kzalloc(table_size, GFP_KERNEL);
+
+	if (!clk_table)
+		return -ENOMEM;
+
+	clk_table->count = (uint32_t)num_entries;
+
+	for (i = 0; i < clk_dep_table->ucNumEntries; i++) {
+		clk_table->entries[i].vddInd =
+				clk_dep_table->entries[i].ucVddInd;
+		clk_table->entries[i].clk =
+				le32_to_cpu(clk_dep_table->entries[i].ulClk);
+	}
+
+	if (i < num_entries) {
+		clk_table->entries[i].vddInd = clk_dep_table->entries[i-1].ucVddInd;
+		clk_table->entries[i].clk = 90000;
+	}
+
+	*pp_vega10_clk_dep_table = clk_table;
+
+	return 0;
+}
+
 static int get_pcie_table(struct pp_hwmgr *hwmgr,
 		struct phm_ppt_v1_pcie_table **vega10_pcie_table,
 		const Vega10_PPTable_Generic_SubTable_Header *table)
@@ -862,21 +934,21 @@ static int init_powerplay_extended_tables(
 				gfxclk_dep_table);
 
 	if (!result && powerplay_table->usPixclkDependencyTableOffset)
-		result = get_dcefclk_voltage_dependency_table(hwmgr,
+		result = get_pix_clk_voltage_dependency_table(hwmgr,
 				&pp_table_info->vdd_dep_on_pixclk,
-				(const ATOM_Vega10_DCEFCLK_Dependency_Table*)
+				(const ATOM_Vega10_PIXCLK_Dependency_Table*)
 				pixclk_dep_table);
 
 	if (!result && powerplay_table->usPhyClkDependencyTableOffset)
-		result = get_dcefclk_voltage_dependency_table(hwmgr,
+		result = get_pix_clk_voltage_dependency_table(hwmgr,
 				&pp_table_info->vdd_dep_on_phyclk,
-				(const ATOM_Vega10_DCEFCLK_Dependency_Table *)
+				(const ATOM_Vega10_PIXCLK_Dependency_Table *)
 				phyclk_dep_table);
 
 	if (!result && powerplay_table->usDispClkDependencyTableOffset)
-		result = get_dcefclk_voltage_dependency_table(hwmgr,
+		result = get_pix_clk_voltage_dependency_table(hwmgr,
 				&pp_table_info->vdd_dep_on_dispclk,
-				(const ATOM_Vega10_DCEFCLK_Dependency_Table *)
+				(const ATOM_Vega10_PIXCLK_Dependency_Table *)
 				dispclk_dep_table);
 
 	if (!result && powerplay_table->usDcefclkDependencyTableOffset)
diff --git a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_thermal.c b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_thermal.c
index e7ab8eb..d442434 100644
--- a/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_thermal.c
+++ b/drivers/gpu/drm/amd/powerplay/hwmgr/vega10_thermal.c
@@ -321,10 +321,7 @@ int vega10_fan_ctrl_reset_fan_speed_to_default(struct pp_hwmgr *hwmgr)
 
 	if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps,
 			PHM_PlatformCaps_MicrocodeFanControl)) {
-		result = vega10_fan_ctrl_set_static_mode(hwmgr,
-				FDO_PWM_MODE_STATIC);
-		if (!result)
-			result = vega10_fan_ctrl_start_smc_fan_control(hwmgr);
+		result = vega10_fan_ctrl_start_smc_fan_control(hwmgr);
 	} else
 		result = vega10_fan_ctrl_set_default_mode(hwmgr);
 
@@ -633,7 +630,6 @@ int tf_vega10_thermal_start_smc_fan_control(struct pp_hwmgr *hwmgr,
 	if (phm_cap_enabled(hwmgr->platform_descriptor.platformCaps,
 			PHM_PlatformCaps_MicrocodeFanControl)) {
 		vega10_fan_ctrl_start_smc_fan_control(hwmgr);
-		vega10_fan_ctrl_set_static_mode(hwmgr, FDO_PWM_MODE_STATIC);
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/powerplay/inc/hardwaremanager.h b/drivers/gpu/drm/amd/powerplay/inc/hardwaremanager.h
index a1ebe10..a4c8b09 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/hardwaremanager.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/hardwaremanager.h
@@ -164,9 +164,14 @@ enum phm_platform_caps {
 	PHM_PlatformCaps_EnablePlatformPowerManagement,         /* indicates that Platform Power Management feature is supported */
 	PHM_PlatformCaps_SurpriseRemoval,                       /* indicates that surprise removal feature is requested */
 	PHM_PlatformCaps_NewCACVoltage,                         /* indicates new CAC voltage table support */
+	PHM_PlatformCaps_DiDtSupport,                           /* for dI/dT feature */
 	PHM_PlatformCaps_DBRamping,                             /* for dI/dT feature */
 	PHM_PlatformCaps_TDRamping,                             /* for dI/dT feature */
 	PHM_PlatformCaps_TCPRamping,                            /* for dI/dT feature */
+	PHM_PlatformCaps_DBRRamping,                            /* for dI/dT feature */
+	PHM_PlatformCaps_DiDtEDCEnable,                         /* for dI/dT feature */
+	PHM_PlatformCaps_GCEDC,                                 /* for dI/dT feature */
+	PHM_PlatformCaps_PSM,                                   /* for dI/dT feature */
 	PHM_PlatformCaps_EnableSMU7ThermalManagement,           /* SMC will manage thermal events */
 	PHM_PlatformCaps_FPS,                                   /* FPS support */
 	PHM_PlatformCaps_ACP,                                   /* ACP support */
diff --git a/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h b/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
index 47e57bd..91b0105 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/hwmgr.h
@@ -128,6 +128,8 @@ struct phm_uvd_arbiter {
 	uint32_t dclk;
 	uint32_t vclk_ceiling;
 	uint32_t dclk_ceiling;
+	uint32_t vclk_soft_min;
+	uint32_t dclk_soft_min;
 };
 
 struct phm_vce_arbiter {
diff --git a/drivers/gpu/drm/amd/powerplay/inc/pp_debug.h b/drivers/gpu/drm/amd/powerplay/inc/pp_debug.h
index f3f9ebb..822cd8b 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/pp_debug.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/pp_debug.h
@@ -42,6 +42,12 @@
 		}				\
 	} while (0)
 
+#define PP_ASSERT(cond, msg)	\
+	do {					\
+		if (!(cond)) {			\
+			pr_warn("%s\n", msg);	\
+		}				\
+	} while (0)
 
 #define PP_DBG_LOG(fmt, ...) \
 	do { \
diff --git a/drivers/gpu/drm/amd/powerplay/inc/pp_soc15.h b/drivers/gpu/drm/amd/powerplay/inc/pp_soc15.h
index 227d999..a511611 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/pp_soc15.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/pp_soc15.h
@@ -41,6 +41,8 @@ inline static uint32_t soc15_get_register_offset(
 		reg = MP1_BASE.instance[inst].segment[segment] + offset;
 	else if (hw_id == DF_HWID)
 		reg = DF_BASE.instance[inst].segment[segment] + offset;
+	else if (hw_id == GC_HWID)
+		reg = GC_BASE.instance[inst].segment[segment] + offset;
 
 	return reg;
 }
diff --git a/drivers/gpu/drm/amd/powerplay/inc/rv_ppsmc.h b/drivers/gpu/drm/amd/powerplay/inc/rv_ppsmc.h
index e0e106f..901c960c 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/rv_ppsmc.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/rv_ppsmc.h
@@ -66,7 +66,12 @@
 #define PPSMC_MSG_SetMinVddcrSocVoltage         0x22
 #define PPSMC_MSG_SetMinVideoFclkFreq           0x23
 #define PPSMC_MSG_SetMinDeepSleepDcefclk        0x24
-#define PPSMC_Message_Count                     0x25
+#define PPSMC_MSG_ForcePowerDownGfx             0x25
+#define PPSMC_MSG_SetPhyclkVoltageByFreq        0x26
+#define PPSMC_MSG_SetDppclkVoltageByFreq        0x27
+#define PPSMC_MSG_SetSoftMinVcn                 0x28
+#define PPSMC_Message_Count                     0x29
+
 
 typedef uint16_t PPSMC_Result;
 typedef int      PPSMC_Msg;
diff --git a/drivers/gpu/drm/amd/powerplay/inc/smu9.h b/drivers/gpu/drm/amd/powerplay/inc/smu9.h
index 9ef2490..550ed67 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/smu9.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/smu9.h
@@ -55,9 +55,9 @@
 #define FEATURE_FW_CTF_BIT              23
 #define FEATURE_LED_DISPLAY_BIT         24
 #define FEATURE_FAN_CONTROL_BIT         25
-#define FEATURE_VOLTAGE_CONTROLLER_BIT  26
-#define FEATURE_SPARE_27_BIT            27
-#define FEATURE_SPARE_28_BIT            28
+#define FEATURE_FAST_PPT_BIT            26
+#define FEATURE_GFX_EDC_BIT             27
+#define FEATURE_ACG_BIT                 28
 #define FEATURE_SPARE_29_BIT            29
 #define FEATURE_SPARE_30_BIT            30
 #define FEATURE_SPARE_31_BIT            31
@@ -90,9 +90,10 @@
 #define FFEATURE_FW_CTF_MASK             (1 << FEATURE_FW_CTF_BIT             )
 #define FFEATURE_LED_DISPLAY_MASK        (1 << FEATURE_LED_DISPLAY_BIT        )
 #define FFEATURE_FAN_CONTROL_MASK        (1 << FEATURE_FAN_CONTROL_BIT        )
-#define FFEATURE_VOLTAGE_CONTROLLER_MASK (1 << FEATURE_VOLTAGE_CONTROLLER_BIT )
-#define FFEATURE_SPARE_27_MASK           (1 << FEATURE_SPARE_27_BIT           )
-#define FFEATURE_SPARE_28_MASK           (1 << FEATURE_SPARE_28_BIT           )
+
+#define FEATURE_FAST_PPT_MASK            (1 << FAST_PPT_BIT                   )
+#define FEATURE_GFX_EDC_MASK             (1 << FEATURE_GFX_EDC_BIT            )
+#define FEATURE_ACG_MASK                 (1 << FEATURE_ACG_BIT                )
 #define FFEATURE_SPARE_29_MASK           (1 << FEATURE_SPARE_29_BIT           )
 #define FFEATURE_SPARE_30_MASK           (1 << FEATURE_SPARE_30_BIT           )
 #define FFEATURE_SPARE_31_MASK           (1 << FEATURE_SPARE_31_BIT           )
diff --git a/drivers/gpu/drm/amd/powerplay/inc/smu9_driver_if.h b/drivers/gpu/drm/amd/powerplay/inc/smu9_driver_if.h
index 532186b..f6d6c61 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/smu9_driver_if.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/smu9_driver_if.h
@@ -312,7 +312,10 @@ typedef struct {
 
   PllSetting_t GfxBoostState;
 
-  uint32_t     Reserved[14];
+  uint8_t      AcgEnable[NUM_GFXCLK_DPM_LEVELS];
+  GbVdroopTable_t AcgBtcGbVdroopTable;
+  QuadraticInt_t  AcgAvfsGb;
+  uint32_t     Reserved[4];
 
   /* Padding - ignore */
   uint32_t     MmHubPadding[7]; /* SMU internal use */
diff --git a/drivers/gpu/drm/amd/powerplay/inc/smumgr.h b/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
index 976e942..5d61cc9 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/smumgr.h
@@ -131,6 +131,7 @@ struct pp_smumgr_func {
 	bool (*is_dpm_running)(struct pp_hwmgr *hwmgr);
 	int (*populate_requested_graphic_levels)(struct pp_hwmgr *hwmgr,
 			struct amd_pp_profile *request);
+	bool (*is_hw_avfs_present)(struct pp_smumgr *smumgr);
 };
 
 struct pp_smumgr {
@@ -202,6 +203,8 @@ extern bool smum_is_dpm_running(struct pp_hwmgr *hwmgr);
 extern int smum_populate_requested_graphic_levels(struct pp_hwmgr *hwmgr,
 		struct amd_pp_profile *request);
 
+extern bool smum_is_hw_avfs_present(struct pp_smumgr *smumgr);
+
 #define SMUM_FIELD_SHIFT(reg, field) reg##__##field##__SHIFT
 
 #define SMUM_FIELD_MASK(reg, field) reg##__##field##_MASK
diff --git a/drivers/gpu/drm/amd/powerplay/inc/vega10_ppsmc.h b/drivers/gpu/drm/amd/powerplay/inc/vega10_ppsmc.h
index b4af9e8..cb070eb 100644
--- a/drivers/gpu/drm/amd/powerplay/inc/vega10_ppsmc.h
+++ b/drivers/gpu/drm/amd/powerplay/inc/vega10_ppsmc.h
@@ -124,6 +124,10 @@ typedef uint16_t PPSMC_Result;
 #define PPSMC_MSG_NumOfDisplays                  0x56
 #define PPSMC_MSG_ReadSerialNumTop32             0x58
 #define PPSMC_MSG_ReadSerialNumBottom32          0x59
+#define PPSMC_MSG_RunAcgBtc                      0x5C
+#define PPSMC_MSG_RunAcgInClosedLoop             0x5D
+#define PPSMC_MSG_RunAcgInOpenLoop               0x5E
+#define PPSMC_MSG_InitializeAcg                  0x5F
 #define PPSMC_MSG_GetCurrPkgPwr                  0x61
 #define PPSMC_Message_Count                      0x62
 
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c
index 6a320b2..8712f09 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.c
@@ -2129,6 +2129,25 @@ int fiji_thermal_setup_fan_table(struct pp_hwmgr *hwmgr)
 	return 0;
 }
 
+
+int fiji_thermal_avfs_enable(struct pp_hwmgr *hwmgr)
+{
+	int ret;
+	struct pp_smumgr *smumgr = (struct pp_smumgr *)(hwmgr->smumgr);
+	struct smu7_smumgr *smu_data = (struct smu7_smumgr *)(smumgr->backend);
+
+	if (smu_data->avfs.avfs_btc_status != AVFS_BTC_ENABLEAVFS)
+		return 0;
+
+	ret = smum_send_msg_to_smc(smumgr, PPSMC_MSG_EnableAvfs);
+
+	if (!ret)
+		/* If this param is not changed, this function could fire unnecessarily */
+		smu_data->avfs.avfs_btc_status = AVFS_BTC_COMPLETED_PREVIOUSLY;
+
+	return ret;
+}
+
 static int fiji_program_mem_timing_parameters(struct pp_hwmgr *hwmgr)
 {
 	struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.h b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.h
index 0e9e1f2..d9c72d9 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.h
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smc.h
@@ -48,5 +48,6 @@ int fiji_initialize_mc_reg_table(struct pp_hwmgr *hwmgr);
 bool fiji_is_dpm_running(struct pp_hwmgr *hwmgr);
 int fiji_populate_requested_graphic_levels(struct pp_hwmgr *hwmgr,
 		struct amd_pp_profile *request);
+int fiji_thermal_avfs_enable(struct pp_hwmgr *hwmgr);
 #endif
 
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
index a1cb785..6ae948f 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.c
@@ -161,56 +161,47 @@ static int fiji_start_smu_in_non_protection_mode(struct pp_smumgr *smumgr)
 
 static int fiji_setup_pwr_virus(struct pp_smumgr *smumgr)
 {
-	int i, result = -1;
+	int i;
+	int result = -EINVAL;
 	uint32_t reg, data;
-	const PWR_Command_Table *virus = PwrVirusTable;
-	struct fiji_smumgr *priv = (struct fiji_smumgr *)(smumgr->backend);
 
-	priv->avfs.AvfsBtcStatus = AVFS_LOAD_VIRUS;
-	for (i = 0; (i < PWR_VIRUS_TABLE_SIZE); i++) {
-		switch (virus->command) {
+	const PWR_Command_Table *pvirus = PwrVirusTable;
+	struct smu7_smumgr *smu_data = (struct smu7_smumgr *)(smumgr->backend);
+
+	for (i = 0; i < PWR_VIRUS_TABLE_SIZE; i++) {
+		switch (pvirus->command) {
 		case PwrCmdWrite:
-			reg  = virus->reg;
-			data = virus->data;
+			reg  = pvirus->reg;
+			data = pvirus->data;
 			cgs_write_register(smumgr->device, reg, data);
 			break;
+
 		case PwrCmdEnd:
-			priv->avfs.AvfsBtcStatus = AVFS_BTC_VIRUS_LOADED;
 			result = 0;
 			break;
+
 		default:
-			pr_err("Table Exit with Invalid Command!");
-			priv->avfs.AvfsBtcStatus = AVFS_BTC_VIRUS_FAIL;
-			result = -1;
+			pr_info("Table Exit with Invalid Command!");
+			smu_data->avfs.avfs_btc_status = AVFS_BTC_VIRUS_FAIL;
+			result = -EINVAL;
 			break;
 		}
-		virus++;
+		pvirus++;
 	}
+
 	return result;
 }
 
 static int fiji_start_avfs_btc(struct pp_smumgr *smumgr)
 {
 	int result = 0;
-	struct fiji_smumgr *priv = (struct fiji_smumgr *)(smumgr->backend);
+	struct smu7_smumgr *smu_data = (struct smu7_smumgr *)(smumgr->backend);
 
-	priv->avfs.AvfsBtcStatus = AVFS_BTC_STARTED;
-	if (priv->avfs.AvfsBtcParam) {
-		if (!smum_send_msg_to_smc_with_parameter(smumgr,
-				PPSMC_MSG_PerformBtc, priv->avfs.AvfsBtcParam)) {
-			if (!smum_send_msg_to_smc(smumgr, PPSMC_MSG_EnableAvfs)) {
-				priv->avfs.AvfsBtcStatus = AVFS_BTC_COMPLETED_UNSAVED;
-				result = 0;
-			} else {
-				pr_err("[AVFS][fiji_start_avfs_btc] Attempt"
-						" to Enable AVFS Failed!");
-				smum_send_msg_to_smc(smumgr, PPSMC_MSG_DisableAvfs);
-				result = -1;
-			}
-		} else {
-			pr_err("[AVFS][fiji_start_avfs_btc] "
-					"PerformBTC SMU msg failed");
-			result = -1;
+	if (0 != smu_data->avfs.avfs_btc_param) {
+		if (0 != smu7_send_msg_to_smc_with_parameter(smumgr,
+				PPSMC_MSG_PerformBtc, smu_data->avfs.avfs_btc_param)) {
+			pr_info("[AVFS][Fiji_PerformBtc] PerformBTC SMU msg failed");
+			result = -EINVAL;
 		}
 	}
 	/* Soft-Reset to reset the engine before loading uCode */
@@ -224,42 +215,6 @@ static int fiji_start_avfs_btc(struct pp_smumgr *smumgr)
 	return result;
 }
 
-static int fiji_setup_pm_fuse_for_avfs(struct pp_smumgr *smumgr)
-{
-	int result = 0;
-	uint32_t table_start;
-	uint32_t charz_freq_addr, inversion_voltage_addr, charz_freq;
-	uint16_t inversion_voltage;
-
-	charz_freq = 0x30750000; /* In 10KHz units 0x00007530 Actual value */
-	inversion_voltage = 0x1A04; /* mV Q14.2 0x41A Actual value */
-
-	PP_ASSERT_WITH_CODE(0 == smu7_read_smc_sram_dword(smumgr,
-			SMU7_FIRMWARE_HEADER_LOCATION + offsetof(SMU73_Firmware_Header,
-					PmFuseTable), &table_start, 0x40000),
-			"[AVFS][Fiji_SetupGfxLvlStruct] SMU could not communicate "
-			"starting address of PmFuse structure",
-			return -1;);
-
-	charz_freq_addr = table_start +
-			offsetof(struct SMU73_Discrete_PmFuses, PsmCharzFreq);
-	inversion_voltage_addr = table_start +
-			offsetof(struct SMU73_Discrete_PmFuses, InversionVoltage);
-
-	result = smu7_copy_bytes_to_smc(smumgr, charz_freq_addr,
-			(uint8_t *)(&charz_freq), sizeof(charz_freq), 0x40000);
-	PP_ASSERT_WITH_CODE(0 == result,
-			"[AVFS][fiji_setup_pm_fuse_for_avfs] charz_freq could not "
-			"be populated.", return -1;);
-
-	result = smu7_copy_bytes_to_smc(smumgr, inversion_voltage_addr,
-			(uint8_t *)(&inversion_voltage), sizeof(inversion_voltage), 0x40000);
-	PP_ASSERT_WITH_CODE(0 == result, "[AVFS][fiji_setup_pm_fuse_for_avfs] "
-			"charz_freq could not be populated.", return -1;);
-
-	return result;
-}
-
 static int fiji_setup_graphics_level_structure(struct pp_smumgr *smumgr)
 {
 	int32_t vr_config;
@@ -298,93 +253,41 @@ static int fiji_setup_graphics_level_structure(struct pp_smumgr *smumgr)
 	return 0;
 }
 
-/* Work in Progress */
-static int fiji_restore_vft_table(struct pp_smumgr *smumgr)
-{
-	struct fiji_smumgr *priv = (struct fiji_smumgr *)(smumgr->backend);
-
-	if (AVFS_BTC_COMPLETED_SAVED == priv->avfs.AvfsBtcStatus) {
-		priv->avfs.AvfsBtcStatus = AVFS_BTC_COMPLETED_RESTORED;
-		return 0;
-	} else
-		return -EINVAL;
-}
-
-/* Work in Progress */
-static int fiji_save_vft_table(struct pp_smumgr *smumgr)
-{
-	struct fiji_smumgr *priv = (struct fiji_smumgr *)(smumgr->backend);
-
-	if (AVFS_BTC_COMPLETED_SAVED == priv->avfs.AvfsBtcStatus) {
-		priv->avfs.AvfsBtcStatus = AVFS_BTC_COMPLETED_RESTORED;
-		return 0;
-	} else
-		return -EINVAL;
-}
-
 static int fiji_avfs_event_mgr(struct pp_smumgr *smumgr, bool smu_started)
 {
-	struct fiji_smumgr *priv = (struct fiji_smumgr *)(smumgr->backend);
+	struct smu7_smumgr *smu_data = (struct smu7_smumgr *)(smumgr->backend);
 
-	switch (priv->avfs.AvfsBtcStatus) {
-	case AVFS_BTC_COMPLETED_SAVED: /*S3 State - Pre SMU Start */
-		priv->avfs.AvfsBtcStatus = AVFS_BTC_RESTOREVFT_FAILED;
-		PP_ASSERT_WITH_CODE(0 == fiji_restore_vft_table(smumgr),
-				"[AVFS][fiji_avfs_event_mgr] Could not Copy Graphics "
-				"Level table over to SMU",
-				return -1;);
-		priv->avfs.AvfsBtcStatus = AVFS_BTC_COMPLETED_RESTORED;
+	switch (smu_data->avfs.avfs_btc_status) {
+	case AVFS_BTC_COMPLETED_PREVIOUSLY:
 		break;
-	case AVFS_BTC_COMPLETED_RESTORED: /*S3 State - Post SMU Start*/
-		priv->avfs.AvfsBtcStatus = AVFS_BTC_SMUMSG_ERROR;
-		PP_ASSERT_WITH_CODE(0 == smum_send_msg_to_smc(smumgr,
-				0x666),
-				"[AVFS][fiji_avfs_event_mgr] SMU did not respond "
-				"correctly to VftTableIsValid Msg",
-				return -1;);
-		priv->avfs.AvfsBtcStatus = AVFS_BTC_SMUMSG_ERROR;
-		PP_ASSERT_WITH_CODE(0 == smum_send_msg_to_smc(smumgr,
-				PPSMC_MSG_EnableAvfs),
-				"[AVFS][fiji_avfs_event_mgr] SMU did not respond "
-				"correctly to EnableAvfs Message Msg",
-				return -1;);
-		priv->avfs.AvfsBtcStatus = AVFS_BTC_COMPLETED_SAVED;
-		break;
+
 	case AVFS_BTC_BOOT: /*Cold Boot State - Post SMU Start*/
 		if (!smu_started)
 			break;
-		priv->avfs.AvfsBtcStatus = AVFS_BTC_FAILED;
-		PP_ASSERT_WITH_CODE(0 == fiji_setup_pm_fuse_for_avfs(smumgr),
-				"[AVFS][fiji_avfs_event_mgr] Failure at "
-				"fiji_setup_pm_fuse_for_avfs",
-				return -1;);
-		priv->avfs.AvfsBtcStatus = AVFS_BTC_DPMTABLESETUP_FAILED;
+		smu_data->avfs.avfs_btc_status = AVFS_BTC_FAILED;
 		PP_ASSERT_WITH_CODE(0 == fiji_setup_graphics_level_structure(smumgr),
 				"[AVFS][fiji_avfs_event_mgr] Could not Copy Graphics Level"
 				" table over to SMU",
-				return -1;);
-		priv->avfs.AvfsBtcStatus = AVFS_BTC_VIRUS_FAIL;
+				return -EINVAL;);
+		smu_data->avfs.avfs_btc_status = AVFS_BTC_VIRUS_FAIL;
 		PP_ASSERT_WITH_CODE(0 == fiji_setup_pwr_virus(smumgr),
 				"[AVFS][fiji_avfs_event_mgr] Could not setup "
 				"Pwr Virus for AVFS ",
-				return -1;);
-		priv->avfs.AvfsBtcStatus = AVFS_BTC_FAILED;
+				return -EINVAL;);
+		smu_data->avfs.avfs_btc_status = AVFS_BTC_FAILED;
 		PP_ASSERT_WITH_CODE(0 == fiji_start_avfs_btc(smumgr),
 				"[AVFS][fiji_avfs_event_mgr] Failure at "
 				"fiji_start_avfs_btc. AVFS Disabled",
-				return -1;);
-		priv->avfs.AvfsBtcStatus = AVFS_BTC_SAVEVFT_FAILED;
-		PP_ASSERT_WITH_CODE(0 == fiji_save_vft_table(smumgr),
-				"[AVFS][fiji_avfs_event_mgr] Could not save VFT Table",
-				return -1;);
-		priv->avfs.AvfsBtcStatus = AVFS_BTC_COMPLETED_SAVED;
+				return -EINVAL;);
+
+		smu_data->avfs.avfs_btc_status = AVFS_BTC_ENABLEAVFS;
 		break;
 	case AVFS_BTC_DISABLED: /* Do nothing */
-		break;
 	case AVFS_BTC_NOTSUPPORTED: /* Do nothing */
+	case AVFS_BTC_ENABLEAVFS:
 		break;
 	default:
-		pr_err("[AVFS] Something is broken. See log!");
+		pr_err("AVFS failed status is %x !\n", smu_data->avfs.avfs_btc_status);
 		break;
 	}
 	return 0;
@@ -477,19 +380,6 @@ static int fiji_smu_init(struct pp_smumgr *smumgr)
 	if (smu7_init(smumgr))
 		return -EINVAL;
 
-	fiji_priv->avfs.AvfsBtcStatus = AVFS_BTC_BOOT;
-	if (fiji_is_hw_avfs_present(smumgr))
-		/* AVFS Parameter
-		 * 0 - BTC DC disabled, BTC AC disabled
-		 * 1 - BTC DC enabled,  BTC AC disabled
-		 * 2 - BTC DC disabled, BTC AC enabled
-		 * 3 - BTC DC enabled,  BTC AC enabled
-		 * Default is 0 - BTC DC disabled, BTC AC disabled
-		 */
-		fiji_priv->avfs.AvfsBtcParam = 0;
-	else
-		fiji_priv->avfs.AvfsBtcStatus = AVFS_BTC_NOTSUPPORTED;
-
 	for (i = 0; i < SMU73_MAX_LEVELS_GRAPHICS; i++)
 		fiji_priv->activity_target[i] = 30;
 
@@ -514,10 +404,12 @@ const struct pp_smumgr_func fiji_smu_funcs = {
 	.init_smc_table = fiji_init_smc_table,
 	.update_sclk_threshold = fiji_update_sclk_threshold,
 	.thermal_setup_fan_table = fiji_thermal_setup_fan_table,
+	.thermal_avfs_enable = fiji_thermal_avfs_enable,
 	.populate_all_graphic_levels = fiji_populate_all_graphic_levels,
 	.populate_all_memory_levels = fiji_populate_all_memory_levels,
 	.get_mac_definition = fiji_get_mac_definition,
 	.initialize_mc_reg_table = fiji_initialize_mc_reg_table,
 	.is_dpm_running = fiji_is_dpm_running,
 	.populate_requested_graphic_levels = fiji_populate_requested_graphic_levels,
+	.is_hw_avfs_present = fiji_is_hw_avfs_present,
 };
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.h b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.h
index adcbdfb..175bf9f 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.h
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/fiji_smumgr.h
@@ -28,17 +28,8 @@
 #include "smu7_smumgr.h"
 
 
-
-struct fiji_smu_avfs {
-	enum AVFS_BTC_STATUS AvfsBtcStatus;
-	uint32_t           AvfsBtcParam;
-};
-
-
 struct fiji_smumgr {
 	struct smu7_smumgr                   smu7_data;
-
-	struct fiji_smu_avfs avfs;
 	struct SMU73_Discrete_DpmTable       smc_state_table;
 	struct SMU73_Discrete_Ulv            ulv_setting;
 	struct SMU73_Discrete_PmFuses  power_tune_table;
@@ -47,7 +38,5 @@ struct fiji_smumgr {
 
 };
 
-
-
 #endif
 
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smc.c b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smc.c
index f68e759..99a00bd 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smc.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smc.c
@@ -1498,7 +1498,7 @@ static int polaris10_populate_avfs_parameters(struct pp_hwmgr *hwmgr)
 			table_info->vdd_dep_on_sclk;
 
 
-	if (smu_data->avfs.avfs_btc_status == AVFS_BTC_NOTSUPPORTED)
+	if (((struct smu7_smumgr *)smu_data)->avfs.avfs_btc_status == AVFS_BTC_NOTSUPPORTED)
 		return result;
 
 	result = atomctrl_get_avfs_information(hwmgr, &avfs_params);
@@ -1889,7 +1889,7 @@ int polaris10_thermal_avfs_enable(struct pp_hwmgr *hwmgr)
 {
 	int ret;
 	struct pp_smumgr *smumgr = (struct pp_smumgr *)(hwmgr->smumgr);
-	struct polaris10_smumgr *smu_data = (struct polaris10_smumgr *)(smumgr->backend);
+	struct smu7_smumgr *smu_data = (struct smu7_smumgr *)(smumgr->backend);
 	struct smu7_hwmgr *data = (struct smu7_hwmgr *)(hwmgr->backend);
 
 	if (smu_data->avfs.avfs_btc_status == AVFS_BTC_NOTSUPPORTED)
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
index 9616ced..75f43da 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.c
@@ -60,16 +60,14 @@ static const SMU74_Discrete_GraphicsLevel avfs_graphics_level_polaris10[8] = {
 static const SMU74_Discrete_MemoryLevel avfs_memory_level_polaris10 = {
 	0x100ea446, 0, 0x30750000, 0x01, 0x01, 0x01, 0x00, 0x00, 0x64, 0x00, 0x00, 0x1f00, 0x00, 0x00};
 
-
 static int polaris10_setup_pwr_virus(struct pp_smumgr *smumgr)
 {
 	int i;
-	int result = -1;
+	int result = -EINVAL;
 	uint32_t reg, data;
 
 	const PWR_Command_Table *pvirus = pwr_virus_table;
-	struct polaris10_smumgr *smu_data = (struct polaris10_smumgr *)(smumgr->backend);
-
+	struct smu7_smumgr *smu_data = (struct smu7_smumgr *)(smumgr->backend);
 
 	for (i = 0; i < PWR_VIRUS_TABLE_SIZE; i++) {
 		switch (pvirus->command) {
@@ -86,7 +84,7 @@ static int polaris10_setup_pwr_virus(struct pp_smumgr *smumgr)
 		default:
 			pr_info("Table Exit with Invalid Command!");
 			smu_data->avfs.avfs_btc_status = AVFS_BTC_VIRUS_FAIL;
-			result = -1;
+			result = -EINVAL;
 			break;
 		}
 		pvirus++;
@@ -98,7 +96,7 @@ static int polaris10_setup_pwr_virus(struct pp_smumgr *smumgr)
 static int polaris10_perform_btc(struct pp_smumgr *smumgr)
 {
 	int result = 0;
-	struct polaris10_smumgr *smu_data = (struct polaris10_smumgr *)(smumgr->backend);
+	struct smu7_smumgr *smu_data = (struct smu7_smumgr *)(smumgr->backend);
 
 	if (0 != smu_data->avfs.avfs_btc_param) {
 		if (0 != smu7_send_msg_to_smc_with_parameter(smumgr, PPSMC_MSG_PerformBtc, smu_data->avfs.avfs_btc_param)) {
@@ -172,10 +170,11 @@ static int polaris10_setup_graphics_level_structure(struct pp_smumgr *smumgr)
 	return 0;
 }
 
+
 static int
 polaris10_avfs_event_mgr(struct pp_smumgr *smumgr, bool SMU_VFT_INTACT)
 {
-	struct polaris10_smumgr *smu_data = (struct polaris10_smumgr *)(smumgr->backend);
+	struct smu7_smumgr *smu_data = (struct smu7_smumgr *)(smumgr->backend);
 
 	switch (smu_data->avfs.avfs_btc_status) {
 	case AVFS_BTC_COMPLETED_PREVIOUSLY:
@@ -185,30 +184,31 @@ polaris10_avfs_event_mgr(struct pp_smumgr *smumgr, bool SMU_VFT_INTACT)
 
 		smu_data->avfs.avfs_btc_status = AVFS_BTC_DPMTABLESETUP_FAILED;
 		PP_ASSERT_WITH_CODE(0 == polaris10_setup_graphics_level_structure(smumgr),
-		"[AVFS][Polaris10_AVFSEventMgr] Could not Copy Graphics Level table over to SMU",
-		return -1);
+			"[AVFS][Polaris10_AVFSEventMgr] Could not Copy Graphics Level table over to SMU",
+			return -EINVAL);
 
 		if (smu_data->avfs.avfs_btc_param > 1) {
 			pr_info("[AVFS][Polaris10_AVFSEventMgr] AC BTC has not been successfully verified on Fiji. There may be in this setting.");
 			smu_data->avfs.avfs_btc_status = AVFS_BTC_VIRUS_FAIL;
-			PP_ASSERT_WITH_CODE(-1 == polaris10_setup_pwr_virus(smumgr),
+			PP_ASSERT_WITH_CODE(0 == polaris10_setup_pwr_virus(smumgr),
 			"[AVFS][Polaris10_AVFSEventMgr] Could not setup Pwr Virus for AVFS ",
-			return -1);
+			return -EINVAL);
 		}
 
 		smu_data->avfs.avfs_btc_status = AVFS_BTC_FAILED;
 		PP_ASSERT_WITH_CODE(0 == polaris10_perform_btc(smumgr),
 					"[AVFS][Polaris10_AVFSEventMgr] Failure at SmuPolaris10_PerformBTC. AVFS Disabled",
-				 return -1);
-
+				 return -EINVAL);
+		smu_data->avfs.avfs_btc_status = AVFS_BTC_ENABLEAVFS;
 		break;
 
 	case AVFS_BTC_DISABLED:
+	case AVFS_BTC_ENABLEAVFS:
 	case AVFS_BTC_NOTSUPPORTED:
 		break;
 
 	default:
-		pr_info("[AVFS] Something is broken. See log!");
+		pr_err("AVFS failed status is %x!\n", smu_data->avfs.avfs_btc_status);
 		break;
 	}
 
@@ -376,11 +376,6 @@ static int polaris10_smu_init(struct pp_smumgr *smumgr)
 	if (smu7_init(smumgr))
 		return -EINVAL;
 
-	if (polaris10_is_hw_avfs_present(smumgr))
-		smu_data->avfs.avfs_btc_status = AVFS_BTC_BOOT;
-	else
-		smu_data->avfs.avfs_btc_status = AVFS_BTC_NOTSUPPORTED;
-
 	for (i = 0; i < SMU74_MAX_LEVELS_GRAPHICS; i++)
 		smu_data->activity_target[i] = PPPOLARIS10_TARGETACTIVITY_DFLT;
 
@@ -410,4 +405,5 @@ const struct pp_smumgr_func polaris10_smu_funcs = {
 	.get_mac_definition = polaris10_get_mac_definition,
 	.is_dpm_running = polaris10_is_dpm_running,
 	.populate_requested_graphic_levels = polaris10_populate_requested_graphic_levels,
+	.is_hw_avfs_present = polaris10_is_hw_avfs_present,
 };
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.h b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.h
index 49ebf1d..5e19c24 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.h
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/polaris10_smumgr.h
@@ -32,11 +32,6 @@
 
 #define SMC_RAM_END 0x40000
 
-struct polaris10_avfs {
-	enum AVFS_BTC_STATUS avfs_btc_status;
-	uint32_t           avfs_btc_param;
-};
-
 struct polaris10_pt_defaults {
 	uint8_t   SviLoadLineEn;
 	uint8_t   SviLoadLineVddC;
@@ -51,8 +46,6 @@ struct polaris10_pt_defaults {
 	uint16_t  BAPMTI_RC[SMU74_DTE_ITERATIONS * SMU74_DTE_SOURCES * SMU74_DTE_SINKS];
 };
 
-
-
 struct polaris10_range_table {
 	uint32_t trans_lower_frequency; /* in 10khz */
 	uint32_t trans_upper_frequency;
@@ -61,14 +54,13 @@ struct polaris10_range_table {
 struct polaris10_smumgr {
 	struct smu7_smumgr smu7_data;
 	uint8_t protected_mode;
-	struct polaris10_avfs  avfs;
 	SMU74_Discrete_DpmTable              smc_state_table;
 	struct SMU74_Discrete_Ulv            ulv_setting;
 	struct SMU74_Discrete_PmFuses  power_tune_table;
 	struct polaris10_range_table                range_table[NUM_SCLK_RANGE];
 	const struct polaris10_pt_defaults       *power_tune_defaults;
-	uint32_t                   activity_target[SMU74_MAX_LEVELS_GRAPHICS];
-	uint32_t                   bif_sclk_table[SMU74_MAX_LEVELS_LINK];
+	uint32_t               activity_target[SMU74_MAX_LEVELS_GRAPHICS];
+	uint32_t               bif_sclk_table[SMU74_MAX_LEVELS_LINK];
 };
 
 
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
index 35ac276..76347ff 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.c
@@ -540,7 +540,6 @@ int smu7_upload_smu_firmware_image(struct pp_smumgr *smumgr)
 	return result;
 }
 
-
 int smu7_init(struct pp_smumgr *smumgr)
 {
 	struct smu7_smumgr *smu_data;
@@ -596,6 +595,11 @@ int smu7_init(struct pp_smumgr *smumgr)
 		(cgs_handle_t)smu_data->smu_buffer.handle);
 		return -EINVAL);
 
+	if (smum_is_hw_avfs_present(smumgr))
+		smu_data->avfs.avfs_btc_status = AVFS_BTC_BOOT;
+	else
+		smu_data->avfs.avfs_btc_status = AVFS_BTC_NOTSUPPORTED;
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.h b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.h
index 919be43..ee5e32d 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.h
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smu7_smumgr.h
@@ -37,6 +37,11 @@ struct smu7_buffer_entry {
 	unsigned long  handle;
 };
 
+struct smu7_avfs {
+	enum AVFS_BTC_STATUS avfs_btc_status;
+	uint32_t           avfs_btc_param;
+};
+
 struct smu7_smumgr {
 	uint8_t *header;
 	uint8_t *mec_image;
@@ -50,7 +55,8 @@ struct smu7_smumgr {
 	uint32_t                             arb_table_start;
 	uint32_t                             ulv_setting_starts;
 	uint8_t                              security_hard_key;
-	uint32_t acpi_optimization;
+	uint32_t                             acpi_optimization;
+	struct smu7_avfs                     avfs;
 };
 
 
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c b/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
index bcc61ff..3bdf647 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/smumgr.c
@@ -43,7 +43,8 @@ MODULE_FIRMWARE("amdgpu/polaris11_smc.bin");
 MODULE_FIRMWARE("amdgpu/polaris11_smc_sk.bin");
 MODULE_FIRMWARE("amdgpu/polaris11_k_smc.bin");
 MODULE_FIRMWARE("amdgpu/polaris12_smc.bin");
-
+MODULE_FIRMWARE("amdgpu/vega10_smc.bin");
+MODULE_FIRMWARE("amdgpu/vega10_acg_smc.bin");
 
 int smum_early_init(struct pp_instance *handle)
 {
@@ -403,3 +404,11 @@ int smum_populate_requested_graphic_levels(struct pp_hwmgr *hwmgr,
 
 	return 0;
 }
+
+bool smum_is_hw_avfs_present(struct pp_smumgr *smumgr)
+{
+	if (smumgr->smumgr_funcs->is_hw_avfs_present)
+		return smumgr->smumgr_funcs->is_hw_avfs_present(smumgr);
+
+	return false;
+}
diff --git a/drivers/gpu/drm/amd/powerplay/smumgr/vega10_smumgr.c b/drivers/gpu/drm/amd/powerplay/smumgr/vega10_smumgr.c
index 2696784..408514c 100644
--- a/drivers/gpu/drm/amd/powerplay/smumgr/vega10_smumgr.c
+++ b/drivers/gpu/drm/amd/powerplay/smumgr/vega10_smumgr.c
@@ -356,6 +356,9 @@ int vega10_set_tools_address(struct pp_smumgr *smumgr)
 static int vega10_verify_smc_interface(struct pp_smumgr *smumgr)
 {
 	uint32_t smc_driver_if_version;
+	struct cgs_system_info sys_info = {0};
+	uint32_t dev_id;
+	uint32_t rev_id;
 
 	PP_ASSERT_WITH_CODE(!vega10_send_msg_to_smc(smumgr,
 			PPSMC_MSG_GetDriverIfVersion),
@@ -363,12 +366,27 @@ static int vega10_verify_smc_interface(struct pp_smumgr *smumgr)
 			return -EINVAL);
 	vega10_read_arg_from_smc(smumgr, &smc_driver_if_version);
 
-	if (smc_driver_if_version != SMU9_DRIVER_IF_VERSION) {
-		pr_err("Your firmware(0x%x) doesn't match \
-			SMU9_DRIVER_IF_VERSION(0x%x). \
-			Please update your firmware!\n",
-			smc_driver_if_version, SMU9_DRIVER_IF_VERSION);
-		return -EINVAL;
+	sys_info.size = sizeof(struct cgs_system_info);
+	sys_info.info_id = CGS_SYSTEM_INFO_PCIE_DEV;
+	cgs_query_system_info(smumgr->device, &sys_info);
+	dev_id = (uint32_t)sys_info.value;
+
+	sys_info.size = sizeof(struct cgs_system_info);
+	sys_info.info_id = CGS_SYSTEM_INFO_PCIE_REV;
+	cgs_query_system_info(smumgr->device, &sys_info);
+	rev_id = (uint32_t)sys_info.value;
+
+	if (!((dev_id == 0x687f) &&
+		((rev_id == 0xc0) ||
+		(rev_id == 0xc1) ||
+		(rev_id == 0xc3)))) {
+		if (smc_driver_if_version != SMU9_DRIVER_IF_VERSION) {
+			pr_err("Your firmware(0x%x) doesn't match \
+				SMU9_DRIVER_IF_VERSION(0x%x). \
+				Please update your firmware!\n",
+				smc_driver_if_version, SMU9_DRIVER_IF_VERSION);
+			return -EINVAL;
+		}
 	}
 
 	return 0;
diff --git a/drivers/gpu/drm/amd/scheduler/gpu_sched_trace.h b/drivers/gpu/drm/amd/scheduler/gpu_sched_trace.h
index dbd4fd3a..8bd3810 100644
--- a/drivers/gpu/drm/amd/scheduler/gpu_sched_trace.h
+++ b/drivers/gpu/drm/amd/scheduler/gpu_sched_trace.h
@@ -16,16 +16,16 @@ TRACE_EVENT(amd_sched_job,
 	    TP_ARGS(sched_job),
 	    TP_STRUCT__entry(
 			     __field(struct amd_sched_entity *, entity)
-			     __field(struct amd_sched_job *, sched_job)
 			     __field(struct dma_fence *, fence)
 			     __field(const char *, name)
+			     __field(uint64_t, id)
 			     __field(u32, job_count)
 			     __field(int, hw_job_count)
 			     ),
 
 	    TP_fast_assign(
 			   __entry->entity = sched_job->s_entity;
-			   __entry->sched_job = sched_job;
+			   __entry->id = sched_job->id;
 			   __entry->fence = &sched_job->s_fence->finished;
 			   __entry->name = sched_job->sched->name;
 			   __entry->job_count = kfifo_len(
@@ -33,8 +33,9 @@ TRACE_EVENT(amd_sched_job,
 			   __entry->hw_job_count = atomic_read(
 				   &sched_job->sched->hw_rq_count);
 			   ),
-	    TP_printk("entity=%p, sched job=%p, fence=%p, ring=%s, job count:%u, hw job count:%d",
-		      __entry->entity, __entry->sched_job, __entry->fence, __entry->name,
+	    TP_printk("entity=%p, id=%llu, fence=%p, ring=%s, job count:%u, hw job count:%d",
+		      __entry->entity, __entry->id,
+		      __entry->fence, __entry->name,
 		      __entry->job_count, __entry->hw_job_count)
 );
 
diff --git a/drivers/gpu/drm/arc/arcpgu_crtc.c b/drivers/gpu/drm/arc/arcpgu_crtc.c
index ad9a959..16903dc 100644
--- a/drivers/gpu/drm/arc/arcpgu_crtc.c
+++ b/drivers/gpu/drm/arc/arcpgu_crtc.c
@@ -64,6 +64,20 @@ static const struct drm_crtc_funcs arc_pgu_crtc_funcs = {
 	.atomic_destroy_state = drm_atomic_helper_crtc_destroy_state,
 };
 
+static enum drm_mode_status arc_pgu_crtc_mode_valid(struct drm_crtc *crtc,
+						    const struct drm_display_mode *mode)
+{
+	struct arcpgu_drm_private *arcpgu = crtc_to_arcpgu_priv(crtc);
+	long rate, clk_rate = mode->clock * 1000;
+	long diff = clk_rate / 200; /* +-0.5% allowed by HDMI spec */
+
+	rate = clk_round_rate(arcpgu->clk, clk_rate);
+	if ((max(rate, clk_rate) - min(rate, clk_rate) < diff) && (rate > 0))
+		return MODE_OK;
+
+	return MODE_NOCLOCK;
+}
+
 static void arc_pgu_crtc_mode_set_nofb(struct drm_crtc *crtc)
 {
 	struct arcpgu_drm_private *arcpgu = crtc_to_arcpgu_priv(crtc);
@@ -106,7 +120,8 @@ static void arc_pgu_crtc_mode_set_nofb(struct drm_crtc *crtc)
 	clk_set_rate(arcpgu->clk, m->crtc_clock * 1000);
 }
 
-static void arc_pgu_crtc_enable(struct drm_crtc *crtc)
+static void arc_pgu_crtc_atomic_enable(struct drm_crtc *crtc,
+				       struct drm_crtc_state *old_state)
 {
 	struct arcpgu_drm_private *arcpgu = crtc_to_arcpgu_priv(crtc);
 
@@ -116,7 +131,8 @@ static void arc_pgu_crtc_enable(struct drm_crtc *crtc)
 		      ARCPGU_CTRL_ENABLE_MASK);
 }
 
-static void arc_pgu_crtc_disable(struct drm_crtc *crtc)
+static void arc_pgu_crtc_atomic_disable(struct drm_crtc *crtc,
+					struct drm_crtc_state *old_state)
 {
 	struct arcpgu_drm_private *arcpgu = crtc_to_arcpgu_priv(crtc);
 
@@ -129,20 +145,6 @@ static void arc_pgu_crtc_disable(struct drm_crtc *crtc)
 			      ~ARCPGU_CTRL_ENABLE_MASK);
 }
 
-static int arc_pgu_crtc_atomic_check(struct drm_crtc *crtc,
-				     struct drm_crtc_state *state)
-{
-	struct arcpgu_drm_private *arcpgu = crtc_to_arcpgu_priv(crtc);
-	struct drm_display_mode *mode = &state->adjusted_mode;
-	long rate, clk_rate = mode->clock * 1000;
-
-	rate = clk_round_rate(arcpgu->clk, clk_rate);
-	if (rate != clk_rate)
-		return -EINVAL;
-
-	return 0;
-}
-
 static void arc_pgu_crtc_atomic_begin(struct drm_crtc *crtc,
 				      struct drm_crtc_state *state)
 {
@@ -158,15 +160,13 @@ static void arc_pgu_crtc_atomic_begin(struct drm_crtc *crtc,
 }
 
 static const struct drm_crtc_helper_funcs arc_pgu_crtc_helper_funcs = {
+	.mode_valid	= arc_pgu_crtc_mode_valid,
 	.mode_set	= drm_helper_crtc_mode_set,
 	.mode_set_base	= drm_helper_crtc_mode_set_base,
 	.mode_set_nofb	= arc_pgu_crtc_mode_set_nofb,
-	.enable		= arc_pgu_crtc_enable,
-	.disable	= arc_pgu_crtc_disable,
-	.prepare	= arc_pgu_crtc_disable,
-	.commit		= arc_pgu_crtc_enable,
-	.atomic_check	= arc_pgu_crtc_atomic_check,
 	.atomic_begin	= arc_pgu_crtc_atomic_begin,
+	.atomic_enable	= arc_pgu_crtc_atomic_enable,
+	.atomic_disable	= arc_pgu_crtc_atomic_disable,
 };
 
 static void arc_pgu_plane_atomic_update(struct drm_plane *plane,
@@ -218,6 +218,7 @@ static struct drm_plane *arc_pgu_plane_init(struct drm_device *drm)
 
 	ret = drm_universal_plane_init(drm, plane, 0xff, &arc_pgu_plane_funcs,
 				       formats, ARRAY_SIZE(formats),
+				       NULL,
 				       DRM_PLANE_TYPE_PRIMARY, NULL);
 	if (ret)
 		return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/arc/arcpgu_drv.c b/drivers/gpu/drm/arc/arcpgu_drv.c
index 3e43a5d..289eda5 100644
--- a/drivers/gpu/drm/arc/arcpgu_drv.c
+++ b/drivers/gpu/drm/arc/arcpgu_drv.c
@@ -31,7 +31,7 @@ static void arcpgu_fb_output_poll_changed(struct drm_device *dev)
 	drm_fbdev_cma_hotplug_event(arcpgu->fbdev);
 }
 
-static struct drm_mode_config_funcs arcpgu_drm_modecfg_funcs = {
+static const struct drm_mode_config_funcs arcpgu_drm_modecfg_funcs = {
 	.fb_create  = drm_fb_cma_create,
 	.output_poll_changed = arcpgu_fb_output_poll_changed,
 	.atomic_check = drm_atomic_helper_check,
@@ -48,29 +48,7 @@ static void arcpgu_setup_mode_config(struct drm_device *drm)
 	drm->mode_config.funcs = &arcpgu_drm_modecfg_funcs;
 }
 
-static int arcpgu_gem_mmap(struct file *filp, struct vm_area_struct *vma)
-{
-	int ret;
-
-	ret = drm_gem_mmap(filp, vma);
-	if (ret)
-		return ret;
-
-	vma->vm_page_prot = pgprot_noncached(vm_get_page_prot(vma->vm_flags));
-	return 0;
-}
-
-static const struct file_operations arcpgu_drm_ops = {
-	.owner = THIS_MODULE,
-	.open = drm_open,
-	.release = drm_release,
-	.unlocked_ioctl = drm_ioctl,
-	.compat_ioctl = drm_compat_ioctl,
-	.poll = drm_poll,
-	.read = drm_read,
-	.llseek = no_llseek,
-	.mmap = arcpgu_gem_mmap,
-};
+DEFINE_DRM_GEM_CMA_FOPS(arcpgu_drm_ops);
 
 static void arcpgu_lastclose(struct drm_device *drm)
 {
@@ -142,7 +120,7 @@ static int arcpgu_load(struct drm_device *drm)
 		return -ENODEV;
 	}
 
-	platform_set_drvdata(pdev, arcpgu);
+	platform_set_drvdata(pdev, drm);
 	return 0;
 }
 
@@ -160,11 +138,37 @@ static int arcpgu_unload(struct drm_device *drm)
 	return 0;
 }
 
+#ifdef CONFIG_DEBUG_FS
+static int arcpgu_show_pxlclock(struct seq_file *m, void *arg)
+{
+	struct drm_info_node *node = (struct drm_info_node *)m->private;
+	struct drm_device *drm = node->minor->dev;
+	struct arcpgu_drm_private *arcpgu = drm->dev_private;
+	unsigned long clkrate = clk_get_rate(arcpgu->clk);
+	unsigned long mode_clock = arcpgu->crtc.mode.crtc_clock * 1000;
+
+	seq_printf(m, "hw  : %lu\n", clkrate);
+	seq_printf(m, "mode: %lu\n", mode_clock);
+	return 0;
+}
+
+static struct drm_info_list arcpgu_debugfs_list[] = {
+	{ "clocks", arcpgu_show_pxlclock, 0 },
+	{ "fb", drm_fb_cma_debugfs_show, 0 },
+};
+
+static int arcpgu_debugfs_init(struct drm_minor *minor)
+{
+	return drm_debugfs_create_files(arcpgu_debugfs_list,
+		ARRAY_SIZE(arcpgu_debugfs_list), minor->debugfs_root, minor);
+}
+#endif
+
 static struct drm_driver arcpgu_drm_driver = {
 	.driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_PRIME |
 			   DRIVER_ATOMIC,
 	.lastclose = arcpgu_lastclose,
-	.name = "drm-arcpgu",
+	.name = "arcpgu",
 	.desc = "ARC PGU Controller",
 	.date = "20160219",
 	.major = 1,
@@ -172,8 +176,6 @@ static struct drm_driver arcpgu_drm_driver = {
 	.patchlevel = 0,
 	.fops = &arcpgu_drm_ops,
 	.dumb_create = drm_gem_cma_dumb_create,
-	.dumb_map_offset = drm_gem_cma_dumb_map_offset,
-	.dumb_destroy = drm_gem_dumb_destroy,
 	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
 	.gem_free_object_unlocked = drm_gem_cma_free_object,
@@ -185,6 +187,9 @@ static struct drm_driver arcpgu_drm_driver = {
 	.gem_prime_vmap = drm_gem_cma_prime_vmap,
 	.gem_prime_vunmap = drm_gem_cma_prime_vunmap,
 	.gem_prime_mmap = drm_gem_cma_prime_mmap,
+#ifdef CONFIG_DEBUG_FS
+	.debugfs_init = arcpgu_debugfs_init,
+#endif
 };
 
 static int arcpgu_probe(struct platform_device *pdev)
diff --git a/drivers/gpu/drm/arm/hdlcd_crtc.c b/drivers/gpu/drm/arm/hdlcd_crtc.c
index d67b6f1..72b22b8 100644
--- a/drivers/gpu/drm/arm/hdlcd_crtc.c
+++ b/drivers/gpu/drm/arm/hdlcd_crtc.c
@@ -165,7 +165,8 @@ static void hdlcd_crtc_mode_set_nofb(struct drm_crtc *crtc)
 	clk_set_rate(hdlcd->clk, m->crtc_clock * 1000);
 }
 
-static void hdlcd_crtc_enable(struct drm_crtc *crtc)
+static void hdlcd_crtc_atomic_enable(struct drm_crtc *crtc,
+				     struct drm_crtc_state *old_state)
 {
 	struct hdlcd_drm_private *hdlcd = crtc_to_hdlcd_priv(crtc);
 
@@ -175,7 +176,8 @@ static void hdlcd_crtc_enable(struct drm_crtc *crtc)
 	drm_crtc_vblank_on(crtc);
 }
 
-static void hdlcd_crtc_disable(struct drm_crtc *crtc)
+static void hdlcd_crtc_atomic_disable(struct drm_crtc *crtc,
+				      struct drm_crtc_state *old_state)
 {
 	struct hdlcd_drm_private *hdlcd = crtc_to_hdlcd_priv(crtc);
 
@@ -218,10 +220,10 @@ static void hdlcd_crtc_atomic_begin(struct drm_crtc *crtc,
 }
 
 static const struct drm_crtc_helper_funcs hdlcd_crtc_helper_funcs = {
-	.enable		= hdlcd_crtc_enable,
-	.disable	= hdlcd_crtc_disable,
 	.atomic_check	= hdlcd_crtc_atomic_check,
 	.atomic_begin	= hdlcd_crtc_atomic_begin,
+	.atomic_enable	= hdlcd_crtc_atomic_enable,
+	.atomic_disable	= hdlcd_crtc_atomic_disable,
 };
 
 static int hdlcd_plane_atomic_check(struct drm_plane *plane,
@@ -313,6 +315,7 @@ static struct drm_plane *hdlcd_plane_init(struct drm_device *drm)
 
 	ret = drm_universal_plane_init(drm, plane, 0xff, &hdlcd_plane_funcs,
 				       formats, ARRAY_SIZE(formats),
+				       NULL,
 				       DRM_PLANE_TYPE_PRIMARY, NULL);
 	if (ret) {
 		return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/arm/hdlcd_drv.c b/drivers/gpu/drm/arm/hdlcd_drv.c
index d3da87f..f9bda7b 100644
--- a/drivers/gpu/drm/arm/hdlcd_drv.c
+++ b/drivers/gpu/drm/arm/hdlcd_drv.c
@@ -253,8 +253,6 @@ static struct drm_driver hdlcd_driver = {
 	.gem_free_object_unlocked = drm_gem_cma_free_object,
 	.gem_vm_ops = &drm_gem_cma_vm_ops,
 	.dumb_create = drm_gem_cma_dumb_create,
-	.dumb_map_offset = drm_gem_cma_dumb_map_offset,
-	.dumb_destroy = drm_gem_dumb_destroy,
 	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
 	.gem_prime_export = drm_gem_prime_export,
@@ -343,7 +341,6 @@ static int hdlcd_drm_bind(struct device *dev)
 	}
 err_fbdev:
 	drm_kms_helper_poll_fini(drm);
-	drm_vblank_cleanup(drm);
 err_vblank:
 	pm_runtime_disable(drm->dev);
 err_pm_active:
@@ -375,7 +372,6 @@ static void hdlcd_drm_unbind(struct device *dev)
 	component_unbind_all(dev, drm);
 	of_node_put(hdlcd->crtc.port);
 	hdlcd->crtc.port = NULL;
-	drm_vblank_cleanup(drm);
 	pm_runtime_get_sync(drm->dev);
 	drm_irq_uninstall(drm);
 	pm_runtime_put_sync(drm->dev);
diff --git a/drivers/gpu/drm/arm/malidp_crtc.c b/drivers/gpu/drm/arm/malidp_crtc.c
index 4bb38a2..3615d18 100644
--- a/drivers/gpu/drm/arm/malidp_crtc.c
+++ b/drivers/gpu/drm/arm/malidp_crtc.c
@@ -46,7 +46,8 @@ static enum drm_mode_status malidp_crtc_mode_valid(struct drm_crtc *crtc,
 	return MODE_OK;
 }
 
-static void malidp_crtc_enable(struct drm_crtc *crtc)
+static void malidp_crtc_atomic_enable(struct drm_crtc *crtc,
+				      struct drm_crtc_state *old_state)
 {
 	struct malidp_drm *malidp = crtc_to_malidp_device(crtc);
 	struct malidp_hw_device *hwdev = malidp->dev;
@@ -69,7 +70,8 @@ static void malidp_crtc_enable(struct drm_crtc *crtc)
 	drm_crtc_vblank_on(crtc);
 }
 
-static void malidp_crtc_disable(struct drm_crtc *crtc)
+static void malidp_crtc_atomic_disable(struct drm_crtc *crtc,
+				       struct drm_crtc_state *old_state)
 {
 	struct malidp_drm *malidp = crtc_to_malidp_device(crtc);
 	struct malidp_hw_device *hwdev = malidp->dev;
@@ -408,9 +410,9 @@ static int malidp_crtc_atomic_check(struct drm_crtc *crtc,
 
 static const struct drm_crtc_helper_funcs malidp_crtc_helper_funcs = {
 	.mode_valid = malidp_crtc_mode_valid,
-	.enable = malidp_crtc_enable,
-	.disable = malidp_crtc_disable,
 	.atomic_check = malidp_crtc_atomic_check,
+	.atomic_enable = malidp_crtc_atomic_enable,
+	.atomic_disable = malidp_crtc_atomic_disable,
 };
 
 static struct drm_crtc_state *malidp_crtc_duplicate_state(struct drm_crtc *crtc)
diff --git a/drivers/gpu/drm/arm/malidp_drv.c b/drivers/gpu/drm/arm/malidp_drv.c
index 01b13d2..1a57cc2 100644
--- a/drivers/gpu/drm/arm/malidp_drv.c
+++ b/drivers/gpu/drm/arm/malidp_drv.c
@@ -225,7 +225,7 @@ static void malidp_atomic_commit_tail(struct drm_atomic_state *state)
 
 	drm_atomic_helper_commit_modeset_disables(drm, state);
 
-	for_each_crtc_in_state(state, crtc, old_crtc_state, i) {
+	for_each_old_crtc_in_state(state, crtc, old_crtc_state, i) {
 		malidp_atomic_commit_update_gamma(crtc, old_crtc_state);
 		malidp_atomic_commit_update_coloradj(crtc, old_crtc_state);
 		malidp_atomic_commit_se_config(crtc, old_crtc_state);
@@ -331,8 +331,6 @@ static struct drm_driver malidp_driver = {
 	.gem_free_object_unlocked = drm_gem_cma_free_object,
 	.gem_vm_ops = &drm_gem_cma_vm_ops,
 	.dumb_create = drm_gem_cma_dumb_create,
-	.dumb_map_offset = drm_gem_cma_dumb_map_offset,
-	.dumb_destroy = drm_gem_dumb_destroy,
 	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
 	.gem_prime_export = drm_gem_prime_export,
diff --git a/drivers/gpu/drm/arm/malidp_planes.c b/drivers/gpu/drm/arm/malidp_planes.c
index 600fa7b..94e7e3f 100644
--- a/drivers/gpu/drm/arm/malidp_planes.c
+++ b/drivers/gpu/drm/arm/malidp_planes.c
@@ -128,7 +128,6 @@ static void malidp_plane_atomic_print_state(struct drm_printer *p,
 static const struct drm_plane_funcs malidp_de_plane_funcs = {
 	.update_plane = drm_atomic_helper_update_plane,
 	.disable_plane = drm_atomic_helper_disable_plane,
-	.set_property = drm_atomic_helper_plane_set_property,
 	.destroy = malidp_de_plane_destroy,
 	.reset = malidp_plane_reset,
 	.atomic_duplicate_state = malidp_duplicate_plane_state,
@@ -398,7 +397,7 @@ int malidp_de_planes_init(struct drm_device *drm)
 					DRM_PLANE_TYPE_OVERLAY;
 		ret = drm_universal_plane_init(drm, &plane->base, crtcs,
 					       &malidp_de_plane_funcs, formats,
-					       n, plane_type, NULL);
+					       n, NULL, plane_type, NULL);
 		if (ret < 0)
 			goto cleanup;
 
diff --git a/drivers/gpu/drm/armada/armada_crtc.c b/drivers/gpu/drm/armada/armada_crtc.c
index 4fe19fd..2a4d163 100644
--- a/drivers/gpu/drm/armada/armada_crtc.c
+++ b/drivers/gpu/drm/armada/armada_crtc.c
@@ -334,16 +334,6 @@ static void armada_drm_vblank_off(struct armada_crtc *dcrtc)
 	armada_drm_plane_work_run(dcrtc, dcrtc->crtc.primary);
 }
 
-void armada_drm_crtc_gamma_set(struct drm_crtc *crtc, u16 r, u16 g, u16 b,
-	int idx)
-{
-}
-
-void armada_drm_crtc_gamma_get(struct drm_crtc *crtc, u16 *r, u16 *g, u16 *b,
-	int idx)
-{
-}
-
 /* The mode_config.mutex will be held for this call */
 static void armada_drm_crtc_dpms(struct drm_crtc *crtc, int dpms)
 {
@@ -1150,13 +1140,13 @@ int armada_drm_plane_init(struct armada_plane *plane)
 	return 0;
 }
 
-static struct drm_prop_enum_list armada_drm_csc_yuv_enum_list[] = {
+static const struct drm_prop_enum_list armada_drm_csc_yuv_enum_list[] = {
 	{ CSC_AUTO,        "Auto" },
 	{ CSC_YUV_CCIR601, "CCIR601" },
 	{ CSC_YUV_CCIR709, "CCIR709" },
 };
 
-static struct drm_prop_enum_list armada_drm_csc_rgb_enum_list[] = {
+static const struct drm_prop_enum_list armada_drm_csc_rgb_enum_list[] = {
 	{ CSC_AUTO,         "Auto" },
 	{ CSC_RGB_COMPUTER, "Computer system" },
 	{ CSC_RGB_STUDIO,   "Studio" },
@@ -1269,6 +1259,7 @@ static int armada_drm_crtc_create(struct drm_device *drm, struct device *dev,
 				       &armada_primary_plane_funcs,
 				       armada_primary_formats,
 				       ARRAY_SIZE(armada_primary_formats),
+				       NULL,
 				       DRM_PLANE_TYPE_PRIMARY, NULL);
 	if (ret) {
 		kfree(primary);
@@ -1329,8 +1320,7 @@ armada_lcd_bind(struct device *dev, struct device *master, void *data)
 		port = of_get_child_by_name(parent, "port");
 		of_node_put(np);
 		if (!port) {
-			dev_err(dev, "no port node found in %s\n",
-				parent->full_name);
+			dev_err(dev, "no port node found in %pOF\n", parent);
 			return -ENXIO;
 		}
 
@@ -1364,7 +1354,7 @@ static int armada_lcd_remove(struct platform_device *pdev)
 	return 0;
 }
 
-static struct of_device_id armada_lcd_of_match[] = {
+static const struct of_device_id armada_lcd_of_match[] = {
 	{
 		.compatible	= "marvell,dove-lcd",
 		.data		= &armada510_ops,
diff --git a/drivers/gpu/drm/armada/armada_crtc.h b/drivers/gpu/drm/armada/armada_crtc.h
index 7e8906d3..bab11f4 100644
--- a/drivers/gpu/drm/armada/armada_crtc.h
+++ b/drivers/gpu/drm/armada/armada_crtc.h
@@ -102,8 +102,6 @@ struct armada_crtc {
 };
 #define drm_to_armada_crtc(c) container_of(c, struct armada_crtc, crtc)
 
-void armada_drm_crtc_gamma_set(struct drm_crtc *, u16, u16, u16, int);
-void armada_drm_crtc_gamma_get(struct drm_crtc *, u16 *, u16 *, u16 *, int);
 void armada_drm_crtc_update_regs(struct armada_crtc *, struct armada_regs *);
 
 void armada_drm_crtc_plane_disable(struct armada_crtc *dcrtc,
diff --git a/drivers/gpu/drm/armada/armada_drv.c b/drivers/gpu/drm/armada/armada_drv.c
index e618fab..0b3227c 100644
--- a/drivers/gpu/drm/armada/armada_drv.c
+++ b/drivers/gpu/drm/armada/armada_drv.c
@@ -232,8 +232,8 @@ static void armada_add_endpoints(struct device *dev,
 			of_node_put(remote);
 			continue;
 		} else if (!of_device_is_available(remote->parent)) {
-			dev_warn(dev, "parent device of %s is not available\n",
-				 remote->full_name);
+			dev_warn(dev, "parent device of %pOF is not available\n",
+				 remote);
 			of_node_put(remote);
 			continue;
 		}
diff --git a/drivers/gpu/drm/armada/armada_fbdev.c b/drivers/gpu/drm/armada/armada_fbdev.c
index 602dfea..29c7d04 100644
--- a/drivers/gpu/drm/armada/armada_fbdev.c
+++ b/drivers/gpu/drm/armada/armada_fbdev.c
@@ -81,7 +81,6 @@ static int armada_fb_create(struct drm_fb_helper *fbh,
 
 	strlcpy(info->fix.id, "armada-drmfb", sizeof(info->fix.id));
 	info->par = fbh;
-	info->flags = FBINFO_DEFAULT | FBINFO_CAN_FORCE_OUTPUT;
 	info->fbops = &armada_fb_ops;
 	info->fix.smem_start = obj->phys_addr;
 	info->fix.smem_len = obj->obj.size;
@@ -118,8 +117,6 @@ static int armada_fb_probe(struct drm_fb_helper *fbh,
 }
 
 static const struct drm_fb_helper_funcs armada_fb_helper_funcs = {
-	.gamma_set	= armada_drm_crtc_gamma_set,
-	.gamma_get	= armada_drm_crtc_gamma_get,
 	.fb_probe	= armada_fb_probe,
 };
 
diff --git a/drivers/gpu/drm/armada/armada_overlay.c b/drivers/gpu/drm/armada/armada_overlay.c
index e9a29df..edc4491 100644
--- a/drivers/gpu/drm/armada/armada_overlay.c
+++ b/drivers/gpu/drm/armada/armada_overlay.c
@@ -388,7 +388,7 @@ static const uint32_t armada_ovl_formats[] = {
 	DRM_FORMAT_BGR565,
 };
 
-static struct drm_prop_enum_list armada_drm_colorkey_enum_list[] = {
+static const struct drm_prop_enum_list armada_drm_colorkey_enum_list[] = {
 	{ CKMODE_DISABLE, "disabled" },
 	{ CKMODE_Y,       "Y component" },
 	{ CKMODE_U,       "U component" },
@@ -460,6 +460,7 @@ int armada_overlay_plane_create(struct drm_device *dev, unsigned long crtcs)
 				       &armada_ovl_plane_funcs,
 				       armada_ovl_formats,
 				       ARRAY_SIZE(armada_ovl_formats),
+				       NULL,
 				       DRM_PLANE_TYPE_OVERLAY, NULL);
 	if (ret) {
 		kfree(dplane);
diff --git a/drivers/gpu/drm/ast/ast_dp501.c b/drivers/gpu/drm/ast/ast_dp501.c
index 76f07f3..749646a 100644
--- a/drivers/gpu/drm/ast/ast_dp501.c
+++ b/drivers/gpu/drm/ast/ast_dp501.c
@@ -4,16 +4,11 @@
 #include "ast_drv.h"
 MODULE_FIRMWARE("ast_dp501_fw.bin");
 
-int ast_load_dp501_microcode(struct drm_device *dev)
+static int ast_load_dp501_microcode(struct drm_device *dev)
 {
 	struct ast_private *ast = dev->dev_private;
-	static char *fw_name = "ast_dp501_fw.bin";
-	int err;
-	err = request_firmware(&ast->dp501_fw, fw_name, dev->dev);
-	if (err)
-		return err;
 
-	return 0;
+	return request_firmware(&ast->dp501_fw, "ast_dp501_fw.bin", dev->dev);
 }
 
 static void send_ack(struct ast_private *ast)
@@ -187,7 +182,7 @@ bool ast_backup_fw(struct drm_device *dev, u8 *addr, u32 size)
 	return false;
 }
 
-bool ast_launch_m68k(struct drm_device *dev)
+static bool ast_launch_m68k(struct drm_device *dev)
 {
 	struct ast_private *ast = dev->dev_private;
 	u32 i, data, len = 0;
@@ -201,7 +196,11 @@ bool ast_launch_m68k(struct drm_device *dev)
 		if (ast->dp501_fw_addr) {
 			fw_addr = ast->dp501_fw_addr;
 			len = 32*1024;
-		} else if (ast->dp501_fw) {
+		} else {
+			if (!ast->dp501_fw &&
+			    ast_load_dp501_microcode(dev) < 0)
+				return false;
+
 			fw_addr = (u8 *)ast->dp501_fw->data;
 			len = ast->dp501_fw->size;
 		}
@@ -432,3 +431,11 @@ void ast_init_3rdtx(struct drm_device *dev)
 		}
 	}
 }
+
+void ast_release_firmware(struct drm_device *dev)
+{
+	struct ast_private *ast = dev->dev_private;
+
+	release_firmware(ast->dp501_fw);
+	ast->dp501_fw = NULL;
+}
diff --git a/drivers/gpu/drm/ast/ast_drv.c b/drivers/gpu/drm/ast/ast_drv.c
index fd7c9ee..69dab82 100644
--- a/drivers/gpu/drm/ast/ast_drv.c
+++ b/drivers/gpu/drm/ast/ast_drv.c
@@ -197,7 +197,6 @@ static struct drm_driver driver = {
 
 	.load = ast_driver_load,
 	.unload = ast_driver_unload,
-	.set_busid = drm_pci_set_busid,
 
 	.fops = &ast_fops,
 	.name = DRIVER_NAME,
@@ -210,7 +209,6 @@ static struct drm_driver driver = {
 	.gem_free_object_unlocked = ast_gem_free_object,
 	.dumb_create = ast_dumb_create,
 	.dumb_map_offset = ast_dumb_mmap_offset,
-	.dumb_destroy = drm_gem_dumb_destroy,
 
 };
 
@@ -221,11 +219,11 @@ static int __init ast_init(void)
 
 	if (ast_modeset == 0)
 		return -EINVAL;
-	return drm_pci_init(&driver, &ast_pci_driver);
+	return pci_register_driver(&ast_pci_driver);
 }
 static void __exit ast_exit(void)
 {
-	drm_pci_exit(&driver, &ast_pci_driver);
+	pci_unregister_driver(&ast_pci_driver);
 }
 
 module_init(ast_init);
diff --git a/drivers/gpu/drm/ast/ast_drv.h b/drivers/gpu/drm/ast/ast_drv.h
index 8880f0b..e6c4cd3 100644
--- a/drivers/gpu/drm/ast/ast_drv.h
+++ b/drivers/gpu/drm/ast/ast_drv.h
@@ -245,7 +245,6 @@ struct ast_connector {
 
 struct ast_crtc {
 	struct drm_crtc base;
-	u8 lut_r[256], lut_g[256], lut_b[256];
 	struct drm_gem_object *cursor_bo;
 	uint64_t cursor_addr;
 	int cursor_width, cursor_height;
@@ -401,11 +400,10 @@ void ast_post_gpu(struct drm_device *dev);
 u32 ast_mindwm(struct ast_private *ast, u32 r);
 void ast_moutdwm(struct ast_private *ast, u32 r, u32 v);
 /* ast dp501 */
-int ast_load_dp501_microcode(struct drm_device *dev);
 void ast_set_dp501_video_output(struct drm_device *dev, u8 mode);
-bool ast_launch_m68k(struct drm_device *dev);
 bool ast_backup_fw(struct drm_device *dev, u8 *addr, u32 size);
 bool ast_dp501_read_edid(struct drm_device *dev, u8 *ediddata);
 u8 ast_get_dp501_max_clk(struct drm_device *dev);
 void ast_init_3rdtx(struct drm_device *dev);
+void ast_release_firmware(struct drm_device *dev);
 #endif
diff --git a/drivers/gpu/drm/ast/ast_fb.c b/drivers/gpu/drm/ast/ast_fb.c
index 4ad4acd..0cd827e 100644
--- a/drivers/gpu/drm/ast/ast_fb.c
+++ b/drivers/gpu/drm/ast/ast_fb.c
@@ -231,7 +231,6 @@ static int astfb_create(struct drm_fb_helper *helper,
 
 	strcpy(info->fix.id, "astdrmfb");
 
-	info->flags = FBINFO_DEFAULT | FBINFO_CAN_FORCE_OUTPUT;
 	info->fbops = &astfb_ops;
 
 	info->apertures->ranges[0].base = pci_resource_start(dev->pdev, 0);
@@ -255,27 +254,7 @@ static int astfb_create(struct drm_fb_helper *helper,
 	return ret;
 }
 
-static void ast_fb_gamma_set(struct drm_crtc *crtc, u16 red, u16 green,
-			       u16 blue, int regno)
-{
-	struct ast_crtc *ast_crtc = to_ast_crtc(crtc);
-	ast_crtc->lut_r[regno] = red >> 8;
-	ast_crtc->lut_g[regno] = green >> 8;
-	ast_crtc->lut_b[regno] = blue >> 8;
-}
-
-static void ast_fb_gamma_get(struct drm_crtc *crtc, u16 *red, u16 *green,
-			       u16 *blue, int regno)
-{
-	struct ast_crtc *ast_crtc = to_ast_crtc(crtc);
-	*red = ast_crtc->lut_r[regno] << 8;
-	*green = ast_crtc->lut_g[regno] << 8;
-	*blue = ast_crtc->lut_b[regno] << 8;
-}
-
 static const struct drm_fb_helper_funcs ast_fb_helper_funcs = {
-	.gamma_set = ast_fb_gamma_set,
-	.gamma_get = ast_fb_gamma_get,
 	.fb_probe = astfb_create,
 };
 
@@ -287,7 +266,7 @@ static void ast_fbdev_destroy(struct drm_device *dev,
 	drm_fb_helper_unregister_fbi(&afbdev->helper);
 
 	if (afb->obj) {
-		drm_gem_object_unreference_unlocked(afb->obj);
+		drm_gem_object_put_unlocked(afb->obj);
 		afb->obj = NULL;
 	}
 	drm_fb_helper_fini(&afbdev->helper);
diff --git a/drivers/gpu/drm/ast/ast_main.c b/drivers/gpu/drm/ast/ast_main.c
index 262c2c0..dac3558 100644
--- a/drivers/gpu/drm/ast/ast_main.c
+++ b/drivers/gpu/drm/ast/ast_main.c
@@ -387,9 +387,9 @@ static void ast_user_framebuffer_destroy(struct drm_framebuffer *fb)
 {
 	struct ast_framebuffer *ast_fb = to_ast_framebuffer(fb);
 
-	drm_gem_object_unreference_unlocked(ast_fb->obj);
+	drm_gem_object_put_unlocked(ast_fb->obj);
 	drm_framebuffer_cleanup(fb);
-	kfree(fb);
+	kfree(ast_fb);
 }
 
 static const struct drm_framebuffer_funcs ast_fb_funcs = {
@@ -429,13 +429,13 @@ ast_user_framebuffer_create(struct drm_device *dev,
 
 	ast_fb = kzalloc(sizeof(*ast_fb), GFP_KERNEL);
 	if (!ast_fb) {
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ERR_PTR(-ENOMEM);
 	}
 
 	ret = ast_framebuffer_init(dev, ast_fb, mode_cmd, obj);
 	if (ret) {
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		kfree(ast_fb);
 		return ERR_PTR(ret);
 	}
@@ -576,6 +576,7 @@ void ast_driver_unload(struct drm_device *dev)
 {
 	struct ast_private *ast = dev->dev_private;
 
+	ast_release_firmware(dev);
 	kfree(ast->dp501_fw_addr);
 	ast_mode_fini(dev);
 	ast_fbdev_fini(dev);
@@ -627,7 +628,7 @@ int ast_dumb_create(struct drm_file *file,
 		return ret;
 
 	ret = drm_gem_handle_create(file, gobj, &handle);
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	if (ret)
 		return ret;
 
@@ -675,7 +676,7 @@ ast_dumb_mmap_offset(struct drm_file *file,
 	bo = gem_to_ast_bo(obj);
 	*offset = ast_bo_mmap_offset(bo);
 
-	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
 
 	return 0;
 
diff --git a/drivers/gpu/drm/ast/ast_mode.c b/drivers/gpu/drm/ast/ast_mode.c
index aaef0a6..6f3849e 100644
--- a/drivers/gpu/drm/ast/ast_mode.c
+++ b/drivers/gpu/drm/ast/ast_mode.c
@@ -63,15 +63,18 @@ static inline void ast_load_palette_index(struct ast_private *ast,
 static void ast_crtc_load_lut(struct drm_crtc *crtc)
 {
 	struct ast_private *ast = crtc->dev->dev_private;
-	struct ast_crtc *ast_crtc = to_ast_crtc(crtc);
+	u16 *r, *g, *b;
 	int i;
 
 	if (!crtc->enabled)
 		return;
 
+	r = crtc->gamma_store;
+	g = r + crtc->gamma_size;
+	b = g + crtc->gamma_size;
+
 	for (i = 0; i < 256; i++)
-		ast_load_palette_index(ast, i, ast_crtc->lut_r[i],
-				       ast_crtc->lut_g[i], ast_crtc->lut_b[i]);
+		ast_load_palette_index(ast, i, *r++ >> 8, *g++ >> 8, *b++ >> 8);
 }
 
 static bool ast_get_vbios_mode_info(struct drm_crtc *crtc, struct drm_display_mode *mode,
@@ -613,7 +616,23 @@ static int ast_crtc_mode_set(struct drm_crtc *crtc,
 
 static void ast_crtc_disable(struct drm_crtc *crtc)
 {
+	int ret;
 
+	DRM_DEBUG_KMS("\n");
+	ast_crtc_dpms(crtc, DRM_MODE_DPMS_OFF);
+	if (crtc->primary->fb) {
+		struct ast_framebuffer *ast_fb = to_ast_framebuffer(crtc->primary->fb);
+		struct drm_gem_object *obj = ast_fb->obj;
+		struct ast_bo *bo = gem_to_ast_bo(obj);
+
+		ret = ast_bo_reserve(bo, false);
+		if (ret)
+			return;
+
+		ast_bo_push_sysram(bo);
+		ast_bo_unreserve(bo);
+	}
+	crtc->primary->fb = NULL;
 }
 
 static void ast_crtc_prepare(struct drm_crtc *crtc)
@@ -633,7 +652,6 @@ static const struct drm_crtc_helper_funcs ast_crtc_helper_funcs = {
 	.mode_set = ast_crtc_mode_set,
 	.mode_set_base = ast_crtc_mode_set_base,
 	.disable = ast_crtc_disable,
-	.load_lut = ast_crtc_load_lut,
 	.prepare = ast_crtc_prepare,
 	.commit = ast_crtc_commit,
 
@@ -648,15 +666,6 @@ static int ast_crtc_gamma_set(struct drm_crtc *crtc, u16 *red, u16 *green,
 			      u16 *blue, uint32_t size,
 			      struct drm_modeset_acquire_ctx *ctx)
 {
-	struct ast_crtc *ast_crtc = to_ast_crtc(crtc);
-	int i;
-
-	/* userspace palettes are always correct as is */
-	for (i = 0; i < size; i++) {
-		ast_crtc->lut_r[i] = red[i] >> 8;
-		ast_crtc->lut_g[i] = green[i] >> 8;
-		ast_crtc->lut_b[i] = blue[i] >> 8;
-	}
 	ast_crtc_load_lut(crtc);
 
 	return 0;
@@ -681,7 +690,6 @@ static const struct drm_crtc_funcs ast_crtc_funcs = {
 static int ast_crtc_init(struct drm_device *dev)
 {
 	struct ast_crtc *crtc;
-	int i;
 
 	crtc = kzalloc(sizeof(struct ast_crtc), GFP_KERNEL);
 	if (!crtc)
@@ -690,12 +698,6 @@ static int ast_crtc_init(struct drm_device *dev)
 	drm_crtc_init(dev, &crtc->base, &ast_crtc_funcs);
 	drm_mode_crtc_set_gamma_size(&crtc->base, 256);
 	drm_crtc_helper_add(&crtc->base, &ast_crtc_helper_funcs);
-
-	for (i = 0; i < 256; i++) {
-		crtc->lut_r[i] = i;
-		crtc->lut_g[i] = i;
-		crtc->lut_b[i] = i;
-	}
 	return 0;
 }
 
@@ -948,7 +950,7 @@ static void ast_cursor_fini(struct drm_device *dev)
 {
 	struct ast_private *ast = dev->dev_private;
 	ttm_bo_kunmap(&ast->cache_kmap);
-	drm_gem_object_unreference_unlocked(ast->cursor_cache);
+	drm_gem_object_put_unlocked(ast->cursor_cache);
 }
 
 int ast_mode_init(struct drm_device *dev)
@@ -1213,10 +1215,10 @@ static int ast_cursor_set(struct drm_crtc *crtc,
 
 	ast_show_cursor(crtc);
 
-	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
 	return 0;
 fail:
-	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/ast/ast_ttm.c b/drivers/gpu/drm/ast/ast_ttm.c
index 5808498..696a15d 100644
--- a/drivers/gpu/drm/ast/ast_ttm.c
+++ b/drivers/gpu/drm/ast/ast_ttm.c
@@ -323,10 +323,8 @@ int ast_bo_create(struct drm_device *dev, int size, int align,
 		return -ENOMEM;
 
 	ret = drm_gem_object_init(dev, &astbo->gem, size);
-	if (ret) {
-		kfree(astbo);
-		return ret;
-	}
+	if (ret)
+		goto error;
 
 	astbo->bo.bdev = &ast->ttm.bdev;
 
@@ -340,10 +338,13 @@ int ast_bo_create(struct drm_device *dev, int size, int align,
 			  align >> PAGE_SHIFT, false, NULL, acc_size,
 			  NULL, NULL, ast_bo_ttm_destroy);
 	if (ret)
-		return ret;
+		goto error;
 
 	*pastbo = astbo;
 	return 0;
+error:
+	kfree(astbo);
+	return ret;
 }
 
 static inline u64 ast_bo_gpu_offset(struct ast_bo *bo)
@@ -376,7 +377,7 @@ int ast_bo_pin(struct ast_bo *bo, u32 pl_flag, u64 *gpu_addr)
 
 int ast_bo_unpin(struct ast_bo *bo)
 {
-	int i, ret;
+	int i;
 	if (!bo->pin_count) {
 		DRM_ERROR("unpin bad %p\n", bo);
 		return 0;
@@ -387,11 +388,7 @@ int ast_bo_unpin(struct ast_bo *bo)
 
 	for (i = 0; i < bo->placement.num_placement ; i++)
 		bo->placements[i].flags &= ~TTM_PL_FLAG_NO_EVICT;
-	ret = ttm_bo_validate(&bo->bo, &bo->placement, false, false);
-	if (ret)
-		return ret;
-
-	return 0;
+	return ttm_bo_validate(&bo->bo, &bo->placement, false, false);
 }
 
 int ast_bo_push_sysram(struct ast_bo *bo)
diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c
index 5348985..d732810 100644
--- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c
+++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_crtc.c
@@ -149,7 +149,8 @@ atmel_hlcdc_crtc_mode_valid(struct drm_crtc *c,
 	return atmel_hlcdc_dc_mode_valid(crtc->dc, mode);
 }
 
-static void atmel_hlcdc_crtc_disable(struct drm_crtc *c)
+static void atmel_hlcdc_crtc_atomic_disable(struct drm_crtc *c,
+					    struct drm_crtc_state *old_state)
 {
 	struct drm_device *dev = c->dev;
 	struct atmel_hlcdc_crtc *crtc = drm_crtc_to_atmel_hlcdc_crtc(c);
@@ -183,7 +184,8 @@ static void atmel_hlcdc_crtc_disable(struct drm_crtc *c)
 	pm_runtime_put_sync(dev->dev);
 }
 
-static void atmel_hlcdc_crtc_enable(struct drm_crtc *c)
+static void atmel_hlcdc_crtc_atomic_enable(struct drm_crtc *c,
+					   struct drm_crtc_state *old_state)
 {
 	struct drm_device *dev = c->dev;
 	struct atmel_hlcdc_crtc *crtc = drm_crtc_to_atmel_hlcdc_crtc(c);
@@ -235,7 +237,7 @@ static int atmel_hlcdc_crtc_select_output_mode(struct drm_crtc_state *state)
 
 	crtc = drm_crtc_to_atmel_hlcdc_crtc(state->crtc);
 
-	for_each_connector_in_state(state->state, connector, cstate, i) {
+	for_each_new_connector_in_state(state->state, connector, cstate, i) {
 		struct drm_display_info *info = &connector->display_info;
 		unsigned int supported_fmts = 0;
 		int j;
@@ -319,11 +321,11 @@ static const struct drm_crtc_helper_funcs lcdc_crtc_helper_funcs = {
 	.mode_set = drm_helper_crtc_mode_set,
 	.mode_set_nofb = atmel_hlcdc_crtc_mode_set_nofb,
 	.mode_set_base = drm_helper_crtc_mode_set_base,
-	.disable = atmel_hlcdc_crtc_disable,
-	.enable = atmel_hlcdc_crtc_enable,
 	.atomic_check = atmel_hlcdc_crtc_atomic_check,
 	.atomic_begin = atmel_hlcdc_crtc_atomic_begin,
 	.atomic_flush = atmel_hlcdc_crtc_atomic_flush,
+	.atomic_enable = atmel_hlcdc_crtc_atomic_enable,
+	.atomic_disable = atmel_hlcdc_crtc_atomic_disable,
 };
 
 static void atmel_hlcdc_crtc_destroy(struct drm_crtc *c)
@@ -429,6 +431,7 @@ static const struct drm_crtc_funcs atmel_hlcdc_crtc_funcs = {
 	.atomic_destroy_state = atmel_hlcdc_crtc_destroy_state,
 	.enable_vblank = atmel_hlcdc_crtc_enable_vblank,
 	.disable_vblank = atmel_hlcdc_crtc_disable_vblank,
+	.gamma_set = drm_atomic_helper_legacy_gamma_set,
 };
 
 int atmel_hlcdc_crtc_create(struct drm_device *dev)
@@ -484,6 +487,10 @@ int atmel_hlcdc_crtc_create(struct drm_device *dev)
 	drm_crtc_helper_add(&crtc->base, &lcdc_crtc_helper_funcs);
 	drm_crtc_vblank_reset(&crtc->base);
 
+	drm_mode_crtc_set_gamma_size(&crtc->base, ATMEL_HLCDC_CLUT_SIZE);
+	drm_crtc_enable_color_mgmt(&crtc->base, 0, false,
+				   ATMEL_HLCDC_CLUT_SIZE);
+
 	dc->crtc = &crtc->base;
 
 	return 0;
diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c
index 30dbffd..74d66e1 100644
--- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c
+++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.c
@@ -42,6 +42,7 @@ static const struct atmel_hlcdc_layer_desc atmel_hlcdc_at91sam9n12_layers[] = {
 			.default_color = 3,
 			.general_config = 4,
 		},
+		.clut_offset = 0x400,
 	},
 };
 
@@ -73,6 +74,7 @@ static const struct atmel_hlcdc_layer_desc atmel_hlcdc_at91sam9x5_layers[] = {
 			.disc_pos = 5,
 			.disc_size = 6,
 		},
+		.clut_offset = 0x400,
 	},
 	{
 		.name = "overlay1",
@@ -91,6 +93,7 @@ static const struct atmel_hlcdc_layer_desc atmel_hlcdc_at91sam9x5_layers[] = {
 			.chroma_key_mask = 8,
 			.general_config = 9,
 		},
+		.clut_offset = 0x800,
 	},
 	{
 		.name = "high-end-overlay",
@@ -112,6 +115,7 @@ static const struct atmel_hlcdc_layer_desc atmel_hlcdc_at91sam9x5_layers[] = {
 			.scaler_config = 13,
 			.csc = 14,
 		},
+		.clut_offset = 0x1000,
 	},
 	{
 		.name = "cursor",
@@ -131,6 +135,7 @@ static const struct atmel_hlcdc_layer_desc atmel_hlcdc_at91sam9x5_layers[] = {
 			.chroma_key_mask = 8,
 			.general_config = 9,
 		},
+		.clut_offset = 0x1400,
 	},
 };
 
@@ -162,6 +167,7 @@ static const struct atmel_hlcdc_layer_desc atmel_hlcdc_sama5d3_layers[] = {
 			.disc_pos = 5,
 			.disc_size = 6,
 		},
+		.clut_offset = 0x600,
 	},
 	{
 		.name = "overlay1",
@@ -180,6 +186,7 @@ static const struct atmel_hlcdc_layer_desc atmel_hlcdc_sama5d3_layers[] = {
 			.chroma_key_mask = 8,
 			.general_config = 9,
 		},
+		.clut_offset = 0xa00,
 	},
 	{
 		.name = "overlay2",
@@ -198,6 +205,7 @@ static const struct atmel_hlcdc_layer_desc atmel_hlcdc_sama5d3_layers[] = {
 			.chroma_key_mask = 8,
 			.general_config = 9,
 		},
+		.clut_offset = 0xe00,
 	},
 	{
 		.name = "high-end-overlay",
@@ -223,6 +231,7 @@ static const struct atmel_hlcdc_layer_desc atmel_hlcdc_sama5d3_layers[] = {
 			},
 			.csc = 14,
 		},
+		.clut_offset = 0x1200,
 	},
 	{
 		.name = "cursor",
@@ -244,6 +253,7 @@ static const struct atmel_hlcdc_layer_desc atmel_hlcdc_sama5d3_layers[] = {
 			.general_config = 9,
 			.scaler_config = 13,
 		},
+		.clut_offset = 0x1600,
 	},
 };
 
@@ -275,6 +285,7 @@ static const struct atmel_hlcdc_layer_desc atmel_hlcdc_sama5d4_layers[] = {
 			.disc_pos = 5,
 			.disc_size = 6,
 		},
+		.clut_offset = 0x600,
 	},
 	{
 		.name = "overlay1",
@@ -293,6 +304,7 @@ static const struct atmel_hlcdc_layer_desc atmel_hlcdc_sama5d4_layers[] = {
 			.chroma_key_mask = 8,
 			.general_config = 9,
 		},
+		.clut_offset = 0xa00,
 	},
 	{
 		.name = "overlay2",
@@ -311,6 +323,7 @@ static const struct atmel_hlcdc_layer_desc atmel_hlcdc_sama5d4_layers[] = {
 			.chroma_key_mask = 8,
 			.general_config = 9,
 		},
+		.clut_offset = 0xe00,
 	},
 	{
 		.name = "high-end-overlay",
@@ -336,6 +349,7 @@ static const struct atmel_hlcdc_layer_desc atmel_hlcdc_sama5d4_layers[] = {
 			},
 			.csc = 14,
 		},
+		.clut_offset = 0x1200,
 	},
 };
 
@@ -451,8 +465,7 @@ static void atmel_hlcdc_fb_output_poll_changed(struct drm_device *dev)
 {
 	struct atmel_hlcdc_dc *dc = dev->dev_private;
 
-	if (dc->fbdev)
-		drm_fbdev_cma_hotplug_event(dc->fbdev);
+	drm_fbdev_cma_hotplug_event(dc->fbdev);
 }
 
 struct atmel_hlcdc_dc_commit {
@@ -526,14 +539,13 @@ static int atmel_hlcdc_dc_atomic_commit(struct drm_device *dev,
 		dc->commit.pending = true;
 	spin_unlock(&dc->commit.wait.lock);
 
-	if (ret) {
-		kfree(commit);
-		goto error;
-	}
+	if (ret)
+		goto err_free;
 
-	/* Swap the state, this is the point of no return. */
-	drm_atomic_helper_swap_state(state, true);
+	/* We have our own synchronization through the commit lock. */
+	BUG_ON(drm_atomic_helper_swap_state(state, false) < 0);
 
+	/* Swap state succeeded, this is the point of no return. */
 	drm_atomic_state_get(state);
 	if (async)
 		queue_work(dc->wq, &commit->work);
@@ -542,6 +554,8 @@ static int atmel_hlcdc_dc_atomic_commit(struct drm_device *dev,
 
 	return 0;
 
+err_free:
+	kfree(commit);
 error:
 	drm_atomic_helper_cleanup_planes(dev, state);
 	return ret;
@@ -747,8 +761,6 @@ static struct drm_driver atmel_hlcdc_dc_driver = {
 	.gem_prime_vunmap = drm_gem_cma_prime_vunmap,
 	.gem_prime_mmap = drm_gem_cma_prime_mmap,
 	.dumb_create = drm_gem_cma_dumb_create,
-	.dumb_map_offset = drm_gem_cma_dumb_map_offset,
-	.dumb_destroy = drm_gem_dumb_destroy,
 	.fops = &fops,
 	.name = "atmel-hlcdc",
 	.desc = "Atmel HLCD Controller DRM",
diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h
index b0596a8..4237b04 100644
--- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h
+++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_dc.h
@@ -88,6 +88,11 @@
 #define ATMEL_HLCDC_YUV422SWP			BIT(17)
 #define ATMEL_HLCDC_DSCALEOPT			BIT(20)
 
+#define ATMEL_HLCDC_C1_MODE			ATMEL_HLCDC_CLUT_MODE(0)
+#define ATMEL_HLCDC_C2_MODE			ATMEL_HLCDC_CLUT_MODE(1)
+#define ATMEL_HLCDC_C4_MODE			ATMEL_HLCDC_CLUT_MODE(2)
+#define ATMEL_HLCDC_C8_MODE			ATMEL_HLCDC_CLUT_MODE(3)
+
 #define ATMEL_HLCDC_XRGB4444_MODE		ATMEL_HLCDC_RGB_MODE(0)
 #define ATMEL_HLCDC_ARGB4444_MODE		ATMEL_HLCDC_RGB_MODE(1)
 #define ATMEL_HLCDC_RGBA4444_MODE		ATMEL_HLCDC_RGB_MODE(2)
@@ -142,6 +147,8 @@
 #define ATMEL_HLCDC_DMA_CHANNEL_DSCR_DONE	BIT(2)
 #define ATMEL_HLCDC_DMA_CHANNEL_DSCR_OVERRUN	BIT(3)
 
+#define ATMEL_HLCDC_CLUT_SIZE			256
+
 #define ATMEL_HLCDC_MAX_LAYERS			6
 
 /**
@@ -259,6 +266,7 @@ struct atmel_hlcdc_layer_desc {
 	int id;
 	int regs_offset;
 	int cfgs_offset;
+	int clut_offset;
 	struct atmel_hlcdc_formats *formats;
 	struct atmel_hlcdc_layer_cfg_layout layout;
 	int max_width;
@@ -414,6 +422,14 @@ static inline u32 atmel_hlcdc_layer_read_cfg(struct atmel_hlcdc_layer *layer,
 					  (cfgid * sizeof(u32)));
 }
 
+static inline void atmel_hlcdc_layer_write_clut(struct atmel_hlcdc_layer *layer,
+						unsigned int c, u32 val)
+{
+	regmap_write(layer->regmap,
+		     layer->desc->clut_offset + c * sizeof(u32),
+		     val);
+}
+
 static inline void atmel_hlcdc_layer_init(struct atmel_hlcdc_layer *layer,
 				const struct atmel_hlcdc_layer_desc *desc,
 				struct regmap *regmap)
diff --git a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
index 1124200..703c2d13 100644
--- a/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
+++ b/drivers/gpu/drm/atmel-hlcdc/atmel_hlcdc_plane.c
@@ -83,6 +83,7 @@ drm_plane_state_to_atmel_hlcdc_plane_state(struct drm_plane_state *s)
 #define SUBPIXEL_MASK			0xffff
 
 static uint32_t rgb_formats[] = {
+	DRM_FORMAT_C8,
 	DRM_FORMAT_XRGB4444,
 	DRM_FORMAT_ARGB4444,
 	DRM_FORMAT_RGBA4444,
@@ -100,6 +101,7 @@ struct atmel_hlcdc_formats atmel_hlcdc_plane_rgb_formats = {
 };
 
 static uint32_t rgb_and_yuv_formats[] = {
+	DRM_FORMAT_C8,
 	DRM_FORMAT_XRGB4444,
 	DRM_FORMAT_ARGB4444,
 	DRM_FORMAT_RGBA4444,
@@ -128,6 +130,9 @@ struct atmel_hlcdc_formats atmel_hlcdc_plane_rgb_and_yuv_formats = {
 static int atmel_hlcdc_format_to_plane_mode(u32 format, u32 *mode)
 {
 	switch (format) {
+	case DRM_FORMAT_C8:
+		*mode = ATMEL_HLCDC_C8_MODE;
+		break;
 	case DRM_FORMAT_XRGB4444:
 		*mode = ATMEL_HLCDC_XRGB4444_MODE;
 		break;
@@ -424,6 +429,29 @@ static void atmel_hlcdc_plane_update_format(struct atmel_hlcdc_plane *plane,
 				    ATMEL_HLCDC_LAYER_FORMAT_CFG, cfg);
 }
 
+static void atmel_hlcdc_plane_update_clut(struct atmel_hlcdc_plane *plane)
+{
+	struct drm_crtc *crtc = plane->base.crtc;
+	struct drm_color_lut *lut;
+	int idx;
+
+	if (!crtc || !crtc->state)
+		return;
+
+	if (!crtc->state->color_mgmt_changed || !crtc->state->gamma_lut)
+		return;
+
+	lut = (struct drm_color_lut *)crtc->state->gamma_lut->data;
+
+	for (idx = 0; idx < ATMEL_HLCDC_CLUT_SIZE; idx++, lut++) {
+		u32 val = ((lut->red << 8) & 0xff0000) |
+			(lut->green & 0xff00) |
+			(lut->blue >> 8);
+
+		atmel_hlcdc_layer_write_clut(&plane->layer, idx, val);
+	}
+}
+
 static void atmel_hlcdc_plane_update_buffers(struct atmel_hlcdc_plane *plane,
 					struct atmel_hlcdc_plane_state *state)
 {
@@ -768,6 +796,7 @@ static void atmel_hlcdc_plane_atomic_update(struct drm_plane *p,
 	atmel_hlcdc_plane_update_pos_and_size(plane, state);
 	atmel_hlcdc_plane_update_general_settings(plane, state);
 	atmel_hlcdc_plane_update_format(plane, state);
+	atmel_hlcdc_plane_update_clut(plane);
 	atmel_hlcdc_plane_update_buffers(plane, state);
 	atmel_hlcdc_plane_update_disc_area(plane, state);
 
@@ -809,7 +838,7 @@ static void atmel_hlcdc_plane_destroy(struct drm_plane *p)
 	struct atmel_hlcdc_plane *plane = drm_plane_to_atmel_hlcdc_plane(p);
 
 	if (plane->base.fb)
-		drm_framebuffer_unreference(plane->base.fb);
+		drm_framebuffer_put(plane->base.fb);
 
 	drm_plane_cleanup(p);
 }
@@ -911,7 +940,7 @@ void atmel_hlcdc_plane_irq(struct atmel_hlcdc_plane *plane)
 			desc->name);
 }
 
-static struct drm_plane_helper_funcs atmel_hlcdc_layer_plane_helper_funcs = {
+static const struct drm_plane_helper_funcs atmel_hlcdc_layer_plane_helper_funcs = {
 	.atomic_check = atmel_hlcdc_plane_atomic_check,
 	.atomic_update = atmel_hlcdc_plane_atomic_update,
 	.atomic_disable = atmel_hlcdc_plane_atomic_disable,
@@ -958,7 +987,7 @@ static void atmel_hlcdc_plane_reset(struct drm_plane *p)
 		state = drm_plane_state_to_atmel_hlcdc_plane_state(p->state);
 
 		if (state->base.fb)
-			drm_framebuffer_unreference(state->base.fb);
+			drm_framebuffer_put(state->base.fb);
 
 		kfree(state);
 		p->state = NULL;
@@ -996,7 +1025,7 @@ atmel_hlcdc_plane_atomic_duplicate_state(struct drm_plane *p)
 	}
 
 	if (copy->base.fb)
-		drm_framebuffer_reference(copy->base.fb);
+		drm_framebuffer_get(copy->base.fb);
 
 	return &copy->base;
 }
@@ -1015,15 +1044,14 @@ static void atmel_hlcdc_plane_atomic_destroy_state(struct drm_plane *p,
 	}
 
 	if (s->fb)
-		drm_framebuffer_unreference(s->fb);
+		drm_framebuffer_put(s->fb);
 
 	kfree(state);
 }
 
-static struct drm_plane_funcs layer_plane_funcs = {
+static const struct drm_plane_funcs layer_plane_funcs = {
 	.update_plane = drm_atomic_helper_update_plane,
 	.disable_plane = drm_atomic_helper_disable_plane,
-	.set_property = drm_atomic_helper_plane_set_property,
 	.destroy = atmel_hlcdc_plane_destroy,
 	.reset = atmel_hlcdc_plane_reset,
 	.atomic_duplicate_state = atmel_hlcdc_plane_atomic_duplicate_state,
@@ -1058,7 +1086,8 @@ static int atmel_hlcdc_plane_create(struct drm_device *dev,
 	ret = drm_universal_plane_init(dev, &plane->base, 0,
 				       &layer_plane_funcs,
 				       desc->formats->formats,
-				       desc->formats->nformats, type, NULL);
+				       desc->formats->nformats,
+				       NULL, type, NULL);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/bochs/bochs_drv.c b/drivers/gpu/drm/bochs/bochs_drv.c
index aa34251..7b203184 100644
--- a/drivers/gpu/drm/bochs/bochs_drv.c
+++ b/drivers/gpu/drm/bochs/bochs_drv.c
@@ -84,7 +84,6 @@ static struct drm_driver bochs_driver = {
 	.driver_features	= DRIVER_GEM | DRIVER_MODESET,
 	.load			= bochs_load,
 	.unload			= bochs_unload,
-	.set_busid		= drm_pci_set_busid,
 	.fops			= &bochs_fops,
 	.name			= "bochs-drm",
 	.desc			= "bochs dispi vga interface (qemu stdvga)",
@@ -94,7 +93,6 @@ static struct drm_driver bochs_driver = {
 	.gem_free_object_unlocked = bochs_gem_free_object,
 	.dumb_create            = bochs_dumb_create,
 	.dumb_map_offset        = bochs_dumb_mmap_offset,
-	.dumb_destroy           = drm_gem_dumb_destroy,
 };
 
 /* ---------------------------------------------------------------------- */
@@ -224,12 +222,12 @@ static int __init bochs_init(void)
 	if (bochs_modeset == 0)
 		return -EINVAL;
 
-	return drm_pci_init(&bochs_driver, &bochs_pci_driver);
+	return pci_register_driver(&bochs_pci_driver);
 }
 
 static void __exit bochs_exit(void)
 {
-	drm_pci_exit(&bochs_driver, &bochs_pci_driver);
+	pci_unregister_driver(&bochs_pci_driver);
 }
 
 module_init(bochs_init);
diff --git a/drivers/gpu/drm/bochs/bochs_fbdev.c b/drivers/gpu/drm/bochs/bochs_fbdev.c
index c38deff..14eb8d0 100644
--- a/drivers/gpu/drm/bochs/bochs_fbdev.c
+++ b/drivers/gpu/drm/bochs/bochs_fbdev.c
@@ -23,9 +23,9 @@ static int bochsfb_mmap(struct fb_info *info,
 static struct fb_ops bochsfb_ops = {
 	.owner = THIS_MODULE,
 	DRM_FB_HELPER_DEFAULT_OPS,
-	.fb_fillrect = drm_fb_helper_sys_fillrect,
-	.fb_copyarea = drm_fb_helper_sys_copyarea,
-	.fb_imageblit = drm_fb_helper_sys_imageblit,
+	.fb_fillrect = drm_fb_helper_cfb_fillrect,
+	.fb_copyarea = drm_fb_helper_cfb_copyarea,
+	.fb_imageblit = drm_fb_helper_cfb_imageblit,
 	.fb_mmap = bochsfb_mmap,
 };
 
@@ -118,7 +118,6 @@ static int bochsfb_create(struct drm_fb_helper *helper,
 
 	strcpy(info->fix.id, "bochsdrmfb");
 
-	info->flags = FBINFO_DEFAULT;
 	info->fbops = &bochsfb_ops;
 
 	drm_fb_helper_fill_fix(info, fb->pitches[0], fb->format->depth);
diff --git a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
index f75ab62..b2431ae 100644
--- a/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
+++ b/drivers/gpu/drm/bridge/adv7511/adv7511_drv.c
@@ -785,8 +785,7 @@ adv7511_connector_detect(struct drm_connector *connector, bool force)
 	return adv7511_detect(adv, connector);
 }
 
-static struct drm_connector_funcs adv7511_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
+static const struct drm_connector_funcs adv7511_connector_funcs = {
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.detect = adv7511_connector_detect,
 	.destroy = drm_connector_cleanup,
@@ -857,7 +856,7 @@ static int adv7511_bridge_attach(struct drm_bridge *bridge)
 	return ret;
 }
 
-static struct drm_bridge_funcs adv7511_bridge_funcs = {
+static const struct drm_bridge_funcs adv7511_bridge_funcs = {
 	.enable = adv7511_bridge_enable,
 	.disable = adv7511_bridge_disable,
 	.mode_set = adv7511_bridge_mode_set,
@@ -1126,11 +1125,7 @@ static int adv7511_probe(struct i2c_client *i2c, const struct i2c_device_id *id)
 	adv7511->bridge.funcs = &adv7511_bridge_funcs;
 	adv7511->bridge.of_node = dev->of_node;
 
-	ret = drm_bridge_add(&adv7511->bridge);
-	if (ret) {
-		dev_err(dev, "failed to add adv7511 bridge\n");
-		goto err_unregister_cec;
-	}
+	drm_bridge_add(&adv7511->bridge);
 
 	adv7511_audio_init(dev, adv7511);
 
diff --git a/drivers/gpu/drm/bridge/analogix-anx78xx.c b/drivers/gpu/drm/bridge/analogix-anx78xx.c
index 9006578..9385eb0 100644
--- a/drivers/gpu/drm/bridge/analogix-anx78xx.c
+++ b/drivers/gpu/drm/bridge/analogix-anx78xx.c
@@ -1002,7 +1002,6 @@ static enum drm_connector_status anx78xx_detect(struct drm_connector *connector,
 }
 
 static const struct drm_connector_funcs anx78xx_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.detect = anx78xx_detect,
 	.destroy = drm_connector_cleanup,
@@ -1097,7 +1096,8 @@ static void anx78xx_bridge_mode_set(struct drm_bridge *bridge,
 
 	mutex_lock(&anx78xx->lock);
 
-	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, adjusted_mode);
+	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, adjusted_mode,
+						       false);
 	if (err) {
 		DRM_ERROR("Failed to setup AVI infoframe: %d\n", err);
 		goto unlock;
@@ -1438,11 +1438,7 @@ static int anx78xx_i2c_probe(struct i2c_client *client,
 
 	anx78xx->bridge.funcs = &anx78xx_bridge_funcs;
 
-	err = drm_bridge_add(&anx78xx->bridge);
-	if (err < 0) {
-		DRM_ERROR("Failed to add drm bridge: %d\n", err);
-		goto err_poweroff;
-	}
+	drm_bridge_add(&anx78xx->bridge);
 
 	/* If cable is pulled out, just poweroff and wait for HPD event */
 	if (!gpiod_get_value(anx78xx->pdata.gpiod_hpd))
diff --git a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
index 4c758ed..5dd3f1c 100644
--- a/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
+++ b/drivers/gpu/drm/bridge/analogix/analogix_dp_core.c
@@ -1005,7 +1005,6 @@ analogix_dp_detect(struct drm_connector *connector, bool force)
 }
 
 static const struct drm_connector_funcs analogix_dp_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.detect = analogix_dp_detect,
 	.destroy = drm_connector_cleanup,
diff --git a/drivers/gpu/drm/bridge/dumb-vga-dac.c b/drivers/gpu/drm/bridge/dumb-vga-dac.c
index 831a606..de5e7de 100644
--- a/drivers/gpu/drm/bridge/dumb-vga-dac.c
+++ b/drivers/gpu/drm/bridge/dumb-vga-dac.c
@@ -92,7 +92,6 @@ dumb_vga_connector_detect(struct drm_connector *connector, bool force)
 }
 
 static const struct drm_connector_funcs dumb_vga_con_funcs = {
-	.dpms			= drm_atomic_helper_connector_dpms,
 	.detect			= dumb_vga_connector_detect,
 	.fill_modes		= drm_helper_probe_single_connector_modes,
 	.destroy		= drm_connector_cleanup,
@@ -177,7 +176,6 @@ static struct i2c_adapter *dumb_vga_retrieve_ddc(struct device *dev)
 static int dumb_vga_probe(struct platform_device *pdev)
 {
 	struct dumb_vga *vga;
-	int ret;
 
 	vga = devm_kzalloc(&pdev->dev, sizeof(*vga), GFP_KERNEL);
 	if (!vga)
@@ -186,7 +184,7 @@ static int dumb_vga_probe(struct platform_device *pdev)
 
 	vga->vdd = devm_regulator_get_optional(&pdev->dev, "vdd");
 	if (IS_ERR(vga->vdd)) {
-		ret = PTR_ERR(vga->vdd);
+		int ret = PTR_ERR(vga->vdd);
 		if (ret == -EPROBE_DEFER)
 			return -EPROBE_DEFER;
 		vga->vdd = NULL;
@@ -207,11 +205,9 @@ static int dumb_vga_probe(struct platform_device *pdev)
 	vga->bridge.funcs = &dumb_vga_bridge_funcs;
 	vga->bridge.of_node = pdev->dev.of_node;
 
-	ret = drm_bridge_add(&vga->bridge);
-	if (ret && !IS_ERR(vga->ddc))
-		i2c_put_adapter(vga->ddc);
+	drm_bridge_add(&vga->bridge);
 
-	return ret;
+	return 0;
 }
 
 static int dumb_vga_remove(struct platform_device *pdev)
diff --git a/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c b/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c
index 11f1108..7ccadba 100644
--- a/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c
+++ b/drivers/gpu/drm/bridge/megachips-stdpxxxx-ge-b850v3-fw.c
@@ -193,7 +193,6 @@ static enum drm_connector_status ge_b850v3_lvds_detect(
 }
 
 static const struct drm_connector_funcs ge_b850v3_lvds_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.detect = ge_b850v3_lvds_detect,
 	.destroy = drm_connector_cleanup,
diff --git a/drivers/gpu/drm/bridge/nxp-ptn3460.c b/drivers/gpu/drm/bridge/nxp-ptn3460.c
index 4f64e71..d64a328 100644
--- a/drivers/gpu/drm/bridge/nxp-ptn3460.c
+++ b/drivers/gpu/drm/bridge/nxp-ptn3460.c
@@ -238,7 +238,6 @@ static const struct drm_connector_helper_funcs ptn3460_connector_helper_funcs =
 };
 
 static const struct drm_connector_funcs ptn3460_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = drm_connector_cleanup,
 	.reset = drm_atomic_helper_connector_reset,
@@ -332,11 +331,7 @@ static int ptn3460_probe(struct i2c_client *client,
 
 	ptn_bridge->bridge.funcs = &ptn3460_bridge_funcs;
 	ptn_bridge->bridge.of_node = dev->of_node;
-	ret = drm_bridge_add(&ptn_bridge->bridge);
-	if (ret) {
-		DRM_ERROR("Failed to add bridge\n");
-		return ret;
-	}
+	drm_bridge_add(&ptn_bridge->bridge);
 
 	i2c_set_clientdata(client, ptn_bridge);
 
diff --git a/drivers/gpu/drm/bridge/panel.c b/drivers/gpu/drm/bridge/panel.c
index 67fe19e..e0cca19 100644
--- a/drivers/gpu/drm/bridge/panel.c
+++ b/drivers/gpu/drm/bridge/panel.c
@@ -50,7 +50,6 @@ panel_bridge_connector_helper_funcs = {
 };
 
 static const struct drm_connector_funcs panel_bridge_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.reset = drm_atomic_helper_connector_reset,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = drm_connector_cleanup,
@@ -158,7 +157,6 @@ struct drm_bridge *drm_panel_bridge_add(struct drm_panel *panel,
 					u32 connector_type)
 {
 	struct panel_bridge *panel_bridge;
-	int ret;
 
 	if (!panel)
 		return ERR_PTR(-EINVAL);
@@ -176,9 +174,7 @@ struct drm_bridge *drm_panel_bridge_add(struct drm_panel *panel,
 	panel_bridge->bridge.of_node = panel->dev->of_node;
 #endif
 
-	ret = drm_bridge_add(&panel_bridge->bridge);
-	if (ret)
-		return ERR_PTR(ret);
+	drm_bridge_add(&panel_bridge->bridge);
 
 	return &panel_bridge->bridge;
 }
@@ -198,3 +194,33 @@ void drm_panel_bridge_remove(struct drm_bridge *bridge)
 	devm_kfree(panel_bridge->panel->dev, bridge);
 }
 EXPORT_SYMBOL(drm_panel_bridge_remove);
+
+static void devm_drm_panel_bridge_release(struct device *dev, void *res)
+{
+	struct drm_bridge **bridge = res;
+
+	drm_panel_bridge_remove(*bridge);
+}
+
+struct drm_bridge *devm_drm_panel_bridge_add(struct device *dev,
+					     struct drm_panel *panel,
+					     u32 connector_type)
+{
+	struct drm_bridge **ptr, *bridge;
+
+	ptr = devres_alloc(devm_drm_panel_bridge_release, sizeof(*ptr),
+			   GFP_KERNEL);
+	if (!ptr)
+		return ERR_PTR(-ENOMEM);
+
+	bridge = drm_panel_bridge_add(panel, connector_type);
+	if (!IS_ERR(bridge)) {
+		*ptr = bridge;
+		devres_add(dev, ptr);
+	} else {
+		devres_free(ptr);
+	}
+
+	return bridge;
+}
+EXPORT_SYMBOL(devm_drm_panel_bridge_add);
diff --git a/drivers/gpu/drm/bridge/parade-ps8622.c b/drivers/gpu/drm/bridge/parade-ps8622.c
index 6f22f9f..81198f5 100644
--- a/drivers/gpu/drm/bridge/parade-ps8622.c
+++ b/drivers/gpu/drm/bridge/parade-ps8622.c
@@ -476,7 +476,6 @@ static const struct drm_connector_helper_funcs ps8622_connector_helper_funcs = {
 };
 
 static const struct drm_connector_funcs ps8622_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = drm_connector_cleanup,
 	.reset = drm_atomic_helper_connector_reset,
@@ -598,11 +597,7 @@ static int ps8622_probe(struct i2c_client *client,
 
 	ps8622->bridge.funcs = &ps8622_bridge_funcs;
 	ps8622->bridge.of_node = dev->of_node;
-	ret = drm_bridge_add(&ps8622->bridge);
-	if (ret) {
-		DRM_ERROR("Failed to add bridge\n");
-		return ret;
-	}
+	drm_bridge_add(&ps8622->bridge);
 
 	i2c_set_clientdata(client, ps8622);
 
diff --git a/drivers/gpu/drm/bridge/sii902x.c b/drivers/gpu/drm/bridge/sii902x.c
index 9b87067..b1ab4ab 100644
--- a/drivers/gpu/drm/bridge/sii902x.c
+++ b/drivers/gpu/drm/bridge/sii902x.c
@@ -124,7 +124,6 @@ sii902x_connector_detect(struct drm_connector *connector, bool force)
 }
 
 static const struct drm_connector_funcs sii902x_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.detect = sii902x_connector_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = drm_connector_cleanup,
@@ -269,7 +268,7 @@ static void sii902x_bridge_mode_set(struct drm_bridge *bridge,
 	if (ret)
 		return;
 
-	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame, adj);
+	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame, adj, false);
 	if (ret < 0) {
 		DRM_ERROR("couldn't fill AVI infoframe\n");
 		return;
@@ -418,11 +417,7 @@ static int sii902x_probe(struct i2c_client *client,
 
 	sii902x->bridge.funcs = &sii902x_bridge_funcs;
 	sii902x->bridge.of_node = dev->of_node;
-	ret = drm_bridge_add(&sii902x->bridge);
-	if (ret) {
-		dev_err(dev, "Failed to add drm_bridge\n");
-		return ret;
-	}
+	drm_bridge_add(&sii902x->bridge);
 
 	i2c_set_clientdata(client, sii902x);
 
diff --git a/drivers/gpu/drm/bridge/sil-sii8620.c b/drivers/gpu/drm/bridge/sil-sii8620.c
index 2d51a22..5131bfb 100644
--- a/drivers/gpu/drm/bridge/sil-sii8620.c
+++ b/drivers/gpu/drm/bridge/sil-sii8620.c
@@ -597,9 +597,9 @@ static void sii8620_mt_read_devcap(struct sii8620 *ctx, bool xdevcap)
 static void sii8620_mt_read_devcap_reg_recv(struct sii8620 *ctx,
 		struct sii8620_mt_msg *msg)
 {
-	u8 reg = msg->reg[0] & 0x7f;
+	u8 reg = msg->reg[1] & 0x7f;
 
-	if (msg->reg[0] & 0x80)
+	if (msg->reg[1] & 0x80)
 		ctx->xdevcap[reg] = msg->ret;
 	else
 		ctx->devcap[reg] = msg->ret;
diff --git a/drivers/gpu/drm/bridge/synopsys/Kconfig b/drivers/gpu/drm/bridge/synopsys/Kconfig
index 53e78d0..3cc53b4 100644
--- a/drivers/gpu/drm/bridge/synopsys/Kconfig
+++ b/drivers/gpu/drm/bridge/synopsys/Kconfig
@@ -2,6 +2,7 @@
 	tristate
 	select DRM_KMS_HELPER
 	select REGMAP_MMIO
+	select CEC_CORE if CEC_NOTIFIER
 
 config DRM_DW_HDMI_AHB_AUDIO
 	tristate "Synopsys Designware AHB Audio interface"
@@ -22,3 +23,18 @@
 	help
 	  Support the I2S Audio interface which is part of the Synopsys
 	  Designware HDMI block.
+
+config DRM_DW_HDMI_CEC
+	tristate "Synopsis Designware CEC interface"
+	depends on DRM_DW_HDMI
+	select CEC_CORE
+	select CEC_NOTIFIER
+	help
+	  Support the CE interface which is part of the Synopsys
+	  Designware HDMI block.
+
+config DRM_DW_MIPI_DSI
+	tristate
+	select DRM_KMS_HELPER
+	select DRM_MIPI_DSI
+	select DRM_PANEL_BRIDGE
diff --git a/drivers/gpu/drm/bridge/synopsys/Makefile b/drivers/gpu/drm/bridge/synopsys/Makefile
index 17aa7a6..5dad97d 100644
--- a/drivers/gpu/drm/bridge/synopsys/Makefile
+++ b/drivers/gpu/drm/bridge/synopsys/Makefile
@@ -3,3 +3,6 @@
 obj-$(CONFIG_DRM_DW_HDMI) += dw-hdmi.o
 obj-$(CONFIG_DRM_DW_HDMI_AHB_AUDIO) += dw-hdmi-ahb-audio.o
 obj-$(CONFIG_DRM_DW_HDMI_I2S_AUDIO) += dw-hdmi-i2s-audio.o
+obj-$(CONFIG_DRM_DW_HDMI_CEC) += dw-hdmi-cec.o
+
+obj-$(CONFIG_DRM_DW_MIPI_DSI) += dw-mipi-dsi.o
diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c
index 8f2d137..cf3f0ca 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-ahb-audio.c
@@ -517,7 +517,7 @@ static snd_pcm_uframes_t dw_hdmi_pointer(struct snd_pcm_substream *substream)
 	return bytes_to_frames(runtime, dw->buf_offset);
 }
 
-static struct snd_pcm_ops snd_dw_hdmi_ops = {
+static const struct snd_pcm_ops snd_dw_hdmi_ops = {
 	.open = dw_hdmi_open,
 	.close = dw_hdmi_close,
 	.ioctl = snd_pcm_lib_ioctl,
diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-cec.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-cec.c
new file mode 100644
index 0000000..6c32351
--- /dev/null
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-cec.c
@@ -0,0 +1,327 @@
+/*
+ * Designware HDMI CEC driver
+ *
+ * Copyright (C) 2015-2017 Russell King.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+
+#include <drm/drm_edid.h>
+
+#include <media/cec.h>
+#include <media/cec-notifier.h>
+
+#include "dw-hdmi-cec.h"
+
+enum {
+	HDMI_IH_CEC_STAT0	= 0x0106,
+	HDMI_IH_MUTE_CEC_STAT0	= 0x0186,
+
+	HDMI_CEC_CTRL		= 0x7d00,
+	CEC_CTRL_START		= BIT(0),
+	CEC_CTRL_FRAME_TYP	= 3 << 1,
+	CEC_CTRL_RETRY		= 0 << 1,
+	CEC_CTRL_NORMAL		= 1 << 1,
+	CEC_CTRL_IMMED		= 2 << 1,
+
+	HDMI_CEC_STAT		= 0x7d01,
+	CEC_STAT_DONE		= BIT(0),
+	CEC_STAT_EOM		= BIT(1),
+	CEC_STAT_NACK		= BIT(2),
+	CEC_STAT_ARBLOST	= BIT(3),
+	CEC_STAT_ERROR_INIT	= BIT(4),
+	CEC_STAT_ERROR_FOLL	= BIT(5),
+	CEC_STAT_WAKEUP		= BIT(6),
+
+	HDMI_CEC_MASK		= 0x7d02,
+	HDMI_CEC_POLARITY	= 0x7d03,
+	HDMI_CEC_INT		= 0x7d04,
+	HDMI_CEC_ADDR_L		= 0x7d05,
+	HDMI_CEC_ADDR_H		= 0x7d06,
+	HDMI_CEC_TX_CNT		= 0x7d07,
+	HDMI_CEC_RX_CNT		= 0x7d08,
+	HDMI_CEC_TX_DATA0	= 0x7d10,
+	HDMI_CEC_RX_DATA0	= 0x7d20,
+	HDMI_CEC_LOCK		= 0x7d30,
+	HDMI_CEC_WKUPCTRL	= 0x7d31,
+};
+
+struct dw_hdmi_cec {
+	struct dw_hdmi *hdmi;
+	const struct dw_hdmi_cec_ops *ops;
+	u32 addresses;
+	struct cec_adapter *adap;
+	struct cec_msg rx_msg;
+	unsigned int tx_status;
+	bool tx_done;
+	bool rx_done;
+	struct cec_notifier *notify;
+	int irq;
+};
+
+static void dw_hdmi_write(struct dw_hdmi_cec *cec, u8 val, int offset)
+{
+	cec->ops->write(cec->hdmi, val, offset);
+}
+
+static u8 dw_hdmi_read(struct dw_hdmi_cec *cec, int offset)
+{
+	return cec->ops->read(cec->hdmi, offset);
+}
+
+static int dw_hdmi_cec_log_addr(struct cec_adapter *adap, u8 logical_addr)
+{
+	struct dw_hdmi_cec *cec = cec_get_drvdata(adap);
+
+	if (logical_addr == CEC_LOG_ADDR_INVALID)
+		cec->addresses = 0;
+	else
+		cec->addresses |= BIT(logical_addr) | BIT(15);
+
+	dw_hdmi_write(cec, cec->addresses & 255, HDMI_CEC_ADDR_L);
+	dw_hdmi_write(cec, cec->addresses >> 8, HDMI_CEC_ADDR_H);
+
+	return 0;
+}
+
+static int dw_hdmi_cec_transmit(struct cec_adapter *adap, u8 attempts,
+				u32 signal_free_time, struct cec_msg *msg)
+{
+	struct dw_hdmi_cec *cec = cec_get_drvdata(adap);
+	unsigned int i, ctrl;
+
+	switch (signal_free_time) {
+	case CEC_SIGNAL_FREE_TIME_RETRY:
+		ctrl = CEC_CTRL_RETRY;
+		break;
+	case CEC_SIGNAL_FREE_TIME_NEW_INITIATOR:
+	default:
+		ctrl = CEC_CTRL_NORMAL;
+		break;
+	case CEC_SIGNAL_FREE_TIME_NEXT_XFER:
+		ctrl = CEC_CTRL_IMMED;
+		break;
+	}
+
+	for (i = 0; i < msg->len; i++)
+		dw_hdmi_write(cec, msg->msg[i], HDMI_CEC_TX_DATA0 + i);
+
+	dw_hdmi_write(cec, msg->len, HDMI_CEC_TX_CNT);
+	dw_hdmi_write(cec, ctrl | CEC_CTRL_START, HDMI_CEC_CTRL);
+
+	return 0;
+}
+
+static irqreturn_t dw_hdmi_cec_hardirq(int irq, void *data)
+{
+	struct cec_adapter *adap = data;
+	struct dw_hdmi_cec *cec = cec_get_drvdata(adap);
+	unsigned int stat = dw_hdmi_read(cec, HDMI_IH_CEC_STAT0);
+	irqreturn_t ret = IRQ_HANDLED;
+
+	if (stat == 0)
+		return IRQ_NONE;
+
+	dw_hdmi_write(cec, stat, HDMI_IH_CEC_STAT0);
+
+	if (stat & CEC_STAT_ERROR_INIT) {
+		cec->tx_status = CEC_TX_STATUS_ERROR;
+		cec->tx_done = true;
+		ret = IRQ_WAKE_THREAD;
+	} else if (stat & CEC_STAT_DONE) {
+		cec->tx_status = CEC_TX_STATUS_OK;
+		cec->tx_done = true;
+		ret = IRQ_WAKE_THREAD;
+	} else if (stat & CEC_STAT_NACK) {
+		cec->tx_status = CEC_TX_STATUS_NACK;
+		cec->tx_done = true;
+		ret = IRQ_WAKE_THREAD;
+	}
+
+	if (stat & CEC_STAT_EOM) {
+		unsigned int len, i;
+
+		len = dw_hdmi_read(cec, HDMI_CEC_RX_CNT);
+		if (len > sizeof(cec->rx_msg.msg))
+			len = sizeof(cec->rx_msg.msg);
+
+		for (i = 0; i < len; i++)
+			cec->rx_msg.msg[i] =
+				dw_hdmi_read(cec, HDMI_CEC_RX_DATA0 + i);
+
+		dw_hdmi_write(cec, 0, HDMI_CEC_LOCK);
+
+		cec->rx_msg.len = len;
+		smp_wmb();
+		cec->rx_done = true;
+
+		ret = IRQ_WAKE_THREAD;
+	}
+
+	return ret;
+}
+
+static irqreturn_t dw_hdmi_cec_thread(int irq, void *data)
+{
+	struct cec_adapter *adap = data;
+	struct dw_hdmi_cec *cec = cec_get_drvdata(adap);
+
+	if (cec->tx_done) {
+		cec->tx_done = false;
+		cec_transmit_attempt_done(adap, cec->tx_status);
+	}
+	if (cec->rx_done) {
+		cec->rx_done = false;
+		smp_rmb();
+		cec_received_msg(adap, &cec->rx_msg);
+	}
+	return IRQ_HANDLED;
+}
+
+static int dw_hdmi_cec_enable(struct cec_adapter *adap, bool enable)
+{
+	struct dw_hdmi_cec *cec = cec_get_drvdata(adap);
+
+	if (!enable) {
+		dw_hdmi_write(cec, ~0, HDMI_CEC_MASK);
+		dw_hdmi_write(cec, ~0, HDMI_IH_MUTE_CEC_STAT0);
+		dw_hdmi_write(cec, 0, HDMI_CEC_POLARITY);
+
+		cec->ops->disable(cec->hdmi);
+	} else {
+		unsigned int irqs;
+
+		dw_hdmi_write(cec, 0, HDMI_CEC_CTRL);
+		dw_hdmi_write(cec, ~0, HDMI_IH_CEC_STAT0);
+		dw_hdmi_write(cec, 0, HDMI_CEC_LOCK);
+
+		dw_hdmi_cec_log_addr(cec->adap, CEC_LOG_ADDR_INVALID);
+
+		cec->ops->enable(cec->hdmi);
+
+		irqs = CEC_STAT_ERROR_INIT | CEC_STAT_NACK | CEC_STAT_EOM |
+		       CEC_STAT_DONE;
+		dw_hdmi_write(cec, irqs, HDMI_CEC_POLARITY);
+		dw_hdmi_write(cec, ~irqs, HDMI_CEC_MASK);
+		dw_hdmi_write(cec, ~irqs, HDMI_IH_MUTE_CEC_STAT0);
+	}
+	return 0;
+}
+
+static const struct cec_adap_ops dw_hdmi_cec_ops = {
+	.adap_enable = dw_hdmi_cec_enable,
+	.adap_log_addr = dw_hdmi_cec_log_addr,
+	.adap_transmit = dw_hdmi_cec_transmit,
+};
+
+static void dw_hdmi_cec_del(void *data)
+{
+	struct dw_hdmi_cec *cec = data;
+
+	cec_delete_adapter(cec->adap);
+}
+
+static int dw_hdmi_cec_probe(struct platform_device *pdev)
+{
+	struct dw_hdmi_cec_data *data = dev_get_platdata(&pdev->dev);
+	struct dw_hdmi_cec *cec;
+	int ret;
+
+	if (!data)
+		return -ENXIO;
+
+	/*
+	 * Our device is just a convenience - we want to link to the real
+	 * hardware device here, so that userspace can see the association
+	 * between the HDMI hardware and its associated CEC chardev.
+	 */
+	cec = devm_kzalloc(&pdev->dev, sizeof(*cec), GFP_KERNEL);
+	if (!cec)
+		return -ENOMEM;
+
+	cec->irq = data->irq;
+	cec->ops = data->ops;
+	cec->hdmi = data->hdmi;
+
+	platform_set_drvdata(pdev, cec);
+
+	dw_hdmi_write(cec, 0, HDMI_CEC_TX_CNT);
+	dw_hdmi_write(cec, ~0, HDMI_CEC_MASK);
+	dw_hdmi_write(cec, ~0, HDMI_IH_MUTE_CEC_STAT0);
+	dw_hdmi_write(cec, 0, HDMI_CEC_POLARITY);
+
+	cec->adap = cec_allocate_adapter(&dw_hdmi_cec_ops, cec, "dw_hdmi",
+					 CEC_CAP_LOG_ADDRS | CEC_CAP_TRANSMIT |
+					 CEC_CAP_RC | CEC_CAP_PASSTHROUGH,
+					 CEC_MAX_LOG_ADDRS);
+	if (IS_ERR(cec->adap))
+		return PTR_ERR(cec->adap);
+
+	/* override the module pointer */
+	cec->adap->owner = THIS_MODULE;
+
+	ret = devm_add_action(&pdev->dev, dw_hdmi_cec_del, cec);
+	if (ret) {
+		cec_delete_adapter(cec->adap);
+		return ret;
+	}
+
+	ret = devm_request_threaded_irq(&pdev->dev, cec->irq,
+					dw_hdmi_cec_hardirq,
+					dw_hdmi_cec_thread, IRQF_SHARED,
+					"dw-hdmi-cec", cec->adap);
+	if (ret < 0)
+		return ret;
+
+	cec->notify = cec_notifier_get(pdev->dev.parent);
+	if (!cec->notify)
+		return -ENOMEM;
+
+	ret = cec_register_adapter(cec->adap, pdev->dev.parent);
+	if (ret < 0) {
+		cec_notifier_put(cec->notify);
+		return ret;
+	}
+
+	/*
+	 * CEC documentation says we must not call cec_delete_adapter
+	 * after a successful call to cec_register_adapter().
+	 */
+	devm_remove_action(&pdev->dev, dw_hdmi_cec_del, cec);
+
+	cec_register_cec_notifier(cec->adap, cec->notify);
+
+	return 0;
+}
+
+static int dw_hdmi_cec_remove(struct platform_device *pdev)
+{
+	struct dw_hdmi_cec *cec = platform_get_drvdata(pdev);
+
+	cec_unregister_adapter(cec->adap);
+	cec_notifier_put(cec->notify);
+
+	return 0;
+}
+
+static struct platform_driver dw_hdmi_cec_driver = {
+	.probe	= dw_hdmi_cec_probe,
+	.remove	= dw_hdmi_cec_remove,
+	.driver = {
+		.name = "dw-hdmi-cec",
+	},
+};
+module_platform_driver(dw_hdmi_cec_driver);
+
+MODULE_AUTHOR("Russell King <rmk+kernel@armlinux.org.uk>");
+MODULE_DESCRIPTION("Synopsys Designware HDMI CEC driver for i.MX");
+MODULE_LICENSE("GPL");
+MODULE_ALIAS(PLATFORM_MODULE_PREFIX "dw-hdmi-cec");
diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-cec.h b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-cec.h
new file mode 100644
index 0000000..cf4dc12
--- /dev/null
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-cec.h
@@ -0,0 +1,19 @@
+#ifndef DW_HDMI_CEC_H
+#define DW_HDMI_CEC_H
+
+struct dw_hdmi;
+
+struct dw_hdmi_cec_ops {
+	void (*write)(struct dw_hdmi *hdmi, u8 val, int offset);
+	u8 (*read)(struct dw_hdmi *hdmi, int offset);
+	void (*enable)(struct dw_hdmi *hdmi);
+	void (*disable)(struct dw_hdmi *hdmi);
+};
+
+struct dw_hdmi_cec_data {
+	struct dw_hdmi *hdmi;
+	const struct dw_hdmi_cec_ops *ops;
+	int irq;
+};
+
+#endif
diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
index b2cf59f..3b7e5c5 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi-i2s-audio.c
@@ -1,7 +1,8 @@
 /*
  * dw-hdmi-i2s-audio.c
  *
- * Copyright (c) 2016 Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
+ * Copyright (c) 2017 Renesas Solutions Corp.
+ * Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
  *
  * This program is free software; you can redistribute it and/or modify
  * it under the terms of the GNU General Public License version 2 as
diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
index ead1124..bf14214 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.c
@@ -35,8 +35,12 @@
 
 #include "dw-hdmi.h"
 #include "dw-hdmi-audio.h"
+#include "dw-hdmi-cec.h"
+
+#include <media/cec-notifier.h>
 
 #define DDC_SEGMENT_ADDR	0x30
+
 #define HDMI_EDID_LEN		512
 
 enum hdmi_datamap {
@@ -130,6 +134,7 @@ struct dw_hdmi {
 	unsigned int version;
 
 	struct platform_device *audio;
+	struct platform_device *cec;
 	struct device *dev;
 	struct clk *isfr_clk;
 	struct clk *iahb_clk;
@@ -163,6 +168,7 @@ struct dw_hdmi {
 	bool bridge_is_on;		/* indicates the bridge is on */
 	bool rxsense;			/* rxsense state */
 	u8 phy_mask;			/* desired phy int mask settings */
+	u8 mc_clkdis;			/* clock disable register */
 
 	spinlock_t audio_lock;
 	struct mutex audio_mutex;
@@ -175,6 +181,8 @@ struct dw_hdmi {
 	struct regmap *regm;
 	void (*enable_audio)(struct dw_hdmi *hdmi);
 	void (*disable_audio)(struct dw_hdmi *hdmi);
+
+	struct cec_notifier *cec_notifier;
 };
 
 #define HDMI_IH_PHY_STAT0_RX_SENSE \
@@ -546,8 +554,11 @@ EXPORT_SYMBOL_GPL(dw_hdmi_set_sample_rate);
 
 static void hdmi_enable_audio_clk(struct dw_hdmi *hdmi, bool enable)
 {
-	hdmi_modb(hdmi, enable ? 0 : HDMI_MC_CLKDIS_AUDCLK_DISABLE,
-		  HDMI_MC_CLKDIS_AUDCLK_DISABLE, HDMI_MC_CLKDIS);
+	if (enable)
+		hdmi->mc_clkdis &= ~HDMI_MC_CLKDIS_AUDCLK_DISABLE;
+	else
+		hdmi->mc_clkdis |= HDMI_MC_CLKDIS_AUDCLK_DISABLE;
+	hdmi_writeb(hdmi, hdmi->mc_clkdis, HDMI_MC_CLKDIS);
 }
 
 static void dw_hdmi_ahb_audio_enable(struct dw_hdmi *hdmi)
@@ -1317,7 +1328,7 @@ static void hdmi_config_AVI(struct dw_hdmi *hdmi, struct drm_display_mode *mode)
 	u8 val;
 
 	/* Initialise info frame from DRM mode */
-	drm_hdmi_avi_infoframe_from_display_mode(&frame, mode);
+	drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
 
 	if (hdmi_bus_fmt_is_yuv444(hdmi->hdmi_data.enc_out_bus_format))
 		frame.colorspace = HDMI_COLORSPACE_YUV444;
@@ -1569,8 +1580,6 @@ static void hdmi_av_composer(struct dw_hdmi *hdmi,
 /* HDMI Initialization Step B.4 */
 static void dw_hdmi_enable_video_path(struct dw_hdmi *hdmi)
 {
-	u8 clkdis;
-
 	/* control period minimum duration */
 	hdmi_writeb(hdmi, 12, HDMI_FC_CTRLDUR);
 	hdmi_writeb(hdmi, 32, HDMI_FC_EXCTRLDUR);
@@ -1582,17 +1591,21 @@ static void dw_hdmi_enable_video_path(struct dw_hdmi *hdmi)
 	hdmi_writeb(hdmi, 0x21, HDMI_FC_CH2PREAM);
 
 	/* Enable pixel clock and tmds data path */
-	clkdis = 0x7F;
-	clkdis &= ~HDMI_MC_CLKDIS_PIXELCLK_DISABLE;
-	hdmi_writeb(hdmi, clkdis, HDMI_MC_CLKDIS);
+	hdmi->mc_clkdis |= HDMI_MC_CLKDIS_HDCPCLK_DISABLE |
+			   HDMI_MC_CLKDIS_CSCCLK_DISABLE |
+			   HDMI_MC_CLKDIS_AUDCLK_DISABLE |
+			   HDMI_MC_CLKDIS_PREPCLK_DISABLE |
+			   HDMI_MC_CLKDIS_TMDSCLK_DISABLE;
+	hdmi->mc_clkdis &= ~HDMI_MC_CLKDIS_PIXELCLK_DISABLE;
+	hdmi_writeb(hdmi, hdmi->mc_clkdis, HDMI_MC_CLKDIS);
 
-	clkdis &= ~HDMI_MC_CLKDIS_TMDSCLK_DISABLE;
-	hdmi_writeb(hdmi, clkdis, HDMI_MC_CLKDIS);
+	hdmi->mc_clkdis &= ~HDMI_MC_CLKDIS_TMDSCLK_DISABLE;
+	hdmi_writeb(hdmi, hdmi->mc_clkdis, HDMI_MC_CLKDIS);
 
 	/* Enable csc path */
 	if (is_color_space_conversion(hdmi)) {
-		clkdis &= ~HDMI_MC_CLKDIS_CSCCLK_DISABLE;
-		hdmi_writeb(hdmi, clkdis, HDMI_MC_CLKDIS);
+		hdmi->mc_clkdis &= ~HDMI_MC_CLKDIS_CSCCLK_DISABLE;
+		hdmi_writeb(hdmi, hdmi->mc_clkdis, HDMI_MC_CLKDIS);
 	}
 
 	/* Enable color space conversion if needed */
@@ -1783,7 +1796,6 @@ static void initialize_hdmi_ih_mutes(struct dw_hdmi *hdmi)
 	hdmi_writeb(hdmi, 0xff, HDMI_AUD_HBR_MASK);
 	hdmi_writeb(hdmi, 0xff, HDMI_GP_MASK);
 	hdmi_writeb(hdmi, 0xff, HDMI_A_APIINTMSK);
-	hdmi_writeb(hdmi, 0xff, HDMI_CEC_MASK);
 	hdmi_writeb(hdmi, 0xff, HDMI_I2CM_INT);
 	hdmi_writeb(hdmi, 0xff, HDMI_I2CM_CTLINT);
 
@@ -1896,6 +1908,7 @@ static int dw_hdmi_connector_get_modes(struct drm_connector *connector)
 		hdmi->sink_is_hdmi = drm_detect_hdmi_monitor(edid);
 		hdmi->sink_has_audio = drm_detect_monitor_audio(edid);
 		drm_mode_connector_update_edid_property(connector, edid);
+		cec_notifier_set_phys_addr_from_edid(hdmi->cec_notifier, edid);
 		ret = drm_add_edid_modes(connector, edid);
 		/* Store the ELD */
 		drm_edid_to_eld(connector, edid);
@@ -1920,7 +1933,6 @@ static void dw_hdmi_connector_force(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs dw_hdmi_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.detect = dw_hdmi_connector_detect,
 	.destroy = drm_connector_cleanup,
@@ -2119,11 +2131,16 @@ static irqreturn_t dw_hdmi_irq(int irq, void *dev_id)
 	 * ask the source to re-read the EDID.
 	 */
 	if (intr_stat &
-	    (HDMI_IH_PHY_STAT0_RX_SENSE | HDMI_IH_PHY_STAT0_HPD))
+	    (HDMI_IH_PHY_STAT0_RX_SENSE | HDMI_IH_PHY_STAT0_HPD)) {
 		__dw_hdmi_setup_rx_sense(hdmi,
 					 phy_stat & HDMI_PHY_HPD,
 					 phy_stat & HDMI_PHY_RX_SENSE);
 
+		if ((phy_stat & (HDMI_PHY_RX_SENSE | HDMI_PHY_HPD)) == 0)
+			cec_notifier_set_phys_addr(hdmi->cec_notifier,
+						   CEC_PHYS_ADDR_INVALID);
+	}
+
 	if (intr_stat & HDMI_IH_PHY_STAT0_HPD) {
 		dev_dbg(hdmi->dev, "EVENT=%s\n",
 			phy_int_pol & HDMI_PHY_HPD ? "plugin" : "plugout");
@@ -2170,6 +2187,7 @@ static const struct dw_hdmi_phy_data dw_hdmi_phys[] = {
 		.name = "DWC HDMI 2.0 TX PHY",
 		.gen = 2,
 		.has_svsret = true,
+		.configure = hdmi_phy_configure_dwc_hdmi_3d_tx,
 	}, {
 		.type = DW_HDMI_PHY_VENDOR_PHY,
 		.name = "Vendor PHY",
@@ -2219,6 +2237,29 @@ static int dw_hdmi_detect_phy(struct dw_hdmi *hdmi)
 	return -ENODEV;
 }
 
+static void dw_hdmi_cec_enable(struct dw_hdmi *hdmi)
+{
+	mutex_lock(&hdmi->mutex);
+	hdmi->mc_clkdis &= ~HDMI_MC_CLKDIS_CECCLK_DISABLE;
+	hdmi_writeb(hdmi, hdmi->mc_clkdis, HDMI_MC_CLKDIS);
+	mutex_unlock(&hdmi->mutex);
+}
+
+static void dw_hdmi_cec_disable(struct dw_hdmi *hdmi)
+{
+	mutex_lock(&hdmi->mutex);
+	hdmi->mc_clkdis |= HDMI_MC_CLKDIS_CECCLK_DISABLE;
+	hdmi_writeb(hdmi, hdmi->mc_clkdis, HDMI_MC_CLKDIS);
+	mutex_unlock(&hdmi->mutex);
+}
+
+static const struct dw_hdmi_cec_ops dw_hdmi_cec_ops = {
+	.write = hdmi_writeb,
+	.read = hdmi_readb,
+	.enable = dw_hdmi_cec_enable,
+	.disable = dw_hdmi_cec_disable,
+};
+
 static const struct regmap_config hdmi_regmap_8bit_config = {
 	.reg_bits	= 32,
 	.val_bits	= 8,
@@ -2241,6 +2282,7 @@ __dw_hdmi_probe(struct platform_device *pdev,
 	struct device_node *np = dev->of_node;
 	struct platform_device_info pdevinfo;
 	struct device_node *ddc_node;
+	struct dw_hdmi_cec_data cec;
 	struct dw_hdmi *hdmi;
 	struct resource *iores = NULL;
 	int irq;
@@ -2261,6 +2303,7 @@ __dw_hdmi_probe(struct platform_device *pdev,
 	hdmi->disabled = true;
 	hdmi->rxsense = true;
 	hdmi->phy_mask = (u8)~(HDMI_PHY_HPD | HDMI_PHY_RX_SENSE);
+	hdmi->mc_clkdis = 0x7f;
 
 	mutex_init(&hdmi->mutex);
 	mutex_init(&hdmi->audio_mutex);
@@ -2376,6 +2419,12 @@ __dw_hdmi_probe(struct platform_device *pdev,
 	if (ret)
 		goto err_iahb;
 
+	hdmi->cec_notifier = cec_notifier_get(dev);
+	if (!hdmi->cec_notifier) {
+		ret = -ENOMEM;
+		goto err_iahb;
+	}
+
 	/*
 	 * To prevent overflows in HDMI_IH_FC_STAT2, set the clk regenerator
 	 * N and cts values before enabling phy
@@ -2438,6 +2487,19 @@ __dw_hdmi_probe(struct platform_device *pdev,
 		hdmi->audio = platform_device_register_full(&pdevinfo);
 	}
 
+	if (config0 & HDMI_CONFIG0_CEC) {
+		cec.hdmi = hdmi;
+		cec.ops = &dw_hdmi_cec_ops;
+		cec.irq = irq;
+
+		pdevinfo.name = "dw-hdmi-cec";
+		pdevinfo.data = &cec;
+		pdevinfo.size_data = sizeof(cec);
+		pdevinfo.dma_mask = 0;
+
+		hdmi->cec = platform_device_register_full(&pdevinfo);
+	}
+
 	/* Reset HDMI DDC I2C master controller and mute I2CM interrupts */
 	if (hdmi->i2c)
 		dw_hdmi_i2c_init(hdmi);
@@ -2452,6 +2514,9 @@ __dw_hdmi_probe(struct platform_device *pdev,
 		hdmi->ddc = NULL;
 	}
 
+	if (hdmi->cec_notifier)
+		cec_notifier_put(hdmi->cec_notifier);
+
 	clk_disable_unprepare(hdmi->iahb_clk);
 err_isfr:
 	clk_disable_unprepare(hdmi->isfr_clk);
@@ -2465,10 +2530,15 @@ static void __dw_hdmi_remove(struct dw_hdmi *hdmi)
 {
 	if (hdmi->audio && !IS_ERR(hdmi->audio))
 		platform_device_unregister(hdmi->audio);
+	if (!IS_ERR(hdmi->cec))
+		platform_device_unregister(hdmi->cec);
 
 	/* Disable all interrupts */
 	hdmi_writeb(hdmi, ~0, HDMI_IH_MUTE_PHY_STAT0);
 
+	if (hdmi->cec_notifier)
+		cec_notifier_put(hdmi->cec_notifier);
+
 	clk_disable_unprepare(hdmi->iahb_clk);
 	clk_disable_unprepare(hdmi->isfr_clk);
 
@@ -2485,17 +2555,12 @@ int dw_hdmi_probe(struct platform_device *pdev,
 		  const struct dw_hdmi_plat_data *plat_data)
 {
 	struct dw_hdmi *hdmi;
-	int ret;
 
 	hdmi = __dw_hdmi_probe(pdev, plat_data);
 	if (IS_ERR(hdmi))
 		return PTR_ERR(hdmi);
 
-	ret = drm_bridge_add(&hdmi->bridge);
-	if (ret < 0) {
-		__dw_hdmi_remove(hdmi);
-		return ret;
-	}
+	drm_bridge_add(&hdmi->bridge);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.h b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.h
index c59f87e..9d90eb9 100644
--- a/drivers/gpu/drm/bridge/synopsys/dw-hdmi.h
+++ b/drivers/gpu/drm/bridge/synopsys/dw-hdmi.h
@@ -478,51 +478,6 @@
 #define HDMI_A_PRESETUP                         0x501A
 #define HDMI_A_SRM_BASE                         0x5020
 
-/* CEC Engine Registers */
-#define HDMI_CEC_CTRL                           0x7D00
-#define HDMI_CEC_STAT                           0x7D01
-#define HDMI_CEC_MASK                           0x7D02
-#define HDMI_CEC_POLARITY                       0x7D03
-#define HDMI_CEC_INT                            0x7D04
-#define HDMI_CEC_ADDR_L                         0x7D05
-#define HDMI_CEC_ADDR_H                         0x7D06
-#define HDMI_CEC_TX_CNT                         0x7D07
-#define HDMI_CEC_RX_CNT                         0x7D08
-#define HDMI_CEC_TX_DATA0                       0x7D10
-#define HDMI_CEC_TX_DATA1                       0x7D11
-#define HDMI_CEC_TX_DATA2                       0x7D12
-#define HDMI_CEC_TX_DATA3                       0x7D13
-#define HDMI_CEC_TX_DATA4                       0x7D14
-#define HDMI_CEC_TX_DATA5                       0x7D15
-#define HDMI_CEC_TX_DATA6                       0x7D16
-#define HDMI_CEC_TX_DATA7                       0x7D17
-#define HDMI_CEC_TX_DATA8                       0x7D18
-#define HDMI_CEC_TX_DATA9                       0x7D19
-#define HDMI_CEC_TX_DATA10                      0x7D1a
-#define HDMI_CEC_TX_DATA11                      0x7D1b
-#define HDMI_CEC_TX_DATA12                      0x7D1c
-#define HDMI_CEC_TX_DATA13                      0x7D1d
-#define HDMI_CEC_TX_DATA14                      0x7D1e
-#define HDMI_CEC_TX_DATA15                      0x7D1f
-#define HDMI_CEC_RX_DATA0                       0x7D20
-#define HDMI_CEC_RX_DATA1                       0x7D21
-#define HDMI_CEC_RX_DATA2                       0x7D22
-#define HDMI_CEC_RX_DATA3                       0x7D23
-#define HDMI_CEC_RX_DATA4                       0x7D24
-#define HDMI_CEC_RX_DATA5                       0x7D25
-#define HDMI_CEC_RX_DATA6                       0x7D26
-#define HDMI_CEC_RX_DATA7                       0x7D27
-#define HDMI_CEC_RX_DATA8                       0x7D28
-#define HDMI_CEC_RX_DATA9                       0x7D29
-#define HDMI_CEC_RX_DATA10                      0x7D2a
-#define HDMI_CEC_RX_DATA11                      0x7D2b
-#define HDMI_CEC_RX_DATA12                      0x7D2c
-#define HDMI_CEC_RX_DATA13                      0x7D2d
-#define HDMI_CEC_RX_DATA14                      0x7D2e
-#define HDMI_CEC_RX_DATA15                      0x7D2f
-#define HDMI_CEC_LOCK                           0x7D30
-#define HDMI_CEC_WKUPCTRL                       0x7D31
-
 /* I2C Master Registers (E-DDC) */
 #define HDMI_I2CM_SLAVE                         0x7E00
 #define HDMI_I2CM_ADDRESS                       0x7E01
@@ -555,6 +510,7 @@ enum {
 
 /* CONFIG0_ID field values */
 	HDMI_CONFIG0_I2S = 0x10,
+	HDMI_CONFIG0_CEC = 0x02,
 
 /* CONFIG1_ID field values */
 	HDMI_CONFIG1_AHB = 0x01,
diff --git a/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
new file mode 100644
index 0000000..63c7a01
--- /dev/null
+++ b/drivers/gpu/drm/bridge/synopsys/dw-mipi-dsi.c
@@ -0,0 +1,981 @@
+/*
+ * Copyright (c) 2016, Fuzhou Rockchip Electronics Co., Ltd
+ * Copyright (C) STMicroelectronics SA 2017
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * Modified by Philippe Cornu <philippe.cornu@st.com>
+ * This generic Synopsys DesignWare MIPI DSI host driver is based on the
+ * Rockchip version from rockchip/dw-mipi-dsi.c with phy & bridge APIs.
+ */
+
+#include <linux/clk.h>
+#include <linux/component.h>
+#include <linux/iopoll.h>
+#include <linux/module.h>
+#include <linux/of_device.h>
+#include <linux/pm_runtime.h>
+#include <linux/reset.h>
+#include <drm/drmP.h>
+#include <drm/drm_atomic_helper.h>
+#include <drm/drm_bridge.h>
+#include <drm/drm_crtc.h>
+#include <drm/drm_crtc_helper.h>
+#include <drm/drm_mipi_dsi.h>
+#include <drm/drm_of.h>
+#include <drm/bridge/dw_mipi_dsi.h>
+#include <video/mipi_display.h>
+
+#define DSI_VERSION			0x00
+#define DSI_PWR_UP			0x04
+#define RESET				0
+#define POWERUP				BIT(0)
+
+#define DSI_CLKMGR_CFG			0x08
+#define TO_CLK_DIVIDSION(div)		(((div) & 0xff) << 8)
+#define TX_ESC_CLK_DIVIDSION(div)	(((div) & 0xff) << 0)
+
+#define DSI_DPI_VCID			0x0c
+#define DPI_VID(vid)			(((vid) & 0x3) << 0)
+
+#define DSI_DPI_COLOR_CODING		0x10
+#define EN18_LOOSELY			BIT(8)
+#define DPI_COLOR_CODING_16BIT_1	0x0
+#define DPI_COLOR_CODING_16BIT_2	0x1
+#define DPI_COLOR_CODING_16BIT_3	0x2
+#define DPI_COLOR_CODING_18BIT_1	0x3
+#define DPI_COLOR_CODING_18BIT_2	0x4
+#define DPI_COLOR_CODING_24BIT		0x5
+
+#define DSI_DPI_CFG_POL			0x14
+#define COLORM_ACTIVE_LOW		BIT(4)
+#define SHUTD_ACTIVE_LOW		BIT(3)
+#define HSYNC_ACTIVE_LOW		BIT(2)
+#define VSYNC_ACTIVE_LOW		BIT(1)
+#define DATAEN_ACTIVE_LOW		BIT(0)
+
+#define DSI_DPI_LP_CMD_TIM		0x18
+#define OUTVACT_LPCMD_TIME(p)		(((p) & 0xff) << 16)
+#define INVACT_LPCMD_TIME(p)		((p) & 0xff)
+
+#define DSI_DBI_CFG			0x20
+#define DSI_DBI_CMDSIZE			0x28
+
+#define DSI_PCKHDL_CFG			0x2c
+#define EN_CRC_RX			BIT(4)
+#define EN_ECC_RX			BIT(3)
+#define EN_BTA				BIT(2)
+#define EN_EOTP_RX			BIT(1)
+#define EN_EOTP_TX			BIT(0)
+
+#define DSI_MODE_CFG			0x34
+#define ENABLE_VIDEO_MODE		0
+#define ENABLE_CMD_MODE			BIT(0)
+
+#define DSI_VID_MODE_CFG		0x38
+#define FRAME_BTA_ACK			BIT(14)
+#define ENABLE_LOW_POWER		(0x3f << 8)
+#define ENABLE_LOW_POWER_MASK		(0x3f << 8)
+#define VID_MODE_TYPE_NON_BURST_SYNC_PULSES	0x0
+#define VID_MODE_TYPE_NON_BURST_SYNC_EVENTS	0x1
+#define VID_MODE_TYPE_BURST			0x2
+#define VID_MODE_TYPE_MASK			0x3
+
+#define DSI_VID_PKT_SIZE		0x3c
+#define VID_PKT_SIZE(p)			(((p) & 0x3fff) << 0)
+#define VID_PKT_MAX_SIZE		0x3fff
+
+#define DSI_VID_HSA_TIME		0x48
+#define DSI_VID_HBP_TIME		0x4c
+#define DSI_VID_HLINE_TIME		0x50
+#define DSI_VID_VSA_LINES		0x54
+#define DSI_VID_VBP_LINES		0x58
+#define DSI_VID_VFP_LINES		0x5c
+#define DSI_VID_VACTIVE_LINES		0x60
+#define DSI_CMD_MODE_CFG		0x68
+#define MAX_RD_PKT_SIZE_LP		BIT(24)
+#define DCS_LW_TX_LP			BIT(19)
+#define DCS_SR_0P_TX_LP			BIT(18)
+#define DCS_SW_1P_TX_LP			BIT(17)
+#define DCS_SW_0P_TX_LP			BIT(16)
+#define GEN_LW_TX_LP			BIT(14)
+#define GEN_SR_2P_TX_LP			BIT(13)
+#define GEN_SR_1P_TX_LP			BIT(12)
+#define GEN_SR_0P_TX_LP			BIT(11)
+#define GEN_SW_2P_TX_LP			BIT(10)
+#define GEN_SW_1P_TX_LP			BIT(9)
+#define GEN_SW_0P_TX_LP			BIT(8)
+#define EN_ACK_RQST			BIT(1)
+#define EN_TEAR_FX			BIT(0)
+
+#define CMD_MODE_ALL_LP			(MAX_RD_PKT_SIZE_LP | \
+					 DCS_LW_TX_LP | \
+					 DCS_SR_0P_TX_LP | \
+					 DCS_SW_1P_TX_LP | \
+					 DCS_SW_0P_TX_LP | \
+					 GEN_LW_TX_LP | \
+					 GEN_SR_2P_TX_LP | \
+					 GEN_SR_1P_TX_LP | \
+					 GEN_SR_0P_TX_LP | \
+					 GEN_SW_2P_TX_LP | \
+					 GEN_SW_1P_TX_LP | \
+					 GEN_SW_0P_TX_LP)
+
+#define DSI_GEN_HDR			0x6c
+#define GEN_HDATA(data)			(((data) & 0xffff) << 8)
+#define GEN_HDATA_MASK			(0xffff << 8)
+#define GEN_HTYPE(type)			(((type) & 0xff) << 0)
+#define GEN_HTYPE_MASK			0xff
+
+#define DSI_GEN_PLD_DATA		0x70
+
+#define DSI_CMD_PKT_STATUS		0x74
+#define GEN_CMD_EMPTY			BIT(0)
+#define GEN_CMD_FULL			BIT(1)
+#define GEN_PLD_W_EMPTY			BIT(2)
+#define GEN_PLD_W_FULL			BIT(3)
+#define GEN_PLD_R_EMPTY			BIT(4)
+#define GEN_PLD_R_FULL			BIT(5)
+#define GEN_RD_CMD_BUSY			BIT(6)
+
+#define DSI_TO_CNT_CFG			0x78
+#define HSTX_TO_CNT(p)			(((p) & 0xffff) << 16)
+#define LPRX_TO_CNT(p)			((p) & 0xffff)
+
+#define DSI_BTA_TO_CNT			0x8c
+#define DSI_LPCLK_CTRL			0x94
+#define AUTO_CLKLANE_CTRL		BIT(1)
+#define PHY_TXREQUESTCLKHS		BIT(0)
+
+#define DSI_PHY_TMR_LPCLK_CFG		0x98
+#define PHY_CLKHS2LP_TIME(lbcc)		(((lbcc) & 0x3ff) << 16)
+#define PHY_CLKLP2HS_TIME(lbcc)		((lbcc) & 0x3ff)
+
+#define DSI_PHY_TMR_CFG			0x9c
+#define PHY_HS2LP_TIME(lbcc)		(((lbcc) & 0xff) << 24)
+#define PHY_LP2HS_TIME(lbcc)		(((lbcc) & 0xff) << 16)
+#define MAX_RD_TIME(lbcc)		((lbcc) & 0x7fff)
+
+#define DSI_PHY_RSTZ			0xa0
+#define PHY_DISFORCEPLL			0
+#define PHY_ENFORCEPLL			BIT(3)
+#define PHY_DISABLECLK			0
+#define PHY_ENABLECLK			BIT(2)
+#define PHY_RSTZ			0
+#define PHY_UNRSTZ			BIT(1)
+#define PHY_SHUTDOWNZ			0
+#define PHY_UNSHUTDOWNZ			BIT(0)
+
+#define DSI_PHY_IF_CFG			0xa4
+#define N_LANES(n)			((((n) - 1) & 0x3) << 0)
+#define PHY_STOP_WAIT_TIME(cycle)	(((cycle) & 0xff) << 8)
+
+#define DSI_PHY_STATUS			0xb0
+#define LOCK				BIT(0)
+#define STOP_STATE_CLK_LANE		BIT(2)
+
+#define DSI_PHY_TST_CTRL0		0xb4
+#define PHY_TESTCLK			BIT(1)
+#define PHY_UNTESTCLK			0
+#define PHY_TESTCLR			BIT(0)
+#define PHY_UNTESTCLR			0
+
+#define DSI_PHY_TST_CTRL1		0xb8
+#define PHY_TESTEN			BIT(16)
+#define PHY_UNTESTEN			0
+#define PHY_TESTDOUT(n)			(((n) & 0xff) << 8)
+#define PHY_TESTDIN(n)			(((n) & 0xff) << 0)
+
+#define DSI_INT_ST0			0xbc
+#define DSI_INT_ST1			0xc0
+#define DSI_INT_MSK0			0xc4
+#define DSI_INT_MSK1			0xc8
+
+#define PHY_STATUS_TIMEOUT_US		10000
+#define CMD_PKT_STATUS_TIMEOUT_US	20000
+
+struct dw_mipi_dsi {
+	struct drm_bridge bridge;
+	struct mipi_dsi_host dsi_host;
+	struct drm_bridge *panel_bridge;
+	bool is_panel_bridge;
+	struct device *dev;
+	void __iomem *base;
+
+	struct clk *pclk;
+
+	unsigned int lane_mbps; /* per lane */
+	u32 channel;
+	u32 lanes;
+	u32 format;
+	unsigned long mode_flags;
+
+	const struct dw_mipi_dsi_plat_data *plat_data;
+};
+
+/*
+ * The controller should generate 2 frames before
+ * preparing the peripheral.
+ */
+static void dw_mipi_dsi_wait_for_two_frames(struct drm_display_mode *mode)
+{
+	int refresh, two_frames;
+
+	refresh = drm_mode_vrefresh(mode);
+	two_frames = DIV_ROUND_UP(MSEC_PER_SEC, refresh) * 2;
+	msleep(two_frames);
+}
+
+static inline struct dw_mipi_dsi *host_to_dsi(struct mipi_dsi_host *host)
+{
+	return container_of(host, struct dw_mipi_dsi, dsi_host);
+}
+
+static inline struct dw_mipi_dsi *bridge_to_dsi(struct drm_bridge *bridge)
+{
+	return container_of(bridge, struct dw_mipi_dsi, bridge);
+}
+
+static inline void dsi_write(struct dw_mipi_dsi *dsi, u32 reg, u32 val)
+{
+	writel(val, dsi->base + reg);
+}
+
+static inline u32 dsi_read(struct dw_mipi_dsi *dsi, u32 reg)
+{
+	return readl(dsi->base + reg);
+}
+
+static int dw_mipi_dsi_host_attach(struct mipi_dsi_host *host,
+				   struct mipi_dsi_device *device)
+{
+	struct dw_mipi_dsi *dsi = host_to_dsi(host);
+	struct drm_bridge *bridge;
+	struct drm_panel *panel;
+	int ret;
+
+	if (device->lanes > dsi->plat_data->max_data_lanes) {
+		dev_err(dsi->dev, "the number of data lanes(%u) is too many\n",
+			device->lanes);
+		return -EINVAL;
+	}
+
+	dsi->lanes = device->lanes;
+	dsi->channel = device->channel;
+	dsi->format = device->format;
+	dsi->mode_flags = device->mode_flags;
+
+	ret = drm_of_find_panel_or_bridge(host->dev->of_node, 1, 0,
+					  &panel, &bridge);
+	if (ret)
+		return ret;
+
+	if (panel) {
+		bridge = drm_panel_bridge_add(panel, DRM_MODE_CONNECTOR_DSI);
+		if (IS_ERR(bridge))
+			return PTR_ERR(bridge);
+		dsi->is_panel_bridge = true;
+	}
+
+	dsi->panel_bridge = bridge;
+
+	drm_bridge_add(&dsi->bridge);
+
+	return 0;
+}
+
+static int dw_mipi_dsi_host_detach(struct mipi_dsi_host *host,
+				   struct mipi_dsi_device *device)
+{
+	struct dw_mipi_dsi *dsi = host_to_dsi(host);
+
+	if (dsi->is_panel_bridge)
+		drm_panel_bridge_remove(dsi->panel_bridge);
+
+	drm_bridge_remove(&dsi->bridge);
+
+	return 0;
+}
+
+static void dw_mipi_message_config(struct dw_mipi_dsi *dsi,
+				   const struct mipi_dsi_msg *msg)
+{
+	bool lpm = msg->flags & MIPI_DSI_MSG_USE_LPM;
+	u32 val = 0;
+
+	if (msg->flags & MIPI_DSI_MSG_REQ_ACK)
+		val |= EN_ACK_RQST;
+	if (lpm)
+		val |= CMD_MODE_ALL_LP;
+
+	dsi_write(dsi, DSI_LPCLK_CTRL, lpm ? 0 : PHY_TXREQUESTCLKHS);
+	dsi_write(dsi, DSI_CMD_MODE_CFG, val);
+}
+
+static int dw_mipi_dsi_gen_pkt_hdr_write(struct dw_mipi_dsi *dsi, u32 hdr_val)
+{
+	int ret;
+	u32 val, mask;
+
+	ret = readl_poll_timeout(dsi->base + DSI_CMD_PKT_STATUS,
+				 val, !(val & GEN_CMD_FULL), 1000,
+				 CMD_PKT_STATUS_TIMEOUT_US);
+	if (ret < 0) {
+		dev_err(dsi->dev, "failed to get available command FIFO\n");
+		return ret;
+	}
+
+	dsi_write(dsi, DSI_GEN_HDR, hdr_val);
+
+	mask = GEN_CMD_EMPTY | GEN_PLD_W_EMPTY;
+	ret = readl_poll_timeout(dsi->base + DSI_CMD_PKT_STATUS,
+				 val, (val & mask) == mask,
+				 1000, CMD_PKT_STATUS_TIMEOUT_US);
+	if (ret < 0) {
+		dev_err(dsi->dev, "failed to write command FIFO\n");
+		return ret;
+	}
+
+	return 0;
+}
+
+static int dw_mipi_dsi_dcs_short_write(struct dw_mipi_dsi *dsi,
+				       const struct mipi_dsi_msg *msg)
+{
+	const u8 *tx_buf = msg->tx_buf;
+	u16 data = 0;
+	u32 val;
+
+	if (msg->tx_len > 0)
+		data |= tx_buf[0];
+	if (msg->tx_len > 1)
+		data |= tx_buf[1] << 8;
+
+	if (msg->tx_len > 2) {
+		dev_err(dsi->dev, "too long tx buf length %zu for short write\n",
+			msg->tx_len);
+		return -EINVAL;
+	}
+
+	val = GEN_HDATA(data) | GEN_HTYPE(msg->type);
+	return dw_mipi_dsi_gen_pkt_hdr_write(dsi, val);
+}
+
+static int dw_mipi_dsi_dcs_long_write(struct dw_mipi_dsi *dsi,
+				      const struct mipi_dsi_msg *msg)
+{
+	const u8 *tx_buf = msg->tx_buf;
+	int len = msg->tx_len, pld_data_bytes = sizeof(u32), ret;
+	u32 hdr_val = GEN_HDATA(msg->tx_len) | GEN_HTYPE(msg->type);
+	u32 remainder;
+	u32 val;
+
+	if (msg->tx_len < 3) {
+		dev_err(dsi->dev, "wrong tx buf length %zu for long write\n",
+			msg->tx_len);
+		return -EINVAL;
+	}
+
+	while (DIV_ROUND_UP(len, pld_data_bytes)) {
+		if (len < pld_data_bytes) {
+			remainder = 0;
+			memcpy(&remainder, tx_buf, len);
+			dsi_write(dsi, DSI_GEN_PLD_DATA, remainder);
+			len = 0;
+		} else {
+			memcpy(&remainder, tx_buf, pld_data_bytes);
+			dsi_write(dsi, DSI_GEN_PLD_DATA, remainder);
+			tx_buf += pld_data_bytes;
+			len -= pld_data_bytes;
+		}
+
+		ret = readl_poll_timeout(dsi->base + DSI_CMD_PKT_STATUS,
+					 val, !(val & GEN_PLD_W_FULL), 1000,
+					 CMD_PKT_STATUS_TIMEOUT_US);
+		if (ret < 0) {
+			dev_err(dsi->dev,
+				"failed to get available write payload FIFO\n");
+			return ret;
+		}
+	}
+
+	return dw_mipi_dsi_gen_pkt_hdr_write(dsi, hdr_val);
+}
+
+static ssize_t dw_mipi_dsi_host_transfer(struct mipi_dsi_host *host,
+					 const struct mipi_dsi_msg *msg)
+{
+	struct dw_mipi_dsi *dsi = host_to_dsi(host);
+	int ret;
+
+	/*
+	 * TODO dw drv improvements
+	 * use mipi_dsi_create_packet() instead of all following
+	 * functions and code (no switch cases, no
+	 * dw_mipi_dsi_dcs_short_write(), only the loop in long_write...)
+	 * and use packet.header...
+	 */
+	dw_mipi_message_config(dsi, msg);
+
+	switch (msg->type) {
+	case MIPI_DSI_DCS_SHORT_WRITE:
+	case MIPI_DSI_DCS_SHORT_WRITE_PARAM:
+	case MIPI_DSI_SET_MAXIMUM_RETURN_PACKET_SIZE:
+		ret = dw_mipi_dsi_dcs_short_write(dsi, msg);
+		break;
+	case MIPI_DSI_DCS_LONG_WRITE:
+		ret = dw_mipi_dsi_dcs_long_write(dsi, msg);
+		break;
+	default:
+		dev_err(dsi->dev, "unsupported message type 0x%02x\n",
+			msg->type);
+		ret = -EINVAL;
+	}
+
+	return ret;
+}
+
+static const struct mipi_dsi_host_ops dw_mipi_dsi_host_ops = {
+	.attach = dw_mipi_dsi_host_attach,
+	.detach = dw_mipi_dsi_host_detach,
+	.transfer = dw_mipi_dsi_host_transfer,
+};
+
+static void dw_mipi_dsi_video_mode_config(struct dw_mipi_dsi *dsi)
+{
+	u32 val;
+
+	/*
+	 * TODO dw drv improvements
+	 * enabling low power is panel-dependent, we should use the
+	 * panel configuration here...
+	 */
+	val = ENABLE_LOW_POWER;
+
+	if (dsi->mode_flags & MIPI_DSI_MODE_VIDEO_BURST)
+		val |= VID_MODE_TYPE_BURST;
+	else if (dsi->mode_flags & MIPI_DSI_MODE_VIDEO_SYNC_PULSE)
+		val |= VID_MODE_TYPE_NON_BURST_SYNC_PULSES;
+	else
+		val |= VID_MODE_TYPE_NON_BURST_SYNC_EVENTS;
+
+	dsi_write(dsi, DSI_VID_MODE_CFG, val);
+}
+
+static void dw_mipi_dsi_set_mode(struct dw_mipi_dsi *dsi,
+				 unsigned long mode_flags)
+{
+	dsi_write(dsi, DSI_PWR_UP, RESET);
+
+	if (mode_flags & MIPI_DSI_MODE_VIDEO) {
+		dsi_write(dsi, DSI_MODE_CFG, ENABLE_VIDEO_MODE);
+		dw_mipi_dsi_video_mode_config(dsi);
+		dsi_write(dsi, DSI_LPCLK_CTRL, PHY_TXREQUESTCLKHS);
+	} else {
+		dsi_write(dsi, DSI_MODE_CFG, ENABLE_CMD_MODE);
+	}
+
+	dsi_write(dsi, DSI_PWR_UP, POWERUP);
+}
+
+static void dw_mipi_dsi_disable(struct dw_mipi_dsi *dsi)
+{
+	dsi_write(dsi, DSI_PWR_UP, RESET);
+	dsi_write(dsi, DSI_PHY_RSTZ, PHY_RSTZ);
+}
+
+static void dw_mipi_dsi_init(struct dw_mipi_dsi *dsi)
+{
+	/*
+	 * The maximum permitted escape clock is 20MHz and it is derived from
+	 * lanebyteclk, which is running at "lane_mbps / 8".  Thus we want:
+	 *
+	 *     (lane_mbps >> 3) / esc_clk_division < 20
+	 * which is:
+	 *     (lane_mbps >> 3) / 20 > esc_clk_division
+	 */
+	u32 esc_clk_division = (dsi->lane_mbps >> 3) / 20 + 1;
+
+	dsi_write(dsi, DSI_PWR_UP, RESET);
+
+	/*
+	 * TODO dw drv improvements
+	 * timeout clock division should be computed with the
+	 * high speed transmission counter timeout and byte lane...
+	 */
+	dsi_write(dsi, DSI_CLKMGR_CFG, TO_CLK_DIVIDSION(10) |
+		  TX_ESC_CLK_DIVIDSION(esc_clk_division));
+}
+
+static void dw_mipi_dsi_dpi_config(struct dw_mipi_dsi *dsi,
+				   struct drm_display_mode *mode)
+{
+	u32 val = 0, color = 0;
+
+	switch (dsi->format) {
+	case MIPI_DSI_FMT_RGB888:
+		color = DPI_COLOR_CODING_24BIT;
+		break;
+	case MIPI_DSI_FMT_RGB666:
+		color = DPI_COLOR_CODING_18BIT_2 | EN18_LOOSELY;
+		break;
+	case MIPI_DSI_FMT_RGB666_PACKED:
+		color = DPI_COLOR_CODING_18BIT_1;
+		break;
+	case MIPI_DSI_FMT_RGB565:
+		color = DPI_COLOR_CODING_16BIT_1;
+		break;
+	}
+
+	if (mode->flags & DRM_MODE_FLAG_NVSYNC)
+		val |= VSYNC_ACTIVE_LOW;
+	if (mode->flags & DRM_MODE_FLAG_NHSYNC)
+		val |= HSYNC_ACTIVE_LOW;
+
+	dsi_write(dsi, DSI_DPI_VCID, DPI_VID(dsi->channel));
+	dsi_write(dsi, DSI_DPI_COLOR_CODING, color);
+	dsi_write(dsi, DSI_DPI_CFG_POL, val);
+	/*
+	 * TODO dw drv improvements
+	 * largest packet sizes during hfp or during vsa/vpb/vfp
+	 * should be computed according to byte lane, lane number and only
+	 * if sending lp cmds in high speed is enable (PHY_TXREQUESTCLKHS)
+	 */
+	dsi_write(dsi, DSI_DPI_LP_CMD_TIM, OUTVACT_LPCMD_TIME(4)
+		  | INVACT_LPCMD_TIME(4));
+}
+
+static void dw_mipi_dsi_packet_handler_config(struct dw_mipi_dsi *dsi)
+{
+	dsi_write(dsi, DSI_PCKHDL_CFG, EN_CRC_RX | EN_ECC_RX | EN_BTA);
+}
+
+static void dw_mipi_dsi_video_packet_config(struct dw_mipi_dsi *dsi,
+					    struct drm_display_mode *mode)
+{
+	/*
+	 * TODO dw drv improvements
+	 * only burst mode is supported here. For non-burst video modes,
+	 * we should compute DSI_VID_PKT_SIZE, DSI_VCCR.NUMC &
+	 * DSI_VNPCR.NPSIZE... especially because this driver supports
+	 * non-burst video modes, see dw_mipi_dsi_video_mode_config()...
+	 */
+	dsi_write(dsi, DSI_VID_PKT_SIZE, VID_PKT_SIZE(mode->hdisplay));
+}
+
+static void dw_mipi_dsi_command_mode_config(struct dw_mipi_dsi *dsi)
+{
+	/*
+	 * TODO dw drv improvements
+	 * compute high speed transmission counter timeout according
+	 * to the timeout clock division (TO_CLK_DIVIDSION) and byte lane...
+	 */
+	dsi_write(dsi, DSI_TO_CNT_CFG, HSTX_TO_CNT(1000) | LPRX_TO_CNT(1000));
+	/*
+	 * TODO dw drv improvements
+	 * the Bus-Turn-Around Timeout Counter should be computed
+	 * according to byte lane...
+	 */
+	dsi_write(dsi, DSI_BTA_TO_CNT, 0xd00);
+	dsi_write(dsi, DSI_MODE_CFG, ENABLE_CMD_MODE);
+}
+
+/* Get lane byte clock cycles. */
+static u32 dw_mipi_dsi_get_hcomponent_lbcc(struct dw_mipi_dsi *dsi,
+					   struct drm_display_mode *mode,
+					   u32 hcomponent)
+{
+	u32 frac, lbcc;
+
+	lbcc = hcomponent * dsi->lane_mbps * MSEC_PER_SEC / 8;
+
+	frac = lbcc % mode->clock;
+	lbcc = lbcc / mode->clock;
+	if (frac)
+		lbcc++;
+
+	return lbcc;
+}
+
+static void dw_mipi_dsi_line_timer_config(struct dw_mipi_dsi *dsi,
+					  struct drm_display_mode *mode)
+{
+	u32 htotal, hsa, hbp, lbcc;
+
+	htotal = mode->htotal;
+	hsa = mode->hsync_end - mode->hsync_start;
+	hbp = mode->htotal - mode->hsync_end;
+
+	/*
+	 * TODO dw drv improvements
+	 * computations below may be improved...
+	 */
+	lbcc = dw_mipi_dsi_get_hcomponent_lbcc(dsi, mode, htotal);
+	dsi_write(dsi, DSI_VID_HLINE_TIME, lbcc);
+
+	lbcc = dw_mipi_dsi_get_hcomponent_lbcc(dsi, mode, hsa);
+	dsi_write(dsi, DSI_VID_HSA_TIME, lbcc);
+
+	lbcc = dw_mipi_dsi_get_hcomponent_lbcc(dsi, mode, hbp);
+	dsi_write(dsi, DSI_VID_HBP_TIME, lbcc);
+}
+
+static void dw_mipi_dsi_vertical_timing_config(struct dw_mipi_dsi *dsi,
+					       struct drm_display_mode *mode)
+{
+	u32 vactive, vsa, vfp, vbp;
+
+	vactive = mode->vdisplay;
+	vsa = mode->vsync_end - mode->vsync_start;
+	vfp = mode->vsync_start - mode->vdisplay;
+	vbp = mode->vtotal - mode->vsync_end;
+
+	dsi_write(dsi, DSI_VID_VACTIVE_LINES, vactive);
+	dsi_write(dsi, DSI_VID_VSA_LINES, vsa);
+	dsi_write(dsi, DSI_VID_VFP_LINES, vfp);
+	dsi_write(dsi, DSI_VID_VBP_LINES, vbp);
+}
+
+static void dw_mipi_dsi_dphy_timing_config(struct dw_mipi_dsi *dsi)
+{
+	/*
+	 * TODO dw drv improvements
+	 * data & clock lane timers should be computed according to panel
+	 * blankings and to the automatic clock lane control mode...
+	 * note: DSI_PHY_TMR_CFG.MAX_RD_TIME should be in line with
+	 * DSI_CMD_MODE_CFG.MAX_RD_PKT_SIZE_LP (see CMD_MODE_ALL_LP)
+	 */
+	dsi_write(dsi, DSI_PHY_TMR_CFG, PHY_HS2LP_TIME(0x40)
+		  | PHY_LP2HS_TIME(0x40) | MAX_RD_TIME(10000));
+
+	dsi_write(dsi, DSI_PHY_TMR_LPCLK_CFG, PHY_CLKHS2LP_TIME(0x40)
+		  | PHY_CLKLP2HS_TIME(0x40));
+}
+
+static void dw_mipi_dsi_dphy_interface_config(struct dw_mipi_dsi *dsi)
+{
+	/*
+	 * TODO dw drv improvements
+	 * stop wait time should be the maximum between host dsi
+	 * and panel stop wait times
+	 */
+	dsi_write(dsi, DSI_PHY_IF_CFG, PHY_STOP_WAIT_TIME(0x20) |
+		  N_LANES(dsi->lanes));
+}
+
+static void dw_mipi_dsi_dphy_init(struct dw_mipi_dsi *dsi)
+{
+	/* Clear PHY state */
+	dsi_write(dsi, DSI_PHY_RSTZ, PHY_DISFORCEPLL | PHY_DISABLECLK
+		  | PHY_RSTZ | PHY_SHUTDOWNZ);
+	dsi_write(dsi, DSI_PHY_TST_CTRL0, PHY_UNTESTCLR);
+	dsi_write(dsi, DSI_PHY_TST_CTRL0, PHY_TESTCLR);
+	dsi_write(dsi, DSI_PHY_TST_CTRL0, PHY_UNTESTCLR);
+}
+
+static void dw_mipi_dsi_dphy_enable(struct dw_mipi_dsi *dsi)
+{
+	u32 val;
+	int ret;
+
+	dsi_write(dsi, DSI_PHY_RSTZ, PHY_ENFORCEPLL | PHY_ENABLECLK |
+		  PHY_UNRSTZ | PHY_UNSHUTDOWNZ);
+
+	ret = readl_poll_timeout(dsi->base + DSI_PHY_STATUS,
+				 val, val & LOCK, 1000, PHY_STATUS_TIMEOUT_US);
+	if (ret < 0)
+		DRM_DEBUG_DRIVER("failed to wait phy lock state\n");
+
+	ret = readl_poll_timeout(dsi->base + DSI_PHY_STATUS,
+				 val, val & STOP_STATE_CLK_LANE, 1000,
+				 PHY_STATUS_TIMEOUT_US);
+	if (ret < 0)
+		DRM_DEBUG_DRIVER("failed to wait phy clk lane stop state\n");
+}
+
+static void dw_mipi_dsi_clear_err(struct dw_mipi_dsi *dsi)
+{
+	dsi_read(dsi, DSI_INT_ST0);
+	dsi_read(dsi, DSI_INT_ST1);
+	dsi_write(dsi, DSI_INT_MSK0, 0);
+	dsi_write(dsi, DSI_INT_MSK1, 0);
+}
+
+static void dw_mipi_dsi_bridge_post_disable(struct drm_bridge *bridge)
+{
+	struct dw_mipi_dsi *dsi = bridge_to_dsi(bridge);
+
+	/*
+	 * Switch to command mode before panel-bridge post_disable &
+	 * panel unprepare.
+	 * Note: panel-bridge disable & panel disable has been called
+	 * before by the drm framework.
+	 */
+	dw_mipi_dsi_set_mode(dsi, 0);
+
+	/*
+	 * TODO Only way found to call panel-bridge post_disable &
+	 * panel unprepare before the dsi "final" disable...
+	 * This needs to be fixed in the drm_bridge framework and the API
+	 * needs to be updated to manage our own call chains...
+	 */
+	dsi->panel_bridge->funcs->post_disable(dsi->panel_bridge);
+
+	dw_mipi_dsi_disable(dsi);
+	clk_disable_unprepare(dsi->pclk);
+	pm_runtime_put(dsi->dev);
+}
+
+void dw_mipi_dsi_bridge_mode_set(struct drm_bridge *bridge,
+				 struct drm_display_mode *mode,
+				 struct drm_display_mode *adjusted_mode)
+{
+	struct dw_mipi_dsi *dsi = bridge_to_dsi(bridge);
+	const struct dw_mipi_dsi_phy_ops *phy_ops = dsi->plat_data->phy_ops;
+	void *priv_data = dsi->plat_data->priv_data;
+	int ret;
+
+	clk_prepare_enable(dsi->pclk);
+
+	ret = phy_ops->get_lane_mbps(priv_data, mode, dsi->mode_flags,
+				     dsi->lanes, dsi->format, &dsi->lane_mbps);
+	if (ret)
+		DRM_DEBUG_DRIVER("Phy get_lane_mbps() failed\n");
+
+	pm_runtime_get_sync(dsi->dev);
+	dw_mipi_dsi_init(dsi);
+	dw_mipi_dsi_dpi_config(dsi, mode);
+	dw_mipi_dsi_packet_handler_config(dsi);
+	dw_mipi_dsi_video_mode_config(dsi);
+	dw_mipi_dsi_video_packet_config(dsi, mode);
+	dw_mipi_dsi_command_mode_config(dsi);
+	dw_mipi_dsi_line_timer_config(dsi, mode);
+	dw_mipi_dsi_vertical_timing_config(dsi, mode);
+
+	dw_mipi_dsi_dphy_init(dsi);
+	dw_mipi_dsi_dphy_timing_config(dsi);
+	dw_mipi_dsi_dphy_interface_config(dsi);
+
+	dw_mipi_dsi_clear_err(dsi);
+
+	ret = phy_ops->init(priv_data);
+	if (ret)
+		DRM_DEBUG_DRIVER("Phy init() failed\n");
+
+	dw_mipi_dsi_dphy_enable(dsi);
+
+	dw_mipi_dsi_wait_for_two_frames(mode);
+
+	/* Switch to cmd mode for panel-bridge pre_enable & panel prepare */
+	dw_mipi_dsi_set_mode(dsi, 0);
+}
+
+static void dw_mipi_dsi_bridge_enable(struct drm_bridge *bridge)
+{
+	struct dw_mipi_dsi *dsi = bridge_to_dsi(bridge);
+
+	/* Switch to video mode for panel-bridge enable & panel enable */
+	dw_mipi_dsi_set_mode(dsi, MIPI_DSI_MODE_VIDEO);
+}
+
+static enum drm_mode_status
+dw_mipi_dsi_bridge_mode_valid(struct drm_bridge *bridge,
+			      const struct drm_display_mode *mode)
+{
+	struct dw_mipi_dsi *dsi = bridge_to_dsi(bridge);
+	const struct dw_mipi_dsi_plat_data *pdata = dsi->plat_data;
+	enum drm_mode_status mode_status = MODE_OK;
+
+	if (pdata->mode_valid)
+		mode_status = pdata->mode_valid(pdata->priv_data, mode);
+
+	return mode_status;
+}
+
+static int dw_mipi_dsi_bridge_attach(struct drm_bridge *bridge)
+{
+	struct dw_mipi_dsi *dsi = bridge_to_dsi(bridge);
+
+	if (!bridge->encoder) {
+		DRM_ERROR("Parent encoder object not found\n");
+		return -ENODEV;
+	}
+
+	/* Set the encoder type as caller does not know it */
+	bridge->encoder->encoder_type = DRM_MODE_ENCODER_DSI;
+
+	/* Attach the panel-bridge to the dsi bridge */
+	return drm_bridge_attach(bridge->encoder, dsi->panel_bridge, bridge);
+}
+
+static const struct drm_bridge_funcs dw_mipi_dsi_bridge_funcs = {
+	.mode_set     = dw_mipi_dsi_bridge_mode_set,
+	.enable	      = dw_mipi_dsi_bridge_enable,
+	.post_disable = dw_mipi_dsi_bridge_post_disable,
+	.mode_valid   = dw_mipi_dsi_bridge_mode_valid,
+	.attach	      = dw_mipi_dsi_bridge_attach,
+};
+
+static struct dw_mipi_dsi *
+__dw_mipi_dsi_probe(struct platform_device *pdev,
+		    const struct dw_mipi_dsi_plat_data *plat_data)
+{
+	struct device *dev = &pdev->dev;
+	struct reset_control *apb_rst;
+	struct dw_mipi_dsi *dsi;
+	struct resource *res;
+	int ret;
+
+	dsi = devm_kzalloc(dev, sizeof(*dsi), GFP_KERNEL);
+	if (!dsi)
+		return ERR_PTR(-ENOMEM);
+
+	dsi->dev = dev;
+	dsi->plat_data = plat_data;
+
+	if (!plat_data->phy_ops->init || !plat_data->phy_ops->get_lane_mbps) {
+		DRM_ERROR("Phy not properly configured\n");
+		return ERR_PTR(-ENODEV);
+	}
+
+	if (!plat_data->base) {
+		res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+		if (!res)
+			return ERR_PTR(-ENODEV);
+
+		dsi->base = devm_ioremap_resource(dev, res);
+		if (IS_ERR(dsi->base))
+			return ERR_PTR(-ENODEV);
+
+	} else {
+		dsi->base = plat_data->base;
+	}
+
+	dsi->pclk = devm_clk_get(dev, "pclk");
+	if (IS_ERR(dsi->pclk)) {
+		ret = PTR_ERR(dsi->pclk);
+		dev_err(dev, "Unable to get pclk: %d\n", ret);
+		return ERR_PTR(ret);
+	}
+
+	/*
+	 * Note that the reset was not defined in the initial device tree, so
+	 * we have to be prepared for it not being found.
+	 */
+	apb_rst = devm_reset_control_get(dev, "apb");
+	if (IS_ERR(apb_rst)) {
+		ret = PTR_ERR(apb_rst);
+		if (ret == -ENOENT) {
+			apb_rst = NULL;
+		} else {
+			dev_err(dev, "Unable to get reset control: %d\n", ret);
+			return ERR_PTR(ret);
+		}
+	}
+
+	if (apb_rst) {
+		ret = clk_prepare_enable(dsi->pclk);
+		if (ret) {
+			dev_err(dev, "%s: Failed to enable pclk\n", __func__);
+			return ERR_PTR(ret);
+		}
+
+		reset_control_assert(apb_rst);
+		usleep_range(10, 20);
+		reset_control_deassert(apb_rst);
+
+		clk_disable_unprepare(dsi->pclk);
+	}
+
+	pm_runtime_enable(dev);
+
+	dsi->dsi_host.ops = &dw_mipi_dsi_host_ops;
+	dsi->dsi_host.dev = dev;
+	ret = mipi_dsi_host_register(&dsi->dsi_host);
+	if (ret) {
+		dev_err(dev, "Failed to register MIPI host: %d\n", ret);
+		return ERR_PTR(ret);
+	}
+
+	dsi->bridge.driver_private = dsi;
+	dsi->bridge.funcs = &dw_mipi_dsi_bridge_funcs;
+#ifdef CONFIG_OF
+	dsi->bridge.of_node = pdev->dev.of_node;
+#endif
+
+	dev_set_drvdata(dev, dsi);
+
+	return dsi;
+}
+
+static void __dw_mipi_dsi_remove(struct dw_mipi_dsi *dsi)
+{
+	pm_runtime_disable(dsi->dev);
+}
+
+/*
+ * Probe/remove API, used from platforms based on the DRM bridge API.
+ */
+int dw_mipi_dsi_probe(struct platform_device *pdev,
+		      const struct dw_mipi_dsi_plat_data *plat_data)
+{
+	struct dw_mipi_dsi *dsi;
+
+	dsi = __dw_mipi_dsi_probe(pdev, plat_data);
+	if (IS_ERR(dsi))
+		return PTR_ERR(dsi);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(dw_mipi_dsi_probe);
+
+void dw_mipi_dsi_remove(struct platform_device *pdev)
+{
+	struct dw_mipi_dsi *dsi = platform_get_drvdata(pdev);
+
+	mipi_dsi_host_unregister(&dsi->dsi_host);
+
+	__dw_mipi_dsi_remove(dsi);
+}
+EXPORT_SYMBOL_GPL(dw_mipi_dsi_remove);
+
+/*
+ * Bind/unbind API, used from platforms based on the component framework.
+ */
+int dw_mipi_dsi_bind(struct platform_device *pdev, struct drm_encoder *encoder,
+		     const struct dw_mipi_dsi_plat_data *plat_data)
+{
+	struct dw_mipi_dsi *dsi;
+	int ret;
+
+	dsi = __dw_mipi_dsi_probe(pdev, plat_data);
+	if (IS_ERR(dsi))
+		return PTR_ERR(dsi);
+
+	ret = drm_bridge_attach(encoder, &dsi->bridge, NULL);
+	if (ret) {
+		dw_mipi_dsi_remove(pdev);
+		DRM_ERROR("Failed to initialize bridge with drm\n");
+		return ret;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(dw_mipi_dsi_bind);
+
+void dw_mipi_dsi_unbind(struct device *dev)
+{
+	struct dw_mipi_dsi *dsi = dev_get_drvdata(dev);
+
+	__dw_mipi_dsi_remove(dsi);
+}
+EXPORT_SYMBOL_GPL(dw_mipi_dsi_unbind);
+
+MODULE_AUTHOR("Chris Zhong <zyw@rock-chips.com>");
+MODULE_AUTHOR("Philippe Cornu <philippe.cornu@st.com>");
+MODULE_DESCRIPTION("DW MIPI DSI host controller driver");
+MODULE_LICENSE("GPL");
+MODULE_ALIAS("platform:dw-mipi-dsi");
diff --git a/drivers/gpu/drm/bridge/tc358767.c b/drivers/gpu/drm/bridge/tc358767.c
index 0529e50..8571cfd 100644
--- a/drivers/gpu/drm/bridge/tc358767.c
+++ b/drivers/gpu/drm/bridge/tc358767.c
@@ -1160,7 +1160,6 @@ static const struct drm_connector_helper_funcs tc_connector_helper_funcs = {
 };
 
 static const struct drm_connector_funcs tc_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = drm_connector_cleanup,
 	.reset = drm_atomic_helper_connector_reset,
@@ -1325,11 +1324,7 @@ static int tc_probe(struct i2c_client *client, const struct i2c_device_id *id)
 
 	tc->bridge.funcs = &tc_bridge_funcs;
 	tc->bridge.of_node = dev->of_node;
-	ret = drm_bridge_add(&tc->bridge);
-	if (ret) {
-		dev_err(dev, "Failed to add drm_bridge: %d\n", ret);
-		goto err_unregister_aux;
-	}
+	drm_bridge_add(&tc->bridge);
 
 	i2c_set_clientdata(client, tc);
 
diff --git a/drivers/gpu/drm/bridge/ti-tfp410.c b/drivers/gpu/drm/bridge/ti-tfp410.c
index eee4efd..acb8570 100644
--- a/drivers/gpu/drm/bridge/ti-tfp410.c
+++ b/drivers/gpu/drm/bridge/ti-tfp410.c
@@ -102,7 +102,6 @@ tfp410_connector_detect(struct drm_connector *connector, bool force)
 }
 
 static const struct drm_connector_funcs tfp410_con_funcs = {
-	.dpms			= drm_atomic_helper_connector_dpms,
 	.detect			= tfp410_connector_detect,
 	.fill_modes		= drm_helper_probe_single_connector_modes,
 	.destroy		= drm_connector_cleanup,
@@ -237,11 +236,7 @@ static int tfp410_init(struct device *dev)
 		}
 	}
 
-	ret = drm_bridge_add(&dvi->bridge);
-	if (ret) {
-		dev_err(dev, "drm_bridge_add() failed: %d\n", ret);
-		goto fail;
-	}
+	drm_bridge_add(&dvi->bridge);
 
 	return 0;
 fail:
diff --git a/drivers/gpu/drm/cirrus/cirrus_drv.c b/drivers/gpu/drm/cirrus/cirrus_drv.c
index d893ea2..69c4e35 100644
--- a/drivers/gpu/drm/cirrus/cirrus_drv.c
+++ b/drivers/gpu/drm/cirrus/cirrus_drv.c
@@ -132,7 +132,6 @@ static struct drm_driver driver = {
 	.driver_features = DRIVER_MODESET | DRIVER_GEM,
 	.load = cirrus_driver_load,
 	.unload = cirrus_driver_unload,
-	.set_busid = drm_pci_set_busid,
 	.fops = &cirrus_driver_fops,
 	.name = DRIVER_NAME,
 	.desc = DRIVER_DESC,
@@ -143,7 +142,6 @@ static struct drm_driver driver = {
 	.gem_free_object_unlocked = cirrus_gem_free_object,
 	.dumb_create = cirrus_dumb_create,
 	.dumb_map_offset = cirrus_dumb_mmap_offset,
-	.dumb_destroy = drm_gem_dumb_destroy,
 };
 
 static const struct dev_pm_ops cirrus_pm_ops = {
@@ -166,12 +164,12 @@ static int __init cirrus_init(void)
 
 	if (cirrus_modeset == 0)
 		return -EINVAL;
-	return drm_pci_init(&driver, &cirrus_pci_driver);
+	return pci_register_driver(&cirrus_pci_driver);
 }
 
 static void __exit cirrus_exit(void)
 {
-	drm_pci_exit(&driver, &cirrus_pci_driver);
+	pci_unregister_driver(&cirrus_pci_driver);
 }
 
 module_init(cirrus_init);
diff --git a/drivers/gpu/drm/cirrus/cirrus_drv.h b/drivers/gpu/drm/cirrus/cirrus_drv.h
index 8690352..be2d7e48 100644
--- a/drivers/gpu/drm/cirrus/cirrus_drv.h
+++ b/drivers/gpu/drm/cirrus/cirrus_drv.h
@@ -96,7 +96,6 @@
 
 struct cirrus_crtc {
 	struct drm_crtc			base;
-	u8				lut_r[256], lut_g[256], lut_b[256];
 	int				last_dpms;
 	bool				enabled;
 };
@@ -180,13 +179,6 @@ cirrus_bo(struct ttm_buffer_object *bo)
 #define to_cirrus_obj(x) container_of(x, struct cirrus_gem_object, base)
 #define DRM_FILE_PAGE_OFFSET (0x100000000ULL >> PAGE_SHIFT)
 
-				/* cirrus_mode.c */
-void cirrus_crtc_fb_gamma_set(struct drm_crtc *crtc, u16 red, u16 green,
-			     u16 blue, int regno);
-void cirrus_crtc_fb_gamma_get(struct drm_crtc *crtc, u16 *red, u16 *green,
-			     u16 *blue, int regno);
-
-
 				/* cirrus_main.c */
 int cirrus_device_init(struct cirrus_device *cdev,
 		      struct drm_device *ddev,
diff --git a/drivers/gpu/drm/cirrus/cirrus_fbdev.c b/drivers/gpu/drm/cirrus/cirrus_fbdev.c
index 7fa58ee..32fbfba 100644
--- a/drivers/gpu/drm/cirrus/cirrus_fbdev.c
+++ b/drivers/gpu/drm/cirrus/cirrus_fbdev.c
@@ -215,7 +215,6 @@ static int cirrusfb_create(struct drm_fb_helper *helper,
 
 	strcpy(info->fix.id, "cirrusdrmfb");
 
-	info->flags = FBINFO_DEFAULT;
 	info->fbops = &cirrusfb_ops;
 
 	drm_fb_helper_fill_fix(info, fb->pitches[0], fb->format->depth);
@@ -252,7 +251,7 @@ static int cirrus_fbdev_destroy(struct drm_device *dev,
 	drm_fb_helper_unregister_fbi(&gfbdev->helper);
 
 	if (gfb->obj) {
-		drm_gem_object_unreference_unlocked(gfb->obj);
+		drm_gem_object_put_unlocked(gfb->obj);
 		gfb->obj = NULL;
 	}
 
@@ -265,8 +264,6 @@ static int cirrus_fbdev_destroy(struct drm_device *dev,
 }
 
 static const struct drm_fb_helper_funcs cirrus_fb_helper_funcs = {
-	.gamma_set = cirrus_crtc_fb_gamma_set,
-	.gamma_get = cirrus_crtc_fb_gamma_get,
 	.fb_probe = cirrusfb_create,
 };
 
diff --git a/drivers/gpu/drm/cirrus/cirrus_main.c b/drivers/gpu/drm/cirrus/cirrus_main.c
index e7fc95f..b5f52854 100644
--- a/drivers/gpu/drm/cirrus/cirrus_main.c
+++ b/drivers/gpu/drm/cirrus/cirrus_main.c
@@ -18,7 +18,7 @@ static void cirrus_user_framebuffer_destroy(struct drm_framebuffer *fb)
 {
 	struct cirrus_framebuffer *cirrus_fb = to_cirrus_framebuffer(fb);
 
-	drm_gem_object_unreference_unlocked(cirrus_fb->obj);
+	drm_gem_object_put_unlocked(cirrus_fb->obj);
 	drm_framebuffer_cleanup(fb);
 	kfree(fb);
 }
@@ -67,13 +67,13 @@ cirrus_user_framebuffer_create(struct drm_device *dev,
 
 	cirrus_fb = kzalloc(sizeof(*cirrus_fb), GFP_KERNEL);
 	if (!cirrus_fb) {
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ERR_PTR(-ENOMEM);
 	}
 
 	ret = cirrus_framebuffer_init(dev, cirrus_fb, mode_cmd, obj);
 	if (ret) {
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		kfree(cirrus_fb);
 		return ERR_PTR(ret);
 	}
@@ -261,7 +261,7 @@ int cirrus_dumb_create(struct drm_file *file,
 		return ret;
 
 	ret = drm_gem_handle_create(file, gobj, &handle);
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	if (ret)
 		return ret;
 
@@ -310,7 +310,7 @@ cirrus_dumb_mmap_offset(struct drm_file *file,
 	bo = gem_to_cirrus_bo(obj);
 	*offset = cirrus_bo_mmap_offset(bo);
 
-	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/cirrus/cirrus_mode.c b/drivers/gpu/drm/cirrus/cirrus_mode.c
index 53f6f0f..a4c4a46 100644
--- a/drivers/gpu/drm/cirrus/cirrus_mode.c
+++ b/drivers/gpu/drm/cirrus/cirrus_mode.c
@@ -31,25 +31,6 @@
  * This file contains setup code for the CRTC.
  */
 
-static void cirrus_crtc_load_lut(struct drm_crtc *crtc)
-{
-	struct cirrus_crtc *cirrus_crtc = to_cirrus_crtc(crtc);
-	struct drm_device *dev = crtc->dev;
-	struct cirrus_device *cdev = dev->dev_private;
-	int i;
-
-	if (!crtc->enabled)
-		return;
-
-	for (i = 0; i < CIRRUS_LUT_SIZE; i++) {
-		/* VGA registers */
-		WREG8(PALETTE_INDEX, i);
-		WREG8(PALETTE_DATA, cirrus_crtc->lut_r[i]);
-		WREG8(PALETTE_DATA, cirrus_crtc->lut_g[i]);
-		WREG8(PALETTE_DATA, cirrus_crtc->lut_b[i]);
-	}
-}
-
 /*
  * The DRM core requires DPMS functions, but they make little sense in our
  * case and so are just stubs
@@ -330,15 +311,25 @@ static int cirrus_crtc_gamma_set(struct drm_crtc *crtc, u16 *red, u16 *green,
 				 u16 *blue, uint32_t size,
 				 struct drm_modeset_acquire_ctx *ctx)
 {
-	struct cirrus_crtc *cirrus_crtc = to_cirrus_crtc(crtc);
+	struct drm_device *dev = crtc->dev;
+	struct cirrus_device *cdev = dev->dev_private;
+	u16 *r, *g, *b;
 	int i;
 
-	for (i = 0; i < size; i++) {
-		cirrus_crtc->lut_r[i] = red[i];
-		cirrus_crtc->lut_g[i] = green[i];
-		cirrus_crtc->lut_b[i] = blue[i];
+	if (!crtc->enabled)
+		return 0;
+
+	r = crtc->gamma_store;
+	g = r + crtc->gamma_size;
+	b = g + crtc->gamma_size;
+
+	for (i = 0; i < CIRRUS_LUT_SIZE; i++) {
+		/* VGA registers */
+		WREG8(PALETTE_INDEX, i);
+		WREG8(PALETTE_DATA, *r++ >> 8);
+		WREG8(PALETTE_DATA, *g++ >> 8);
+		WREG8(PALETTE_DATA, *b++ >> 8);
 	}
-	cirrus_crtc_load_lut(crtc);
 
 	return 0;
 }
@@ -365,7 +356,6 @@ static const struct drm_crtc_helper_funcs cirrus_helper_funcs = {
 	.mode_set_base = cirrus_crtc_mode_set_base,
 	.prepare = cirrus_crtc_prepare,
 	.commit = cirrus_crtc_commit,
-	.load_lut = cirrus_crtc_load_lut,
 };
 
 /* CRTC setup */
@@ -373,7 +363,6 @@ static void cirrus_crtc_init(struct drm_device *dev)
 {
 	struct cirrus_device *cdev = dev->dev_private;
 	struct cirrus_crtc *cirrus_crtc;
-	int i;
 
 	cirrus_crtc = kzalloc(sizeof(struct cirrus_crtc) +
 			      (CIRRUSFB_CONN_LIMIT * sizeof(struct drm_connector *)),
@@ -387,37 +376,9 @@ static void cirrus_crtc_init(struct drm_device *dev)
 	drm_mode_crtc_set_gamma_size(&cirrus_crtc->base, CIRRUS_LUT_SIZE);
 	cdev->mode_info.crtc = cirrus_crtc;
 
-	for (i = 0; i < CIRRUS_LUT_SIZE; i++) {
-		cirrus_crtc->lut_r[i] = i;
-		cirrus_crtc->lut_g[i] = i;
-		cirrus_crtc->lut_b[i] = i;
-	}
-
 	drm_crtc_helper_add(&cirrus_crtc->base, &cirrus_helper_funcs);
 }
 
-/** Sets the color ramps on behalf of fbcon */
-void cirrus_crtc_fb_gamma_set(struct drm_crtc *crtc, u16 red, u16 green,
-			      u16 blue, int regno)
-{
-	struct cirrus_crtc *cirrus_crtc = to_cirrus_crtc(crtc);
-
-	cirrus_crtc->lut_r[regno] = red;
-	cirrus_crtc->lut_g[regno] = green;
-	cirrus_crtc->lut_b[regno] = blue;
-}
-
-/** Gets the color ramps on behalf of fbcon */
-void cirrus_crtc_fb_gamma_get(struct drm_crtc *crtc, u16 *red, u16 *green,
-			      u16 *blue, int regno)
-{
-	struct cirrus_crtc *cirrus_crtc = to_cirrus_crtc(crtc);
-
-	*red = cirrus_crtc->lut_r[regno];
-	*green = cirrus_crtc->lut_g[regno];
-	*blue = cirrus_crtc->lut_b[regno];
-}
-
 static void cirrus_encoder_mode_set(struct drm_encoder *encoder,
 				struct drm_display_mode *mode,
 				struct drm_display_mode *adjusted_mode)
diff --git a/drivers/gpu/drm/drm_atomic.c b/drivers/gpu/drm/drm_atomic.c
index aed25c4..2fd383d 100644
--- a/drivers/gpu/drm/drm_atomic.c
+++ b/drivers/gpu/drm/drm_atomic.c
@@ -29,7 +29,6 @@
 #include <drm/drmP.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_mode.h>
-#include <drm/drm_plane_helper.h>
 #include <drm/drm_print.h>
 #include <linux/sync_file.h>
 
@@ -188,12 +187,15 @@ void drm_atomic_state_default_clear(struct drm_atomic_state *state)
 	}
 
 	for (i = 0; i < state->num_private_objs; i++) {
-		void *obj_state = state->private_objs[i].obj_state;
+		struct drm_private_obj *obj = state->private_objs[i].ptr;
 
-		state->private_objs[i].funcs->destroy_state(obj_state);
-		state->private_objs[i].obj = NULL;
-		state->private_objs[i].obj_state = NULL;
-		state->private_objs[i].funcs = NULL;
+		if (!obj)
+			continue;
+
+		obj->funcs->atomic_destroy_state(obj,
+						 state->private_objs[i].state);
+		state->private_objs[i].ptr = NULL;
+		state->private_objs[i].state = NULL;
 	}
 	state->num_private_objs = 0;
 
@@ -409,34 +411,6 @@ int drm_atomic_set_mode_prop_for_crtc(struct drm_crtc_state *state,
 }
 EXPORT_SYMBOL(drm_atomic_set_mode_prop_for_crtc);
 
-/**
- * drm_atomic_replace_property_blob - replace a blob property
- * @blob: a pointer to the member blob to be replaced
- * @new_blob: the new blob to replace with
- * @replaced: whether the blob has been replaced
- *
- * RETURNS:
- * Zero on success, error code on failure
- */
-static void
-drm_atomic_replace_property_blob(struct drm_property_blob **blob,
-				 struct drm_property_blob *new_blob,
-				 bool *replaced)
-{
-	struct drm_property_blob *old_blob = *blob;
-
-	if (old_blob == new_blob)
-		return;
-
-	drm_property_blob_put(old_blob);
-	if (new_blob)
-		drm_property_blob_get(new_blob);
-	*blob = new_blob;
-	*replaced = true;
-
-	return;
-}
-
 static int
 drm_atomic_replace_property_blob_from_id(struct drm_device *dev,
 					 struct drm_property_blob **blob,
@@ -457,7 +431,7 @@ drm_atomic_replace_property_blob_from_id(struct drm_device *dev,
 		}
 	}
 
-	drm_atomic_replace_property_blob(blob, new_blob, replaced);
+	*replaced |= drm_property_replace_blob(blob, new_blob);
 	drm_property_blob_put(new_blob);
 
 	return 0;
@@ -739,7 +713,7 @@ EXPORT_SYMBOL(drm_atomic_get_plane_state);
  * RETURNS:
  * Zero on success, error code on failure
  */
-int drm_atomic_plane_set_property(struct drm_plane *plane,
+static int drm_atomic_plane_set_property(struct drm_plane *plane,
 		struct drm_plane_state *state, struct drm_property *property,
 		uint64_t val)
 {
@@ -796,7 +770,6 @@ int drm_atomic_plane_set_property(struct drm_plane *plane,
 
 	return 0;
 }
-EXPORT_SYMBOL(drm_atomic_plane_set_property);
 
 /**
  * drm_atomic_plane_get_property - get property value from plane state
@@ -991,11 +964,44 @@ static void drm_atomic_plane_print_state(struct drm_printer *p,
 }
 
 /**
+ * drm_atomic_private_obj_init - initialize private object
+ * @obj: private object
+ * @state: initial private object state
+ * @funcs: pointer to the struct of function pointers that identify the object
+ * type
+ *
+ * Initialize the private object, which can be embedded into any
+ * driver private object that needs its own atomic state.
+ */
+void
+drm_atomic_private_obj_init(struct drm_private_obj *obj,
+			    struct drm_private_state *state,
+			    const struct drm_private_state_funcs *funcs)
+{
+	memset(obj, 0, sizeof(*obj));
+
+	obj->state = state;
+	obj->funcs = funcs;
+}
+EXPORT_SYMBOL(drm_atomic_private_obj_init);
+
+/**
+ * drm_atomic_private_obj_fini - finalize private object
+ * @obj: private object
+ *
+ * Finalize the private object.
+ */
+void
+drm_atomic_private_obj_fini(struct drm_private_obj *obj)
+{
+	obj->funcs->atomic_destroy_state(obj, obj->state);
+}
+EXPORT_SYMBOL(drm_atomic_private_obj_fini);
+
+/**
  * drm_atomic_get_private_obj_state - get private object state
  * @state: global atomic state
  * @obj: private object to get the state for
- * @funcs: pointer to the struct of function pointers that identify the object
- * type
  *
  * This function returns the private object state for the given private object,
  * allocating the state if needed. It does not grab any locks as the caller is
@@ -1005,18 +1011,18 @@ static void drm_atomic_plane_print_state(struct drm_printer *p,
  *
  * Either the allocated state or the error code encoded into a pointer.
  */
-void *
-drm_atomic_get_private_obj_state(struct drm_atomic_state *state, void *obj,
-			      const struct drm_private_state_funcs *funcs)
+struct drm_private_state *
+drm_atomic_get_private_obj_state(struct drm_atomic_state *state,
+				 struct drm_private_obj *obj)
 {
 	int index, num_objs, i;
 	size_t size;
 	struct __drm_private_objs_state *arr;
+	struct drm_private_state *obj_state;
 
 	for (i = 0; i < state->num_private_objs; i++)
-		if (obj == state->private_objs[i].obj &&
-		    state->private_objs[i].obj_state)
-			return state->private_objs[i].obj_state;
+		if (obj == state->private_objs[i].ptr)
+			return state->private_objs[i].state;
 
 	num_objs = state->num_private_objs + 1;
 	size = sizeof(*state->private_objs) * num_objs;
@@ -1028,18 +1034,21 @@ drm_atomic_get_private_obj_state(struct drm_atomic_state *state, void *obj,
 	index = state->num_private_objs;
 	memset(&state->private_objs[index], 0, sizeof(*state->private_objs));
 
-	state->private_objs[index].obj_state = funcs->duplicate_state(state, obj);
-	if (!state->private_objs[index].obj_state)
+	obj_state = obj->funcs->atomic_duplicate_state(obj);
+	if (!obj_state)
 		return ERR_PTR(-ENOMEM);
 
-	state->private_objs[index].obj = obj;
-	state->private_objs[index].funcs = funcs;
+	state->private_objs[index].state = obj_state;
+	state->private_objs[index].old_state = obj->state;
+	state->private_objs[index].new_state = obj_state;
+	state->private_objs[index].ptr = obj;
+
 	state->num_private_objs = num_objs;
 
-	DRM_DEBUG_ATOMIC("Added new private object state %p to %p\n",
-			 state->private_objs[index].obj_state, state);
+	DRM_DEBUG_ATOMIC("Added new private object %p state %p to %p\n",
+			 obj, obj_state, state);
 
-	return state->private_objs[index].obj_state;
+	return obj_state;
 }
 EXPORT_SYMBOL(drm_atomic_get_private_obj_state);
 
@@ -1135,7 +1144,7 @@ EXPORT_SYMBOL(drm_atomic_get_connector_state);
  * RETURNS:
  * Zero on success, error code on failure
  */
-int drm_atomic_connector_set_property(struct drm_connector *connector,
+static int drm_atomic_connector_set_property(struct drm_connector *connector,
 		struct drm_connector_state *state, struct drm_property *property,
 		uint64_t val)
 {
@@ -1202,7 +1211,6 @@ int drm_atomic_connector_set_property(struct drm_connector *connector,
 
 	return 0;
 }
-EXPORT_SYMBOL(drm_atomic_connector_set_property);
 
 static void drm_atomic_connector_print_state(struct drm_printer *p,
 		const struct drm_connector_state *state)
@@ -1580,38 +1588,6 @@ drm_atomic_add_affected_planes(struct drm_atomic_state *state,
 EXPORT_SYMBOL(drm_atomic_add_affected_planes);
 
 /**
- * drm_atomic_legacy_backoff - locking backoff for legacy ioctls
- * @state: atomic state
- *
- * This function should be used by legacy entry points which don't understand
- * -EDEADLK semantics. For simplicity this one will grab all modeset locks after
- * the slowpath completed.
- */
-void drm_atomic_legacy_backoff(struct drm_atomic_state *state)
-{
-	struct drm_device *dev = state->dev;
-	int ret;
-	bool global = false;
-
-	if (WARN_ON(dev->mode_config.acquire_ctx == state->acquire_ctx)) {
-		global = true;
-
-		dev->mode_config.acquire_ctx = NULL;
-	}
-
-retry:
-	drm_modeset_backoff(state->acquire_ctx);
-
-	ret = drm_modeset_lock_all_ctx(dev, state->acquire_ctx);
-	if (ret)
-		goto retry;
-
-	if (global)
-		dev->mode_config.acquire_ctx = state->acquire_ctx;
-}
-EXPORT_SYMBOL(drm_atomic_legacy_backoff);
-
-/**
  * drm_atomic_check_only - check whether a given config would work
  * @state: atomic configuration to check
  *
@@ -1857,9 +1833,60 @@ static struct drm_pending_vblank_event *create_vblank_event(
 	return e;
 }
 
-static int atomic_set_prop(struct drm_atomic_state *state,
-		struct drm_mode_object *obj, struct drm_property *prop,
-		uint64_t prop_value)
+int drm_atomic_connector_commit_dpms(struct drm_atomic_state *state,
+				     struct drm_connector *connector,
+				     int mode)
+{
+	struct drm_connector *tmp_connector;
+	struct drm_connector_state *new_conn_state;
+	struct drm_crtc *crtc;
+	struct drm_crtc_state *crtc_state;
+	int i, ret, old_mode = connector->dpms;
+	bool active = false;
+
+	ret = drm_modeset_lock(&state->dev->mode_config.connection_mutex,
+			       state->acquire_ctx);
+	if (ret)
+		return ret;
+
+	if (mode != DRM_MODE_DPMS_ON)
+		mode = DRM_MODE_DPMS_OFF;
+	connector->dpms = mode;
+
+	crtc = connector->state->crtc;
+	if (!crtc)
+		goto out;
+	ret = drm_atomic_add_affected_connectors(state, crtc);
+	if (ret)
+		goto out;
+
+	crtc_state = drm_atomic_get_crtc_state(state, crtc);
+	if (IS_ERR(crtc_state)) {
+		ret = PTR_ERR(crtc_state);
+		goto out;
+	}
+
+	for_each_new_connector_in_state(state, tmp_connector, new_conn_state, i) {
+		if (new_conn_state->crtc != crtc)
+			continue;
+		if (tmp_connector->dpms == DRM_MODE_DPMS_ON) {
+			active = true;
+			break;
+		}
+	}
+
+	crtc_state->active = active;
+	ret = drm_atomic_commit(state);
+out:
+	if (ret != 0)
+		connector->dpms = old_mode;
+	return ret;
+}
+
+int drm_atomic_set_property(struct drm_atomic_state *state,
+			    struct drm_mode_object *obj,
+			    struct drm_property *prop,
+			    uint64_t prop_value)
 {
 	struct drm_mode_object *ref;
 	int ret;
@@ -2042,7 +2069,7 @@ static int prepare_crtc_signaling(struct drm_device *dev,
 {
 	struct drm_crtc *crtc;
 	struct drm_crtc_state *crtc_state;
-	int i, ret;
+	int i, c = 0, ret;
 
 	if (arg->flags & DRM_MODE_ATOMIC_TEST_ONLY)
 		return 0;
@@ -2103,8 +2130,17 @@ static int prepare_crtc_signaling(struct drm_device *dev,
 
 			crtc_state->event->base.fence = fence;
 		}
+
+		c++;
 	}
 
+	/*
+	 * Having this flag means user mode pends on event which will never
+	 * reach due to lack of at least one CRTC for signaling
+	 */
+	if (c == 0 && (arg->flags & DRM_MODE_PAGE_FLIP_EVENT))
+		return -EINVAL;
+
 	return 0;
 }
 
@@ -2272,7 +2308,8 @@ int drm_mode_atomic_ioctl(struct drm_device *dev,
 				goto out;
 			}
 
-			ret = atomic_set_prop(state, obj, prop, prop_value);
+			ret = drm_atomic_set_property(state, obj, prop,
+						      prop_value);
 			if (ret) {
 				drm_mode_object_put(obj);
 				goto out;
diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 86d3093..4e53aae 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -795,6 +795,9 @@ int drm_atomic_helper_check(struct drm_device *dev,
 	if (ret)
 		return ret;
 
+	if (state->legacy_cursor_update)
+		state->async_update = !drm_atomic_helper_async_check(dev, state);
+
 	return ret;
 }
 EXPORT_SYMBOL(drm_atomic_helper_check);
@@ -918,16 +921,12 @@ drm_atomic_helper_update_legacy_modeset_state(struct drm_device *dev,
 		crtc = new_conn_state->crtc;
 		if ((!crtc && old_conn_state->crtc) ||
 		    (crtc && drm_atomic_crtc_needs_modeset(crtc->state))) {
-			struct drm_property *dpms_prop =
-				dev->mode_config.dpms_property;
 			int mode = DRM_MODE_DPMS_OFF;
 
 			if (crtc && crtc->state->active)
 				mode = DRM_MODE_DPMS_ON;
 
 			connector->dpms = mode;
-			drm_object_property_set_value(&connector->base,
-						      dpms_prop, mode);
 		}
 	}
 
@@ -1069,12 +1068,13 @@ void drm_atomic_helper_commit_modeset_enables(struct drm_device *dev,
 					      struct drm_atomic_state *old_state)
 {
 	struct drm_crtc *crtc;
+	struct drm_crtc_state *old_crtc_state;
 	struct drm_crtc_state *new_crtc_state;
 	struct drm_connector *connector;
 	struct drm_connector_state *new_conn_state;
 	int i;
 
-	for_each_new_crtc_in_state(old_state, crtc, new_crtc_state, i) {
+	for_each_oldnew_crtc_in_state(old_state, crtc, old_crtc_state, new_crtc_state, i) {
 		const struct drm_crtc_helper_funcs *funcs;
 
 		/* Need to filter out CRTCs where only planes change. */
@@ -1090,8 +1090,8 @@ void drm_atomic_helper_commit_modeset_enables(struct drm_device *dev,
 			DRM_DEBUG_ATOMIC("enabling [CRTC:%d:%s]\n",
 					 crtc->base.id, crtc->name);
 
-			if (funcs->enable)
-				funcs->enable(crtc);
+			if (funcs->atomic_enable)
+				funcs->atomic_enable(crtc, old_crtc_state);
 			else
 				funcs->commit(crtc);
 		}
@@ -1191,9 +1191,13 @@ EXPORT_SYMBOL(drm_atomic_helper_wait_for_fences);
  *
  * Helper to, after atomic commit, wait for vblanks on all effected
  * crtcs (ie. before cleaning up old framebuffers using
- * drm_atomic_helper_cleanup_planes()). It will only wait on crtcs where the
+ * drm_atomic_helper_cleanup_planes()). It will only wait on CRTCs where the
  * framebuffers have actually changed to optimize for the legacy cursor and
  * plane update use-case.
+ *
+ * Drivers using the nonblocking commit tracking support initialized by calling
+ * drm_atomic_helper_setup_commit() should look at
+ * drm_atomic_helper_wait_for_flip_done() as an alternative.
  */
 void
 drm_atomic_helper_wait_for_vblanks(struct drm_device *dev,
@@ -1241,27 +1245,54 @@ drm_atomic_helper_wait_for_vblanks(struct drm_device *dev,
 EXPORT_SYMBOL(drm_atomic_helper_wait_for_vblanks);
 
 /**
+ * drm_atomic_helper_wait_for_flip_done - wait for all page flips to be done
+ * @dev: DRM device
+ * @old_state: atomic state object with old state structures
+ *
+ * Helper to, after atomic commit, wait for page flips on all effected
+ * crtcs (ie. before cleaning up old framebuffers using
+ * drm_atomic_helper_cleanup_planes()). Compared to
+ * drm_atomic_helper_wait_for_vblanks() this waits for the completion of on all
+ * CRTCs, assuming that cursors-only updates are signalling their completion
+ * immediately (or using a different path).
+ *
+ * This requires that drivers use the nonblocking commit tracking support
+ * initialized using drm_atomic_helper_setup_commit().
+ */
+void drm_atomic_helper_wait_for_flip_done(struct drm_device *dev,
+					  struct drm_atomic_state *old_state)
+{
+	struct drm_crtc_state *unused;
+	struct drm_crtc *crtc;
+	int i;
+
+	for_each_new_crtc_in_state(old_state, crtc, unused, i) {
+		struct drm_crtc_commit *commit = old_state->crtcs[i].commit;
+		int ret;
+
+		if (!commit)
+			continue;
+
+		ret = wait_for_completion_timeout(&commit->flip_done, 10 * HZ);
+		if (ret == 0)
+			DRM_ERROR("[CRTC:%d:%s] flip_done timed out\n",
+				  crtc->base.id, crtc->name);
+	}
+}
+EXPORT_SYMBOL(drm_atomic_helper_wait_for_flip_done);
+
+/**
  * drm_atomic_helper_commit_tail - commit atomic update to hardware
  * @old_state: atomic state object with old state structures
  *
  * This is the default implementation for the
- * &drm_mode_config_helper_funcs.atomic_commit_tail hook.
+ * &drm_mode_config_helper_funcs.atomic_commit_tail hook, for drivers
+ * that do not support runtime_pm or do not need the CRTC to be
+ * enabled to perform a commit. Otherwise, see
+ * drm_atomic_helper_commit_tail_rpm().
  *
  * Note that the default ordering of how the various stages are called is to
- * match the legacy modeset helper library closest. One peculiarity of that is
- * that it doesn't mesh well with runtime PM at all.
- *
- * For drivers supporting runtime PM the recommended sequence is instead ::
- *
- *     drm_atomic_helper_commit_modeset_disables(dev, old_state);
- *
- *     drm_atomic_helper_commit_modeset_enables(dev, old_state);
- *
- *     drm_atomic_helper_commit_planes(dev, old_state,
- *                                     DRM_PLANE_COMMIT_ACTIVE_ONLY);
- *
- * for committing the atomic update to hardware.  See the kerneldoc entries for
- * these three functions for more details.
+ * match the legacy modeset helper library closest.
  */
 void drm_atomic_helper_commit_tail(struct drm_atomic_state *old_state)
 {
@@ -1281,6 +1312,35 @@ void drm_atomic_helper_commit_tail(struct drm_atomic_state *old_state)
 }
 EXPORT_SYMBOL(drm_atomic_helper_commit_tail);
 
+/**
+ * drm_atomic_helper_commit_tail_rpm - commit atomic update to hardware
+ * @old_state: new modeset state to be committed
+ *
+ * This is an alternative implementation for the
+ * &drm_mode_config_helper_funcs.atomic_commit_tail hook, for drivers
+ * that support runtime_pm or need the CRTC to be enabled to perform a
+ * commit. Otherwise, one should use the default implementation
+ * drm_atomic_helper_commit_tail().
+ */
+void drm_atomic_helper_commit_tail_rpm(struct drm_atomic_state *old_state)
+{
+	struct drm_device *dev = old_state->dev;
+
+	drm_atomic_helper_commit_modeset_disables(dev, old_state);
+
+	drm_atomic_helper_commit_modeset_enables(dev, old_state);
+
+	drm_atomic_helper_commit_planes(dev, old_state,
+					DRM_PLANE_COMMIT_ACTIVE_ONLY);
+
+	drm_atomic_helper_commit_hw_done(old_state);
+
+	drm_atomic_helper_wait_for_vblanks(dev, old_state);
+
+	drm_atomic_helper_cleanup_planes(dev, old_state);
+}
+EXPORT_SYMBOL(drm_atomic_helper_commit_tail_rpm);
+
 static void commit_tail(struct drm_atomic_state *old_state)
 {
 	struct drm_device *dev = old_state->dev;
@@ -1311,6 +1371,114 @@ static void commit_work(struct work_struct *work)
 }
 
 /**
+ * drm_atomic_helper_async_check - check if state can be commited asynchronously
+ * @dev: DRM device
+ * @state: the driver state object
+ *
+ * This helper will check if it is possible to commit the state asynchronously.
+ * Async commits are not supposed to swap the states like normal sync commits
+ * but just do in-place changes on the current state.
+ *
+ * It will return 0 if the commit can happen in an asynchronous fashion or error
+ * if not. Note that error just mean it can't be commited asynchronously, if it
+ * fails the commit should be treated like a normal synchronous commit.
+ */
+int drm_atomic_helper_async_check(struct drm_device *dev,
+				   struct drm_atomic_state *state)
+{
+	struct drm_crtc *crtc;
+	struct drm_crtc_state *crtc_state;
+	struct drm_crtc_commit *commit;
+	struct drm_plane *__plane, *plane = NULL;
+	struct drm_plane_state *__plane_state, *plane_state = NULL;
+	const struct drm_plane_helper_funcs *funcs;
+	int i, j, n_planes = 0;
+
+	for_each_new_crtc_in_state(state, crtc, crtc_state, i) {
+		if (drm_atomic_crtc_needs_modeset(crtc_state))
+			return -EINVAL;
+	}
+
+	for_each_new_plane_in_state(state, __plane, __plane_state, i) {
+		n_planes++;
+		plane = __plane;
+		plane_state = __plane_state;
+	}
+
+	/* FIXME: we support only single plane updates for now */
+	if (!plane || n_planes != 1)
+		return -EINVAL;
+
+	if (!plane_state->crtc)
+		return -EINVAL;
+
+	funcs = plane->helper_private;
+	if (!funcs->atomic_async_update)
+		return -EINVAL;
+
+	if (plane_state->fence)
+		return -EINVAL;
+
+	/*
+	 * Don't do an async update if there is an outstanding commit modifying
+	 * the plane.  This prevents our async update's changes from getting
+	 * overridden by a previous synchronous update's state.
+	 */
+	for_each_new_crtc_in_state(state, crtc, crtc_state, i) {
+		if (plane->crtc != crtc)
+			continue;
+
+		spin_lock(&crtc->commit_lock);
+		commit = list_first_entry_or_null(&crtc->commit_list,
+						  struct drm_crtc_commit,
+						  commit_entry);
+		if (!commit) {
+			spin_unlock(&crtc->commit_lock);
+			continue;
+		}
+		spin_unlock(&crtc->commit_lock);
+
+		if (!crtc->state->state)
+			continue;
+
+		for_each_plane_in_state(crtc->state->state, __plane,
+					__plane_state, j) {
+			if (__plane == plane)
+				return -EINVAL;
+		}
+	}
+
+	return funcs->atomic_async_check(plane, plane_state);
+}
+EXPORT_SYMBOL(drm_atomic_helper_async_check);
+
+/**
+ * drm_atomic_helper_async_commit - commit state asynchronously
+ * @dev: DRM device
+ * @state: the driver state object
+ *
+ * This function commits a state asynchronously, i.e., not vblank
+ * synchronized. It should be used on a state only when
+ * drm_atomic_async_check() succeeds. Async commits are not supposed to swap
+ * the states like normal sync commits, but just do in-place changes on the
+ * current state.
+ */
+void drm_atomic_helper_async_commit(struct drm_device *dev,
+				    struct drm_atomic_state *state)
+{
+	struct drm_plane *plane;
+	struct drm_plane_state *plane_state;
+	const struct drm_plane_helper_funcs *funcs;
+	int i;
+
+	for_each_new_plane_in_state(state, plane, plane_state, i) {
+		funcs = plane->helper_private;
+		funcs->atomic_async_update(plane, plane_state);
+	}
+}
+EXPORT_SYMBOL(drm_atomic_helper_async_commit);
+
+/**
  * drm_atomic_helper_commit - commit validated state object
  * @dev: DRM device
  * @state: the driver state object
@@ -1334,6 +1502,17 @@ int drm_atomic_helper_commit(struct drm_device *dev,
 {
 	int ret;
 
+	if (state->async_update) {
+		ret = drm_atomic_helper_prepare_planes(dev, state);
+		if (ret)
+			return ret;
+
+		drm_atomic_helper_async_commit(dev, state);
+		drm_atomic_helper_cleanup_planes(dev, state);
+
+		return 0;
+	}
+
 	ret = drm_atomic_helper_setup_commit(state, nonblock);
 	if (ret)
 		return ret;
@@ -1346,10 +1525,8 @@ int drm_atomic_helper_commit(struct drm_device *dev,
 
 	if (!nonblock) {
 		ret = drm_atomic_helper_wait_for_fences(dev, state, true);
-		if (ret) {
-			drm_atomic_helper_cleanup_planes(dev, state);
-			return ret;
-		}
+		if (ret)
+			goto err;
 	}
 
 	/*
@@ -1358,7 +1535,9 @@ int drm_atomic_helper_commit(struct drm_device *dev,
 	 * the software side now.
 	 */
 
-	drm_atomic_helper_swap_state(state, true);
+	ret = drm_atomic_helper_swap_state(state, true);
+	if (ret)
+		goto err;
 
 	/*
 	 * Everything below can be run asynchronously without the need to grab
@@ -1387,6 +1566,10 @@ int drm_atomic_helper_commit(struct drm_device *dev,
 		commit_tail(state);
 
 	return 0;
+
+err:
+	drm_atomic_helper_cleanup_planes(dev, state);
+	return ret;
 }
 EXPORT_SYMBOL(drm_atomic_helper_commit);
 
@@ -1680,9 +1863,7 @@ void drm_atomic_helper_commit_hw_done(struct drm_atomic_state *old_state)
 
 		/* backend must have consumed any event by now */
 		WARN_ON(new_crtc_state->event);
-		spin_lock(&crtc->commit_lock);
 		complete_all(&commit->hw_done);
-		spin_unlock(&crtc->commit_lock);
 	}
 }
 EXPORT_SYMBOL(drm_atomic_helper_commit_hw_done);
@@ -1711,7 +1892,6 @@ void drm_atomic_helper_commit_cleanup_done(struct drm_atomic_state *old_state)
 		if (WARN_ON(!commit))
 			continue;
 
-		spin_lock(&crtc->commit_lock);
 		complete_all(&commit->cleanup_done);
 		WARN_ON(!try_wait_for_completion(&commit->hw_done));
 
@@ -1721,8 +1901,6 @@ void drm_atomic_helper_commit_cleanup_done(struct drm_atomic_state *old_state)
 		if (try_wait_for_completion(&commit->flip_done))
 			goto del_commit;
 
-		spin_unlock(&crtc->commit_lock);
-
 		/* We must wait for the vblank event to signal our completion
 		 * before releasing our reference, since the vblank work does
 		 * not hold a reference of its own. */
@@ -1732,8 +1910,8 @@ void drm_atomic_helper_commit_cleanup_done(struct drm_atomic_state *old_state)
 			DRM_ERROR("[CRTC:%d:%s] flip_done timed out\n",
 				  crtc->base.id, crtc->name);
 
-		spin_lock(&crtc->commit_lock);
 del_commit:
+		spin_lock(&crtc->commit_lock);
 		list_del(&commit->commit_entry);
 		spin_unlock(&crtc->commit_lock);
 	}
@@ -2069,14 +2247,14 @@ EXPORT_SYMBOL(drm_atomic_helper_cleanup_planes);
 /**
  * drm_atomic_helper_swap_state - store atomic state into current sw state
  * @state: atomic state
- * @stall: stall for proceeding commits
+ * @stall: stall for preceeding commits
  *
  * This function stores the atomic state into the current state pointers in all
  * driver objects. It should be called after all failing steps have been done
  * and succeeded, but before the actual hardware state is committed.
  *
  * For cleanup and error recovery the current state for all changed objects will
- * be swaped into @state.
+ * be swapped into @state.
  *
  * With that sequence it fits perfectly into the plane prepare/cleanup sequence:
  *
@@ -2095,12 +2273,16 @@ EXPORT_SYMBOL(drm_atomic_helper_cleanup_planes);
  * the &drm_plane.state, &drm_crtc.state or &drm_connector.state pointer. With
  * the current atomic helpers this is almost always the case, since the helpers
  * don't pass the right state structures to the callbacks.
+ *
+ * Returns:
+ *
+ * Returns 0 on success. Can return -ERESTARTSYS when @stall is true and the
+ * waiting for the previous commits has been interrupted.
  */
-void drm_atomic_helper_swap_state(struct drm_atomic_state *state,
+int drm_atomic_helper_swap_state(struct drm_atomic_state *state,
 				  bool stall)
 {
-	int i;
-	long ret;
+	int i, ret;
 	struct drm_connector *connector;
 	struct drm_connector_state *old_conn_state, *new_conn_state;
 	struct drm_crtc *crtc;
@@ -2108,8 +2290,8 @@ void drm_atomic_helper_swap_state(struct drm_atomic_state *state,
 	struct drm_plane *plane;
 	struct drm_plane_state *old_plane_state, *new_plane_state;
 	struct drm_crtc_commit *commit;
-	void *obj, *obj_state;
-	const struct drm_private_state_funcs *funcs;
+	struct drm_private_obj *obj;
+	struct drm_private_state *old_obj_state, *new_obj_state;
 
 	if (stall) {
 		for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) {
@@ -2123,12 +2305,11 @@ void drm_atomic_helper_swap_state(struct drm_atomic_state *state,
 			if (!commit)
 				continue;
 
-			ret = wait_for_completion_timeout(&commit->hw_done,
-							  10*HZ);
-			if (ret == 0)
-				DRM_ERROR("[CRTC:%d:%s] hw_done timed out\n",
-					  crtc->base.id, crtc->name);
+			ret = wait_for_completion_interruptible(&commit->hw_done);
 			drm_crtc_commit_put(commit);
+
+			if (ret)
+				return ret;
 		}
 	}
 
@@ -2171,8 +2352,17 @@ void drm_atomic_helper_swap_state(struct drm_atomic_state *state,
 		plane->state = new_plane_state;
 	}
 
-	__for_each_private_obj(state, obj, obj_state, i, funcs)
-		funcs->swap_state(obj, &state->private_objs[i].obj_state);
+	for_each_oldnew_private_obj_in_state(state, obj, old_obj_state, new_obj_state, i) {
+		WARN_ON(obj->state != old_obj_state);
+
+		old_obj_state->state = state;
+		new_obj_state->state = NULL;
+
+		state->private_objs[i].state = old_obj_state;
+		obj->state = new_obj_state;
+	}
+
+	return 0;
 }
 EXPORT_SYMBOL(drm_atomic_helper_swap_state);
 
@@ -2526,6 +2716,7 @@ int drm_atomic_helper_disable_all(struct drm_device *dev,
 	struct drm_plane *plane;
 	struct drm_crtc_state *crtc_state;
 	struct drm_crtc *crtc;
+	unsigned plane_mask = 0;
 	int ret, i;
 
 	state = drm_atomic_state_alloc(dev);
@@ -2556,22 +2747,26 @@ int drm_atomic_helper_disable_all(struct drm_device *dev,
 			goto free;
 	}
 
-	for_each_connector_in_state(state, conn, conn_state, i) {
+	for_each_new_connector_in_state(state, conn, conn_state, i) {
 		ret = drm_atomic_set_crtc_for_connector(conn_state, NULL);
 		if (ret < 0)
 			goto free;
 	}
 
-	for_each_plane_in_state(state, plane, plane_state, i) {
+	for_each_new_plane_in_state(state, plane, plane_state, i) {
 		ret = drm_atomic_set_crtc_for_plane(plane_state, NULL);
 		if (ret < 0)
 			goto free;
 
 		drm_atomic_set_fb_for_plane(plane_state, NULL);
+		plane_mask |= BIT(drm_plane_index(plane));
+		plane->old_fb = plane->fb;
 	}
 
 	ret = drm_atomic_commit(state);
 free:
+	if (plane_mask)
+		drm_atomic_clean_old_fb(dev, plane_mask, ret);
 	drm_atomic_state_put(state);
 	return ret;
 }
@@ -2702,11 +2897,16 @@ int drm_atomic_helper_commit_duplicated_state(struct drm_atomic_state *state,
 	struct drm_connector_state *new_conn_state;
 	struct drm_crtc *crtc;
 	struct drm_crtc_state *new_crtc_state;
+	unsigned plane_mask = 0;
+	struct drm_device *dev = state->dev;
+	int ret;
 
 	state->acquire_ctx = ctx;
 
-	for_each_new_plane_in_state(state, plane, new_plane_state, i)
+	for_each_new_plane_in_state(state, plane, new_plane_state, i) {
+		plane_mask |= BIT(drm_plane_index(plane));
 		state->planes[i].old_state = plane->state;
+	}
 
 	for_each_new_crtc_in_state(state, crtc, new_crtc_state, i)
 		state->crtcs[i].old_state = crtc->state;
@@ -2714,7 +2914,11 @@ int drm_atomic_helper_commit_duplicated_state(struct drm_atomic_state *state,
 	for_each_new_connector_in_state(state, connector, new_conn_state, i)
 		state->connectors[i].old_state = connector->state;
 
-	return drm_atomic_commit(state);
+	ret = drm_atomic_commit(state);
+	if (plane_mask)
+		drm_atomic_clean_old_fb(dev, plane_mask, ret);
+
+	return ret;
 }
 EXPORT_SYMBOL(drm_atomic_helper_commit_duplicated_state);
 
@@ -2763,177 +2967,11 @@ int drm_atomic_helper_resume(struct drm_device *dev,
 }
 EXPORT_SYMBOL(drm_atomic_helper_resume);
 
-/**
- * drm_atomic_helper_crtc_set_property - helper for crtc properties
- * @crtc: DRM crtc
- * @property: DRM property
- * @val: value of property
- *
- * Provides a default crtc set_property handler using the atomic driver
- * interface.
- *
- * RETURNS:
- * Zero on success, error code on failure
- */
-int
-drm_atomic_helper_crtc_set_property(struct drm_crtc *crtc,
-				    struct drm_property *property,
-				    uint64_t val)
-{
-	struct drm_atomic_state *state;
-	struct drm_crtc_state *crtc_state;
-	int ret = 0;
-
-	state = drm_atomic_state_alloc(crtc->dev);
-	if (!state)
-		return -ENOMEM;
-
-	/* ->set_property is always called with all locks held. */
-	state->acquire_ctx = crtc->dev->mode_config.acquire_ctx;
-retry:
-	crtc_state = drm_atomic_get_crtc_state(state, crtc);
-	if (IS_ERR(crtc_state)) {
-		ret = PTR_ERR(crtc_state);
-		goto fail;
-	}
-
-	ret = drm_atomic_crtc_set_property(crtc, crtc_state,
-			property, val);
-	if (ret)
-		goto fail;
-
-	ret = drm_atomic_commit(state);
-fail:
-	if (ret == -EDEADLK)
-		goto backoff;
-
-	drm_atomic_state_put(state);
-	return ret;
-
-backoff:
-	drm_atomic_state_clear(state);
-	drm_atomic_legacy_backoff(state);
-
-	goto retry;
-}
-EXPORT_SYMBOL(drm_atomic_helper_crtc_set_property);
-
-/**
- * drm_atomic_helper_plane_set_property - helper for plane properties
- * @plane: DRM plane
- * @property: DRM property
- * @val: value of property
- *
- * Provides a default plane set_property handler using the atomic driver
- * interface.
- *
- * RETURNS:
- * Zero on success, error code on failure
- */
-int
-drm_atomic_helper_plane_set_property(struct drm_plane *plane,
-				    struct drm_property *property,
-				    uint64_t val)
-{
-	struct drm_atomic_state *state;
-	struct drm_plane_state *plane_state;
-	int ret = 0;
-
-	state = drm_atomic_state_alloc(plane->dev);
-	if (!state)
-		return -ENOMEM;
-
-	/* ->set_property is always called with all locks held. */
-	state->acquire_ctx = plane->dev->mode_config.acquire_ctx;
-retry:
-	plane_state = drm_atomic_get_plane_state(state, plane);
-	if (IS_ERR(plane_state)) {
-		ret = PTR_ERR(plane_state);
-		goto fail;
-	}
-
-	ret = drm_atomic_plane_set_property(plane, plane_state,
-			property, val);
-	if (ret)
-		goto fail;
-
-	ret = drm_atomic_commit(state);
-fail:
-	if (ret == -EDEADLK)
-		goto backoff;
-
-	drm_atomic_state_put(state);
-	return ret;
-
-backoff:
-	drm_atomic_state_clear(state);
-	drm_atomic_legacy_backoff(state);
-
-	goto retry;
-}
-EXPORT_SYMBOL(drm_atomic_helper_plane_set_property);
-
-/**
- * drm_atomic_helper_connector_set_property - helper for connector properties
- * @connector: DRM connector
- * @property: DRM property
- * @val: value of property
- *
- * Provides a default connector set_property handler using the atomic driver
- * interface.
- *
- * RETURNS:
- * Zero on success, error code on failure
- */
-int
-drm_atomic_helper_connector_set_property(struct drm_connector *connector,
-				    struct drm_property *property,
-				    uint64_t val)
-{
-	struct drm_atomic_state *state;
-	struct drm_connector_state *connector_state;
-	int ret = 0;
-
-	state = drm_atomic_state_alloc(connector->dev);
-	if (!state)
-		return -ENOMEM;
-
-	/* ->set_property is always called with all locks held. */
-	state->acquire_ctx = connector->dev->mode_config.acquire_ctx;
-retry:
-	connector_state = drm_atomic_get_connector_state(state, connector);
-	if (IS_ERR(connector_state)) {
-		ret = PTR_ERR(connector_state);
-		goto fail;
-	}
-
-	ret = drm_atomic_connector_set_property(connector, connector_state,
-			property, val);
-	if (ret)
-		goto fail;
-
-	ret = drm_atomic_commit(state);
-fail:
-	if (ret == -EDEADLK)
-		goto backoff;
-
-	drm_atomic_state_put(state);
-	return ret;
-
-backoff:
-	drm_atomic_state_clear(state);
-	drm_atomic_legacy_backoff(state);
-
-	goto retry;
-}
-EXPORT_SYMBOL(drm_atomic_helper_connector_set_property);
-
-static int page_flip_common(
-				struct drm_atomic_state *state,
-				struct drm_crtc *crtc,
-				struct drm_framebuffer *fb,
-				struct drm_pending_vblank_event *event,
-				uint32_t flags)
+static int page_flip_common(struct drm_atomic_state *state,
+			    struct drm_crtc *crtc,
+			    struct drm_framebuffer *fb,
+			    struct drm_pending_vblank_event *event,
+			    uint32_t flags)
 {
 	struct drm_plane *plane = crtc->primary;
 	struct drm_plane_state *plane_state;
@@ -3027,13 +3065,12 @@ EXPORT_SYMBOL(drm_atomic_helper_page_flip);
  * Returns:
  * Returns 0 on success, negative errno numbers on failure.
  */
-int drm_atomic_helper_page_flip_target(
-				struct drm_crtc *crtc,
-				struct drm_framebuffer *fb,
-				struct drm_pending_vblank_event *event,
-				uint32_t flags,
-				uint32_t target,
-				struct drm_modeset_acquire_ctx *ctx)
+int drm_atomic_helper_page_flip_target(struct drm_crtc *crtc,
+				       struct drm_framebuffer *fb,
+				       struct drm_pending_vblank_event *event,
+				       uint32_t flags,
+				       uint32_t target,
+				       struct drm_modeset_acquire_ctx *ctx)
 {
 	struct drm_plane *plane = crtc->primary;
 	struct drm_atomic_state *state;
@@ -3065,85 +3102,6 @@ int drm_atomic_helper_page_flip_target(
 EXPORT_SYMBOL(drm_atomic_helper_page_flip_target);
 
 /**
- * drm_atomic_helper_connector_dpms() - connector dpms helper implementation
- * @connector: affected connector
- * @mode: DPMS mode
- *
- * This is the main helper function provided by the atomic helper framework for
- * implementing the legacy DPMS connector interface. It computes the new desired
- * &drm_crtc_state.active state for the corresponding CRTC (if the connector is
- * enabled) and updates it.
- *
- * Returns:
- * Returns 0 on success, negative errno numbers on failure.
- */
-int drm_atomic_helper_connector_dpms(struct drm_connector *connector,
-				     int mode)
-{
-	struct drm_mode_config *config = &connector->dev->mode_config;
-	struct drm_atomic_state *state;
-	struct drm_crtc_state *crtc_state;
-	struct drm_crtc *crtc;
-	struct drm_connector *tmp_connector;
-	struct drm_connector_list_iter conn_iter;
-	int ret;
-	bool active = false;
-	int old_mode = connector->dpms;
-
-	if (mode != DRM_MODE_DPMS_ON)
-		mode = DRM_MODE_DPMS_OFF;
-
-	connector->dpms = mode;
-	crtc = connector->state->crtc;
-
-	if (!crtc)
-		return 0;
-
-	state = drm_atomic_state_alloc(connector->dev);
-	if (!state)
-		return -ENOMEM;
-
-	state->acquire_ctx = crtc->dev->mode_config.acquire_ctx;
-retry:
-	crtc_state = drm_atomic_get_crtc_state(state, crtc);
-	if (IS_ERR(crtc_state)) {
-		ret = PTR_ERR(crtc_state);
-		goto fail;
-	}
-
-	WARN_ON(!drm_modeset_is_locked(&config->connection_mutex));
-
-	drm_connector_list_iter_begin(connector->dev, &conn_iter);
-	drm_for_each_connector_iter(tmp_connector, &conn_iter) {
-		if (tmp_connector->state->crtc != crtc)
-			continue;
-
-		if (tmp_connector->dpms == DRM_MODE_DPMS_ON) {
-			active = true;
-			break;
-		}
-	}
-	drm_connector_list_iter_end(&conn_iter);
-	crtc_state->active = active;
-
-	ret = drm_atomic_commit(state);
-fail:
-	if (ret == -EDEADLK)
-		goto backoff;
-	if (ret != 0)
-		connector->dpms = old_mode;
-	drm_atomic_state_put(state);
-	return ret;
-
-backoff:
-	drm_atomic_state_clear(state);
-	drm_atomic_legacy_backoff(state);
-
-	goto retry;
-}
-EXPORT_SYMBOL(drm_atomic_helper_connector_dpms);
-
-/**
  * drm_atomic_helper_best_encoder - Helper for
  * 	&drm_connector_helper_funcs.best_encoder callback
  * @connector: Connector control structure
@@ -3612,12 +3570,12 @@ int drm_atomic_helper_legacy_gamma_set(struct drm_crtc *crtc,
 				       struct drm_modeset_acquire_ctx *ctx)
 {
 	struct drm_device *dev = crtc->dev;
-	struct drm_mode_config *config = &dev->mode_config;
 	struct drm_atomic_state *state;
 	struct drm_crtc_state *crtc_state;
 	struct drm_property_blob *blob = NULL;
 	struct drm_color_lut *blob_data;
 	int i, ret = 0;
+	bool replaced;
 
 	state = drm_atomic_state_alloc(crtc->dev);
 	if (!state)
@@ -3648,20 +3606,10 @@ int drm_atomic_helper_legacy_gamma_set(struct drm_crtc *crtc,
 	}
 
 	/* Reset DEGAMMA_LUT and CTM properties. */
-	ret = drm_atomic_crtc_set_property(crtc, crtc_state,
-			config->degamma_lut_property, 0);
-	if (ret)
-		goto fail;
-
-	ret = drm_atomic_crtc_set_property(crtc, crtc_state,
-			config->ctm_property, 0);
-	if (ret)
-		goto fail;
-
-	ret = drm_atomic_crtc_set_property(crtc, crtc_state,
-			config->gamma_lut_property, blob->base.id);
-	if (ret)
-		goto fail;
+	replaced  = drm_property_replace_blob(&crtc_state->degamma_lut, NULL);
+	replaced |= drm_property_replace_blob(&crtc_state->ctm, NULL);
+	replaced |= drm_property_replace_blob(&crtc_state->gamma_lut, blob);
+	crtc_state->color_mgmt_changed |= replaced;
 
 	ret = drm_atomic_commit(state);
 
@@ -3671,3 +3619,18 @@ int drm_atomic_helper_legacy_gamma_set(struct drm_crtc *crtc,
 	return ret;
 }
 EXPORT_SYMBOL(drm_atomic_helper_legacy_gamma_set);
+
+/**
+ * __drm_atomic_helper_private_duplicate_state - copy atomic private state
+ * @obj: CRTC object
+ * @state: new private object state
+ *
+ * Copies atomic state from a private objects's current state and resets inferred values.
+ * This is useful for drivers that subclass the private state.
+ */
+void __drm_atomic_helper_private_obj_duplicate_state(struct drm_private_obj *obj,
+						     struct drm_private_state *state)
+{
+	memcpy(state, obj->state, sizeof(*state));
+}
+EXPORT_SYMBOL(__drm_atomic_helper_private_obj_duplicate_state);
diff --git a/drivers/gpu/drm/drm_color_mgmt.c b/drivers/gpu/drm/drm_color_mgmt.c
index 3eda500..fe09827 100644
--- a/drivers/gpu/drm/drm_color_mgmt.c
+++ b/drivers/gpu/drm/drm_color_mgmt.c
@@ -128,6 +128,9 @@ EXPORT_SYMBOL(drm_color_lut_extract);
  * optional. The gamma and degamma properties are only attached if
  * their size is not 0 and ctm_property is only attached if has_ctm is
  * true.
+ *
+ * Drivers should use drm_atomic_helper_legacy_gamma_set() to implement the
+ * legacy &drm_crtc_funcs.gamma_set callback.
  */
 void drm_crtc_enable_color_mgmt(struct drm_crtc *crtc,
 				uint degamma_lut_size,
diff --git a/drivers/gpu/drm/drm_connector.c b/drivers/gpu/drm/drm_connector.c
index 8072e6e..ba9f36c 100644
--- a/drivers/gpu/drm/drm_connector.c
+++ b/drivers/gpu/drm/drm_connector.c
@@ -717,9 +717,9 @@ DRM_ENUM_NAME_FN(drm_get_tv_subconnector_name,
  * 	drivers, it remaps to controlling the "ACTIVE" property on the CRTC the
  * 	connector is linked to. Drivers should never set this property directly,
  * 	it is handled by the DRM core by calling the &drm_connector_funcs.dpms
- * 	callback. Atomic drivers should implement this hook using
- * 	drm_atomic_helper_connector_dpms(). This is the only property standard
- * 	connector property that userspace can change.
+ * 	callback. For atomic drivers the remapping to the "ACTIVE" property is
+ * 	implemented in the DRM core.  This is the only standard connector
+ * 	property that userspace can change.
  * PATH:
  * 	Connector path property to identify how this sink is physically
  * 	connected. Used by DP MST. This should be set by calling
@@ -1225,7 +1225,6 @@ int drm_mode_connector_set_obj_prop(struct drm_mode_object *obj,
 	} else if (connector->funcs->set_property)
 		ret = connector->funcs->set_property(connector, property, value);
 
-	/* store the property value if successful */
 	if (!ret)
 		drm_object_property_set_value(&connector->base, property, value);
 	return ret;
diff --git a/drivers/gpu/drm/drm_crtc_helper.c b/drivers/gpu/drm/drm_crtc_helper.c
index 4afdf79..eab36a4 100644
--- a/drivers/gpu/drm/drm_crtc_helper.c
+++ b/drivers/gpu/drm/drm_crtc_helper.c
@@ -863,8 +863,7 @@ static int drm_helper_choose_crtc_dpms(struct drm_crtc *crtc)
  * provided by the driver.
  *
  * This function is deprecated.  New drivers must implement atomic modeset
- * support, for which this function is unsuitable. Instead drivers should use
- * drm_atomic_helper_connector_dpms().
+ * support, where DPMS is handled in the DRM core.
  *
  * Returns:
  * Always returns 0.
diff --git a/drivers/gpu/drm/drm_crtc_internal.h b/drivers/gpu/drm/drm_crtc_internal.h
index d077c54..a435820 100644
--- a/drivers/gpu/drm/drm_crtc_internal.h
+++ b/drivers/gpu/drm/drm_crtc_internal.h
@@ -178,6 +178,13 @@ struct drm_minor;
 int drm_atomic_debugfs_init(struct drm_minor *minor);
 #endif
 
+int drm_atomic_connector_commit_dpms(struct drm_atomic_state *state,
+				     struct drm_connector *connector,
+				     int mode);
+int drm_atomic_set_property(struct drm_atomic_state *state,
+			    struct drm_mode_object *obj,
+			    struct drm_property *prop,
+			    uint64_t prop_value);
 int drm_atomic_get_property(struct drm_mode_object *obj,
 			    struct drm_property *property, uint64_t *val);
 int drm_mode_atomic_ioctl(struct drm_device *dev,
diff --git a/drivers/gpu/drm/drm_debugfs_crc.c b/drivers/gpu/drm/drm_debugfs_crc.c
index 1722d8f..f9e26dd 100644
--- a/drivers/gpu/drm/drm_debugfs_crc.c
+++ b/drivers/gpu/drm/drm_debugfs_crc.c
@@ -136,20 +136,50 @@ static int crtc_crc_data_count(struct drm_crtc_crc *crc)
 	return CIRC_CNT(crc->head, crc->tail, DRM_CRC_ENTRIES_NR);
 }
 
+static void crtc_crc_cleanup(struct drm_crtc_crc *crc)
+{
+	kfree(crc->entries);
+	crc->entries = NULL;
+	crc->head = 0;
+	crc->tail = 0;
+	crc->values_cnt = 0;
+	crc->opened = false;
+}
+
 static int crtc_crc_open(struct inode *inode, struct file *filep)
 {
 	struct drm_crtc *crtc = inode->i_private;
 	struct drm_crtc_crc *crc = &crtc->crc;
 	struct drm_crtc_crc_entry *entries = NULL;
 	size_t values_cnt;
-	int ret;
+	int ret = 0;
 
-	if (crc->opened)
-		return -EBUSY;
+	if (drm_drv_uses_atomic_modeset(crtc->dev)) {
+		ret = drm_modeset_lock_interruptible(&crtc->mutex, NULL);
+		if (ret)
+			return ret;
+
+		if (!crtc->state->active)
+			ret = -EIO;
+		drm_modeset_unlock(&crtc->mutex);
+
+		if (ret)
+			return ret;
+	}
+
+	spin_lock_irq(&crc->lock);
+	if (!crc->opened)
+		crc->opened = true;
+	else
+		ret = -EBUSY;
+	spin_unlock_irq(&crc->lock);
+
+	if (ret)
+		return ret;
 
 	ret = crtc->funcs->set_crc_source(crtc, crc->source, &values_cnt);
 	if (ret)
-		return ret;
+		goto err;
 
 	if (WARN_ON(values_cnt > DRM_MAX_CRC_NR)) {
 		ret = -EINVAL;
@@ -170,7 +200,6 @@ static int crtc_crc_open(struct inode *inode, struct file *filep)
 	spin_lock_irq(&crc->lock);
 	crc->entries = entries;
 	crc->values_cnt = values_cnt;
-	crc->opened = true;
 
 	/*
 	 * Only return once we got a first frame, so userspace doesn't have to
@@ -182,12 +211,17 @@ static int crtc_crc_open(struct inode *inode, struct file *filep)
 						crc->lock);
 	spin_unlock_irq(&crc->lock);
 
-	WARN_ON(ret);
+	if (ret)
+		goto err_disable;
 
 	return 0;
 
 err_disable:
 	crtc->funcs->set_crc_source(crtc, NULL, &values_cnt);
+err:
+	spin_lock_irq(&crc->lock);
+	crtc_crc_cleanup(crc);
+	spin_unlock_irq(&crc->lock);
 	return ret;
 }
 
@@ -197,17 +231,12 @@ static int crtc_crc_release(struct inode *inode, struct file *filep)
 	struct drm_crtc_crc *crc = &crtc->crc;
 	size_t values_cnt;
 
-	spin_lock_irq(&crc->lock);
-	kfree(crc->entries);
-	crc->entries = NULL;
-	crc->head = 0;
-	crc->tail = 0;
-	crc->values_cnt = 0;
-	crc->opened = false;
-	spin_unlock_irq(&crc->lock);
-
 	crtc->funcs->set_crc_source(crtc, NULL, &values_cnt);
 
+	spin_lock_irq(&crc->lock);
+	crtc_crc_cleanup(crc);
+	spin_unlock_irq(&crc->lock);
+
 	return 0;
 }
 
@@ -334,7 +363,7 @@ int drm_crtc_add_crc_entry(struct drm_crtc *crtc, bool has_frame,
 	spin_lock(&crc->lock);
 
 	/* Caller may not have noticed yet that userspace has stopped reading */
-	if (!crc->opened) {
+	if (!crc->entries) {
 		spin_unlock(&crc->lock);
 		return -EINVAL;
 	}
diff --git a/drivers/gpu/drm/drm_dp_mst_topology.c b/drivers/gpu/drm/drm_dp_mst_topology.c
index ae5f068..41b492f 100644
--- a/drivers/gpu/drm/drm_dp_mst_topology.c
+++ b/drivers/gpu/drm/drm_dp_mst_topology.c
@@ -31,6 +31,8 @@
 #include <drm/drmP.h>
 
 #include <drm/drm_fixed.h>
+#include <drm/drm_atomic.h>
+#include <drm/drm_atomic_helper.h>
 
 /**
  * DOC: dp mst helper
@@ -1342,15 +1344,17 @@ static void drm_dp_mst_link_probe_work(struct work_struct *work)
 static bool drm_dp_validate_guid(struct drm_dp_mst_topology_mgr *mgr,
 				 u8 *guid)
 {
-	static u8 zero_guid[16];
+	u64 salt;
 
-	if (!memcmp(guid, zero_guid, 16)) {
-		u64 salt = get_jiffies_64();
-		memcpy(&guid[0], &salt, sizeof(u64));
-		memcpy(&guid[8], &salt, sizeof(u64));
-		return false;
-	}
-	return true;
+	if (memchr_inv(guid, 0, 16))
+		return true;
+
+	salt = get_jiffies_64();
+
+	memcpy(&guid[0], &salt, sizeof(u64));
+	memcpy(&guid[8], &salt, sizeof(u64));
+
+	return false;
 }
 
 #if 0
@@ -2540,8 +2544,8 @@ int drm_dp_atomic_find_vcpi_slots(struct drm_atomic_state *state,
 	int req_slots;
 
 	topology_state = drm_atomic_get_mst_topology_state(state, mgr);
-	if (topology_state == NULL)
-		return -ENOMEM;
+	if (IS_ERR(topology_state))
+		return PTR_ERR(topology_state);
 
 	port = drm_dp_get_validated_port_ref(mgr, port);
 	if (port == NULL)
@@ -2580,8 +2584,8 @@ int drm_dp_atomic_release_vcpi_slots(struct drm_atomic_state *state,
 	struct drm_dp_mst_topology_state *topology_state;
 
 	topology_state = drm_atomic_get_mst_topology_state(state, mgr);
-	if (topology_state == NULL)
-		return -ENOMEM;
+	if (IS_ERR(topology_state))
+		return PTR_ERR(topology_state);
 
 	/* We cannot rely on port->vcpi.num_slots to update
 	 * topology_state->avail_slots as the port may not exist if the parent
@@ -3017,41 +3021,32 @@ static void drm_dp_destroy_connector_work(struct work_struct *work)
 		(*mgr->cbs->hotplug)(mgr);
 }
 
-void *drm_dp_mst_duplicate_state(struct drm_atomic_state *state, void *obj)
+static struct drm_private_state *
+drm_dp_mst_duplicate_state(struct drm_private_obj *obj)
 {
-	struct drm_dp_mst_topology_mgr *mgr = obj;
-	struct drm_dp_mst_topology_state *new_mst_state;
+	struct drm_dp_mst_topology_state *state;
 
-	if (WARN_ON(!mgr->state))
+	state = kmemdup(obj->state, sizeof(*state), GFP_KERNEL);
+	if (!state)
 		return NULL;
 
-	new_mst_state = kmemdup(mgr->state, sizeof(*new_mst_state), GFP_KERNEL);
-	if (new_mst_state)
-		new_mst_state->state = state;
-	return new_mst_state;
+	__drm_atomic_helper_private_obj_duplicate_state(obj, &state->base);
+
+	return &state->base;
 }
 
-void drm_dp_mst_swap_state(void *obj, void **obj_state_ptr)
+static void drm_dp_mst_destroy_state(struct drm_private_obj *obj,
+				     struct drm_private_state *state)
 {
-	struct drm_dp_mst_topology_mgr *mgr = obj;
-	struct drm_dp_mst_topology_state **topology_state_ptr;
+	struct drm_dp_mst_topology_state *mst_state =
+		to_dp_mst_topology_state(state);
 
-	topology_state_ptr = (struct drm_dp_mst_topology_state **)obj_state_ptr;
-
-	mgr->state->state = (*topology_state_ptr)->state;
-	swap(*topology_state_ptr, mgr->state);
-	mgr->state->state = NULL;
-}
-
-void drm_dp_mst_destroy_state(void *obj_state)
-{
-	kfree(obj_state);
+	kfree(mst_state);
 }
 
 static const struct drm_private_state_funcs mst_state_funcs = {
-	.duplicate_state = drm_dp_mst_duplicate_state,
-	.swap_state = drm_dp_mst_swap_state,
-	.destroy_state = drm_dp_mst_destroy_state,
+	.atomic_duplicate_state = drm_dp_mst_duplicate_state,
+	.atomic_destroy_state = drm_dp_mst_destroy_state,
 };
 
 /**
@@ -3075,8 +3070,7 @@ struct drm_dp_mst_topology_state *drm_atomic_get_mst_topology_state(struct drm_a
 	struct drm_device *dev = mgr->dev;
 
 	WARN_ON(!drm_modeset_is_locked(&dev->mode_config.connection_mutex));
-	return drm_atomic_get_private_obj_state(state, mgr,
-						&mst_state_funcs);
+	return to_dp_mst_topology_state(drm_atomic_get_private_obj_state(state, &mgr->base));
 }
 EXPORT_SYMBOL(drm_atomic_get_mst_topology_state);
 
@@ -3096,6 +3090,8 @@ int drm_dp_mst_topology_mgr_init(struct drm_dp_mst_topology_mgr *mgr,
 				 int max_dpcd_transaction_bytes,
 				 int max_payloads, int conn_base_id)
 {
+	struct drm_dp_mst_topology_state *mst_state;
+
 	mutex_init(&mgr->lock);
 	mutex_init(&mgr->qlock);
 	mutex_init(&mgr->payload_lock);
@@ -3124,14 +3120,18 @@ int drm_dp_mst_topology_mgr_init(struct drm_dp_mst_topology_mgr *mgr,
 	if (test_calc_pbn_mode() < 0)
 		DRM_ERROR("MST PBN self-test failed\n");
 
-	mgr->state = kzalloc(sizeof(*mgr->state), GFP_KERNEL);
-	if (mgr->state == NULL)
+	mst_state = kzalloc(sizeof(*mst_state), GFP_KERNEL);
+	if (mst_state == NULL)
 		return -ENOMEM;
-	mgr->state->mgr = mgr;
+
+	mst_state->mgr = mgr;
 
 	/* max. time slots - one slot for MTP header */
-	mgr->state->avail_slots = 63;
-	mgr->funcs = &mst_state_funcs;
+	mst_state->avail_slots = 63;
+
+	drm_atomic_private_obj_init(&mgr->base,
+				    &mst_state->base,
+				    &mst_state_funcs);
 
 	return 0;
 }
@@ -3153,8 +3153,7 @@ void drm_dp_mst_topology_mgr_destroy(struct drm_dp_mst_topology_mgr *mgr)
 	mutex_unlock(&mgr->payload_lock);
 	mgr->dev = NULL;
 	mgr->aux = NULL;
-	kfree(mgr->state);
-	mgr->state = NULL;
+	drm_atomic_private_obj_fini(&mgr->base);
 	mgr->funcs = NULL;
 }
 EXPORT_SYMBOL(drm_dp_mst_topology_mgr_destroy);
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 37b8ad3..be38ac7 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -63,6 +63,15 @@ module_param_named(debug, drm_debug, int, 0600);
 static DEFINE_SPINLOCK(drm_minor_lock);
 static struct idr drm_minors_idr;
 
+/*
+ * If the drm core fails to init for whatever reason,
+ * we should prevent any drivers from registering with it.
+ * It's best to check this at drm_dev_init(), as some drivers
+ * prefer to embed struct drm_device into their own device
+ * structure and call drm_dev_init() themselves.
+ */
+static bool drm_core_init_complete = false;
+
 static struct dentry *drm_debugfs_root;
 
 #define DRM_PRINTK_FMT "[" DRM_NAME ":%s]%s %pV"
@@ -282,7 +291,7 @@ struct drm_minor *drm_minor_acquire(unsigned int minor_id)
 
 	if (!minor) {
 		return ERR_PTR(-ENODEV);
-	} else if (drm_device_is_unplugged(minor->dev)) {
+	} else if (drm_dev_is_unplugged(minor->dev)) {
 		drm_dev_unref(minor->dev);
 		return ERR_PTR(-ENODEV);
 	}
@@ -355,26 +364,32 @@ void drm_put_dev(struct drm_device *dev)
 }
 EXPORT_SYMBOL(drm_put_dev);
 
-void drm_unplug_dev(struct drm_device *dev)
+static void drm_device_set_unplugged(struct drm_device *dev)
 {
-	/* for a USB device */
-	if (drm_core_check_feature(dev, DRIVER_MODESET))
-		drm_modeset_unregister_all(dev);
+	smp_wmb();
+	atomic_set(&dev->unplugged, 1);
+}
 
-	drm_minor_unregister(dev, DRM_MINOR_PRIMARY);
-	drm_minor_unregister(dev, DRM_MINOR_RENDER);
-	drm_minor_unregister(dev, DRM_MINOR_CONTROL);
+/**
+ * drm_dev_unplug - unplug a DRM device
+ * @dev: DRM device
+ *
+ * This unplugs a hotpluggable DRM device, which makes it inaccessible to
+ * userspace operations. Entry-points can use drm_dev_is_unplugged(). This
+ * essentially unregisters the device like drm_dev_unregister(), but can be
+ * called while there are still open users of @dev.
+ */
+void drm_dev_unplug(struct drm_device *dev)
+{
+	drm_dev_unregister(dev);
 
 	mutex_lock(&drm_global_mutex);
-
 	drm_device_set_unplugged(dev);
-
-	if (dev->open_count == 0) {
-		drm_put_dev(dev);
-	}
+	if (dev->open_count == 0)
+		drm_dev_unref(dev);
 	mutex_unlock(&drm_global_mutex);
 }
-EXPORT_SYMBOL(drm_unplug_dev);
+EXPORT_SYMBOL(drm_dev_unplug);
 
 /*
  * DRM internal mount
@@ -484,6 +499,11 @@ int drm_dev_init(struct drm_device *dev,
 {
 	int ret;
 
+	if (!drm_core_init_complete) {
+		DRM_ERROR("DRM core is not initialized\n");
+		return -ENODEV;
+	}
+
 	kref_init(&dev->ref);
 	dev->dev = parent;
 	dev->driver = driver;
@@ -821,6 +841,9 @@ EXPORT_SYMBOL(drm_dev_register);
  * drm_dev_register() but does not deallocate the device. The caller must call
  * drm_dev_unref() to drop their final reference.
  *
+ * A special form of unregistering for hotpluggable devices is drm_dev_unplug(),
+ * which can be called while there are still open users of @dev.
+ *
  * This should be called first in the device teardown code to make sure
  * userspace can't access the device instance any more.
  */
@@ -828,7 +851,8 @@ void drm_dev_unregister(struct drm_device *dev)
 {
 	struct drm_map_list *r_list, *list_temp;
 
-	drm_lastclose(dev);
+	if (drm_core_check_feature(dev, DRIVER_LEGACY))
+		drm_lastclose(dev);
 
 	dev->registered = false;
 
@@ -966,6 +990,8 @@ static int __init drm_core_init(void)
 	if (ret < 0)
 		goto error;
 
+	drm_core_init_complete = true;
+
 	DRM_DEBUG("Initialized\n");
 	return 0;
 
diff --git a/drivers/gpu/drm/drm_dumb_buffers.c b/drivers/gpu/drm/drm_dumb_buffers.c
index 10307cc..39ac15c 100644
--- a/drivers/gpu/drm/drm_dumb_buffers.c
+++ b/drivers/gpu/drm/drm_dumb_buffers.c
@@ -24,6 +24,7 @@
  */
 
 #include <drm/drmP.h>
+#include <drm/drm_gem.h>
 
 #include "drm_crtc_internal.h"
 
@@ -42,9 +43,10 @@
  * create dumb buffers suitable for scanout, which can then be used to create
  * KMS frame buffers.
  *
- * To support dumb objects drivers must implement the &drm_driver.dumb_create,
- * &drm_driver.dumb_destroy and &drm_driver.dumb_map_offset operations. See
- * there for further details.
+ * To support dumb objects drivers must implement the &drm_driver.dumb_create
+ * operation. &drm_driver.dumb_destroy defaults to drm_gem_dumb_destroy() if
+ * not set and &drm_driver.dumb_map_offset defaults to
+ * drm_gem_dumb_map_offset(). See the callbacks for further details.
  *
  * Note that dumb objects may not be used for gpu acceleration, as has been
  * attempted on some ARM embedded platforms. Such drivers really must have
@@ -108,11 +110,16 @@ int drm_mode_mmap_dumb_ioctl(struct drm_device *dev,
 {
 	struct drm_mode_map_dumb *args = data;
 
-	/* call driver ioctl to get mmap offset */
-	if (!dev->driver->dumb_map_offset)
+	if (!dev->driver->dumb_create)
 		return -ENOSYS;
 
-	return dev->driver->dumb_map_offset(file_priv, dev, args->handle, &args->offset);
+	if (dev->driver->dumb_map_offset)
+		return dev->driver->dumb_map_offset(file_priv, dev,
+						    args->handle,
+						    &args->offset);
+	else
+		return drm_gem_dumb_map_offset(file_priv, dev, args->handle,
+					       &args->offset);
 }
 
 int drm_mode_destroy_dumb_ioctl(struct drm_device *dev,
@@ -120,9 +127,12 @@ int drm_mode_destroy_dumb_ioctl(struct drm_device *dev,
 {
 	struct drm_mode_destroy_dumb *args = data;
 
-	if (!dev->driver->dumb_destroy)
+	if (!dev->driver->dumb_create)
 		return -ENOSYS;
 
-	return dev->driver->dumb_destroy(file_priv, dev, args->handle);
+	if (dev->driver->dumb_destroy)
+		return dev->driver->dumb_destroy(file_priv, dev, args->handle);
+	else
+		return drm_gem_dumb_destroy(file_priv, dev, args->handle);
 }
 
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 2e55599..6bb6337 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -1006,6 +1006,221 @@ static const struct drm_display_mode edid_cea_modes[] = {
 		   2492, 2640, 0, 1080, 1084, 1089, 1125, 0,
 		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
 	 .vrefresh = 100, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_16_9, },
+	/* 65 - 1280x720@24Hz */
+	{ DRM_MODE("1280x720", DRM_MODE_TYPE_DRIVER, 59400, 1280, 3040,
+		   3080, 3300, 0, 720, 725, 730, 750, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 24, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 66 - 1280x720@25Hz */
+	{ DRM_MODE("1280x720", DRM_MODE_TYPE_DRIVER, 74250, 1280, 3700,
+		   3740, 3960, 0, 720, 725, 730, 750, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 25, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 67 - 1280x720@30Hz */
+	{ DRM_MODE("1280x720", DRM_MODE_TYPE_DRIVER, 74250, 1280, 3040,
+		   3080, 3300, 0, 720, 725, 730, 750, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 30, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 68 - 1280x720@50Hz */
+	{ DRM_MODE("1280x720", DRM_MODE_TYPE_DRIVER, 74250, 1280, 1720,
+		   1760, 1980, 0, 720, 725, 730, 750, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 50, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 69 - 1280x720@60Hz */
+	{ DRM_MODE("1280x720", DRM_MODE_TYPE_DRIVER, 74250, 1280, 1390,
+		   1430, 1650, 0, 720, 725, 730, 750, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 60, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 70 - 1280x720@100Hz */
+	{ DRM_MODE("1280x720", DRM_MODE_TYPE_DRIVER, 148500, 1280, 1720,
+		   1760, 1980, 0, 720, 725, 730, 750, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 100, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 71 - 1280x720@120Hz */
+	{ DRM_MODE("1280x720", DRM_MODE_TYPE_DRIVER, 148500, 1280, 1390,
+		   1430, 1650, 0, 720, 725, 730, 750, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 120, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 72 - 1920x1080@24Hz */
+	{ DRM_MODE("1920x1080", DRM_MODE_TYPE_DRIVER, 74250, 1920, 2558,
+		   2602, 2750, 0, 1080, 1084, 1089, 1125, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 24, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 73 - 1920x1080@25Hz */
+	{ DRM_MODE("1920x1080", DRM_MODE_TYPE_DRIVER, 74250, 1920, 2448,
+		   2492, 2640, 0, 1080, 1084, 1089, 1125, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 25, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 74 - 1920x1080@30Hz */
+	{ DRM_MODE("1920x1080", DRM_MODE_TYPE_DRIVER, 74250, 1920, 2008,
+		   2052, 2200, 0, 1080, 1084, 1089, 1125, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 30, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 75 - 1920x1080@50Hz */
+	{ DRM_MODE("1920x1080", DRM_MODE_TYPE_DRIVER, 148500, 1920, 2448,
+		   2492, 2640, 0, 1080, 1084, 1089, 1125, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 50, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 76 - 1920x1080@60Hz */
+	{ DRM_MODE("1920x1080", DRM_MODE_TYPE_DRIVER, 148500, 1920, 2008,
+		   2052, 2200, 0, 1080, 1084, 1089, 1125, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 60, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 77 - 1920x1080@100Hz */
+	{ DRM_MODE("1920x1080", DRM_MODE_TYPE_DRIVER, 297000, 1920, 2448,
+		   2492, 2640, 0, 1080, 1084, 1089, 1125, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 100, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 78 - 1920x1080@120Hz */
+	{ DRM_MODE("1920x1080", DRM_MODE_TYPE_DRIVER, 297000, 1920, 2008,
+		   2052, 2200, 0, 1080, 1084, 1089, 1125, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 120, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 79 - 1680x720@24Hz */
+	{ DRM_MODE("1680x720", DRM_MODE_TYPE_DRIVER, 59400, 1680, 3040,
+		   3080, 3300, 0, 720, 725, 730, 750, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 24, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 80 - 1680x720@25Hz */
+	{ DRM_MODE("1680x720", DRM_MODE_TYPE_DRIVER, 59400, 1680, 2908,
+		   2948, 3168, 0, 720, 725, 730, 750, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 25, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 81 - 1680x720@30Hz */
+	{ DRM_MODE("1680x720", DRM_MODE_TYPE_DRIVER, 59400, 1680, 2380,
+		   2420, 2640, 0, 720, 725, 730, 750, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 30, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 82 - 1680x720@50Hz */
+	{ DRM_MODE("1680x720", DRM_MODE_TYPE_DRIVER, 82500, 1680, 1940,
+		   1980, 2200, 0, 720, 725, 730, 750, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 50, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 83 - 1680x720@60Hz */
+	{ DRM_MODE("1680x720", DRM_MODE_TYPE_DRIVER, 99000, 1680, 1940,
+		   1980, 2200, 0, 720, 725, 730, 750, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 60, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 84 - 1680x720@100Hz */
+	{ DRM_MODE("1680x720", DRM_MODE_TYPE_DRIVER, 165000, 1680, 1740,
+		   1780, 2000, 0, 720, 725, 730, 825, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 100, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 85 - 1680x720@120Hz */
+	{ DRM_MODE("1680x720", DRM_MODE_TYPE_DRIVER, 198000, 1680, 1740,
+		   1780, 2000, 0, 720, 725, 730, 825, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 120, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 86 - 2560x1080@24Hz */
+	{ DRM_MODE("2560x1080", DRM_MODE_TYPE_DRIVER, 99000, 2560, 3558,
+		   3602, 3750, 0, 1080, 1084, 1089, 1100, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 24, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 87 - 2560x1080@25Hz */
+	{ DRM_MODE("2560x1080", DRM_MODE_TYPE_DRIVER, 90000, 2560, 3008,
+		   3052, 3200, 0, 1080, 1084, 1089, 1125, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 25, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 88 - 2560x1080@30Hz */
+	{ DRM_MODE("2560x1080", DRM_MODE_TYPE_DRIVER, 118800, 2560, 3328,
+		   3372, 3520, 0, 1080, 1084, 1089, 1125, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 30, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 89 - 2560x1080@50Hz */
+	{ DRM_MODE("2560x1080", DRM_MODE_TYPE_DRIVER, 185625, 2560, 3108,
+		   3152, 3300, 0, 1080, 1084, 1089, 1125, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 50, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 90 - 2560x1080@60Hz */
+	{ DRM_MODE("2560x1080", DRM_MODE_TYPE_DRIVER, 198000, 2560, 2808,
+		   2852, 3000, 0, 1080, 1084, 1089, 1100, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 60, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 91 - 2560x1080@100Hz */
+	{ DRM_MODE("2560x1080", DRM_MODE_TYPE_DRIVER, 371250, 2560, 2778,
+		   2822, 2970, 0, 1080, 1084, 1089, 1250, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 100, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 92 - 2560x1080@120Hz */
+	{ DRM_MODE("2560x1080", DRM_MODE_TYPE_DRIVER, 495000, 2560, 3108,
+		   3152, 3300, 0, 1080, 1084, 1089, 1250, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 120, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 93 - 3840x2160p@24Hz 16:9 */
+	{ DRM_MODE("3840x2160", DRM_MODE_TYPE_DRIVER, 297000, 3840, 5116,
+		   5204, 5500, 0, 2160, 2168, 2178, 2250, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 24, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_16_9, },
+	/* 94 - 3840x2160p@25Hz 16:9 */
+	{ DRM_MODE("3840x2160", DRM_MODE_TYPE_DRIVER, 297000, 3840, 4896,
+		   4984, 5280, 0, 2160, 2168, 2178, 2250, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 25, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_16_9, },
+	/* 95 - 3840x2160p@30Hz 16:9 */
+	{ DRM_MODE("3840x2160", DRM_MODE_TYPE_DRIVER, 297000, 3840, 4016,
+		   4104, 4400, 0, 2160, 2168, 2178, 2250, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 30, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_16_9, },
+	/* 96 - 3840x2160p@50Hz 16:9 */
+	{ DRM_MODE("3840x2160", DRM_MODE_TYPE_DRIVER, 594000, 3840, 4896,
+		   4984, 5280, 0, 2160, 2168, 2178, 2250, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 50, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_16_9, },
+	/* 97 - 3840x2160p@60Hz 16:9 */
+	{ DRM_MODE("3840x2160", DRM_MODE_TYPE_DRIVER, 594000, 3840, 4016,
+		   4104, 4400, 0, 2160, 2168, 2178, 2250, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 60, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_16_9, },
+	/* 98 - 4096x2160p@24Hz 256:135 */
+	{ DRM_MODE("4096x2160", DRM_MODE_TYPE_DRIVER, 297000, 4096, 5116,
+		   5204, 5500, 0, 2160, 2168, 2178, 2250, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 24, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_256_135, },
+	/* 99 - 4096x2160p@25Hz 256:135 */
+	{ DRM_MODE("4096x2160", DRM_MODE_TYPE_DRIVER, 297000, 4096, 5064,
+		   5152, 5280, 0, 2160, 2168, 2178, 2250, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 25, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_256_135, },
+	/* 100 - 4096x2160p@30Hz 256:135 */
+	{ DRM_MODE("4096x2160", DRM_MODE_TYPE_DRIVER, 297000, 4096, 4184,
+		   4272, 4400, 0, 2160, 2168, 2178, 2250, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 30, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_256_135, },
+	/* 101 - 4096x2160p@50Hz 256:135 */
+	{ DRM_MODE("4096x2160", DRM_MODE_TYPE_DRIVER, 594000, 4096, 5064,
+		   5152, 5280, 0, 2160, 2168, 2178, 2250, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 50, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_256_135, },
+	/* 102 - 4096x2160p@60Hz 256:135 */
+	{ DRM_MODE("4096x2160", DRM_MODE_TYPE_DRIVER, 594000, 4096, 4184,
+		   4272, 4400, 0, 2160, 2168, 2178, 2250, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 60, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_256_135, },
+	/* 103 - 3840x2160p@24Hz 64:27 */
+	{ DRM_MODE("3840x2160", DRM_MODE_TYPE_DRIVER, 297000, 3840, 5116,
+		   5204, 5500, 0, 2160, 2168, 2178, 2250, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 24, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 104 - 3840x2160p@25Hz 64:27 */
+	{ DRM_MODE("3840x2160", DRM_MODE_TYPE_DRIVER, 297000, 3840, 4896,
+		   4984, 5280, 0, 2160, 2168, 2178, 2250, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 25, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 105 - 3840x2160p@30Hz 64:27 */
+	{ DRM_MODE("3840x2160", DRM_MODE_TYPE_DRIVER, 297000, 3840, 4016,
+		   4104, 4400, 0, 2160, 2168, 2178, 2250, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 30, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 106 - 3840x2160p@50Hz 64:27 */
+	{ DRM_MODE("3840x2160", DRM_MODE_TYPE_DRIVER, 594000, 3840, 4896,
+		   4984, 5280, 0, 2160, 2168, 2178, 2250, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 50, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
+	/* 107 - 3840x2160p@60Hz 64:27 */
+	{ DRM_MODE("3840x2160", DRM_MODE_TYPE_DRIVER, 594000, 3840, 4016,
+		   4104, 4400, 0, 2160, 2168, 2178, 2250, 0,
+		   DRM_MODE_FLAG_PHSYNC | DRM_MODE_FLAG_PVSYNC),
+	  .vrefresh = 60, .picture_aspect_ratio = HDMI_PICTURE_ASPECT_64_27, },
 };
 
 /*
@@ -2566,7 +2781,10 @@ add_detailed_modes(struct drm_connector *connector, struct edid *edid,
 #define VIDEO_BLOCK     0x02
 #define VENDOR_BLOCK    0x03
 #define SPEAKER_BLOCK	0x04
-#define VIDEO_CAPABILITY_BLOCK	0x07
+#define USE_EXTENDED_TAG 0x07
+#define EXT_VIDEO_CAPABILITY_BLOCK 0x00
+#define EXT_VIDEO_DATA_BLOCK_420	0x0E
+#define EXT_VIDEO_CAP_BLOCK_Y420CMDB 0x0F
 #define EDID_BASIC_AUDIO	(1 << 6)
 #define EDID_CEA_YCRCB444	(1 << 5)
 #define EDID_CEA_YCRCB422	(1 << 4)
@@ -2902,6 +3120,15 @@ add_alternate_cea_modes(struct drm_connector *connector, struct edid *edid)
 	return modes;
 }
 
+static u8 svd_to_vic(u8 svd)
+{
+	/* 0-6 bit vic, 7th bit native mode indicator */
+	if ((svd >= 1 &&  svd <= 64) || (svd >= 129 && svd <= 192))
+		return svd & 127;
+
+	return svd;
+}
+
 static struct drm_display_mode *
 drm_display_mode_from_vic_index(struct drm_connector *connector,
 				const u8 *video_db, u8 video_len,
@@ -2915,7 +3142,7 @@ drm_display_mode_from_vic_index(struct drm_connector *connector,
 		return NULL;
 
 	/* CEA modes are numbered 1..127 */
-	vic = (video_db[video_index] & 127);
+	vic = svd_to_vic(video_db[video_index]);
 	if (!drm_valid_cea_vic(vic))
 		return NULL;
 
@@ -2928,15 +3155,85 @@ drm_display_mode_from_vic_index(struct drm_connector *connector,
 	return newmode;
 }
 
+/*
+ * do_y420vdb_modes - Parse YCBCR 420 only modes
+ * @connector: connector corresponding to the HDMI sink
+ * @svds: start of the data block of CEA YCBCR 420 VDB
+ * @len: length of the CEA YCBCR 420 VDB
+ *
+ * Parse the CEA-861-F YCBCR 420 Video Data Block (Y420VDB)
+ * which contains modes which can be supported in YCBCR 420
+ * output format only.
+ */
+static int do_y420vdb_modes(struct drm_connector *connector,
+			    const u8 *svds, u8 svds_len)
+{
+	int modes = 0, i;
+	struct drm_device *dev = connector->dev;
+	struct drm_display_info *info = &connector->display_info;
+	struct drm_hdmi_info *hdmi = &info->hdmi;
+
+	for (i = 0; i < svds_len; i++) {
+		u8 vic = svd_to_vic(svds[i]);
+		struct drm_display_mode *newmode;
+
+		if (!drm_valid_cea_vic(vic))
+			continue;
+
+		newmode = drm_mode_duplicate(dev, &edid_cea_modes[vic]);
+		if (!newmode)
+			break;
+		bitmap_set(hdmi->y420_vdb_modes, vic, 1);
+		drm_mode_probed_add(connector, newmode);
+		modes++;
+	}
+
+	if (modes > 0)
+		info->color_formats |= DRM_COLOR_FORMAT_YCRCB420;
+	return modes;
+}
+
+/*
+ * drm_add_cmdb_modes - Add a YCBCR 420 mode into bitmap
+ * @connector: connector corresponding to the HDMI sink
+ * @vic: CEA vic for the video mode to be added in the map
+ *
+ * Makes an entry for a videomode in the YCBCR 420 bitmap
+ */
+static void
+drm_add_cmdb_modes(struct drm_connector *connector, u8 svd)
+{
+	u8 vic = svd_to_vic(svd);
+	struct drm_hdmi_info *hdmi = &connector->display_info.hdmi;
+
+	if (!drm_valid_cea_vic(vic))
+		return;
+
+	bitmap_set(hdmi->y420_cmdb_modes, vic, 1);
+}
+
 static int
 do_cea_modes(struct drm_connector *connector, const u8 *db, u8 len)
 {
 	int i, modes = 0;
+	struct drm_hdmi_info *hdmi = &connector->display_info.hdmi;
 
 	for (i = 0; i < len; i++) {
 		struct drm_display_mode *mode;
 		mode = drm_display_mode_from_vic_index(connector, db, len, i);
 		if (mode) {
+			/*
+			 * YCBCR420 capability block contains a bitmap which
+			 * gives the index of CEA modes from CEA VDB, which
+			 * can support YCBCR 420 sampling output also (apart
+			 * from RGB/YCBCR444 etc).
+			 * For example, if the bit 0 in bitmap is set,
+			 * first mode in VDB can support YCBCR420 output too.
+			 * Add YCBCR420 modes only if sink is HDMI 2.0 capable.
+			 */
+			if (i < 64 && hdmi->y420_cmdb_map & (1ULL << i))
+				drm_add_cmdb_modes(connector, db[i]);
+
 			drm_mode_probed_add(connector, mode);
 			modes++;
 		}
@@ -3218,6 +3515,12 @@ cea_db_payload_len(const u8 *db)
 }
 
 static int
+cea_db_extended_tag(const u8 *db)
+{
+	return db[1];
+}
+
+static int
 cea_db_tag(const u8 *db)
 {
 	return db[0] >> 5;
@@ -3272,9 +3575,77 @@ static bool cea_db_is_hdmi_forum_vsdb(const u8 *db)
 	return oui == HDMI_FORUM_IEEE_OUI;
 }
 
+static bool cea_db_is_y420cmdb(const u8 *db)
+{
+	if (cea_db_tag(db) != USE_EXTENDED_TAG)
+		return false;
+
+	if (!cea_db_payload_len(db))
+		return false;
+
+	if (cea_db_extended_tag(db) != EXT_VIDEO_CAP_BLOCK_Y420CMDB)
+		return false;
+
+	return true;
+}
+
+static bool cea_db_is_y420vdb(const u8 *db)
+{
+	if (cea_db_tag(db) != USE_EXTENDED_TAG)
+		return false;
+
+	if (!cea_db_payload_len(db))
+		return false;
+
+	if (cea_db_extended_tag(db) != EXT_VIDEO_DATA_BLOCK_420)
+		return false;
+
+	return true;
+}
+
 #define for_each_cea_db(cea, i, start, end) \
 	for ((i) = (start); (i) < (end) && (i) + cea_db_payload_len(&(cea)[(i)]) < (end); (i) += cea_db_payload_len(&(cea)[(i)]) + 1)
 
+static void drm_parse_y420cmdb_bitmap(struct drm_connector *connector,
+				      const u8 *db)
+{
+	struct drm_display_info *info = &connector->display_info;
+	struct drm_hdmi_info *hdmi = &info->hdmi;
+	u8 map_len = cea_db_payload_len(db) - 1;
+	u8 count;
+	u64 map = 0;
+
+	if (map_len == 0) {
+		/* All CEA modes support ycbcr420 sampling also.*/
+		hdmi->y420_cmdb_map = U64_MAX;
+		info->color_formats |= DRM_COLOR_FORMAT_YCRCB420;
+		return;
+	}
+
+	/*
+	 * This map indicates which of the existing CEA block modes
+	 * from VDB can support YCBCR420 output too. So if bit=0 is
+	 * set, first mode from VDB can support YCBCR420 output too.
+	 * We will parse and keep this map, before parsing VDB itself
+	 * to avoid going through the same block again and again.
+	 *
+	 * Spec is not clear about max possible size of this block.
+	 * Clamping max bitmap block size at 8 bytes. Every byte can
+	 * address 8 CEA modes, in this way this map can address
+	 * 8*8 = first 64 SVDs.
+	 */
+	if (WARN_ON_ONCE(map_len > 8))
+		map_len = 8;
+
+	for (count = 0; count < map_len; count++)
+		map |= (u64)db[2 + count] << (8 * count);
+
+	if (map)
+		info->color_formats |= DRM_COLOR_FORMAT_YCRCB420;
+
+	hdmi->y420_cmdb_map = map;
+}
+
 static int
 add_cea_modes(struct drm_connector *connector, struct edid *edid)
 {
@@ -3297,10 +3668,16 @@ add_cea_modes(struct drm_connector *connector, struct edid *edid)
 				video = db + 1;
 				video_len = dbl;
 				modes += do_cea_modes(connector, video, dbl);
-			}
-			else if (cea_db_is_hdmi_vsdb(db)) {
+			} else if (cea_db_is_hdmi_vsdb(db)) {
 				hdmi = db;
 				hdmi_len = dbl;
+			} else if (cea_db_is_y420vdb(db)) {
+				const u8 *vdb420 = &db[2];
+
+				/* Add 4:2:0(only) modes present in EDID */
+				modes += do_y420vdb_modes(connector,
+							  vdb420,
+							  dbl - 1);
 			}
 		}
 	}
@@ -3793,8 +4170,10 @@ bool drm_rgb_quant_range_selectable(struct edid *edid)
 		return false;
 
 	for_each_cea_db(edid_ext, i, start, end) {
-		if (cea_db_tag(&edid_ext[i]) == VIDEO_CAPABILITY_BLOCK &&
-		    cea_db_payload_len(&edid_ext[i]) == 2) {
+		if (cea_db_tag(&edid_ext[i]) == USE_EXTENDED_TAG &&
+		    cea_db_payload_len(&edid_ext[i]) == 2 &&
+		    cea_db_extended_tag(&edid_ext[i]) ==
+			EXT_VIDEO_CAPABILITY_BLOCK) {
 			DRM_DEBUG_KMS("CEA VCDB 0x%02x\n", edid_ext[i + 2]);
 			return edid_ext[i + 2] & EDID_CEA_VCDB_QS;
 		}
@@ -3823,6 +4202,16 @@ drm_default_rgb_quant_range(const struct drm_display_mode *mode)
 }
 EXPORT_SYMBOL(drm_default_rgb_quant_range);
 
+static void drm_parse_ycbcr420_deep_color_info(struct drm_connector *connector,
+					       const u8 *db)
+{
+	u8 dc_mask;
+	struct drm_hdmi_info *hdmi = &connector->display_info.hdmi;
+
+	dc_mask = db[7] & DRM_EDID_YCBCR420_DC_MASK;
+	hdmi->y420_dc_modes |= dc_mask;
+}
+
 static void drm_parse_hdmi_forum_vsdb(struct drm_connector *connector,
 				 const u8 *hf_vsdb)
 {
@@ -3863,6 +4252,8 @@ static void drm_parse_hdmi_forum_vsdb(struct drm_connector *connector,
 				scdc->scrambling.low_rates = true;
 		}
 	}
+
+	drm_parse_ycbcr420_deep_color_info(connector, hf_vsdb);
 }
 
 static void drm_parse_hdmi_deep_color_info(struct drm_connector *connector,
@@ -3981,6 +4372,8 @@ static void drm_parse_cea_ext(struct drm_connector *connector,
 			drm_parse_hdmi_vsdb_video(connector, db);
 		if (cea_db_is_hdmi_forum_vsdb(db))
 			drm_parse_hdmi_forum_vsdb(connector, db);
+		if (cea_db_is_y420cmdb(db))
+			drm_parse_y420cmdb_bitmap(connector, db);
 	}
 }
 
@@ -4215,6 +4608,13 @@ int drm_add_edid_modes(struct drm_connector *connector, struct edid *edid)
 	quirks = edid_get_quirks(edid);
 
 	/*
+	 * CEA-861-F adds ycbcr capability map block, for HDMI 2.0 sinks.
+	 * To avoid multiple parsing of same block, lets parse that map
+	 * from sink info, before parsing CEA modes.
+	 */
+	drm_add_display_info(connector, edid);
+
+	/*
 	 * EDID spec says modes should be preferred in this order:
 	 * - preferred detailed mode
 	 * - other detailed modes from base block
@@ -4241,8 +4641,6 @@ int drm_add_edid_modes(struct drm_connector *connector, struct edid *edid)
 	if (quirks & (EDID_QUIRK_PREFER_LARGE_60 | EDID_QUIRK_PREFER_LARGE_75))
 		edid_fixup_preferred(connector, quirks);
 
-	drm_add_display_info(connector, edid);
-
 	if (quirks & EDID_QUIRK_FORCE_6BPC)
 		connector->display_info.bpc = 6;
 
@@ -4334,12 +4732,14 @@ EXPORT_SYMBOL(drm_set_preferred_mode);
  *                                              data from a DRM display mode
  * @frame: HDMI AVI infoframe
  * @mode: DRM display mode
+ * @is_hdmi2_sink: Sink is HDMI 2.0 compliant
  *
  * Return: 0 on success or a negative error code on failure.
  */
 int
 drm_hdmi_avi_infoframe_from_display_mode(struct hdmi_avi_infoframe *frame,
-					 const struct drm_display_mode *mode)
+					 const struct drm_display_mode *mode,
+					 bool is_hdmi2_sink)
 {
 	int err;
 
@@ -4355,6 +4755,28 @@ drm_hdmi_avi_infoframe_from_display_mode(struct hdmi_avi_infoframe *frame,
 
 	frame->video_code = drm_match_cea_mode(mode);
 
+	/*
+	 * HDMI 1.4 VIC range: 1 <= VIC <= 64 (CEA-861-D) but
+	 * HDMI 2.0 VIC range: 1 <= VIC <= 107 (CEA-861-F). So we
+	 * have to make sure we dont break HDMI 1.4 sinks.
+	 */
+	if (!is_hdmi2_sink && frame->video_code > 64)
+		frame->video_code = 0;
+
+	/*
+	 * HDMI spec says if a mode is found in HDMI 1.4b 4K modes
+	 * we should send its VIC in vendor infoframes, else send the
+	 * VIC in AVI infoframes. Lets check if this mode is present in
+	 * HDMI 1.4b 4K modes
+	 */
+	if (frame->video_code) {
+		u8 vendor_if_vic = drm_match_hdmi_mode(mode);
+		bool is_s3d = mode->flags & DRM_MODE_FLAG_3D_MASK;
+
+		if (drm_valid_hdmi_vic(vendor_if_vic) && !is_s3d)
+			frame->video_code = 0;
+	}
+
 	frame->picture_aspect = HDMI_PICTURE_ASPECT_NONE;
 
 	/*
diff --git a/drivers/gpu/drm/drm_fb_cma_helper.c b/drivers/gpu/drm/drm_fb_cma_helper.c
index 53f9bdf..f2ee883 100644
--- a/drivers/gpu/drm/drm_fb_cma_helper.c
+++ b/drivers/gpu/drm/drm_fb_cma_helper.c
@@ -18,27 +18,17 @@
  */
 
 #include <drm/drmP.h>
-#include <drm/drm_atomic.h>
-#include <drm/drm_crtc.h>
 #include <drm/drm_fb_helper.h>
-#include <drm/drm_crtc_helper.h>
+#include <drm/drm_framebuffer.h>
 #include <drm/drm_gem_cma_helper.h>
+#include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_fb_cma_helper.h>
-#include <linux/dma-buf.h>
-#include <linux/dma-mapping.h>
 #include <linux/module.h>
-#include <linux/reservation.h>
 
 #define DEFAULT_FBDEFIO_DELAY_MS 50
 
-struct drm_fb_cma {
-	struct drm_framebuffer		fb;
-	struct drm_gem_cma_object	*obj[4];
-};
-
 struct drm_fbdev_cma {
 	struct drm_fb_helper	fb_helper;
-	struct drm_fb_cma	*fb;
 	const struct drm_framebuffer_funcs *fb_funcs;
 };
 
@@ -90,69 +80,19 @@ static inline struct drm_fbdev_cma *to_fbdev_cma(struct drm_fb_helper *helper)
 	return container_of(helper, struct drm_fbdev_cma, fb_helper);
 }
 
-static inline struct drm_fb_cma *to_fb_cma(struct drm_framebuffer *fb)
-{
-	return container_of(fb, struct drm_fb_cma, fb);
-}
-
 void drm_fb_cma_destroy(struct drm_framebuffer *fb)
 {
-	struct drm_fb_cma *fb_cma = to_fb_cma(fb);
-	int i;
-
-	for (i = 0; i < 4; i++) {
-		if (fb_cma->obj[i])
-			drm_gem_object_put_unlocked(&fb_cma->obj[i]->base);
-	}
-
-	drm_framebuffer_cleanup(fb);
-	kfree(fb_cma);
+	drm_gem_fb_destroy(fb);
 }
 EXPORT_SYMBOL(drm_fb_cma_destroy);
 
 int drm_fb_cma_create_handle(struct drm_framebuffer *fb,
 	struct drm_file *file_priv, unsigned int *handle)
 {
-	struct drm_fb_cma *fb_cma = to_fb_cma(fb);
-
-	return drm_gem_handle_create(file_priv,
-			&fb_cma->obj[0]->base, handle);
+	return drm_gem_fb_create_handle(fb, file_priv, handle);
 }
 EXPORT_SYMBOL(drm_fb_cma_create_handle);
 
-static struct drm_framebuffer_funcs drm_fb_cma_funcs = {
-	.destroy	= drm_fb_cma_destroy,
-	.create_handle	= drm_fb_cma_create_handle,
-};
-
-static struct drm_fb_cma *drm_fb_cma_alloc(struct drm_device *dev,
-	const struct drm_mode_fb_cmd2 *mode_cmd,
-	struct drm_gem_cma_object **obj,
-	unsigned int num_planes, const struct drm_framebuffer_funcs *funcs)
-{
-	struct drm_fb_cma *fb_cma;
-	int ret;
-	int i;
-
-	fb_cma = kzalloc(sizeof(*fb_cma), GFP_KERNEL);
-	if (!fb_cma)
-		return ERR_PTR(-ENOMEM);
-
-	drm_helper_mode_fill_fb_struct(dev, &fb_cma->fb, mode_cmd);
-
-	for (i = 0; i < num_planes; i++)
-		fb_cma->obj[i] = obj[i];
-
-	ret = drm_framebuffer_init(dev, &fb_cma->fb, funcs);
-	if (ret) {
-		dev_err(dev->dev, "Failed to initialize framebuffer: %d\n", ret);
-		kfree(fb_cma);
-		return ERR_PTR(ret);
-	}
-
-	return fb_cma;
-}
-
 /**
  * drm_fb_cma_create_with_funcs() - helper function for the
  *                                  &drm_mode_config_funcs.fb_create
@@ -170,53 +110,7 @@ struct drm_framebuffer *drm_fb_cma_create_with_funcs(struct drm_device *dev,
 	struct drm_file *file_priv, const struct drm_mode_fb_cmd2 *mode_cmd,
 	const struct drm_framebuffer_funcs *funcs)
 {
-	const struct drm_format_info *info;
-	struct drm_fb_cma *fb_cma;
-	struct drm_gem_cma_object *objs[4];
-	struct drm_gem_object *obj;
-	int ret;
-	int i;
-
-	info = drm_get_format_info(dev, mode_cmd);
-	if (!info)
-		return ERR_PTR(-EINVAL);
-
-	for (i = 0; i < info->num_planes; i++) {
-		unsigned int width = mode_cmd->width / (i ? info->hsub : 1);
-		unsigned int height = mode_cmd->height / (i ? info->vsub : 1);
-		unsigned int min_size;
-
-		obj = drm_gem_object_lookup(file_priv, mode_cmd->handles[i]);
-		if (!obj) {
-			dev_err(dev->dev, "Failed to lookup GEM object\n");
-			ret = -ENOENT;
-			goto err_gem_object_put;
-		}
-
-		min_size = (height - 1) * mode_cmd->pitches[i]
-			 + width * info->cpp[i]
-			 + mode_cmd->offsets[i];
-
-		if (obj->size < min_size) {
-			drm_gem_object_put_unlocked(obj);
-			ret = -EINVAL;
-			goto err_gem_object_put;
-		}
-		objs[i] = to_drm_gem_cma_obj(obj);
-	}
-
-	fb_cma = drm_fb_cma_alloc(dev, mode_cmd, objs, i, funcs);
-	if (IS_ERR(fb_cma)) {
-		ret = PTR_ERR(fb_cma);
-		goto err_gem_object_put;
-	}
-
-	return &fb_cma->fb;
-
-err_gem_object_put:
-	for (i--; i >= 0; i--)
-		drm_gem_object_put_unlocked(&objs[i]->base);
-	return ERR_PTR(ret);
+	return drm_gem_fb_create_with_funcs(dev, file_priv, mode_cmd, funcs);
 }
 EXPORT_SYMBOL_GPL(drm_fb_cma_create_with_funcs);
 
@@ -233,8 +127,7 @@ EXPORT_SYMBOL_GPL(drm_fb_cma_create_with_funcs);
 struct drm_framebuffer *drm_fb_cma_create(struct drm_device *dev,
 	struct drm_file *file_priv, const struct drm_mode_fb_cmd2 *mode_cmd)
 {
-	return drm_fb_cma_create_with_funcs(dev, file_priv, mode_cmd,
-					    &drm_fb_cma_funcs);
+	return drm_gem_fb_create(dev, file_priv, mode_cmd);
 }
 EXPORT_SYMBOL_GPL(drm_fb_cma_create);
 
@@ -250,12 +143,13 @@ EXPORT_SYMBOL_GPL(drm_fb_cma_create);
 struct drm_gem_cma_object *drm_fb_cma_get_gem_obj(struct drm_framebuffer *fb,
 						  unsigned int plane)
 {
-	struct drm_fb_cma *fb_cma = to_fb_cma(fb);
+	struct drm_gem_object *gem;
 
-	if (plane >= 4)
+	gem = drm_gem_fb_get_obj(fb, plane);
+	if (!gem)
 		return NULL;
 
-	return fb_cma->obj[plane];
+	return to_drm_gem_cma_obj(gem);
 }
 EXPORT_SYMBOL_GPL(drm_fb_cma_get_gem_obj);
 
@@ -272,13 +166,14 @@ dma_addr_t drm_fb_cma_get_gem_addr(struct drm_framebuffer *fb,
 				   struct drm_plane_state *state,
 				   unsigned int plane)
 {
-	struct drm_fb_cma *fb_cma = to_fb_cma(fb);
+	struct drm_gem_cma_object *obj;
 	dma_addr_t paddr;
 
-	if (plane >= 4)
+	obj = drm_fb_cma_get_gem_obj(fb, plane);
+	if (!obj)
 		return 0;
 
-	paddr = fb_cma->obj[plane]->paddr + fb->offsets[plane];
+	paddr = obj->paddr + fb->offsets[plane];
 	paddr += fb->format->cpp[plane] * (state->src_x >> 16);
 	paddr += fb->pitches[plane] * (state->src_y >> 16);
 
@@ -302,26 +197,13 @@ EXPORT_SYMBOL_GPL(drm_fb_cma_get_gem_addr);
 int drm_fb_cma_prepare_fb(struct drm_plane *plane,
 			  struct drm_plane_state *state)
 {
-	struct dma_buf *dma_buf;
-	struct dma_fence *fence;
-
-	if ((plane->state->fb == state->fb) || !state->fb)
-		return 0;
-
-	dma_buf = drm_fb_cma_get_gem_obj(state->fb, 0)->base.dma_buf;
-	if (dma_buf) {
-		fence = reservation_object_get_excl_rcu(dma_buf->resv);
-		drm_atomic_set_fence_for_plane(state, fence);
-	}
-
-	return 0;
+	return drm_gem_fb_prepare_fb(plane, state);
 }
 EXPORT_SYMBOL_GPL(drm_fb_cma_prepare_fb);
 
 #ifdef CONFIG_DEBUG_FS
 static void drm_fb_cma_describe(struct drm_framebuffer *fb, struct seq_file *m)
 {
-	struct drm_fb_cma *fb_cma = to_fb_cma(fb);
 	int i;
 
 	seq_printf(m, "fb: %dx%d@%4.4s\n", fb->width, fb->height,
@@ -330,7 +212,7 @@ static void drm_fb_cma_describe(struct drm_framebuffer *fb, struct seq_file *m)
 	for (i = 0; i < fb->format->num_planes; i++) {
 		seq_printf(m, "   %d: offset=%d pitch=%d, obj: ",
 				i, fb->offsets[i], fb->pitches[i]);
-		drm_gem_cma_describe(fb_cma->obj[i], m);
+		drm_gem_cma_describe(drm_fb_cma_get_gem_obj(fb, i), m);
 	}
 }
 
@@ -431,7 +313,6 @@ drm_fbdev_cma_create(struct drm_fb_helper *helper,
 	struct drm_fb_helper_surface_size *sizes)
 {
 	struct drm_fbdev_cma *fbdev_cma = to_fbdev_cma(helper);
-	struct drm_mode_fb_cmd2 mode_cmd = { 0 };
 	struct drm_device *dev = helper->dev;
 	struct drm_gem_cma_object *obj;
 	struct drm_framebuffer *fb;
@@ -446,14 +327,7 @@ drm_fbdev_cma_create(struct drm_fb_helper *helper,
 			sizes->surface_bpp);
 
 	bytes_per_pixel = DIV_ROUND_UP(sizes->surface_bpp, 8);
-
-	mode_cmd.width = sizes->surface_width;
-	mode_cmd.height = sizes->surface_height;
-	mode_cmd.pitches[0] = sizes->surface_width * bytes_per_pixel;
-	mode_cmd.pixel_format = drm_mode_legacy_fb_format(sizes->surface_bpp,
-		sizes->surface_depth);
-
-	size = mode_cmd.pitches[0] * mode_cmd.height;
+	size = sizes->surface_width * sizes->surface_height * bytes_per_pixel;
 	obj = drm_gem_cma_create(dev, size);
 	if (IS_ERR(obj))
 		return -ENOMEM;
@@ -464,15 +338,14 @@ drm_fbdev_cma_create(struct drm_fb_helper *helper,
 		goto err_gem_free_object;
 	}
 
-	fbdev_cma->fb = drm_fb_cma_alloc(dev, &mode_cmd, &obj, 1,
-					 fbdev_cma->fb_funcs);
-	if (IS_ERR(fbdev_cma->fb)) {
+	fb = drm_gem_fbdev_fb_create(dev, sizes, 0, &obj->base,
+				     fbdev_cma->fb_funcs);
+	if (IS_ERR(fb)) {
 		dev_err(dev->dev, "Failed to allocate DRM framebuffer.\n");
-		ret = PTR_ERR(fbdev_cma->fb);
+		ret = PTR_ERR(fb);
 		goto err_fb_info_destroy;
 	}
 
-	fb = &fbdev_cma->fb->fb;
 	helper->fb = fb;
 
 	fbi->par = helper;
@@ -500,7 +373,7 @@ drm_fbdev_cma_create(struct drm_fb_helper *helper,
 	return 0;
 
 err_cma_destroy:
-	drm_framebuffer_remove(&fbdev_cma->fb->fb);
+	drm_framebuffer_remove(fb);
 err_fb_info_destroy:
 	drm_fb_helper_fini(helper);
 err_gem_free_object:
@@ -570,6 +443,11 @@ struct drm_fbdev_cma *drm_fbdev_cma_init_with_funcs(struct drm_device *dev,
 }
 EXPORT_SYMBOL_GPL(drm_fbdev_cma_init_with_funcs);
 
+static const struct drm_framebuffer_funcs drm_fb_cma_funcs = {
+	.destroy	= drm_gem_fb_destroy,
+	.create_handle	= drm_gem_fb_create_handle,
+};
+
 /**
  * drm_fbdev_cma_init() - Allocate and initializes a drm_fbdev_cma struct
  * @dev: DRM device
@@ -597,8 +475,8 @@ void drm_fbdev_cma_fini(struct drm_fbdev_cma *fbdev_cma)
 	if (fbdev_cma->fb_helper.fbdev)
 		drm_fbdev_cma_defio_fini(fbdev_cma->fb_helper.fbdev);
 
-	if (fbdev_cma->fb)
-		drm_framebuffer_remove(&fbdev_cma->fb->fb);
+	if (fbdev_cma->fb_helper.fb)
+		drm_framebuffer_remove(fbdev_cma->fb_helper.fb);
 
 	drm_fb_helper_fini(&fbdev_cma->fb_helper);
 	kfree(fbdev_cma);
@@ -640,7 +518,7 @@ EXPORT_SYMBOL_GPL(drm_fbdev_cma_hotplug_event);
  * Calls drm_fb_helper_set_suspend, which is a wrapper around
  * fb_set_suspend implemented by fbdev core.
  */
-void drm_fbdev_cma_set_suspend(struct drm_fbdev_cma *fbdev_cma, int state)
+void drm_fbdev_cma_set_suspend(struct drm_fbdev_cma *fbdev_cma, bool state)
 {
 	if (fbdev_cma)
 		drm_fb_helper_set_suspend(&fbdev_cma->fb_helper, state);
@@ -657,7 +535,7 @@ EXPORT_SYMBOL(drm_fbdev_cma_set_suspend);
  * fb_set_suspend implemented by fbdev core.
  */
 void drm_fbdev_cma_set_suspend_unlocked(struct drm_fbdev_cma *fbdev_cma,
-					int state)
+					bool state)
 {
 	if (fbdev_cma)
 		drm_fb_helper_set_suspend_unlocked(&fbdev_cma->fb_helper,
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index 574af01..1b8f013 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -106,11 +106,11 @@ static DEFINE_MUTEX(kernel_fb_helper_lock);
  */
 
 #define drm_fb_helper_for_each_connector(fbh, i__) \
-	for (({ lockdep_assert_held(&(fbh)->dev->mode_config.mutex); }), \
+	for (({ lockdep_assert_held(&(fbh)->lock); }), \
 	     i__ = 0; i__ < (fbh)->connector_count; i__++)
 
-int drm_fb_helper_add_one_connector(struct drm_fb_helper *fb_helper,
-				    struct drm_connector *connector)
+static int __drm_fb_helper_add_one_connector(struct drm_fb_helper *fb_helper,
+					     struct drm_connector *connector)
 {
 	struct drm_fb_helper_connector *fb_conn;
 	struct drm_fb_helper_connector **temp;
@@ -119,7 +119,7 @@ int drm_fb_helper_add_one_connector(struct drm_fb_helper *fb_helper,
 	if (!drm_fbdev_emulation)
 		return 0;
 
-	WARN_ON(!mutex_is_locked(&fb_helper->dev->mode_config.mutex));
+	lockdep_assert_held(&fb_helper->lock);
 
 	count = fb_helper->connector_count + 1;
 
@@ -141,8 +141,21 @@ int drm_fb_helper_add_one_connector(struct drm_fb_helper *fb_helper,
 	drm_connector_get(connector);
 	fb_conn->connector = connector;
 	fb_helper->connector_info[fb_helper->connector_count++] = fb_conn;
+
 	return 0;
 }
+
+int drm_fb_helper_add_one_connector(struct drm_fb_helper *fb_helper,
+				    struct drm_connector *connector)
+{
+	int err;
+
+	mutex_lock(&fb_helper->lock);
+	err = __drm_fb_helper_add_one_connector(fb_helper, connector);
+	mutex_unlock(&fb_helper->lock);
+
+	return err;
+}
 EXPORT_SYMBOL(drm_fb_helper_add_one_connector);
 
 /**
@@ -169,11 +182,10 @@ int drm_fb_helper_single_add_all_connectors(struct drm_fb_helper *fb_helper)
 	if (!drm_fbdev_emulation)
 		return 0;
 
-	mutex_lock(&dev->mode_config.mutex);
+	mutex_lock(&fb_helper->lock);
 	drm_connector_list_iter_begin(dev, &conn_iter);
 	drm_for_each_connector_iter(connector, &conn_iter) {
-		ret = drm_fb_helper_add_one_connector(fb_helper, connector);
-
+		ret = __drm_fb_helper_add_one_connector(fb_helper, connector);
 		if (ret)
 			goto fail;
 	}
@@ -192,14 +204,14 @@ int drm_fb_helper_single_add_all_connectors(struct drm_fb_helper *fb_helper)
 	fb_helper->connector_count = 0;
 out:
 	drm_connector_list_iter_end(&conn_iter);
-	mutex_unlock(&dev->mode_config.mutex);
+	mutex_unlock(&fb_helper->lock);
 
 	return ret;
 }
 EXPORT_SYMBOL(drm_fb_helper_single_add_all_connectors);
 
-int drm_fb_helper_remove_one_connector(struct drm_fb_helper *fb_helper,
-				       struct drm_connector *connector)
+static int __drm_fb_helper_remove_one_connector(struct drm_fb_helper *fb_helper,
+						struct drm_connector *connector)
 {
 	struct drm_fb_helper_connector *fb_helper_connector;
 	int i, j;
@@ -207,9 +219,9 @@ int drm_fb_helper_remove_one_connector(struct drm_fb_helper *fb_helper,
 	if (!drm_fbdev_emulation)
 		return 0;
 
-	WARN_ON(!mutex_is_locked(&fb_helper->dev->mode_config.mutex));
+	lockdep_assert_held(&fb_helper->lock);
 
-	for (i = 0; i < fb_helper->connector_count; i++) {
+	drm_fb_helper_for_each_connector(fb_helper, i) {
 		if (fb_helper->connector_info[i]->connector == connector)
 			break;
 	}
@@ -227,23 +239,19 @@ int drm_fb_helper_remove_one_connector(struct drm_fb_helper *fb_helper,
 
 	return 0;
 }
-EXPORT_SYMBOL(drm_fb_helper_remove_one_connector);
 
-static void drm_fb_helper_save_lut_atomic(struct drm_crtc *crtc, struct drm_fb_helper *helper)
+int drm_fb_helper_remove_one_connector(struct drm_fb_helper *fb_helper,
+				       struct drm_connector *connector)
 {
-	uint16_t *r_base, *g_base, *b_base;
-	int i;
+	int err;
 
-	if (helper->funcs->gamma_get == NULL)
-		return;
+	mutex_lock(&fb_helper->lock);
+	err = __drm_fb_helper_remove_one_connector(fb_helper, connector);
+	mutex_unlock(&fb_helper->lock);
 
-	r_base = crtc->gamma_store;
-	g_base = r_base + crtc->gamma_size;
-	b_base = g_base + crtc->gamma_size;
-
-	for (i = 0; i < crtc->gamma_size; i++)
-		helper->funcs->gamma_get(crtc, &r_base[i], &g_base[i], &b_base[i], i);
+	return err;
 }
+EXPORT_SYMBOL(drm_fb_helper_remove_one_connector);
 
 static void drm_fb_helper_restore_lut_atomic(struct drm_crtc *crtc)
 {
@@ -285,7 +293,6 @@ int drm_fb_helper_debug_enter(struct fb_info *info)
 			if (drm_drv_uses_atomic_modeset(mode_set->crtc->dev))
 				continue;
 
-			drm_fb_helper_save_lut_atomic(mode_set->crtc, helper);
 			funcs->mode_set_base_atomic(mode_set->crtc,
 						    mode_set->fb,
 						    mode_set->x,
@@ -298,20 +305,6 @@ int drm_fb_helper_debug_enter(struct fb_info *info)
 }
 EXPORT_SYMBOL(drm_fb_helper_debug_enter);
 
-/* Find the real fb for a given fb helper CRTC */
-static struct drm_framebuffer *drm_mode_config_fb(struct drm_crtc *crtc)
-{
-	struct drm_device *dev = crtc->dev;
-	struct drm_crtc *c;
-
-	drm_for_each_crtc(c, dev) {
-		if (crtc->base.id == c->base.id)
-			return c->primary->fb;
-	}
-
-	return NULL;
-}
-
 /**
  * drm_fb_helper_debug_leave - implementation for &fb_ops.fb_debug_leave
  * @info: fbdev registered by the helper
@@ -328,8 +321,11 @@ int drm_fb_helper_debug_leave(struct fb_info *info)
 		struct drm_mode_set *mode_set = &helper->crtc_info[i].mode_set;
 
 		crtc = mode_set->crtc;
+		if (drm_drv_uses_atomic_modeset(crtc->dev))
+			continue;
+
 		funcs = crtc->helper_private;
-		fb = drm_mode_config_fb(crtc);
+		fb = crtc->primary->fb;
 
 		if (!crtc->enabled)
 			continue;
@@ -342,9 +338,6 @@ int drm_fb_helper_debug_leave(struct fb_info *info)
 		if (funcs->mode_set_base_atomic == NULL)
 			continue;
 
-		if (drm_drv_uses_atomic_modeset(crtc->dev))
-			continue;
-
 		drm_fb_helper_restore_lut_atomic(mode_set->crtc);
 		funcs->mode_set_base_atomic(mode_set->crtc, fb, crtc->x,
 					    crtc->y, LEAVE_ATOMIC_MODE_SET);
@@ -354,19 +347,24 @@ int drm_fb_helper_debug_leave(struct fb_info *info)
 }
 EXPORT_SYMBOL(drm_fb_helper_debug_leave);
 
-static int restore_fbdev_mode_atomic(struct drm_fb_helper *fb_helper)
+static int restore_fbdev_mode_atomic(struct drm_fb_helper *fb_helper, bool active)
 {
 	struct drm_device *dev = fb_helper->dev;
 	struct drm_plane *plane;
 	struct drm_atomic_state *state;
 	int i, ret;
 	unsigned int plane_mask;
+	struct drm_modeset_acquire_ctx ctx;
+
+	drm_modeset_acquire_init(&ctx, 0);
 
 	state = drm_atomic_state_alloc(dev);
-	if (!state)
-		return -ENOMEM;
+	if (!state) {
+		ret = -ENOMEM;
+		goto out_ctx;
+	}
 
-	state->acquire_ctx = dev->mode_config.acquire_ctx;
+	state->acquire_ctx = &ctx;
 retry:
 	plane_mask = 0;
 	drm_for_each_plane(plane, dev) {
@@ -375,7 +373,7 @@ static int restore_fbdev_mode_atomic(struct drm_fb_helper *fb_helper)
 		plane_state = drm_atomic_get_plane_state(state, plane);
 		if (IS_ERR(plane_state)) {
 			ret = PTR_ERR(plane_state);
-			goto fail;
+			goto out_state;
 		}
 
 		plane_state->rotation = DRM_MODE_ROTATE_0;
@@ -389,7 +387,7 @@ static int restore_fbdev_mode_atomic(struct drm_fb_helper *fb_helper)
 
 		ret = __drm_atomic_helper_disable_plane(plane, plane_state);
 		if (ret != 0)
-			goto fail;
+			goto out_state;
 	}
 
 	for (i = 0; i < fb_helper->crtc_count; i++) {
@@ -397,23 +395,38 @@ static int restore_fbdev_mode_atomic(struct drm_fb_helper *fb_helper)
 
 		ret = __drm_atomic_helper_set_config(mode_set, state);
 		if (ret != 0)
-			goto fail;
+			goto out_state;
+
+		/*
+		 * __drm_atomic_helper_set_config() sets active when a
+		 * mode is set, unconditionally clear it if we force DPMS off
+		 */
+		if (!active) {
+			struct drm_crtc *crtc = mode_set->crtc;
+			struct drm_crtc_state *crtc_state = drm_atomic_get_new_crtc_state(state, crtc);
+
+			crtc_state->active = false;
+		}
 	}
 
 	ret = drm_atomic_commit(state);
 
-fail:
+out_state:
 	drm_atomic_clean_old_fb(dev, plane_mask, ret);
 
 	if (ret == -EDEADLK)
 		goto backoff;
 
 	drm_atomic_state_put(state);
+out_ctx:
+	drm_modeset_drop_locks(&ctx);
+	drm_modeset_acquire_fini(&ctx);
+
 	return ret;
 
 backoff:
 	drm_atomic_state_clear(state);
-	drm_atomic_legacy_backoff(state);
+	drm_modeset_backoff(&ctx);
 
 	goto retry;
 }
@@ -422,8 +435,9 @@ static int restore_fbdev_mode_legacy(struct drm_fb_helper *fb_helper)
 {
 	struct drm_device *dev = fb_helper->dev;
 	struct drm_plane *plane;
-	int i;
+	int i, ret = 0;
 
+	drm_modeset_lock_all(fb_helper->dev);
 	drm_for_each_plane(plane, dev) {
 		if (plane->type != DRM_PLANE_TYPE_PRIMARY)
 			drm_plane_force_disable(plane);
@@ -437,34 +451,33 @@ static int restore_fbdev_mode_legacy(struct drm_fb_helper *fb_helper)
 	for (i = 0; i < fb_helper->crtc_count; i++) {
 		struct drm_mode_set *mode_set = &fb_helper->crtc_info[i].mode_set;
 		struct drm_crtc *crtc = mode_set->crtc;
-		int ret;
 
 		if (crtc->funcs->cursor_set2) {
 			ret = crtc->funcs->cursor_set2(crtc, NULL, 0, 0, 0, 0, 0);
 			if (ret)
-				return ret;
+				goto out;
 		} else if (crtc->funcs->cursor_set) {
 			ret = crtc->funcs->cursor_set(crtc, NULL, 0, 0, 0);
 			if (ret)
-				return ret;
+				goto out;
 		}
 
 		ret = drm_mode_set_config_internal(mode_set);
 		if (ret)
-			return ret;
+			goto out;
 	}
+out:
+	drm_modeset_unlock_all(fb_helper->dev);
 
-	return 0;
+	return ret;
 }
 
 static int restore_fbdev_mode(struct drm_fb_helper *fb_helper)
 {
 	struct drm_device *dev = fb_helper->dev;
 
-	drm_warn_on_modeset_not_all_locked(dev);
-
 	if (drm_drv_uses_atomic_modeset(dev))
-		return restore_fbdev_mode_atomic(fb_helper);
+		return restore_fbdev_mode_atomic(fb_helper, true);
 	else
 		return restore_fbdev_mode_legacy(fb_helper);
 }
@@ -482,23 +495,26 @@ static int restore_fbdev_mode(struct drm_fb_helper *fb_helper)
  */
 int drm_fb_helper_restore_fbdev_mode_unlocked(struct drm_fb_helper *fb_helper)
 {
-	struct drm_device *dev = fb_helper->dev;
 	bool do_delayed;
 	int ret;
 
 	if (!drm_fbdev_emulation)
 		return -ENODEV;
 
-	drm_modeset_lock_all(dev);
+	if (READ_ONCE(fb_helper->deferred_setup))
+		return 0;
+
+	mutex_lock(&fb_helper->lock);
 	ret = restore_fbdev_mode(fb_helper);
 
 	do_delayed = fb_helper->delayed_hotplug;
 	if (do_delayed)
 		fb_helper->delayed_hotplug = false;
-	drm_modeset_unlock_all(dev);
+	mutex_unlock(&fb_helper->lock);
 
 	if (do_delayed)
 		drm_fb_helper_hotplug_event(fb_helper);
+
 	return ret;
 }
 EXPORT_SYMBOL(drm_fb_helper_restore_fbdev_mode_unlocked);
@@ -517,10 +533,12 @@ static bool drm_fb_helper_is_bound(struct drm_fb_helper *fb_helper)
 		return false;
 
 	drm_for_each_crtc(crtc, dev) {
+		drm_modeset_lock(&crtc->mutex, NULL);
 		if (crtc->primary->fb)
 			crtcs_bound++;
 		if (crtc->primary->fb == fb_helper->fb)
 			bound++;
+		drm_modeset_unlock(&crtc->mutex);
 	}
 
 	if (bound < crtcs_bound)
@@ -548,11 +566,11 @@ static bool drm_fb_helper_force_kernel_mode(void)
 		if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
 			continue;
 
-		drm_modeset_lock_all(dev);
+		mutex_lock(&helper->lock);
 		ret = restore_fbdev_mode(helper);
 		if (ret)
 			error = true;
-		drm_modeset_unlock_all(dev);
+		mutex_unlock(&helper->lock);
 	}
 	return error;
 }
@@ -581,23 +599,14 @@ static struct sysrq_key_op sysrq_drm_fb_helper_restore_op = {
 static struct sysrq_key_op sysrq_drm_fb_helper_restore_op = { };
 #endif
 
-static void drm_fb_helper_dpms(struct fb_info *info, int dpms_mode)
+static void dpms_legacy(struct drm_fb_helper *fb_helper, int dpms_mode)
 {
-	struct drm_fb_helper *fb_helper = info->par;
 	struct drm_device *dev = fb_helper->dev;
 	struct drm_crtc *crtc;
 	struct drm_connector *connector;
 	int i, j;
 
-	/*
-	 * For each CRTC in this fb, turn the connectors on/off.
-	 */
 	drm_modeset_lock_all(dev);
-	if (!drm_fb_helper_is_bound(fb_helper)) {
-		drm_modeset_unlock_all(dev);
-		return;
-	}
-
 	for (i = 0; i < fb_helper->crtc_count; i++) {
 		crtc = fb_helper->crtc_info[i].mode_set.crtc;
 
@@ -615,6 +624,26 @@ static void drm_fb_helper_dpms(struct fb_info *info, int dpms_mode)
 	drm_modeset_unlock_all(dev);
 }
 
+static void drm_fb_helper_dpms(struct fb_info *info, int dpms_mode)
+{
+	struct drm_fb_helper *fb_helper = info->par;
+
+	/*
+	 * For each CRTC in this fb, turn the connectors on/off.
+	 */
+	mutex_lock(&fb_helper->lock);
+	if (!drm_fb_helper_is_bound(fb_helper)) {
+		mutex_unlock(&fb_helper->lock);
+		return;
+	}
+
+	if (drm_drv_uses_atomic_modeset(fb_helper->dev))
+		restore_fbdev_mode_atomic(fb_helper, dpms_mode == DRM_MODE_DPMS_ON);
+	else
+		dpms_legacy(fb_helper, dpms_mode);
+	mutex_unlock(&fb_helper->lock);
+}
+
 /**
  * drm_fb_helper_blank - implementation for &fb_ops.fb_blank
  * @blank: desired blanking state
@@ -734,6 +763,7 @@ void drm_fb_helper_prepare(struct drm_device *dev, struct drm_fb_helper *helper,
 	INIT_WORK(&helper->resume_work, drm_fb_helper_resume_worker);
 	INIT_WORK(&helper->dirty_work, drm_fb_helper_dirty_work);
 	helper->dirty_clip.x1 = helper->dirty_clip.y1 = ~0;
+	mutex_init(&helper->lock);
 	helper->funcs = funcs;
 	helper->dev = dev;
 }
@@ -899,6 +929,7 @@ void drm_fb_helper_fini(struct drm_fb_helper *fb_helper)
 	}
 	mutex_unlock(&kernel_fb_helper_lock);
 
+	mutex_destroy(&fb_helper->lock);
 	drm_fb_helper_crtc_free(fb_helper);
 
 }
@@ -1167,22 +1198,23 @@ void drm_fb_helper_set_suspend_unlocked(struct drm_fb_helper *fb_helper,
 }
 EXPORT_SYMBOL(drm_fb_helper_set_suspend_unlocked);
 
-static int setcolreg(struct drm_crtc *crtc, u16 red, u16 green,
-		     u16 blue, u16 regno, struct fb_info *info)
+static int setcmap_pseudo_palette(struct fb_cmap *cmap, struct fb_info *info)
 {
-	struct drm_fb_helper *fb_helper = info->par;
-	struct drm_framebuffer *fb = fb_helper->fb;
+	u32 *palette = (u32 *)info->pseudo_palette;
+	int i;
 
-	if (info->fix.visual == FB_VISUAL_TRUECOLOR) {
-		u32 *palette;
+	if (cmap->start + cmap->len > 16)
+		return -EINVAL;
+
+	for (i = 0; i < cmap->len; ++i) {
+		u16 red = cmap->red[i];
+		u16 green = cmap->green[i];
+		u16 blue = cmap->blue[i];
 		u32 value;
-		/* place color in psuedopalette */
-		if (regno > 16)
-			return -EINVAL;
-		palette = (u32 *)info->pseudo_palette;
-		red >>= (16 - info->var.red.length);
-		green >>= (16 - info->var.green.length);
-		blue >>= (16 - info->var.blue.length);
+
+		red >>= 16 - info->var.red.length;
+		green >>= 16 - info->var.green.length;
+		blue >>= 16 - info->var.blue.length;
 		value = (red << info->var.red.offset) |
 			(green << info->var.green.offset) |
 			(blue << info->var.blue.offset);
@@ -1192,25 +1224,171 @@ static int setcolreg(struct drm_crtc *crtc, u16 red, u16 green,
 			mask <<= info->var.transp.offset;
 			value |= mask;
 		}
-		palette[regno] = value;
-		return 0;
+		palette[cmap->start + i] = value;
 	}
 
-	/*
-	 * The driver really shouldn't advertise pseudo/directcolor
-	 * visuals if it can't deal with the palette.
-	 */
-	if (WARN_ON(!fb_helper->funcs->gamma_set ||
-		    !fb_helper->funcs->gamma_get))
-		return -EINVAL;
-
-	WARN_ON(fb->format->cpp[0] != 1);
-
-	fb_helper->funcs->gamma_set(crtc, red, green, blue, regno);
-
 	return 0;
 }
 
+static int setcmap_legacy(struct fb_cmap *cmap, struct fb_info *info)
+{
+	struct drm_fb_helper *fb_helper = info->par;
+	struct drm_crtc *crtc;
+	u16 *r, *g, *b;
+	int i, ret = 0;
+
+	drm_modeset_lock_all(fb_helper->dev);
+	for (i = 0; i < fb_helper->crtc_count; i++) {
+		crtc = fb_helper->crtc_info[i].mode_set.crtc;
+		if (!crtc->funcs->gamma_set || !crtc->gamma_size)
+			return -EINVAL;
+
+		if (cmap->start + cmap->len > crtc->gamma_size)
+			return -EINVAL;
+
+		r = crtc->gamma_store;
+		g = r + crtc->gamma_size;
+		b = g + crtc->gamma_size;
+
+		memcpy(r + cmap->start, cmap->red, cmap->len * sizeof(*r));
+		memcpy(g + cmap->start, cmap->green, cmap->len * sizeof(*g));
+		memcpy(b + cmap->start, cmap->blue, cmap->len * sizeof(*b));
+
+		ret = crtc->funcs->gamma_set(crtc, r, g, b,
+					     crtc->gamma_size, NULL);
+		if (ret)
+			return ret;
+	}
+	drm_modeset_unlock_all(fb_helper->dev);
+
+	return ret;
+}
+
+static struct drm_property_blob *setcmap_new_gamma_lut(struct drm_crtc *crtc,
+						       struct fb_cmap *cmap)
+{
+	struct drm_device *dev = crtc->dev;
+	struct drm_property_blob *gamma_lut;
+	struct drm_color_lut *lut;
+	int size = crtc->gamma_size;
+	int i;
+
+	if (!size || cmap->start + cmap->len > size)
+		return ERR_PTR(-EINVAL);
+
+	gamma_lut = drm_property_create_blob(dev, sizeof(*lut) * size, NULL);
+	if (IS_ERR(gamma_lut))
+		return gamma_lut;
+
+	lut = (struct drm_color_lut *)gamma_lut->data;
+	if (cmap->start || cmap->len != size) {
+		u16 *r = crtc->gamma_store;
+		u16 *g = r + crtc->gamma_size;
+		u16 *b = g + crtc->gamma_size;
+
+		for (i = 0; i < cmap->start; i++) {
+			lut[i].red = r[i];
+			lut[i].green = g[i];
+			lut[i].blue = b[i];
+		}
+		for (i = cmap->start + cmap->len; i < size; i++) {
+			lut[i].red = r[i];
+			lut[i].green = g[i];
+			lut[i].blue = b[i];
+		}
+	}
+
+	for (i = 0; i < cmap->len; i++) {
+		lut[cmap->start + i].red = cmap->red[i];
+		lut[cmap->start + i].green = cmap->green[i];
+		lut[cmap->start + i].blue = cmap->blue[i];
+	}
+
+	return gamma_lut;
+}
+
+static int setcmap_atomic(struct fb_cmap *cmap, struct fb_info *info)
+{
+	struct drm_fb_helper *fb_helper = info->par;
+	struct drm_device *dev = fb_helper->dev;
+	struct drm_property_blob *gamma_lut = NULL;
+	struct drm_modeset_acquire_ctx ctx;
+	struct drm_crtc_state *crtc_state;
+	struct drm_atomic_state *state;
+	struct drm_crtc *crtc;
+	u16 *r, *g, *b;
+	int i, ret = 0;
+	bool replaced;
+
+	drm_modeset_acquire_init(&ctx, 0);
+
+	state = drm_atomic_state_alloc(dev);
+	if (!state) {
+		ret = -ENOMEM;
+		goto out_ctx;
+	}
+
+	state->acquire_ctx = &ctx;
+retry:
+	for (i = 0; i < fb_helper->crtc_count; i++) {
+		crtc = fb_helper->crtc_info[i].mode_set.crtc;
+
+		if (!gamma_lut)
+			gamma_lut = setcmap_new_gamma_lut(crtc, cmap);
+		if (IS_ERR(gamma_lut)) {
+			ret = PTR_ERR(gamma_lut);
+			gamma_lut = NULL;
+			goto out_state;
+		}
+
+		crtc_state = drm_atomic_get_crtc_state(state, crtc);
+		if (IS_ERR(crtc_state)) {
+			ret = PTR_ERR(crtc_state);
+			goto out_state;
+		}
+
+		replaced  = drm_property_replace_blob(&crtc_state->degamma_lut,
+						      NULL);
+		replaced |= drm_property_replace_blob(&crtc_state->ctm, NULL);
+		replaced |= drm_property_replace_blob(&crtc_state->gamma_lut,
+						      gamma_lut);
+		crtc_state->color_mgmt_changed |= replaced;
+	}
+
+	ret = drm_atomic_commit(state);
+	if (ret)
+		goto out_state;
+
+	for (i = 0; i < fb_helper->crtc_count; i++) {
+		crtc = fb_helper->crtc_info[i].mode_set.crtc;
+
+		r = crtc->gamma_store;
+		g = r + crtc->gamma_size;
+		b = g + crtc->gamma_size;
+
+		memcpy(r + cmap->start, cmap->red, cmap->len * sizeof(*r));
+		memcpy(g + cmap->start, cmap->green, cmap->len * sizeof(*g));
+		memcpy(b + cmap->start, cmap->blue, cmap->len * sizeof(*b));
+	}
+
+out_state:
+	if (ret == -EDEADLK)
+		goto backoff;
+
+	drm_property_blob_put(gamma_lut);
+	drm_atomic_state_put(state);
+out_ctx:
+	drm_modeset_drop_locks(&ctx);
+	drm_modeset_acquire_fini(&ctx);
+
+	return ret;
+
+backoff:
+	drm_atomic_state_clear(state);
+	drm_modeset_backoff(&ctx);
+	goto retry;
+}
+
 /**
  * drm_fb_helper_setcmap - implementation for &fb_ops.fb_setcmap
  * @cmap: cmap to set
@@ -1219,52 +1397,29 @@ static int setcolreg(struct drm_crtc *crtc, u16 red, u16 green,
 int drm_fb_helper_setcmap(struct fb_cmap *cmap, struct fb_info *info)
 {
 	struct drm_fb_helper *fb_helper = info->par;
-	struct drm_device *dev = fb_helper->dev;
-	const struct drm_crtc_helper_funcs *crtc_funcs;
-	u16 *red, *green, *blue, *transp;
-	struct drm_crtc *crtc;
-	int i, j, rc = 0;
-	int start;
+	int ret;
 
 	if (oops_in_progress)
 		return -EBUSY;
 
-	drm_modeset_lock_all(dev);
+	mutex_lock(&fb_helper->lock);
+
 	if (!drm_fb_helper_is_bound(fb_helper)) {
-		drm_modeset_unlock_all(dev);
-		return -EBUSY;
+		ret = -EBUSY;
+		goto out;
 	}
 
-	for (i = 0; i < fb_helper->crtc_count; i++) {
-		crtc = fb_helper->crtc_info[i].mode_set.crtc;
-		crtc_funcs = crtc->helper_private;
+	if (info->fix.visual == FB_VISUAL_TRUECOLOR)
+		ret = setcmap_pseudo_palette(cmap, info);
+	else if (drm_drv_uses_atomic_modeset(fb_helper->dev))
+		ret = setcmap_atomic(cmap, info);
+	else
+		ret = setcmap_legacy(cmap, info);
 
-		red = cmap->red;
-		green = cmap->green;
-		blue = cmap->blue;
-		transp = cmap->transp;
-		start = cmap->start;
+out:
+	mutex_unlock(&fb_helper->lock);
 
-		for (j = 0; j < cmap->len; j++) {
-			u16 hred, hgreen, hblue, htransp = 0xffff;
-
-			hred = *red++;
-			hgreen = *green++;
-			hblue = *blue++;
-
-			if (transp)
-				htransp = *transp++;
-
-			rc = setcolreg(crtc, hred, hgreen, hblue, start++, info);
-			if (rc)
-				goto out;
-		}
-		if (crtc_funcs->load_lut)
-			crtc_funcs->load_lut(crtc);
-	}
- out:
-	drm_modeset_unlock_all(dev);
-	return rc;
+	return ret;
 }
 EXPORT_SYMBOL(drm_fb_helper_setcmap);
 
@@ -1281,12 +1436,11 @@ int drm_fb_helper_ioctl(struct fb_info *info, unsigned int cmd,
 			unsigned long arg)
 {
 	struct drm_fb_helper *fb_helper = info->par;
-	struct drm_device *dev = fb_helper->dev;
 	struct drm_mode_set *mode_set;
 	struct drm_crtc *crtc;
 	int ret = 0;
 
-	mutex_lock(&dev->mode_config.mutex);
+	mutex_lock(&fb_helper->lock);
 	if (!drm_fb_helper_is_bound(fb_helper)) {
 		ret = -EBUSY;
 		goto unlock;
@@ -1331,7 +1485,7 @@ int drm_fb_helper_ioctl(struct fb_info *info, unsigned int cmd,
 	}
 
 unlock:
-	mutex_unlock(&dev->mode_config.mutex);
+	mutex_unlock(&fb_helper->lock);
 	return ret;
 }
 EXPORT_SYMBOL(drm_fb_helper_ioctl);
@@ -1463,61 +1617,36 @@ int drm_fb_helper_set_par(struct fb_info *info)
 }
 EXPORT_SYMBOL(drm_fb_helper_set_par);
 
-static int pan_display_atomic(struct fb_var_screeninfo *var,
-			      struct fb_info *info)
+static void pan_set(struct drm_fb_helper *fb_helper, int x, int y)
 {
-	struct drm_fb_helper *fb_helper = info->par;
-	struct drm_device *dev = fb_helper->dev;
-	struct drm_atomic_state *state;
-	struct drm_plane *plane;
-	int i, ret;
-	unsigned int plane_mask;
+	int i;
 
-	state = drm_atomic_state_alloc(dev);
-	if (!state)
-		return -ENOMEM;
-
-	state->acquire_ctx = dev->mode_config.acquire_ctx;
-retry:
-	plane_mask = 0;
 	for (i = 0; i < fb_helper->crtc_count; i++) {
 		struct drm_mode_set *mode_set;
 
 		mode_set = &fb_helper->crtc_info[i].mode_set;
 
-		mode_set->x = var->xoffset;
-		mode_set->y = var->yoffset;
-
-		ret = __drm_atomic_helper_set_config(mode_set, state);
-		if (ret != 0)
-			goto fail;
-
-		plane = mode_set->crtc->primary;
-		plane_mask |= (1 << drm_plane_index(plane));
-		plane->old_fb = plane->fb;
+		mode_set->x = x;
+		mode_set->y = y;
 	}
+}
 
-	ret = drm_atomic_commit(state);
-	if (ret != 0)
-		goto fail;
+static int pan_display_atomic(struct fb_var_screeninfo *var,
+			      struct fb_info *info)
+{
+	struct drm_fb_helper *fb_helper = info->par;
+	int ret;
 
-	info->var.xoffset = var->xoffset;
-	info->var.yoffset = var->yoffset;
+	pan_set(fb_helper, var->xoffset, var->yoffset);
 
-fail:
-	drm_atomic_clean_old_fb(dev, plane_mask, ret);
+	ret = restore_fbdev_mode_atomic(fb_helper, true);
+	if (!ret) {
+		info->var.xoffset = var->xoffset;
+		info->var.yoffset = var->yoffset;
+	} else
+		pan_set(fb_helper, info->var.xoffset, info->var.yoffset);
 
-	if (ret == -EDEADLK)
-		goto backoff;
-
-	drm_atomic_state_put(state);
 	return ret;
-
-backoff:
-	drm_atomic_state_clear(state);
-	drm_atomic_legacy_backoff(state);
-
-	goto retry;
 }
 
 static int pan_display_legacy(struct fb_var_screeninfo *var,
@@ -1528,6 +1657,7 @@ static int pan_display_legacy(struct fb_var_screeninfo *var,
 	int ret = 0;
 	int i;
 
+	drm_modeset_lock_all(fb_helper->dev);
 	for (i = 0; i < fb_helper->crtc_count; i++) {
 		modeset = &fb_helper->crtc_info[i].mode_set;
 
@@ -1542,6 +1672,7 @@ static int pan_display_legacy(struct fb_var_screeninfo *var,
 			}
 		}
 	}
+	drm_modeset_unlock_all(fb_helper->dev);
 
 	return ret;
 }
@@ -1561,9 +1692,9 @@ int drm_fb_helper_pan_display(struct fb_var_screeninfo *var,
 	if (oops_in_progress)
 		return -EBUSY;
 
-	drm_modeset_lock_all(dev);
+	mutex_lock(&fb_helper->lock);
 	if (!drm_fb_helper_is_bound(fb_helper)) {
-		drm_modeset_unlock_all(dev);
+		mutex_unlock(&fb_helper->lock);
 		return -EBUSY;
 	}
 
@@ -1571,7 +1702,7 @@ int drm_fb_helper_pan_display(struct fb_var_screeninfo *var,
 		ret = pan_display_atomic(var, info);
 	else
 		ret = pan_display_legacy(var, info);
-	drm_modeset_unlock_all(dev);
+	mutex_unlock(&fb_helper->lock);
 
 	return ret;
 }
@@ -1579,8 +1710,7 @@ EXPORT_SYMBOL(drm_fb_helper_pan_display);
 
 /*
  * Allocates the backing storage and sets up the fbdev info structure through
- * the ->fb_probe callback and then registers the fbdev and sets up the panic
- * notifier.
+ * the ->fb_probe callback.
  */
 static int drm_fb_helper_single_fb_probe(struct drm_fb_helper *fb_helper,
 					 int preferred_bpp)
@@ -1678,13 +1808,8 @@ static int drm_fb_helper_single_fb_probe(struct drm_fb_helper *fb_helper,
 	}
 
 	if (crtc_count == 0 || sizes.fb_width == -1 || sizes.fb_height == -1) {
-		/*
-		 * hmm everyone went away - assume VGA cable just fell out
-		 * and will come back later.
-		 */
-		DRM_INFO("Cannot find any crtc or sizes - going 1024x768\n");
-		sizes.fb_width = sizes.surface_width = 1024;
-		sizes.fb_height = sizes.surface_height = 768;
+		DRM_INFO("Cannot find any crtc or sizes\n");
+		return -EAGAIN;
 	}
 
 	/* Handle our overallocation */
@@ -1696,17 +1821,6 @@ static int drm_fb_helper_single_fb_probe(struct drm_fb_helper *fb_helper,
 	if (ret < 0)
 		return ret;
 
-	/*
-	 * Set the fb pointer - usually drm_setup_crtcs does this for hotplug
-	 * events, but at init time drm_setup_crtcs needs to be called before
-	 * the fb is allocated (since we need to figure out the desired size of
-	 * the fb before we can allocate it ...). Hence we need to fix things up
-	 * here again.
-	 */
-	for (i = 0; i < fb_helper->crtc_count; i++)
-		if (fb_helper->crtc_info[i].mode_set.num_connectors)
-			fb_helper->crtc_info[i].mode_set.fb = fb_helper->fb;
-
 	return 0;
 }
 
@@ -1768,8 +1882,6 @@ void drm_fb_helper_fill_var(struct fb_info *info, struct drm_fb_helper *fb_helpe
 	info->var.xoffset = 0;
 	info->var.yoffset = 0;
 	info->var.activate = FB_ACTIVATE_NOW;
-	info->var.height = -1;
-	info->var.width = -1;
 
 	switch (fb->format->depth) {
 	case 8:
@@ -1831,12 +1943,11 @@ void drm_fb_helper_fill_var(struct fb_info *info, struct drm_fb_helper *fb_helpe
 EXPORT_SYMBOL(drm_fb_helper_fill_var);
 
 static int drm_fb_helper_probe_connector_modes(struct drm_fb_helper *fb_helper,
-					       uint32_t maxX,
-					       uint32_t maxY)
+						uint32_t maxX,
+						uint32_t maxY)
 {
 	struct drm_connector *connector;
-	int count = 0;
-	int i;
+	int i, count = 0;
 
 	drm_fb_helper_for_each_connector(fb_helper, i) {
 		connector = fb_helper->connector_info[i]->connector;
@@ -2234,11 +2345,8 @@ static void drm_setup_crtcs(struct drm_fb_helper *fb_helper,
 	int i;
 
 	DRM_DEBUG_KMS("\n");
-	if (drm_fb_helper_probe_connector_modes(fb_helper, width, height) == 0)
-		DRM_DEBUG_KMS("No connectors reported connected with modes\n");
-
 	/* prevent concurrent modification of connector_count by hotplug */
-	lockdep_assert_held(&fb_helper->dev->mode_config.mutex);
+	lockdep_assert_held(&fb_helper->lock);
 
 	crtcs = kcalloc(fb_helper->connector_count,
 			sizeof(struct drm_fb_helper_crtc *), GFP_KERNEL);
@@ -2253,6 +2361,9 @@ static void drm_setup_crtcs(struct drm_fb_helper *fb_helper,
 		goto out;
 	}
 
+	mutex_lock(&fb_helper->dev->mode_config.mutex);
+	if (drm_fb_helper_probe_connector_modes(fb_helper, width, height) == 0)
+		DRM_DEBUG_KMS("No connectors reported connected with modes\n");
 	drm_enable_connectors(fb_helper, enabled);
 
 	if (!(fb_helper->funcs->initial_config &&
@@ -2274,6 +2385,7 @@ static void drm_setup_crtcs(struct drm_fb_helper *fb_helper,
 
 		drm_pick_crtcs(fb_helper, crtcs, modes, 0, width, height);
 	}
+	mutex_unlock(&fb_helper->dev->mode_config.mutex);
 
 	/* need to set the modesets up here for use later */
 	/* fill out the connector<->crtc mappings into the modesets */
@@ -2285,9 +2397,9 @@ static void drm_setup_crtcs(struct drm_fb_helper *fb_helper,
 		struct drm_display_mode *mode = modes[i];
 		struct drm_fb_helper_crtc *fb_crtc = crtcs[i];
 		struct drm_fb_offset *offset = &offsets[i];
-		struct drm_mode_set *modeset = &fb_crtc->mode_set;
 
 		if (mode && fb_crtc) {
+			struct drm_mode_set *modeset = &fb_crtc->mode_set;
 			struct drm_connector *connector =
 				fb_helper->connector_info[i]->connector;
 
@@ -2301,7 +2413,6 @@ static void drm_setup_crtcs(struct drm_fb_helper *fb_helper,
 							   fb_crtc->desired_mode);
 			drm_connector_get(connector);
 			modeset->connectors[modeset->num_connectors++] = connector;
-			modeset->fb = fb_helper->fb;
 			modeset->x = offset->x;
 			modeset->y = offset->y;
 		}
@@ -2313,6 +2424,91 @@ static void drm_setup_crtcs(struct drm_fb_helper *fb_helper,
 	kfree(enabled);
 }
 
+/*
+ * This is a continuation of drm_setup_crtcs() that sets up anything related
+ * to the framebuffer. During initialization, drm_setup_crtcs() is called before
+ * the framebuffer has been allocated (fb_helper->fb and fb_helper->fbdev).
+ * So, any setup that touches those fields needs to be done here instead of in
+ * drm_setup_crtcs().
+ */
+static void drm_setup_crtcs_fb(struct drm_fb_helper *fb_helper)
+{
+	struct fb_info *info = fb_helper->fbdev;
+	int i;
+
+	for (i = 0; i < fb_helper->crtc_count; i++)
+		if (fb_helper->crtc_info[i].mode_set.num_connectors)
+			fb_helper->crtc_info[i].mode_set.fb = fb_helper->fb;
+
+	mutex_lock(&fb_helper->dev->mode_config.mutex);
+	drm_fb_helper_for_each_connector(fb_helper, i) {
+		struct drm_connector *connector =
+					fb_helper->connector_info[i]->connector;
+
+		/* use first connected connector for the physical dimensions */
+		if (connector->status == connector_status_connected) {
+			info->var.width = connector->display_info.width_mm;
+			info->var.height = connector->display_info.height_mm;
+			break;
+		}
+	}
+	mutex_unlock(&fb_helper->dev->mode_config.mutex);
+}
+
+/* Note: Drops fb_helper->lock before returning. */
+static int
+__drm_fb_helper_initial_config_and_unlock(struct drm_fb_helper *fb_helper,
+					  int bpp_sel)
+{
+	struct drm_device *dev = fb_helper->dev;
+	struct fb_info *info;
+	unsigned int width, height;
+	int ret;
+
+	width = dev->mode_config.max_width;
+	height = dev->mode_config.max_height;
+
+	drm_setup_crtcs(fb_helper, width, height);
+	ret = drm_fb_helper_single_fb_probe(fb_helper, bpp_sel);
+	if (ret < 0) {
+		if (ret == -EAGAIN) {
+			fb_helper->preferred_bpp = bpp_sel;
+			fb_helper->deferred_setup = true;
+			ret = 0;
+		}
+		mutex_unlock(&fb_helper->lock);
+
+		return ret;
+	}
+	drm_setup_crtcs_fb(fb_helper);
+
+	fb_helper->deferred_setup = false;
+
+	info = fb_helper->fbdev;
+	info->var.pixclock = 0;
+
+	/* Need to drop locks to avoid recursive deadlock in
+	 * register_framebuffer. This is ok because the only thing left to do is
+	 * register the fbdev emulation instance in kernel_fb_helper_list. */
+	mutex_unlock(&fb_helper->lock);
+
+	ret = register_framebuffer(info);
+	if (ret < 0)
+		return ret;
+
+	dev_info(dev->dev, "fb%d: %s frame buffer device\n",
+		 info->node, info->fix.id);
+
+	mutex_lock(&kernel_fb_helper_lock);
+	if (list_empty(&kernel_fb_helper_list))
+		register_sysrq_key('v', &sysrq_drm_fb_helper_restore_op);
+
+	list_add(&fb_helper->kernel_fb_list, &kernel_fb_helper_list);
+	mutex_unlock(&kernel_fb_helper_lock);
+
+	return 0;
+}
+
 /**
  * drm_fb_helper_initial_config - setup a sane initial connector configuration
  * @fb_helper: fb_helper device struct
@@ -2357,39 +2553,15 @@ static void drm_setup_crtcs(struct drm_fb_helper *fb_helper,
  */
 int drm_fb_helper_initial_config(struct drm_fb_helper *fb_helper, int bpp_sel)
 {
-	struct drm_device *dev = fb_helper->dev;
-	struct fb_info *info;
 	int ret;
 
 	if (!drm_fbdev_emulation)
 		return 0;
 
-	mutex_lock(&dev->mode_config.mutex);
-	drm_setup_crtcs(fb_helper,
-			dev->mode_config.max_width,
-			dev->mode_config.max_height);
-	ret = drm_fb_helper_single_fb_probe(fb_helper, bpp_sel);
-	mutex_unlock(&dev->mode_config.mutex);
-	if (ret)
-		return ret;
+	mutex_lock(&fb_helper->lock);
+	ret = __drm_fb_helper_initial_config_and_unlock(fb_helper, bpp_sel);
 
-	info = fb_helper->fbdev;
-	info->var.pixclock = 0;
-	ret = register_framebuffer(info);
-	if (ret < 0)
-		return ret;
-
-	dev_info(dev->dev, "fb%d: %s frame buffer device\n",
-		 info->node, info->fix.id);
-
-	mutex_lock(&kernel_fb_helper_lock);
-	if (list_empty(&kernel_fb_helper_list))
-		register_sysrq_key('v', &sysrq_drm_fb_helper_restore_op);
-
-	list_add(&fb_helper->kernel_fb_list, &kernel_fb_helper_list);
-	mutex_unlock(&kernel_fb_helper_lock);
-
-	return 0;
+	return ret;
 }
 EXPORT_SYMBOL(drm_fb_helper_initial_config);
 
@@ -2416,22 +2588,29 @@ EXPORT_SYMBOL(drm_fb_helper_initial_config);
  */
 int drm_fb_helper_hotplug_event(struct drm_fb_helper *fb_helper)
 {
-	struct drm_device *dev = fb_helper->dev;
+	int err = 0;
 
 	if (!drm_fbdev_emulation)
 		return 0;
 
-	mutex_lock(&dev->mode_config.mutex);
+	mutex_lock(&fb_helper->lock);
+	if (fb_helper->deferred_setup) {
+		err = __drm_fb_helper_initial_config_and_unlock(fb_helper,
+				fb_helper->preferred_bpp);
+		return err;
+	}
+
 	if (!fb_helper->fb || !drm_fb_helper_is_bound(fb_helper)) {
 		fb_helper->delayed_hotplug = true;
-		mutex_unlock(&dev->mode_config.mutex);
-		return 0;
+		mutex_unlock(&fb_helper->lock);
+		return err;
 	}
+
 	DRM_DEBUG_KMS("\n");
 
 	drm_setup_crtcs(fb_helper, fb_helper->fb->width, fb_helper->fb->height);
-
-	mutex_unlock(&dev->mode_config.mutex);
+	drm_setup_crtcs_fb(fb_helper);
+	mutex_unlock(&fb_helper->lock);
 
 	drm_fb_helper_set_par(fb_helper->fbdev);
 
diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index 84f3a24..b3c6e99 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -75,7 +75,7 @@ DEFINE_MUTEX(drm_global_mutex);
  * for drivers which use the CMA GEM helpers it's drm_gem_cma_mmap().
  *
  * No other file operations are supported by the DRM userspace API. Overall the
- * following is an example #file_operations structure::
+ * following is an example &file_operations structure::
  *
  *     static const example_drm_fops = {
  *             .owner = THIS_MODULE,
@@ -92,6 +92,11 @@ DEFINE_MUTEX(drm_global_mutex);
  * For plain GEM based drivers there is the DEFINE_DRM_GEM_FOPS() macro, and for
  * CMA based drivers there is the DEFINE_DRM_GEM_CMA_FOPS() macro to make this
  * simpler.
+ *
+ * The driver's &file_operations must be stored in &drm_driver.fops.
+ *
+ * For driver-private IOCTL handling see the more detailed discussion in
+ * :ref:`IOCTL support in the userland interfaces chapter<drm_driver_ioctl>`.
  */
 
 static int drm_open_helper(struct file *filp, struct drm_minor *minor);
@@ -431,7 +436,7 @@ int drm_release(struct inode *inode, struct file *filp)
 
 	if (!--dev->open_count) {
 		drm_lastclose(dev);
-		if (drm_device_is_unplugged(dev))
+		if (drm_dev_is_unplugged(dev))
 			drm_put_dev(dev);
 	}
 	mutex_unlock(&drm_global_mutex);
diff --git a/drivers/gpu/drm/drm_framebuffer.c b/drivers/gpu/drm/drm_framebuffer.c
index b3ef4f1..af27984 100644
--- a/drivers/gpu/drm/drm_framebuffer.c
+++ b/drivers/gpu/drm/drm_framebuffer.c
@@ -817,7 +817,7 @@ static int atomic_remove_fb(struct drm_framebuffer *fb)
 		plane->old_fb = plane->fb;
 	}
 
-	for_each_connector_in_state(state, conn, conn_state, i) {
+	for_each_new_connector_in_state(state, conn, conn_state, i) {
 		ret = drm_atomic_set_crtc_for_connector(conn_state, NULL);
 
 		if (ret)
diff --git a/drivers/gpu/drm/drm_gem.c b/drivers/gpu/drm/drm_gem.c
index cdaac37..88c6d78 100644
--- a/drivers/gpu/drm/drm_gem.c
+++ b/drivers/gpu/drm/drm_gem.c
@@ -311,6 +311,41 @@ drm_gem_handle_delete(struct drm_file *filp, u32 handle)
 EXPORT_SYMBOL(drm_gem_handle_delete);
 
 /**
+ * drm_gem_dumb_map_offset - return the fake mmap offset for a gem object
+ * @file: drm file-private structure containing the gem object
+ * @dev: corresponding drm_device
+ * @handle: gem object handle
+ * @offset: return location for the fake mmap offset
+ *
+ * This implements the &drm_driver.dumb_map_offset kms driver callback for
+ * drivers which use gem to manage their backing storage.
+ *
+ * Returns:
+ * 0 on success or a negative error code on failure.
+ */
+int drm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev,
+			    u32 handle, u64 *offset)
+{
+	struct drm_gem_object *obj;
+	int ret;
+
+	obj = drm_gem_object_lookup(file, handle);
+	if (!obj)
+		return -ENOENT;
+
+	ret = drm_gem_create_mmap_offset(obj);
+	if (ret)
+		goto out;
+
+	*offset = drm_vma_node_offset_addr(&obj->vma_node);
+out:
+	drm_gem_object_put_unlocked(obj);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(drm_gem_dumb_map_offset);
+
+/**
  * drm_gem_dumb_destroy - dumb fb callback helper for gem based drivers
  * @file: drm file-private structure to remove the dumb handle from
  * @dev: corresponding drm_device
@@ -826,13 +861,15 @@ drm_gem_object_put_unlocked(struct drm_gem_object *obj)
 		return;
 
 	dev = obj->dev;
-	might_lock(&dev->struct_mutex);
 
-	if (dev->driver->gem_free_object_unlocked)
+	if (dev->driver->gem_free_object_unlocked) {
 		kref_put(&obj->refcount, drm_gem_object_free);
-	else if (kref_put_mutex(&obj->refcount, drm_gem_object_free,
+	} else {
+		might_lock(&dev->struct_mutex);
+		if (kref_put_mutex(&obj->refcount, drm_gem_object_free,
 				&dev->struct_mutex))
-		mutex_unlock(&dev->struct_mutex);
+			mutex_unlock(&dev->struct_mutex);
+	}
 }
 EXPORT_SYMBOL(drm_gem_object_put_unlocked);
 
@@ -964,7 +1001,7 @@ int drm_gem_mmap(struct file *filp, struct vm_area_struct *vma)
 	struct drm_vma_offset_node *node;
 	int ret;
 
-	if (drm_device_is_unplugged(dev))
+	if (drm_dev_is_unplugged(dev))
 		return -ENODEV;
 
 	drm_vma_offset_lock_lookup(dev->vma_offset_manager);
diff --git a/drivers/gpu/drm/drm_gem_cma_helper.c b/drivers/gpu/drm/drm_gem_cma_helper.c
index bc28e75..373e33f 100644
--- a/drivers/gpu/drm/drm_gem_cma_helper.c
+++ b/drivers/gpu/drm/drm_gem_cma_helper.c
@@ -177,7 +177,7 @@ drm_gem_cma_create_with_handle(struct drm_file *file_priv,
  * This function frees the backing memory of the CMA GEM object, cleans up the
  * GEM object state and frees the memory used to store the object itself.
  * Drivers using the CMA helpers should set this as their
- * &drm_driver.gem_free_object callback.
+ * &drm_driver.gem_free_object_unlocked callback.
  */
 void drm_gem_cma_free_object(struct drm_gem_object *gem_obj)
 {
@@ -264,41 +264,6 @@ int drm_gem_cma_dumb_create(struct drm_file *file_priv,
 }
 EXPORT_SYMBOL_GPL(drm_gem_cma_dumb_create);
 
-/**
- * drm_gem_cma_dumb_map_offset - return the fake mmap offset for a CMA GEM
- *     object
- * @file_priv: DRM file-private structure containing the GEM object
- * @drm: DRM device
- * @handle: GEM object handle
- * @offset: return location for the fake mmap offset
- *
- * This function look up an object by its handle and returns the fake mmap
- * offset associated with it. Drivers using the CMA helpers should set this
- * as their &drm_driver.dumb_map_offset callback.
- *
- * Returns:
- * 0 on success or a negative error code on failure.
- */
-int drm_gem_cma_dumb_map_offset(struct drm_file *file_priv,
-				struct drm_device *drm, u32 handle,
-				u64 *offset)
-{
-	struct drm_gem_object *gem_obj;
-
-	gem_obj = drm_gem_object_lookup(file_priv, handle);
-	if (!gem_obj) {
-		dev_err(drm->dev, "failed to lookup GEM object\n");
-		return -EINVAL;
-	}
-
-	*offset = drm_vma_node_offset_addr(&gem_obj->vma_node);
-
-	drm_gem_object_put_unlocked(gem_obj);
-
-	return 0;
-}
-EXPORT_SYMBOL_GPL(drm_gem_cma_dumb_map_offset);
-
 const struct vm_operations_struct drm_gem_cma_vm_ops = {
 	.open = drm_gem_vm_open,
 	.close = drm_gem_vm_close,
@@ -390,7 +355,7 @@ unsigned long drm_gem_cma_get_unmapped_area(struct file *filp,
 	struct drm_device *dev = priv->minor->dev;
 	struct drm_vma_offset_node *node;
 
-	if (drm_device_is_unplugged(dev))
+	if (drm_dev_is_unplugged(dev))
 		return -ENODEV;
 
 	drm_vma_offset_lock_lookup(dev->vma_offset_manager);
diff --git a/drivers/gpu/drm/drm_gem_framebuffer_helper.c b/drivers/gpu/drm/drm_gem_framebuffer_helper.c
new file mode 100644
index 0000000..d54a083
--- /dev/null
+++ b/drivers/gpu/drm/drm_gem_framebuffer_helper.c
@@ -0,0 +1,283 @@
+/*
+ * drm gem framebuffer helper functions
+ *
+ * Copyright (C) 2017 Noralf Trønnes
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/dma-buf.h>
+#include <linux/dma-fence.h>
+#include <linux/reservation.h>
+#include <linux/slab.h>
+
+#include <drm/drmP.h>
+#include <drm/drm_atomic.h>
+#include <drm/drm_fb_helper.h>
+#include <drm/drm_fourcc.h>
+#include <drm/drm_framebuffer.h>
+#include <drm/drm_gem.h>
+#include <drm/drm_gem_framebuffer_helper.h>
+#include <drm/drm_modeset_helper.h>
+
+/**
+ * DOC: overview
+ *
+ * This library provides helpers for drivers that don't subclass
+ * &drm_framebuffer and and use &drm_gem_object for their backing storage.
+ *
+ * Drivers without additional needs to validate framebuffers can simply use
+ * drm_gem_fb_create() and everything is wired up automatically. But all
+ * parts can be used individually.
+ */
+
+/**
+ * drm_gem_fb_get_obj() - Get GEM object for framebuffer
+ * @fb: The framebuffer
+ * @plane: Which plane
+ *
+ * Returns the GEM object for given framebuffer.
+ */
+struct drm_gem_object *drm_gem_fb_get_obj(struct drm_framebuffer *fb,
+					  unsigned int plane)
+{
+	if (plane >= 4)
+		return NULL;
+
+	return fb->obj[plane];
+}
+EXPORT_SYMBOL_GPL(drm_gem_fb_get_obj);
+
+static struct drm_framebuffer *
+drm_gem_fb_alloc(struct drm_device *dev,
+		 const struct drm_mode_fb_cmd2 *mode_cmd,
+		 struct drm_gem_object **obj, unsigned int num_planes,
+		 const struct drm_framebuffer_funcs *funcs)
+{
+	struct drm_framebuffer *fb;
+	int ret, i;
+
+	fb = kzalloc(sizeof(*fb), GFP_KERNEL);
+	if (!fb)
+		return ERR_PTR(-ENOMEM);
+
+	drm_helper_mode_fill_fb_struct(dev, fb, mode_cmd);
+
+	for (i = 0; i < num_planes; i++)
+		fb->obj[i] = obj[i];
+
+	ret = drm_framebuffer_init(dev, fb, funcs);
+	if (ret) {
+		DRM_DEV_ERROR(dev->dev, "Failed to init framebuffer: %d\n",
+			      ret);
+		kfree(fb);
+		return ERR_PTR(ret);
+	}
+
+	return fb;
+}
+
+/**
+ * drm_gem_fb_destroy - Free GEM backed framebuffer
+ * @fb: DRM framebuffer
+ *
+ * Frees a GEM backed framebuffer with its backing buffer(s) and the structure
+ * itself. Drivers can use this as their &drm_framebuffer_funcs->destroy
+ * callback.
+ */
+void drm_gem_fb_destroy(struct drm_framebuffer *fb)
+{
+	int i;
+
+	for (i = 0; i < 4; i++)
+		drm_gem_object_put_unlocked(fb->obj[i]);
+
+	drm_framebuffer_cleanup(fb);
+	kfree(fb);
+}
+EXPORT_SYMBOL(drm_gem_fb_destroy);
+
+/**
+ * drm_gem_fb_create_handle - Create handle for GEM backed framebuffer
+ * @fb: DRM framebuffer
+ * @file: drm file
+ * @handle: handle created
+ *
+ * Drivers can use this as their &drm_framebuffer_funcs->create_handle
+ * callback.
+ *
+ * Returns:
+ * 0 on success or a negative error code on failure.
+ */
+int drm_gem_fb_create_handle(struct drm_framebuffer *fb, struct drm_file *file,
+			     unsigned int *handle)
+{
+	return drm_gem_handle_create(file, fb->obj[0], handle);
+}
+EXPORT_SYMBOL(drm_gem_fb_create_handle);
+
+/**
+ * drm_gem_fb_create_with_funcs() - helper function for the
+ *                                  &drm_mode_config_funcs.fb_create
+ *                                  callback
+ * @dev: DRM device
+ * @file: drm file for the ioctl call
+ * @mode_cmd: metadata from the userspace fb creation request
+ * @funcs: vtable to be used for the new framebuffer object
+ *
+ * This can be used to set &drm_framebuffer_funcs for drivers that need the
+ * &drm_framebuffer_funcs.dirty callback. Use drm_gem_fb_create() if you don't
+ * need to change &drm_framebuffer_funcs.
+ * The function does buffer size validation.
+ */
+struct drm_framebuffer *
+drm_gem_fb_create_with_funcs(struct drm_device *dev, struct drm_file *file,
+			     const struct drm_mode_fb_cmd2 *mode_cmd,
+			     const struct drm_framebuffer_funcs *funcs)
+{
+	const struct drm_format_info *info;
+	struct drm_gem_object *objs[4];
+	struct drm_framebuffer *fb;
+	int ret, i;
+
+	info = drm_get_format_info(dev, mode_cmd);
+	if (!info)
+		return ERR_PTR(-EINVAL);
+
+	for (i = 0; i < info->num_planes; i++) {
+		unsigned int width = mode_cmd->width / (i ? info->hsub : 1);
+		unsigned int height = mode_cmd->height / (i ? info->vsub : 1);
+		unsigned int min_size;
+
+		objs[i] = drm_gem_object_lookup(file, mode_cmd->handles[i]);
+		if (!objs[i]) {
+			DRM_DEV_ERROR(dev->dev, "Failed to lookup GEM\n");
+			ret = -ENOENT;
+			goto err_gem_object_put;
+		}
+
+		min_size = (height - 1) * mode_cmd->pitches[i]
+			 + width * info->cpp[i]
+			 + mode_cmd->offsets[i];
+
+		if (objs[i]->size < min_size) {
+			drm_gem_object_put_unlocked(objs[i]);
+			ret = -EINVAL;
+			goto err_gem_object_put;
+		}
+	}
+
+	fb = drm_gem_fb_alloc(dev, mode_cmd, objs, i, funcs);
+	if (IS_ERR(fb)) {
+		ret = PTR_ERR(fb);
+		goto err_gem_object_put;
+	}
+
+	return fb;
+
+err_gem_object_put:
+	for (i--; i >= 0; i--)
+		drm_gem_object_put_unlocked(objs[i]);
+
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(drm_gem_fb_create_with_funcs);
+
+static const struct drm_framebuffer_funcs drm_gem_fb_funcs = {
+	.destroy	= drm_gem_fb_destroy,
+	.create_handle	= drm_gem_fb_create_handle,
+};
+
+/**
+ * drm_gem_fb_create() - &drm_mode_config_funcs.fb_create callback function
+ * @dev: DRM device
+ * @file: drm file for the ioctl call
+ * @mode_cmd: metadata from the userspace fb creation request
+ *
+ * If your hardware has special alignment or pitch requirements these should be
+ * checked before calling this function. The function does buffer size
+ * validation. Use drm_gem_fb_create_with_funcs() if you need to set
+ * &drm_framebuffer_funcs.dirty.
+ */
+struct drm_framebuffer *
+drm_gem_fb_create(struct drm_device *dev, struct drm_file *file,
+		  const struct drm_mode_fb_cmd2 *mode_cmd)
+{
+	return drm_gem_fb_create_with_funcs(dev, file, mode_cmd,
+					    &drm_gem_fb_funcs);
+}
+EXPORT_SYMBOL_GPL(drm_gem_fb_create);
+
+/**
+ * drm_gem_fb_prepare_fb() - Prepare gem framebuffer
+ * @plane: Which plane
+ * @state: Plane state attach fence to
+ *
+ * This can be used as the &drm_plane_helper_funcs.prepare_fb hook.
+ *
+ * This function checks if the plane FB has an dma-buf attached, extracts
+ * the exclusive fence and attaches it to plane state for the atomic helper
+ * to wait on.
+ *
+ * There is no need for &drm_plane_helper_funcs.cleanup_fb hook for simple
+ * gem based framebuffer drivers which have their buffers always pinned in
+ * memory.
+ */
+int drm_gem_fb_prepare_fb(struct drm_plane *plane,
+			  struct drm_plane_state *state)
+{
+	struct dma_buf *dma_buf;
+	struct dma_fence *fence;
+
+	if ((plane->state->fb == state->fb) || !state->fb)
+		return 0;
+
+	dma_buf = drm_gem_fb_get_obj(state->fb, 0)->dma_buf;
+	if (dma_buf) {
+		fence = reservation_object_get_excl_rcu(dma_buf->resv);
+		drm_atomic_set_fence_for_plane(state, fence);
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(drm_gem_fb_prepare_fb);
+
+/**
+ * drm_gem_fbdev_fb_create - Create a drm_framebuffer for fbdev emulation
+ * @dev: DRM device
+ * @sizes: fbdev size description
+ * @pitch_align: optional pitch alignment
+ * @obj: GEM object backing the framebuffer
+ * @funcs: vtable to be used for the new framebuffer object
+ *
+ * This function creates a framebuffer for use with fbdev emulation.
+ *
+ * Returns:
+ * Pointer to a drm_framebuffer on success or an error pointer on failure.
+ */
+struct drm_framebuffer *
+drm_gem_fbdev_fb_create(struct drm_device *dev,
+			struct drm_fb_helper_surface_size *sizes,
+			unsigned int pitch_align, struct drm_gem_object *obj,
+			const struct drm_framebuffer_funcs *funcs)
+{
+	struct drm_mode_fb_cmd2 mode_cmd = { 0 };
+
+	mode_cmd.width = sizes->surface_width;
+	mode_cmd.height = sizes->surface_height;
+	mode_cmd.pitches[0] = sizes->surface_width *
+			      DIV_ROUND_UP(sizes->surface_bpp, 8);
+	if (pitch_align)
+		mode_cmd.pitches[0] = roundup(mode_cmd.pitches[0],
+					      pitch_align);
+	mode_cmd.pixel_format = drm_mode_legacy_fb_format(sizes->surface_bpp,
+							sizes->surface_depth);
+	if (obj->size < mode_cmd.pitches[0] * mode_cmd.height)
+		return ERR_PTR(-EINVAL);
+
+	return drm_gem_fb_alloc(dev, &mode_cmd, &obj, 1, funcs);
+}
+EXPORT_SYMBOL(drm_gem_fbdev_fb_create);
diff --git a/drivers/gpu/drm/drm_internal.h b/drivers/gpu/drm/drm_internal.h
index 5edc24b..fbc3f30 100644
--- a/drivers/gpu/drm/drm_internal.h
+++ b/drivers/gpu/drm/drm_internal.h
@@ -32,6 +32,7 @@ void drm_lastclose(struct drm_device *dev);
 int drm_irq_by_busid(struct drm_device *dev, void *data,
 		     struct drm_file *file_priv);
 void drm_pci_agp_destroy(struct drm_device *dev);
+int drm_pci_set_busid(struct drm_device *dev, struct drm_master *master);
 
 /* drm_prime.c */
 int drm_prime_handle_to_fd_ioctl(struct drm_device *dev, void *data,
@@ -56,14 +57,19 @@ int drm_gem_name_info(struct seq_file *m, void *data);
 /* drm_vblank.c */
 extern unsigned int drm_timestamp_monotonic;
 void drm_vblank_disable_and_save(struct drm_device *dev, unsigned int pipe);
+void drm_vblank_cleanup(struct drm_device *dev);
 
 /* IOCTLS */
-int drm_wait_vblank(struct drm_device *dev, void *data,
-		    struct drm_file *filp);
+int drm_wait_vblank_ioctl(struct drm_device *dev, void *data,
+			  struct drm_file *filp);
+int drm_legacy_modeset_ctl_ioctl(struct drm_device *dev, void *data,
+				 struct drm_file *file_priv);
+
+/* drm_irq.c */
+
+/* IOCTLS */
 int drm_legacy_irq_control(struct drm_device *dev, void *data,
 			   struct drm_file *file_priv);
-int drm_legacy_modeset_ctl(struct drm_device *dev, void *data,
-			   struct drm_file *file_priv);
 
 /* drm_auth.c */
 int drm_getmagic(struct drm_device *dev, void *data,
@@ -161,3 +167,9 @@ int drm_syncobj_handle_to_fd_ioctl(struct drm_device *dev, void *data,
 				   struct drm_file *file_private);
 int drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, void *data,
 				   struct drm_file *file_private);
+int drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
+			   struct drm_file *file_private);
+int drm_syncobj_reset_ioctl(struct drm_device *dev, void *data,
+			    struct drm_file *file_private);
+int drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
+			     struct drm_file *file_private);
diff --git a/drivers/gpu/drm/drm_ioc32.c b/drivers/gpu/drm/drm_ioc32.c
index d1f2028..f8e96e6 100644
--- a/drivers/gpu/drm/drm_ioc32.c
+++ b/drivers/gpu/drm/drm_ioc32.c
@@ -842,7 +842,7 @@ static int compat_drm_wait_vblank(struct file *file, unsigned int cmd,
 	req.request.type = req32.request.type;
 	req.request.sequence = req32.request.sequence;
 	req.request.signal = req32.request.signal;
-	err = drm_ioctl_kernel(file, drm_wait_vblank, &req, DRM_UNLOCKED);
+	err = drm_ioctl_kernel(file, drm_wait_vblank_ioctl, &req, DRM_UNLOCKED);
 	if (err)
 		return err;
 
diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c
index f1eb326..a9ae6dd 100644
--- a/drivers/gpu/drm/drm_ioctl.c
+++ b/drivers/gpu/drm/drm_ioctl.c
@@ -143,8 +143,8 @@ static int drm_set_busid(struct drm_device *dev, struct drm_file *file_priv)
 	if (master->unique != NULL)
 		drm_unset_busid(dev, master);
 
-	if (dev->driver->set_busid) {
-		ret = dev->driver->set_busid(dev, master);
+	if (dev->dev && dev_is_pci(dev->dev)) {
+		ret = drm_pci_set_busid(dev, master);
 		if (ret) {
 			drm_unset_busid(dev, master);
 			return ret;
@@ -603,9 +603,9 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
 	DRM_IOCTL_DEF(DRM_IOCTL_SG_ALLOC, drm_legacy_sg_alloc, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
 	DRM_IOCTL_DEF(DRM_IOCTL_SG_FREE, drm_legacy_sg_free, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
 
-	DRM_IOCTL_DEF(DRM_IOCTL_WAIT_VBLANK, drm_wait_vblank, DRM_UNLOCKED),
+	DRM_IOCTL_DEF(DRM_IOCTL_WAIT_VBLANK, drm_wait_vblank_ioctl, DRM_UNLOCKED),
 
-	DRM_IOCTL_DEF(DRM_IOCTL_MODESET_CTL, drm_legacy_modeset_ctl, 0),
+	DRM_IOCTL_DEF(DRM_IOCTL_MODESET_CTL, drm_legacy_modeset_ctl_ioctl, 0),
 
 	DRM_IOCTL_DEF(DRM_IOCTL_UPDATE_DRAW, drm_noop, DRM_AUTH|DRM_MASTER|DRM_ROOT_ONLY),
 
@@ -657,6 +657,12 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
 		      DRM_UNLOCKED|DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE, drm_syncobj_fd_to_handle_ioctl,
 		      DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_WAIT, drm_syncobj_wait_ioctl,
+		      DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_RESET, drm_syncobj_reset_ioctl,
+		      DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF(DRM_IOCTL_SYNCOBJ_SIGNAL, drm_syncobj_signal_ioctl,
+		      DRM_UNLOCKED|DRM_RENDER_ALLOW),
 };
 
 #define DRM_CORE_IOCTL_COUNT	ARRAY_SIZE( drm_ioctls )
@@ -695,7 +701,7 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
  * 
  * DRM driver private IOCTL must be in the range from DRM_COMMAND_BASE to
  * DRM_COMMAND_END. Finally you need an array of &struct drm_ioctl_desc to wire
- * up the handlers and set the access rights:
+ * up the handlers and set the access rights::
  *
  *     static const struct drm_ioctl_desc my_driver_ioctls[] = {
  *         DRM_IOCTL_DEF_DRV(MY_DRIVER_OPERATION, my_driver_operation,
@@ -704,6 +710,9 @@ static const struct drm_ioctl_desc drm_ioctls[] = {
  *
  * And then assign this to the &drm_driver.ioctls field in your driver
  * structure.
+ *
+ * See the separate chapter on :ref:`file operations<drm_driver_fops>` for how
+ * the driver-specific IOCTLs are wired up.
  */
 
 long drm_ioctl_kernel(struct file *file, drm_ioctl_t *func, void *kdata,
@@ -713,7 +722,7 @@ long drm_ioctl_kernel(struct file *file, drm_ioctl_t *func, void *kdata,
 	struct drm_device *dev = file_priv->minor->dev;
 	int retcode;
 
-	if (drm_device_is_unplugged(dev))
+	if (drm_dev_is_unplugged(dev))
 		return -ENODEV;
 
 	retcode = drm_ioctl_permit(flags, file_priv);
@@ -762,7 +771,7 @@ long drm_ioctl(struct file *filp,
 
 	dev = file_priv->minor->dev;
 
-	if (drm_device_is_unplugged(dev))
+	if (drm_dev_is_unplugged(dev))
 		return -ENODEV;
 
 	is_driver_ioctl = nr >= DRM_COMMAND_BASE && nr < DRM_COMMAND_END;
diff --git a/drivers/gpu/drm/drm_mipi_dsi.c b/drivers/gpu/drm/drm_mipi_dsi.c
index 1160a57..4b47226 100644
--- a/drivers/gpu/drm/drm_mipi_dsi.c
+++ b/drivers/gpu/drm/drm_mipi_dsi.c
@@ -165,14 +165,14 @@ of_mipi_dsi_device_add(struct mipi_dsi_host *host, struct device_node *node)
 	u32 reg;
 
 	if (of_modalias_node(node, info.type, sizeof(info.type)) < 0) {
-		dev_err(dev, "modalias failure on %s\n", node->full_name);
+		dev_err(dev, "modalias failure on %pOF\n", node);
 		return ERR_PTR(-EINVAL);
 	}
 
 	ret = of_property_read_u32(node, "reg", &reg);
 	if (ret) {
-		dev_err(dev, "device node %s has no valid reg property: %d\n",
-			node->full_name, ret);
+		dev_err(dev, "device node %pOF has no valid reg property: %d\n",
+			node, ret);
 		return ERR_PTR(-EINVAL);
 	}
 
diff --git a/drivers/gpu/drm/drm_mode_config.c b/drivers/gpu/drm/drm_mode_config.c
index d986225..74f6ff5 100644
--- a/drivers/gpu/drm/drm_mode_config.c
+++ b/drivers/gpu/drm/drm_mode_config.c
@@ -337,6 +337,13 @@ static int drm_mode_create_standard_properties(struct drm_device *dev)
 		return -ENOMEM;
 	dev->mode_config.gamma_lut_size_property = prop;
 
+	prop = drm_property_create(dev,
+				   DRM_MODE_PROP_IMMUTABLE | DRM_MODE_PROP_BLOB,
+				   "IN_FORMATS", 0);
+	if (!prop)
+		return -ENOMEM;
+	dev->mode_config.modifiers_property = prop;
+
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/drm_mode_object.c b/drivers/gpu/drm/drm_mode_object.c
index da9a9ad..1055533 100644
--- a/drivers/gpu/drm/drm_mode_object.c
+++ b/drivers/gpu/drm/drm_mode_object.c
@@ -233,6 +233,9 @@ int drm_object_property_set_value(struct drm_mode_object *obj,
 {
 	int i;
 
+	WARN_ON(drm_drv_uses_atomic_modeset(property->dev) &&
+		!(property->flags & DRM_MODE_PROP_IMMUTABLE));
+
 	for (i = 0; i < obj->properties->count; i++) {
 		if (obj->properties->properties[i] == property) {
 			obj->properties->values[i] = val;
@@ -244,24 +247,7 @@ int drm_object_property_set_value(struct drm_mode_object *obj,
 }
 EXPORT_SYMBOL(drm_object_property_set_value);
 
-/**
- * drm_object_property_get_value - retrieve the value of a property
- * @obj: drm mode object to get property value from
- * @property: property to retrieve
- * @val: storage for the property value
- *
- * This function retrieves the softare state of the given property for the given
- * property. Since there is no driver callback to retrieve the current property
- * value this might be out of sync with the hardware, depending upon the driver
- * and property.
- *
- * Atomic drivers should never call this function directly, the core will read
- * out property values through the various ->atomic_get_property callbacks.
- *
- * Returns:
- * Zero on success, error code on failure.
- */
-int drm_object_property_get_value(struct drm_mode_object *obj,
+int __drm_object_property_get_value(struct drm_mode_object *obj,
 				  struct drm_property *property, uint64_t *val)
 {
 	int i;
@@ -284,6 +270,31 @@ int drm_object_property_get_value(struct drm_mode_object *obj,
 
 	return -EINVAL;
 }
+
+/**
+ * drm_object_property_get_value - retrieve the value of a property
+ * @obj: drm mode object to get property value from
+ * @property: property to retrieve
+ * @val: storage for the property value
+ *
+ * This function retrieves the softare state of the given property for the given
+ * property. Since there is no driver callback to retrieve the current property
+ * value this might be out of sync with the hardware, depending upon the driver
+ * and property.
+ *
+ * Atomic drivers should never call this function directly, the core will read
+ * out property values through the various ->atomic_get_property callbacks.
+ *
+ * Returns:
+ * Zero on success, error code on failure.
+ */
+int drm_object_property_get_value(struct drm_mode_object *obj,
+				  struct drm_property *property, uint64_t *val)
+{
+	WARN_ON(drm_drv_uses_atomic_modeset(property->dev));
+
+	return __drm_object_property_get_value(obj, property, val);
+}
 EXPORT_SYMBOL(drm_object_property_get_value);
 
 /* helper for getconnector and getproperties ioctls */
@@ -302,7 +313,7 @@ int drm_mode_object_get_properties(struct drm_mode_object *obj, bool atomic,
 			continue;
 
 		if (*arg_count_props > count) {
-			ret = drm_object_property_get_value(obj, prop, &val);
+			ret = __drm_object_property_get_value(obj, prop, &val);
 			if (ret)
 				return ret;
 
@@ -381,6 +392,83 @@ struct drm_property *drm_mode_obj_find_prop_id(struct drm_mode_object *obj,
 	return NULL;
 }
 
+static int set_property_legacy(struct drm_mode_object *obj,
+			       struct drm_property *prop,
+			       uint64_t prop_value)
+{
+	struct drm_device *dev = prop->dev;
+	struct drm_mode_object *ref;
+	int ret = -EINVAL;
+
+	if (!drm_property_change_valid_get(prop, prop_value, &ref))
+		return -EINVAL;
+
+	drm_modeset_lock_all(dev);
+	switch (obj->type) {
+	case DRM_MODE_OBJECT_CONNECTOR:
+		ret = drm_mode_connector_set_obj_prop(obj, prop,
+						      prop_value);
+		break;
+	case DRM_MODE_OBJECT_CRTC:
+		ret = drm_mode_crtc_set_obj_prop(obj, prop, prop_value);
+		break;
+	case DRM_MODE_OBJECT_PLANE:
+		ret = drm_mode_plane_set_obj_prop(obj_to_plane(obj),
+						  prop, prop_value);
+		break;
+	}
+	drm_property_change_valid_put(prop, ref);
+	drm_modeset_unlock_all(dev);
+
+	return ret;
+}
+
+static int set_property_atomic(struct drm_mode_object *obj,
+			       struct drm_property *prop,
+			       uint64_t prop_value)
+{
+	struct drm_device *dev = prop->dev;
+	struct drm_atomic_state *state;
+	struct drm_modeset_acquire_ctx ctx;
+	int ret;
+
+	drm_modeset_acquire_init(&ctx, 0);
+
+	state = drm_atomic_state_alloc(dev);
+	if (!state)
+		return -ENOMEM;
+	state->acquire_ctx = &ctx;
+retry:
+	if (prop == state->dev->mode_config.dpms_property) {
+		if (obj->type != DRM_MODE_OBJECT_CONNECTOR) {
+			ret = -EINVAL;
+			goto out;
+		}
+
+		ret = drm_atomic_connector_commit_dpms(state,
+						       obj_to_connector(obj),
+						       prop_value);
+	} else {
+		ret = drm_atomic_set_property(state, obj, prop, prop_value);
+		if (ret)
+			goto out;
+		ret = drm_atomic_commit(state);
+	}
+out:
+	if (ret == -EDEADLK) {
+		drm_atomic_state_clear(state);
+		drm_modeset_backoff(&ctx);
+		goto retry;
+	}
+
+	drm_atomic_state_put(state);
+
+	drm_modeset_drop_locks(&ctx);
+	drm_modeset_acquire_fini(&ctx);
+
+	return ret;
+}
+
 int drm_mode_obj_set_property_ioctl(struct drm_device *dev, void *data,
 				    struct drm_file *file_priv)
 {
@@ -388,18 +476,13 @@ int drm_mode_obj_set_property_ioctl(struct drm_device *dev, void *data,
 	struct drm_mode_object *arg_obj;
 	struct drm_property *property;
 	int ret = -EINVAL;
-	struct drm_mode_object *ref;
 
 	if (!drm_core_check_feature(dev, DRIVER_MODESET))
 		return -EINVAL;
 
-	drm_modeset_lock_all(dev);
-
 	arg_obj = drm_mode_object_find(dev, arg->obj_id, arg->obj_type);
-	if (!arg_obj) {
-		ret = -ENOENT;
-		goto out;
-	}
+	if (!arg_obj)
+		return -ENOENT;
 
 	if (!arg_obj->properties)
 		goto out_unref;
@@ -408,28 +491,12 @@ int drm_mode_obj_set_property_ioctl(struct drm_device *dev, void *data,
 	if (!property)
 		goto out_unref;
 
-	if (!drm_property_change_valid_get(property, arg->value, &ref))
-		goto out_unref;
-
-	switch (arg_obj->type) {
-	case DRM_MODE_OBJECT_CONNECTOR:
-		ret = drm_mode_connector_set_obj_prop(arg_obj, property,
-						      arg->value);
-		break;
-	case DRM_MODE_OBJECT_CRTC:
-		ret = drm_mode_crtc_set_obj_prop(arg_obj, property, arg->value);
-		break;
-	case DRM_MODE_OBJECT_PLANE:
-		ret = drm_mode_plane_set_obj_prop(obj_to_plane(arg_obj),
-						  property, arg->value);
-		break;
-	}
-
-	drm_property_change_valid_put(property, ref);
+	if (drm_drv_uses_atomic_modeset(property->dev))
+		ret = set_property_atomic(arg_obj, property, arg->value);
+	else
+		ret = set_property_legacy(arg_obj, property, arg->value);
 
 out_unref:
 	drm_mode_object_put(arg_obj);
-out:
-	drm_modeset_unlock_all(dev);
 	return ret;
 }
diff --git a/drivers/gpu/drm/drm_modes.c b/drivers/gpu/drm/drm_modes.c
index f2493b9..4a3f68a 100644
--- a/drivers/gpu/drm/drm_modes.c
+++ b/drivers/gpu/drm/drm_modes.c
@@ -709,8 +709,8 @@ int of_get_drm_display_mode(struct device_node *np,
 	if (bus_flags)
 		drm_bus_flags_from_videomode(&vm, bus_flags);
 
-	pr_debug("%s: got %dx%d display mode from %s\n",
-		of_node_full_name(np), vm.hactive, vm.vactive, np->name);
+	pr_debug("%pOF: got %dx%d display mode from %s\n",
+		np, vm.hactive, vm.vactive, np->name);
 	drm_mode_debug_printmodeline(dmode);
 
 	return 0;
@@ -1083,6 +1083,34 @@ drm_mode_validate_size(const struct drm_display_mode *mode,
 }
 EXPORT_SYMBOL(drm_mode_validate_size);
 
+/**
+ * drm_mode_validate_ycbcr420 - add 'ycbcr420-only' modes only when allowed
+ * @mode: mode to check
+ * @connector: drm connector under action
+ *
+ * This function is a helper which can be used to filter out any YCBCR420
+ * only mode, when the source doesn't support it.
+ *
+ * Returns:
+ * The mode status
+ */
+enum drm_mode_status
+drm_mode_validate_ycbcr420(const struct drm_display_mode *mode,
+			   struct drm_connector *connector)
+{
+	u8 vic = drm_match_cea_mode(mode);
+	enum drm_mode_status status = MODE_OK;
+	struct drm_hdmi_info *hdmi = &connector->display_info.hdmi;
+
+	if (test_bit(vic, hdmi->y420_vdb_modes)) {
+		if (!connector->ycbcr_420_allowed)
+			status = MODE_NO_420;
+	}
+
+	return status;
+}
+EXPORT_SYMBOL(drm_mode_validate_ycbcr420);
+
 #define MODE_STATUS(status) [MODE_ ## status + 3] = #status
 
 static const char * const drm_mode_status_names[] = {
@@ -1122,6 +1150,7 @@ static const char * const drm_mode_status_names[] = {
 	MODE_STATUS(ONE_SIZE),
 	MODE_STATUS(NO_REDUCED),
 	MODE_STATUS(NO_STEREO),
+	MODE_STATUS(NO_420),
 	MODE_STATUS(STALE),
 	MODE_STATUS(BAD),
 	MODE_STATUS(ERROR),
@@ -1576,3 +1605,61 @@ int drm_mode_convert_umode(struct drm_display_mode *out,
 out:
 	return ret;
 }
+
+/**
+ * drm_mode_is_420_only - if a given videomode can be only supported in YCBCR420
+ * output format
+ *
+ * @display: display under action
+ * @mode: video mode to be tested.
+ *
+ * Returns:
+ * true if the mode can be supported in YCBCR420 format
+ * false if not.
+ */
+bool drm_mode_is_420_only(const struct drm_display_info *display,
+			  const struct drm_display_mode *mode)
+{
+	u8 vic = drm_match_cea_mode(mode);
+
+	return test_bit(vic, display->hdmi.y420_vdb_modes);
+}
+EXPORT_SYMBOL(drm_mode_is_420_only);
+
+/**
+ * drm_mode_is_420_also - if a given videomode can be supported in YCBCR420
+ * output format also (along with RGB/YCBCR444/422)
+ *
+ * @display: display under action.
+ * @mode: video mode to be tested.
+ *
+ * Returns:
+ * true if the mode can be support YCBCR420 format
+ * false if not.
+ */
+bool drm_mode_is_420_also(const struct drm_display_info *display,
+			  const struct drm_display_mode *mode)
+{
+	u8 vic = drm_match_cea_mode(mode);
+
+	return test_bit(vic, display->hdmi.y420_cmdb_modes);
+}
+EXPORT_SYMBOL(drm_mode_is_420_also);
+/**
+ * drm_mode_is_420 - if a given videomode can be supported in YCBCR420
+ * output format
+ *
+ * @display: display under action.
+ * @mode: video mode to be tested.
+ *
+ * Returns:
+ * true if the mode can be supported in YCBCR420 format
+ * false if not.
+ */
+bool drm_mode_is_420(const struct drm_display_info *display,
+		     const struct drm_display_mode *mode)
+{
+	return drm_mode_is_420_only(display, mode) ||
+		drm_mode_is_420_also(display, mode);
+}
+EXPORT_SYMBOL(drm_mode_is_420);
diff --git a/drivers/gpu/drm/drm_modeset_helper.c b/drivers/gpu/drm/drm_modeset_helper.c
index 2b33825..9cb1eed 100644
--- a/drivers/gpu/drm/drm_modeset_helper.c
+++ b/drivers/gpu/drm/drm_modeset_helper.c
@@ -124,6 +124,7 @@ static struct drm_plane *create_primary_plane(struct drm_device *dev)
 				       &drm_primary_helper_funcs,
 				       safe_modeset_formats,
 				       ARRAY_SIZE(safe_modeset_formats),
+				       NULL,
 				       DRM_PLANE_TYPE_PRIMARY, NULL);
 	if (ret) {
 		kfree(primary);
diff --git a/drivers/gpu/drm/drm_modeset_lock.c b/drivers/gpu/drm/drm_modeset_lock.c
index 64ef09a..af4e906 100644
--- a/drivers/gpu/drm/drm_modeset_lock.c
+++ b/drivers/gpu/drm/drm_modeset_lock.c
@@ -52,7 +52,12 @@
  *     drm_modeset_drop_locks(&ctx);
  *     drm_modeset_acquire_fini(&ctx);
  *
- * On top of of these per-object locks using &ww_mutex there's also an overall
+ * If all that is needed is a single modeset lock, then the &struct
+ * drm_modeset_acquire_ctx is not needed and the locking can be simplified
+ * by passing a NULL instead of ctx in the drm_modeset_lock()
+ * call and, when done, by calling drm_modeset_unlock().
+ *
+ * On top of these per-object locks using &ww_mutex there's also an overall
  * &drm_mode_config.mutex, for protecting everything else. Mostly this means
  * probe state of connectors, and preventing hotplug add/removal of connectors.
  *
@@ -313,11 +318,14 @@ EXPORT_SYMBOL(drm_modeset_lock_init);
  * @lock: lock to take
  * @ctx: acquire ctx
  *
- * If ctx is not NULL, then its ww acquire context is used and the
+ * If @ctx is not NULL, then its ww acquire context is used and the
  * lock will be tracked by the context and can be released by calling
  * drm_modeset_drop_locks().  If -EDEADLK is returned, this means a
  * deadlock scenario has been detected and it is an error to attempt
  * to take any more locks without first calling drm_modeset_backoff().
+ *
+ * If @ctx is NULL then the function call behaves like a normal,
+ * non-nesting mutex_lock() call.
  */
 int drm_modeset_lock(struct drm_modeset_lock *lock,
 		struct drm_modeset_acquire_ctx *ctx)
diff --git a/drivers/gpu/drm/drm_of.c b/drivers/gpu/drm/drm_of.c
index 2120f33..8dafbdf 100644
--- a/drivers/gpu/drm/drm_of.c
+++ b/drivers/gpu/drm/drm_of.c
@@ -160,8 +160,8 @@ int drm_of_component_probe(struct device *dev,
 				of_node_put(remote);
 				continue;
 			} else if (!of_device_is_available(remote->parent)) {
-				dev_warn(dev, "parent device of %s is not available\n",
-					 remote->full_name);
+				dev_warn(dev, "parent device of %pOF is not available\n",
+					 remote);
 				of_node_put(remote);
 				continue;
 			}
diff --git a/drivers/gpu/drm/drm_pci.c b/drivers/gpu/drm/drm_pci.c
index 1eb4fc3..1235c98 100644
--- a/drivers/gpu/drm/drm_pci.c
+++ b/drivers/gpu/drm/drm_pci.c
@@ -149,7 +149,6 @@ int drm_pci_set_busid(struct drm_device *dev, struct drm_master *master)
 	master->unique_len = strlen(master->unique);
 	return 0;
 }
-EXPORT_SYMBOL(drm_pci_set_busid);
 
 static int drm_pci_irq_by_busid(struct drm_device *dev, struct drm_irq_busid *p)
 {
@@ -281,20 +280,15 @@ int drm_get_pci_dev(struct pci_dev *pdev, const struct pci_device_id *ent,
 EXPORT_SYMBOL(drm_get_pci_dev);
 
 /**
- * drm_pci_init - Register matching PCI devices with the DRM subsystem
+ * drm_legacy_pci_init - shadow-attach a legacy DRM PCI driver
  * @driver: DRM device driver
  * @pdriver: PCI device driver
  *
- * Initializes a drm_device structures, registering the stubs and initializing
- * the AGP device.
- *
- * NOTE: This function is deprecated. Modern modesetting drm drivers should use
- * pci_register_driver() directly, this function only provides shadow-binding
- * support for old legacy drivers on top of that core pci function.
+ * This is only used by legacy dri1 drivers and deprecated.
  *
  * Return: 0 on success or a negative error code on failure.
  */
-int drm_pci_init(struct drm_driver *driver, struct pci_driver *pdriver)
+int drm_legacy_pci_init(struct drm_driver *driver, struct pci_driver *pdriver)
 {
 	struct pci_dev *pdev = NULL;
 	const struct pci_device_id *pid;
@@ -302,8 +296,8 @@ int drm_pci_init(struct drm_driver *driver, struct pci_driver *pdriver)
 
 	DRM_DEBUG("\n");
 
-	if (!(driver->driver_features & DRIVER_LEGACY))
-		return pci_register_driver(pdriver);
+	if (WARN_ON(!(driver->driver_features & DRIVER_LEGACY)))
+		return -EINVAL;
 
 	/* If not using KMS, fall back to stealth mode manual scanning. */
 	INIT_LIST_HEAD(&driver->legacy_dev_list);
@@ -330,6 +324,7 @@ int drm_pci_init(struct drm_driver *driver, struct pci_driver *pdriver)
 	}
 	return 0;
 }
+EXPORT_SYMBOL(drm_legacy_pci_init);
 
 int drm_pcie_get_speed_cap_mask(struct drm_device *dev, u32 *mask)
 {
@@ -391,11 +386,6 @@ EXPORT_SYMBOL(drm_pcie_get_max_link_width);
 
 #else
 
-int drm_pci_init(struct drm_driver *driver, struct pci_driver *pdriver)
-{
-	return -1;
-}
-
 void drm_pci_agp_destroy(struct drm_device *dev) {}
 
 int drm_irq_by_busid(struct drm_device *dev, void *data,
@@ -405,27 +395,21 @@ int drm_irq_by_busid(struct drm_device *dev, void *data,
 }
 #endif
 
-EXPORT_SYMBOL(drm_pci_init);
-
 /**
- * drm_pci_exit - Unregister matching PCI devices from the DRM subsystem
+ * drm_legacy_pci_exit - unregister shadow-attach legacy DRM driver
  * @driver: DRM device driver
  * @pdriver: PCI device driver
  *
- * Unregisters one or more devices matched by a PCI driver from the DRM
- * subsystem.
- *
- * NOTE: This function is deprecated. Modern modesetting drm drivers should use
- * pci_unregister_driver() directly, this function only provides shadow-binding
- * support for old legacy drivers on top of that core pci function.
+ * Unregister a DRM driver shadow-attached through drm_legacy_pci_init(). This
+ * is deprecated and only used by dri1 drivers.
  */
-void drm_pci_exit(struct drm_driver *driver, struct pci_driver *pdriver)
+void drm_legacy_pci_exit(struct drm_driver *driver, struct pci_driver *pdriver)
 {
 	struct drm_device *dev, *tmp;
 	DRM_DEBUG("\n");
 
 	if (!(driver->driver_features & DRIVER_LEGACY)) {
-		pci_unregister_driver(pdriver);
+		WARN_ON(1);
 	} else {
 		list_for_each_entry_safe(dev, tmp, &driver->legacy_dev_list,
 					 legacy_dev_list) {
@@ -435,4 +419,4 @@ void drm_pci_exit(struct drm_driver *driver, struct pci_driver *pdriver)
 	}
 	DRM_INFO("Module unloaded\n");
 }
-EXPORT_SYMBOL(drm_pci_exit);
+EXPORT_SYMBOL(drm_legacy_pci_exit);
diff --git a/drivers/gpu/drm/drm_plane.c b/drivers/gpu/drm/drm_plane.c
index e40c12f..7a00351 100644
--- a/drivers/gpu/drm/drm_plane.c
+++ b/drivers/gpu/drm/drm_plane.c
@@ -62,6 +62,87 @@ static unsigned int drm_num_planes(struct drm_device *dev)
 	return num;
 }
 
+static inline u32 *
+formats_ptr(struct drm_format_modifier_blob *blob)
+{
+	return (u32 *)(((char *)blob) + blob->formats_offset);
+}
+
+static inline struct drm_format_modifier *
+modifiers_ptr(struct drm_format_modifier_blob *blob)
+{
+	return (struct drm_format_modifier *)(((char *)blob) + blob->modifiers_offset);
+}
+
+static int create_in_format_blob(struct drm_device *dev, struct drm_plane *plane)
+{
+	const struct drm_mode_config *config = &dev->mode_config;
+	struct drm_property_blob *blob;
+	struct drm_format_modifier *mod;
+	size_t blob_size, formats_size, modifiers_size;
+	struct drm_format_modifier_blob *blob_data;
+	unsigned int i, j;
+
+	formats_size = sizeof(__u32) * plane->format_count;
+	if (WARN_ON(!formats_size)) {
+		/* 0 formats are never expected */
+		return 0;
+	}
+
+	modifiers_size =
+		sizeof(struct drm_format_modifier) * plane->modifier_count;
+
+	blob_size = sizeof(struct drm_format_modifier_blob);
+	/* Modifiers offset is a pointer to a struct with a 64 bit field so it
+	 * should be naturally aligned to 8B.
+	 */
+	BUILD_BUG_ON(sizeof(struct drm_format_modifier_blob) % 8);
+	blob_size += ALIGN(formats_size, 8);
+	blob_size += modifiers_size;
+
+	blob = drm_property_create_blob(dev, blob_size, NULL);
+	if (IS_ERR(blob))
+		return -1;
+
+	blob_data = (struct drm_format_modifier_blob *)blob->data;
+	blob_data->version = FORMAT_BLOB_CURRENT;
+	blob_data->count_formats = plane->format_count;
+	blob_data->formats_offset = sizeof(struct drm_format_modifier_blob);
+	blob_data->count_modifiers = plane->modifier_count;
+
+	blob_data->modifiers_offset =
+		ALIGN(blob_data->formats_offset + formats_size, 8);
+
+	memcpy(formats_ptr(blob_data), plane->format_types, formats_size);
+
+	/* If we can't determine support, just bail */
+	if (!plane->funcs->format_mod_supported)
+		goto done;
+
+	mod = modifiers_ptr(blob_data);
+	for (i = 0; i < plane->modifier_count; i++) {
+		for (j = 0; j < plane->format_count; j++) {
+			if (plane->funcs->format_mod_supported(plane,
+							       plane->format_types[j],
+							       plane->modifiers[i])) {
+
+				mod->formats |= 1ULL << j;
+			}
+		}
+
+		mod->modifier = plane->modifiers[i];
+		mod->offset = 0;
+		mod->pad = 0;
+		mod++;
+	}
+
+done:
+	drm_object_attach_property(&plane->base, config->modifiers_property,
+				   blob->base.id);
+
+	return 0;
+}
+
 /**
  * drm_universal_plane_init - Initialize a new universal plane object
  * @dev: DRM device
@@ -70,6 +151,8 @@ static unsigned int drm_num_planes(struct drm_device *dev)
  * @funcs: callbacks for the new plane
  * @formats: array of supported formats (DRM_FORMAT\_\*)
  * @format_count: number of elements in @formats
+ * @format_modifiers: array of struct drm_format modifiers terminated by
+ *                    DRM_FORMAT_MOD_INVALID
  * @type: type of plane (overlay, primary, cursor)
  * @name: printf style format string for the plane name, or NULL for default name
  *
@@ -82,10 +165,12 @@ int drm_universal_plane_init(struct drm_device *dev, struct drm_plane *plane,
 			     uint32_t possible_crtcs,
 			     const struct drm_plane_funcs *funcs,
 			     const uint32_t *formats, unsigned int format_count,
+			     const uint64_t *format_modifiers,
 			     enum drm_plane_type type,
 			     const char *name, ...)
 {
 	struct drm_mode_config *config = &dev->mode_config;
+	unsigned int format_modifier_count = 0;
 	int ret;
 
 	ret = drm_mode_object_add(dev, &plane->base, DRM_MODE_OBJECT_PLANE);
@@ -105,6 +190,31 @@ int drm_universal_plane_init(struct drm_device *dev, struct drm_plane *plane,
 		return -ENOMEM;
 	}
 
+	/*
+	 * First driver to need more than 64 formats needs to fix this. Each
+	 * format is encoded as a bit and the current code only supports a u64.
+	 */
+	if (WARN_ON(format_count > 64))
+		return -EINVAL;
+
+	if (format_modifiers) {
+		const uint64_t *temp_modifiers = format_modifiers;
+		while (*temp_modifiers++ != DRM_FORMAT_MOD_INVALID)
+			format_modifier_count++;
+	}
+
+	plane->modifier_count = format_modifier_count;
+	plane->modifiers = kmalloc_array(format_modifier_count,
+					 sizeof(format_modifiers[0]),
+					 GFP_KERNEL);
+
+	if (format_modifier_count && !plane->modifiers) {
+		DRM_DEBUG_KMS("out of memory when allocating plane\n");
+		kfree(plane->format_types);
+		drm_mode_object_unregister(dev, &plane->base);
+		return -ENOMEM;
+	}
+
 	if (name) {
 		va_list ap;
 
@@ -117,12 +227,15 @@ int drm_universal_plane_init(struct drm_device *dev, struct drm_plane *plane,
 	}
 	if (!plane->name) {
 		kfree(plane->format_types);
+		kfree(plane->modifiers);
 		drm_mode_object_unregister(dev, &plane->base);
 		return -ENOMEM;
 	}
 
 	memcpy(plane->format_types, formats, format_count * sizeof(uint32_t));
 	plane->format_count = format_count;
+	memcpy(plane->modifiers, format_modifiers,
+	       format_modifier_count * sizeof(format_modifiers[0]));
 	plane->possible_crtcs = possible_crtcs;
 	plane->type = type;
 
@@ -149,6 +262,9 @@ int drm_universal_plane_init(struct drm_device *dev, struct drm_plane *plane,
 		drm_object_attach_property(&plane->base, config->prop_src_h, 0);
 	}
 
+	if (config->allow_fb_modifiers)
+		create_in_format_blob(dev, plane);
+
 	return 0;
 }
 EXPORT_SYMBOL(drm_universal_plane_init);
@@ -205,7 +321,8 @@ int drm_plane_init(struct drm_device *dev, struct drm_plane *plane,
 
 	type = is_primary ? DRM_PLANE_TYPE_PRIMARY : DRM_PLANE_TYPE_OVERLAY;
 	return drm_universal_plane_init(dev, plane, possible_crtcs, funcs,
-					formats, format_count, type, NULL);
+					formats, format_count,
+					NULL, type, NULL);
 }
 EXPORT_SYMBOL(drm_plane_init);
 
@@ -224,6 +341,7 @@ void drm_plane_cleanup(struct drm_plane *plane)
 	drm_modeset_lock_fini(&plane->mutex);
 
 	kfree(plane->format_types);
+	kfree(plane->modifiers);
 	drm_mode_object_unregister(dev, &plane->base);
 
 	BUG_ON(list_empty(&plane->head));
diff --git a/drivers/gpu/drm/drm_probe_helper.c b/drivers/gpu/drm/drm_probe_helper.c
index 00e6832..904966c 100644
--- a/drivers/gpu/drm/drm_probe_helper.c
+++ b/drivers/gpu/drm/drm_probe_helper.c
@@ -528,6 +528,10 @@ int drm_helper_probe_single_connector_modes(struct drm_connector *connector,
 		if (mode->status == MODE_OK)
 			mode->status = drm_mode_validate_pipeline(mode,
 								  connector);
+
+		if (mode->status == MODE_OK)
+			mode->status = drm_mode_validate_ycbcr420(mode,
+								  connector);
 	}
 
 prune:
diff --git a/drivers/gpu/drm/drm_property.c b/drivers/gpu/drm/drm_property.c
index 3e88fa2..bc51282 100644
--- a/drivers/gpu/drm/drm_property.c
+++ b/drivers/gpu/drm/drm_property.c
@@ -709,6 +709,29 @@ int drm_property_replace_global_blob(struct drm_device *dev,
 }
 EXPORT_SYMBOL(drm_property_replace_global_blob);
 
+/**
+ * drm_property_replace_blob - replace a blob property
+ * @blob: a pointer to the member blob to be replaced
+ * @new_blob: the new blob to replace with
+ *
+ * Return: true if the blob was in fact replaced.
+ */
+bool drm_property_replace_blob(struct drm_property_blob **blob,
+			       struct drm_property_blob *new_blob)
+{
+	struct drm_property_blob *old_blob = *blob;
+
+	if (old_blob == new_blob)
+		return false;
+
+	drm_property_blob_put(old_blob);
+	if (new_blob)
+		drm_property_blob_get(new_blob);
+	*blob = new_blob;
+	return true;
+}
+EXPORT_SYMBOL(drm_property_replace_blob);
+
 int drm_mode_getblob_ioctl(struct drm_device *dev,
 			   void *data, struct drm_file *file_priv)
 {
diff --git a/drivers/gpu/drm/drm_scdc_helper.c b/drivers/gpu/drm/drm_scdc_helper.c
index 3cd96a9..7d1b0f0 100644
--- a/drivers/gpu/drm/drm_scdc_helper.c
+++ b/drivers/gpu/drm/drm_scdc_helper.c
@@ -194,19 +194,26 @@ EXPORT_SYMBOL(drm_scdc_set_scrambling);
  * @adapter: I2C adapter for DDC channel
  * @set: ret or reset the high clock ratio
  *
- * TMDS clock ratio calculations go like this:
- * TMDS character = 10 bit TMDS encoded value
- * TMDS character rate = The rate at which TMDS characters are transmitted(Mcsc)
- * TMDS bit rate = 10x TMDS character rate
- * As per the spec:
- * TMDS clock rate for pixel clock < 340 MHz = 1x the character rate
- *	= 1/10 pixel clock rate
- * TMDS clock rate for pixel clock > 340 MHz = 0.25x the character rate
- *	= 1/40 pixel clock rate
  *
- * Writes to the TMDS config register over SCDC channel, and:
- * sets TMDS clock ratio to 1/40 when set = 1
- * sets TMDS clock ratio to 1/10 when set = 0
+ *	TMDS clock ratio calculations go like this:
+ *		TMDS character = 10 bit TMDS encoded value
+ *
+ *		TMDS character rate = The rate at which TMDS characters are
+ *		transmitted (Mcsc)
+ *
+ *		TMDS bit rate = 10x TMDS character rate
+ *
+ *	As per the spec:
+ *		TMDS clock rate for pixel clock < 340 MHz = 1x the character
+ *		rate = 1/10 pixel clock rate
+ *
+ *		TMDS clock rate for pixel clock > 340 MHz = 0.25x the character
+ *		rate = 1/40 pixel clock rate
+ *
+ *	Writes to the TMDS config register over SCDC channel, and:
+ *		sets TMDS clock ratio to 1/40 when set = 1
+ *
+ *		sets TMDS clock ratio to 1/10 when set = 0
  *
  * Returns:
  * True if write is successful, false otherwise.
diff --git a/drivers/gpu/drm/drm_simple_kms_helper.c b/drivers/gpu/drm/drm_simple_kms_helper.c
index e084f9f..dc9fd10 100644
--- a/drivers/gpu/drm/drm_simple_kms_helper.c
+++ b/drivers/gpu/drm/drm_simple_kms_helper.c
@@ -37,10 +37,18 @@ static const struct drm_encoder_funcs drm_simple_kms_encoder_funcs = {
 static int drm_simple_kms_crtc_check(struct drm_crtc *crtc,
 				     struct drm_crtc_state *state)
 {
+	bool has_primary = state->plane_mask &
+			   BIT(drm_plane_index(crtc->primary));
+
+	/* We always want to have an active plane with an active CRTC */
+	if (has_primary != state->enable)
+		return -EINVAL;
+
 	return drm_atomic_add_affected_planes(state->state, crtc);
 }
 
-static void drm_simple_kms_crtc_enable(struct drm_crtc *crtc)
+static void drm_simple_kms_crtc_enable(struct drm_crtc *crtc,
+				       struct drm_crtc_state *old_state)
 {
 	struct drm_simple_display_pipe *pipe;
 
@@ -51,7 +59,8 @@ static void drm_simple_kms_crtc_enable(struct drm_crtc *crtc)
 	pipe->funcs->enable(pipe, crtc->state);
 }
 
-static void drm_simple_kms_crtc_disable(struct drm_crtc *crtc)
+static void drm_simple_kms_crtc_disable(struct drm_crtc *crtc,
+					struct drm_crtc_state *old_state)
 {
 	struct drm_simple_display_pipe *pipe;
 
@@ -64,8 +73,8 @@ static void drm_simple_kms_crtc_disable(struct drm_crtc *crtc)
 
 static const struct drm_crtc_helper_funcs drm_simple_kms_crtc_helper_funcs = {
 	.atomic_check = drm_simple_kms_crtc_check,
-	.disable = drm_simple_kms_crtc_disable,
-	.enable = drm_simple_kms_crtc_enable,
+	.atomic_enable = drm_simple_kms_crtc_enable,
+	.atomic_disable = drm_simple_kms_crtc_disable,
 };
 
 static const struct drm_crtc_funcs drm_simple_kms_crtc_funcs = {
@@ -88,9 +97,6 @@ static int drm_simple_kms_plane_atomic_check(struct drm_plane *plane,
 	pipe = container_of(plane, struct drm_simple_display_pipe, plane);
 	crtc_state = drm_atomic_get_new_crtc_state(plane_state->state,
 						   &pipe->crtc);
-	if (crtc_state->enable != !!plane_state->crtc)
-		return -EINVAL; /* plane must match crtc enable state */
-
 	if (!crtc_state->enable)
 		return 0; /* nothing to check when disabling or disabled */
 
@@ -193,6 +199,7 @@ EXPORT_SYMBOL(drm_simple_display_pipe_attach_bridge);
  * @funcs: callbacks for the display pipe (optional)
  * @formats: array of supported formats (DRM_FORMAT\_\*)
  * @format_count: number of elements in @formats
+ * @format_modifiers: array of formats modifiers
  * @connector: connector to attach and register (optional)
  *
  * Sets up a display pipeline which consist of a really simple
@@ -213,6 +220,7 @@ int drm_simple_display_pipe_init(struct drm_device *dev,
 			struct drm_simple_display_pipe *pipe,
 			const struct drm_simple_display_pipe_funcs *funcs,
 			const uint32_t *formats, unsigned int format_count,
+			const uint64_t *format_modifiers,
 			struct drm_connector *connector)
 {
 	struct drm_encoder *encoder = &pipe->encoder;
@@ -227,6 +235,7 @@ int drm_simple_display_pipe_init(struct drm_device *dev,
 	ret = drm_universal_plane_init(dev, plane, 0,
 				       &drm_simple_kms_plane_funcs,
 				       formats, format_count,
+				       format_modifiers,
 				       DRM_PLANE_TYPE_PRIMARY, NULL);
 	if (ret)
 		return ret;
diff --git a/drivers/gpu/drm/drm_syncobj.c b/drivers/gpu/drm/drm_syncobj.c
index 789ba0b..0422b8c 100644
--- a/drivers/gpu/drm/drm_syncobj.c
+++ b/drivers/gpu/drm/drm_syncobj.c
@@ -1,5 +1,7 @@
 /*
  * Copyright 2017 Red Hat
+ * Parts ported from amdgpu (fence wait code).
+ * Copyright 2016 Advanced Micro Devices, Inc.
  *
  * Permission is hereby granted, free of charge, to any person obtaining a
  * copy of this software and associated documentation files (the "Software"),
@@ -31,6 +33,9 @@
  * that contain an optional fence. The fence can be updated with a new
  * fence, or be NULL.
  *
+ * syncobj's can be waited upon, where it will wait for the underlying
+ * fence.
+ *
  * syncobj's can be export to fd's and back, these fd's are opaque and
  * have no other use case, except passing the syncobj between processes.
  *
@@ -46,6 +51,7 @@
 #include <linux/fs.h>
 #include <linux/anon_inodes.h>
 #include <linux/sync_file.h>
+#include <linux/sched/signal.h>
 
 #include "drm_internal.h"
 #include <drm/drm_syncobj.h>
@@ -75,6 +81,75 @@ struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
 }
 EXPORT_SYMBOL(drm_syncobj_find);
 
+static void drm_syncobj_add_callback_locked(struct drm_syncobj *syncobj,
+					    struct drm_syncobj_cb *cb,
+					    drm_syncobj_func_t func)
+{
+	cb->func = func;
+	list_add_tail(&cb->node, &syncobj->cb_list);
+}
+
+static int drm_syncobj_fence_get_or_add_callback(struct drm_syncobj *syncobj,
+						 struct dma_fence **fence,
+						 struct drm_syncobj_cb *cb,
+						 drm_syncobj_func_t func)
+{
+	int ret;
+
+	*fence = drm_syncobj_fence_get(syncobj);
+	if (*fence)
+		return 1;
+
+	spin_lock(&syncobj->lock);
+	/* We've already tried once to get a fence and failed.  Now that we
+	 * have the lock, try one more time just to be sure we don't add a
+	 * callback when a fence has already been set.
+	 */
+	if (syncobj->fence) {
+		*fence = dma_fence_get(syncobj->fence);
+		ret = 1;
+	} else {
+		*fence = NULL;
+		drm_syncobj_add_callback_locked(syncobj, cb, func);
+		ret = 0;
+	}
+	spin_unlock(&syncobj->lock);
+
+	return ret;
+}
+
+/**
+ * drm_syncobj_add_callback - adds a callback to syncobj::cb_list
+ * @syncobj: Sync object to which to add the callback
+ * @cb: Callback to add
+ * @func: Func to use when initializing the drm_syncobj_cb struct
+ *
+ * This adds a callback to be called next time the fence is replaced
+ */
+void drm_syncobj_add_callback(struct drm_syncobj *syncobj,
+			      struct drm_syncobj_cb *cb,
+			      drm_syncobj_func_t func)
+{
+	spin_lock(&syncobj->lock);
+	drm_syncobj_add_callback_locked(syncobj, cb, func);
+	spin_unlock(&syncobj->lock);
+}
+EXPORT_SYMBOL(drm_syncobj_add_callback);
+
+/**
+ * drm_syncobj_add_callback - removes a callback to syncobj::cb_list
+ * @syncobj: Sync object from which to remove the callback
+ * @cb: Callback to remove
+ */
+void drm_syncobj_remove_callback(struct drm_syncobj *syncobj,
+				 struct drm_syncobj_cb *cb)
+{
+	spin_lock(&syncobj->lock);
+	list_del_init(&cb->node);
+	spin_unlock(&syncobj->lock);
+}
+EXPORT_SYMBOL(drm_syncobj_remove_callback);
+
 /**
  * drm_syncobj_replace_fence - replace fence in a sync object.
  * @syncobj: Sync object to replace fence in
@@ -86,18 +161,75 @@ void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 			       struct dma_fence *fence)
 {
 	struct dma_fence *old_fence;
+	struct drm_syncobj_cb *cur, *tmp;
 
 	if (fence)
 		dma_fence_get(fence);
-	old_fence = xchg(&syncobj->fence, fence);
+
+	spin_lock(&syncobj->lock);
+
+	old_fence = syncobj->fence;
+	syncobj->fence = fence;
+
+	if (fence != old_fence) {
+		list_for_each_entry_safe(cur, tmp, &syncobj->cb_list, node) {
+			list_del_init(&cur->node);
+			cur->func(syncobj, cur);
+		}
+	}
+
+	spin_unlock(&syncobj->lock);
 
 	dma_fence_put(old_fence);
 }
 EXPORT_SYMBOL(drm_syncobj_replace_fence);
 
-int drm_syncobj_fence_get(struct drm_file *file_private,
-			  u32 handle,
-			  struct dma_fence **fence)
+struct drm_syncobj_null_fence {
+	struct dma_fence base;
+	spinlock_t lock;
+};
+
+static const char *drm_syncobj_null_fence_get_name(struct dma_fence *fence)
+{
+        return "syncobjnull";
+}
+
+static bool drm_syncobj_null_fence_enable_signaling(struct dma_fence *fence)
+{
+    dma_fence_enable_sw_signaling(fence);
+    return !dma_fence_is_signaled(fence);
+}
+
+static const struct dma_fence_ops drm_syncobj_null_fence_ops = {
+	.get_driver_name = drm_syncobj_null_fence_get_name,
+	.get_timeline_name = drm_syncobj_null_fence_get_name,
+	.enable_signaling = drm_syncobj_null_fence_enable_signaling,
+	.wait = dma_fence_default_wait,
+	.release = NULL,
+};
+
+static int drm_syncobj_assign_null_handle(struct drm_syncobj *syncobj)
+{
+	struct drm_syncobj_null_fence *fence;
+	fence = kzalloc(sizeof(*fence), GFP_KERNEL);
+	if (fence == NULL)
+		return -ENOMEM;
+
+	spin_lock_init(&fence->lock);
+	dma_fence_init(&fence->base, &drm_syncobj_null_fence_ops,
+		       &fence->lock, 0, 0);
+	dma_fence_signal(&fence->base);
+
+	drm_syncobj_replace_fence(syncobj, &fence->base);
+
+	dma_fence_put(&fence->base);
+
+	return 0;
+}
+
+int drm_syncobj_find_fence(struct drm_file *file_private,
+			   u32 handle,
+			   struct dma_fence **fence)
 {
 	struct drm_syncobj *syncobj = drm_syncobj_find(file_private, handle);
 	int ret = 0;
@@ -105,14 +237,14 @@ int drm_syncobj_fence_get(struct drm_file *file_private,
 	if (!syncobj)
 		return -ENOENT;
 
-	*fence = dma_fence_get(syncobj->fence);
+	*fence = drm_syncobj_fence_get(syncobj);
 	if (!*fence) {
 		ret = -EINVAL;
 	}
 	drm_syncobj_put(syncobj);
 	return ret;
 }
-EXPORT_SYMBOL(drm_syncobj_fence_get);
+EXPORT_SYMBOL(drm_syncobj_find_fence);
 
 /**
  * drm_syncobj_free - free a sync object.
@@ -125,13 +257,13 @@ void drm_syncobj_free(struct kref *kref)
 	struct drm_syncobj *syncobj = container_of(kref,
 						   struct drm_syncobj,
 						   refcount);
-	dma_fence_put(syncobj->fence);
+	drm_syncobj_replace_fence(syncobj, NULL);
 	kfree(syncobj);
 }
 EXPORT_SYMBOL(drm_syncobj_free);
 
 static int drm_syncobj_create(struct drm_file *file_private,
-			      u32 *handle)
+			      u32 *handle, uint32_t flags)
 {
 	int ret;
 	struct drm_syncobj *syncobj;
@@ -141,6 +273,16 @@ static int drm_syncobj_create(struct drm_file *file_private,
 		return -ENOMEM;
 
 	kref_init(&syncobj->refcount);
+	INIT_LIST_HEAD(&syncobj->cb_list);
+	spin_lock_init(&syncobj->lock);
+
+	if (flags & DRM_SYNCOBJ_CREATE_SIGNALED) {
+		ret = drm_syncobj_assign_null_handle(syncobj);
+		if (ret < 0) {
+			drm_syncobj_put(syncobj);
+			return ret;
+		}
+	}
 
 	idr_preload(GFP_KERNEL);
 	spin_lock(&file_private->syncobj_table_lock);
@@ -307,7 +449,7 @@ int drm_syncobj_export_sync_file(struct drm_file *file_private,
 	if (fd < 0)
 		return fd;
 
-	ret = drm_syncobj_fence_get(file_private, handle, &fence);
+	ret = drm_syncobj_find_fence(file_private, handle, &fence);
 	if (ret)
 		goto err_put_fd;
 
@@ -330,7 +472,6 @@ int drm_syncobj_export_sync_file(struct drm_file *file_private,
 }
 /**
  * drm_syncobj_open - initalizes syncobj file-private structures at devnode open time
- * @dev: drm_device which is being opened by userspace
  * @file_private: drm file-private structure to set up
  *
  * Called at device open time, sets up the structure for handling refcounting
@@ -354,7 +495,6 @@ drm_syncobj_release_handle(int id, void *ptr, void *data)
 
 /**
  * drm_syncobj_release - release file-private sync object resources
- * @dev: drm_device which is being closed by userspace
  * @file_private: drm file-private structure to clean up
  *
  * Called at close time when the filp is going away.
@@ -379,11 +519,11 @@ drm_syncobj_create_ioctl(struct drm_device *dev, void *data,
 		return -ENODEV;
 
 	/* no valid flags yet */
-	if (args->flags)
+	if (args->flags & ~DRM_SYNCOBJ_CREATE_SIGNALED)
 		return -EINVAL;
 
 	return drm_syncobj_create(file_private,
-				  &args->handle);
+				  &args->handle, args->flags);
 }
 
 int
@@ -449,3 +589,368 @@ drm_syncobj_fd_to_handle_ioctl(struct drm_device *dev, void *data,
 	return drm_syncobj_fd_to_handle(file_private, args->fd,
 					&args->handle);
 }
+
+struct syncobj_wait_entry {
+	struct task_struct *task;
+	struct dma_fence *fence;
+	struct dma_fence_cb fence_cb;
+	struct drm_syncobj_cb syncobj_cb;
+};
+
+static void syncobj_wait_fence_func(struct dma_fence *fence,
+				    struct dma_fence_cb *cb)
+{
+	struct syncobj_wait_entry *wait =
+		container_of(cb, struct syncobj_wait_entry, fence_cb);
+
+	wake_up_process(wait->task);
+}
+
+static void syncobj_wait_syncobj_func(struct drm_syncobj *syncobj,
+				      struct drm_syncobj_cb *cb)
+{
+	struct syncobj_wait_entry *wait =
+		container_of(cb, struct syncobj_wait_entry, syncobj_cb);
+
+	/* This happens inside the syncobj lock */
+	wait->fence = dma_fence_get(syncobj->fence);
+	wake_up_process(wait->task);
+}
+
+static signed long drm_syncobj_array_wait_timeout(struct drm_syncobj **syncobjs,
+						  uint32_t count,
+						  uint32_t flags,
+						  signed long timeout,
+						  uint32_t *idx)
+{
+	struct syncobj_wait_entry *entries;
+	struct dma_fence *fence;
+	signed long ret;
+	uint32_t signaled_count, i;
+
+	entries = kcalloc(count, sizeof(*entries), GFP_KERNEL);
+	if (!entries)
+		return -ENOMEM;
+
+	/* Walk the list of sync objects and initialize entries.  We do
+	 * this up-front so that we can properly return -EINVAL if there is
+	 * a syncobj with a missing fence and then never have the chance of
+	 * returning -EINVAL again.
+	 */
+	signaled_count = 0;
+	for (i = 0; i < count; ++i) {
+		entries[i].task = current;
+		entries[i].fence = drm_syncobj_fence_get(syncobjs[i]);
+		if (!entries[i].fence) {
+			if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
+				continue;
+			} else {
+				ret = -EINVAL;
+				goto cleanup_entries;
+			}
+		}
+
+		if (dma_fence_is_signaled(entries[i].fence)) {
+			if (signaled_count == 0 && idx)
+				*idx = i;
+			signaled_count++;
+		}
+	}
+
+	/* Initialize ret to the max of timeout and 1.  That way, the
+	 * default return value indicates a successful wait and not a
+	 * timeout.
+	 */
+	ret = max_t(signed long, timeout, 1);
+
+	if (signaled_count == count ||
+	    (signaled_count > 0 &&
+	     !(flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL)))
+		goto cleanup_entries;
+
+	/* There's a very annoying laxness in the dma_fence API here, in
+	 * that backends are not required to automatically report when a
+	 * fence is signaled prior to fence->ops->enable_signaling() being
+	 * called.  So here if we fail to match signaled_count, we need to
+	 * fallthough and try a 0 timeout wait!
+	 */
+
+	if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT) {
+		for (i = 0; i < count; ++i) {
+			drm_syncobj_fence_get_or_add_callback(syncobjs[i],
+							      &entries[i].fence,
+							      &entries[i].syncobj_cb,
+							      syncobj_wait_syncobj_func);
+		}
+	}
+
+	do {
+		set_current_state(TASK_INTERRUPTIBLE);
+
+		signaled_count = 0;
+		for (i = 0; i < count; ++i) {
+			fence = entries[i].fence;
+			if (!fence)
+				continue;
+
+			if (dma_fence_is_signaled(fence) ||
+			    (!entries[i].fence_cb.func &&
+			     dma_fence_add_callback(fence,
+						    &entries[i].fence_cb,
+						    syncobj_wait_fence_func))) {
+				/* The fence has been signaled */
+				if (flags & DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL) {
+					signaled_count++;
+				} else {
+					if (idx)
+						*idx = i;
+					goto done_waiting;
+				}
+			}
+		}
+
+		if (signaled_count == count)
+			goto done_waiting;
+
+		if (timeout == 0) {
+			/* If we are doing a 0 timeout wait and we got
+			 * here, then we just timed out.
+			 */
+			ret = 0;
+			goto done_waiting;
+		}
+
+		ret = schedule_timeout(ret);
+
+		if (ret > 0 && signal_pending(current))
+			ret = -ERESTARTSYS;
+	} while (ret > 0);
+
+done_waiting:
+	__set_current_state(TASK_RUNNING);
+
+cleanup_entries:
+	for (i = 0; i < count; ++i) {
+		if (entries[i].syncobj_cb.func)
+			drm_syncobj_remove_callback(syncobjs[i],
+						    &entries[i].syncobj_cb);
+		if (entries[i].fence_cb.func)
+			dma_fence_remove_callback(entries[i].fence,
+						  &entries[i].fence_cb);
+		dma_fence_put(entries[i].fence);
+	}
+	kfree(entries);
+
+	return ret;
+}
+
+/**
+ * drm_timeout_abs_to_jiffies - calculate jiffies timeout from absolute value
+ *
+ * @timeout_nsec: timeout nsec component in ns, 0 for poll
+ *
+ * Calculate the timeout in jiffies from an absolute time in sec/nsec.
+ */
+static signed long drm_timeout_abs_to_jiffies(int64_t timeout_nsec)
+{
+	ktime_t abs_timeout, now;
+	u64 timeout_ns, timeout_jiffies64;
+
+	/* make 0 timeout means poll - absolute 0 doesn't seem valid */
+	if (timeout_nsec == 0)
+		return 0;
+
+	abs_timeout = ns_to_ktime(timeout_nsec);
+	now = ktime_get();
+
+	if (!ktime_after(abs_timeout, now))
+		return 0;
+
+	timeout_ns = ktime_to_ns(ktime_sub(abs_timeout, now));
+
+	timeout_jiffies64 = nsecs_to_jiffies64(timeout_ns);
+	/*  clamp timeout to avoid infinite timeout */
+	if (timeout_jiffies64 >= MAX_SCHEDULE_TIMEOUT - 1)
+		return MAX_SCHEDULE_TIMEOUT - 1;
+
+	return timeout_jiffies64 + 1;
+}
+
+static int drm_syncobj_array_wait(struct drm_device *dev,
+				  struct drm_file *file_private,
+				  struct drm_syncobj_wait *wait,
+				  struct drm_syncobj **syncobjs)
+{
+	signed long timeout = drm_timeout_abs_to_jiffies(wait->timeout_nsec);
+	signed long ret = 0;
+	uint32_t first = ~0;
+
+	ret = drm_syncobj_array_wait_timeout(syncobjs,
+					     wait->count_handles,
+					     wait->flags,
+					     timeout, &first);
+	if (ret < 0)
+		return ret;
+
+	wait->first_signaled = first;
+	if (ret == 0)
+		return -ETIME;
+	return 0;
+}
+
+static int drm_syncobj_array_find(struct drm_file *file_private,
+				  void *user_handles, uint32_t count_handles,
+				  struct drm_syncobj ***syncobjs_out)
+{
+	uint32_t i, *handles;
+	struct drm_syncobj **syncobjs;
+	int ret;
+
+	handles = kmalloc_array(count_handles, sizeof(*handles), GFP_KERNEL);
+	if (handles == NULL)
+		return -ENOMEM;
+
+	if (copy_from_user(handles, user_handles,
+			   sizeof(uint32_t) * count_handles)) {
+		ret = -EFAULT;
+		goto err_free_handles;
+	}
+
+	syncobjs = kmalloc_array(count_handles, sizeof(*syncobjs), GFP_KERNEL);
+	if (syncobjs == NULL) {
+		ret = -ENOMEM;
+		goto err_free_handles;
+	}
+
+	for (i = 0; i < count_handles; i++) {
+		syncobjs[i] = drm_syncobj_find(file_private, handles[i]);
+		if (!syncobjs[i]) {
+			ret = -ENOENT;
+			goto err_put_syncobjs;
+		}
+	}
+
+	kfree(handles);
+	*syncobjs_out = syncobjs;
+	return 0;
+
+err_put_syncobjs:
+	while (i-- > 0)
+		drm_syncobj_put(syncobjs[i]);
+	kfree(syncobjs);
+err_free_handles:
+	kfree(handles);
+
+	return ret;
+}
+
+static void drm_syncobj_array_free(struct drm_syncobj **syncobjs,
+				   uint32_t count)
+{
+	uint32_t i;
+	for (i = 0; i < count; i++)
+		drm_syncobj_put(syncobjs[i]);
+	kfree(syncobjs);
+}
+
+int
+drm_syncobj_wait_ioctl(struct drm_device *dev, void *data,
+		       struct drm_file *file_private)
+{
+	struct drm_syncobj_wait *args = data;
+	struct drm_syncobj **syncobjs;
+	int ret = 0;
+
+	if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
+		return -ENODEV;
+
+	if (args->flags & ~(DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL |
+			    DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT))
+		return -EINVAL;
+
+	if (args->count_handles == 0)
+		return -EINVAL;
+
+	ret = drm_syncobj_array_find(file_private,
+				     u64_to_user_ptr(args->handles),
+				     args->count_handles,
+				     &syncobjs);
+	if (ret < 0)
+		return ret;
+
+	ret = drm_syncobj_array_wait(dev, file_private,
+				     args, syncobjs);
+
+	drm_syncobj_array_free(syncobjs, args->count_handles);
+
+	return ret;
+}
+
+int
+drm_syncobj_reset_ioctl(struct drm_device *dev, void *data,
+			struct drm_file *file_private)
+{
+	struct drm_syncobj_array *args = data;
+	struct drm_syncobj **syncobjs;
+	uint32_t i;
+	int ret;
+
+	if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
+		return -ENODEV;
+
+	if (args->pad != 0)
+		return -EINVAL;
+
+	if (args->count_handles == 0)
+		return -EINVAL;
+
+	ret = drm_syncobj_array_find(file_private,
+				     u64_to_user_ptr(args->handles),
+				     args->count_handles,
+				     &syncobjs);
+	if (ret < 0)
+		return ret;
+
+	for (i = 0; i < args->count_handles; i++)
+		drm_syncobj_replace_fence(syncobjs[i], NULL);
+
+	drm_syncobj_array_free(syncobjs, args->count_handles);
+
+	return 0;
+}
+
+int
+drm_syncobj_signal_ioctl(struct drm_device *dev, void *data,
+			 struct drm_file *file_private)
+{
+	struct drm_syncobj_array *args = data;
+	struct drm_syncobj **syncobjs;
+	uint32_t i;
+	int ret;
+
+	if (!drm_core_check_feature(dev, DRIVER_SYNCOBJ))
+		return -ENODEV;
+
+	if (args->pad != 0)
+		return -EINVAL;
+
+	if (args->count_handles == 0)
+		return -EINVAL;
+
+	ret = drm_syncobj_array_find(file_private,
+				     u64_to_user_ptr(args->handles),
+				     args->count_handles,
+				     &syncobjs);
+	if (ret < 0)
+		return ret;
+
+	for (i = 0; i < args->count_handles; i++) {
+		ret = drm_syncobj_assign_null_handle(syncobjs[i]);
+		if (ret < 0)
+			break;
+	}
+
+	drm_syncobj_array_free(syncobjs, args->count_handles);
+
+	return ret;
+}
diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
index e9f33cd..70f2b95 100644
--- a/drivers/gpu/drm/drm_vblank.c
+++ b/drivers/gpu/drm/drm_vblank.c
@@ -31,6 +31,41 @@
 #include "drm_trace.h"
 #include "drm_internal.h"
 
+/**
+ * DOC: vblank handling
+ *
+ * Vertical blanking plays a major role in graphics rendering. To achieve
+ * tear-free display, users must synchronize page flips and/or rendering to
+ * vertical blanking. The DRM API offers ioctls to perform page flips
+ * synchronized to vertical blanking and wait for vertical blanking.
+ *
+ * The DRM core handles most of the vertical blanking management logic, which
+ * involves filtering out spurious interrupts, keeping race-free blanking
+ * counters, coping with counter wrap-around and resets and keeping use counts.
+ * It relies on the driver to generate vertical blanking interrupts and
+ * optionally provide a hardware vertical blanking counter.
+ *
+ * Drivers must initialize the vertical blanking handling core with a call to
+ * drm_vblank_init(). Minimally, a driver needs to implement
+ * &drm_crtc_funcs.enable_vblank and &drm_crtc_funcs.disable_vblank plus call
+ * drm_crtc_handle_vblank() in it's vblank interrupt handler for working vblank
+ * support.
+ *
+ * Vertical blanking interrupts can be enabled by the DRM core or by drivers
+ * themselves (for instance to handle page flipping operations).  The DRM core
+ * maintains a vertical blanking use count to ensure that the interrupts are not
+ * disabled while a user still needs them. To increment the use count, drivers
+ * call drm_crtc_vblank_get() and release the vblank reference again with
+ * drm_crtc_vblank_put(). In between these two calls vblank interrupts are
+ * guaranteed to be enabled.
+ *
+ * On many hardware disabling the vblank interrupt cannot be done in a race-free
+ * manner, see &drm_driver.vblank_disable_immediate and
+ * &drm_driver.max_vblank_count. In that case the vblank core only disables the
+ * vblanks after a timer has expired, which can be configured through the
+ * ``vblankoffdelay`` module parameter.
+ */
+
 /* Retry timestamp calculation up to 3 times to satisfy
  * drm_timestamp_precision before giving up.
  */
@@ -259,16 +294,17 @@ static u32 drm_vblank_count(struct drm_device *dev, unsigned int pipe)
 }
 
 /**
- * drm_accurate_vblank_count - retrieve the master vblank counter
+ * drm_crtc_accurate_vblank_count - retrieve the master vblank counter
  * @crtc: which counter to retrieve
  *
- * This function is similar to @drm_crtc_vblank_count but this
- * function interpolates to handle a race with vblank irq's.
+ * This function is similar to drm_crtc_vblank_count() but this function
+ * interpolates to handle a race with vblank interrupts using the high precision
+ * timestamping support.
  *
- * This is mostly useful for hardware that can obtain the scanout
- * position, but doesn't have a frame counter.
+ * This is mostly useful for hardware that can obtain the scanout position, but
+ * doesn't have a hardware frame counter.
  */
-u32 drm_accurate_vblank_count(struct drm_crtc *crtc)
+u32 drm_crtc_accurate_vblank_count(struct drm_crtc *crtc)
 {
 	struct drm_device *dev = crtc->dev;
 	unsigned int pipe = drm_crtc_index(crtc);
@@ -287,7 +323,7 @@ u32 drm_accurate_vblank_count(struct drm_crtc *crtc)
 
 	return vblank;
 }
-EXPORT_SYMBOL(drm_accurate_vblank_count);
+EXPORT_SYMBOL(drm_crtc_accurate_vblank_count);
 
 static void __disable_vblank(struct drm_device *dev, unsigned int pipe)
 {
@@ -358,15 +394,6 @@ static void vblank_disable_fn(unsigned long arg)
 	spin_unlock_irqrestore(&dev->vbl_lock, irqflags);
 }
 
-/**
- * drm_vblank_cleanup - cleanup vblank support
- * @dev: DRM device
- *
- * This function cleans up any resources allocated in drm_vblank_init.
- *
- * Drivers which don't use drm_irq_install() need to set &drm_device.irq_enabled
- * themselves, to signal to the DRM core that vblank interrupts are enabled.
- */
 void drm_vblank_cleanup(struct drm_device *dev)
 {
 	unsigned int pipe;
@@ -388,7 +415,6 @@ void drm_vblank_cleanup(struct drm_device *dev)
 
 	dev->num_crtcs = 0;
 }
-EXPORT_SYMBOL(drm_vblank_cleanup);
 
 /**
  * drm_vblank_init - initialize vblank support
@@ -396,6 +422,8 @@ EXPORT_SYMBOL(drm_vblank_cleanup);
  * @num_crtcs: number of CRTCs supported by @dev
  *
  * This function initializes vblank support for @num_crtcs display pipelines.
+ * Cleanup is handled by the DRM core, or through calling drm_dev_fini() for
+ * drivers with a &drm_driver.release callback.
  *
  * Returns:
  * Zero on success or a negative error code on failure.
@@ -468,11 +496,11 @@ EXPORT_SYMBOL(drm_crtc_vblank_waitqueue);
  * @crtc: drm_crtc whose timestamp constants should be updated.
  * @mode: display mode containing the scanout timings
  *
- * Calculate and store various constants which are later
- * needed by vblank and swap-completion timestamping, e.g,
- * by drm_calc_vbltimestamp_from_scanoutpos(). They are
- * derived from CRTC's true scanout timing, so they take
- * things like panel scaling or other adjustments into account.
+ * Calculate and store various constants which are later needed by vblank and
+ * swap-completion timestamping, e.g, by
+ * drm_calc_vbltimestamp_from_scanoutpos(). They are derived from CRTC's true
+ * scanout timing, so they take things like panel scaling or other adjustments
+ * into account.
  */
 void drm_calc_timestamping_constants(struct drm_crtc *crtc,
 				     const struct drm_display_mode *mode)
@@ -535,25 +563,14 @@ EXPORT_SYMBOL(drm_calc_timestamping_constants);
  *     if flag is set.
  *
  * Implements calculation of exact vblank timestamps from given drm_display_mode
- * timings and current video scanout position of a CRTC. This can be called from
- * within get_vblank_timestamp() implementation of a kms driver to implement the
- * actual timestamping.
+ * timings and current video scanout position of a CRTC. This can be directly
+ * used as the &drm_driver.get_vblank_timestamp implementation of a kms driver
+ * if &drm_driver.get_scanout_position is implemented.
  *
- * Should return timestamps conforming to the OML_sync_control OpenML
- * extension specification. The timestamp corresponds to the end of
- * the vblank interval, aka start of scanout of topmost-leftmost display
- * pixel in the following video frame.
- *
- * Requires support for optional dev->driver->get_scanout_position()
- * in kms driver, plus a bit of setup code to provide a drm_display_mode
- * that corresponds to the true scanout timing.
- *
- * The current implementation only handles standard video modes. It
- * returns as no operation if a doublescan or interlaced video mode is
- * active. Higher level code is expected to handle this.
- *
- * This function can be used to implement the &drm_driver.get_vblank_timestamp
- * directly, if the driver implements the &drm_driver.get_scanout_position hook.
+ * The current implementation only handles standard video modes. For double scan
+ * and interlaced modes the driver is supposed to adjust the hardware mode
+ * (taken from &drm_crtc_state.adjusted mode for atomic modeset drivers) to
+ * match the scanout position reported.
  *
  * Note that atomic drivers must call drm_calc_timestamping_constants() before
  * enabling a CRTC. The atomic helpers already take care of that in
@@ -738,7 +755,9 @@ drm_get_last_vbltimestamp(struct drm_device *dev, unsigned int pipe,
  *
  * Fetches the "cooked" vblank count value that represents the number of
  * vblank events since the system was booted, including lost events due to
- * modesetting activity.
+ * modesetting activity. Note that this timer isn't correct against a racing
+ * vblank interrupt (since it only reports the software vblank counter), see
+ * drm_crtc_accurate_vblank_count() for such use-cases.
  *
  * Returns:
  * The software vblank counter.
@@ -749,20 +768,6 @@ u32 drm_crtc_vblank_count(struct drm_crtc *crtc)
 }
 EXPORT_SYMBOL(drm_crtc_vblank_count);
 
-/**
- * drm_vblank_count_and_time - retrieve "cooked" vblank counter value and the
- *     system timestamp corresponding to that vblank counter value.
- * @dev: DRM device
- * @pipe: index of CRTC whose counter to retrieve
- * @vblanktime: Pointer to struct timeval to receive the vblank timestamp.
- *
- * Fetches the "cooked" vblank count value that represents the number of
- * vblank events since the system was booted, including lost events due to
- * modesetting activity. Returns corresponding system timestamp of the time
- * of the vblank interval that corresponds to the current vblank counter value.
- *
- * This is the legacy version of drm_crtc_vblank_count_and_time().
- */
 static u32 drm_vblank_count_and_time(struct drm_device *dev, unsigned int pipe,
 				     struct timeval *vblanktime)
 {
@@ -831,7 +836,7 @@ static void send_vblank_event(struct drm_device *dev,
  * NOTE: Drivers using this to send out the &drm_crtc_state.event as part of an
  * atomic commit must ensure that the next vblank happens at exactly the same
  * time as the atomic commit is committed to the hardware. This function itself
- * does **not** protect again the next vblank interrupt racing with either this
+ * does **not** protect against the next vblank interrupt racing with either this
  * function call or the atomic commit operation. A possible sequence could be:
  *
  * 1. Driver commits new hardware state into vblank-synchronized registers.
@@ -852,8 +857,8 @@ static void send_vblank_event(struct drm_device *dev,
  * handler by calling drm_crtc_send_vblank_event() and make sure that there's no
  * possible race with the hardware committing the atomic update.
  *
- * Caller must hold event lock. Caller must also hold a vblank reference for
- * the event @e, which will be dropped when the next vblank arrives.
+ * Caller must hold a vblank reference for the event @e, which will be dropped
+ * when the next vblank arrives.
  */
 void drm_crtc_arm_vblank_event(struct drm_crtc *crtc,
 			       struct drm_pending_vblank_event *e)
@@ -913,14 +918,6 @@ static int __enable_vblank(struct drm_device *dev, unsigned int pipe)
 	return dev->driver->enable_vblank(dev, pipe);
 }
 
-/**
- * drm_vblank_enable - enable the vblank interrupt on a CRTC
- * @dev: DRM device
- * @pipe: CRTC index
- *
- * Returns:
- * Zero on success or a negative error code on failure.
- */
 static int drm_vblank_enable(struct drm_device *dev, unsigned int pipe)
 {
 	struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
@@ -958,19 +955,6 @@ static int drm_vblank_enable(struct drm_device *dev, unsigned int pipe)
 	return ret;
 }
 
-/**
- * drm_vblank_get - get a reference count on vblank events
- * @dev: DRM device
- * @pipe: index of CRTC to own
- *
- * Acquire a reference count on vblank events to avoid having them disabled
- * while in use.
- *
- * This is the legacy version of drm_crtc_vblank_get().
- *
- * Returns:
- * Zero on success or a negative error code on failure.
- */
 static int drm_vblank_get(struct drm_device *dev, unsigned int pipe)
 {
 	struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
@@ -1014,16 +998,6 @@ int drm_crtc_vblank_get(struct drm_crtc *crtc)
 }
 EXPORT_SYMBOL(drm_crtc_vblank_get);
 
-/**
- * drm_vblank_put - release ownership of vblank events
- * @dev: DRM device
- * @pipe: index of CRTC to release
- *
- * Release ownership of a given vblank counter, turning off interrupts
- * if possible. Disable interrupts after drm_vblank_offdelay milliseconds.
- *
- * This is the legacy version of drm_crtc_vblank_put().
- */
 static void drm_vblank_put(struct drm_device *dev, unsigned int pipe)
 {
 	struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
@@ -1067,6 +1041,8 @@ EXPORT_SYMBOL(drm_crtc_vblank_put);
  * This waits for one vblank to pass on @pipe, using the irq driver interfaces.
  * It is a failure to call this when the vblank irq for @pipe is disabled, e.g.
  * due to lack of driver support or because the crtc is off.
+ *
+ * This is the legacy version of drm_crtc_wait_one_vblank().
  */
 void drm_wait_one_vblank(struct drm_device *dev, unsigned int pipe)
 {
@@ -1116,7 +1092,7 @@ EXPORT_SYMBOL(drm_crtc_wait_one_vblank);
  * stored so that drm_vblank_on can restore it again.
  *
  * Drivers must use this function when the hardware vblank counter can get
- * reset, e.g. when suspending.
+ * reset, e.g. when suspending or disabling the @crtc in general.
  */
 void drm_crtc_vblank_off(struct drm_crtc *crtc)
 {
@@ -1184,6 +1160,8 @@ EXPORT_SYMBOL(drm_crtc_vblank_off);
  * drm_crtc_vblank_on() functions. The difference compared to
  * drm_crtc_vblank_off() is that this function doesn't save the vblank counter
  * and hence doesn't need to call any driver hooks.
+ *
+ * This is useful for recovering driver state e.g. on driver load, or on resume.
  */
 void drm_crtc_vblank_reset(struct drm_crtc *crtc)
 {
@@ -1212,9 +1190,10 @@ EXPORT_SYMBOL(drm_crtc_vblank_reset);
  * @crtc: CRTC in question
  *
  * This functions restores the vblank interrupt state captured with
- * drm_crtc_vblank_off() again. Note that calls to drm_crtc_vblank_on() and
- * drm_crtc_vblank_off() can be unbalanced and so can also be unconditionally called
- * in driver load code to reflect the current hardware state of the crtc.
+ * drm_crtc_vblank_off() again and is generally called when enabling @crtc. Note
+ * that calls to drm_crtc_vblank_on() and drm_crtc_vblank_off() can be
+ * unbalanced and so can also be unconditionally called in driver load code to
+ * reflect the current hardware state of the crtc.
  */
 void drm_crtc_vblank_on(struct drm_crtc *crtc)
 {
@@ -1299,8 +1278,8 @@ static void drm_legacy_vblank_post_modeset(struct drm_device *dev,
 	}
 }
 
-int drm_legacy_modeset_ctl(struct drm_device *dev, void *data,
-			   struct drm_file *file_priv)
+int drm_legacy_modeset_ctl_ioctl(struct drm_device *dev, void *data,
+				 struct drm_file *file_priv)
 {
 	struct drm_modeset_ctl *modeset = data;
 	unsigned int pipe;
@@ -1419,22 +1398,8 @@ static bool drm_wait_vblank_is_query(union drm_wait_vblank *vblwait)
 					  _DRM_VBLANK_NEXTONMISS));
 }
 
-/*
- * Wait for VBLANK.
- *
- * \param inode device inode.
- * \param file_priv DRM file private.
- * \param cmd command.
- * \param data user argument, pointing to a drm_wait_vblank structure.
- * \return zero on success or a negative number on failure.
- *
- * This function enables the vblank interrupt on the pipe requested, then
- * sleeps waiting for the requested sequence number to occur, and drops
- * the vblank interrupt refcount afterwards. (vblank IRQ disable follows that
- * after a timeout with no further vblank waits scheduled).
- */
-int drm_wait_vblank(struct drm_device *dev, void *data,
-		    struct drm_file *file_priv)
+int drm_wait_vblank_ioctl(struct drm_device *dev, void *data,
+			  struct drm_file *file_priv)
 {
 	struct drm_vblank_crtc *vblank;
 	union drm_wait_vblank *vblwait = data;
diff --git a/drivers/gpu/drm/drm_vm.c b/drivers/gpu/drm/drm_vm.c
index 1170b32..13a59ed 100644
--- a/drivers/gpu/drm/drm_vm.c
+++ b/drivers/gpu/drm/drm_vm.c
@@ -631,7 +631,7 @@ int drm_legacy_mmap(struct file *filp, struct vm_area_struct *vma)
 	struct drm_device *dev = priv->minor->dev;
 	int ret;
 
-	if (drm_device_is_unplugged(dev))
+	if (drm_dev_is_unplugged(dev))
 		return -ENODEV;
 
 	mutex_lock(&dev->struct_mutex);
diff --git a/drivers/gpu/drm/etnaviv/Kconfig b/drivers/gpu/drm/etnaviv/Kconfig
index 71cee4e..38b477b 100644
--- a/drivers/gpu/drm/etnaviv/Kconfig
+++ b/drivers/gpu/drm/etnaviv/Kconfig
@@ -10,6 +10,8 @@
 	select IOMMU_API
 	select IOMMU_SUPPORT
 	select WANT_DEV_COREDUMP
+	select CMA if HAVE_DMA_CONTIGUOUS
+	select DMA_CMA if HAVE_DMA_CONTIGUOUS
 	help
 	  DRM driver for Vivante GPUs.
 
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
index 91e17ae..2cb4773 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
@@ -316,7 +316,7 @@ static int etnaviv_ioctl_gem_cpu_prep(struct drm_device *dev, void *data,
 
 	ret = etnaviv_gem_cpu_prep(obj, args->op, &TS(args->timeout));
 
-	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
 
 	return ret;
 }
@@ -337,7 +337,7 @@ static int etnaviv_ioctl_gem_cpu_fini(struct drm_device *dev, void *data,
 
 	ret = etnaviv_gem_cpu_fini(obj);
 
-	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
 
 	return ret;
 }
@@ -357,7 +357,7 @@ static int etnaviv_ioctl_gem_info(struct drm_device *dev, void *data,
 		return -ENOENT;
 
 	ret = etnaviv_gem_mmap_offset(obj, &args->offset);
-	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
 
 	return ret;
 }
@@ -446,7 +446,7 @@ static int etnaviv_ioctl_gem_wait(struct drm_device *dev, void *data,
 
 	ret = etnaviv_gem_wait_bo(gpu, obj, timeout);
 
-	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem.c b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
index 9a3bea7..5a63459 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem.c
@@ -68,7 +68,7 @@ static int etnaviv_gem_shmem_get_pages(struct etnaviv_gem_object *etnaviv_obj)
 	struct page **p = drm_gem_get_pages(&etnaviv_obj->base);
 
 	if (IS_ERR(p)) {
-		dev_err(dev->dev, "could not get pages: %ld\n", PTR_ERR(p));
+		dev_dbg(dev->dev, "could not get pages: %ld\n", PTR_ERR(p));
 		return PTR_ERR(p);
 	}
 
@@ -265,7 +265,7 @@ void etnaviv_gem_mapping_reference(struct etnaviv_vram_mapping *mapping)
 {
 	struct etnaviv_gem_object *etnaviv_obj = mapping->object;
 
-	drm_gem_object_reference(&etnaviv_obj->base);
+	drm_gem_object_get(&etnaviv_obj->base);
 
 	mutex_lock(&etnaviv_obj->lock);
 	WARN_ON(mapping->use == 0);
@@ -282,7 +282,7 @@ void etnaviv_gem_mapping_unreference(struct etnaviv_vram_mapping *mapping)
 	mapping->use -= 1;
 	mutex_unlock(&etnaviv_obj->lock);
 
-	drm_gem_object_unreference_unlocked(&etnaviv_obj->base);
+	drm_gem_object_put_unlocked(&etnaviv_obj->base);
 }
 
 struct etnaviv_vram_mapping *etnaviv_gem_mapping_get(
@@ -358,7 +358,7 @@ struct etnaviv_vram_mapping *etnaviv_gem_mapping_get(
 		return ERR_PTR(ret);
 
 	/* Take a reference on the object */
-	drm_gem_object_reference(obj);
+	drm_gem_object_get(obj);
 	return mapping;
 }
 
@@ -413,6 +413,16 @@ int etnaviv_gem_cpu_prep(struct drm_gem_object *obj, u32 op,
 	bool write = !!(op & ETNA_PREP_WRITE);
 	int ret;
 
+	if (!etnaviv_obj->sgt) {
+		void *ret;
+
+		mutex_lock(&etnaviv_obj->lock);
+		ret = etnaviv_gem_get_pages(etnaviv_obj);
+		mutex_unlock(&etnaviv_obj->lock);
+		if (IS_ERR(ret))
+			return PTR_ERR(ret);
+	}
+
 	if (op & ETNA_PREP_NOSYNC) {
 		if (!reservation_object_test_signaled_rcu(etnaviv_obj->resv,
 							  write))
@@ -427,16 +437,6 @@ int etnaviv_gem_cpu_prep(struct drm_gem_object *obj, u32 op,
 	}
 
 	if (etnaviv_obj->flags & ETNA_BO_CACHED) {
-		if (!etnaviv_obj->sgt) {
-			void *ret;
-
-			mutex_lock(&etnaviv_obj->lock);
-			ret = etnaviv_gem_get_pages(etnaviv_obj);
-			mutex_unlock(&etnaviv_obj->lock);
-			if (IS_ERR(ret))
-				return PTR_ERR(ret);
-		}
-
 		dma_sync_sg_for_cpu(dev->dev, etnaviv_obj->sgt->sgl,
 				    etnaviv_obj->sgt->nents,
 				    etnaviv_op_to_dma_dir(op));
@@ -662,7 +662,8 @@ static struct drm_gem_object *__etnaviv_gem_new(struct drm_device *dev,
 		 * going to pin these pages.
 		 */
 		mapping = obj->filp->f_mapping;
-		mapping_set_gfp_mask(mapping, GFP_HIGHUSER);
+		mapping_set_gfp_mask(mapping, GFP_HIGHUSER |
+				     __GFP_RETRY_MAYFAIL | __GFP_NOWARN);
 	}
 
 	if (ret)
@@ -671,7 +672,7 @@ static struct drm_gem_object *__etnaviv_gem_new(struct drm_device *dev,
 	return obj;
 
 fail:
-	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
 	return ERR_PTR(ret);
 }
 
@@ -688,14 +689,14 @@ int etnaviv_gem_new_handle(struct drm_device *dev, struct drm_file *file,
 
 	ret = etnaviv_gem_obj_add(dev, obj);
 	if (ret < 0) {
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ret;
 	}
 
 	ret = drm_gem_handle_create(file, obj, handle);
 
 	/* drop reference from allocate - handle holds it now */
-	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
 
 	return ret;
 }
@@ -712,7 +713,7 @@ struct drm_gem_object *etnaviv_gem_new(struct drm_device *dev,
 
 	ret = etnaviv_gem_obj_add(dev, obj);
 	if (ret < 0) {
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ERR_PTR(ret);
 	}
 
@@ -800,7 +801,7 @@ static void __etnaviv_gem_userptr_get_pages(struct work_struct *_work)
 	}
 
 	mutex_unlock(&etnaviv_obj->lock);
-	drm_gem_object_unreference_unlocked(&etnaviv_obj->base);
+	drm_gem_object_put_unlocked(&etnaviv_obj->base);
 
 	mmput(work->mm);
 	put_task_struct(work->task);
@@ -858,7 +859,7 @@ static int etnaviv_gem_userptr_get_pages(struct etnaviv_gem_object *etnaviv_obj)
 	}
 
 	get_task_struct(current);
-	drm_gem_object_reference(&etnaviv_obj->base);
+	drm_gem_object_get(&etnaviv_obj->base);
 
 	work->mm = mm;
 	work->task = current;
@@ -924,6 +925,6 @@ int etnaviv_gem_new_userptr(struct drm_device *dev, struct drm_file *file,
 	ret = drm_gem_handle_create(file, &etnaviv_obj->base, handle);
 unreference:
 	/* drop reference from allocate - handle holds it now */
-	drm_gem_object_unreference_unlocked(&etnaviv_obj->base);
+	drm_gem_object_put_unlocked(&etnaviv_obj->base);
 	return ret;
 }
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
index e5da4f23..ae88472 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
@@ -146,7 +146,7 @@ struct drm_gem_object *etnaviv_gem_prime_import_sg_table(struct drm_device *dev,
 	return &etnaviv_obj->base;
 
 fail:
-	drm_gem_object_unreference_unlocked(&etnaviv_obj->base);
+	drm_gem_object_put_unlocked(&etnaviv_obj->base);
 
 	return ERR_PTR(ret);
 }
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
index 6463fc2..a7ff2e4 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_submit.c
@@ -88,7 +88,7 @@ static int submit_lookup_objects(struct etnaviv_gem_submit *submit,
 		 * Take a refcount on the object. The file table lock
 		 * prevents the object_idr's refcount on this being dropped.
 		 */
-		drm_gem_object_reference(obj);
+		drm_gem_object_get(obj);
 
 		submit->bos[i].obj = to_etnaviv_bo(obj);
 	}
@@ -291,7 +291,7 @@ static void submit_cleanup(struct etnaviv_gem_submit *submit)
 		struct etnaviv_gem_object *etnaviv_obj = submit->bos[i].obj;
 
 		submit_unlock_object(submit, i);
-		drm_gem_object_unreference_unlocked(&etnaviv_obj->base);
+		drm_gem_object_put_unlocked(&etnaviv_obj->base);
 	}
 
 	ww_acquire_fini(&submit->ticket);
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
index ada45fd..fc9a6a8 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gpu.c
@@ -1622,10 +1622,12 @@ static int etnaviv_gpu_bind(struct device *dev, struct device *master,
 	struct etnaviv_gpu *gpu = dev_get_drvdata(dev);
 	int ret;
 
-	gpu->cooling = thermal_of_cooling_device_register(dev->of_node,
+	if (IS_ENABLED(CONFIG_THERMAL)) {
+		gpu->cooling = thermal_of_cooling_device_register(dev->of_node,
 				(char *)dev_name(dev), gpu, &cooling_ops);
-	if (IS_ERR(gpu->cooling))
-		return PTR_ERR(gpu->cooling);
+		if (IS_ERR(gpu->cooling))
+			return PTR_ERR(gpu->cooling);
+	}
 
 #ifdef CONFIG_PM
 	ret = pm_runtime_get_sync(gpu->dev);
diff --git a/drivers/gpu/drm/exynos/exynos5433_drm_decon.c b/drivers/gpu/drm/exynos/exynos5433_drm_decon.c
index 5792ca8..730b8d9 100644
--- a/drivers/gpu/drm/exynos/exynos5433_drm_decon.c
+++ b/drivers/gpu/drm/exynos/exynos5433_drm_decon.c
@@ -13,6 +13,7 @@
 #include <linux/platform_device.h>
 #include <linux/clk.h>
 #include <linux/component.h>
+#include <linux/iopoll.h>
 #include <linux/mfd/syscon.h>
 #include <linux/of_device.h>
 #include <linux/of_gpio.h>
@@ -33,9 +34,8 @@
 #define WINDOWS_NR	3
 #define MIN_FB_WIDTH_FOR_16WORD_BURST	128
 
-#define IFTYPE_I80	(1 << 0)
-#define I80_HW_TRG	(1 << 1)
-#define IFTYPE_HDMI	(1 << 2)
+#define I80_HW_TRG	(1 << 0)
+#define IFTYPE_HDMI	(1 << 1)
 
 static const char * const decon_clks_name[] = {
 	"pclk",
@@ -57,6 +57,8 @@ struct decon_context {
 	struct regmap			*sysreg;
 	struct clk			*clks[ARRAY_SIZE(decon_clks_name)];
 	unsigned int			irq;
+	unsigned int			irq_vsync;
+	unsigned int			irq_lcd_sys;
 	unsigned int			te_irq;
 	unsigned long			out_type;
 	int				first_win;
@@ -90,7 +92,7 @@ static int decon_enable_vblank(struct exynos_drm_crtc *crtc)
 	u32 val;
 
 	val = VIDINTCON0_INTEN;
-	if (ctx->out_type & IFTYPE_I80)
+	if (crtc->i80_mode)
 		val |= VIDINTCON0_FRAMEDONE;
 	else
 		val |= VIDINTCON0_INTFRMEN | VIDINTCON0_FRAMESEL_FP;
@@ -139,7 +141,7 @@ static u32 decon_get_frame_count(struct decon_context *ctx, bool end)
 
 	switch (status & (VIDCON1_VSTATUS_MASK | VIDCON1_I80_ACTIVE)) {
 	case VIDCON1_VSTATUS_VS:
-		if (!(ctx->out_type & IFTYPE_I80))
+		if (!(ctx->crtc->i80_mode))
 			--frm;
 		break;
 	case VIDCON1_VSTATUS_BP:
@@ -166,7 +168,7 @@ static u32 decon_get_vblank_counter(struct exynos_drm_crtc *crtc)
 
 static void decon_setup_trigger(struct decon_context *ctx)
 {
-	if (!(ctx->out_type & (IFTYPE_I80 | I80_HW_TRG)))
+	if (!ctx->crtc->i80_mode && !(ctx->out_type & I80_HW_TRG))
 		return;
 
 	if (!(ctx->out_type & I80_HW_TRG)) {
@@ -206,7 +208,7 @@ static void decon_commit(struct exynos_drm_crtc *crtc)
 	val = VIDOUT_LCD_ON;
 	if (interlaced)
 		val |= VIDOUT_INTERLACE_EN_F;
-	if (ctx->out_type & IFTYPE_I80) {
+	if (crtc->i80_mode) {
 		val |= VIDOUT_COMMAND_IF;
 	} else {
 		val |= VIDOUT_RGB_IF;
@@ -222,7 +224,7 @@ static void decon_commit(struct exynos_drm_crtc *crtc)
 			VIDTCON2_HOZVAL(m->hdisplay - 1);
 	writel(val, ctx->addr + DECON_VIDTCON2);
 
-	if (!(ctx->out_type & IFTYPE_I80)) {
+	if (!crtc->i80_mode) {
 		int vbp = m->crtc_vtotal - m->crtc_vsync_end;
 		int vfp = m->crtc_vsync_start - m->crtc_vdisplay;
 
@@ -277,16 +279,14 @@ static void decon_win_set_pixfmt(struct decon_context *ctx, unsigned int win,
 		val |= WINCONx_BURSTLEN_16WORD;
 		break;
 	case DRM_FORMAT_ARGB8888:
+	default:
 		val |= WINCONx_BPPMODE_32BPP_A8888;
 		val |= WINCONx_WSWP_F | WINCONx_BLD_PIX_F | WINCONx_ALPHA_SEL_F;
 		val |= WINCONx_BURSTLEN_16WORD;
 		break;
-	default:
-		DRM_ERROR("Proper pixel format is not set\n");
-		return;
 	}
 
-	DRM_DEBUG_KMS("bpp = %u\n", fb->format->cpp[0] * 8);
+	DRM_DEBUG_KMS("cpp = %u\n", fb->format->cpp[0]);
 
 	/*
 	 * In case of exynos, setting dma-burst to 16Word causes permanent
@@ -329,7 +329,7 @@ static void decon_update_plane(struct exynos_drm_crtc *crtc,
 	struct decon_context *ctx = crtc->ctx;
 	struct drm_framebuffer *fb = state->base.fb;
 	unsigned int win = plane->index;
-	unsigned int bpp = fb->format->cpp[0];
+	unsigned int cpp = fb->format->cpp[0];
 	unsigned int pitch = fb->pitches[0];
 	dma_addr_t dma_addr = exynos_drm_fb_dma_addr(fb, 0);
 	u32 val;
@@ -365,11 +365,11 @@ static void decon_update_plane(struct exynos_drm_crtc *crtc,
 	writel(val, ctx->addr + DECON_VIDW0xADD1B0(win));
 
 	if (!(ctx->out_type & IFTYPE_HDMI))
-		val = BIT_VAL(pitch - state->crtc.w * bpp, 27, 14)
-			| BIT_VAL(state->crtc.w * bpp, 13, 0);
+		val = BIT_VAL(pitch - state->crtc.w * cpp, 27, 14)
+			| BIT_VAL(state->crtc.w * cpp, 13, 0);
 	else
-		val = BIT_VAL(pitch - state->crtc.w * bpp, 29, 15)
-			| BIT_VAL(state->crtc.w * bpp, 14, 0);
+		val = BIT_VAL(pitch - state->crtc.w * cpp, 29, 15)
+			| BIT_VAL(state->crtc.w * cpp, 14, 0);
 	writel(val, ctx->addr + DECON_VIDW0xADD2(win));
 
 	decon_win_set_pixfmt(ctx, win, fb);
@@ -407,24 +407,19 @@ static void decon_atomic_flush(struct exynos_drm_crtc *crtc)
 
 static void decon_swreset(struct decon_context *ctx)
 {
-	unsigned int tries;
 	unsigned long flags;
+	u32 val;
+	int ret;
 
 	writel(0, ctx->addr + DECON_VIDCON0);
-	for (tries = 2000; tries; --tries) {
-		if (~readl(ctx->addr + DECON_VIDCON0) & VIDCON0_STOP_STATUS)
-			break;
-		udelay(10);
-	}
+	readl_poll_timeout(ctx->addr + DECON_VIDCON0, val,
+			   ~val & VIDCON0_STOP_STATUS, 12, 20000);
 
 	writel(VIDCON0_SWRESET, ctx->addr + DECON_VIDCON0);
-	for (tries = 2000; tries; --tries) {
-		if (~readl(ctx->addr + DECON_VIDCON0) & VIDCON0_SWRESET)
-			break;
-		udelay(10);
-	}
+	ret = readl_poll_timeout(ctx->addr + DECON_VIDCON0, val,
+				 ~val & VIDCON0_SWRESET, 12, 20000);
 
-	WARN(tries == 0, "failed to software reset DECON\n");
+	WARN(ret < 0, "failed to software reset DECON\n");
 
 	spin_lock_irqsave(&ctx->vblank_lock, flags);
 	ctx->frame_id = 0;
@@ -515,6 +510,22 @@ static void decon_clear_channels(struct exynos_drm_crtc *crtc)
 		clk_disable_unprepare(ctx->clks[i]);
 }
 
+static enum drm_mode_status decon_mode_valid(struct exynos_drm_crtc *crtc,
+		const struct drm_display_mode *mode)
+{
+	struct decon_context *ctx = crtc->ctx;
+
+	ctx->irq = crtc->i80_mode ? ctx->irq_lcd_sys : ctx->irq_vsync;
+
+	if (ctx->irq)
+		return MODE_OK;
+
+	dev_info(ctx->dev, "Sink requires %s mode, but appropriate interrupt is not provided.\n",
+			crtc->i80_mode ? "command" : "video");
+
+	return MODE_BAD;
+}
+
 static const struct exynos_drm_crtc_ops decon_crtc_ops = {
 	.enable			= decon_enable,
 	.disable		= decon_disable,
@@ -524,6 +535,7 @@ static const struct exynos_drm_crtc_ops decon_crtc_ops = {
 	.atomic_begin		= decon_atomic_begin,
 	.update_plane		= decon_update_plane,
 	.disable_plane		= decon_disable_plane,
+	.mode_valid		= decon_mode_valid,
 	.atomic_flush		= decon_atomic_flush,
 };
 
@@ -674,19 +686,22 @@ static const struct of_device_id exynos5433_decon_driver_dt_match[] = {
 MODULE_DEVICE_TABLE(of, exynos5433_decon_driver_dt_match);
 
 static int decon_conf_irq(struct decon_context *ctx, const char *name,
-		irq_handler_t handler, unsigned long int flags, bool required)
+		irq_handler_t handler, unsigned long int flags)
 {
 	struct platform_device *pdev = to_platform_device(ctx->dev);
 	int ret, irq = platform_get_irq_byname(pdev, name);
 
 	if (irq < 0) {
-		if (irq == -EPROBE_DEFER)
+		switch (irq) {
+		case -EPROBE_DEFER:
 			return irq;
-		if (required)
-			dev_err(ctx->dev, "cannot get %s IRQ\n", name);
-		else
-			irq = 0;
-		return irq;
+		case -ENODATA:
+		case -ENXIO:
+			return 0;
+		default:
+			dev_err(ctx->dev, "IRQ %s get failed, %d\n", name, irq);
+			return irq;
+		}
 	}
 	irq_set_status_flags(irq, IRQ_NOAUTOEN);
 	ret = devm_request_irq(ctx->dev, irq, handler, flags, "drm_decon", ctx);
@@ -714,11 +729,8 @@ static int exynos5433_decon_probe(struct platform_device *pdev)
 	ctx->out_type = (unsigned long)of_device_get_match_data(dev);
 	spin_lock_init(&ctx->vblank_lock);
 
-	if (ctx->out_type & IFTYPE_HDMI) {
+	if (ctx->out_type & IFTYPE_HDMI)
 		ctx->first_win = 1;
-	} else if (of_get_child_by_name(dev->of_node, "i80-if-timings")) {
-		ctx->out_type |= IFTYPE_I80;
-	}
 
 	for (i = 0; i < ARRAY_SIZE(decon_clks_name); i++) {
 		struct clk *clk;
@@ -742,25 +754,23 @@ static int exynos5433_decon_probe(struct platform_device *pdev)
 		return PTR_ERR(ctx->addr);
 	}
 
-	if (ctx->out_type & IFTYPE_I80) {
-		ret = decon_conf_irq(ctx, "lcd_sys", decon_irq_handler, 0, true);
-		if (ret < 0)
-			return ret;
-		ctx->irq = ret;
+	ret = decon_conf_irq(ctx, "vsync", decon_irq_handler, 0);
+	if (ret < 0)
+		return ret;
+	ctx->irq_vsync = ret;
 
-		ret = decon_conf_irq(ctx, "te", decon_te_irq_handler,
-				     IRQF_TRIGGER_RISING, false);
-		if (ret < 0)
+	ret = decon_conf_irq(ctx, "lcd_sys", decon_irq_handler, 0);
+	if (ret < 0)
+		return ret;
+	ctx->irq_lcd_sys = ret;
+
+	ret = decon_conf_irq(ctx, "te", decon_te_irq_handler,
+			IRQF_TRIGGER_RISING);
+	if (ret < 0)
 			return ret;
-		if (ret) {
-			ctx->te_irq = ret;
-			ctx->out_type &= ~I80_HW_TRG;
-		}
-	} else {
-		ret = decon_conf_irq(ctx, "vsync", decon_irq_handler, 0, true);
-		if (ret < 0)
-			return ret;
-		ctx->irq = ret;
+	if (ret) {
+		ctx->te_irq = ret;
+		ctx->out_type &= ~I80_HW_TRG;
 	}
 
 	if (ctx->out_type & I80_HW_TRG) {
diff --git a/drivers/gpu/drm/exynos/exynos7_drm_decon.c b/drivers/gpu/drm/exynos/exynos7_drm_decon.c
index 3e88269..615efcf 100644
--- a/drivers/gpu/drm/exynos/exynos7_drm_decon.c
+++ b/drivers/gpu/drm/exynos/exynos7_drm_decon.c
@@ -309,19 +309,14 @@ static void decon_win_set_pixfmt(struct decon_context *ctx, unsigned int win,
 		val |= WINCONx_BURSTLEN_16WORD;
 		break;
 	case DRM_FORMAT_BGRA8888:
+	default:
 		val |= WINCONx_BPPMODE_32BPP_BGRA | WINCONx_BLD_PIX |
 			WINCONx_ALPHA_SEL;
 		val |= WINCONx_BURSTLEN_16WORD;
 		break;
-	default:
-		DRM_DEBUG_KMS("invalid pixel size so using unpacked 24bpp.\n");
-
-		val |= WINCONx_BPPMODE_24BPP_xRGB;
-		val |= WINCONx_BURSTLEN_16WORD;
-		break;
 	}
 
-	DRM_DEBUG_KMS("bpp = %d\n", fb->format->cpp[0] * 8);
+	DRM_DEBUG_KMS("cpp = %d\n", fb->format->cpp[0]);
 
 	/*
 	 * In case of exynos, setting dma-burst to 16Word causes permanent
@@ -398,7 +393,7 @@ static void decon_update_plane(struct exynos_drm_crtc *crtc,
 	unsigned int last_x;
 	unsigned int last_y;
 	unsigned int win = plane->index;
-	unsigned int bpp = fb->format->cpp[0];
+	unsigned int cpp = fb->format->cpp[0];
 	unsigned int pitch = fb->pitches[0];
 
 	if (ctx->suspended)
@@ -418,7 +413,7 @@ static void decon_update_plane(struct exynos_drm_crtc *crtc,
 	val = (unsigned long)exynos_drm_fb_dma_addr(fb, 0);
 	writel(val, ctx->regs + VIDW_BUF_START(win));
 
-	padding = (pitch / bpp) - fb->width;
+	padding = (pitch / cpp) - fb->width;
 
 	/* buffer size */
 	writel(fb->width + padding, ctx->regs + VIDW_WHOLE_X(win));
diff --git a/drivers/gpu/drm/exynos/exynos_dp.c b/drivers/gpu/drm/exynos/exynos_dp.c
index 385537b..39629e7 100644
--- a/drivers/gpu/drm/exynos/exynos_dp.c
+++ b/drivers/gpu/drm/exynos/exynos_dp.c
@@ -155,7 +155,7 @@ static int exynos_dp_bind(struct device *dev, struct device *master, void *data)
 	struct exynos_dp_device *dp = dev_get_drvdata(dev);
 	struct drm_encoder *encoder = &dp->encoder;
 	struct drm_device *drm_dev = data;
-	int pipe, ret;
+	int ret;
 
 	/*
 	 * Just like the probe function said, we don't need the
@@ -179,20 +179,15 @@ static int exynos_dp_bind(struct device *dev, struct device *master, void *data)
 			return ret;
 	}
 
-	pipe = exynos_drm_crtc_get_pipe_from_type(drm_dev,
-						  EXYNOS_DISPLAY_TYPE_LCD);
-	if (pipe < 0)
-		return pipe;
-
-	encoder->possible_crtcs = 1 << pipe;
-
-	DRM_DEBUG_KMS("possible_crtcs = 0x%x\n", encoder->possible_crtcs);
-
 	drm_encoder_init(drm_dev, encoder, &exynos_dp_encoder_funcs,
 			 DRM_MODE_ENCODER_TMDS, NULL);
 
 	drm_encoder_helper_add(encoder, &exynos_dp_encoder_helper_funcs);
 
+	ret = exynos_drm_set_possible_crtcs(encoder, EXYNOS_DISPLAY_TYPE_LCD);
+	if (ret < 0)
+		return ret;
+
 	dp->plat_data.encoder = encoder;
 
 	return analogix_dp_bind(dev, dp->drm_dev, &dp->plat_data);
diff --git a/drivers/gpu/drm/exynos/exynos_drm_core.c b/drivers/gpu/drm/exynos/exynos_drm_core.c
index edbd98f..b0c0621 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_core.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_core.c
@@ -13,6 +13,7 @@
  */
 
 #include <drm/drmP.h>
+
 #include "exynos_drm_drv.h"
 #include "exynos_drm_crtc.h"
 
diff --git a/drivers/gpu/drm/exynos/exynos_drm_crtc.c b/drivers/gpu/drm/exynos/exynos_drm_crtc.c
index d72777f..6ce0821 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_crtc.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_crtc.c
@@ -16,12 +16,14 @@
 #include <drm/drm_crtc_helper.h>
 #include <drm/drm_atomic.h>
 #include <drm/drm_atomic_helper.h>
+#include <drm/drm_encoder.h>
 
 #include "exynos_drm_crtc.h"
 #include "exynos_drm_drv.h"
 #include "exynos_drm_plane.h"
 
-static void exynos_drm_crtc_enable(struct drm_crtc *crtc)
+static void exynos_drm_crtc_atomic_enable(struct drm_crtc *crtc,
+					  struct drm_crtc_state *old_state)
 {
 	struct exynos_drm_crtc *exynos_crtc = to_exynos_crtc(crtc);
 
@@ -31,7 +33,8 @@ static void exynos_drm_crtc_enable(struct drm_crtc *crtc)
 	drm_crtc_vblank_on(crtc);
 }
 
-static void exynos_drm_crtc_disable(struct drm_crtc *crtc)
+static void exynos_drm_crtc_atomic_disable(struct drm_crtc *crtc,
+					   struct drm_crtc_state *old_state)
 {
 	struct exynos_drm_crtc *exynos_crtc = to_exynos_crtc(crtc);
 
@@ -81,12 +84,24 @@ static void exynos_crtc_atomic_flush(struct drm_crtc *crtc,
 		exynos_crtc->ops->atomic_flush(exynos_crtc);
 }
 
+static enum drm_mode_status exynos_crtc_mode_valid(struct drm_crtc *crtc,
+	const struct drm_display_mode *mode)
+{
+	struct exynos_drm_crtc *exynos_crtc = to_exynos_crtc(crtc);
+
+	if (exynos_crtc->ops->mode_valid)
+		return exynos_crtc->ops->mode_valid(exynos_crtc, mode);
+
+	return MODE_OK;
+}
+
 static const struct drm_crtc_helper_funcs exynos_crtc_helper_funcs = {
-	.enable		= exynos_drm_crtc_enable,
-	.disable	= exynos_drm_crtc_disable,
+	.mode_valid	= exynos_crtc_mode_valid,
 	.atomic_check	= exynos_crtc_atomic_check,
 	.atomic_begin	= exynos_crtc_atomic_begin,
 	.atomic_flush	= exynos_crtc_atomic_flush,
+	.atomic_enable	= exynos_drm_crtc_atomic_enable,
+	.atomic_disable	= exynos_drm_crtc_atomic_disable,
 };
 
 void exynos_crtc_handle_event(struct exynos_drm_crtc *exynos_crtc)
@@ -189,16 +204,30 @@ struct exynos_drm_crtc *exynos_drm_crtc_create(struct drm_device *drm_dev,
 	return ERR_PTR(ret);
 }
 
-int exynos_drm_crtc_get_pipe_from_type(struct drm_device *drm_dev,
+struct exynos_drm_crtc *exynos_drm_crtc_get_by_type(struct drm_device *drm_dev,
 				       enum exynos_drm_output_type out_type)
 {
 	struct drm_crtc *crtc;
 
 	drm_for_each_crtc(crtc, drm_dev)
 		if (to_exynos_crtc(crtc)->type == out_type)
-			return drm_crtc_index(crtc);
+			return to_exynos_crtc(crtc);
 
-	return -EPERM;
+	return ERR_PTR(-EPERM);
+}
+
+int exynos_drm_set_possible_crtcs(struct drm_encoder *encoder,
+		enum exynos_drm_output_type out_type)
+{
+	struct exynos_drm_crtc *crtc = exynos_drm_crtc_get_by_type(encoder->dev,
+						out_type);
+
+	if (IS_ERR(crtc))
+		return PTR_ERR(crtc);
+
+	encoder->possible_crtcs = drm_crtc_mask(&crtc->base);
+
+	return 0;
 }
 
 void exynos_drm_crtc_te_handler(struct drm_crtc *crtc)
diff --git a/drivers/gpu/drm/exynos/exynos_drm_crtc.h b/drivers/gpu/drm/exynos/exynos_drm_crtc.h
index ef58b64..dec4461 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_crtc.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_crtc.h
@@ -15,21 +15,25 @@
 #ifndef _EXYNOS_DRM_CRTC_H_
 #define _EXYNOS_DRM_CRTC_H_
 
+
 #include "exynos_drm_drv.h"
 
 struct exynos_drm_crtc *exynos_drm_crtc_create(struct drm_device *drm_dev,
 					struct drm_plane *plane,
-					enum exynos_drm_output_type type,
+					enum exynos_drm_output_type out_type,
 					const struct exynos_drm_crtc_ops *ops,
 					void *context);
 void exynos_drm_crtc_wait_pending_update(struct exynos_drm_crtc *exynos_crtc);
 void exynos_drm_crtc_finish_update(struct exynos_drm_crtc *exynos_crtc,
 				   struct exynos_drm_plane *exynos_plane);
 
-/* This function gets pipe value to crtc device matched with out_type. */
-int exynos_drm_crtc_get_pipe_from_type(struct drm_device *drm_dev,
+/* This function gets crtc device matched with out_type. */
+struct exynos_drm_crtc *exynos_drm_crtc_get_by_type(struct drm_device *drm_dev,
 				       enum exynos_drm_output_type out_type);
 
+int exynos_drm_set_possible_crtcs(struct drm_encoder *encoder,
+		enum exynos_drm_output_type out_type);
+
 /*
  * This function calls the crtc device(manager)'s te_handler() callback
  * to trigger to transfer video image at the tearing effect synchronization
diff --git a/drivers/gpu/drm/exynos/exynos_drm_dpi.c b/drivers/gpu/drm/exynos/exynos_drm_dpi.c
index 63abcd2..66945e0 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dpi.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dpi.c
@@ -59,7 +59,6 @@ static void exynos_dpi_connector_destroy(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs exynos_dpi_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.detect = exynos_dpi_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = exynos_dpi_connector_destroy,
@@ -203,19 +202,15 @@ int exynos_dpi_bind(struct drm_device *dev, struct drm_encoder *encoder)
 {
 	int ret;
 
-	ret = exynos_drm_crtc_get_pipe_from_type(dev, EXYNOS_DISPLAY_TYPE_LCD);
-	if (ret < 0)
-		return ret;
-
-	encoder->possible_crtcs = 1 << ret;
-
-	DRM_DEBUG_KMS("possible_crtcs = 0x%x\n", encoder->possible_crtcs);
-
 	drm_encoder_init(dev, encoder, &exynos_dpi_encoder_funcs,
 			 DRM_MODE_ENCODER_TMDS, NULL);
 
 	drm_encoder_helper_add(encoder, &exynos_dpi_encoder_helper_funcs);
 
+	ret = exynos_drm_set_possible_crtcs(encoder, EXYNOS_DISPLAY_TYPE_LCD);
+	if (ret < 0)
+		return ret;
+
 	ret = exynos_dpi_create_connector(encoder);
 	if (ret) {
 		DRM_ERROR("failed to create connector ret = %d\n", ret);
diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.c b/drivers/gpu/drm/exynos/exynos_drm_drv.c
index 242bd50..b1f7299 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.c
@@ -145,8 +145,6 @@ static struct drm_driver exynos_drm_driver = {
 	.gem_free_object_unlocked = exynos_drm_gem_free_object,
 	.gem_vm_ops		= &exynos_drm_gem_vm_ops,
 	.dumb_create		= exynos_drm_gem_dumb_create,
-	.dumb_map_offset	= exynos_drm_gem_dumb_map_offset,
-	.dumb_destroy		= drm_gem_dumb_destroy,
 	.prime_handle_to_fd	= drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle	= drm_gem_prime_fd_to_handle,
 	.gem_prime_export	= drm_gem_prime_export,
@@ -395,8 +393,9 @@ static int exynos_drm_bind(struct device *dev)
 	/* init kms poll for handling hpd */
 	drm_kms_helper_poll_init(drm);
 
-	/* force connectors detection */
-	drm_helper_hpd_irq_event(drm);
+	ret = exynos_drm_fbdev_init(drm);
+	if (ret)
+		goto err_cleanup_poll;
 
 	/* register the DRM device */
 	ret = drm_dev_register(drm, 0);
@@ -407,6 +406,7 @@ static int exynos_drm_bind(struct device *dev)
 
 err_cleanup_fbdev:
 	exynos_drm_fbdev_fini(drm);
+err_cleanup_poll:
 	drm_kms_helper_poll_fini(drm);
 	exynos_drm_device_subdrv_remove(drm);
 err_unbind_all:
diff --git a/drivers/gpu/drm/exynos/exynos_drm_drv.h b/drivers/gpu/drm/exynos/exynos_drm_drv.h
index a93de32..cf131c2 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_drv.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_drv.h
@@ -91,6 +91,7 @@ struct exynos_drm_plane {
 #define EXYNOS_DRM_PLANE_CAP_DOUBLE	(1 << 0)
 #define EXYNOS_DRM_PLANE_CAP_SCALE	(1 << 1)
 #define EXYNOS_DRM_PLANE_CAP_ZPOS	(1 << 2)
+#define EXYNOS_DRM_PLANE_CAP_TILE	(1 << 3)
 
 /*
  * Exynos DRM plane configuration structure.
@@ -117,6 +118,7 @@ struct exynos_drm_plane_config {
  * @disable: disable the device
  * @enable_vblank: specific driver callback for enabling vblank interrupt.
  * @disable_vblank: specific driver callback for disabling vblank interrupt.
+ * @mode_valid: specific driver callback for mode validation
  * @atomic_check: validate state
  * @atomic_begin: prepare device to receive an update
  * @atomic_flush: mark the end of device update
@@ -132,6 +134,8 @@ struct exynos_drm_crtc_ops {
 	int (*enable_vblank)(struct exynos_drm_crtc *crtc);
 	void (*disable_vblank)(struct exynos_drm_crtc *crtc);
 	u32 (*get_vblank_counter)(struct exynos_drm_crtc *crtc);
+	enum drm_mode_status (*mode_valid)(struct exynos_drm_crtc *crtc,
+		const struct drm_display_mode *mode);
 	int (*atomic_check)(struct exynos_drm_crtc *crtc,
 			    struct drm_crtc_state *state);
 	void (*atomic_begin)(struct exynos_drm_crtc *crtc);
@@ -162,6 +166,7 @@ struct exynos_drm_crtc {
 	const struct exynos_drm_crtc_ops	*ops;
 	void				*ctx;
 	struct exynos_drm_clk		*pipe_clk;
+	bool				i80_mode : 1;
 };
 
 static inline void exynos_drm_pipe_clk_enable(struct exynos_drm_crtc *crtc,
diff --git a/drivers/gpu/drm/exynos/exynos_drm_dsi.c b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
index b6a46d9..7904ffa 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_dsi.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_dsi.c
@@ -254,7 +254,6 @@ struct exynos_dsi {
 	struct drm_encoder encoder;
 	struct mipi_dsi_host dsi_host;
 	struct drm_connector connector;
-	struct device_node *panel_node;
 	struct drm_panel *panel;
 	struct device *dev;
 
@@ -1329,12 +1328,13 @@ static int exynos_dsi_init(struct exynos_dsi *dsi)
 	return 0;
 }
 
-static int exynos_dsi_register_te_irq(struct exynos_dsi *dsi)
+static int exynos_dsi_register_te_irq(struct exynos_dsi *dsi,
+				      struct device *panel)
 {
 	int ret;
 	int te_gpio_irq;
 
-	dsi->te_gpio = of_get_named_gpio(dsi->panel_node, "te-gpios", 0);
+	dsi->te_gpio = of_get_named_gpio(panel->of_node, "te-gpios", 0);
 	if (dsi->te_gpio == -ENOENT)
 		return 0;
 
@@ -1374,85 +1374,6 @@ static void exynos_dsi_unregister_te_irq(struct exynos_dsi *dsi)
 	}
 }
 
-static int exynos_dsi_host_attach(struct mipi_dsi_host *host,
-				  struct mipi_dsi_device *device)
-{
-	struct exynos_dsi *dsi = host_to_dsi(host);
-
-	dsi->lanes = device->lanes;
-	dsi->format = device->format;
-	dsi->mode_flags = device->mode_flags;
-	dsi->panel_node = device->dev.of_node;
-
-	/*
-	 * This is a temporary solution and should be made by more generic way.
-	 *
-	 * If attached panel device is for command mode one, dsi should register
-	 * TE interrupt handler.
-	 */
-	if (!(dsi->mode_flags & MIPI_DSI_MODE_VIDEO)) {
-		int ret = exynos_dsi_register_te_irq(dsi);
-
-		if (ret)
-			return ret;
-	}
-
-	if (dsi->connector.dev)
-		drm_helper_hpd_irq_event(dsi->connector.dev);
-
-	return 0;
-}
-
-static int exynos_dsi_host_detach(struct mipi_dsi_host *host,
-				  struct mipi_dsi_device *device)
-{
-	struct exynos_dsi *dsi = host_to_dsi(host);
-
-	exynos_dsi_unregister_te_irq(dsi);
-
-	dsi->panel_node = NULL;
-
-	if (dsi->connector.dev)
-		drm_helper_hpd_irq_event(dsi->connector.dev);
-
-	return 0;
-}
-
-static ssize_t exynos_dsi_host_transfer(struct mipi_dsi_host *host,
-				        const struct mipi_dsi_msg *msg)
-{
-	struct exynos_dsi *dsi = host_to_dsi(host);
-	struct exynos_dsi_transfer xfer;
-	int ret;
-
-	if (!(dsi->state & DSIM_STATE_ENABLED))
-		return -EINVAL;
-
-	if (!(dsi->state & DSIM_STATE_INITIALIZED)) {
-		ret = exynos_dsi_init(dsi);
-		if (ret)
-			return ret;
-		dsi->state |= DSIM_STATE_INITIALIZED;
-	}
-
-	ret = mipi_dsi_create_packet(&xfer.packet, msg);
-	if (ret < 0)
-		return ret;
-
-	xfer.rx_len = msg->rx_len;
-	xfer.rx_payload = msg->rx_buf;
-	xfer.flags = msg->flags;
-
-	ret = exynos_dsi_transfer(dsi, &xfer);
-	return (ret < 0) ? ret : xfer.rx_done;
-}
-
-static const struct mipi_dsi_host_ops exynos_dsi_ops = {
-	.attach = exynos_dsi_host_attach,
-	.detach = exynos_dsi_host_detach,
-	.transfer = exynos_dsi_host_transfer,
-};
-
 static void exynos_dsi_enable(struct drm_encoder *encoder)
 {
 	struct exynos_dsi *dsi = encoder_to_dsi(encoder);
@@ -1508,25 +1429,7 @@ static void exynos_dsi_disable(struct drm_encoder *encoder)
 static enum drm_connector_status
 exynos_dsi_detect(struct drm_connector *connector, bool force)
 {
-	struct exynos_dsi *dsi = connector_to_dsi(connector);
-
-	if (!dsi->panel) {
-		dsi->panel = of_drm_find_panel(dsi->panel_node);
-		if (dsi->panel)
-			drm_panel_attach(dsi->panel, &dsi->connector);
-	} else if (!dsi->panel_node) {
-		struct drm_encoder *encoder;
-
-		encoder = platform_get_drvdata(to_platform_device(dsi->dev));
-		exynos_dsi_disable(encoder);
-		drm_panel_detach(dsi->panel);
-		dsi->panel = NULL;
-	}
-
-	if (dsi->panel)
-		return connector_status_connected;
-
-	return connector_status_disconnected;
+	return connector->status;
 }
 
 static void exynos_dsi_connector_destroy(struct drm_connector *connector)
@@ -1537,7 +1440,6 @@ static void exynos_dsi_connector_destroy(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs exynos_dsi_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.detect = exynos_dsi_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = exynos_dsi_connector_destroy,
@@ -1576,6 +1478,7 @@ static int exynos_dsi_create_connector(struct drm_encoder *encoder)
 		return ret;
 	}
 
+	connector->status = connector_status_disconnected;
 	drm_connector_helper_add(connector, &exynos_dsi_connector_helper_funcs);
 	drm_mode_connector_attach_encoder(connector, encoder);
 
@@ -1612,14 +1515,112 @@ static const struct drm_encoder_funcs exynos_dsi_encoder_funcs = {
 
 MODULE_DEVICE_TABLE(of, exynos_dsi_of_match);
 
+static int exynos_dsi_host_attach(struct mipi_dsi_host *host,
+				  struct mipi_dsi_device *device)
+{
+	struct exynos_dsi *dsi = host_to_dsi(host);
+	struct drm_device *drm = dsi->connector.dev;
+
+	/*
+	 * This is a temporary solution and should be made by more generic way.
+	 *
+	 * If attached panel device is for command mode one, dsi should register
+	 * TE interrupt handler.
+	 */
+	if (!(device->mode_flags & MIPI_DSI_MODE_VIDEO)) {
+		int ret = exynos_dsi_register_te_irq(dsi, &device->dev);
+		if (ret)
+			return ret;
+	}
+
+	mutex_lock(&drm->mode_config.mutex);
+
+	dsi->lanes = device->lanes;
+	dsi->format = device->format;
+	dsi->mode_flags = device->mode_flags;
+	dsi->panel = of_drm_find_panel(device->dev.of_node);
+	if (dsi->panel) {
+		drm_panel_attach(dsi->panel, &dsi->connector);
+		dsi->connector.status = connector_status_connected;
+	}
+	exynos_drm_crtc_get_by_type(drm, EXYNOS_DISPLAY_TYPE_LCD)->i80_mode =
+			!(dsi->mode_flags & MIPI_DSI_MODE_VIDEO);
+
+	mutex_unlock(&drm->mode_config.mutex);
+
+	if (drm->mode_config.poll_enabled)
+		drm_kms_helper_hotplug_event(drm);
+
+	return 0;
+}
+
+static int exynos_dsi_host_detach(struct mipi_dsi_host *host,
+				  struct mipi_dsi_device *device)
+{
+	struct exynos_dsi *dsi = host_to_dsi(host);
+	struct drm_device *drm = dsi->connector.dev;
+
+	mutex_lock(&drm->mode_config.mutex);
+
+	if (dsi->panel) {
+		exynos_dsi_disable(&dsi->encoder);
+		drm_panel_detach(dsi->panel);
+		dsi->panel = NULL;
+		dsi->connector.status = connector_status_disconnected;
+	}
+
+	mutex_unlock(&drm->mode_config.mutex);
+
+	if (drm->mode_config.poll_enabled)
+		drm_kms_helper_hotplug_event(drm);
+
+	exynos_dsi_unregister_te_irq(dsi);
+
+	return 0;
+}
+
+static ssize_t exynos_dsi_host_transfer(struct mipi_dsi_host *host,
+					 const struct mipi_dsi_msg *msg)
+{
+	struct exynos_dsi *dsi = host_to_dsi(host);
+	struct exynos_dsi_transfer xfer;
+	int ret;
+
+	if (!(dsi->state & DSIM_STATE_ENABLED))
+		return -EINVAL;
+
+	if (!(dsi->state & DSIM_STATE_INITIALIZED)) {
+		ret = exynos_dsi_init(dsi);
+		if (ret)
+			return ret;
+		dsi->state |= DSIM_STATE_INITIALIZED;
+	}
+
+	ret = mipi_dsi_create_packet(&xfer.packet, msg);
+	if (ret < 0)
+		return ret;
+
+	xfer.rx_len = msg->rx_len;
+	xfer.rx_payload = msg->rx_buf;
+	xfer.flags = msg->flags;
+
+	ret = exynos_dsi_transfer(dsi, &xfer);
+	return (ret < 0) ? ret : xfer.rx_done;
+}
+
+static const struct mipi_dsi_host_ops exynos_dsi_ops = {
+	.attach = exynos_dsi_host_attach,
+	.detach = exynos_dsi_host_detach,
+	.transfer = exynos_dsi_host_transfer,
+};
+
 static int exynos_dsi_of_read_u32(const struct device_node *np,
 				  const char *propname, u32 *out_value)
 {
 	int ret = of_property_read_u32(np, propname, out_value);
 
 	if (ret < 0)
-		pr_err("%s: failed to get '%s' property\n", np->full_name,
-		       propname);
+		pr_err("%pOF: failed to get '%s' property\n", np, propname);
 
 	return ret;
 }
@@ -1664,20 +1665,15 @@ static int exynos_dsi_bind(struct device *dev, struct device *master,
 	struct drm_bridge *bridge;
 	int ret;
 
-	ret = exynos_drm_crtc_get_pipe_from_type(drm_dev,
-						  EXYNOS_DISPLAY_TYPE_LCD);
-	if (ret < 0)
-		return ret;
-
-	encoder->possible_crtcs = 1 << ret;
-
-	DRM_DEBUG_KMS("possible_crtcs = 0x%x\n", encoder->possible_crtcs);
-
 	drm_encoder_init(drm_dev, encoder, &exynos_dsi_encoder_funcs,
 			 DRM_MODE_ENCODER_TMDS, NULL);
 
 	drm_encoder_helper_add(encoder, &exynos_dsi_encoder_helper_funcs);
 
+	ret = exynos_drm_set_possible_crtcs(encoder, EXYNOS_DISPLAY_TYPE_LCD);
+	if (ret < 0)
+		return ret;
+
 	ret = exynos_dsi_create_connector(encoder);
 	if (ret) {
 		DRM_ERROR("failed to create connector ret = %d\n", ret);
diff --git a/drivers/gpu/drm/exynos/exynos_drm_fb.c b/drivers/gpu/drm/exynos/exynos_drm_fb.c
index 73217c28..8208df5 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fb.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fb.c
@@ -199,33 +199,8 @@ dma_addr_t exynos_drm_fb_dma_addr(struct drm_framebuffer *fb, int index)
 	return exynos_fb->dma_addr[index];
 }
 
-static void exynos_drm_atomic_commit_tail(struct drm_atomic_state *state)
-{
-	struct drm_device *dev = state->dev;
-
-	drm_atomic_helper_commit_modeset_disables(dev, state);
-
-	drm_atomic_helper_commit_modeset_enables(dev, state);
-
-	/*
-	 * Exynos can't update planes with CRTCs and encoders disabled,
-	 * its updates routines, specially for FIMD, requires the clocks
-	 * to be enabled. So it is necessary to handle the modeset operations
-	 * *before* the commit_planes() step, this way it will always
-	 * have the relevant clocks enabled to perform the update.
-	 */
-	drm_atomic_helper_commit_planes(dev, state,
-					DRM_PLANE_COMMIT_ACTIVE_ONLY);
-
-	drm_atomic_helper_commit_hw_done(state);
-
-	drm_atomic_helper_wait_for_vblanks(dev, state);
-
-	drm_atomic_helper_cleanup_planes(dev, state);
-}
-
 static struct drm_mode_config_helper_funcs exynos_drm_mode_config_helpers = {
-	.atomic_commit_tail = exynos_drm_atomic_commit_tail,
+	.atomic_commit_tail = drm_atomic_helper_commit_tail_rpm,
 };
 
 static const struct drm_mode_config_funcs exynos_drm_mode_config_funcs = {
@@ -250,4 +225,6 @@ void exynos_drm_mode_config_init(struct drm_device *dev)
 
 	dev->mode_config.funcs = &exynos_drm_mode_config_funcs;
 	dev->mode_config.helper_private = &exynos_drm_mode_config_helpers;
+
+	dev->mode_config.allow_fb_modifiers = true;
 }
diff --git a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
index 6415312..c3a0684 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fbdev.c
@@ -183,24 +183,6 @@ static const struct drm_fb_helper_funcs exynos_drm_fb_helper_funcs = {
 	.fb_probe =	exynos_drm_fbdev_create,
 };
 
-static bool exynos_drm_fbdev_is_anything_connected(struct drm_device *dev)
-{
-	struct drm_connector *connector;
-	bool ret = false;
-
-	mutex_lock(&dev->mode_config.mutex);
-	list_for_each_entry(connector, &dev->mode_config.connector_list, head) {
-		if (connector->status != connector_status_connected)
-			continue;
-
-		ret = true;
-		break;
-	}
-	mutex_unlock(&dev->mode_config.mutex);
-
-	return ret;
-}
-
 int exynos_drm_fbdev_init(struct drm_device *dev)
 {
 	struct exynos_drm_fbdev *fbdev;
@@ -211,9 +193,6 @@ int exynos_drm_fbdev_init(struct drm_device *dev)
 	if (!dev->mode_config.num_crtc || !dev->mode_config.num_connector)
 		return 0;
 
-	if (!exynos_drm_fbdev_is_anything_connected(dev))
-		return 0;
-
 	fbdev = kzalloc(sizeof(*fbdev), GFP_KERNEL);
 	if (!fbdev)
 		return -ENOMEM;
@@ -304,8 +283,5 @@ void exynos_drm_output_poll_changed(struct drm_device *dev)
 	struct exynos_drm_private *private = dev->dev_private;
 	struct drm_fb_helper *fb_helper = private->fb_helper;
 
-	if (fb_helper)
-		drm_fb_helper_hotplug_event(fb_helper);
-	else
-		exynos_drm_fbdev_init(dev);
+	drm_fb_helper_hotplug_event(fb_helper);
 }
diff --git a/drivers/gpu/drm/exynos/exynos_drm_fimd.c b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
index 60f93ca..d42ae2b 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_fimd.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_fimd.c
@@ -583,18 +583,12 @@ static void fimd_win_set_pixfmt(struct fimd_context *ctx, unsigned int win,
 		val |= WINCONx_BURSTLEN_16WORD;
 		break;
 	case DRM_FORMAT_ARGB8888:
+	default:
 		val |= WINCON1_BPPMODE_25BPP_A1888
 			| WINCON1_BLD_PIX | WINCON1_ALPHA_SEL;
 		val |= WINCONx_WSWP;
 		val |= WINCONx_BURSTLEN_16WORD;
 		break;
-	default:
-		DRM_DEBUG_KMS("invalid pixel size so using unpacked 24bpp.\n");
-
-		val |= WINCON0_BPPMODE_24BPP_888;
-		val |= WINCONx_WSWP;
-		val |= WINCONx_BURSTLEN_16WORD;
-		break;
 	}
 
 	/*
@@ -718,13 +712,13 @@ static void fimd_update_plane(struct exynos_drm_crtc *crtc,
 	unsigned long val, size, offset;
 	unsigned int last_x, last_y, buf_offsize, line_size;
 	unsigned int win = plane->index;
-	unsigned int bpp = fb->format->cpp[0];
+	unsigned int cpp = fb->format->cpp[0];
 	unsigned int pitch = fb->pitches[0];
 
 	if (ctx->suspended)
 		return;
 
-	offset = state->src.x * bpp;
+	offset = state->src.x * cpp;
 	offset += state->src.y * pitch;
 
 	/* buffer start address */
@@ -743,8 +737,8 @@ static void fimd_update_plane(struct exynos_drm_crtc *crtc,
 			state->crtc.w, state->crtc.h);
 
 	/* buffer size */
-	buf_offsize = pitch - (state->crtc.w * bpp);
-	line_size = state->crtc.w * bpp;
+	buf_offsize = pitch - (state->crtc.w * cpp);
+	line_size = state->crtc.w * cpp;
 	val = VIDW_BUF_SIZE_OFFSET(buf_offsize) |
 		VIDW_BUF_SIZE_PAGEWIDTH(line_size) |
 		VIDW_BUF_SIZE_OFFSET_E(buf_offsize) |
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/drivers/gpu/drm/exynos/exynos_drm_gem.c
index c23479b..077de01 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.c
@@ -286,8 +286,8 @@ int exynos_drm_gem_map_ioctl(struct drm_device *dev, void *data,
 {
 	struct drm_exynos_gem_map *args = data;
 
-	return exynos_drm_gem_dumb_map_offset(file_priv, dev, args->handle,
-					      &args->offset);
+	return drm_gem_dumb_map_offset(file_priv, dev, args->handle,
+				       &args->offset);
 }
 
 dma_addr_t *exynos_drm_gem_get_dma_addr(struct drm_device *dev,
@@ -422,32 +422,6 @@ int exynos_drm_gem_dumb_create(struct drm_file *file_priv,
 	return 0;
 }
 
-int exynos_drm_gem_dumb_map_offset(struct drm_file *file_priv,
-				   struct drm_device *dev, uint32_t handle,
-				   uint64_t *offset)
-{
-	struct drm_gem_object *obj;
-	int ret = 0;
-
-	/*
-	 * get offset of memory allocated for drm framebuffer.
-	 * - this callback would be called by user application
-	 *	with DRM_IOCTL_MODE_MAP_DUMB command.
-	 */
-
-	obj = drm_gem_object_lookup(file_priv, handle);
-	if (!obj) {
-		DRM_ERROR("failed to lookup gem object.\n");
-		return -EINVAL;
-	}
-
-	*offset = drm_vma_node_offset_addr(&obj->vma_node);
-	DRM_DEBUG_KMS("offset = 0x%lx\n", (unsigned long)*offset);
-
-	drm_gem_object_unreference_unlocked(obj);
-	return ret;
-}
-
 int exynos_drm_gem_fault(struct vm_fault *vmf)
 {
 	struct vm_area_struct *vma = vmf->vma;
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.h b/drivers/gpu/drm/exynos/exynos_drm_gem.h
index 8545725..e86d1a9 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_gem.h
+++ b/drivers/gpu/drm/exynos/exynos_drm_gem.h
@@ -110,11 +110,6 @@ int exynos_drm_gem_dumb_create(struct drm_file *file_priv,
 			       struct drm_device *dev,
 			       struct drm_mode_create_dumb *args);
 
-/* map memory region for drm framebuffer to user space. */
-int exynos_drm_gem_dumb_map_offset(struct drm_file *file_priv,
-				   struct drm_device *dev, uint32_t handle,
-				   uint64_t *offset);
-
 /* page fault handler and mmap fault address(virtual) to physical memory. */
 int exynos_drm_gem_fault(struct vm_fault *vmf);
 
diff --git a/drivers/gpu/drm/exynos/exynos_drm_mic.c b/drivers/gpu/drm/exynos/exynos_drm_mic.c
index 16bbee8..ba4a32b 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_mic.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_mic.c
@@ -21,9 +21,12 @@
 #include <linux/component.h>
 #include <linux/pm_runtime.h>
 #include <drm/drmP.h>
+#include <drm/drm_encoder.h>
 #include <linux/mfd/syscon.h>
 #include <linux/regmap.h>
 
+#include "exynos_drm_drv.h"
+
 /* Sysreg registers for MIC */
 #define DSD_CFG_MUX	0x1004
 #define MIC0_RGB_MUX	(1 << 0)
@@ -85,12 +88,6 @@
 
 #define MIC_BS_SIZE_2D(x)	((x) & 0x3fff)
 
-enum {
-	ENDPOINT_DECON_NODE,
-	ENDPOINT_DSI_NODE,
-	NUM_ENDPOINTS
-};
-
 static char *clk_names[] = { "pclk_mic0", "sclk_rgb_vclk_to_mic0" };
 #define NUM_CLKS		ARRAY_SIZE(clk_names)
 static DEFINE_MUTEX(mic_mutex);
@@ -229,36 +226,6 @@ static void mic_set_reg_on(struct exynos_mic *mic, bool enable)
 	writel(reg, mic->reg + MIC_OP);
 }
 
-static int parse_dt(struct exynos_mic *mic)
-{
-	int ret = 0, i, j;
-	struct device_node *remote_node;
-	struct device_node *nodes[3];
-
-	/*
-	 * The order of endpoints does matter.
-	 * The first node must be for decon and the second one must be for dsi.
-	 */
-	for (i = 0, j = 0; i < NUM_ENDPOINTS; i++) {
-		remote_node = of_graph_get_remote_node(mic->dev->of_node, i, 0);
-		if (!remote_node) {
-			ret = -EPIPE;
-			goto exit;
-		}
-		nodes[j++] = remote_node;
-
-		if (i == ENDPOINT_DECON_NODE &&
-			of_get_child_by_name(remote_node, "i80-if-timings"))
-			mic->i80_mode = 1;
-	}
-
-exit:
-	while (--j > -1)
-		of_node_put(nodes[j]);
-
-	return ret;
-}
-
 static void mic_disable(struct drm_bridge *bridge) { }
 
 static void mic_post_disable(struct drm_bridge *bridge)
@@ -286,6 +253,7 @@ static void mic_mode_set(struct drm_bridge *bridge,
 
 	mutex_lock(&mic_mutex);
 	drm_display_mode_to_videomode(mode, &mic->vm);
+	mic->i80_mode = to_exynos_crtc(bridge->encoder->crtc)->i80_mode;
 	mutex_unlock(&mic_mutex);
 }
 
@@ -417,10 +385,6 @@ static int exynos_mic_probe(struct platform_device *pdev)
 
 	mic->dev = dev;
 
-	ret = parse_dt(mic);
-	if (ret)
-		goto err;
-
 	ret = of_address_to_resource(dev->of_node, 0, &res);
 	if (ret) {
 		DRM_ERROR("mic: Failed to get mem region for MIC\n");
diff --git a/drivers/gpu/drm/exynos/exynos_drm_plane.c b/drivers/gpu/drm/exynos/exynos_drm_plane.c
index 611b6fd..d2a90da 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_plane.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_plane.c
@@ -173,13 +173,35 @@ static struct drm_plane_funcs exynos_plane_funcs = {
 	.update_plane	= drm_atomic_helper_update_plane,
 	.disable_plane	= drm_atomic_helper_disable_plane,
 	.destroy	= drm_plane_cleanup,
-	.set_property	= drm_atomic_helper_plane_set_property,
 	.reset		= exynos_drm_plane_reset,
 	.atomic_duplicate_state = exynos_drm_plane_duplicate_state,
 	.atomic_destroy_state = exynos_drm_plane_destroy_state,
 };
 
 static int
+exynos_drm_plane_check_format(const struct exynos_drm_plane_config *config,
+			      struct exynos_drm_plane_state *state)
+{
+	struct drm_framebuffer *fb = state->base.fb;
+
+	switch (fb->modifier) {
+	case DRM_FORMAT_MOD_SAMSUNG_64_32_TILE:
+		if (!(config->capabilities & EXYNOS_DRM_PLANE_CAP_TILE))
+			return -ENOTSUPP;
+		break;
+
+	case DRM_FORMAT_MOD_LINEAR:
+		break;
+
+	default:
+		DRM_ERROR("unsupported pixel format modifier");
+		return -ENOTSUPP;
+	}
+
+	return 0;
+}
+
+static int
 exynos_drm_plane_check_size(const struct exynos_drm_plane_config *config,
 			    struct exynos_drm_plane_state *state)
 {
@@ -223,6 +245,10 @@ static int exynos_plane_atomic_check(struct drm_plane *plane,
 	/* translate state into exynos_state */
 	exynos_plane_mode_set(exynos_state);
 
+	ret = exynos_drm_plane_check_format(exynos_plane->config, exynos_state);
+	if (ret)
+		return ret;
+
 	ret = exynos_drm_plane_check_size(exynos_plane->config, exynos_state);
 	return ret;
 }
@@ -283,7 +309,7 @@ int exynos_plane_init(struct drm_device *dev,
 				       &exynos_plane_funcs,
 				       config->pixel_formats,
 				       config->num_pixel_formats,
-				       config->type, NULL);
+				       NULL, config->type, NULL);
 	if (err) {
 		DRM_ERROR("failed to initialize plane\n");
 		return err;
diff --git a/drivers/gpu/drm/exynos/exynos_drm_vidi.c b/drivers/gpu/drm/exynos/exynos_drm_vidi.c
index cb8a728..53e03f8 100644
--- a/drivers/gpu/drm/exynos/exynos_drm_vidi.c
+++ b/drivers/gpu/drm/exynos/exynos_drm_vidi.c
@@ -289,7 +289,6 @@ static void vidi_connector_destroy(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs vidi_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.detect = vidi_detect,
 	.destroy = vidi_connector_destroy,
@@ -382,7 +381,7 @@ static int vidi_bind(struct device *dev, struct device *master, void *data)
 	struct exynos_drm_plane *exynos_plane;
 	struct exynos_drm_plane_config plane_config = { 0 };
 	unsigned int i;
-	int pipe, ret;
+	int ret;
 
 	ctx->drm_dev = drm_dev;
 
@@ -407,20 +406,15 @@ static int vidi_bind(struct device *dev, struct device *master, void *data)
 		return PTR_ERR(ctx->crtc);
 	}
 
-	pipe = exynos_drm_crtc_get_pipe_from_type(drm_dev,
-						  EXYNOS_DISPLAY_TYPE_VIDI);
-	if (pipe < 0)
-		return pipe;
-
-	encoder->possible_crtcs = 1 << pipe;
-
-	DRM_DEBUG_KMS("possible_crtcs = 0x%x\n", encoder->possible_crtcs);
-
 	drm_encoder_init(drm_dev, encoder, &exynos_vidi_encoder_funcs,
 			 DRM_MODE_ENCODER_TMDS, NULL);
 
 	drm_encoder_helper_add(encoder, &exynos_vidi_encoder_helper_funcs);
 
+	ret = exynos_drm_set_possible_crtcs(encoder, EXYNOS_DISPLAY_TYPE_VIDI);
+	if (ret < 0)
+		return ret;
+
 	ret = vidi_create_connector(encoder);
 	if (ret) {
 		DRM_ERROR("failed to create connector ret = %d\n", ret);
diff --git a/drivers/gpu/drm/exynos/exynos_hdmi.c b/drivers/gpu/drm/exynos/exynos_hdmi.c
index d3b69d6..214fa5e 100644
--- a/drivers/gpu/drm/exynos/exynos_hdmi.c
+++ b/drivers/gpu/drm/exynos/exynos_hdmi.c
@@ -784,7 +784,7 @@ static void hdmi_reg_infoframes(struct hdmi_context *hdata)
 	}
 
 	ret = drm_hdmi_avi_infoframe_from_display_mode(&frm.avi,
-			&hdata->current_mode);
+			&hdata->current_mode, false);
 	if (!ret)
 		ret = hdmi_avi_infoframe_pack(&frm.avi, buf, sizeof(buf));
 	if (ret > 0) {
@@ -835,7 +835,6 @@ static void hdmi_connector_destroy(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs hdmi_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.detect = hdmi_detect,
 	.destroy = hdmi_connector_destroy,
@@ -1698,32 +1697,25 @@ static int hdmi_bind(struct device *dev, struct device *master, void *data)
 	struct drm_device *drm_dev = data;
 	struct hdmi_context *hdata = dev_get_drvdata(dev);
 	struct drm_encoder *encoder = &hdata->encoder;
-	struct exynos_drm_crtc *exynos_crtc;
-	struct drm_crtc *crtc;
-	int ret, pipe;
+	struct exynos_drm_crtc *crtc;
+	int ret;
 
 	hdata->drm_dev = drm_dev;
 
-	pipe = exynos_drm_crtc_get_pipe_from_type(drm_dev,
-						  EXYNOS_DISPLAY_TYPE_HDMI);
-	if (pipe < 0)
-		return pipe;
-
 	hdata->phy_clk.enable = hdmiphy_clk_enable;
 
-	crtc = drm_crtc_from_index(drm_dev, pipe);
-	exynos_crtc = to_exynos_crtc(crtc);
-	exynos_crtc->pipe_clk = &hdata->phy_clk;
-
-	encoder->possible_crtcs = 1 << pipe;
-
-	DRM_DEBUG_KMS("possible_crtcs = 0x%x\n", encoder->possible_crtcs);
-
 	drm_encoder_init(drm_dev, encoder, &exynos_hdmi_encoder_funcs,
 			 DRM_MODE_ENCODER_TMDS, NULL);
 
 	drm_encoder_helper_add(encoder, &exynos_hdmi_encoder_helper_funcs);
 
+	ret = exynos_drm_set_possible_crtcs(encoder, EXYNOS_DISPLAY_TYPE_HDMI);
+	if (ret < 0)
+		return ret;
+
+	crtc = exynos_drm_crtc_get_by_type(drm_dev, EXYNOS_DISPLAY_TYPE_HDMI);
+	crtc->pipe_clk = &hdata->phy_clk;
+
 	ret = hdmi_create_connector(encoder);
 	if (ret) {
 		DRM_ERROR("failed to create connector ret = %d\n", ret);
diff --git a/drivers/gpu/drm/exynos/exynos_mixer.c b/drivers/gpu/drm/exynos/exynos_mixer.c
index a998a8d..0027554 100644
--- a/drivers/gpu/drm/exynos/exynos_mixer.c
+++ b/drivers/gpu/drm/exynos/exynos_mixer.c
@@ -148,7 +148,8 @@ static const struct exynos_drm_plane_config plane_configs[MIXER_WIN_NR] = {
 		.pixel_formats = vp_formats,
 		.num_pixel_formats = ARRAY_SIZE(vp_formats),
 		.capabilities = EXYNOS_DRM_PLANE_CAP_SCALE |
-				EXYNOS_DRM_PLANE_CAP_ZPOS,
+				EXYNOS_DRM_PLANE_CAP_ZPOS |
+				EXYNOS_DRM_PLANE_CAP_TILE,
 	},
 };
 
@@ -483,29 +484,18 @@ static void vp_video_buffer(struct mixer_context *ctx,
 	unsigned int priority = state->base.normalized_zpos + 1;
 	unsigned long flags;
 	dma_addr_t luma_addr[2], chroma_addr[2];
-	bool tiled_mode = false;
-	bool crcb_mode = false;
+	bool is_tiled, is_nv21;
 	u32 val;
 
-	switch (fb->format->format) {
-	case DRM_FORMAT_NV12:
-		crcb_mode = false;
-		break;
-	case DRM_FORMAT_NV21:
-		crcb_mode = true;
-		break;
-	default:
-		DRM_ERROR("pixel format for vp is wrong [%d].\n",
-				fb->format->format);
-		return;
-	}
+	is_nv21 = (fb->format->format == DRM_FORMAT_NV21);
+	is_tiled = (fb->modifier == DRM_FORMAT_MOD_SAMSUNG_64_32_TILE);
 
 	luma_addr[0] = exynos_drm_fb_dma_addr(fb, 0);
 	chroma_addr[0] = exynos_drm_fb_dma_addr(fb, 1);
 
 	if (mode->flags & DRM_MODE_FLAG_INTERLACE) {
 		__set_bit(MXR_BIT_INTERLACE, &ctx->flags);
-		if (tiled_mode) {
+		if (is_tiled) {
 			luma_addr[1] = luma_addr[0] + 0x40;
 			chroma_addr[1] = chroma_addr[0] + 0x40;
 		} else {
@@ -525,14 +515,14 @@ static void vp_video_buffer(struct mixer_context *ctx,
 	vp_reg_writemask(res, VP_MODE, val, VP_MODE_LINE_SKIP);
 
 	/* setup format */
-	val = (crcb_mode ? VP_MODE_NV21 : VP_MODE_NV12);
-	val |= (tiled_mode ? VP_MODE_MEM_TILED : VP_MODE_MEM_LINEAR);
+	val = (is_nv21 ? VP_MODE_NV21 : VP_MODE_NV12);
+	val |= (is_tiled ? VP_MODE_MEM_TILED : VP_MODE_MEM_LINEAR);
 	vp_reg_writemask(res, VP_MODE, val, VP_MODE_FMT_MASK);
 
 	/* setting size of input image */
 	vp_reg_write(res, VP_IMG_SIZE_Y, VP_IMG_HSIZE(fb->pitches[0]) |
 		VP_IMG_VSIZE(fb->height));
-	/* chroma height has to reduced by 2 to avoid chroma distorions */
+	/* chroma plane for NV12/NV21 is half the height of the luma plane */
 	vp_reg_write(res, VP_IMG_SIZE_C, VP_IMG_HSIZE(fb->pitches[0]) |
 		VP_IMG_VSIZE(fb->height / 2));
 
@@ -594,7 +584,7 @@ static void mixer_graph_buffer(struct mixer_context *ctx,
 	unsigned long flags;
 	unsigned int win = plane->index;
 	unsigned int x_ratio = 0, y_ratio = 0;
-	unsigned int src_x_offset, src_y_offset, dst_x_offset, dst_y_offset;
+	unsigned int dst_x_offset, dst_y_offset;
 	dma_addr_t dma_addr;
 	unsigned int fmt;
 	u32 val;
@@ -616,12 +606,9 @@ static void mixer_graph_buffer(struct mixer_context *ctx,
 
 	case DRM_FORMAT_XRGB8888:
 	case DRM_FORMAT_ARGB8888:
+	default:
 		fmt = MXR_FORMAT_ARGB8888;
 		break;
-
-	default:
-		DRM_DEBUG_KMS("pixelformat unsupported by mixer\n");
-		return;
 	}
 
 	/* ratio is already checked by common plane code */
@@ -631,12 +618,10 @@ static void mixer_graph_buffer(struct mixer_context *ctx,
 	dst_x_offset = state->crtc.x;
 	dst_y_offset = state->crtc.y;
 
-	/* converting dma address base and source offset */
+	/* translate dma address base s.t. the source image offset is zero */
 	dma_addr = exynos_drm_fb_dma_addr(fb, 0)
 		+ (state->src.x * fb->format->cpp[0])
 		+ (state->src.y * fb->pitches[0]);
-	src_x_offset = 0;
-	src_y_offset = 0;
 
 	if (mode->flags & DRM_MODE_FLAG_INTERLACE)
 		__set_bit(MXR_BIT_INTERLACE, &ctx->flags);
@@ -667,11 +652,6 @@ static void mixer_graph_buffer(struct mixer_context *ctx,
 	val |= MXR_GRP_WH_V_SCALE(y_ratio);
 	mixer_reg_write(res, MXR_GRAPHIC_WH(win), val);
 
-	/* setup offsets in source image */
-	val  = MXR_GRP_SXY_SX(src_x_offset);
-	val |= MXR_GRP_SXY_SY(src_y_offset);
-	mixer_reg_write(res, MXR_GRAPHIC_SXY(win), val);
-
 	/* setup offsets in display image */
 	val  = MXR_GRP_DXY_DX(dst_x_offset);
 	val |= MXR_GRP_DXY_DY(dst_y_offset);
@@ -748,6 +728,10 @@ static void mixer_win_reset(struct mixer_context *ctx)
 	if (test_bit(MXR_BIT_VP_ENABLED, &ctx->flags))
 		mixer_reg_writemask(res, MXR_CFG, 0, MXR_CFG_VP_ENABLE);
 
+	/* set all source image offsets to zero */
+	mixer_reg_write(res, MXR_GRAPHIC_SXY(0), 0);
+	mixer_reg_write(res, MXR_GRAPHIC_SXY(1), 0);
+
 	spin_unlock_irqrestore(&res->reg_slock, flags);
 }
 
diff --git a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_crtc.c b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_crtc.c
index cc4e944..0e37524 100644
--- a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_crtc.c
+++ b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_crtc.c
@@ -63,7 +63,8 @@ static void fsl_dcu_drm_crtc_atomic_disable(struct drm_crtc *crtc,
 	clk_disable_unprepare(fsl_dev->pix_clk);
 }
 
-static void fsl_dcu_drm_crtc_enable(struct drm_crtc *crtc)
+static void fsl_dcu_drm_crtc_atomic_enable(struct drm_crtc *crtc,
+					   struct drm_crtc_state *old_state)
 {
 	struct drm_device *dev = crtc->dev;
 	struct fsl_dcu_drm_device *fsl_dev = dev->dev_private;
@@ -133,7 +134,7 @@ static void fsl_dcu_drm_crtc_mode_set_nofb(struct drm_crtc *crtc)
 static const struct drm_crtc_helper_funcs fsl_dcu_drm_crtc_helper_funcs = {
 	.atomic_disable = fsl_dcu_drm_crtc_atomic_disable,
 	.atomic_flush = fsl_dcu_drm_crtc_atomic_flush,
-	.enable = fsl_dcu_drm_crtc_enable,
+	.atomic_enable = fsl_dcu_drm_crtc_atomic_enable,
 	.mode_set_nofb = fsl_dcu_drm_crtc_mode_set_nofb,
 };
 
diff --git a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_drv.c b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_drv.c
index 5cbde19..58e9e06 100644
--- a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_drv.c
+++ b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_drv.c
@@ -176,8 +176,6 @@ static struct drm_driver fsl_dcu_drm_driver = {
 	.gem_prime_vunmap	= drm_gem_cma_prime_vunmap,
 	.gem_prime_mmap		= drm_gem_cma_prime_mmap,
 	.dumb_create		= drm_gem_cma_dumb_create,
-	.dumb_map_offset	= drm_gem_cma_dumb_map_offset,
-	.dumb_destroy		= drm_gem_dumb_destroy,
 	.fops			= &fsl_dcu_drm_fops,
 	.name			= "fsl-dcu-drm",
 	.desc			= "Freescale DCU DRM",
diff --git a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c
index 0a20723..9554b24 100644
--- a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c
+++ b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_plane.c
@@ -224,7 +224,7 @@ struct drm_plane *fsl_dcu_drm_primary_create_plane(struct drm_device *dev)
 				       &fsl_dcu_drm_plane_funcs,
 				       fsl_dcu_drm_plane_formats,
 				       ARRAY_SIZE(fsl_dcu_drm_plane_formats),
-				       DRM_PLANE_TYPE_PRIMARY, NULL);
+				       NULL, DRM_PLANE_TYPE_PRIMARY, NULL);
 	if (ret) {
 		kfree(primary);
 		primary = NULL;
diff --git a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_rgb.c b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_rgb.c
index dcbf3c0..edd7d81 100644
--- a/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_rgb.c
+++ b/drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_rgb.c
@@ -63,7 +63,6 @@ static const struct drm_connector_funcs fsl_dcu_drm_connector_funcs = {
 	.atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
 	.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
 	.destroy = fsl_dcu_drm_connector_destroy,
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.reset = drm_atomic_helper_connector_reset,
 };
diff --git a/drivers/gpu/drm/gma500/framebuffer.c b/drivers/gpu/drm/gma500/framebuffer.c
index 7da70b6..2570c7f 100644
--- a/drivers/gpu/drm/gma500/framebuffer.c
+++ b/drivers/gpu/drm/gma500/framebuffer.c
@@ -479,26 +479,6 @@ static struct drm_framebuffer *psb_user_framebuffer_create
 	return psb_framebuffer_create(dev, cmd, r);
 }
 
-static void psbfb_gamma_set(struct drm_crtc *crtc, u16 red, u16 green,
-							u16 blue, int regno)
-{
-	struct gma_crtc *gma_crtc = to_gma_crtc(crtc);
-
-	gma_crtc->lut_r[regno] = red >> 8;
-	gma_crtc->lut_g[regno] = green >> 8;
-	gma_crtc->lut_b[regno] = blue >> 8;
-}
-
-static void psbfb_gamma_get(struct drm_crtc *crtc, u16 *red,
-					u16 *green, u16 *blue, int regno)
-{
-	struct gma_crtc *gma_crtc = to_gma_crtc(crtc);
-
-	*red = gma_crtc->lut_r[regno] << 8;
-	*green = gma_crtc->lut_g[regno] << 8;
-	*blue = gma_crtc->lut_b[regno] << 8;
-}
-
 static int psbfb_probe(struct drm_fb_helper *helper,
 				struct drm_fb_helper_surface_size *sizes)
 {
@@ -525,8 +505,6 @@ static int psbfb_probe(struct drm_fb_helper *helper,
 }
 
 static const struct drm_fb_helper_funcs psb_fb_helper_funcs = {
-	.gamma_set = psbfb_gamma_set,
-	.gamma_get = psbfb_gamma_get,
 	.fb_probe = psbfb_probe,
 };
 
diff --git a/drivers/gpu/drm/gma500/gem.c b/drivers/gpu/drm/gma500/gem.c
index 7da061a..1312397 100644
--- a/drivers/gpu/drm/gma500/gem.c
+++ b/drivers/gpu/drm/gma500/gem.c
@@ -48,36 +48,6 @@ int psb_gem_get_aperture(struct drm_device *dev, void *data,
 }
 
 /**
- *	psb_gem_dumb_map_gtt	-	buffer mapping for dumb interface
- *	@file: our drm client file
- *	@dev: drm device
- *	@handle: GEM handle to the object (from dumb_create)
- *
- *	Do the necessary setup to allow the mapping of the frame buffer
- *	into user memory. We don't have to do much here at the moment.
- */
-int psb_gem_dumb_map_gtt(struct drm_file *file, struct drm_device *dev,
-			 uint32_t handle, uint64_t *offset)
-{
-	int ret = 0;
-	struct drm_gem_object *obj;
-
-	/* GEM does all our handle to object mapping */
-	obj = drm_gem_object_lookup(file, handle);
-	if (obj == NULL)
-		return -ENOENT;
-
-	/* Make it mmapable */
-	ret = drm_gem_create_mmap_offset(obj);
-	if (ret)
-		goto out;
-	*offset = drm_vma_node_offset_addr(&obj->vma_node);
-out:
-	drm_gem_object_unreference_unlocked(obj);
-	return ret;
-}
-
-/**
  *	psb_gem_create		-	create a mappable object
  *	@file: the DRM file of the client
  *	@dev: our device
diff --git a/drivers/gpu/drm/gma500/gma_display.c b/drivers/gpu/drm/gma500/gma_display.c
index e7fd356..f3c48a2 100644
--- a/drivers/gpu/drm/gma500/gma_display.c
+++ b/drivers/gpu/drm/gma500/gma_display.c
@@ -144,33 +144,32 @@ void gma_crtc_load_lut(struct drm_crtc *crtc)
 	struct gma_crtc *gma_crtc = to_gma_crtc(crtc);
 	const struct psb_offset *map = &dev_priv->regmap[gma_crtc->pipe];
 	int palreg = map->palette;
+	u16 *r, *g, *b;
 	int i;
 
 	/* The clocks have to be on to load the palette. */
 	if (!crtc->enabled)
 		return;
 
+	r = crtc->gamma_store;
+	g = r + crtc->gamma_size;
+	b = g + crtc->gamma_size;
+
 	if (gma_power_begin(dev, false)) {
 		for (i = 0; i < 256; i++) {
 			REG_WRITE(palreg + 4 * i,
-				  ((gma_crtc->lut_r[i] +
-				  gma_crtc->lut_adj[i]) << 16) |
-				  ((gma_crtc->lut_g[i] +
-				  gma_crtc->lut_adj[i]) << 8) |
-				  (gma_crtc->lut_b[i] +
-				  gma_crtc->lut_adj[i]));
+				  (((*r++ >> 8) + gma_crtc->lut_adj[i]) << 16) |
+				  (((*g++ >> 8) + gma_crtc->lut_adj[i]) << 8) |
+				  ((*b++ >> 8) + gma_crtc->lut_adj[i]));
 		}
 		gma_power_end(dev);
 	} else {
 		for (i = 0; i < 256; i++) {
 			/* FIXME: Why pipe[0] and not pipe[..._crtc->pipe]? */
 			dev_priv->regs.pipe[0].palette[i] =
-				  ((gma_crtc->lut_r[i] +
-				  gma_crtc->lut_adj[i]) << 16) |
-				  ((gma_crtc->lut_g[i] +
-				  gma_crtc->lut_adj[i]) << 8) |
-				  (gma_crtc->lut_b[i] +
-				  gma_crtc->lut_adj[i]);
+				(((*r++ >> 8) + gma_crtc->lut_adj[i]) << 16) |
+				(((*g++ >> 8) + gma_crtc->lut_adj[i]) << 8) |
+				((*b++ >> 8) + gma_crtc->lut_adj[i]);
 		}
 
 	}
@@ -180,15 +179,6 @@ int gma_crtc_gamma_set(struct drm_crtc *crtc, u16 *red, u16 *green, u16 *blue,
 		       u32 size,
 		       struct drm_modeset_acquire_ctx *ctx)
 {
-	struct gma_crtc *gma_crtc = to_gma_crtc(crtc);
-	int i;
-
-	for (i = 0; i < size; i++) {
-		gma_crtc->lut_r[i] = red[i] >> 8;
-		gma_crtc->lut_g[i] = green[i] >> 8;
-		gma_crtc->lut_b[i] = blue[i] >> 8;
-	}
-
 	gma_crtc_load_lut(crtc);
 
 	return 0;
diff --git a/drivers/gpu/drm/gma500/mdfld_dsi_pkg_sender.c b/drivers/gpu/drm/gma500/mdfld_dsi_pkg_sender.c
index 1616af2..c50534c 100644
--- a/drivers/gpu/drm/gma500/mdfld_dsi_pkg_sender.c
+++ b/drivers/gpu/drm/gma500/mdfld_dsi_pkg_sender.c
@@ -520,7 +520,7 @@ static int __read_panel_data(struct mdfld_dsi_pkg_sender *sender, u8 data_type,
 			u8 *data, u16 len, u32 *data_out, u16 len_out, bool hs)
 {
 	unsigned long flags;
-	struct drm_device *dev = sender->dev;
+	struct drm_device *dev;
 	int i;
 	u32 gen_data_reg;
 	int retry = MDFLD_DSI_READ_MAX_COUNT;
@@ -530,6 +530,8 @@ static int __read_panel_data(struct mdfld_dsi_pkg_sender *sender, u8 data_type,
 		return -EINVAL;
 	}
 
+	dev = sender->dev;
+
 	/**
 	 * do reading.
 	 * 0) send out generic read request
diff --git a/drivers/gpu/drm/gma500/mdfld_intel_display.c b/drivers/gpu/drm/gma500/mdfld_intel_display.c
index 63c6e086..531e4450 100644
--- a/drivers/gpu/drm/gma500/mdfld_intel_display.c
+++ b/drivers/gpu/drm/gma500/mdfld_intel_display.c
@@ -737,11 +737,7 @@ static int mdfld_crtc_mode_set(struct drm_crtc *crtc,
 					sizeof(struct drm_display_mode));
 
 	list_for_each_entry(connector, &mode_config->connector_list, head) {
-		if (!connector)
-			continue;
-
 		encoder = connector->encoder;
-
 		if (!encoder)
 			continue;
 
diff --git a/drivers/gpu/drm/gma500/psb_drv.c b/drivers/gpu/drm/gma500/psb_drv.c
index 1f9b35a..37a3be7 100644
--- a/drivers/gpu/drm/gma500/psb_drv.c
+++ b/drivers/gpu/drm/gma500/psb_drv.c
@@ -480,7 +480,6 @@ static struct drm_driver driver = {
 	.load = psb_driver_load,
 	.unload = psb_driver_unload,
 	.lastclose = psb_driver_lastclose,
-	.set_busid = drm_pci_set_busid,
 
 	.num_ioctls = ARRAY_SIZE(psb_ioctls),
 	.irq_preinstall = psb_irq_preinstall,
@@ -495,8 +494,6 @@ static struct drm_driver driver = {
 	.gem_vm_ops = &psb_gem_vm_ops,
 
 	.dumb_create = psb_gem_dumb_create,
-	.dumb_map_offset = psb_gem_dumb_map_gtt,
-	.dumb_destroy = drm_gem_dumb_destroy,
 	.ioctls = psb_ioctls,
 	.fops = &psb_gem_fops,
 	.name = DRIVER_NAME,
@@ -517,12 +514,12 @@ static struct pci_driver psb_pci_driver = {
 
 static int __init psb_init(void)
 {
-	return drm_pci_init(&driver, &psb_pci_driver);
+	return pci_register_driver(&psb_pci_driver);
 }
 
 static void __exit psb_exit(void)
 {
-	drm_pci_exit(&driver, &psb_pci_driver);
+	pci_unregister_driver(&psb_pci_driver);
 }
 
 late_initcall(psb_init);
diff --git a/drivers/gpu/drm/gma500/psb_drv.h b/drivers/gpu/drm/gma500/psb_drv.h
index 8366708..821497d 100644
--- a/drivers/gpu/drm/gma500/psb_drv.h
+++ b/drivers/gpu/drm/gma500/psb_drv.h
@@ -750,8 +750,6 @@ extern int psb_gem_get_aperture(struct drm_device *dev, void *data,
 			struct drm_file *file);
 extern int psb_gem_dumb_create(struct drm_file *file, struct drm_device *dev,
 			struct drm_mode_create_dumb *args);
-extern int psb_gem_dumb_map_gtt(struct drm_file *file, struct drm_device *dev,
-			uint32_t handle, uint64_t *offset);
 extern int psb_gem_fault(struct vm_fault *vmf);
 
 /* psb_device.c */
diff --git a/drivers/gpu/drm/gma500/psb_intel_display.c b/drivers/gpu/drm/gma500/psb_intel_display.c
index 7b6c849..8762efa 100644
--- a/drivers/gpu/drm/gma500/psb_intel_display.c
+++ b/drivers/gpu/drm/gma500/psb_intel_display.c
@@ -518,13 +518,8 @@ void psb_intel_crtc_init(struct drm_device *dev, int pipe,
 	gma_crtc->pipe = pipe;
 	gma_crtc->plane = pipe;
 
-	for (i = 0; i < 256; i++) {
-		gma_crtc->lut_r[i] = i;
-		gma_crtc->lut_g[i] = i;
-		gma_crtc->lut_b[i] = i;
-
+	for (i = 0; i < 256; i++)
 		gma_crtc->lut_adj[i] = 0;
-	}
 
 	gma_crtc->mode_dev = mode_dev;
 	gma_crtc->cursor_addr = 0;
diff --git a/drivers/gpu/drm/gma500/psb_intel_drv.h b/drivers/gpu/drm/gma500/psb_intel_drv.h
index 6a10215..e8e4ea1 100644
--- a/drivers/gpu/drm/gma500/psb_intel_drv.h
+++ b/drivers/gpu/drm/gma500/psb_intel_drv.h
@@ -172,7 +172,6 @@ struct gma_crtc {
 	int plane;
 	uint32_t cursor_addr;
 	struct gtt_range *cursor_gt;
-	u8 lut_r[256], lut_g[256], lut_b[256];
 	u8 lut_adj[256];
 	struct psb_intel_framebuffer *fbdev_fb;
 	/* a mode_set for fbdev users on this crtc */
diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
index 59542bd..a956545 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_de.c
@@ -150,7 +150,6 @@ static const u32 channel_formats1[] = {
 static struct drm_plane_funcs hibmc_plane_funcs = {
 	.update_plane	= drm_atomic_helper_update_plane,
 	.disable_plane	= drm_atomic_helper_disable_plane,
-	.set_property = drm_atomic_helper_plane_set_property,
 	.destroy = drm_plane_cleanup,
 	.reset = drm_atomic_helper_plane_reset,
 	.atomic_duplicate_state = drm_atomic_helper_plane_duplicate_state,
@@ -181,6 +180,7 @@ static struct drm_plane *hibmc_plane_init(struct hibmc_drm_private *priv)
 	ret = drm_universal_plane_init(dev, plane, 1, &hibmc_plane_funcs,
 				       channel_formats1,
 				       ARRAY_SIZE(channel_formats1),
+				       NULL,
 				       DRM_PLANE_TYPE_PRIMARY,
 				       NULL);
 	if (ret) {
@@ -192,7 +192,8 @@ static struct drm_plane *hibmc_plane_init(struct hibmc_drm_private *priv)
 	return plane;
 }
 
-static void hibmc_crtc_enable(struct drm_crtc *crtc)
+static void hibmc_crtc_atomic_enable(struct drm_crtc *crtc,
+				     struct drm_crtc_state *old_state)
 {
 	unsigned int reg;
 	struct hibmc_drm_private *priv = crtc->dev->dev_private;
@@ -209,7 +210,8 @@ static void hibmc_crtc_enable(struct drm_crtc *crtc)
 	drm_crtc_vblank_on(crtc);
 }
 
-static void hibmc_crtc_disable(struct drm_crtc *crtc)
+static void hibmc_crtc_atomic_disable(struct drm_crtc *crtc,
+				      struct drm_crtc_state *old_state)
 {
 	unsigned int reg;
 	struct hibmc_drm_private *priv = crtc->dev->dev_private;
@@ -453,11 +455,11 @@ static const struct drm_crtc_funcs hibmc_crtc_funcs = {
 };
 
 static const struct drm_crtc_helper_funcs hibmc_crtc_helper_funcs = {
-	.enable		= hibmc_crtc_enable,
-	.disable	= hibmc_crtc_disable,
 	.mode_set_nofb	= hibmc_crtc_mode_set_nofb,
 	.atomic_begin	= hibmc_crtc_atomic_begin,
 	.atomic_flush	= hibmc_crtc_atomic_flush,
+	.atomic_enable	= hibmc_crtc_atomic_enable,
+	.atomic_disable	= hibmc_crtc_atomic_disable,
 };
 
 int hibmc_de_init(struct hibmc_drm_private *priv)
diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
index 2ffdbf9..d4f6f1f 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_drv.c
@@ -67,7 +67,6 @@ static struct drm_driver hibmc_driver = {
 	.gem_free_object_unlocked = hibmc_gem_free_object,
 	.dumb_create            = hibmc_dumb_create,
 	.dumb_map_offset        = hibmc_dumb_mmap_offset,
-	.dumb_destroy           = drm_gem_dumb_destroy,
 	.irq_handler		= hibmc_drm_interrupt,
 };
 
@@ -276,11 +275,12 @@ static int hibmc_unload(struct drm_device *dev)
 
 	hibmc_fbdev_fini(priv);
 
+	drm_atomic_helper_shutdown(dev);
+
 	if (dev->irq_enabled)
 		drm_irq_uninstall(dev);
 	if (priv->msi_enabled)
 		pci_disable_msi(dev->pdev);
-	drm_vblank_cleanup(dev);
 
 	hibmc_kms_fini(priv);
 	hibmc_mm_fini(priv);
diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
index f5ac80d..b92595c 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_fbdev.c
@@ -131,7 +131,6 @@ static int hibmc_drm_fb_create(struct drm_fb_helper *helper,
 
 	strcpy(info->fix.id, "hibmcdrmfb");
 
-	info->flags = FBINFO_DEFAULT;
 	info->fbops = &hibmc_drm_fb_ops;
 
 	drm_fb_helper_fill_fix(info, hi_fbdev->fb->fb.pitches[0],
@@ -158,7 +157,7 @@ static int hibmc_drm_fb_create(struct drm_fb_helper *helper,
 out_unreserve_ttm_bo:
 	ttm_bo_unreserve(&bo->bo);
 out_unref_gem:
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 
 	return ret;
 }
@@ -173,7 +172,7 @@ static void hibmc_fbdev_destroy(struct hibmc_fbdev *fbdev)
 	drm_fb_helper_fini(fbh);
 
 	if (gfb)
-		drm_framebuffer_unreference(&gfb->fb);
+		drm_framebuffer_put(&gfb->fb);
 }
 
 static const struct drm_fb_helper_funcs hibmc_fbdev_helper_funcs = {
diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_vdac.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_vdac.c
index 12a1855..ec4dd9d 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_vdac.c
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_drm_vdac.c
@@ -47,7 +47,6 @@ static const struct drm_connector_helper_funcs
 };
 
 static const struct drm_connector_funcs hibmc_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = drm_connector_cleanup,
 	.reset = drm_atomic_helper_connector_reset,
diff --git a/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c b/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
index ac457c7..3518167 100644
--- a/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
+++ b/drivers/gpu/drm/hisilicon/hibmc/hibmc_ttm.c
@@ -444,7 +444,7 @@ int hibmc_dumb_create(struct drm_file *file, struct drm_device *dev,
 	}
 
 	ret = drm_gem_handle_create(file, gobj, &handle);
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	if (ret) {
 		DRM_ERROR("failed to unreference GEM object: %d\n", ret);
 		return ret;
@@ -479,7 +479,7 @@ int hibmc_dumb_mmap_offset(struct drm_file *file, struct drm_device *dev,
 	bo = gem_to_hibmc_bo(obj);
 	*offset = hibmc_bo_mmap_offset(bo);
 
-	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
 	return 0;
 }
 
@@ -487,7 +487,7 @@ static void hibmc_user_framebuffer_destroy(struct drm_framebuffer *fb)
 {
 	struct hibmc_framebuffer *hibmc_fb = to_hibmc_framebuffer(fb);
 
-	drm_gem_object_unreference_unlocked(hibmc_fb->obj);
+	drm_gem_object_put_unlocked(hibmc_fb->obj);
 	drm_framebuffer_cleanup(fb);
 	kfree(hibmc_fb);
 }
@@ -543,7 +543,7 @@ hibmc_user_framebuffer_create(struct drm_device *dev,
 
 	hibmc_fb = hibmc_framebuffer_init(dev, mode_cmd, obj);
 	if (IS_ERR(hibmc_fb)) {
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ERR_PTR((long)hibmc_fb);
 	}
 	return &hibmc_fb->fb;
diff --git a/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c b/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c
index f77dcfa..b4c7af3 100644
--- a/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c
+++ b/drivers/gpu/drm/hisilicon/kirin/dw_drm_dsi.c
@@ -603,6 +603,72 @@ static void dsi_encoder_enable(struct drm_encoder *encoder)
 	dsi->enable = true;
 }
 
+static enum drm_mode_status dsi_encoder_phy_mode_valid(
+					struct drm_encoder *encoder,
+					const struct drm_display_mode *mode)
+{
+	struct dw_dsi *dsi = encoder_to_dsi(encoder);
+	struct mipi_phy_params phy;
+	u32 bpp = mipi_dsi_pixel_format_to_bpp(dsi->format);
+	u32 req_kHz, act_kHz, lane_byte_clk_kHz;
+
+	/* Calculate the lane byte clk using the adjusted mode clk */
+	memset(&phy, 0, sizeof(phy));
+	req_kHz = mode->clock * bpp / dsi->lanes;
+	act_kHz = dsi_calc_phy_rate(req_kHz, &phy);
+	lane_byte_clk_kHz = act_kHz / 8;
+
+	DRM_DEBUG_DRIVER("Checking mode %ix%i-%i@%i clock: %i...",
+			mode->hdisplay, mode->vdisplay, bpp,
+			drm_mode_vrefresh(mode), mode->clock);
+
+	/*
+	 * Make sure the adjusted mode clock and the lane byte clk
+	 * have a common denominator base frequency
+	 */
+	if (mode->clock/dsi->lanes == lane_byte_clk_kHz/3) {
+		DRM_DEBUG_DRIVER("OK!\n");
+		return MODE_OK;
+	}
+
+	DRM_DEBUG_DRIVER("BAD!\n");
+	return MODE_BAD;
+}
+
+static enum drm_mode_status dsi_encoder_mode_valid(struct drm_encoder *encoder,
+					const struct drm_display_mode *mode)
+
+{
+	const struct drm_crtc_helper_funcs *crtc_funcs = NULL;
+	struct drm_crtc *crtc = NULL;
+	struct drm_display_mode adj_mode;
+	enum drm_mode_status ret;
+
+	/*
+	 * The crtc might adjust the mode, so go through the
+	 * possible crtcs (technically just one) and call
+	 * mode_fixup to figure out the adjusted mode before we
+	 * validate it.
+	 */
+	drm_for_each_crtc(crtc, encoder->dev) {
+		/*
+		 * reset adj_mode to the mode value each time,
+		 * so we don't adjust the mode twice
+		 */
+		drm_mode_copy(&adj_mode, mode);
+
+		crtc_funcs = crtc->helper_private;
+		if (crtc_funcs && crtc_funcs->mode_fixup)
+			if (!crtc_funcs->mode_fixup(crtc, mode, &adj_mode))
+				return MODE_BAD;
+
+		ret = dsi_encoder_phy_mode_valid(encoder, &adj_mode);
+		if (ret != MODE_OK)
+			return ret;
+	}
+	return MODE_OK;
+}
+
 static void dsi_encoder_mode_set(struct drm_encoder *encoder,
 				 struct drm_display_mode *mode,
 				 struct drm_display_mode *adj_mode)
@@ -622,6 +688,7 @@ static int dsi_encoder_atomic_check(struct drm_encoder *encoder,
 
 static const struct drm_encoder_helper_funcs dw_encoder_helper_funcs = {
 	.atomic_check	= dsi_encoder_atomic_check,
+	.mode_valid	= dsi_encoder_mode_valid,
 	.mode_set	= dsi_encoder_mode_set,
 	.enable		= dsi_encoder_enable,
 	.disable	= dsi_encoder_disable
diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c
index c96c228..9823477 100644
--- a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c
+++ b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_ade.c
@@ -178,6 +178,19 @@ static void ade_init(struct ade_hw_ctx *ctx)
 			FRM_END_START_MASK, REG_EFFECTIVE_IN_ADEEN_FRMEND);
 }
 
+static bool ade_crtc_mode_fixup(struct drm_crtc *crtc,
+				const struct drm_display_mode *mode,
+				struct drm_display_mode *adjusted_mode)
+{
+	struct ade_crtc *acrtc = to_ade_crtc(crtc);
+	struct ade_hw_ctx *ctx = acrtc->ctx;
+
+	adjusted_mode->clock =
+		clk_round_rate(ctx->ade_pix_clk, mode->clock * 1000) / 1000;
+	return true;
+}
+
+
 static void ade_set_pix_clk(struct ade_hw_ctx *ctx,
 			    struct drm_display_mode *mode,
 			    struct drm_display_mode *adj_mode)
@@ -467,7 +480,8 @@ static void ade_dump_regs(void __iomem *base)
 static void ade_dump_regs(void __iomem *base) { }
 #endif
 
-static void ade_crtc_enable(struct drm_crtc *crtc)
+static void ade_crtc_atomic_enable(struct drm_crtc *crtc,
+				   struct drm_crtc_state *old_state)
 {
 	struct ade_crtc *acrtc = to_ade_crtc(crtc);
 	struct ade_hw_ctx *ctx = acrtc->ctx;
@@ -489,7 +503,8 @@ static void ade_crtc_enable(struct drm_crtc *crtc)
 	acrtc->enable = true;
 }
 
-static void ade_crtc_disable(struct drm_crtc *crtc)
+static void ade_crtc_atomic_disable(struct drm_crtc *crtc,
+				    struct drm_crtc_state *old_state)
 {
 	struct ade_crtc *acrtc = to_ade_crtc(crtc);
 	struct ade_hw_ctx *ctx = acrtc->ctx;
@@ -553,11 +568,12 @@ static void ade_crtc_atomic_flush(struct drm_crtc *crtc,
 }
 
 static const struct drm_crtc_helper_funcs ade_crtc_helper_funcs = {
-	.enable		= ade_crtc_enable,
-	.disable	= ade_crtc_disable,
+	.mode_fixup	= ade_crtc_mode_fixup,
 	.mode_set_nofb	= ade_crtc_mode_set_nofb,
 	.atomic_begin	= ade_crtc_atomic_begin,
 	.atomic_flush	= ade_crtc_atomic_flush,
+	.atomic_enable	= ade_crtc_atomic_enable,
+	.atomic_disable	= ade_crtc_atomic_disable,
 };
 
 static const struct drm_crtc_funcs ade_crtc_funcs = {
@@ -565,7 +581,6 @@ static const struct drm_crtc_funcs ade_crtc_funcs = {
 	.set_config	= drm_atomic_helper_set_config,
 	.page_flip	= drm_atomic_helper_page_flip,
 	.reset		= drm_atomic_helper_crtc_reset,
-	.set_property = drm_atomic_helper_crtc_set_property,
 	.atomic_duplicate_state	= drm_atomic_helper_crtc_duplicate_state,
 	.atomic_destroy_state	= drm_atomic_helper_crtc_destroy_state,
 	.enable_vblank	= ade_crtc_enable_vblank,
@@ -583,8 +598,7 @@ static int ade_crtc_init(struct drm_device *dev, struct drm_crtc *crtc,
 	 */
 	port = of_get_child_by_name(dev->dev->of_node, "port");
 	if (!port) {
-		DRM_ERROR("no port node found in %s\n",
-			  dev->dev->of_node->full_name);
+		DRM_ERROR("no port node found in %pOF\n", dev->dev->of_node);
 		return -EINVAL;
 	}
 	of_node_put(port);
@@ -889,7 +903,6 @@ static const struct drm_plane_helper_funcs ade_plane_helper_funcs = {
 static struct drm_plane_funcs ade_plane_funcs = {
 	.update_plane	= drm_atomic_helper_update_plane,
 	.disable_plane	= drm_atomic_helper_disable_plane,
-	.set_property = drm_atomic_helper_plane_set_property,
 	.destroy = drm_plane_cleanup,
 	.reset = drm_atomic_helper_plane_reset,
 	.atomic_duplicate_state = drm_atomic_helper_plane_duplicate_state,
@@ -909,7 +922,7 @@ static int ade_plane_init(struct drm_device *dev, struct ade_plane *aplane,
 		return ret;
 
 	ret = drm_universal_plane_init(dev, &aplane->base, 1, &ade_plane_funcs,
-				       fmts, fmts_cnt, type, NULL);
+				       fmts, fmts_cnt, NULL, type, NULL);
 	if (ret) {
 		DRM_ERROR("fail to init plane, ch=%d\n", aplane->ch);
 		return ret;
diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c
index 9c90367..e27352c 100644
--- a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c
+++ b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.c
@@ -34,14 +34,12 @@ static int kirin_drm_kms_cleanup(struct drm_device *dev)
 {
 	struct kirin_drm_private *priv = dev->dev_private;
 
-#ifdef CONFIG_DRM_FBDEV_EMULATION
 	if (priv->fbdev) {
 		drm_fbdev_cma_fini(priv->fbdev);
 		priv->fbdev = NULL;
 	}
-#endif
+
 	drm_kms_helper_poll_fini(dev);
-	drm_vblank_cleanup(dev);
 	dc_ops->cleanup(to_platform_device(dev->dev));
 	drm_mode_config_cleanup(dev);
 	devm_kfree(dev->dev, priv);
@@ -50,27 +48,16 @@ static int kirin_drm_kms_cleanup(struct drm_device *dev)
 	return 0;
 }
 
-#ifdef CONFIG_DRM_FBDEV_EMULATION
 static void kirin_fbdev_output_poll_changed(struct drm_device *dev)
 {
 	struct kirin_drm_private *priv = dev->dev_private;
 
-	if (priv->fbdev) {
-		drm_fbdev_cma_hotplug_event(priv->fbdev);
-	} else {
-		priv->fbdev = drm_fbdev_cma_init(dev, 32,
-						 dev->mode_config.num_connector);
-		if (IS_ERR(priv->fbdev))
-			priv->fbdev = NULL;
-	}
+	drm_fbdev_cma_hotplug_event(priv->fbdev);
 }
-#endif
 
 static const struct drm_mode_config_funcs kirin_drm_mode_config_funcs = {
 	.fb_create = drm_fb_cma_create,
-#ifdef CONFIG_DRM_FBDEV_EMULATION
 	.output_poll_changed = kirin_fbdev_output_poll_changed,
-#endif
 	.atomic_check = drm_atomic_helper_check,
 	.atomic_commit = drm_atomic_helper_commit,
 };
@@ -129,11 +116,18 @@ static int kirin_drm_kms_init(struct drm_device *dev)
 	/* init kms poll for handling hpd */
 	drm_kms_helper_poll_init(dev);
 
-	/* force detection after connectors init */
-	(void)drm_helper_hpd_irq_event(dev);
+	priv->fbdev = drm_fbdev_cma_init(dev, 32,
+					 dev->mode_config.num_connector);
 
+	if (IS_ERR(priv->fbdev)) {
+		DRM_ERROR("failed to initialize fbdev.\n");
+		ret = PTR_ERR(priv->fbdev);
+		goto err_cleanup_poll;
+	}
 	return 0;
 
+err_cleanup_poll:
+	drm_kms_helper_poll_fini(dev);
 err_unbind_all:
 	component_unbind_all(dev->dev, dev);
 err_dc_cleanup:
@@ -163,8 +157,6 @@ static struct drm_driver kirin_drm_driver = {
 	.gem_free_object_unlocked = drm_gem_cma_free_object,
 	.gem_vm_ops		= &drm_gem_cma_vm_ops,
 	.dumb_create		= kirin_gem_cma_dumb_create,
-	.dumb_map_offset	= drm_gem_cma_dumb_map_offset,
-	.dumb_destroy		= drm_gem_dumb_destroy,
 
 	.prime_handle_to_fd	= drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle	= drm_gem_prime_fd_to_handle,
diff --git a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.h b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.h
index 7f60c649..56cb62d 100644
--- a/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.h
+++ b/drivers/gpu/drm/hisilicon/kirin/kirin_drm_drv.h
@@ -20,9 +20,7 @@ struct kirin_dc_ops {
 };
 
 struct kirin_drm_private {
-#ifdef CONFIG_DRM_FBDEV_EMULATION
 	struct drm_fbdev_cma *fbdev;
-#endif
 };
 
 extern const struct kirin_dc_ops ade_dc_ops;
diff --git a/drivers/gpu/drm/i2c/tda998x_drv.c b/drivers/gpu/drm/i2c/tda998x_drv.c
index 86f47e1..54e3255 100644
--- a/drivers/gpu/drm/i2c/tda998x_drv.c
+++ b/drivers/gpu/drm/i2c/tda998x_drv.c
@@ -712,7 +712,7 @@ tda998x_write_avi(struct tda998x_priv *priv, struct drm_display_mode *mode)
 {
 	union hdmi_infoframe frame;
 
-	drm_hdmi_avi_infoframe_from_display_mode(&frame.avi, mode);
+	drm_hdmi_avi_infoframe_from_display_mode(&frame.avi, mode, false);
 	frame.avi.quantization_range = HDMI_QUANTIZATION_RANGE_FULL;
 
 	tda998x_write_if(priv, DIP_IF_FLAGS_IF2, REG_IF2_HB0, &frame);
@@ -969,14 +969,6 @@ static int tda998x_audio_codec_init(struct tda998x_priv *priv,
 
 /* DRM connector functions */
 
-static int tda998x_connector_dpms(struct drm_connector *connector, int mode)
-{
-	if (drm_core_check_feature(connector->dev, DRIVER_ATOMIC))
-		return drm_atomic_helper_connector_dpms(connector, mode);
-	else
-		return drm_helper_connector_dpms(connector, mode);
-}
-
 static int tda998x_connector_fill_modes(struct drm_connector *connector,
 					uint32_t maxX, uint32_t maxY)
 {
@@ -1014,7 +1006,7 @@ static void tda998x_connector_destroy(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs tda998x_connector_funcs = {
-	.dpms = tda998x_connector_dpms,
+	.dpms = drm_helper_connector_dpms,
 	.reset = drm_atomic_helper_connector_reset,
 	.fill_modes = tda998x_connector_fill_modes,
 	.detect = tda998x_connector_detect,
diff --git a/drivers/gpu/drm/i810/i810_drv.c b/drivers/gpu/drm/i810/i810_drv.c
index 37fd090..c69d5c4 100644
--- a/drivers/gpu/drm/i810/i810_drv.c
+++ b/drivers/gpu/drm/i810/i810_drv.c
@@ -59,7 +59,6 @@ static struct drm_driver driver = {
 	.load = i810_driver_load,
 	.lastclose = i810_driver_lastclose,
 	.preclose = i810_driver_preclose,
-	.set_busid = drm_pci_set_busid,
 	.dma_quiescent = i810_driver_dma_quiescent,
 	.ioctls = i810_ioctls,
 	.fops = &i810_driver_fops,
@@ -83,12 +82,12 @@ static int __init i810_init(void)
 		return -EINVAL;
 	}
 	driver.num_ioctls = i810_max_ioctl;
-	return drm_pci_init(&driver, &i810_pci_driver);
+	return drm_legacy_pci_init(&driver, &i810_pci_driver);
 }
 
 static void __exit i810_exit(void)
 {
-	drm_pci_exit(&driver, &i810_pci_driver);
+	drm_legacy_pci_exit(&driver, &i810_pci_driver);
 }
 
 module_init(i810_init);
diff --git a/drivers/gpu/drm/i915/Kconfig b/drivers/gpu/drm/i915/Kconfig
index a5cd5da..e9e64e8 100644
--- a/drivers/gpu/drm/i915/Kconfig
+++ b/drivers/gpu/drm/i915/Kconfig
@@ -21,6 +21,7 @@
 	select ACPI_BUTTON if ACPI
 	select SYNC_FILE
 	select IOSF_MBI
+	select CRC32
 	help
 	  Choose this option if you have a system that has "Intel Graphics
 	  Media Accelerator" or "HD Graphics" integrated graphics,
diff --git a/drivers/gpu/drm/i915/Kconfig.debug b/drivers/gpu/drm/i915/Kconfig.debug
index 78c5c04..aed7d20 100644
--- a/drivers/gpu/drm/i915/Kconfig.debug
+++ b/drivers/gpu/drm/i915/Kconfig.debug
@@ -25,6 +25,7 @@
         select DRM_VGEM # used by igt/prime_vgem (dmabuf interop checks)
         select DRM_DEBUG_MM if DRM=y
 	select DRM_DEBUG_MM_SELFTEST
+	select SW_SYNC # signaling validation framework (igt/syncobj*)
 	select DRM_I915_SW_FENCE_DEBUG_OBJECTS
 	select DRM_I915_SELFTEST
         default n
diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index f822731..892f52b 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -39,6 +39,7 @@
 	  i915_gem_gtt.o \
 	  i915_gem_internal.o \
 	  i915_gem.o \
+	  i915_gem_object.o \
 	  i915_gem_render_state.o \
 	  i915_gem_request.o \
 	  i915_gem_shrinker.o \
diff --git a/drivers/gpu/drm/i915/gvt/aperture_gm.c b/drivers/gpu/drm/i915/gvt/aperture_gm.c
index 325618d..ca3d192 100644
--- a/drivers/gpu/drm/i915/gvt/aperture_gm.c
+++ b/drivers/gpu/drm/i915/gvt/aperture_gm.c
@@ -285,8 +285,8 @@ static int alloc_resource(struct intel_vgpu *vgpu,
 	return 0;
 
 no_enough_resource:
-	gvt_vgpu_err("fail to allocate resource %s\n", item);
-	gvt_vgpu_err("request %luMB avail %luMB max %luMB taken %luMB\n",
+	gvt_err("fail to allocate resource %s\n", item);
+	gvt_err("request %luMB avail %luMB max %luMB taken %luMB\n",
 		BYTES_TO_MB(request), BYTES_TO_MB(avail),
 		BYTES_TO_MB(max), BYTES_TO_MB(taken));
 	return -ENOSPC;
diff --git a/drivers/gpu/drm/i915/gvt/cmd_parser.c b/drivers/gpu/drm/i915/gvt/cmd_parser.c
index e556a46..21c36e2 100644
--- a/drivers/gpu/drm/i915/gvt/cmd_parser.c
+++ b/drivers/gpu/drm/i915/gvt/cmd_parser.c
@@ -1382,13 +1382,13 @@ static inline int cmd_address_audit(struct parser_exec_state *s,
 			ret = -EINVAL;
 			goto err;
 		}
-	} else if ((!vgpu_gmadr_is_valid(s->vgpu, guest_gma)) ||
-			(!vgpu_gmadr_is_valid(s->vgpu,
-					      guest_gma + op_size - 1))) {
+	} else if (!intel_gvt_ggtt_validate_range(vgpu, guest_gma, op_size)) {
 		ret = -EINVAL;
 		goto err;
 	}
+
 	return 0;
+
 err:
 	gvt_vgpu_err("cmd_parser: Malicious %s detected, addr=0x%lx, len=%d!\n",
 			s->info->name, guest_gma, op_size);
@@ -2647,7 +2647,7 @@ static int shadow_workload_ring_buffer(struct intel_vgpu_workload *workload)
 	return 0;
 }
 
-int intel_gvt_scan_and_shadow_workload(struct intel_vgpu_workload *workload)
+int intel_gvt_scan_and_shadow_ringbuffer(struct intel_vgpu_workload *workload)
 {
 	int ret;
 	struct intel_vgpu *vgpu = workload->vgpu;
diff --git a/drivers/gpu/drm/i915/gvt/cmd_parser.h b/drivers/gpu/drm/i915/gvt/cmd_parser.h
index bed3351..2867036 100644
--- a/drivers/gpu/drm/i915/gvt/cmd_parser.h
+++ b/drivers/gpu/drm/i915/gvt/cmd_parser.h
@@ -42,7 +42,7 @@ void intel_gvt_clean_cmd_parser(struct intel_gvt *gvt);
 
 int intel_gvt_init_cmd_parser(struct intel_gvt *gvt);
 
-int intel_gvt_scan_and_shadow_workload(struct intel_vgpu_workload *workload);
+int intel_gvt_scan_and_shadow_ringbuffer(struct intel_vgpu_workload *workload);
 
 int intel_gvt_scan_and_shadow_wa_ctx(struct intel_shadow_wa_ctx *wa_ctx);
 
diff --git a/drivers/gpu/drm/i915/gvt/display.c b/drivers/gpu/drm/i915/gvt/display.c
index 7cb0818..3c31843 100644
--- a/drivers/gpu/drm/i915/gvt/display.c
+++ b/drivers/gpu/drm/i915/gvt/display.c
@@ -178,9 +178,9 @@ static void emulate_monitor_status_change(struct intel_vgpu *vgpu)
 				SDE_PORTE_HOTPLUG_SPT);
 		vgpu_vreg(vgpu, SKL_FUSE_STATUS) |=
 				SKL_FUSE_DOWNLOAD_STATUS |
-				SKL_FUSE_PG0_DIST_STATUS |
-				SKL_FUSE_PG1_DIST_STATUS |
-				SKL_FUSE_PG2_DIST_STATUS;
+				SKL_FUSE_PG_DIST_STATUS(SKL_PG0) |
+				SKL_FUSE_PG_DIST_STATUS(SKL_PG1) |
+				SKL_FUSE_PG_DIST_STATUS(SKL_PG2);
 		vgpu_vreg(vgpu, LCPLL1_CTL) |=
 				LCPLL_PLL_ENABLE |
 				LCPLL_PLL_LOCK;
diff --git a/drivers/gpu/drm/i915/gvt/execlist.c b/drivers/gpu/drm/i915/gvt/execlist.c
index 1648887..91b4300 100644
--- a/drivers/gpu/drm/i915/gvt/execlist.c
+++ b/drivers/gpu/drm/i915/gvt/execlist.c
@@ -622,6 +622,7 @@ static int submit_context(struct intel_vgpu *vgpu, int ring_id,
 	struct list_head *q = workload_q_head(vgpu, ring_id);
 	struct intel_vgpu_workload *last_workload = get_last_workload(q);
 	struct intel_vgpu_workload *workload = NULL;
+	struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
 	u64 ring_context_gpa;
 	u32 head, tail, start, ctl, ctx_ctl, per_ctx, indirect_ctx;
 	int ret;
@@ -685,6 +686,7 @@ static int submit_context(struct intel_vgpu *vgpu, int ring_id,
 	workload->complete = complete_execlist_workload;
 	workload->status = -EINPROGRESS;
 	workload->emulate_schedule_in = emulate_schedule_in;
+	workload->shadowed = false;
 
 	if (ring_id == RCS) {
 		intel_gvt_hypervisor_read_gpa(vgpu, ring_context_gpa +
@@ -718,6 +720,17 @@ static int submit_context(struct intel_vgpu *vgpu, int ring_id,
 		return ret;
 	}
 
+	/* Only scan and shadow the first workload in the queue
+	 * as there is only one pre-allocated buf-obj for shadow.
+	 */
+	if (list_empty(workload_q_head(vgpu, ring_id))) {
+		intel_runtime_pm_get(dev_priv);
+		mutex_lock(&dev_priv->drm.struct_mutex);
+		intel_gvt_scan_and_shadow_workload(workload);
+		mutex_unlock(&dev_priv->drm.struct_mutex);
+		intel_runtime_pm_put(dev_priv);
+	}
+
 	queue_workload(workload);
 	return 0;
 }
@@ -800,6 +813,8 @@ static void clean_workloads(struct intel_vgpu *vgpu, unsigned long engine_mask)
 			list_del_init(&pos->list);
 			free_workload(pos);
 		}
+
+		clear_bit(engine->id, vgpu->shadow_ctx_desc_updated);
 	}
 }
 
diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c
index 6166e34..e6dfc33 100644
--- a/drivers/gpu/drm/i915/gvt/gtt.c
+++ b/drivers/gpu/drm/i915/gvt/gtt.c
@@ -259,7 +259,7 @@ static void write_pte64(struct drm_i915_private *dev_priv,
 	writeq(pte, addr);
 }
 
-static inline struct intel_gvt_gtt_entry *gtt_get_entry64(void *pt,
+static inline int gtt_get_entry64(void *pt,
 		struct intel_gvt_gtt_entry *e,
 		unsigned long index, bool hypervisor_access, unsigned long gpa,
 		struct intel_vgpu *vgpu)
@@ -268,22 +268,23 @@ static inline struct intel_gvt_gtt_entry *gtt_get_entry64(void *pt,
 	int ret;
 
 	if (WARN_ON(info->gtt_entry_size != 8))
-		return e;
+		return -EINVAL;
 
 	if (hypervisor_access) {
 		ret = intel_gvt_hypervisor_read_gpa(vgpu, gpa +
 				(index << info->gtt_entry_size_shift),
 				&e->val64, 8);
-		WARN_ON(ret);
+		if (WARN_ON(ret))
+			return ret;
 	} else if (!pt) {
 		e->val64 = read_pte64(vgpu->gvt->dev_priv, index);
 	} else {
 		e->val64 = *((u64 *)pt + index);
 	}
-	return e;
+	return 0;
 }
 
-static inline struct intel_gvt_gtt_entry *gtt_set_entry64(void *pt,
+static inline int gtt_set_entry64(void *pt,
 		struct intel_gvt_gtt_entry *e,
 		unsigned long index, bool hypervisor_access, unsigned long gpa,
 		struct intel_vgpu *vgpu)
@@ -292,19 +293,20 @@ static inline struct intel_gvt_gtt_entry *gtt_set_entry64(void *pt,
 	int ret;
 
 	if (WARN_ON(info->gtt_entry_size != 8))
-		return e;
+		return -EINVAL;
 
 	if (hypervisor_access) {
 		ret = intel_gvt_hypervisor_write_gpa(vgpu, gpa +
 				(index << info->gtt_entry_size_shift),
 				&e->val64, 8);
-		WARN_ON(ret);
+		if (WARN_ON(ret))
+			return ret;
 	} else if (!pt) {
 		write_pte64(vgpu->gvt->dev_priv, index, e->val64);
 	} else {
 		*((u64 *)pt + index) = e->val64;
 	}
-	return e;
+	return 0;
 }
 
 #define GTT_HAW 46
@@ -445,21 +447,25 @@ static int gtt_entry_p2m(struct intel_vgpu *vgpu, struct intel_gvt_gtt_entry *p,
 /*
  * MM helpers.
  */
-struct intel_gvt_gtt_entry *intel_vgpu_mm_get_entry(struct intel_vgpu_mm *mm,
+int intel_vgpu_mm_get_entry(struct intel_vgpu_mm *mm,
 		void *page_table, struct intel_gvt_gtt_entry *e,
 		unsigned long index)
 {
 	struct intel_gvt *gvt = mm->vgpu->gvt;
 	struct intel_gvt_gtt_pte_ops *ops = gvt->gtt.pte_ops;
+	int ret;
 
 	e->type = mm->page_table_entry_type;
 
-	ops->get_entry(page_table, e, index, false, 0, mm->vgpu);
+	ret = ops->get_entry(page_table, e, index, false, 0, mm->vgpu);
+	if (ret)
+		return ret;
+
 	ops->test_pse(e);
-	return e;
+	return 0;
 }
 
-struct intel_gvt_gtt_entry *intel_vgpu_mm_set_entry(struct intel_vgpu_mm *mm,
+int intel_vgpu_mm_set_entry(struct intel_vgpu_mm *mm,
 		void *page_table, struct intel_gvt_gtt_entry *e,
 		unsigned long index)
 {
@@ -472,7 +478,7 @@ struct intel_gvt_gtt_entry *intel_vgpu_mm_set_entry(struct intel_vgpu_mm *mm,
 /*
  * PPGTT shadow page table helpers.
  */
-static inline struct intel_gvt_gtt_entry *ppgtt_spt_get_entry(
+static inline int ppgtt_spt_get_entry(
 		struct intel_vgpu_ppgtt_spt *spt,
 		void *page_table, int type,
 		struct intel_gvt_gtt_entry *e, unsigned long index,
@@ -480,20 +486,24 @@ static inline struct intel_gvt_gtt_entry *ppgtt_spt_get_entry(
 {
 	struct intel_gvt *gvt = spt->vgpu->gvt;
 	struct intel_gvt_gtt_pte_ops *ops = gvt->gtt.pte_ops;
+	int ret;
 
 	e->type = get_entry_type(type);
 
 	if (WARN(!gtt_type_is_entry(e->type), "invalid entry type\n"))
-		return e;
+		return -EINVAL;
 
-	ops->get_entry(page_table, e, index, guest,
+	ret = ops->get_entry(page_table, e, index, guest,
 			spt->guest_page.gfn << GTT_PAGE_SHIFT,
 			spt->vgpu);
+	if (ret)
+		return ret;
+
 	ops->test_pse(e);
-	return e;
+	return 0;
 }
 
-static inline struct intel_gvt_gtt_entry *ppgtt_spt_set_entry(
+static inline int ppgtt_spt_set_entry(
 		struct intel_vgpu_ppgtt_spt *spt,
 		void *page_table, int type,
 		struct intel_gvt_gtt_entry *e, unsigned long index,
@@ -503,7 +513,7 @@ static inline struct intel_gvt_gtt_entry *ppgtt_spt_set_entry(
 	struct intel_gvt_gtt_pte_ops *ops = gvt->gtt.pte_ops;
 
 	if (WARN(!gtt_type_is_entry(e->type), "invalid entry type\n"))
-		return e;
+		return -EINVAL;
 
 	return ops->set_entry(page_table, e, index, guest,
 			spt->guest_page.gfn << GTT_PAGE_SHIFT,
@@ -792,13 +802,13 @@ static struct intel_vgpu_ppgtt_spt *ppgtt_find_shadow_page(
 
 #define for_each_present_guest_entry(spt, e, i) \
 	for (i = 0; i < pt_entries(spt); i++) \
-	if (spt->vgpu->gvt->gtt.pte_ops->test_present( \
-		ppgtt_get_guest_entry(spt, e, i)))
+		if (!ppgtt_get_guest_entry(spt, e, i) && \
+		    spt->vgpu->gvt->gtt.pte_ops->test_present(e))
 
 #define for_each_present_shadow_entry(spt, e, i) \
 	for (i = 0; i < pt_entries(spt); i++) \
-	if (spt->vgpu->gvt->gtt.pte_ops->test_present( \
-		ppgtt_get_shadow_entry(spt, e, i)))
+		if (!ppgtt_get_shadow_entry(spt, e, i) && \
+		    spt->vgpu->gvt->gtt.pte_ops->test_present(e))
 
 static void ppgtt_get_shadow_page(struct intel_vgpu_ppgtt_spt *spt)
 {
@@ -979,29 +989,26 @@ static int ppgtt_populate_shadow_page(struct intel_vgpu_ppgtt_spt *spt)
 }
 
 static int ppgtt_handle_guest_entry_removal(struct intel_vgpu_guest_page *gpt,
-		unsigned long index)
+		struct intel_gvt_gtt_entry *se, unsigned long index)
 {
 	struct intel_vgpu_ppgtt_spt *spt = guest_page_to_ppgtt_spt(gpt);
 	struct intel_vgpu_shadow_page *sp = &spt->shadow_page;
 	struct intel_vgpu *vgpu = spt->vgpu;
 	struct intel_gvt_gtt_pte_ops *ops = vgpu->gvt->gtt.pte_ops;
-	struct intel_gvt_gtt_entry e;
 	int ret;
 
-	ppgtt_get_shadow_entry(spt, &e, index);
-
-	trace_gpt_change(spt->vgpu->id, "remove", spt, sp->type, e.val64,
+	trace_gpt_change(spt->vgpu->id, "remove", spt, sp->type, se->val64,
 			 index);
 
-	if (!ops->test_present(&e))
+	if (!ops->test_present(se))
 		return 0;
 
-	if (ops->get_pfn(&e) == vgpu->gtt.scratch_pt[sp->type].page_mfn)
+	if (ops->get_pfn(se) == vgpu->gtt.scratch_pt[sp->type].page_mfn)
 		return 0;
 
-	if (gtt_type_is_pt(get_next_pt_type(e.type))) {
+	if (gtt_type_is_pt(get_next_pt_type(se->type))) {
 		struct intel_vgpu_ppgtt_spt *s =
-			ppgtt_find_shadow_page(vgpu, ops->get_pfn(&e));
+			ppgtt_find_shadow_page(vgpu, ops->get_pfn(se));
 		if (!s) {
 			gvt_vgpu_err("fail to find guest page\n");
 			ret = -ENXIO;
@@ -1011,12 +1018,10 @@ static int ppgtt_handle_guest_entry_removal(struct intel_vgpu_guest_page *gpt,
 		if (ret)
 			goto fail;
 	}
-	ops->set_pfn(&e, vgpu->gtt.scratch_pt[sp->type].page_mfn);
-	ppgtt_set_shadow_entry(spt, &e, index);
 	return 0;
 fail:
 	gvt_vgpu_err("fail: shadow page %p guest entry 0x%llx type %d\n",
-			spt, e.val64, e.type);
+			spt, se->val64, se->type);
 	return ret;
 }
 
@@ -1236,22 +1241,37 @@ static int ppgtt_handle_guest_write_page_table(
 {
 	struct intel_vgpu_ppgtt_spt *spt = guest_page_to_ppgtt_spt(gpt);
 	struct intel_vgpu *vgpu = spt->vgpu;
+	int type = spt->shadow_page.type;
 	struct intel_gvt_gtt_pte_ops *ops = vgpu->gvt->gtt.pte_ops;
+	struct intel_gvt_gtt_entry se;
 
 	int ret;
 	int new_present;
 
 	new_present = ops->test_present(we);
 
-	ret = ppgtt_handle_guest_entry_removal(gpt, index);
-	if (ret)
-		goto fail;
+	/*
+	 * Adding the new entry first and then removing the old one, that can
+	 * guarantee the ppgtt table is validated during the window between
+	 * adding and removal.
+	 */
+	ppgtt_get_shadow_entry(spt, &se, index);
 
 	if (new_present) {
 		ret = ppgtt_handle_guest_entry_add(gpt, we, index);
 		if (ret)
 			goto fail;
 	}
+
+	ret = ppgtt_handle_guest_entry_removal(gpt, &se, index);
+	if (ret)
+		goto fail;
+
+	if (!new_present) {
+		ops->set_pfn(&se, vgpu->gtt.scratch_pt[type].page_mfn);
+		ppgtt_set_shadow_entry(spt, &se, index);
+	}
+
 	return 0;
 fail:
 	gvt_vgpu_err("fail: shadow page %p guest entry 0x%llx type %d.\n",
@@ -1323,7 +1343,7 @@ static int ppgtt_handle_guest_write_page_table_bytes(void *gp,
 	struct intel_vgpu *vgpu = spt->vgpu;
 	struct intel_gvt_gtt_pte_ops *ops = vgpu->gvt->gtt.pte_ops;
 	const struct intel_gvt_device_info *info = &vgpu->gvt->device_info;
-	struct intel_gvt_gtt_entry we;
+	struct intel_gvt_gtt_entry we, se;
 	unsigned long index;
 	int ret;
 
@@ -1339,7 +1359,8 @@ static int ppgtt_handle_guest_write_page_table_bytes(void *gp,
 			return ret;
 	} else {
 		if (!test_bit(index, spt->post_shadow_bitmap)) {
-			ret = ppgtt_handle_guest_entry_removal(gpt, index);
+			ppgtt_get_shadow_entry(spt, &se, index);
+			ret = ppgtt_handle_guest_entry_removal(gpt, &se, index);
 			if (ret)
 				return ret;
 		}
@@ -1713,8 +1734,10 @@ unsigned long intel_vgpu_gma_to_gpa(struct intel_vgpu_mm *mm, unsigned long gma)
 		if (!vgpu_gmadr_is_valid(vgpu, gma))
 			goto err;
 
-		ggtt_get_guest_entry(mm, &e,
-			gma_ops->gma_to_ggtt_pte_index(gma));
+		ret = ggtt_get_guest_entry(mm, &e,
+				gma_ops->gma_to_ggtt_pte_index(gma));
+		if (ret)
+			goto err;
 		gpa = (pte_ops->get_pfn(&e) << GTT_PAGE_SHIFT)
 			+ (gma & ~GTT_PAGE_MASK);
 
@@ -1724,7 +1747,9 @@ unsigned long intel_vgpu_gma_to_gpa(struct intel_vgpu_mm *mm, unsigned long gma)
 
 	switch (mm->page_table_level) {
 	case 4:
-		ppgtt_get_shadow_root_entry(mm, &e, 0);
+		ret = ppgtt_get_shadow_root_entry(mm, &e, 0);
+		if (ret)
+			goto err;
 		gma_index[0] = gma_ops->gma_to_pml4_index(gma);
 		gma_index[1] = gma_ops->gma_to_l4_pdp_index(gma);
 		gma_index[2] = gma_ops->gma_to_pde_index(gma);
@@ -1732,15 +1757,19 @@ unsigned long intel_vgpu_gma_to_gpa(struct intel_vgpu_mm *mm, unsigned long gma)
 		index = 4;
 		break;
 	case 3:
-		ppgtt_get_shadow_root_entry(mm, &e,
+		ret = ppgtt_get_shadow_root_entry(mm, &e,
 				gma_ops->gma_to_l3_pdp_index(gma));
+		if (ret)
+			goto err;
 		gma_index[0] = gma_ops->gma_to_pde_index(gma);
 		gma_index[1] = gma_ops->gma_to_pte_index(gma);
 		index = 2;
 		break;
 	case 2:
-		ppgtt_get_shadow_root_entry(mm, &e,
+		ret = ppgtt_get_shadow_root_entry(mm, &e,
 				gma_ops->gma_to_pde_index(gma));
+		if (ret)
+			goto err;
 		gma_index[0] = gma_ops->gma_to_pte_index(gma);
 		index = 1;
 		break;
@@ -1755,6 +1784,11 @@ unsigned long intel_vgpu_gma_to_gpa(struct intel_vgpu_mm *mm, unsigned long gma)
 			(i == index - 1));
 		if (ret)
 			goto err;
+
+		if (!pte_ops->test_present(&e)) {
+			gvt_dbg_core("GMA 0x%lx is not present\n", gma);
+			goto err;
+		}
 	}
 
 	gpa = (pte_ops->get_pfn(&e) << GTT_PAGE_SHIFT)
@@ -2329,13 +2363,12 @@ void intel_vgpu_reset_ggtt(struct intel_vgpu *vgpu)
 /**
  * intel_vgpu_reset_gtt - reset the all GTT related status
  * @vgpu: a vGPU
- * @dmlr: true for vGPU Device Model Level Reset, false for GT Reset
  *
  * This function is called from vfio core to reset reset all
  * GTT related status, including GGTT, PPGTT, scratch page.
  *
  */
-void intel_vgpu_reset_gtt(struct intel_vgpu *vgpu, bool dmlr)
+void intel_vgpu_reset_gtt(struct intel_vgpu *vgpu)
 {
 	int i;
 
@@ -2347,9 +2380,6 @@ void intel_vgpu_reset_gtt(struct intel_vgpu *vgpu, bool dmlr)
 	 */
 	intel_vgpu_free_mm(vgpu, INTEL_GVT_MM_PPGTT);
 
-	if (!dmlr)
-		return;
-
 	intel_vgpu_reset_ggtt(vgpu);
 
 	/* clear scratch page for security */
diff --git a/drivers/gpu/drm/i915/gvt/gtt.h b/drivers/gpu/drm/i915/gvt/gtt.h
index f88eb5e..30a4c8d 100644
--- a/drivers/gpu/drm/i915/gvt/gtt.h
+++ b/drivers/gpu/drm/i915/gvt/gtt.h
@@ -49,14 +49,18 @@ struct intel_gvt_gtt_entry {
 };
 
 struct intel_gvt_gtt_pte_ops {
-	struct intel_gvt_gtt_entry *(*get_entry)(void *pt,
-		struct intel_gvt_gtt_entry *e,
-		unsigned long index, bool hypervisor_access, unsigned long gpa,
-		struct intel_vgpu *vgpu);
-	struct intel_gvt_gtt_entry *(*set_entry)(void *pt,
-		struct intel_gvt_gtt_entry *e,
-		unsigned long index, bool hypervisor_access, unsigned long gpa,
-		struct intel_vgpu *vgpu);
+	int (*get_entry)(void *pt,
+			 struct intel_gvt_gtt_entry *e,
+			 unsigned long index,
+			 bool hypervisor_access,
+			 unsigned long gpa,
+			 struct intel_vgpu *vgpu);
+	int (*set_entry)(void *pt,
+			 struct intel_gvt_gtt_entry *e,
+			 unsigned long index,
+			 bool hypervisor_access,
+			 unsigned long gpa,
+			 struct intel_vgpu *vgpu);
 	bool (*test_present)(struct intel_gvt_gtt_entry *e);
 	void (*clear_present)(struct intel_gvt_gtt_entry *e);
 	bool (*test_pse)(struct intel_gvt_gtt_entry *e);
@@ -143,12 +147,12 @@ struct intel_vgpu_mm {
 	struct intel_vgpu *vgpu;
 };
 
-extern struct intel_gvt_gtt_entry *intel_vgpu_mm_get_entry(
+extern int intel_vgpu_mm_get_entry(
 		struct intel_vgpu_mm *mm,
 		void *page_table, struct intel_gvt_gtt_entry *e,
 		unsigned long index);
 
-extern struct intel_gvt_gtt_entry *intel_vgpu_mm_set_entry(
+extern int intel_vgpu_mm_set_entry(
 		struct intel_vgpu_mm *mm,
 		void *page_table, struct intel_gvt_gtt_entry *e,
 		unsigned long index);
@@ -208,7 +212,7 @@ extern void intel_vgpu_clean_gtt(struct intel_vgpu *vgpu);
 void intel_vgpu_reset_ggtt(struct intel_vgpu *vgpu);
 
 extern int intel_gvt_init_gtt(struct intel_gvt *gvt);
-extern void intel_vgpu_reset_gtt(struct intel_vgpu *vgpu, bool dmlr);
+void intel_vgpu_reset_gtt(struct intel_vgpu *vgpu);
 extern void intel_gvt_clean_gtt(struct intel_gvt *gvt);
 
 extern struct intel_vgpu_mm *intel_gvt_find_ppgtt_mm(struct intel_vgpu *vgpu,
diff --git a/drivers/gpu/drm/i915/gvt/gvt.h b/drivers/gpu/drm/i915/gvt/gvt.h
index 2964a4d..44b719e 100644
--- a/drivers/gpu/drm/i915/gvt/gvt.h
+++ b/drivers/gpu/drm/i915/gvt/gvt.h
@@ -167,6 +167,7 @@ struct intel_vgpu {
 	atomic_t running_workload_num;
 	DECLARE_BITMAP(tlb_handle_pending, I915_NUM_ENGINES);
 	struct i915_gem_context *shadow_ctx;
+	DECLARE_BITMAP(shadow_ctx_desc_updated, I915_NUM_ENGINES);
 
 #if IS_ENABLED(CONFIG_DRM_I915_GVT_KVMGT)
 	struct {
@@ -482,6 +483,8 @@ int intel_vgpu_init_opregion(struct intel_vgpu *vgpu, u32 gpa);
 int intel_vgpu_emulate_opregion_request(struct intel_vgpu *vgpu, u32 swsci);
 void populate_pvinfo_page(struct intel_vgpu *vgpu);
 
+int intel_gvt_scan_and_shadow_workload(struct intel_vgpu_workload *workload);
+
 struct intel_gvt_ops {
 	int (*emulate_cfg_read)(struct intel_vgpu *, unsigned int, void *,
 				unsigned int);
diff --git a/drivers/gpu/drm/i915/gvt/handlers.c b/drivers/gpu/drm/i915/gvt/handlers.c
index feed992..3502a59 100644
--- a/drivers/gpu/drm/i915/gvt/handlers.c
+++ b/drivers/gpu/drm/i915/gvt/handlers.c
@@ -113,9 +113,17 @@ static int new_mmio_info(struct intel_gvt *gvt,
 
 		info->offset = i;
 		p = find_mmio_info(gvt, info->offset);
-		if (p)
-			gvt_err("dup mmio definition offset %x\n",
+		if (p) {
+			WARN(1, "dup mmio definition offset %x\n",
 				info->offset);
+			kfree(info);
+
+			/* We return -EEXIST here to make GVT-g load fail.
+			 * So duplicated MMIO can be found as soon as
+			 * possible.
+			 */
+			return -EEXIST;
+		}
 
 		info->ro_mask = ro_mask;
 		info->device = device;
@@ -1222,10 +1230,12 @@ static int power_well_ctl_mmio_write(struct intel_vgpu *vgpu,
 {
 	write_vreg(vgpu, offset, p_data, bytes);
 
-	if (vgpu_vreg(vgpu, offset) & HSW_PWR_WELL_ENABLE_REQUEST)
-		vgpu_vreg(vgpu, offset) |= HSW_PWR_WELL_STATE_ENABLED;
+	if (vgpu_vreg(vgpu, offset) & HSW_PWR_WELL_CTL_REQ(HSW_DISP_PW_GLOBAL))
+		vgpu_vreg(vgpu, offset) |=
+			HSW_PWR_WELL_CTL_STATE(HSW_DISP_PW_GLOBAL);
 	else
-		vgpu_vreg(vgpu, offset) &= ~HSW_PWR_WELL_STATE_ENABLED;
+		vgpu_vreg(vgpu, offset) &=
+			~HSW_PWR_WELL_CTL_STATE(HSW_DISP_PW_GLOBAL);
 	return 0;
 }
 
@@ -2242,10 +2252,17 @@ static int init_generic_mmio_info(struct intel_gvt *gvt)
 	MMIO_D(GEN6_RC6p_THRESHOLD, D_ALL);
 	MMIO_D(GEN6_RC6pp_THRESHOLD, D_ALL);
 	MMIO_D(GEN6_PMINTRMSK, D_ALL);
-	MMIO_DH(HSW_PWR_WELL_BIOS, D_BDW, NULL, power_well_ctl_mmio_write);
-	MMIO_DH(HSW_PWR_WELL_DRIVER, D_BDW, NULL, power_well_ctl_mmio_write);
-	MMIO_DH(HSW_PWR_WELL_KVMR, D_BDW, NULL, power_well_ctl_mmio_write);
-	MMIO_DH(HSW_PWR_WELL_DEBUG, D_BDW, NULL, power_well_ctl_mmio_write);
+	/*
+	 * Use an arbitrary power well controlled by the PWR_WELL_CTL
+	 * register.
+	 */
+	MMIO_DH(HSW_PWR_WELL_CTL_BIOS(HSW_DISP_PW_GLOBAL), D_BDW, NULL,
+		power_well_ctl_mmio_write);
+	MMIO_DH(HSW_PWR_WELL_CTL_DRIVER(HSW_DISP_PW_GLOBAL), D_BDW, NULL,
+		power_well_ctl_mmio_write);
+	MMIO_DH(HSW_PWR_WELL_CTL_KVMR, D_BDW, NULL, power_well_ctl_mmio_write);
+	MMIO_DH(HSW_PWR_WELL_CTL_DEBUG(HSW_DISP_PW_GLOBAL), D_BDW, NULL,
+		power_well_ctl_mmio_write);
 	MMIO_DH(HSW_PWR_WELL_CTL5, D_BDW, NULL, power_well_ctl_mmio_write);
 	MMIO_DH(HSW_PWR_WELL_CTL6, D_BDW, NULL, power_well_ctl_mmio_write);
 
@@ -2581,7 +2598,6 @@ static int init_broadwell_mmio_info(struct intel_gvt *gvt)
 	MMIO_F(0x24d0, 48, F_CMD_ACCESS, 0, 0, D_BDW_PLUS,
 		NULL, force_nonpriv_write);
 
-	MMIO_D(0x22040, D_BDW_PLUS);
 	MMIO_D(0x44484, D_BDW_PLUS);
 	MMIO_D(0x4448c, D_BDW_PLUS);
 
@@ -2636,9 +2652,13 @@ static int init_skl_mmio_info(struct intel_gvt *gvt)
 	MMIO_F(_DPD_AUX_CH_CTL, 6 * 4, 0, 0, 0, D_SKL_PLUS, NULL,
 						dp_aux_ch_ctl_mmio_write);
 
-	MMIO_D(HSW_PWR_WELL_BIOS, D_SKL_PLUS);
-	MMIO_DH(HSW_PWR_WELL_DRIVER, D_SKL_PLUS, NULL,
-						skl_power_well_ctl_write);
+	/*
+	 * Use an arbitrary power well controlled by the PWR_WELL_CTL
+	 * register.
+	 */
+	MMIO_D(HSW_PWR_WELL_CTL_BIOS(SKL_DISP_PW_MISC_IO), D_SKL_PLUS);
+	MMIO_DH(HSW_PWR_WELL_CTL_DRIVER(SKL_DISP_PW_MISC_IO), D_SKL_PLUS, NULL,
+		skl_power_well_ctl_write);
 	MMIO_DH(GEN6_PCODE_MAILBOX, D_SKL_PLUS, NULL, mailbox_write);
 
 	MMIO_D(0xa210, D_SKL_PLUS);
@@ -2831,7 +2851,6 @@ static int init_skl_mmio_info(struct intel_gvt *gvt)
 	MMIO_D(0x320f0, D_SKL | D_KBL);
 
 	MMIO_DFH(_REG_VCS2_EXCC, D_SKL_PLUS, F_CMD_ACCESS, NULL, NULL);
-	MMIO_DFH(_REG_VECS_EXCC, D_SKL_PLUS, F_CMD_ACCESS, NULL, NULL);
 	MMIO_D(0x70034, D_SKL_PLUS);
 	MMIO_D(0x71034, D_SKL_PLUS);
 	MMIO_D(0x72034, D_SKL_PLUS);
@@ -2849,10 +2868,7 @@ static int init_skl_mmio_info(struct intel_gvt *gvt)
 		NULL, NULL);
 
 	MMIO_D(0x4ab8, D_KBL);
-	MMIO_D(0x940c, D_SKL_PLUS);
 	MMIO_D(0x2248, D_SKL_PLUS | D_KBL);
-	MMIO_D(0x4ab0, D_SKL | D_KBL);
-	MMIO_D(0x20d4, D_SKL | D_KBL);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c
index fd0c85f..83e88c7 100644
--- a/drivers/gpu/drm/i915/gvt/kvmgt.c
+++ b/drivers/gpu/drm/i915/gvt/kvmgt.c
@@ -1170,10 +1170,27 @@ vgpu_id_show(struct device *dev, struct device_attribute *attr,
 	return sprintf(buf, "\n");
 }
 
+static ssize_t
+hw_id_show(struct device *dev, struct device_attribute *attr,
+	   char *buf)
+{
+	struct mdev_device *mdev = mdev_from_dev(dev);
+
+	if (mdev) {
+		struct intel_vgpu *vgpu = (struct intel_vgpu *)
+			mdev_get_drvdata(mdev);
+		return sprintf(buf, "%u\n",
+			       vgpu->shadow_ctx->hw_id);
+	}
+	return sprintf(buf, "\n");
+}
+
 static DEVICE_ATTR_RO(vgpu_id);
+static DEVICE_ATTR_RO(hw_id);
 
 static struct attribute *intel_vgpu_attrs[] = {
 	&dev_attr_vgpu_id.attr,
+	&dev_attr_hw_id.attr,
 	NULL
 };
 
diff --git a/drivers/gpu/drm/i915/gvt/render.c b/drivers/gpu/drm/i915/gvt/render.c
index 504e57c..2ea5422 100644
--- a/drivers/gpu/drm/i915/gvt/render.c
+++ b/drivers/gpu/drm/i915/gvt/render.c
@@ -207,18 +207,16 @@ static void load_mocs(struct intel_vgpu *vgpu, int ring_id)
 
 	offset.reg = regs[ring_id];
 	for (i = 0; i < 64; i++) {
-		gen9_render_mocs[ring_id][i] = I915_READ(offset);
+		gen9_render_mocs[ring_id][i] = I915_READ_FW(offset);
 		I915_WRITE(offset, vgpu_vreg(vgpu, offset));
-		POSTING_READ(offset);
 		offset.reg += 4;
 	}
 
 	if (ring_id == RCS) {
 		l3_offset.reg = 0xb020;
 		for (i = 0; i < 32; i++) {
-			gen9_render_mocs_L3[i] = I915_READ(l3_offset);
-			I915_WRITE(l3_offset, vgpu_vreg(vgpu, l3_offset));
-			POSTING_READ(l3_offset);
+			gen9_render_mocs_L3[i] = I915_READ_FW(l3_offset);
+			I915_WRITE_FW(l3_offset, vgpu_vreg(vgpu, l3_offset));
 			l3_offset.reg += 4;
 		}
 	}
@@ -242,18 +240,16 @@ static void restore_mocs(struct intel_vgpu *vgpu, int ring_id)
 
 	offset.reg = regs[ring_id];
 	for (i = 0; i < 64; i++) {
-		vgpu_vreg(vgpu, offset) = I915_READ(offset);
-		I915_WRITE(offset, gen9_render_mocs[ring_id][i]);
-		POSTING_READ(offset);
+		vgpu_vreg(vgpu, offset) = I915_READ_FW(offset);
+		I915_WRITE_FW(offset, gen9_render_mocs[ring_id][i]);
 		offset.reg += 4;
 	}
 
 	if (ring_id == RCS) {
 		l3_offset.reg = 0xb020;
 		for (i = 0; i < 32; i++) {
-			vgpu_vreg(vgpu, l3_offset) = I915_READ(l3_offset);
-			I915_WRITE(l3_offset, gen9_render_mocs_L3[i]);
-			POSTING_READ(l3_offset);
+			vgpu_vreg(vgpu, l3_offset) = I915_READ_FW(l3_offset);
+			I915_WRITE_FW(l3_offset, gen9_render_mocs_L3[i]);
 			l3_offset.reg += 4;
 		}
 	}
@@ -272,6 +268,7 @@ static void switch_mmio_to_vgpu(struct intel_vgpu *vgpu, int ring_id)
 	u32 ctx_ctrl = reg_state[CTX_CONTEXT_CONTROL_VAL];
 	u32 inhibit_mask =
 		_MASKED_BIT_ENABLE(CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT);
+	i915_reg_t last_reg = _MMIO(0);
 
 	if (IS_SKYLAKE(vgpu->gvt->dev_priv)
 		|| IS_KABYLAKE(vgpu->gvt->dev_priv)) {
@@ -287,7 +284,7 @@ static void switch_mmio_to_vgpu(struct intel_vgpu *vgpu, int ring_id)
 		if (mmio->ring_id != ring_id)
 			continue;
 
-		mmio->value = I915_READ(mmio->reg);
+		mmio->value = I915_READ_FW(mmio->reg);
 
 		/*
 		 * if it is an inhibit context, load in_context mmio
@@ -304,13 +301,18 @@ static void switch_mmio_to_vgpu(struct intel_vgpu *vgpu, int ring_id)
 		else
 			v = vgpu_vreg(vgpu, mmio->reg);
 
-		I915_WRITE(mmio->reg, v);
-		POSTING_READ(mmio->reg);
+		I915_WRITE_FW(mmio->reg, v);
+		last_reg = mmio->reg;
 
 		trace_render_mmio(vgpu->id, "load",
 				  i915_mmio_reg_offset(mmio->reg),
 				  mmio->value, v);
 	}
+
+	/* Make sure the swiched MMIOs has taken effect. */
+	if (likely(INTEL_GVT_MMIO_OFFSET(last_reg)))
+		I915_READ_FW(last_reg);
+
 	handle_tlb_pending_event(vgpu, ring_id);
 }
 
@@ -319,6 +321,7 @@ static void switch_mmio_to_host(struct intel_vgpu *vgpu, int ring_id)
 {
 	struct drm_i915_private *dev_priv = vgpu->gvt->dev_priv;
 	struct render_mmio *mmio;
+	i915_reg_t last_reg = _MMIO(0);
 	u32 v;
 	int i, array_size;
 
@@ -335,7 +338,7 @@ static void switch_mmio_to_host(struct intel_vgpu *vgpu, int ring_id)
 		if (mmio->ring_id != ring_id)
 			continue;
 
-		vgpu_vreg(vgpu, mmio->reg) = I915_READ(mmio->reg);
+		vgpu_vreg(vgpu, mmio->reg) = I915_READ_FW(mmio->reg);
 
 		if (mmio->mask) {
 			vgpu_vreg(vgpu, mmio->reg) &= ~(mmio->mask << 16);
@@ -346,13 +349,17 @@ static void switch_mmio_to_host(struct intel_vgpu *vgpu, int ring_id)
 		if (mmio->in_context)
 			continue;
 
-		I915_WRITE(mmio->reg, v);
-		POSTING_READ(mmio->reg);
+		I915_WRITE_FW(mmio->reg, v);
+		last_reg = mmio->reg;
 
 		trace_render_mmio(vgpu->id, "restore",
 				  i915_mmio_reg_offset(mmio->reg),
 				  mmio->value, v);
 	}
+
+	/* Make sure the swiched MMIOs has taken effect. */
+	if (likely(INTEL_GVT_MMIO_OFFSET(last_reg)))
+		I915_READ_FW(last_reg);
 }
 
 /**
@@ -367,12 +374,23 @@ static void switch_mmio_to_host(struct intel_vgpu *vgpu, int ring_id)
 void intel_gvt_switch_mmio(struct intel_vgpu *pre,
 			   struct intel_vgpu *next, int ring_id)
 {
+	struct drm_i915_private *dev_priv;
+
 	if (WARN_ON(!pre && !next))
 		return;
 
 	gvt_dbg_render("switch ring %d from %s to %s\n", ring_id,
 		       pre ? "vGPU" : "host", next ? "vGPU" : "HOST");
 
+	dev_priv = pre ? pre->gvt->dev_priv : next->gvt->dev_priv;
+
+	/**
+	 * We are using raw mmio access wrapper to improve the
+	 * performace for batch mmio read/write, so we need
+	 * handle forcewake mannually.
+	 */
+	intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
+
 	/**
 	 * TODO: Optimize for vGPU to vGPU switch by merging
 	 * switch_mmio_to_host() and switch_mmio_to_vgpu().
@@ -382,4 +400,6 @@ void intel_gvt_switch_mmio(struct intel_vgpu *pre,
 
 	if (next)
 		switch_mmio_to_vgpu(next, ring_id);
+
+	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 }
diff --git a/drivers/gpu/drm/i915/gvt/scheduler.c b/drivers/gpu/drm/i915/gvt/scheduler.c
index 22e08eb..391800d 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.c
+++ b/drivers/gpu/drm/i915/gvt/scheduler.c
@@ -184,41 +184,52 @@ static int shadow_context_status_change(struct notifier_block *nb,
 	return NOTIFY_OK;
 }
 
-static int dispatch_workload(struct intel_vgpu_workload *workload)
+static void shadow_context_descriptor_update(struct i915_gem_context *ctx,
+		struct intel_engine_cs *engine)
+{
+	struct intel_context *ce = &ctx->engine[engine->id];
+	u64 desc = 0;
+
+	desc = ce->lrc_desc;
+
+	/* Update bits 0-11 of the context descriptor which includes flags
+	 * like GEN8_CTX_* cached in desc_template
+	 */
+	desc &= U64_MAX << 12;
+	desc |= ctx->desc_template & ((1ULL << 12) - 1);
+
+	ce->lrc_desc = desc;
+}
+
+/**
+ * intel_gvt_scan_and_shadow_workload - audit the workload by scanning and
+ * shadow it as well, include ringbuffer,wa_ctx and ctx.
+ * @workload: an abstract entity for each execlist submission.
+ *
+ * This function is called before the workload submitting to i915, to make
+ * sure the content of the workload is valid.
+ */
+int intel_gvt_scan_and_shadow_workload(struct intel_vgpu_workload *workload)
 {
 	int ring_id = workload->ring_id;
 	struct i915_gem_context *shadow_ctx = workload->vgpu->shadow_ctx;
 	struct drm_i915_private *dev_priv = workload->vgpu->gvt->dev_priv;
-	struct intel_engine_cs *engine = dev_priv->engine[ring_id];
 	struct drm_i915_gem_request *rq;
 	struct intel_vgpu *vgpu = workload->vgpu;
-	struct intel_ring *ring;
 	int ret;
 
-	gvt_dbg_sched("ring id %d prepare to dispatch workload %p\n",
-		ring_id, workload);
+	lockdep_assert_held(&dev_priv->drm.struct_mutex);
+
+	if (workload->shadowed)
+		return 0;
 
 	shadow_ctx->desc_template &= ~(0x3 << GEN8_CTX_ADDRESSING_MODE_SHIFT);
 	shadow_ctx->desc_template |= workload->ctx_desc.addressing_mode <<
 				    GEN8_CTX_ADDRESSING_MODE_SHIFT;
 
-	mutex_lock(&dev_priv->drm.struct_mutex);
-
-	/* pin shadow context by gvt even the shadow context will be pinned
-	 * when i915 alloc request. That is because gvt will update the guest
-	 * context from shadow context when workload is completed, and at that
-	 * moment, i915 may already unpined the shadow context to make the
-	 * shadow_ctx pages invalid. So gvt need to pin itself. After update
-	 * the guest context, gvt can unpin the shadow_ctx safely.
-	 */
-	ring = engine->context_pin(engine, shadow_ctx);
-	if (IS_ERR(ring)) {
-		ret = PTR_ERR(ring);
-		gvt_vgpu_err("fail to pin shadow context\n");
-		workload->status = ret;
-		mutex_unlock(&dev_priv->drm.struct_mutex);
-		return ret;
-	}
+	if (!test_and_set_bit(ring_id, vgpu->shadow_ctx_desc_updated))
+		shadow_context_descriptor_update(shadow_ctx,
+					dev_priv->engine[ring_id]);
 
 	rq = i915_gem_request_alloc(dev_priv->engine[ring_id], shadow_ctx);
 	if (IS_ERR(rq)) {
@@ -231,7 +242,7 @@ static int dispatch_workload(struct intel_vgpu_workload *workload)
 
 	workload->req = i915_gem_request_get(rq);
 
-	ret = intel_gvt_scan_and_shadow_workload(workload);
+	ret = intel_gvt_scan_and_shadow_ringbuffer(workload);
 	if (ret)
 		goto out;
 
@@ -246,25 +257,61 @@ static int dispatch_workload(struct intel_vgpu_workload *workload)
 	if (ret)
 		goto out;
 
+	workload->shadowed = true;
+
+out:
+	return ret;
+}
+
+static int dispatch_workload(struct intel_vgpu_workload *workload)
+{
+	int ring_id = workload->ring_id;
+	struct i915_gem_context *shadow_ctx = workload->vgpu->shadow_ctx;
+	struct drm_i915_private *dev_priv = workload->vgpu->gvt->dev_priv;
+	struct intel_engine_cs *engine = dev_priv->engine[ring_id];
+	struct intel_vgpu *vgpu = workload->vgpu;
+	struct intel_ring *ring;
+	int ret = 0;
+
+	gvt_dbg_sched("ring id %d prepare to dispatch workload %p\n",
+		ring_id, workload);
+
+	mutex_lock(&dev_priv->drm.struct_mutex);
+
+	ret = intel_gvt_scan_and_shadow_workload(workload);
+	if (ret)
+		goto out;
+
 	if (workload->prepare) {
 		ret = workload->prepare(workload);
 		if (ret)
 			goto out;
 	}
 
-	gvt_dbg_sched("ring id %d submit workload to i915 %p\n",
-			ring_id, workload->req);
+	/* pin shadow context by gvt even the shadow context will be pinned
+	 * when i915 alloc request. That is because gvt will update the guest
+	 * context from shadow context when workload is completed, and at that
+	 * moment, i915 may already unpined the shadow context to make the
+	 * shadow_ctx pages invalid. So gvt need to pin itself. After update
+	 * the guest context, gvt can unpin the shadow_ctx safely.
+	 */
+	ring = engine->context_pin(engine, shadow_ctx);
+	if (IS_ERR(ring)) {
+		ret = PTR_ERR(ring);
+		gvt_vgpu_err("fail to pin shadow context\n");
+		goto out;
+	}
 
-	ret = 0;
-	workload->dispatched = true;
 out:
 	if (ret)
 		workload->status = ret;
 
-	if (!IS_ERR_OR_NULL(rq))
-		i915_add_request(rq);
-	else
-		engine->context_unpin(engine, shadow_ctx);
+	if (!IS_ERR_OR_NULL(workload->req)) {
+		gvt_dbg_sched("ring id %d submit workload to i915 %p\n",
+				ring_id, workload->req);
+		i915_add_request(workload->req);
+		workload->dispatched = true;
+	}
 
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 	return ret;
@@ -617,7 +664,7 @@ int intel_gvt_init_workload_scheduler(struct intel_gvt *gvt)
 
 void intel_vgpu_clean_gvt_context(struct intel_vgpu *vgpu)
 {
-	i915_gem_context_put_unlocked(vgpu->shadow_ctx);
+	i915_gem_context_put(vgpu->shadow_ctx);
 }
 
 int intel_vgpu_init_gvt_context(struct intel_vgpu *vgpu)
@@ -631,5 +678,7 @@ int intel_vgpu_init_gvt_context(struct intel_vgpu *vgpu)
 
 	vgpu->shadow_ctx->engine[RCS].initialised = true;
 
+	bitmap_zero(vgpu->shadow_ctx_desc_updated, I915_NUM_ENGINES);
+
 	return 0;
 }
diff --git a/drivers/gpu/drm/i915/gvt/scheduler.h b/drivers/gpu/drm/i915/gvt/scheduler.h
index 9b6bf51..0d431a9 100644
--- a/drivers/gpu/drm/i915/gvt/scheduler.h
+++ b/drivers/gpu/drm/i915/gvt/scheduler.h
@@ -82,6 +82,7 @@ struct intel_vgpu_workload {
 	struct drm_i915_gem_request *req;
 	/* if this workload has been dispatched to i915? */
 	bool dispatched;
+	bool shadowed;
 	int status;
 
 	struct intel_vgpu_mm *shadow_mm;
diff --git a/drivers/gpu/drm/i915/gvt/vgpu.c b/drivers/gpu/drm/i915/gvt/vgpu.c
index 3deadcb..02c61a1 100644
--- a/drivers/gpu/drm/i915/gvt/vgpu.c
+++ b/drivers/gpu/drm/i915/gvt/vgpu.c
@@ -43,6 +43,7 @@ void populate_pvinfo_page(struct intel_vgpu *vgpu)
 	vgpu_vreg(vgpu, vgtif_reg(version_minor)) = 0;
 	vgpu_vreg(vgpu, vgtif_reg(display_ready)) = 0;
 	vgpu_vreg(vgpu, vgtif_reg(vgt_id)) = vgpu->id;
+	vgpu_vreg(vgpu, vgtif_reg(vgt_caps)) = VGT_CAPS_FULL_48BIT_PPGTT;
 	vgpu_vreg(vgpu, vgtif_reg(avail_rs.mappable_gmadr.base)) =
 		vgpu_aperture_gmadr_base(vgpu);
 	vgpu_vreg(vgpu, vgtif_reg(avail_rs.mappable_gmadr.size)) =
@@ -504,11 +505,11 @@ void intel_gvt_reset_vgpu_locked(struct intel_vgpu *vgpu, bool dmlr,
 	/* full GPU reset or device model level reset */
 	if (engine_mask == ALL_ENGINES || dmlr) {
 
-		intel_vgpu_reset_gtt(vgpu, dmlr);
-
 		/*fence will not be reset during virtual reset */
-		if (dmlr)
+		if (dmlr) {
+			intel_vgpu_reset_gtt(vgpu);
 			intel_vgpu_reset_resource(vgpu);
+		}
 
 		intel_vgpu_reset_mmio(vgpu, dmlr);
 		populate_pvinfo_page(vgpu);
diff --git a/drivers/gpu/drm/i915/i915_debugfs.c b/drivers/gpu/drm/i915/i915_debugfs.c
index d1bd53b..48572b1 100644
--- a/drivers/gpu/drm/i915/i915_debugfs.c
+++ b/drivers/gpu/drm/i915/i915_debugfs.c
@@ -543,75 +543,6 @@ static int i915_gem_gtt_info(struct seq_file *m, void *data)
 	return 0;
 }
 
-static int i915_gem_pageflip_info(struct seq_file *m, void *data)
-{
-	struct drm_i915_private *dev_priv = node_to_i915(m->private);
-	struct drm_device *dev = &dev_priv->drm;
-	struct intel_crtc *crtc;
-	int ret;
-
-	ret = mutex_lock_interruptible(&dev->struct_mutex);
-	if (ret)
-		return ret;
-
-	for_each_intel_crtc(dev, crtc) {
-		const char pipe = pipe_name(crtc->pipe);
-		const char plane = plane_name(crtc->plane);
-		struct intel_flip_work *work;
-
-		spin_lock_irq(&dev->event_lock);
-		work = crtc->flip_work;
-		if (work == NULL) {
-			seq_printf(m, "No flip due on pipe %c (plane %c)\n",
-				   pipe, plane);
-		} else {
-			u32 pending;
-			u32 addr;
-
-			pending = atomic_read(&work->pending);
-			if (pending) {
-				seq_printf(m, "Flip ioctl preparing on pipe %c (plane %c)\n",
-					   pipe, plane);
-			} else {
-				seq_printf(m, "Flip pending (waiting for vsync) on pipe %c (plane %c)\n",
-					   pipe, plane);
-			}
-			if (work->flip_queued_req) {
-				struct intel_engine_cs *engine = work->flip_queued_req->engine;
-
-				seq_printf(m, "Flip queued on %s at seqno %x, last submitted seqno %x [current breadcrumb %x], completed? %d\n",
-					   engine->name,
-					   work->flip_queued_req->global_seqno,
-					   intel_engine_last_submit(engine),
-					   intel_engine_get_seqno(engine),
-					   i915_gem_request_completed(work->flip_queued_req));
-			} else
-				seq_printf(m, "Flip not associated with any ring\n");
-			seq_printf(m, "Flip queued on frame %d, (was ready on frame %d), now %d\n",
-				   work->flip_queued_vblank,
-				   work->flip_ready_vblank,
-				   intel_crtc_get_vblank_counter(crtc));
-			seq_printf(m, "%d prepares\n", atomic_read(&work->pending));
-
-			if (INTEL_GEN(dev_priv) >= 4)
-				addr = I915_HI_DISPBASE(I915_READ(DSPSURF(crtc->plane)));
-			else
-				addr = I915_READ(DSPADDR(crtc->plane));
-			seq_printf(m, "Current scanout address 0x%08x\n", addr);
-
-			if (work->pending_flip_obj) {
-				seq_printf(m, "New framebuffer address 0x%08lx\n", (long)work->gtt_offset);
-				seq_printf(m, "MMIO update completed? %d\n",  addr == work->gtt_offset);
-			}
-		}
-		spin_unlock_irq(&dev->event_lock);
-	}
-
-	mutex_unlock(&dev->struct_mutex);
-
-	return 0;
-}
-
 static int i915_gem_batch_pool_info(struct seq_file *m, void *data)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
@@ -1159,7 +1090,7 @@ static int i915_frequency_info(struct seq_file *m, void *unused)
 		intel_uncore_forcewake_get(dev_priv, FORCEWAKE_ALL);
 
 		reqf = I915_READ(GEN6_RPNSWREQ);
-		if (IS_GEN9(dev_priv))
+		if (INTEL_GEN(dev_priv) >= 9)
 			reqf >>= 23;
 		else {
 			reqf &= ~GEN6_TURBO_DISABLE;
@@ -1181,7 +1112,7 @@ static int i915_frequency_info(struct seq_file *m, void *unused)
 		rpdownei = I915_READ(GEN6_RP_CUR_DOWN_EI) & GEN6_CURIAVG_MASK;
 		rpcurdown = I915_READ(GEN6_RP_CUR_DOWN) & GEN6_CURBSYTAVG_MASK;
 		rpprevdown = I915_READ(GEN6_RP_PREV_DOWN) & GEN6_CURBSYTAVG_MASK;
-		if (IS_GEN9(dev_priv))
+		if (INTEL_GEN(dev_priv) >= 9)
 			cagf = (rpstat & GEN9_CAGF_MASK) >> GEN9_CAGF_SHIFT;
 		else if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv))
 			cagf = (rpstat & HSW_CAGF_MASK) >> HSW_CAGF_SHIFT;
@@ -1210,7 +1141,7 @@ static int i915_frequency_info(struct seq_file *m, void *unused)
 			   dev_priv->rps.pm_intrmsk_mbz);
 		seq_printf(m, "GT_PERF_STATUS: 0x%08x\n", gt_perf_status);
 		seq_printf(m, "Render p-state ratio: %d\n",
-			   (gt_perf_status & (IS_GEN9(dev_priv) ? 0x1ff00 : 0xff00)) >> 8);
+			   (gt_perf_status & (INTEL_GEN(dev_priv) >= 9 ? 0x1ff00 : 0xff00)) >> 8);
 		seq_printf(m, "Render p-state VID: %d\n",
 			   gt_perf_status & 0xff);
 		seq_printf(m, "Render p-state limit: %d\n",
@@ -1241,18 +1172,21 @@ static int i915_frequency_info(struct seq_file *m, void *unused)
 
 		max_freq = (IS_GEN9_LP(dev_priv) ? rp_state_cap >> 0 :
 			    rp_state_cap >> 16) & 0xff;
-		max_freq *= (IS_GEN9_BC(dev_priv) ? GEN9_FREQ_SCALER : 1);
+		max_freq *= (IS_GEN9_BC(dev_priv) ||
+			     IS_CANNONLAKE(dev_priv) ? GEN9_FREQ_SCALER : 1);
 		seq_printf(m, "Lowest (RPN) frequency: %dMHz\n",
 			   intel_gpu_freq(dev_priv, max_freq));
 
 		max_freq = (rp_state_cap & 0xff00) >> 8;
-		max_freq *= (IS_GEN9_BC(dev_priv) ? GEN9_FREQ_SCALER : 1);
+		max_freq *= (IS_GEN9_BC(dev_priv) ||
+			     IS_CANNONLAKE(dev_priv) ? GEN9_FREQ_SCALER : 1);
 		seq_printf(m, "Nominal (RP1) frequency: %dMHz\n",
 			   intel_gpu_freq(dev_priv, max_freq));
 
 		max_freq = (IS_GEN9_LP(dev_priv) ? rp_state_cap >> 16 :
 			    rp_state_cap >> 0) & 0xff;
-		max_freq *= (IS_GEN9_BC(dev_priv) ? GEN9_FREQ_SCALER : 1);
+		max_freq *= (IS_GEN9_BC(dev_priv) ||
+			     IS_CANNONLAKE(dev_priv) ? GEN9_FREQ_SCALER : 1);
 		seq_printf(m, "Max non-overclocked (RP0) frequency: %dMHz\n",
 			   intel_gpu_freq(dev_priv, max_freq));
 		seq_printf(m, "Max overclocked frequency: %dMHz\n",
@@ -1407,6 +1341,23 @@ static int i915_hangcheck_info(struct seq_file *m, void *unused)
 	return 0;
 }
 
+static int i915_reset_info(struct seq_file *m, void *unused)
+{
+	struct drm_i915_private *dev_priv = node_to_i915(m->private);
+	struct i915_gpu_error *error = &dev_priv->gpu_error;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	seq_printf(m, "full gpu reset = %u\n", i915_reset_count(error));
+
+	for_each_engine(engine, dev_priv, id) {
+		seq_printf(m, "%s = %u\n", engine->name,
+			   i915_reset_engine_count(error, engine));
+	}
+
+	return 0;
+}
+
 static int ironlake_drpc_info(struct seq_file *m)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
@@ -1838,7 +1789,7 @@ static int i915_ring_freq_table(struct seq_file *m, void *unused)
 	if (ret)
 		goto out;
 
-	if (IS_GEN9_BC(dev_priv)) {
+	if (IS_GEN9_BC(dev_priv) || IS_CANNONLAKE(dev_priv)) {
 		/* Convert GT frequency to 50 HZ units */
 		min_gpu_freq =
 			dev_priv->rps.min_freq_softlimit / GEN9_FREQ_SCALER;
@@ -1858,7 +1809,8 @@ static int i915_ring_freq_table(struct seq_file *m, void *unused)
 				       &ia_freq);
 		seq_printf(m, "%d\t\t%d\t\t\t\t%d\n",
 			   intel_gpu_freq(dev_priv, (gpu_freq *
-						     (IS_GEN9_BC(dev_priv) ?
+						     (IS_GEN9_BC(dev_priv) ||
+						      IS_CANNONLAKE(dev_priv) ?
 						      GEN9_FREQ_SCALER : 1))),
 			   ((ia_freq >> 0) & 0xff) * 100,
 			   ((ia_freq >> 8) & 0xff) * 100);
@@ -1914,7 +1866,7 @@ static int i915_gem_framebuffer_info(struct seq_file *m, void *data)
 		return ret;
 
 #ifdef CONFIG_DRM_FBDEV_EMULATION
-	if (dev_priv->fbdev) {
+	if (dev_priv->fbdev && dev_priv->fbdev->helper.fb) {
 		fbdev_fb = to_intel_framebuffer(dev_priv->fbdev->helper.fb);
 
 		seq_printf(m, "fbcon size: %d x %d, depth %d, %d bpp, modifier 0x%llx, refcount %d, obj ",
@@ -1970,7 +1922,7 @@ static int i915_context_status(struct seq_file *m, void *unused)
 	if (ret)
 		return ret;
 
-	list_for_each_entry(ctx, &dev_priv->context_list, link) {
+	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
 		seq_printf(m, "HW context %u ", ctx->hw_id);
 		if (ctx->pid) {
 			struct task_struct *task;
@@ -2002,12 +1954,6 @@ static int i915_context_status(struct seq_file *m, void *unused)
 			seq_putc(m, '\n');
 		}
 
-		seq_printf(m,
-			   "\tvma hashtable size=%u (actual %lu), count=%u\n",
-			   ctx->vma_lut.ht_size,
-			   BIT(ctx->vma_lut.ht_bits),
-			   ctx->vma_lut.ht_count);
-
 		seq_putc(m, '\n');
 	}
 
@@ -2076,7 +2022,7 @@ static int i915_dump_lrc(struct seq_file *m, void *unused)
 	if (ret)
 		return ret;
 
-	list_for_each_entry(ctx, &dev_priv->context_list, link)
+	list_for_each_entry(ctx, &dev_priv->contexts.list, link)
 		for_each_engine(engine, dev_priv, id)
 			i915_dump_lrc_obj(m, ctx, engine);
 
@@ -2310,6 +2256,8 @@ static int i915_rps_boost_info(struct seq_file *m, void *data)
 	seq_printf(m, "GPU busy? %s [%d requests]\n",
 		   yesno(dev_priv->gt.awake), dev_priv->gt.active_requests);
 	seq_printf(m, "CPU waiting? %d\n", count_irq_waiters(dev_priv));
+	seq_printf(m, "Boosts outstanding? %d\n",
+		   atomic_read(&dev_priv->rps.num_waiters));
 	seq_printf(m, "Frequency requested %d\n",
 		   intel_gpu_freq(dev_priv, dev_priv->rps.cur_freq));
 	seq_printf(m, "  min hard:%d, soft:%d; max soft:%d, hard:%d\n",
@@ -2323,22 +2271,20 @@ static int i915_rps_boost_info(struct seq_file *m, void *data)
 		   intel_gpu_freq(dev_priv, dev_priv->rps.boost_freq));
 
 	mutex_lock(&dev->filelist_mutex);
-	spin_lock(&dev_priv->rps.client_lock);
 	list_for_each_entry_reverse(file, &dev->filelist, lhead) {
 		struct drm_i915_file_private *file_priv = file->driver_priv;
 		struct task_struct *task;
 
 		rcu_read_lock();
 		task = pid_task(file->pid, PIDTYPE_PID);
-		seq_printf(m, "%s [%d]: %d boosts%s\n",
+		seq_printf(m, "%s [%d]: %d boosts\n",
 			   task ? task->comm : "<unknown>",
 			   task ? task->pid : -1,
-			   file_priv->rps.boosts,
-			   list_empty(&file_priv->rps.link) ? "" : ", active");
+			   atomic_read(&file_priv->rps.boosts));
 		rcu_read_unlock();
 	}
-	seq_printf(m, "Kernel (anonymous) boosts: %d\n", dev_priv->rps.boosts);
-	spin_unlock(&dev_priv->rps.client_lock);
+	seq_printf(m, "Kernel (anonymous) boosts: %d\n",
+		   atomic_read(&dev_priv->rps.boosts));
 	mutex_unlock(&dev->filelist_mutex);
 
 	if (INTEL_GEN(dev_priv) >= 6 &&
@@ -2831,7 +2777,7 @@ static int i915_sink_crc(struct seq_file *m, void *data)
 static int i915_energy_uJ(struct seq_file *m, void *data)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
-	u64 power;
+	unsigned long long power;
 	u32 units;
 
 	if (INTEL_GEN(dev_priv) < 6)
@@ -2839,15 +2785,18 @@ static int i915_energy_uJ(struct seq_file *m, void *data)
 
 	intel_runtime_pm_get(dev_priv);
 
-	rdmsrl(MSR_RAPL_POWER_UNIT, power);
-	power = (power & 0x1f00) >> 8;
-	units = 1000000 / (1 << power); /* convert to uJ */
+	if (rdmsrl_safe(MSR_RAPL_POWER_UNIT, &power)) {
+		intel_runtime_pm_put(dev_priv);
+		return -ENODEV;
+	}
+
+	units = (power & 0x1f00) >> 8;
 	power = I915_READ(MCH_SECP_NRG_STTS);
-	power *= units;
+	power = (1000000 * power) >> units; /* convert to uJ */
 
 	intel_runtime_pm_put(dev_priv);
 
-	seq_printf(m, "%llu", (long long unsigned)power);
+	seq_printf(m, "%llu", power);
 
 	return 0;
 }
@@ -3289,6 +3238,7 @@ static int i915_display_info(struct seq_file *m, void *unused)
 static int i915_engine_info(struct seq_file *m, void *unused)
 {
 	struct drm_i915_private *dev_priv = node_to_i915(m->private);
+	struct i915_gpu_error *error = &dev_priv->gpu_error;
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
 
@@ -3312,6 +3262,8 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 			   engine->hangcheck.seqno,
 			   jiffies_to_msecs(jiffies - engine->hangcheck.action_timestamp),
 			   engine->timeline->inflight_seqnos);
+		seq_printf(m, "\tReset count: %d\n",
+			   i915_reset_engine_count(error, engine));
 
 		rcu_read_lock();
 
@@ -3370,8 +3322,10 @@ static int i915_engine_info(struct seq_file *m, void *unused)
 			ptr = I915_READ(RING_CONTEXT_STATUS_PTR(engine));
 			read = GEN8_CSB_READ_PTR(ptr);
 			write = GEN8_CSB_WRITE_PTR(ptr);
-			seq_printf(m, "\tExeclist CSB read %d, write %d\n",
-				   read, write);
+			seq_printf(m, "\tExeclist CSB read %d, write %d, interrupt posted? %s\n",
+				   read, write,
+				   yesno(test_bit(ENGINE_IRQ_EXECLIST,
+						  &engine->irq_posted)));
 			if (read >= GEN8_CSB_ENTRIES)
 				read = 0;
 			if (write >= GEN8_CSB_ENTRIES)
@@ -3758,13 +3712,18 @@ static ssize_t i915_displayport_test_active_write(struct file *file,
 
 	drm_connector_list_iter_begin(dev, &conn_iter);
 	drm_for_each_connector_iter(connector, &conn_iter) {
+		struct intel_encoder *encoder;
+
 		if (connector->connector_type !=
 		    DRM_MODE_CONNECTOR_DisplayPort)
 			continue;
 
-		if (connector->status == connector_status_connected &&
-		    connector->encoder != NULL) {
-			intel_dp = enc_to_intel_dp(connector->encoder);
+		encoder = to_intel_encoder(connector->encoder);
+		if (encoder && encoder->type == INTEL_OUTPUT_DP_MST)
+			continue;
+
+		if (encoder && connector->status == connector_status_connected) {
+			intel_dp = enc_to_intel_dp(&encoder->base);
 			status = kstrtoint(input_buffer, 10, &val);
 			if (status < 0)
 				break;
@@ -3796,13 +3755,18 @@ static int i915_displayport_test_active_show(struct seq_file *m, void *data)
 
 	drm_connector_list_iter_begin(dev, &conn_iter);
 	drm_for_each_connector_iter(connector, &conn_iter) {
+		struct intel_encoder *encoder;
+
 		if (connector->connector_type !=
 		    DRM_MODE_CONNECTOR_DisplayPort)
 			continue;
 
-		if (connector->status == connector_status_connected &&
-		    connector->encoder != NULL) {
-			intel_dp = enc_to_intel_dp(connector->encoder);
+		encoder = to_intel_encoder(connector->encoder);
+		if (encoder && encoder->type == INTEL_OUTPUT_DP_MST)
+			continue;
+
+		if (encoder && connector->status == connector_status_connected) {
+			intel_dp = enc_to_intel_dp(&encoder->base);
 			if (intel_dp->compliance.test_active)
 				seq_puts(m, "1");
 			else
@@ -3842,13 +3806,18 @@ static int i915_displayport_test_data_show(struct seq_file *m, void *data)
 
 	drm_connector_list_iter_begin(dev, &conn_iter);
 	drm_for_each_connector_iter(connector, &conn_iter) {
+		struct intel_encoder *encoder;
+
 		if (connector->connector_type !=
 		    DRM_MODE_CONNECTOR_DisplayPort)
 			continue;
 
-		if (connector->status == connector_status_connected &&
-		    connector->encoder != NULL) {
-			intel_dp = enc_to_intel_dp(connector->encoder);
+		encoder = to_intel_encoder(connector->encoder);
+		if (encoder && encoder->type == INTEL_OUTPUT_DP_MST)
+			continue;
+
+		if (encoder && connector->status == connector_status_connected) {
+			intel_dp = enc_to_intel_dp(&encoder->base);
 			if (intel_dp->compliance.test_type ==
 			    DP_TEST_LINK_EDID_READ)
 				seq_printf(m, "%lx",
@@ -3895,13 +3864,18 @@ static int i915_displayport_test_type_show(struct seq_file *m, void *data)
 
 	drm_connector_list_iter_begin(dev, &conn_iter);
 	drm_for_each_connector_iter(connector, &conn_iter) {
+		struct intel_encoder *encoder;
+
 		if (connector->connector_type !=
 		    DRM_MODE_CONNECTOR_DisplayPort)
 			continue;
 
-		if (connector->status == connector_status_connected &&
-		    connector->encoder != NULL) {
-			intel_dp = enc_to_intel_dp(connector->encoder);
+		encoder = to_intel_encoder(connector->encoder);
+		if (encoder && encoder->type == INTEL_OUTPUT_DP_MST)
+			continue;
+
+		if (encoder && connector->status == connector_status_connected) {
+			intel_dp = enc_to_intel_dp(&encoder->base);
 			seq_printf(m, "%02lx", intel_dp->compliance.test_type);
 		} else
 			seq_puts(m, "0");
@@ -4810,7 +4784,6 @@ static const struct drm_info_list i915_debugfs_list[] = {
 	{"i915_gem_gtt", i915_gem_gtt_info, 0},
 	{"i915_gem_pin_display", i915_gem_gtt_info, 0, (void *)1},
 	{"i915_gem_stolen", i915_gem_stolen_list_info },
-	{"i915_gem_pageflip", i915_gem_pageflip_info, 0},
 	{"i915_gem_request", i915_gem_request_info, 0},
 	{"i915_gem_seqno", i915_gem_seqno_info, 0},
 	{"i915_gem_fence_regs", i915_gem_fence_regs_info, 0},
@@ -4824,6 +4797,7 @@ static const struct drm_info_list i915_debugfs_list[] = {
 	{"i915_huc_load_status", i915_huc_load_status_info, 0},
 	{"i915_frequency_info", i915_frequency_info, 0},
 	{"i915_hangcheck_info", i915_hangcheck_info, 0},
+	{"i915_reset_info", i915_reset_info, 0},
 	{"i915_drpc_info", i915_drpc_info, 0},
 	{"i915_emon_status", i915_emon_status, 0},
 	{"i915_ring_freq_table", i915_ring_freq_table, 0},
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c
index fc307e03..4310022 100644
--- a/drivers/gpu/drm/i915/i915_drv.c
+++ b/drivers/gpu/drm/i915/i915_drv.c
@@ -132,9 +132,13 @@ static enum intel_pch intel_virt_detect_pch(struct drm_i915_private *dev_priv)
 		DRM_DEBUG_KMS("Assuming Ibex Peak PCH\n");
 	} else if (IS_GEN6(dev_priv) || IS_IVYBRIDGE(dev_priv)) {
 		ret = PCH_CPT;
-		DRM_DEBUG_KMS("Assuming CouarPoint PCH\n");
+		DRM_DEBUG_KMS("Assuming CougarPoint PCH\n");
 	} else if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv)) {
 		ret = PCH_LPT;
+		if (IS_HSW_ULT(dev_priv) || IS_BDW_ULT(dev_priv))
+			dev_priv->pch_id = INTEL_PCH_LPT_LP_DEVICE_ID_TYPE;
+		else
+			dev_priv->pch_id = INTEL_PCH_LPT_DEVICE_ID_TYPE;
 		DRM_DEBUG_KMS("Assuming LynxPoint PCH\n");
 	} else if (IS_SKYLAKE(dev_priv) || IS_KABYLAKE(dev_priv)) {
 		ret = PCH_SPT;
@@ -173,29 +177,25 @@ static void intel_detect_pch(struct drm_i915_private *dev_priv)
 	while ((pch = pci_get_class(PCI_CLASS_BRIDGE_ISA << 8, pch))) {
 		if (pch->vendor == PCI_VENDOR_ID_INTEL) {
 			unsigned short id = pch->device & INTEL_PCH_DEVICE_ID_MASK;
-			unsigned short id_ext = pch->device &
-				INTEL_PCH_DEVICE_ID_MASK_EXT;
+
+			dev_priv->pch_id = id;
 
 			if (id == INTEL_PCH_IBX_DEVICE_ID_TYPE) {
-				dev_priv->pch_id = id;
 				dev_priv->pch_type = PCH_IBX;
 				DRM_DEBUG_KMS("Found Ibex Peak PCH\n");
 				WARN_ON(!IS_GEN5(dev_priv));
 			} else if (id == INTEL_PCH_CPT_DEVICE_ID_TYPE) {
-				dev_priv->pch_id = id;
 				dev_priv->pch_type = PCH_CPT;
 				DRM_DEBUG_KMS("Found CougarPoint PCH\n");
-				WARN_ON(!(IS_GEN6(dev_priv) ||
-					IS_IVYBRIDGE(dev_priv)));
+				WARN_ON(!IS_GEN6(dev_priv) &&
+					!IS_IVYBRIDGE(dev_priv));
 			} else if (id == INTEL_PCH_PPT_DEVICE_ID_TYPE) {
 				/* PantherPoint is CPT compatible */
-				dev_priv->pch_id = id;
 				dev_priv->pch_type = PCH_CPT;
 				DRM_DEBUG_KMS("Found PantherPoint PCH\n");
-				WARN_ON(!(IS_GEN6(dev_priv) ||
-					IS_IVYBRIDGE(dev_priv)));
+				WARN_ON(!IS_GEN6(dev_priv) &&
+					!IS_IVYBRIDGE(dev_priv));
 			} else if (id == INTEL_PCH_LPT_DEVICE_ID_TYPE) {
-				dev_priv->pch_id = id;
 				dev_priv->pch_type = PCH_LPT;
 				DRM_DEBUG_KMS("Found LynxPoint PCH\n");
 				WARN_ON(!IS_HASWELL(dev_priv) &&
@@ -203,51 +203,60 @@ static void intel_detect_pch(struct drm_i915_private *dev_priv)
 				WARN_ON(IS_HSW_ULT(dev_priv) ||
 					IS_BDW_ULT(dev_priv));
 			} else if (id == INTEL_PCH_LPT_LP_DEVICE_ID_TYPE) {
-				dev_priv->pch_id = id;
 				dev_priv->pch_type = PCH_LPT;
 				DRM_DEBUG_KMS("Found LynxPoint LP PCH\n");
 				WARN_ON(!IS_HASWELL(dev_priv) &&
 					!IS_BROADWELL(dev_priv));
 				WARN_ON(!IS_HSW_ULT(dev_priv) &&
 					!IS_BDW_ULT(dev_priv));
+			} else if (id == INTEL_PCH_WPT_DEVICE_ID_TYPE) {
+				/* WildcatPoint is LPT compatible */
+				dev_priv->pch_type = PCH_LPT;
+				DRM_DEBUG_KMS("Found WildcatPoint PCH\n");
+				WARN_ON(!IS_HASWELL(dev_priv) &&
+					!IS_BROADWELL(dev_priv));
+				WARN_ON(IS_HSW_ULT(dev_priv) ||
+					IS_BDW_ULT(dev_priv));
+			} else if (id == INTEL_PCH_WPT_LP_DEVICE_ID_TYPE) {
+				/* WildcatPoint is LPT compatible */
+				dev_priv->pch_type = PCH_LPT;
+				DRM_DEBUG_KMS("Found WildcatPoint LP PCH\n");
+				WARN_ON(!IS_HASWELL(dev_priv) &&
+					!IS_BROADWELL(dev_priv));
+				WARN_ON(!IS_HSW_ULT(dev_priv) &&
+					!IS_BDW_ULT(dev_priv));
 			} else if (id == INTEL_PCH_SPT_DEVICE_ID_TYPE) {
-				dev_priv->pch_id = id;
 				dev_priv->pch_type = PCH_SPT;
 				DRM_DEBUG_KMS("Found SunrisePoint PCH\n");
 				WARN_ON(!IS_SKYLAKE(dev_priv) &&
 					!IS_KABYLAKE(dev_priv));
-			} else if (id_ext == INTEL_PCH_SPT_LP_DEVICE_ID_TYPE) {
-				dev_priv->pch_id = id_ext;
+			} else if (id == INTEL_PCH_SPT_LP_DEVICE_ID_TYPE) {
 				dev_priv->pch_type = PCH_SPT;
 				DRM_DEBUG_KMS("Found SunrisePoint LP PCH\n");
 				WARN_ON(!IS_SKYLAKE(dev_priv) &&
 					!IS_KABYLAKE(dev_priv));
 			} else if (id == INTEL_PCH_KBP_DEVICE_ID_TYPE) {
-				dev_priv->pch_id = id;
 				dev_priv->pch_type = PCH_KBP;
-				DRM_DEBUG_KMS("Found KabyPoint PCH\n");
+				DRM_DEBUG_KMS("Found Kaby Lake PCH (KBP)\n");
 				WARN_ON(!IS_SKYLAKE(dev_priv) &&
 					!IS_KABYLAKE(dev_priv));
 			} else if (id == INTEL_PCH_CNP_DEVICE_ID_TYPE) {
-				dev_priv->pch_id = id;
 				dev_priv->pch_type = PCH_CNP;
-				DRM_DEBUG_KMS("Found CannonPoint PCH\n");
+				DRM_DEBUG_KMS("Found Cannon Lake PCH (CNP)\n");
 				WARN_ON(!IS_CANNONLAKE(dev_priv) &&
 					!IS_COFFEELAKE(dev_priv));
-			} else if (id_ext == INTEL_PCH_CNP_LP_DEVICE_ID_TYPE) {
-				dev_priv->pch_id = id_ext;
+			} else if (id == INTEL_PCH_CNP_LP_DEVICE_ID_TYPE) {
 				dev_priv->pch_type = PCH_CNP;
-				DRM_DEBUG_KMS("Found CannonPoint LP PCH\n");
+				DRM_DEBUG_KMS("Found Cannon Lake LP PCH (CNP-LP)\n");
 				WARN_ON(!IS_CANNONLAKE(dev_priv) &&
 					!IS_COFFEELAKE(dev_priv));
-			} else if ((id == INTEL_PCH_P2X_DEVICE_ID_TYPE) ||
-				   (id == INTEL_PCH_P3X_DEVICE_ID_TYPE) ||
-				   ((id == INTEL_PCH_QEMU_DEVICE_ID_TYPE) &&
+			} else if (id == INTEL_PCH_P2X_DEVICE_ID_TYPE ||
+				   id == INTEL_PCH_P3X_DEVICE_ID_TYPE ||
+				   (id == INTEL_PCH_QEMU_DEVICE_ID_TYPE &&
 				    pch->subsystem_vendor ==
 					    PCI_SUBVENDOR_ID_REDHAT_QUMRANET &&
 				    pch->subsystem_device ==
 					    PCI_SUBDEVICE_ID_QEMU)) {
-				dev_priv->pch_id = id;
 				dev_priv->pch_type =
 					intel_virt_detect_pch(dev_priv);
 			} else
@@ -331,6 +340,8 @@ static int i915_getparam(struct drm_device *dev, void *data,
 		break;
 	case I915_PARAM_HAS_GPU_RESET:
 		value = i915.enable_hangcheck && intel_has_gpu_reset(dev_priv);
+		if (value && intel_has_reset_engine(dev_priv))
+			value = 2;
 		break;
 	case I915_PARAM_HAS_RESOURCE_STREAMER:
 		value = HAS_RESOURCE_STREAMER(dev_priv);
@@ -377,6 +388,7 @@ static int i915_getparam(struct drm_device *dev, void *data,
 	case I915_PARAM_HAS_EXEC_FENCE:
 	case I915_PARAM_HAS_EXEC_CAPTURE:
 	case I915_PARAM_HAS_EXEC_BATCH_FIRST:
+	case I915_PARAM_HAS_EXEC_FENCE_ARRAY:
 		/* For the time being all of these are always true;
 		 * if some supported hardware does not have one of these
 		 * features this value needs to be provided from
@@ -585,16 +597,19 @@ static const struct vga_switcheroo_client_ops i915_switcheroo_ops = {
 
 static void i915_gem_fini(struct drm_i915_private *dev_priv)
 {
+	/* Flush any outstanding unpin_work. */
+	i915_gem_drain_workqueue(dev_priv);
+
 	mutex_lock(&dev_priv->drm.struct_mutex);
 	intel_uc_fini_hw(dev_priv);
 	i915_gem_cleanup_engines(dev_priv);
-	i915_gem_context_fini(dev_priv);
+	i915_gem_contexts_fini(dev_priv);
 	i915_gem_cleanup_userptr(dev_priv);
 	mutex_unlock(&dev_priv->drm.struct_mutex);
 
 	i915_gem_drain_freed_objects(dev_priv);
 
-	WARN_ON(!list_empty(&dev_priv->context_list));
+	WARN_ON(!list_empty(&dev_priv->contexts.list));
 }
 
 static int i915_load_modeset_init(struct drm_device *dev)
@@ -862,7 +877,6 @@ static int i915_driver_init_early(struct drm_i915_private *dev_priv,
 	spin_lock_init(&dev_priv->uncore.lock);
 
 	spin_lock_init(&dev_priv->mm.object_stat_lock);
-	spin_lock_init(&dev_priv->mmio_flip_lock);
 	mutex_init(&dev_priv->sb_lock);
 	mutex_init(&dev_priv->modeset_restore_lock);
 	mutex_init(&dev_priv->av_mutex);
@@ -1227,6 +1241,7 @@ static void i915_driver_register(struct drm_i915_private *dev_priv)
  */
 static void i915_driver_unregister(struct drm_i915_private *dev_priv)
 {
+	intel_fbdev_unregister(dev_priv);
 	intel_audio_deinit(dev_priv);
 
 	intel_gpu_ips_teardown();
@@ -1319,7 +1334,7 @@ int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	ret = i915_load_modeset_init(&dev_priv->drm);
 	if (ret < 0)
-		goto out_cleanup_vblank;
+		goto out_cleanup_hw;
 
 	i915_driver_register(dev_priv);
 
@@ -1336,8 +1351,6 @@ int i915_driver_load(struct pci_dev *pdev, const struct pci_device_id *ent)
 
 	return 0;
 
-out_cleanup_vblank:
-	drm_vblank_cleanup(&dev_priv->drm);
 out_cleanup_hw:
 	i915_driver_cleanup_hw(dev_priv);
 out_cleanup_mmio:
@@ -1360,7 +1373,7 @@ void i915_driver_unload(struct drm_device *dev)
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct pci_dev *pdev = dev_priv->drm.pdev;
 
-	intel_fbdev_fini(dev);
+	i915_driver_unregister(dev_priv);
 
 	if (i915_gem_suspend(dev_priv))
 		DRM_ERROR("failed to idle hardware; continuing to unload!\n");
@@ -1371,10 +1384,6 @@ void i915_driver_unload(struct drm_device *dev)
 
 	intel_gvt_cleanup(dev_priv);
 
-	i915_driver_unregister(dev_priv);
-
-	drm_vblank_cleanup(dev);
-
 	intel_modeset_cleanup(dev);
 
 	/*
@@ -1400,9 +1409,6 @@ void i915_driver_unload(struct drm_device *dev)
 	cancel_delayed_work_sync(&dev_priv->gpu_error.hangcheck_work);
 	i915_reset_error_state(dev_priv);
 
-	/* Flush any outstanding unpin_work. */
-	drain_workqueue(dev_priv->wq);
-
 	i915_gem_fini(dev_priv);
 	intel_uc_fini_fw(dev_priv);
 	intel_fbc_cleanup_cfb(dev_priv);
@@ -1427,9 +1433,10 @@ static void i915_driver_release(struct drm_device *dev)
 
 static int i915_driver_open(struct drm_device *dev, struct drm_file *file)
 {
+	struct drm_i915_private *i915 = to_i915(dev);
 	int ret;
 
-	ret = i915_gem_open(dev, file);
+	ret = i915_gem_open(i915, file);
 	if (ret)
 		return ret;
 
@@ -1459,7 +1466,7 @@ static void i915_driver_postclose(struct drm_device *dev, struct drm_file *file)
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 
 	mutex_lock(&dev->struct_mutex);
-	i915_gem_context_close(dev, file);
+	i915_gem_context_close(file);
 	i915_gem_release(dev, file);
 	mutex_unlock(&dev->struct_mutex);
 
@@ -1825,7 +1832,8 @@ static int i915_resume_switcheroo(struct drm_device *dev)
 
 /**
  * i915_reset - reset chip after a hang
- * @dev_priv: device private to reset
+ * @i915: #drm_i915_private to reset
+ * @flags: Instructions
  *
  * Reset the chip.  Useful if a hang is detected. Marks the device as wedged
  * on failure.
@@ -1840,33 +1848,34 @@ static int i915_resume_switcheroo(struct drm_device *dev)
  *   - re-init interrupt state
  *   - re-init display
  */
-void i915_reset(struct drm_i915_private *dev_priv)
+void i915_reset(struct drm_i915_private *i915, unsigned int flags)
 {
-	struct i915_gpu_error *error = &dev_priv->gpu_error;
+	struct i915_gpu_error *error = &i915->gpu_error;
 	int ret;
 
-	lockdep_assert_held(&dev_priv->drm.struct_mutex);
+	lockdep_assert_held(&i915->drm.struct_mutex);
 	GEM_BUG_ON(!test_bit(I915_RESET_BACKOFF, &error->flags));
 
 	if (!test_bit(I915_RESET_HANDOFF, &error->flags))
 		return;
 
 	/* Clear any previous failed attempts at recovery. Time to try again. */
-	if (!i915_gem_unset_wedged(dev_priv))
+	if (!i915_gem_unset_wedged(i915))
 		goto wakeup;
 
+	if (!(flags & I915_RESET_QUIET))
+		dev_notice(i915->drm.dev, "Resetting chip after gpu hang\n");
 	error->reset_count++;
 
-	pr_notice("drm/i915: Resetting chip after gpu hang\n");
-	disable_irq(dev_priv->drm.irq);
-	ret = i915_gem_reset_prepare(dev_priv);
+	disable_irq(i915->drm.irq);
+	ret = i915_gem_reset_prepare(i915);
 	if (ret) {
 		DRM_ERROR("GPU recovery failed\n");
-		intel_gpu_reset(dev_priv, ALL_ENGINES);
+		intel_gpu_reset(i915, ALL_ENGINES);
 		goto error;
 	}
 
-	ret = intel_gpu_reset(dev_priv, ALL_ENGINES);
+	ret = intel_gpu_reset(i915, ALL_ENGINES);
 	if (ret) {
 		if (ret != -ENODEV)
 			DRM_ERROR("Failed to reset chip: %i\n", ret);
@@ -1875,8 +1884,8 @@ void i915_reset(struct drm_i915_private *dev_priv)
 		goto error;
 	}
 
-	i915_gem_reset(dev_priv);
-	intel_overlay_reset(dev_priv);
+	i915_gem_reset(i915);
+	intel_overlay_reset(i915);
 
 	/* Ok, now get things going again... */
 
@@ -1892,17 +1901,17 @@ void i915_reset(struct drm_i915_private *dev_priv)
 	 * was running at the time of the reset (i.e. we weren't VT
 	 * switched away).
 	 */
-	ret = i915_gem_init_hw(dev_priv);
+	ret = i915_gem_init_hw(i915);
 	if (ret) {
 		DRM_ERROR("Failed hw init on reset %d\n", ret);
 		goto error;
 	}
 
-	i915_queue_hangcheck(dev_priv);
+	i915_queue_hangcheck(i915);
 
 finish:
-	i915_gem_reset_finish(dev_priv);
-	enable_irq(dev_priv->drm.irq);
+	i915_gem_reset_finish(i915);
+	enable_irq(i915->drm.irq);
 
 wakeup:
 	clear_bit(I915_RESET_HANDOFF, &error->flags);
@@ -1910,10 +1919,74 @@ void i915_reset(struct drm_i915_private *dev_priv)
 	return;
 
 error:
-	i915_gem_set_wedged(dev_priv);
+	i915_gem_set_wedged(i915);
+	i915_gem_retire_requests(i915);
 	goto finish;
 }
 
+/**
+ * i915_reset_engine - reset GPU engine to recover from a hang
+ * @engine: engine to reset
+ * @flags: options
+ *
+ * Reset a specific GPU engine. Useful if a hang is detected.
+ * Returns zero on successful reset or otherwise an error code.
+ *
+ * Procedure is:
+ *  - identifies the request that caused the hang and it is dropped
+ *  - reset engine (which will force the engine to idle)
+ *  - re-init/configure engine
+ */
+int i915_reset_engine(struct intel_engine_cs *engine, unsigned int flags)
+{
+	struct i915_gpu_error *error = &engine->i915->gpu_error;
+	struct drm_i915_gem_request *active_request;
+	int ret;
+
+	GEM_BUG_ON(!test_bit(I915_RESET_ENGINE + engine->id, &error->flags));
+
+	if (!(flags & I915_RESET_QUIET)) {
+		dev_notice(engine->i915->drm.dev,
+			   "Resetting %s after gpu hang\n", engine->name);
+	}
+	error->reset_engine_count[engine->id]++;
+
+	active_request = i915_gem_reset_prepare_engine(engine);
+	if (IS_ERR(active_request)) {
+		DRM_DEBUG_DRIVER("Previous reset failed, promote to full reset\n");
+		ret = PTR_ERR(active_request);
+		goto out;
+	}
+
+	ret = intel_gpu_reset(engine->i915, intel_engine_flag(engine));
+	if (ret) {
+		/* If we fail here, we expect to fallback to a global reset */
+		DRM_DEBUG_DRIVER("Failed to reset %s, ret=%d\n",
+				 engine->name, ret);
+		goto out;
+	}
+
+	/*
+	 * The request that caused the hang is stuck on elsp, we know the
+	 * active request and can drop it, adjust head to skip the offending
+	 * request to resume executing remaining requests in the queue.
+	 */
+	i915_gem_reset_engine(engine, active_request);
+
+	/*
+	 * The engine and its registers (and workarounds in case of render)
+	 * have been reset to their default values. Follow the init_ring
+	 * process to program RING_MODE, HWSP and re-enable submission.
+	 */
+	ret = engine->init_hw(engine);
+	if (ret)
+		goto out;
+
+out:
+	i915_gem_reset_finish_engine(engine);
+	return ret;
+}
+
 static int i915_pm_suspend(struct device *kdev)
 {
 	struct pci_dev *pdev = to_pci_dev(kdev);
@@ -2657,6 +2730,8 @@ static const struct drm_ioctl_desc i915_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_GETPARAM, i915_gem_context_getparam_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_GEM_CONTEXT_SETPARAM, i915_gem_context_setparam_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(I915_PERF_OPEN, i915_perf_open_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_PERF_ADD_CONFIG, i915_perf_add_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(I915_PERF_REMOVE_CONFIG, i915_perf_remove_config_ioctl, DRM_UNLOCKED|DRM_RENDER_ALLOW),
 };
 
 static struct drm_driver driver = {
@@ -2665,12 +2740,11 @@ static struct drm_driver driver = {
 	 */
 	.driver_features =
 	    DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED | DRIVER_GEM | DRIVER_PRIME |
-	    DRIVER_RENDER | DRIVER_MODESET | DRIVER_ATOMIC,
+	    DRIVER_RENDER | DRIVER_MODESET | DRIVER_ATOMIC | DRIVER_SYNCOBJ,
 	.release = i915_driver_release,
 	.open = i915_driver_open,
 	.lastclose = i915_driver_lastclose,
 	.postclose = i915_driver_postclose,
-	.set_busid = drm_pci_set_busid,
 
 	.gem_close_object = i915_gem_close_object,
 	.gem_free_object_unlocked = i915_gem_free_object,
@@ -2683,7 +2757,6 @@ static struct drm_driver driver = {
 
 	.dumb_create = i915_gem_dumb_create,
 	.dumb_map_offset = i915_gem_mmap_gtt,
-	.dumb_destroy = drm_gem_dumb_destroy,
 	.ioctls = i915_ioctls,
 	.num_ioctls = ARRAY_SIZE(i915_ioctls),
 	.fops = &i915_driver_fops,
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index e1f7c97..60267e3 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -80,8 +80,8 @@
 
 #define DRIVER_NAME		"i915"
 #define DRIVER_DESC		"Intel Graphics"
-#define DRIVER_DATE		"20170619"
-#define DRIVER_TIMESTAMP	1497857498
+#define DRIVER_DATE		"20170818"
+#define DRIVER_TIMESTAMP	1503088845
 
 /* Use I915_STATE_WARN(x) and I915_STATE_WARN_ON() (rather than WARN() and
  * WARN_ON()) for hw state sanity checks to check for unexpected conditions
@@ -122,7 +122,7 @@ static inline bool is_fixed16_zero(uint_fixed_16_16_t val)
 	return false;
 }
 
-static inline uint_fixed_16_16_t u32_to_fixed_16_16(uint32_t val)
+static inline uint_fixed_16_16_t u32_to_fixed16(uint32_t val)
 {
 	uint_fixed_16_16_t fp;
 
@@ -132,17 +132,17 @@ static inline uint_fixed_16_16_t u32_to_fixed_16_16(uint32_t val)
 	return fp;
 }
 
-static inline uint32_t fixed_16_16_to_u32_round_up(uint_fixed_16_16_t fp)
+static inline uint32_t fixed16_to_u32_round_up(uint_fixed_16_16_t fp)
 {
 	return DIV_ROUND_UP(fp.val, 1 << 16);
 }
 
-static inline uint32_t fixed_16_16_to_u32(uint_fixed_16_16_t fp)
+static inline uint32_t fixed16_to_u32(uint_fixed_16_16_t fp)
 {
 	return fp.val >> 16;
 }
 
-static inline uint_fixed_16_16_t min_fixed_16_16(uint_fixed_16_16_t min1,
+static inline uint_fixed_16_16_t min_fixed16(uint_fixed_16_16_t min1,
 						 uint_fixed_16_16_t min2)
 {
 	uint_fixed_16_16_t min;
@@ -151,7 +151,7 @@ static inline uint_fixed_16_16_t min_fixed_16_16(uint_fixed_16_16_t min1,
 	return min;
 }
 
-static inline uint_fixed_16_16_t max_fixed_16_16(uint_fixed_16_16_t max1,
+static inline uint_fixed_16_16_t max_fixed16(uint_fixed_16_16_t max1,
 						 uint_fixed_16_16_t max2)
 {
 	uint_fixed_16_16_t max;
@@ -160,6 +160,14 @@ static inline uint_fixed_16_16_t max_fixed_16_16(uint_fixed_16_16_t max1,
 	return max;
 }
 
+static inline uint_fixed_16_16_t clamp_u64_to_fixed16(uint64_t val)
+{
+	uint_fixed_16_16_t fp;
+	WARN_ON(val >> 32);
+	fp.val = clamp_t(uint32_t, val, 0, ~0);
+	return fp;
+}
+
 static inline uint32_t div_round_up_fixed16(uint_fixed_16_16_t val,
 					    uint_fixed_16_16_t d)
 {
@@ -170,48 +178,30 @@ static inline uint32_t mul_round_up_u32_fixed16(uint32_t val,
 						uint_fixed_16_16_t mul)
 {
 	uint64_t intermediate_val;
-	uint32_t result;
 
 	intermediate_val = (uint64_t) val * mul.val;
 	intermediate_val = DIV_ROUND_UP_ULL(intermediate_val, 1 << 16);
 	WARN_ON(intermediate_val >> 32);
-	result = clamp_t(uint32_t, intermediate_val, 0, ~0);
-	return result;
+	return clamp_t(uint32_t, intermediate_val, 0, ~0);
 }
 
 static inline uint_fixed_16_16_t mul_fixed16(uint_fixed_16_16_t val,
 					     uint_fixed_16_16_t mul)
 {
 	uint64_t intermediate_val;
-	uint_fixed_16_16_t fp;
 
 	intermediate_val = (uint64_t) val.val * mul.val;
 	intermediate_val = intermediate_val >> 16;
-	WARN_ON(intermediate_val >> 32);
-	fp.val = clamp_t(uint32_t, intermediate_val, 0, ~0);
-	return fp;
+	return clamp_u64_to_fixed16(intermediate_val);
 }
 
-static inline uint_fixed_16_16_t fixed_16_16_div(uint32_t val, uint32_t d)
+static inline uint_fixed_16_16_t div_fixed16(uint32_t val, uint32_t d)
 {
-	uint_fixed_16_16_t fp, res;
-
-	fp = u32_to_fixed_16_16(val);
-	res.val = DIV_ROUND_UP(fp.val, d);
-	return res;
-}
-
-static inline uint_fixed_16_16_t fixed_16_16_div_u64(uint32_t val, uint32_t d)
-{
-	uint_fixed_16_16_t res;
 	uint64_t interm_val;
 
 	interm_val = (uint64_t)val << 16;
 	interm_val = DIV_ROUND_UP_ULL(interm_val, d);
-	WARN_ON(interm_val >> 32);
-	res.val = (uint32_t) interm_val;
-
-	return res;
+	return clamp_u64_to_fixed16(interm_val);
 }
 
 static inline uint32_t div_round_up_u32_fixed16(uint32_t val,
@@ -225,16 +215,32 @@ static inline uint32_t div_round_up_u32_fixed16(uint32_t val,
 	return clamp_t(uint32_t, interm_val, 0, ~0);
 }
 
-static inline uint_fixed_16_16_t mul_u32_fixed_16_16(uint32_t val,
+static inline uint_fixed_16_16_t mul_u32_fixed16(uint32_t val,
 						     uint_fixed_16_16_t mul)
 {
 	uint64_t intermediate_val;
-	uint_fixed_16_16_t fp;
 
 	intermediate_val = (uint64_t) val * mul.val;
-	WARN_ON(intermediate_val >> 32);
-	fp.val = (uint32_t) intermediate_val;
-	return fp;
+	return clamp_u64_to_fixed16(intermediate_val);
+}
+
+static inline uint_fixed_16_16_t add_fixed16(uint_fixed_16_16_t add1,
+					     uint_fixed_16_16_t add2)
+{
+	uint64_t interm_sum;
+
+	interm_sum = (uint64_t) add1.val + add2.val;
+	return clamp_u64_to_fixed16(interm_sum);
+}
+
+static inline uint_fixed_16_16_t add_fixed16_u32(uint_fixed_16_16_t add1,
+						 uint32_t add2)
+{
+	uint64_t interm_sum;
+	uint_fixed_16_16_t interm_add2 = u32_to_fixed16(add2);
+
+	interm_sum = (uint64_t) add1.val + interm_add2.val;
+	return clamp_u64_to_fixed16(interm_sum);
 }
 
 static inline const char *yesno(bool v)
@@ -584,8 +590,7 @@ struct drm_i915_file_private {
 	struct idr context_idr;
 
 	struct intel_rps_client {
-		struct list_head link;
-		unsigned boosts;
+		atomic_t boosts;
 	} rps;
 
 	unsigned int bsd_engine;
@@ -597,7 +602,7 @@ struct drm_i915_file_private {
  * to limit the badly behaving clients access to gpu.
  */
 #define I915_MAX_CLIENT_CONTEXT_BANS 3
-	int context_bans;
+	atomic_t context_bans;
 };
 
 /* Used by dp and fdi links */
@@ -641,6 +646,7 @@ struct intel_opregion {
 	u32 swsci_sbcb_sub_functions;
 	struct opregion_asle *asle;
 	void *rvda;
+	void *vbt_firmware;
 	const void *vbt;
 	u32 vbt_size;
 	u32 *lid_state;
@@ -710,11 +716,6 @@ struct drm_i915_display_funcs {
 	void (*fdi_link_train)(struct intel_crtc *crtc,
 			       const struct intel_crtc_state *crtc_state);
 	void (*init_clock_gating)(struct drm_i915_private *dev_priv);
-	int (*queue_flip)(struct drm_device *dev, struct drm_crtc *crtc,
-			  struct drm_framebuffer *fb,
-			  struct drm_i915_gem_object *obj,
-			  struct drm_i915_gem_request *req,
-			  uint32_t flags);
 	void (*hpd_irq_setup)(struct drm_i915_private *dev_priv);
 	/* clock updates for mode set */
 	/* cursor updates */
@@ -753,6 +754,7 @@ struct intel_csr {
 	func(has_csr); \
 	func(has_ddi); \
 	func(has_dp_mst); \
+	func(has_reset_engine); \
 	func(has_fbc); \
 	func(has_fpga_dbg); \
 	func(has_full_ppgtt); \
@@ -917,6 +919,7 @@ struct i915_gpu_state {
 		enum intel_engine_hangcheck_action hangcheck_action;
 		struct i915_address_space *vm;
 		int num_requests;
+		u32 reset_count;
 
 		/* position of active request inside the ring */
 		u32 rq_head, rq_post, rq_tail;
@@ -1056,6 +1059,11 @@ struct intel_fbc {
 	bool underrun_detected;
 	struct work_struct underrun_work;
 
+	/*
+	 * Due to the atomic rules we can't access some structures without the
+	 * appropriate locking, so we cache information here in order to avoid
+	 * these problems.
+	 */
 	struct intel_fbc_state_cache {
 		struct i915_vma *vma;
 
@@ -1077,6 +1085,13 @@ struct intel_fbc {
 		} fb;
 	} state_cache;
 
+	/*
+	 * This structure contains everything that's relevant to program the
+	 * hardware registers. When we want to figure out if we need to disable
+	 * and re-enable FBC for a new configuration we just check if there's
+	 * something different in the struct. The genx_fbc_activate functions
+	 * are supposed to read from it in order to program the registers.
+	 */
 	struct intel_fbc_reg_params {
 		struct i915_vma *vma;
 
@@ -1149,11 +1164,11 @@ struct i915_psr {
 enum intel_pch {
 	PCH_NONE = 0,	/* No PCH present */
 	PCH_IBX,	/* Ibexpeak PCH */
-	PCH_CPT,	/* Cougarpoint PCH */
-	PCH_LPT,	/* Lynxpoint PCH */
+	PCH_CPT,	/* Cougarpoint/Pantherpoint PCH */
+	PCH_LPT,	/* Lynxpoint/Wildcatpoint PCH */
 	PCH_SPT,        /* Sunrisepoint PCH */
-	PCH_KBP,        /* Kabypoint PCH */
-	PCH_CNP,        /* Cannonpoint PCH */
+	PCH_KBP,        /* Kaby Lake PCH */
+	PCH_CNP,        /* Cannon Lake PCH */
 	PCH_NOP,
 };
 
@@ -1166,6 +1181,7 @@ enum intel_sbi_destination {
 #define QUIRK_INVERT_BRIGHTNESS (1<<2)
 #define QUIRK_BACKLIGHT_PRESENT (1<<3)
 #define QUIRK_PIN_SWIZZLED_PAGES (1<<5)
+#define QUIRK_INCREASE_T12_DELAY (1<<6)
 
 struct intel_fbdev;
 struct intel_fbc_work;
@@ -1301,13 +1317,10 @@ struct intel_gen6_power_mgmt {
 	int last_adj;
 	enum { LOW_POWER, BETWEEN, HIGH_POWER } power;
 
-	spinlock_t client_lock;
-	struct list_head clients;
-	bool client_boost;
-
 	bool enabled;
 	struct delayed_work autoenable_work;
-	unsigned boosts;
+	atomic_t num_waiters;
+	atomic_t boosts;
 
 	/* manual wa residency calculations */
 	struct intel_rps_ei ei;
@@ -1383,12 +1396,23 @@ struct i915_power_well {
 	bool hw_enabled;
 	u64 domains;
 	/* unique identifier for this power well */
-	unsigned long id;
+	enum i915_power_well_id id;
 	/*
 	 * Arbitraty data associated with this power well. Platform and power
 	 * well specific.
 	 */
-	unsigned long data;
+	union {
+		struct {
+			enum dpio_phy phy;
+		} bxt;
+		struct {
+			/* Mask of pipes whose IRQ logic is backed by the pw */
+			u8 irq_pipe_mask;
+			/* The pw is backing the VGA functionality */
+			bool has_vga:1;
+			bool has_fuses:1;
+		} hsw;
+	};
 	const struct i915_power_well_ops *ops;
 };
 
@@ -1505,6 +1529,8 @@ struct i915_gpu_error {
 	/* Protected by the above dev->gpu_error.lock. */
 	struct i915_gpu_state *first_error;
 
+	atomic_t pending_fb_pin;
+
 	unsigned long missed_irq_rings;
 
 	/**
@@ -1550,6 +1576,12 @@ struct i915_gpu_error {
 	 * inspect the bit and do the reset directly, otherwise the worker
 	 * waits for the struct_mutex.
 	 *
+	 * #I915_RESET_ENGINE[num_engines] - Since the driver doesn't need to
+	 * acquire the struct_mutex to reset an engine, we need an explicit
+	 * flag to prevent two concurrent reset attempts in the same engine.
+	 * As the number of engines continues to grow, allocate the flags from
+	 * the most significant bits.
+	 *
 	 * #I915_WEDGED - If reset fails and we can no longer use the GPU,
 	 * we set the #I915_WEDGED bit. Prior to command submission, e.g.
 	 * i915_gem_request_alloc(), this bit is checked and the sequence
@@ -1558,7 +1590,12 @@ struct i915_gpu_error {
 	unsigned long flags;
 #define I915_RESET_BACKOFF	0
 #define I915_RESET_HANDOFF	1
+#define I915_RESET_MODESET	2
 #define I915_WEDGED		(BITS_PER_LONG - 1)
+#define I915_RESET_ENGINE	(I915_WEDGED - I915_NUM_ENGINES)
+
+	/** Number of times an engine has been reset */
+	u32 reset_engine_count[I915_NUM_ENGINES];
 
 	/**
 	 * Waitqueue to signal when a hang is detected. Used to for waiters
@@ -1869,6 +1906,7 @@ struct i915_workarounds {
 
 struct i915_virtual_gpu {
 	bool active;
+	u32 caps;
 };
 
 /* used in computing the new watermarks state */
@@ -1888,6 +1926,24 @@ struct i915_oa_reg {
 	u32 value;
 };
 
+struct i915_oa_config {
+	char uuid[UUID_STRING_LEN + 1];
+	int id;
+
+	const struct i915_oa_reg *mux_regs;
+	u32 mux_regs_len;
+	const struct i915_oa_reg *b_counter_regs;
+	u32 b_counter_regs_len;
+	const struct i915_oa_reg *flex_regs;
+	u32 flex_regs_len;
+
+	struct attribute_group sysfs_metric;
+	struct attribute *attrs[2];
+	struct device_attribute sysfs_metric_id;
+
+	atomic_t ref_count;
+};
+
 struct i915_perf_stream;
 
 /**
@@ -2000,6 +2056,11 @@ struct i915_perf_stream {
 	 * type of configured stream.
 	 */
 	const struct i915_perf_stream_ops *ops;
+
+	/**
+	 * @oa_config: The OA configuration used by the stream.
+	 */
+	struct i915_oa_config *oa_config;
 };
 
 /**
@@ -2007,6 +2068,25 @@ struct i915_perf_stream {
  */
 struct i915_oa_ops {
 	/**
+	 * @is_valid_b_counter_reg: Validates register's address for
+	 * programming boolean counters for a particular platform.
+	 */
+	bool (*is_valid_b_counter_reg)(struct drm_i915_private *dev_priv,
+				       u32 addr);
+
+	/**
+	 * @is_valid_mux_reg: Validates register's address for programming mux
+	 * for a particular platform.
+	 */
+	bool (*is_valid_mux_reg)(struct drm_i915_private *dev_priv, u32 addr);
+
+	/**
+	 * @is_valid_flex_reg: Validates register's address for programming
+	 * flex EU filtering for a particular platform.
+	 */
+	bool (*is_valid_flex_reg)(struct drm_i915_private *dev_priv, u32 addr);
+
+	/**
 	 * @init_oa_buffer: Resets the head and tail pointers of the
 	 * circular buffer for periodic OA reports.
 	 *
@@ -2024,20 +2104,13 @@ struct i915_oa_ops {
 	void (*init_oa_buffer)(struct drm_i915_private *dev_priv);
 
 	/**
-	 * @select_metric_set: The auto generated code that checks whether a
-	 * requested OA config is applicable to the system and if so sets up
-	 * the mux, oa and flex eu register config pointers according to the
-	 * current dev_priv->perf.oa.metrics_set.
-	 */
-	int (*select_metric_set)(struct drm_i915_private *dev_priv);
-
-	/**
 	 * @enable_metric_set: Selects and applies any MUX configuration to set
 	 * up the Boolean and Custom (B/C) counters that are part of the
 	 * counter reports being sampled. May apply system constraints such as
 	 * disabling EU clock gating as required.
 	 */
-	int (*enable_metric_set)(struct drm_i915_private *dev_priv);
+	int (*enable_metric_set)(struct drm_i915_private *dev_priv,
+				 const struct i915_oa_config *oa_config);
 
 	/**
 	 * @disable_metric_set: Remove system constraints associated with using
@@ -2083,6 +2156,7 @@ struct drm_i915_private {
 
 	struct kmem_cache *objects;
 	struct kmem_cache *vmas;
+	struct kmem_cache *luts;
 	struct kmem_cache *requests;
 	struct kmem_cache *dependencies;
 	struct kmem_cache *priorities;
@@ -2133,9 +2207,6 @@ struct drm_i915_private {
 	/* protects the irq masks */
 	spinlock_t irq_lock;
 
-	/* protects the mmio flip data */
-	spinlock_t mmio_flip_lock;
-
 	bool display_irqs_enabled;
 
 	/* To control wakeup latency, e.g. for irq-driven dp aux transfers. */
@@ -2236,18 +2307,10 @@ struct drm_i915_private {
 	DECLARE_HASHTABLE(mm_structs, 7);
 	struct mutex mm_lock;
 
-	/* The hw wants to have a stable context identifier for the lifetime
-	 * of the context (for OA, PASID, faults, etc). This is limited
-	 * in execlists to 21 bits.
-	 */
-	struct ida context_hw_ida;
-#define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
-
 	/* Kernel Modesetting */
 
 	struct intel_crtc *plane_to_crtc_mapping[I915_MAX_PIPES];
 	struct intel_crtc *pipe_to_crtc_mapping[I915_MAX_PIPES];
-	wait_queue_head_t pending_flip_queue;
 
 #ifdef CONFIG_DEBUG_FS
 	struct intel_pipe_crc pipe_crc[I915_MAX_PIPES];
@@ -2303,11 +2366,9 @@ struct drm_i915_private {
 
 	struct drm_i915_gem_object *vlv_pctx;
 
-#ifdef CONFIG_DRM_FBDEV_EMULATION
 	/* list of fbdev register on this device */
 	struct intel_fbdev *fbdev;
 	struct work_struct fbdev_suspend_work;
-#endif
 
 	struct drm_property *broadcast_rgb_property;
 	struct drm_property *force_audio_property;
@@ -2321,7 +2382,18 @@ struct drm_i915_private {
 	 */
 	struct mutex av_mutex;
 
-	struct list_head context_list;
+	struct {
+		struct list_head list;
+		struct llist_head free_list;
+		struct work_struct free_work;
+
+		/* The hw wants to have a stable context identifier for the
+		 * lifetime of the context (for OA, PASID, faults, etc).
+		 * This is limited in execlists to 21 bits.
+		 */
+		struct ida hw_ida;
+#define MAX_CONTEXT_HW_ID (1<<21) /* exclusive */
+	} contexts;
 
 	u32 fdi_rx_config;
 
@@ -2399,10 +2471,32 @@ struct drm_i915_private {
 		struct kobject *metrics_kobj;
 		struct ctl_table_header *sysctl_header;
 
+		/*
+		 * Lock associated with adding/modifying/removing OA configs
+		 * in dev_priv->perf.metrics_idr.
+		 */
+		struct mutex metrics_lock;
+
+		/*
+		 * List of dynamic configurations, you need to hold
+		 * dev_priv->perf.metrics_lock to access it.
+		 */
+		struct idr metrics_idr;
+
+		/*
+		 * Lock associated with anything below within this structure
+		 * except exclusive_stream.
+		 */
 		struct mutex lock;
 		struct list_head streams;
 
 		struct {
+			/*
+			 * The stream currently using the OA unit. If accessed
+			 * outside a syscall associated to its file
+			 * descriptor, you need to hold
+			 * dev_priv->drm.struct_mutex.
+			 */
 			struct i915_perf_stream *exclusive_stream;
 
 			u32 specific_ctx_id;
@@ -2421,16 +2515,7 @@ struct drm_i915_private {
 			int period_exponent;
 			int timestamp_frequency;
 
-			int metrics_set;
-
-			const struct i915_oa_reg *mux_regs[6];
-			int mux_regs_lens[6];
-			int n_mux_configs;
-
-			const struct i915_oa_reg *b_counter_regs;
-			int b_counter_regs_len;
-			const struct i915_oa_reg *flex_regs;
-			int flex_regs_len;
+			struct i915_oa_config test_config;
 
 			struct {
 				struct i915_vma *vma;
@@ -2517,7 +2602,6 @@ struct drm_i915_private {
 
 			struct i915_oa_ops ops;
 			const struct i915_oa_format *oa_formats;
-			int n_builtin_sets;
 		} oa;
 	} perf;
 
@@ -2996,16 +3080,17 @@ intel_info(const struct drm_i915_private *dev_priv)
 
 #define HAS_POOLED_EU(dev_priv)	((dev_priv)->info.has_pooled_eu)
 
-#define INTEL_PCH_DEVICE_ID_MASK		0xff00
-#define INTEL_PCH_DEVICE_ID_MASK_EXT		0xff80
+#define INTEL_PCH_DEVICE_ID_MASK		0xff80
 #define INTEL_PCH_IBX_DEVICE_ID_TYPE		0x3b00
 #define INTEL_PCH_CPT_DEVICE_ID_TYPE		0x1c00
 #define INTEL_PCH_PPT_DEVICE_ID_TYPE		0x1e00
 #define INTEL_PCH_LPT_DEVICE_ID_TYPE		0x8c00
 #define INTEL_PCH_LPT_LP_DEVICE_ID_TYPE		0x9c00
+#define INTEL_PCH_WPT_DEVICE_ID_TYPE		0x8c80
+#define INTEL_PCH_WPT_LP_DEVICE_ID_TYPE		0x9c80
 #define INTEL_PCH_SPT_DEVICE_ID_TYPE		0xA100
 #define INTEL_PCH_SPT_LP_DEVICE_ID_TYPE		0x9D00
-#define INTEL_PCH_KBP_DEVICE_ID_TYPE		0xA200
+#define INTEL_PCH_KBP_DEVICE_ID_TYPE		0xA280
 #define INTEL_PCH_CNP_DEVICE_ID_TYPE		0xA300
 #define INTEL_PCH_CNP_LP_DEVICE_ID_TYPE		0x9D80
 #define INTEL_PCH_P2X_DEVICE_ID_TYPE		0x7100
@@ -3020,9 +3105,11 @@ intel_info(const struct drm_i915_private *dev_priv)
 #define HAS_PCH_SPT(dev_priv) (INTEL_PCH_TYPE(dev_priv) == PCH_SPT)
 #define HAS_PCH_LPT(dev_priv) (INTEL_PCH_TYPE(dev_priv) == PCH_LPT)
 #define HAS_PCH_LPT_LP(dev_priv) \
-	((dev_priv)->pch_id == INTEL_PCH_LPT_LP_DEVICE_ID_TYPE)
+	((dev_priv)->pch_id == INTEL_PCH_LPT_LP_DEVICE_ID_TYPE || \
+	 (dev_priv)->pch_id == INTEL_PCH_WPT_LP_DEVICE_ID_TYPE)
 #define HAS_PCH_LPT_H(dev_priv) \
-	((dev_priv)->pch_id == INTEL_PCH_LPT_DEVICE_ID_TYPE)
+	((dev_priv)->pch_id == INTEL_PCH_LPT_DEVICE_ID_TYPE || \
+	 (dev_priv)->pch_id == INTEL_PCH_WPT_DEVICE_ID_TYPE)
 #define HAS_PCH_CPT(dev_priv) (INTEL_PCH_TYPE(dev_priv) == PCH_CPT)
 #define HAS_PCH_IBX(dev_priv) (INTEL_PCH_TYPE(dev_priv) == PCH_IBX)
 #define HAS_PCH_NOP(dev_priv) (INTEL_PCH_TYPE(dev_priv) == PCH_NOP)
@@ -3088,7 +3175,13 @@ extern int i915_driver_load(struct pci_dev *pdev,
 extern void i915_driver_unload(struct drm_device *dev);
 extern int intel_gpu_reset(struct drm_i915_private *dev_priv, u32 engine_mask);
 extern bool intel_has_gpu_reset(struct drm_i915_private *dev_priv);
-extern void i915_reset(struct drm_i915_private *dev_priv);
+
+#define I915_RESET_QUIET BIT(0)
+extern void i915_reset(struct drm_i915_private *i915, unsigned int flags);
+extern int i915_reset_engine(struct intel_engine_cs *engine,
+			     unsigned int flags);
+
+extern bool intel_has_reset_engine(struct drm_i915_private *dev_priv);
 extern int intel_guc_reset(struct drm_i915_private *dev_priv);
 extern void intel_engine_init_hangcheck(struct intel_engine_cs *engine);
 extern void intel_hangcheck_init(struct drm_i915_private *dev_priv);
@@ -3107,7 +3200,8 @@ void intel_hpd_irq_handler(struct drm_i915_private *dev_priv,
 void intel_hpd_init(struct drm_i915_private *dev_priv);
 void intel_hpd_init_work(struct drm_i915_private *dev_priv);
 void intel_hpd_cancel_work(struct drm_i915_private *dev_priv);
-bool intel_hpd_pin_to_port(enum hpd_pin pin, enum port *port);
+enum port intel_hpd_pin_to_port(enum hpd_pin pin);
+enum hpd_pin intel_hpd_pin(enum port port);
 bool intel_hpd_disable(struct drm_i915_private *dev_priv, enum hpd_pin pin);
 void intel_hpd_enable(struct drm_i915_private *dev_priv, enum hpd_pin pin);
 
@@ -3276,6 +3370,26 @@ static inline void i915_gem_drain_freed_objects(struct drm_i915_private *i915)
 	} while (flush_work(&i915->mm.free_work));
 }
 
+static inline void i915_gem_drain_workqueue(struct drm_i915_private *i915)
+{
+	/*
+	 * Similar to objects above (see i915_gem_drain_freed-objects), in
+	 * general we have workers that are armed by RCU and then rearm
+	 * themselves in their callbacks. To be paranoid, we need to
+	 * drain the workqueue a second time after waiting for the RCU
+	 * grace period so that we catch work queued via RCU from the first
+	 * pass. As neither drain_workqueue() nor flush_workqueue() report
+	 * a result, we make an assumption that we only don't require more
+	 * than 2 passes to catch all recursive RCU delayed work.
+	 *
+	 */
+	int pass = 2;
+	do {
+		rcu_barrier();
+		drain_workqueue(i915->wq);
+	} while (--pass);
+}
+
 struct i915_vma * __must_check
 i915_gem_object_ggtt_pin(struct drm_i915_gem_object *obj,
 			 const struct i915_ggtt_view *view,
@@ -3461,11 +3575,22 @@ static inline u32 i915_reset_count(struct i915_gpu_error *error)
 	return READ_ONCE(error->reset_count);
 }
 
+static inline u32 i915_reset_engine_count(struct i915_gpu_error *error,
+					  struct intel_engine_cs *engine)
+{
+	return READ_ONCE(error->reset_engine_count[engine->id]);
+}
+
+struct drm_i915_gem_request *
+i915_gem_reset_prepare_engine(struct intel_engine_cs *engine);
 int i915_gem_reset_prepare(struct drm_i915_private *dev_priv);
 void i915_gem_reset(struct drm_i915_private *dev_priv);
+void i915_gem_reset_finish_engine(struct intel_engine_cs *engine);
 void i915_gem_reset_finish(struct drm_i915_private *dev_priv);
 void i915_gem_set_wedged(struct drm_i915_private *dev_priv);
 bool i915_gem_unset_wedged(struct drm_i915_private *dev_priv);
+void i915_gem_reset_engine(struct intel_engine_cs *engine,
+			   struct drm_i915_gem_request *request);
 
 void i915_gem_init_mmio(struct drm_i915_private *i915);
 int __must_check i915_gem_init(struct drm_i915_private *dev_priv);
@@ -3499,7 +3624,7 @@ i915_gem_object_pin_to_display_plane(struct drm_i915_gem_object *obj,
 void i915_gem_object_unpin_from_display_plane(struct i915_vma *vma);
 int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj,
 				int align);
-int i915_gem_open(struct drm_device *dev, struct drm_file *file);
+int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file);
 void i915_gem_release(struct drm_device *dev, struct drm_file *file);
 
 int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
@@ -3531,40 +3656,25 @@ void i915_gem_object_save_bit_17_swizzle(struct drm_i915_gem_object *obj,
 					 struct sg_table *pages);
 
 static inline struct i915_gem_context *
+__i915_gem_context_lookup_rcu(struct drm_i915_file_private *file_priv, u32 id)
+{
+	return idr_find(&file_priv->context_idr, id);
+}
+
+static inline struct i915_gem_context *
 i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id)
 {
 	struct i915_gem_context *ctx;
 
-	lockdep_assert_held(&file_priv->dev_priv->drm.struct_mutex);
-
-	ctx = idr_find(&file_priv->context_idr, id);
-	if (!ctx)
-		return ERR_PTR(-ENOENT);
+	rcu_read_lock();
+	ctx = __i915_gem_context_lookup_rcu(file_priv, id);
+	if (ctx && !kref_get_unless_zero(&ctx->ref))
+		ctx = NULL;
+	rcu_read_unlock();
 
 	return ctx;
 }
 
-static inline struct i915_gem_context *
-i915_gem_context_get(struct i915_gem_context *ctx)
-{
-	kref_get(&ctx->ref);
-	return ctx;
-}
-
-static inline void i915_gem_context_put(struct i915_gem_context *ctx)
-{
-	lockdep_assert_held(&ctx->i915->drm.struct_mutex);
-	kref_put(&ctx->ref, i915_gem_context_free);
-}
-
-static inline void i915_gem_context_put_unlocked(struct i915_gem_context *ctx)
-{
-	struct mutex *lock = &ctx->i915->drm.struct_mutex;
-
-	if (kref_put_mutex(&ctx->ref, i915_gem_context_free, lock))
-		mutex_unlock(lock);
-}
-
 static inline struct intel_timeline *
 i915_gem_context_lookup_timeline(struct i915_gem_context *ctx,
 				 struct intel_engine_cs *engine)
@@ -3577,6 +3687,10 @@ i915_gem_context_lookup_timeline(struct i915_gem_context *ctx,
 
 int i915_perf_open_ioctl(struct drm_device *dev, void *data,
 			 struct drm_file *file);
+int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
+			       struct drm_file *file);
+int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
+				  struct drm_file *file);
 void i915_oa_init_reg_state(struct intel_engine_cs *engine,
 			    struct i915_gem_context *ctx,
 			    uint32_t *reg_state);
@@ -4064,6 +4178,11 @@ static inline unsigned long msecs_to_jiffies_timeout(const unsigned int m)
 
 static inline unsigned long nsecs_to_jiffies_timeout(const u64 n)
 {
+	/* nsecs_to_jiffies64() does not guard against overflow */
+	if (NSEC_PER_SEC % HZ &&
+	    div_u64(n, NSEC_PER_SEC) >= MAX_JIFFY_OFFSET / HZ)
+		return MAX_JIFFY_OFFSET;
+
         return min_t(u64, MAX_JIFFY_OFFSET, nsecs_to_jiffies64(n) + 1);
 }
 
@@ -4210,10 +4329,11 @@ int remap_io_mapping(struct vm_area_struct *vma,
 		     unsigned long addr, unsigned long pfn, unsigned long size,
 		     struct io_mapping *iomap);
 
-static inline bool i915_gem_object_is_coherent(struct drm_i915_gem_object *obj)
+static inline bool
+intel_engine_can_store_dword(struct intel_engine_cs *engine)
 {
-	return (obj->cache_level != I915_CACHE_NONE ||
-		HAS_LLC(to_i915(obj->base.dev)));
+	return __intel_engine_can_store_dword(INTEL_GEN(engine->i915),
+					      engine->class);
 }
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 969bac8..b9e8e0d 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -52,7 +52,7 @@ static bool cpu_write_needs_clflush(struct drm_i915_gem_object *obj)
 	if (obj->cache_dirty)
 		return false;
 
-	if (!obj->cache_coherent)
+	if (!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE))
 		return true;
 
 	return obj->pin_display;
@@ -253,7 +253,7 @@ __i915_gem_object_release_shmem(struct drm_i915_gem_object *obj,
 
 	if (needs_clflush &&
 	    (obj->base.read_domains & I915_GEM_DOMAIN_CPU) == 0 &&
-	    !obj->cache_coherent)
+	    !(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ))
 		drm_clflush_sg(pages);
 
 	__start_cpu_write(obj);
@@ -388,7 +388,7 @@ i915_gem_object_wait_fence(struct dma_fence *fence,
 	 */
 	if (rps) {
 		if (INTEL_GEN(rq->i915) >= 6)
-			gen6_rps_boost(rq->i915, rps, rq->emitted_jiffies);
+			gen6_rps_boost(rq, rps);
 		else
 			rps = NULL;
 	}
@@ -399,22 +399,6 @@ i915_gem_object_wait_fence(struct dma_fence *fence,
 	if (flags & I915_WAIT_LOCKED && i915_gem_request_completed(rq))
 		i915_gem_request_retire_upto(rq);
 
-	if (rps && i915_gem_request_global_seqno(rq) == intel_engine_last_submit(rq->engine)) {
-		/* The GPU is now idle and this client has stalled.
-		 * Since no other client has submitted a request in the
-		 * meantime, assume that this client is the only one
-		 * supplying work to the GPU but is unable to keep that
-		 * work supplied because it is waiting. Since the GPU is
-		 * then never kept fully busy, RPS autoclocking will
-		 * keep the clocks relatively low, causing further delays.
-		 * Compensate by giving the synchronous client credit for
-		 * a waitboost next time.
-		 */
-		spin_lock(&rq->i915->rps.client_lock);
-		list_del_init(&rps->link);
-		spin_unlock(&rq->i915->rps.client_lock);
-	}
-
 	return timeout;
 }
 
@@ -577,46 +561,6 @@ static struct intel_rps_client *to_rps_client(struct drm_file *file)
 	return &fpriv->rps;
 }
 
-int
-i915_gem_object_attach_phys(struct drm_i915_gem_object *obj,
-			    int align)
-{
-	int ret;
-
-	if (align > obj->base.size)
-		return -EINVAL;
-
-	if (obj->ops == &i915_gem_phys_ops)
-		return 0;
-
-	if (obj->mm.madv != I915_MADV_WILLNEED)
-		return -EFAULT;
-
-	if (obj->base.filp == NULL)
-		return -EINVAL;
-
-	ret = i915_gem_object_unbind(obj);
-	if (ret)
-		return ret;
-
-	__i915_gem_object_put_pages(obj, I915_MM_NORMAL);
-	if (obj->mm.pages)
-		return -EBUSY;
-
-	GEM_BUG_ON(obj->ops != &i915_gem_object_ops);
-	obj->ops = &i915_gem_phys_ops;
-
-	ret = i915_gem_object_pin_pages(obj);
-	if (ret)
-		goto err_xfer;
-
-	return 0;
-
-err_xfer:
-	obj->ops = &i915_gem_object_ops;
-	return ret;
-}
-
 static int
 i915_gem_phys_pwrite(struct drm_i915_gem_object *obj,
 		     struct drm_i915_gem_pwrite *args,
@@ -856,7 +800,8 @@ int i915_gem_obj_prepare_shmem_read(struct drm_i915_gem_object *obj,
 	if (ret)
 		return ret;
 
-	if (obj->cache_coherent || !static_cpu_has(X86_FEATURE_CLFLUSH)) {
+	if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ ||
+	    !static_cpu_has(X86_FEATURE_CLFLUSH)) {
 		ret = i915_gem_object_set_to_cpu_domain(obj, false);
 		if (ret)
 			goto err_unpin;
@@ -908,7 +853,8 @@ int i915_gem_obj_prepare_shmem_write(struct drm_i915_gem_object *obj,
 	if (ret)
 		return ret;
 
-	if (obj->cache_coherent || !static_cpu_has(X86_FEATURE_CLFLUSH)) {
+	if (obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE ||
+	    !static_cpu_has(X86_FEATURE_CLFLUSH)) {
 		ret = i915_gem_object_set_to_cpu_domain(obj, true);
 		if (ret)
 			goto err_unpin;
@@ -2756,34 +2702,38 @@ i915_gem_object_pwrite_gtt(struct drm_i915_gem_object *obj,
 	return 0;
 }
 
-static bool ban_context(const struct i915_gem_context *ctx)
+static bool ban_context(const struct i915_gem_context *ctx,
+			unsigned int score)
 {
 	return (i915_gem_context_is_bannable(ctx) &&
-		ctx->ban_score >= CONTEXT_SCORE_BAN_THRESHOLD);
+		score >= CONTEXT_SCORE_BAN_THRESHOLD);
 }
 
 static void i915_gem_context_mark_guilty(struct i915_gem_context *ctx)
 {
-	ctx->guilty_count++;
-	ctx->ban_score += CONTEXT_SCORE_GUILTY;
-	if (ban_context(ctx))
-		i915_gem_context_set_banned(ctx);
+	unsigned int score;
+	bool banned;
 
+	atomic_inc(&ctx->guilty_count);
+
+	score = atomic_add_return(CONTEXT_SCORE_GUILTY, &ctx->ban_score);
+	banned = ban_context(ctx, score);
 	DRM_DEBUG_DRIVER("context %s marked guilty (score %d) banned? %s\n",
-			 ctx->name, ctx->ban_score,
-			 yesno(i915_gem_context_is_banned(ctx)));
-
-	if (!i915_gem_context_is_banned(ctx) || IS_ERR_OR_NULL(ctx->file_priv))
+			 ctx->name, score, yesno(banned));
+	if (!banned)
 		return;
 
-	ctx->file_priv->context_bans++;
-	DRM_DEBUG_DRIVER("client %s has had %d context banned\n",
-			 ctx->name, ctx->file_priv->context_bans);
+	i915_gem_context_set_banned(ctx);
+	if (!IS_ERR_OR_NULL(ctx->file_priv)) {
+		atomic_inc(&ctx->file_priv->context_bans);
+		DRM_DEBUG_DRIVER("client %s has had %d context banned\n",
+				 ctx->name, atomic_read(&ctx->file_priv->context_bans));
+	}
 }
 
 static void i915_gem_context_mark_innocent(struct i915_gem_context *ctx)
 {
-	ctx->active_count++;
+	atomic_inc(&ctx->active_count);
 }
 
 struct drm_i915_gem_request *
@@ -2832,46 +2782,62 @@ static bool engine_stalled(struct intel_engine_cs *engine)
 	return true;
 }
 
+/*
+ * Ensure irq handler finishes, and not run again.
+ * Also return the active request so that we only search for it once.
+ */
+struct drm_i915_gem_request *
+i915_gem_reset_prepare_engine(struct intel_engine_cs *engine)
+{
+	struct drm_i915_gem_request *request = NULL;
+
+	/* Prevent the signaler thread from updating the request
+	 * state (by calling dma_fence_signal) as we are processing
+	 * the reset. The write from the GPU of the seqno is
+	 * asynchronous and the signaler thread may see a different
+	 * value to us and declare the request complete, even though
+	 * the reset routine have picked that request as the active
+	 * (incomplete) request. This conflict is not handled
+	 * gracefully!
+	 */
+	kthread_park(engine->breadcrumbs.signaler);
+
+	/* Prevent request submission to the hardware until we have
+	 * completed the reset in i915_gem_reset_finish(). If a request
+	 * is completed by one engine, it may then queue a request
+	 * to a second via its engine->irq_tasklet *just* as we are
+	 * calling engine->init_hw() and also writing the ELSP.
+	 * Turning off the engine->irq_tasklet until the reset is over
+	 * prevents the race.
+	 */
+	tasklet_kill(&engine->irq_tasklet);
+	tasklet_disable(&engine->irq_tasklet);
+
+	if (engine->irq_seqno_barrier)
+		engine->irq_seqno_barrier(engine);
+
+	request = i915_gem_find_active_request(engine);
+	if (request && request->fence.error == -EIO)
+		request = ERR_PTR(-EIO); /* Previous reset failed! */
+
+	return request;
+}
+
 int i915_gem_reset_prepare(struct drm_i915_private *dev_priv)
 {
 	struct intel_engine_cs *engine;
+	struct drm_i915_gem_request *request;
 	enum intel_engine_id id;
 	int err = 0;
 
-	/* Ensure irq handler finishes, and not run again. */
 	for_each_engine(engine, dev_priv, id) {
-		struct drm_i915_gem_request *request;
-
-		/* Prevent the signaler thread from updating the request
-		 * state (by calling dma_fence_signal) as we are processing
-		 * the reset. The write from the GPU of the seqno is
-		 * asynchronous and the signaler thread may see a different
-		 * value to us and declare the request complete, even though
-		 * the reset routine have picked that request as the active
-		 * (incomplete) request. This conflict is not handled
-		 * gracefully!
-		 */
-		kthread_park(engine->breadcrumbs.signaler);
-
-		/* Prevent request submission to the hardware until we have
-		 * completed the reset in i915_gem_reset_finish(). If a request
-		 * is completed by one engine, it may then queue a request
-		 * to a second via its engine->irq_tasklet *just* as we are
-		 * calling engine->init_hw() and also writing the ELSP.
-		 * Turning off the engine->irq_tasklet until the reset is over
-		 * prevents the race.
-		 */
-		tasklet_kill(&engine->irq_tasklet);
-		tasklet_disable(&engine->irq_tasklet);
-
-		if (engine->irq_seqno_barrier)
-			engine->irq_seqno_barrier(engine);
-
-		if (engine_stalled(engine)) {
-			request = i915_gem_find_active_request(engine);
-			if (request && request->fence.error == -EIO)
-				err = -EIO; /* Previous reset failed! */
+		request = i915_gem_reset_prepare_engine(engine);
+		if (IS_ERR(request)) {
+			err = PTR_ERR(request);
+			continue;
 		}
+
+		engine->hangcheck.active_request = request;
 	}
 
 	i915_gem_revoke_fences(dev_priv);
@@ -2921,12 +2887,11 @@ static void engine_skip_context(struct drm_i915_gem_request *request)
 	spin_unlock_irqrestore(&engine->timeline->lock, flags);
 }
 
-/* Returns true if the request was guilty of hang */
-static bool i915_gem_reset_request(struct drm_i915_gem_request *request)
+/* Returns the request if it was guilty of the hang */
+static struct drm_i915_gem_request *
+i915_gem_reset_request(struct intel_engine_cs *engine,
+		       struct drm_i915_gem_request *request)
 {
-	/* Read once and return the resolution */
-	const bool guilty = engine_stalled(request->engine);
-
 	/* The guilty request will get skipped on a hung engine.
 	 *
 	 * Users of client default contexts do not rely on logical
@@ -2948,29 +2913,47 @@ static bool i915_gem_reset_request(struct drm_i915_gem_request *request)
 	 * subsequent hangs.
 	 */
 
-	if (guilty) {
+	if (engine_stalled(engine)) {
 		i915_gem_context_mark_guilty(request->ctx);
 		skip_request(request);
-	} else {
-		i915_gem_context_mark_innocent(request->ctx);
-		dma_fence_set_error(&request->fence, -EAGAIN);
-	}
-
-	return guilty;
-}
-
-static void i915_gem_reset_engine(struct intel_engine_cs *engine)
-{
-	struct drm_i915_gem_request *request;
-
-	request = i915_gem_find_active_request(engine);
-	if (request && i915_gem_reset_request(request)) {
-		DRM_DEBUG_DRIVER("resetting %s to restart from tail of request 0x%x\n",
-				 engine->name, request->global_seqno);
 
 		/* If this context is now banned, skip all pending requests. */
 		if (i915_gem_context_is_banned(request->ctx))
 			engine_skip_context(request);
+	} else {
+		/*
+		 * Since this is not the hung engine, it may have advanced
+		 * since the hang declaration. Double check by refinding
+		 * the active request at the time of the reset.
+		 */
+		request = i915_gem_find_active_request(engine);
+		if (request) {
+			i915_gem_context_mark_innocent(request->ctx);
+			dma_fence_set_error(&request->fence, -EAGAIN);
+
+			/* Rewind the engine to replay the incomplete rq */
+			spin_lock_irq(&engine->timeline->lock);
+			request = list_prev_entry(request, link);
+			if (&request->link == &engine->timeline->requests)
+				request = NULL;
+			spin_unlock_irq(&engine->timeline->lock);
+		}
+	}
+
+	return request;
+}
+
+void i915_gem_reset_engine(struct intel_engine_cs *engine,
+			   struct drm_i915_gem_request *request)
+{
+	engine->irq_posted = 0;
+
+	if (request)
+		request = i915_gem_reset_request(engine, request);
+
+	if (request) {
+		DRM_DEBUG_DRIVER("resetting %s to restart from tail of request 0x%x\n",
+				 engine->name, request->global_seqno);
 	}
 
 	/* Setup the CS to resume from the breadcrumb of the hung request */
@@ -2989,7 +2972,7 @@ void i915_gem_reset(struct drm_i915_private *dev_priv)
 	for_each_engine(engine, dev_priv, id) {
 		struct i915_gem_context *ctx;
 
-		i915_gem_reset_engine(engine);
+		i915_gem_reset_engine(engine, engine->hangcheck.active_request);
 		ctx = fetch_and_zero(&engine->last_retired_context);
 		if (ctx)
 			engine->context_unpin(engine, ctx);
@@ -3005,6 +2988,12 @@ void i915_gem_reset(struct drm_i915_private *dev_priv)
 	}
 }
 
+void i915_gem_reset_finish_engine(struct intel_engine_cs *engine)
+{
+	tasklet_enable(&engine->irq_tasklet);
+	kthread_unpark(engine->breadcrumbs.signaler);
+}
+
 void i915_gem_reset_finish(struct drm_i915_private *dev_priv)
 {
 	struct intel_engine_cs *engine;
@@ -3013,13 +3002,14 @@ void i915_gem_reset_finish(struct drm_i915_private *dev_priv)
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
 
 	for_each_engine(engine, dev_priv, id) {
-		tasklet_enable(&engine->irq_tasklet);
-		kthread_unpark(engine->breadcrumbs.signaler);
+		engine->hangcheck.active_request = NULL;
+		i915_gem_reset_finish_engine(engine);
 	}
 }
 
 static void nop_submit_request(struct drm_i915_gem_request *request)
 {
+	GEM_BUG_ON(!i915_terminally_wedged(&request->i915->gpu_error));
 	dma_fence_set_error(&request->fence, -EIO);
 	i915_gem_request_submit(request);
 	intel_engine_init_global_seqno(request->engine, request->global_seqno);
@@ -3041,16 +3031,10 @@ static void engine_set_wedged(struct intel_engine_cs *engine)
 	/* Mark all executing requests as skipped */
 	spin_lock_irqsave(&engine->timeline->lock, flags);
 	list_for_each_entry(request, &engine->timeline->requests, link)
-		dma_fence_set_error(&request->fence, -EIO);
+		if (!i915_gem_request_completed(request))
+			dma_fence_set_error(&request->fence, -EIO);
 	spin_unlock_irqrestore(&engine->timeline->lock, flags);
 
-	/* Mark all pending requests as complete so that any concurrent
-	 * (lockless) lookup doesn't try and wait upon the request as we
-	 * reset it.
-	 */
-	intel_engine_init_global_seqno(engine,
-				       intel_engine_last_submit(engine));
-
 	/*
 	 * Clear the execlists queue up before freeing the requests, as those
 	 * are the ones that keep the context and ringbuffer backing objects
@@ -3071,7 +3055,21 @@ static void engine_set_wedged(struct intel_engine_cs *engine)
 		engine->execlist_first = NULL;
 
 		spin_unlock_irqrestore(&engine->timeline->lock, flags);
+
+		/* The port is checked prior to scheduling a tasklet, but
+		 * just in case we have suspended the tasklet to do the
+		 * wedging make sure that when it wakes, it decides there
+		 * is no work to do by clearing the irq_posted bit.
+		 */
+		clear_bit(ENGINE_IRQ_EXECLIST, &engine->irq_posted);
 	}
+
+	/* Mark all pending requests as complete so that any concurrent
+	 * (lockless) lookup doesn't try and wait upon the request as we
+	 * reset it.
+	 */
+	intel_engine_init_global_seqno(engine,
+				       intel_engine_last_submit(engine));
 }
 
 static int __i915_gem_set_wedged_BKL(void *data)
@@ -3083,25 +3081,15 @@ static int __i915_gem_set_wedged_BKL(void *data)
 	for_each_engine(engine, i915, id)
 		engine_set_wedged(engine);
 
+	set_bit(I915_WEDGED, &i915->gpu_error.flags);
+	wake_up_all(&i915->gpu_error.reset_queue);
+
 	return 0;
 }
 
 void i915_gem_set_wedged(struct drm_i915_private *dev_priv)
 {
-	lockdep_assert_held(&dev_priv->drm.struct_mutex);
-	set_bit(I915_WEDGED, &dev_priv->gpu_error.flags);
-
-	/* Retire completed requests first so the list of inflight/incomplete
-	 * requests is accurate and we don't try and mark successful requests
-	 * as in error during __i915_gem_set_wedged_BKL().
-	 */
-	i915_gem_retire_requests(dev_priv);
-
 	stop_machine(__i915_gem_set_wedged_BKL, dev_priv, NULL);
-
-	i915_gem_context_lost(dev_priv);
-
-	mod_delayed_work(dev_priv->wq, &dev_priv->gt.idle_work, 0);
 }
 
 bool i915_gem_unset_wedged(struct drm_i915_private *i915)
@@ -3156,6 +3144,7 @@ bool i915_gem_unset_wedged(struct drm_i915_private *i915)
 	 * context and do not require stop_machine().
 	 */
 	intel_engines_reset_default_submission(i915);
+	i915_gem_contexts_lost(i915);
 
 	smp_mb__before_atomic(); /* complete takeover before enabling execbuf */
 	clear_bit(I915_WEDGED, &i915->gpu_error.flags);
@@ -3253,25 +3242,33 @@ i915_gem_idle_work_handler(struct work_struct *work)
 
 void i915_gem_close_object(struct drm_gem_object *gem, struct drm_file *file)
 {
+	struct drm_i915_private *i915 = to_i915(gem->dev);
 	struct drm_i915_gem_object *obj = to_intel_bo(gem);
 	struct drm_i915_file_private *fpriv = file->driver_priv;
-	struct i915_vma *vma, *vn;
+	struct i915_lut_handle *lut, *ln;
 
-	mutex_lock(&obj->base.dev->struct_mutex);
-	list_for_each_entry_safe(vma, vn, &obj->vma_list, obj_link)
-		if (vma->vm->file == fpriv)
+	mutex_lock(&i915->drm.struct_mutex);
+
+	list_for_each_entry_safe(lut, ln, &obj->lut_list, obj_link) {
+		struct i915_gem_context *ctx = lut->ctx;
+		struct i915_vma *vma;
+
+		if (ctx->file_priv != fpriv)
+			continue;
+
+		vma = radix_tree_delete(&ctx->handles_vma, lut->handle);
+
+		if (!i915_vma_is_ggtt(vma))
 			i915_vma_close(vma);
 
-	vma = obj->vma_hashed;
-	if (vma && vma->ctx->file_priv == fpriv)
-		i915_vma_unlink_ctx(vma);
+		list_del(&lut->obj_link);
+		list_del(&lut->ctx_link);
 
-	if (i915_gem_object_is_active(obj) &&
-	    !i915_gem_object_has_active_reference(obj)) {
-		i915_gem_object_set_active_reference(obj);
-		i915_gem_object_get(obj);
+		kmem_cache_free(i915->luts, lut);
+		__i915_gem_object_release_unless_active(obj);
 	}
-	mutex_unlock(&obj->base.dev->struct_mutex);
+
+	mutex_unlock(&i915->drm.struct_mutex);
 }
 
 static unsigned long to_wait_timeout(s64 timeout_ns)
@@ -3297,7 +3294,7 @@ static unsigned long to_wait_timeout(s64 timeout_ns)
  *  -ERESTARTSYS: signal interrupted the wait
  *  -ENONENT: object doesn't exist
  * Also possible, but rare:
- *  -EAGAIN: GPU wedged
+ *  -EAGAIN: incomplete, restart syscall
  *  -ENOMEM: damn
  *  -ENODEV: Internal IRQ fail
  *  -E?: The add request failed
@@ -3345,6 +3342,10 @@ i915_gem_wait_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
 		 */
 		if (ret == -ETIME && !nsecs_to_jiffies(args->timeout_ns))
 			args->timeout_ns = 0;
+
+		/* Asked to wait beyond the jiffie/scheduler precision? */
+		if (ret == -ETIME && args->timeout_ns)
+			ret = -EAGAIN;
 	}
 
 	i915_gem_object_put(obj);
@@ -3686,8 +3687,7 @@ int i915_gem_object_set_cache_level(struct drm_i915_gem_object *obj,
 
 	list_for_each_entry(vma, &obj->vma_list, obj_link)
 		vma->node.color = cache_level;
-	obj->cache_level = cache_level;
-	obj->cache_coherent = i915_gem_object_is_coherent(obj);
+	i915_gem_object_set_cache_coherency(obj, cache_level);
 	obj->cache_dirty = true; /* Always invalidate stale cachelines */
 
 	return 0;
@@ -4260,6 +4260,7 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
 	INIT_LIST_HEAD(&obj->global_link);
 	INIT_LIST_HEAD(&obj->userfault_link);
 	INIT_LIST_HEAD(&obj->vma_list);
+	INIT_LIST_HEAD(&obj->lut_list);
 	INIT_LIST_HEAD(&obj->batch_pool_link);
 
 	obj->ops = ops;
@@ -4292,6 +4293,7 @@ i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size)
 {
 	struct drm_i915_gem_object *obj;
 	struct address_space *mapping;
+	unsigned int cache_level;
 	gfp_t mask;
 	int ret;
 
@@ -4330,7 +4332,7 @@ i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size)
 	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
 	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
 
-	if (HAS_LLC(dev_priv)) {
+	if (HAS_LLC(dev_priv))
 		/* On some devices, we can have the GPU use the LLC (the CPU
 		 * cache) for about a 10% performance improvement
 		 * compared to uncached.  Graphics requests other than
@@ -4343,12 +4345,11 @@ i915_gem_object_create(struct drm_i915_private *dev_priv, u64 size)
 		 * However, we maintain the display planes as UC, and so
 		 * need to rebind when first used as such.
 		 */
-		obj->cache_level = I915_CACHE_LLC;
-	} else
-		obj->cache_level = I915_CACHE_NONE;
+		cache_level = I915_CACHE_LLC;
+	else
+		cache_level = I915_CACHE_NONE;
 
-	obj->cache_coherent = i915_gem_object_is_coherent(obj);
-	obj->cache_dirty = !obj->cache_coherent;
+	i915_gem_object_set_cache_coherency(obj, cache_level);
 
 	trace_i915_gem_object_create(obj);
 
@@ -4503,8 +4504,8 @@ void __i915_gem_object_release_unless_active(struct drm_i915_gem_object *obj)
 {
 	lockdep_assert_held(&obj->base.dev->struct_mutex);
 
-	GEM_BUG_ON(i915_gem_object_has_active_reference(obj));
-	if (i915_gem_object_is_active(obj))
+	if (!i915_gem_object_has_active_reference(obj) &&
+	    i915_gem_object_is_active(obj))
 		i915_gem_object_set_active_reference(obj);
 	else
 		i915_gem_object_put(obj);
@@ -4565,7 +4566,7 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv)
 		goto err_unlock;
 
 	assert_kernel_context_is_current(dev_priv);
-	i915_gem_context_lost(dev_priv);
+	i915_gem_contexts_lost(dev_priv);
 	mutex_unlock(&dev->struct_mutex);
 
 	intel_guc_suspend(dev_priv);
@@ -4579,8 +4580,6 @@ int i915_gem_suspend(struct drm_i915_private *dev_priv)
 	while (flush_delayed_work(&dev_priv->gt.idle_work))
 		;
 
-	i915_gem_drain_freed_objects(dev_priv);
-
 	/* Assert that we sucessfully flushed all the work and
 	 * reset the GPU back to its idle, low power state.
 	 */
@@ -4812,7 +4811,7 @@ int i915_gem_init(struct drm_i915_private *dev_priv)
 	if (ret)
 		goto out_unlock;
 
-	ret = i915_gem_context_init(dev_priv);
+	ret = i915_gem_contexts_init(dev_priv);
 	if (ret)
 		goto out_unlock;
 
@@ -4898,12 +4897,16 @@ i915_gem_load_init(struct drm_i915_private *dev_priv)
 	if (!dev_priv->vmas)
 		goto err_objects;
 
+	dev_priv->luts = KMEM_CACHE(i915_lut_handle, 0);
+	if (!dev_priv->luts)
+		goto err_vmas;
+
 	dev_priv->requests = KMEM_CACHE(drm_i915_gem_request,
 					SLAB_HWCACHE_ALIGN |
 					SLAB_RECLAIM_ACCOUNT |
 					SLAB_TYPESAFE_BY_RCU);
 	if (!dev_priv->requests)
-		goto err_vmas;
+		goto err_luts;
 
 	dev_priv->dependencies = KMEM_CACHE(i915_dependency,
 					    SLAB_HWCACHE_ALIGN |
@@ -4922,7 +4925,6 @@ i915_gem_load_init(struct drm_i915_private *dev_priv)
 	if (err)
 		goto err_priorities;
 
-	INIT_LIST_HEAD(&dev_priv->context_list);
 	INIT_WORK(&dev_priv->mm.free_work, __i915_gem_free_work);
 	init_llist_head(&dev_priv->mm.free_list);
 	INIT_LIST_HEAD(&dev_priv->mm.unbound_list);
@@ -4936,8 +4938,6 @@ i915_gem_load_init(struct drm_i915_private *dev_priv)
 	init_waitqueue_head(&dev_priv->gpu_error.wait_queue);
 	init_waitqueue_head(&dev_priv->gpu_error.reset_queue);
 
-	init_waitqueue_head(&dev_priv->pending_flip_queue);
-
 	atomic_set(&dev_priv->mm.bsd_engine_dispatch_index, 0);
 
 	spin_lock_init(&dev_priv->fb_tracking.lock);
@@ -4950,6 +4950,8 @@ i915_gem_load_init(struct drm_i915_private *dev_priv)
 	kmem_cache_destroy(dev_priv->dependencies);
 err_requests:
 	kmem_cache_destroy(dev_priv->requests);
+err_luts:
+	kmem_cache_destroy(dev_priv->luts);
 err_vmas:
 	kmem_cache_destroy(dev_priv->vmas);
 err_objects:
@@ -4972,6 +4974,7 @@ void i915_gem_load_cleanup(struct drm_i915_private *dev_priv)
 	kmem_cache_destroy(dev_priv->priorities);
 	kmem_cache_destroy(dev_priv->dependencies);
 	kmem_cache_destroy(dev_priv->requests);
+	kmem_cache_destroy(dev_priv->luts);
 	kmem_cache_destroy(dev_priv->vmas);
 	kmem_cache_destroy(dev_priv->objects);
 
@@ -5038,15 +5041,9 @@ void i915_gem_release(struct drm_device *dev, struct drm_file *file)
 	list_for_each_entry(request, &file_priv->mm.request_list, client_link)
 		request->file_priv = NULL;
 	spin_unlock(&file_priv->mm.lock);
-
-	if (!list_empty(&file_priv->rps.link)) {
-		spin_lock(&to_i915(dev)->rps.client_lock);
-		list_del(&file_priv->rps.link);
-		spin_unlock(&to_i915(dev)->rps.client_lock);
-	}
 }
 
-int i915_gem_open(struct drm_device *dev, struct drm_file *file)
+int i915_gem_open(struct drm_i915_private *i915, struct drm_file *file)
 {
 	struct drm_i915_file_private *file_priv;
 	int ret;
@@ -5058,16 +5055,15 @@ int i915_gem_open(struct drm_device *dev, struct drm_file *file)
 		return -ENOMEM;
 
 	file->driver_priv = file_priv;
-	file_priv->dev_priv = to_i915(dev);
+	file_priv->dev_priv = i915;
 	file_priv->file = file;
-	INIT_LIST_HEAD(&file_priv->rps.link);
 
 	spin_lock_init(&file_priv->mm.lock);
 	INIT_LIST_HEAD(&file_priv->mm.request_list);
 
 	file_priv->bsd_engine = -1;
 
-	ret = i915_gem_context_open(dev, file);
+	ret = i915_gem_context_open(i915, file);
 	if (ret)
 		kfree(file_priv);
 
@@ -5311,6 +5307,64 @@ i915_gem_object_get_dma_address(struct drm_i915_gem_object *obj,
 	return sg_dma_address(sg) + (offset << PAGE_SHIFT);
 }
 
+int i915_gem_object_attach_phys(struct drm_i915_gem_object *obj, int align)
+{
+	struct sg_table *pages;
+	int err;
+
+	if (align > obj->base.size)
+		return -EINVAL;
+
+	if (obj->ops == &i915_gem_phys_ops)
+		return 0;
+
+	if (obj->ops != &i915_gem_object_ops)
+		return -EINVAL;
+
+	err = i915_gem_object_unbind(obj);
+	if (err)
+		return err;
+
+	mutex_lock(&obj->mm.lock);
+
+	if (obj->mm.madv != I915_MADV_WILLNEED) {
+		err = -EFAULT;
+		goto err_unlock;
+	}
+
+	if (obj->mm.quirked) {
+		err = -EFAULT;
+		goto err_unlock;
+	}
+
+	if (obj->mm.mapping) {
+		err = -EBUSY;
+		goto err_unlock;
+	}
+
+	pages = obj->mm.pages;
+	obj->ops = &i915_gem_phys_ops;
+
+	err = ____i915_gem_object_get_pages(obj);
+	if (err)
+		goto err_xfer;
+
+	/* Perma-pin (until release) the physical set of pages */
+	__i915_gem_object_pin_pages(obj);
+
+	if (!IS_ERR_OR_NULL(pages))
+		i915_gem_object_ops.put_pages(obj, pages);
+	mutex_unlock(&obj->mm.lock);
+	return 0;
+
+err_xfer:
+	obj->ops = &i915_gem_object_ops;
+	obj->mm.pages = pages;
+err_unlock:
+	mutex_unlock(&obj->mm.lock);
+	return err;
+}
+
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
 #include "selftests/scatterlist.c"
 #include "selftests/mock_gem_device.c"
diff --git a/drivers/gpu/drm/i915/i915_gem_clflush.c b/drivers/gpu/drm/i915/i915_gem_clflush.c
index 348b29a..8a04d33 100644
--- a/drivers/gpu/drm/i915/i915_gem_clflush.c
+++ b/drivers/gpu/drm/i915/i915_gem_clflush.c
@@ -139,7 +139,8 @@ bool i915_gem_clflush_object(struct drm_i915_gem_object *obj,
 	 * snooping behaviour occurs naturally as the result of our domain
 	 * tracking.
 	 */
-	if (!(flags & I915_CLFLUSH_FORCE) && obj->cache_coherent)
+	if (!(flags & I915_CLFLUSH_FORCE) &&
+	    obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ)
 		return false;
 
 	trace_i915_gem_object_clflush(obj);
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index e1e971e..58a2a44 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -93,81 +93,37 @@
 
 #define ALL_L3_SLICES(dev) (1 << NUM_L3_SLICES(dev)) - 1
 
-/* Initial size (as log2) to preallocate the handle->object hashtable */
-#define VMA_HT_BITS 2u /* 4 x 2 pointers, 64 bytes minimum */
-
-static void resize_vma_ht(struct work_struct *work)
+static void lut_close(struct i915_gem_context *ctx)
 {
-	struct i915_gem_context_vma_lut *lut =
-		container_of(work, typeof(*lut), resize);
-	unsigned int bits, new_bits, size, i;
-	struct hlist_head *new_ht;
+	struct i915_lut_handle *lut, *ln;
+	struct radix_tree_iter iter;
+	void __rcu **slot;
 
-	GEM_BUG_ON(!(lut->ht_size & I915_CTX_RESIZE_IN_PROGRESS));
-
-	bits = 1 + ilog2(4*lut->ht_count/3 + 1);
-	new_bits = min_t(unsigned int,
-			 max(bits, VMA_HT_BITS),
-			 sizeof(unsigned int) * BITS_PER_BYTE - 1);
-	if (new_bits == lut->ht_bits)
-		goto out;
-
-	new_ht = kzalloc(sizeof(*new_ht)<<new_bits, GFP_KERNEL | __GFP_NOWARN);
-	if (!new_ht)
-		new_ht = vzalloc(sizeof(*new_ht)<<new_bits);
-	if (!new_ht)
-		/* Pretend resize succeeded and stop calling us for a bit! */
-		goto out;
-
-	size = BIT(lut->ht_bits);
-	for (i = 0; i < size; i++) {
-		struct i915_vma *vma;
-		struct hlist_node *tmp;
-
-		hlist_for_each_entry_safe(vma, tmp, &lut->ht[i], ctx_node)
-			hlist_add_head(&vma->ctx_node,
-				       &new_ht[hash_32(vma->ctx_handle,
-						       new_bits)]);
+	list_for_each_entry_safe(lut, ln, &ctx->handles_list, ctx_link) {
+		list_del(&lut->obj_link);
+		kmem_cache_free(ctx->i915->luts, lut);
 	}
-	kvfree(lut->ht);
-	lut->ht = new_ht;
-	lut->ht_bits = new_bits;
-out:
-	smp_store_release(&lut->ht_size, BIT(bits));
-	GEM_BUG_ON(lut->ht_size & I915_CTX_RESIZE_IN_PROGRESS);
+
+	radix_tree_for_each_slot(slot, &ctx->handles_vma, &iter, 0) {
+		struct i915_vma *vma = rcu_dereference_raw(*slot);
+		struct drm_i915_gem_object *obj = vma->obj;
+
+		radix_tree_iter_delete(&ctx->handles_vma, &iter, slot);
+
+		if (!i915_vma_is_ggtt(vma))
+			i915_vma_close(vma);
+
+		__i915_gem_object_release_unless_active(obj);
+	}
 }
 
-static void vma_lut_free(struct i915_gem_context *ctx)
+static void i915_gem_context_free(struct i915_gem_context *ctx)
 {
-	struct i915_gem_context_vma_lut *lut = &ctx->vma_lut;
-	unsigned int i, size;
-
-	if (lut->ht_size & I915_CTX_RESIZE_IN_PROGRESS)
-		cancel_work_sync(&lut->resize);
-
-	size = BIT(lut->ht_bits);
-	for (i = 0; i < size; i++) {
-		struct i915_vma *vma;
-
-		hlist_for_each_entry(vma, &lut->ht[i], ctx_node) {
-			vma->obj->vma_hashed = NULL;
-			vma->ctx = NULL;
-			i915_vma_put(vma);
-		}
-	}
-	kvfree(lut->ht);
-}
-
-void i915_gem_context_free(struct kref *ctx_ref)
-{
-	struct i915_gem_context *ctx = container_of(ctx_ref, typeof(*ctx), ref);
 	int i;
 
 	lockdep_assert_held(&ctx->i915->drm.struct_mutex);
-	trace_i915_context_free(ctx);
 	GEM_BUG_ON(!i915_gem_context_is_closed(ctx));
 
-	vma_lut_free(ctx);
 	i915_ppgtt_put(ctx->ppgtt);
 
 	for (i = 0; i < I915_NUM_ENGINES; i++) {
@@ -188,15 +144,64 @@ void i915_gem_context_free(struct kref *ctx_ref)
 
 	list_del(&ctx->link);
 
-	ida_simple_remove(&ctx->i915->context_hw_ida, ctx->hw_id);
-	kfree(ctx);
+	ida_simple_remove(&ctx->i915->contexts.hw_ida, ctx->hw_id);
+	kfree_rcu(ctx, rcu);
+}
+
+static void contexts_free(struct drm_i915_private *i915)
+{
+	struct llist_node *freed = llist_del_all(&i915->contexts.free_list);
+	struct i915_gem_context *ctx, *cn;
+
+	lockdep_assert_held(&i915->drm.struct_mutex);
+
+	llist_for_each_entry_safe(ctx, cn, freed, free_link)
+		i915_gem_context_free(ctx);
+}
+
+static void contexts_free_first(struct drm_i915_private *i915)
+{
+	struct i915_gem_context *ctx;
+	struct llist_node *freed;
+
+	lockdep_assert_held(&i915->drm.struct_mutex);
+
+	freed = llist_del_first(&i915->contexts.free_list);
+	if (!freed)
+		return;
+
+	ctx = container_of(freed, typeof(*ctx), free_link);
+	i915_gem_context_free(ctx);
+}
+
+static void contexts_free_worker(struct work_struct *work)
+{
+	struct drm_i915_private *i915 =
+		container_of(work, typeof(*i915), contexts.free_work);
+
+	mutex_lock(&i915->drm.struct_mutex);
+	contexts_free(i915);
+	mutex_unlock(&i915->drm.struct_mutex);
+}
+
+void i915_gem_context_release(struct kref *ref)
+{
+	struct i915_gem_context *ctx = container_of(ref, typeof(*ctx), ref);
+	struct drm_i915_private *i915 = ctx->i915;
+
+	trace_i915_context_free(ctx);
+	if (llist_add(&ctx->free_link, &i915->contexts.free_list))
+		queue_work(i915->wq, &i915->contexts.free_work);
 }
 
 static void context_close(struct i915_gem_context *ctx)
 {
 	i915_gem_context_set_closed(ctx);
+
+	lut_close(ctx);
 	if (ctx->ppgtt)
 		i915_ppgtt_close(&ctx->ppgtt->base);
+
 	ctx->file_priv = ERR_PTR(-EBADF);
 	i915_gem_context_put(ctx);
 }
@@ -205,7 +210,7 @@ static int assign_hw_id(struct drm_i915_private *dev_priv, unsigned *out)
 {
 	int ret;
 
-	ret = ida_simple_get(&dev_priv->context_hw_ida,
+	ret = ida_simple_get(&dev_priv->contexts.hw_ida,
 			     0, MAX_CONTEXT_HW_ID, GFP_KERNEL);
 	if (ret < 0) {
 		/* Contexts are only released when no longer active.
@@ -213,7 +218,7 @@ static int assign_hw_id(struct drm_i915_private *dev_priv, unsigned *out)
 		 * stale contexts and try again.
 		 */
 		i915_gem_retire_requests(dev_priv);
-		ret = ida_simple_get(&dev_priv->context_hw_ida,
+		ret = ida_simple_get(&dev_priv->contexts.hw_ida,
 				     0, MAX_CONTEXT_HW_ID, GFP_KERNEL);
 		if (ret < 0)
 			return ret;
@@ -265,20 +270,12 @@ __create_hw_context(struct drm_i915_private *dev_priv,
 	}
 
 	kref_init(&ctx->ref);
-	list_add_tail(&ctx->link, &dev_priv->context_list);
+	list_add_tail(&ctx->link, &dev_priv->contexts.list);
 	ctx->i915 = dev_priv;
 	ctx->priority = I915_PRIORITY_NORMAL;
 
-	ctx->vma_lut.ht_bits = VMA_HT_BITS;
-	ctx->vma_lut.ht_size = BIT(VMA_HT_BITS);
-	BUILD_BUG_ON(BIT(VMA_HT_BITS) == I915_CTX_RESIZE_IN_PROGRESS);
-	ctx->vma_lut.ht = kcalloc(ctx->vma_lut.ht_size,
-				  sizeof(*ctx->vma_lut.ht),
-				  GFP_KERNEL);
-	if (!ctx->vma_lut.ht)
-		goto err_out;
-
-	INIT_WORK(&ctx->vma_lut.resize, resize_vma_ht);
+	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
+	INIT_LIST_HEAD(&ctx->handles_list);
 
 	/* Default context will never have a file_priv */
 	ret = DEFAULT_CONTEXT_HANDLE;
@@ -328,8 +325,6 @@ __create_hw_context(struct drm_i915_private *dev_priv,
 	put_pid(ctx->pid);
 	idr_remove(&file_priv->context_idr, ctx->user_handle);
 err_lut:
-	kvfree(ctx->vma_lut.ht);
-err_out:
 	context_close(ctx);
 	return ERR_PTR(ret);
 }
@@ -354,6 +349,9 @@ i915_gem_create_context(struct drm_i915_private *dev_priv,
 
 	lockdep_assert_held(&dev_priv->drm.struct_mutex);
 
+	/* Reap the most stale context */
+	contexts_free_first(dev_priv);
+
 	ctx = __create_hw_context(dev_priv, file_priv);
 	if (IS_ERR(ctx))
 		return ctx;
@@ -418,7 +416,7 @@ i915_gem_context_create_gvt(struct drm_device *dev)
 	return ctx;
 }
 
-int i915_gem_context_init(struct drm_i915_private *dev_priv)
+int i915_gem_contexts_init(struct drm_i915_private *dev_priv)
 {
 	struct i915_gem_context *ctx;
 
@@ -427,6 +425,10 @@ int i915_gem_context_init(struct drm_i915_private *dev_priv)
 	if (WARN_ON(dev_priv->kernel_context))
 		return 0;
 
+	INIT_LIST_HEAD(&dev_priv->contexts.list);
+	INIT_WORK(&dev_priv->contexts.free_work, contexts_free_worker);
+	init_llist_head(&dev_priv->contexts.free_list);
+
 	if (intel_vgpu_active(dev_priv) &&
 	    HAS_LOGICAL_RING_CONTEXTS(dev_priv)) {
 		if (!i915.enable_execlists) {
@@ -437,7 +439,7 @@ int i915_gem_context_init(struct drm_i915_private *dev_priv)
 
 	/* Using the simple ida interface, the max is limited by sizeof(int) */
 	BUILD_BUG_ON(MAX_CONTEXT_HW_ID > INT_MAX);
-	ida_init(&dev_priv->context_hw_ida);
+	ida_init(&dev_priv->contexts.hw_ida);
 
 	ctx = i915_gem_create_context(dev_priv, NULL);
 	if (IS_ERR(ctx)) {
@@ -463,7 +465,7 @@ int i915_gem_context_init(struct drm_i915_private *dev_priv)
 	return 0;
 }
 
-void i915_gem_context_lost(struct drm_i915_private *dev_priv)
+void i915_gem_contexts_lost(struct drm_i915_private *dev_priv)
 {
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id;
@@ -484,7 +486,7 @@ void i915_gem_context_lost(struct drm_i915_private *dev_priv)
 	if (!i915.enable_execlists) {
 		struct i915_gem_context *ctx;
 
-		list_for_each_entry(ctx, &dev_priv->context_list, link) {
+		list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
 			if (!i915_gem_context_is_default(ctx))
 				continue;
 
@@ -503,18 +505,20 @@ void i915_gem_context_lost(struct drm_i915_private *dev_priv)
 	}
 }
 
-void i915_gem_context_fini(struct drm_i915_private *dev_priv)
+void i915_gem_contexts_fini(struct drm_i915_private *i915)
 {
-	struct i915_gem_context *dctx = dev_priv->kernel_context;
+	struct i915_gem_context *ctx;
 
-	lockdep_assert_held(&dev_priv->drm.struct_mutex);
+	lockdep_assert_held(&i915->drm.struct_mutex);
 
-	GEM_BUG_ON(!i915_gem_context_is_kernel(dctx));
+	/* Keep the context so that we can free it immediately ourselves */
+	ctx = i915_gem_context_get(fetch_and_zero(&i915->kernel_context));
+	GEM_BUG_ON(!i915_gem_context_is_kernel(ctx));
+	context_close(ctx);
+	i915_gem_context_free(ctx);
 
-	context_close(dctx);
-	dev_priv->kernel_context = NULL;
-
-	ida_destroy(&dev_priv->context_hw_ida);
+	/* Must free all deferred contexts (via flush_workqueue) first */
+	ida_destroy(&i915->contexts.hw_ida);
 }
 
 static int context_idr_cleanup(int id, void *p, void *data)
@@ -525,32 +529,32 @@ static int context_idr_cleanup(int id, void *p, void *data)
 	return 0;
 }
 
-int i915_gem_context_open(struct drm_device *dev, struct drm_file *file)
+int i915_gem_context_open(struct drm_i915_private *i915,
+			  struct drm_file *file)
 {
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 	struct i915_gem_context *ctx;
 
 	idr_init(&file_priv->context_idr);
 
-	mutex_lock(&dev->struct_mutex);
-	ctx = i915_gem_create_context(to_i915(dev), file_priv);
-	mutex_unlock(&dev->struct_mutex);
-
-	GEM_BUG_ON(i915_gem_context_is_kernel(ctx));
-
+	mutex_lock(&i915->drm.struct_mutex);
+	ctx = i915_gem_create_context(i915, file_priv);
+	mutex_unlock(&i915->drm.struct_mutex);
 	if (IS_ERR(ctx)) {
 		idr_destroy(&file_priv->context_idr);
 		return PTR_ERR(ctx);
 	}
 
+	GEM_BUG_ON(i915_gem_context_is_kernel(ctx));
+
 	return 0;
 }
 
-void i915_gem_context_close(struct drm_device *dev, struct drm_file *file)
+void i915_gem_context_close(struct drm_file *file)
 {
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 
-	lockdep_assert_held(&dev->struct_mutex);
+	lockdep_assert_held(&file_priv->dev_priv->drm.struct_mutex);
 
 	idr_for_each(&file_priv->context_idr, context_idr_cleanup, NULL);
 	idr_destroy(&file_priv->context_idr);
@@ -925,7 +929,7 @@ int i915_gem_switch_to_kernel_context(struct drm_i915_private *dev_priv)
 
 static bool client_is_banned(struct drm_i915_file_private *file_priv)
 {
-	return file_priv->context_bans > I915_MAX_CLIENT_CONTEXT_BANS;
+	return atomic_read(&file_priv->context_bans) > I915_MAX_CLIENT_CONTEXT_BANS;
 }
 
 int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
@@ -982,20 +986,19 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
 	if (args->ctx_id == DEFAULT_CONTEXT_HANDLE)
 		return -ENOENT;
 
-	ret = i915_mutex_lock_interruptible(dev);
-	if (ret)
-		return ret;
-
 	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
-	if (IS_ERR(ctx)) {
-		mutex_unlock(&dev->struct_mutex);
-		return PTR_ERR(ctx);
-	}
+	if (!ctx)
+		return -ENOENT;
+
+	ret = mutex_lock_interruptible(&dev->struct_mutex);
+	if (ret)
+		goto out;
 
 	__destroy_hw_context(ctx, file_priv);
 	mutex_unlock(&dev->struct_mutex);
 
-	DRM_DEBUG("HW context %d destroyed\n", args->ctx_id);
+out:
+	i915_gem_context_put(ctx);
 	return 0;
 }
 
@@ -1005,17 +1008,11 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 	struct drm_i915_file_private *file_priv = file->driver_priv;
 	struct drm_i915_gem_context_param *args = data;
 	struct i915_gem_context *ctx;
-	int ret;
-
-	ret = i915_mutex_lock_interruptible(dev);
-	if (ret)
-		return ret;
+	int ret = 0;
 
 	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
-	if (IS_ERR(ctx)) {
-		mutex_unlock(&dev->struct_mutex);
-		return PTR_ERR(ctx);
-	}
+	if (!ctx)
+		return -ENOENT;
 
 	args->size = 0;
 	switch (args->param) {
@@ -1043,8 +1040,8 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
 		ret = -EINVAL;
 		break;
 	}
-	mutex_unlock(&dev->struct_mutex);
 
+	i915_gem_context_put(ctx);
 	return ret;
 }
 
@@ -1056,15 +1053,13 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 	struct i915_gem_context *ctx;
 	int ret;
 
+	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
+	if (!ctx)
+		return -ENOENT;
+
 	ret = i915_mutex_lock_interruptible(dev);
 	if (ret)
-		return ret;
-
-	ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
-	if (IS_ERR(ctx)) {
-		mutex_unlock(&dev->struct_mutex);
-		return PTR_ERR(ctx);
-	}
+		goto out;
 
 	switch (args->param) {
 	case I915_CONTEXT_PARAM_BAN_PERIOD:
@@ -1102,6 +1097,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 	}
 	mutex_unlock(&dev->struct_mutex);
 
+out:
+	i915_gem_context_put(ctx);
 	return ret;
 }
 
@@ -1116,27 +1113,31 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev,
 	if (args->flags || args->pad)
 		return -EINVAL;
 
-	ret = i915_mutex_lock_interruptible(dev);
-	if (ret)
-		return ret;
+	ret = -ENOENT;
+	rcu_read_lock();
+	ctx = __i915_gem_context_lookup_rcu(file->driver_priv, args->ctx_id);
+	if (!ctx)
+		goto out;
 
-	ctx = i915_gem_context_lookup(file->driver_priv, args->ctx_id);
-	if (IS_ERR(ctx)) {
-		mutex_unlock(&dev->struct_mutex);
-		return PTR_ERR(ctx);
-	}
+	/*
+	 * We opt for unserialised reads here. This may result in tearing
+	 * in the extremely unlikely event of a GPU hang on this context
+	 * as we are querying them. If we need that extra layer of protection,
+	 * we should wrap the hangstats with a seqlock.
+	 */
 
 	if (capable(CAP_SYS_ADMIN))
 		args->reset_count = i915_reset_count(&dev_priv->gpu_error);
 	else
 		args->reset_count = 0;
 
-	args->batch_active = ctx->guilty_count;
-	args->batch_pending = ctx->active_count;
+	args->batch_active = atomic_read(&ctx->guilty_count);
+	args->batch_pending = atomic_read(&ctx->active_count);
 
-	mutex_unlock(&dev->struct_mutex);
-
-	return 0;
+	ret = 0;
+out:
+	rcu_read_unlock();
+	return ret;
 }
 
 #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
diff --git a/drivers/gpu/drm/i915/i915_gem_context.h b/drivers/gpu/drm/i915/i915_gem_context.h
index 82c99ba..44688e2 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.h
+++ b/drivers/gpu/drm/i915/i915_gem_context.h
@@ -27,6 +27,7 @@
 
 #include <linux/bitops.h>
 #include <linux/list.h>
+#include <linux/radix-tree.h>
 
 struct pid;
 
@@ -86,6 +87,7 @@ struct i915_gem_context {
 
 	/** link: place with &drm_i915_private.context_list */
 	struct list_head link;
+	struct llist_node free_link;
 
 	/**
 	 * @ref: reference count
@@ -99,6 +101,11 @@ struct i915_gem_context {
 	struct kref ref;
 
 	/**
+	 * @rcu: rcu_head for deferred freeing.
+	 */
+	struct rcu_head rcu;
+
+	/**
 	 * @flags: small set of booleans
 	 */
 	unsigned long flags;
@@ -143,32 +150,6 @@ struct i915_gem_context {
 	/** ggtt_offset_bias: placement restriction for context objects */
 	u32 ggtt_offset_bias;
 
-	struct i915_gem_context_vma_lut {
-		/** ht_size: last request size to allocate the hashtable for. */
-		unsigned int ht_size;
-#define I915_CTX_RESIZE_IN_PROGRESS BIT(0)
-		/** ht_bits: real log2(size) of hashtable. */
-		unsigned int ht_bits;
-		/** ht_count: current number of entries inside the hashtable */
-		unsigned int ht_count;
-
-		/** ht: the array of buckets comprising the simple hashtable */
-		struct hlist_head *ht;
-
-		/**
-		 * resize: After an execbuf completes, we check the load factor
-		 * of the hashtable. If the hashtable is too full, or too empty,
-		 * we schedule a task to resize the hashtable. During the
-		 * resize, the entries are moved between different buckets and
-		 * so we cannot simultaneously read the hashtable as it is
-		 * being resized (unlike rhashtable). Therefore we treat the
-		 * active work as a strong barrier, pausing a subsequent
-		 * execbuf to wait for the resize worker to complete, if
-		 * required.
-		 */
-		struct work_struct resize;
-	} vma_lut;
-
 	/** engine: per-engine logical HW state */
 	struct intel_context {
 		struct i915_vma *state;
@@ -185,20 +166,32 @@ struct i915_gem_context {
 	u32 desc_template;
 
 	/** guilty_count: How many times this context has caused a GPU hang. */
-	unsigned int guilty_count;
+	atomic_t guilty_count;
 	/**
 	 * @active_count: How many times this context was active during a GPU
 	 * hang, but did not cause it.
 	 */
-	unsigned int active_count;
+	atomic_t active_count;
 
 #define CONTEXT_SCORE_GUILTY		10
 #define CONTEXT_SCORE_BAN_THRESHOLD	40
 	/** ban_score: Accumulated score of all hangs caused by this context. */
-	int ban_score;
+	atomic_t ban_score;
 
 	/** remap_slice: Bitmask of cache lines that need remapping */
 	u8 remap_slice;
+
+	/** handles_vma: rbtree to look up our context specific obj/vma for
+	 * the user handle. (user handles are per fd, but the binding is
+	 * per vm, which may be one per context or shared with the global GTT)
+	 */
+	struct radix_tree_root handles_vma;
+
+	/** handles_list: reverse list of all the rbtree entries in use for
+	 * this context, which allows us to free all the allocations on
+	 * context close.
+	 */
+	struct list_head handles_list;
 };
 
 static inline bool i915_gem_context_is_closed(const struct i915_gem_context *ctx)
@@ -273,14 +266,18 @@ static inline bool i915_gem_context_is_kernel(struct i915_gem_context *ctx)
 }
 
 /* i915_gem_context.c */
-int __must_check i915_gem_context_init(struct drm_i915_private *dev_priv);
-void i915_gem_context_lost(struct drm_i915_private *dev_priv);
-void i915_gem_context_fini(struct drm_i915_private *dev_priv);
-int i915_gem_context_open(struct drm_device *dev, struct drm_file *file);
-void i915_gem_context_close(struct drm_device *dev, struct drm_file *file);
+int __must_check i915_gem_contexts_init(struct drm_i915_private *dev_priv);
+void i915_gem_contexts_lost(struct drm_i915_private *dev_priv);
+void i915_gem_contexts_fini(struct drm_i915_private *dev_priv);
+
+int i915_gem_context_open(struct drm_i915_private *i915,
+			  struct drm_file *file);
+void i915_gem_context_close(struct drm_file *file);
+
 int i915_switch_context(struct drm_i915_gem_request *req);
 int i915_gem_switch_to_kernel_context(struct drm_i915_private *dev_priv);
-void i915_gem_context_free(struct kref *ctx_ref);
+
+void i915_gem_context_release(struct kref *ctx_ref);
 struct i915_gem_context *
 i915_gem_context_create_gvt(struct drm_device *dev);
 
@@ -295,4 +292,16 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data,
 int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data,
 				       struct drm_file *file);
 
+static inline struct i915_gem_context *
+i915_gem_context_get(struct i915_gem_context *ctx)
+{
+	kref_get(&ctx->ref);
+	return ctx;
+}
+
+static inline void i915_gem_context_put(struct i915_gem_context *ctx)
+{
+	kref_put(&ctx->ref, i915_gem_context_release);
+}
+
 #endif /* !__I915_GEM_CONTEXT_H__ */
diff --git a/drivers/gpu/drm/i915/i915_gem_evict.c b/drivers/gpu/drm/i915/i915_gem_evict.c
index a193f1b..4df039e 100644
--- a/drivers/gpu/drm/i915/i915_gem_evict.c
+++ b/drivers/gpu/drm/i915/i915_gem_evict.c
@@ -318,8 +318,8 @@ int i915_gem_evict_for_node(struct i915_address_space *vm,
 		/* Overlap of objects in the same batch? */
 		if (i915_vma_is_pinned(vma)) {
 			ret = -ENOSPC;
-			if (vma->exec_entry &&
-			    vma->exec_entry->flags & EXEC_OBJECT_PINNED)
+			if (vma->exec_flags &&
+			    *vma->exec_flags & EXEC_OBJECT_PINNED)
 				ret = -EINVAL;
 			break;
 		}
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index e9503f6..4c20162 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -32,6 +32,7 @@
 #include <linux/uaccess.h>
 
 #include <drm/drmP.h>
+#include <drm/drm_syncobj.h>
 #include <drm/i915_drm.h>
 
 #include "i915_drv.h"
@@ -191,6 +192,8 @@ struct i915_execbuffer {
 	struct drm_file *file; /** per-file lookup tables and limits */
 	struct drm_i915_gem_execbuffer2 *args; /** ioctl parameters */
 	struct drm_i915_gem_exec_object2 *exec; /** ioctl execobj[] */
+	struct i915_vma **vma;
+	unsigned int *flags;
 
 	struct intel_engine_cs *engine; /** engine to queue the request to */
 	struct i915_gem_context *ctx; /** context for building the request */
@@ -244,13 +247,7 @@ struct i915_execbuffer {
 	struct hlist_head *buckets; /** ht for relocation handles */
 };
 
-/*
- * As an alternative to creating a hashtable of handle-to-vma for a batch,
- * we used the last available reserved field in the execobject[] and stash
- * a link from the execobj to its vma.
- */
-#define __exec_to_vma(ee) (ee)->rsvd2
-#define exec_to_vma(ee) u64_to_ptr(struct i915_vma, __exec_to_vma(ee))
+#define exec_entry(EB, VMA) (&(EB)->exec[(VMA)->exec_flags - (EB)->flags])
 
 /*
  * Used to convert any address to canonical form.
@@ -319,85 +316,82 @@ static int eb_create(struct i915_execbuffer *eb)
 
 static bool
 eb_vma_misplaced(const struct drm_i915_gem_exec_object2 *entry,
-		 const struct i915_vma *vma)
+		 const struct i915_vma *vma,
+		 unsigned int flags)
 {
-	if (!(entry->flags & __EXEC_OBJECT_HAS_PIN))
-		return true;
-
 	if (vma->node.size < entry->pad_to_size)
 		return true;
 
 	if (entry->alignment && !IS_ALIGNED(vma->node.start, entry->alignment))
 		return true;
 
-	if (entry->flags & EXEC_OBJECT_PINNED &&
+	if (flags & EXEC_OBJECT_PINNED &&
 	    vma->node.start != entry->offset)
 		return true;
 
-	if (entry->flags & __EXEC_OBJECT_NEEDS_BIAS &&
+	if (flags & __EXEC_OBJECT_NEEDS_BIAS &&
 	    vma->node.start < BATCH_OFFSET_BIAS)
 		return true;
 
-	if (!(entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) &&
+	if (!(flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS) &&
 	    (vma->node.start + vma->node.size - 1) >> 32)
 		return true;
 
 	return false;
 }
 
-static inline void
+static inline bool
 eb_pin_vma(struct i915_execbuffer *eb,
-	   struct drm_i915_gem_exec_object2 *entry,
+	   const struct drm_i915_gem_exec_object2 *entry,
 	   struct i915_vma *vma)
 {
-	u64 flags;
+	unsigned int exec_flags = *vma->exec_flags;
+	u64 pin_flags;
 
 	if (vma->node.size)
-		flags = vma->node.start;
+		pin_flags = vma->node.start;
 	else
-		flags = entry->offset & PIN_OFFSET_MASK;
+		pin_flags = entry->offset & PIN_OFFSET_MASK;
 
-	flags |= PIN_USER | PIN_NOEVICT | PIN_OFFSET_FIXED;
-	if (unlikely(entry->flags & EXEC_OBJECT_NEEDS_GTT))
-		flags |= PIN_GLOBAL;
+	pin_flags |= PIN_USER | PIN_NOEVICT | PIN_OFFSET_FIXED;
+	if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_GTT))
+		pin_flags |= PIN_GLOBAL;
 
-	if (unlikely(i915_vma_pin(vma, 0, 0, flags)))
-		return;
+	if (unlikely(i915_vma_pin(vma, 0, 0, pin_flags)))
+		return false;
 
-	if (unlikely(entry->flags & EXEC_OBJECT_NEEDS_FENCE)) {
+	if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) {
 		if (unlikely(i915_vma_get_fence(vma))) {
 			i915_vma_unpin(vma);
-			return;
+			return false;
 		}
 
 		if (i915_vma_pin_fence(vma))
-			entry->flags |= __EXEC_OBJECT_HAS_FENCE;
+			exec_flags |= __EXEC_OBJECT_HAS_FENCE;
 	}
 
-	entry->flags |= __EXEC_OBJECT_HAS_PIN;
+	*vma->exec_flags = exec_flags | __EXEC_OBJECT_HAS_PIN;
+	return !eb_vma_misplaced(entry, vma, exec_flags);
 }
 
-static inline void
-__eb_unreserve_vma(struct i915_vma *vma,
-		   const struct drm_i915_gem_exec_object2 *entry)
+static inline void __eb_unreserve_vma(struct i915_vma *vma, unsigned int flags)
 {
-	GEM_BUG_ON(!(entry->flags & __EXEC_OBJECT_HAS_PIN));
+	GEM_BUG_ON(!(flags & __EXEC_OBJECT_HAS_PIN));
 
-	if (unlikely(entry->flags & __EXEC_OBJECT_HAS_FENCE))
+	if (unlikely(flags & __EXEC_OBJECT_HAS_FENCE))
 		i915_vma_unpin_fence(vma);
 
 	__i915_vma_unpin(vma);
 }
 
 static inline void
-eb_unreserve_vma(struct i915_vma *vma,
-		 struct drm_i915_gem_exec_object2 *entry)
+eb_unreserve_vma(struct i915_vma *vma, unsigned int *flags)
 {
-	if (!(entry->flags & __EXEC_OBJECT_HAS_PIN))
+	if (!(*flags & __EXEC_OBJECT_HAS_PIN))
 		return;
 
-	__eb_unreserve_vma(vma, entry);
-	entry->flags &= ~__EXEC_OBJECT_RESERVED;
+	__eb_unreserve_vma(vma, *flags);
+	*flags &= ~__EXEC_OBJECT_RESERVED;
 }
 
 static int
@@ -427,7 +421,7 @@ eb_validate_vma(struct i915_execbuffer *eb,
 		entry->pad_to_size = 0;
 	}
 
-	if (unlikely(vma->exec_entry)) {
+	if (unlikely(vma->exec_flags)) {
 		DRM_DEBUG("Object [handle %d, index %d] appears more than once in object list\n",
 			  entry->handle, (int)(entry - eb->exec));
 		return -EINVAL;
@@ -440,14 +434,25 @@ eb_validate_vma(struct i915_execbuffer *eb,
 	 */
 	entry->offset = gen8_noncanonical_addr(entry->offset);
 
+	if (!eb->reloc_cache.has_fence) {
+		entry->flags &= ~EXEC_OBJECT_NEEDS_FENCE;
+	} else {
+		if ((entry->flags & EXEC_OBJECT_NEEDS_FENCE ||
+		     eb->reloc_cache.needs_unfenced) &&
+		    i915_gem_object_is_tiled(vma->obj))
+			entry->flags |= EXEC_OBJECT_NEEDS_GTT | __EXEC_OBJECT_NEEDS_MAP;
+	}
+
+	if (!(entry->flags & EXEC_OBJECT_PINNED))
+		entry->flags |= eb->context_flags;
+
 	return 0;
 }
 
 static int
-eb_add_vma(struct i915_execbuffer *eb,
-	   struct drm_i915_gem_exec_object2 *entry,
-	   struct i915_vma *vma)
+eb_add_vma(struct i915_execbuffer *eb, unsigned int i, struct i915_vma *vma)
 {
+	struct drm_i915_gem_exec_object2 *entry = &eb->exec[i];
 	int err;
 
 	GEM_BUG_ON(i915_vma_is_closed(vma));
@@ -468,40 +473,28 @@ eb_add_vma(struct i915_execbuffer *eb,
 	if (entry->relocation_count)
 		list_add_tail(&vma->reloc_link, &eb->relocs);
 
-	if (!eb->reloc_cache.has_fence) {
-		entry->flags &= ~EXEC_OBJECT_NEEDS_FENCE;
-	} else {
-		if ((entry->flags & EXEC_OBJECT_NEEDS_FENCE ||
-		     eb->reloc_cache.needs_unfenced) &&
-		    i915_gem_object_is_tiled(vma->obj))
-			entry->flags |= EXEC_OBJECT_NEEDS_GTT | __EXEC_OBJECT_NEEDS_MAP;
-	}
-
-	if (!(entry->flags & EXEC_OBJECT_PINNED))
-		entry->flags |= eb->context_flags;
-
 	/*
 	 * Stash a pointer from the vma to execobj, so we can query its flags,
 	 * size, alignment etc as provided by the user. Also we stash a pointer
 	 * to the vma inside the execobj so that we can use a direct lookup
 	 * to find the right target VMA when doing relocations.
 	 */
-	vma->exec_entry = entry;
-	__exec_to_vma(entry) = (uintptr_t)vma;
+	eb->vma[i] = vma;
+	eb->flags[i] = entry->flags;
+	vma->exec_flags = &eb->flags[i];
 
 	err = 0;
-	eb_pin_vma(eb, entry, vma);
-	if (eb_vma_misplaced(entry, vma)) {
-		eb_unreserve_vma(vma, entry);
-
-		list_add_tail(&vma->exec_link, &eb->unbound);
-		if (drm_mm_node_allocated(&vma->node))
-			err = i915_vma_unbind(vma);
-	} else {
+	if (eb_pin_vma(eb, entry, vma)) {
 		if (entry->offset != vma->node.start) {
 			entry->offset = vma->node.start | UPDATE;
 			eb->args->flags |= __EXEC_HAS_RELOC;
 		}
+	} else {
+		eb_unreserve_vma(vma, vma->exec_flags);
+
+		list_add_tail(&vma->exec_link, &eb->unbound);
+		if (drm_mm_node_allocated(&vma->node))
+			err = i915_vma_unbind(vma);
 	}
 	return err;
 }
@@ -526,32 +519,35 @@ static inline int use_cpu_reloc(const struct reloc_cache *cache,
 static int eb_reserve_vma(const struct i915_execbuffer *eb,
 			  struct i915_vma *vma)
 {
-	struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
-	u64 flags;
+	struct drm_i915_gem_exec_object2 *entry = exec_entry(eb, vma);
+	unsigned int exec_flags = *vma->exec_flags;
+	u64 pin_flags;
 	int err;
 
-	flags = PIN_USER | PIN_NONBLOCK;
-	if (entry->flags & EXEC_OBJECT_NEEDS_GTT)
-		flags |= PIN_GLOBAL;
+	pin_flags = PIN_USER | PIN_NONBLOCK;
+	if (exec_flags & EXEC_OBJECT_NEEDS_GTT)
+		pin_flags |= PIN_GLOBAL;
 
 	/*
 	 * Wa32bitGeneralStateOffset & Wa32bitInstructionBaseOffset,
 	 * limit address to the first 4GBs for unflagged objects.
 	 */
-	if (!(entry->flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS))
-		flags |= PIN_ZONE_4G;
+	if (!(exec_flags & EXEC_OBJECT_SUPPORTS_48B_ADDRESS))
+		pin_flags |= PIN_ZONE_4G;
 
-	if (entry->flags & __EXEC_OBJECT_NEEDS_MAP)
-		flags |= PIN_MAPPABLE;
+	if (exec_flags & __EXEC_OBJECT_NEEDS_MAP)
+		pin_flags |= PIN_MAPPABLE;
 
-	if (entry->flags & EXEC_OBJECT_PINNED) {
-		flags |= entry->offset | PIN_OFFSET_FIXED;
-		flags &= ~PIN_NONBLOCK; /* force overlapping PINNED checks */
-	} else if (entry->flags & __EXEC_OBJECT_NEEDS_BIAS) {
-		flags |= BATCH_OFFSET_BIAS | PIN_OFFSET_BIAS;
+	if (exec_flags & EXEC_OBJECT_PINNED) {
+		pin_flags |= entry->offset | PIN_OFFSET_FIXED;
+		pin_flags &= ~PIN_NONBLOCK; /* force overlapping checks */
+	} else if (exec_flags & __EXEC_OBJECT_NEEDS_BIAS) {
+		pin_flags |= BATCH_OFFSET_BIAS | PIN_OFFSET_BIAS;
 	}
 
-	err = i915_vma_pin(vma, entry->pad_to_size, entry->alignment, flags);
+	err = i915_vma_pin(vma,
+			   entry->pad_to_size, entry->alignment,
+			   pin_flags);
 	if (err)
 		return err;
 
@@ -560,7 +556,7 @@ static int eb_reserve_vma(const struct i915_execbuffer *eb,
 		eb->args->flags |= __EXEC_HAS_RELOC;
 	}
 
-	if (unlikely(entry->flags & EXEC_OBJECT_NEEDS_FENCE)) {
+	if (unlikely(exec_flags & EXEC_OBJECT_NEEDS_FENCE)) {
 		err = i915_vma_get_fence(vma);
 		if (unlikely(err)) {
 			i915_vma_unpin(vma);
@@ -568,11 +564,11 @@ static int eb_reserve_vma(const struct i915_execbuffer *eb,
 		}
 
 		if (i915_vma_pin_fence(vma))
-			entry->flags |= __EXEC_OBJECT_HAS_FENCE;
+			exec_flags |= __EXEC_OBJECT_HAS_FENCE;
 	}
 
-	entry->flags |= __EXEC_OBJECT_HAS_PIN;
-	GEM_BUG_ON(eb_vma_misplaced(entry, vma));
+	*vma->exec_flags = exec_flags | __EXEC_OBJECT_HAS_PIN;
+	GEM_BUG_ON(eb_vma_misplaced(entry, vma, exec_flags));
 
 	return 0;
 }
@@ -614,18 +610,18 @@ static int eb_reserve(struct i915_execbuffer *eb)
 		INIT_LIST_HEAD(&eb->unbound);
 		INIT_LIST_HEAD(&last);
 		for (i = 0; i < count; i++) {
-			struct drm_i915_gem_exec_object2 *entry = &eb->exec[i];
+			unsigned int flags = eb->flags[i];
+			struct i915_vma *vma = eb->vma[i];
 
-			if (entry->flags & EXEC_OBJECT_PINNED &&
-			    entry->flags & __EXEC_OBJECT_HAS_PIN)
+			if (flags & EXEC_OBJECT_PINNED &&
+			    flags & __EXEC_OBJECT_HAS_PIN)
 				continue;
 
-			vma = exec_to_vma(entry);
-			eb_unreserve_vma(vma, entry);
+			eb_unreserve_vma(vma, &eb->flags[i]);
 
-			if (entry->flags & EXEC_OBJECT_PINNED)
+			if (flags & EXEC_OBJECT_PINNED)
 				list_add(&vma->exec_link, &eb->unbound);
-			else if (entry->flags & __EXEC_OBJECT_NEEDS_MAP)
+			else if (flags & __EXEC_OBJECT_NEEDS_MAP)
 				list_add_tail(&vma->exec_link, &eb->unbound);
 			else
 				list_add_tail(&vma->exec_link, &last);
@@ -649,19 +645,6 @@ static int eb_reserve(struct i915_execbuffer *eb)
 	} while (1);
 }
 
-static inline struct hlist_head *
-ht_head(const  struct i915_gem_context_vma_lut *lut, u32 handle)
-{
-	return &lut->ht[hash_32(handle, lut->ht_bits)];
-}
-
-static inline bool
-ht_needs_resize(const struct i915_gem_context_vma_lut *lut)
-{
-	return (4*lut->ht_count > 3*lut->ht_size ||
-		4*lut->ht_count + 1 < lut->ht_size);
-}
-
 static unsigned int eb_batch_index(const struct i915_execbuffer *eb)
 {
 	if (eb->args->flags & I915_EXEC_BATCH_FIRST)
@@ -675,16 +658,10 @@ static int eb_select_context(struct i915_execbuffer *eb)
 	struct i915_gem_context *ctx;
 
 	ctx = i915_gem_context_lookup(eb->file->driver_priv, eb->args->rsvd1);
-	if (unlikely(IS_ERR(ctx)))
-		return PTR_ERR(ctx);
+	if (unlikely(!ctx))
+		return -ENOENT;
 
-	if (unlikely(i915_gem_context_is_banned(ctx))) {
-		DRM_DEBUG("Context %u tried to submit while banned\n",
-			  ctx->user_handle);
-		return -EIO;
-	}
-
-	eb->ctx = i915_gem_context_get(ctx);
+	eb->ctx = ctx;
 	eb->vm = ctx->ppgtt ? &ctx->ppgtt->base : &eb->i915->ggtt.base;
 
 	eb->context_flags = 0;
@@ -696,132 +673,74 @@ static int eb_select_context(struct i915_execbuffer *eb)
 
 static int eb_lookup_vmas(struct i915_execbuffer *eb)
 {
-#define INTERMEDIATE BIT(0)
-	const unsigned int count = eb->buffer_count;
-	struct i915_gem_context_vma_lut *lut = &eb->ctx->vma_lut;
-	struct i915_vma *vma;
-	struct idr *idr;
+	struct radix_tree_root *handles_vma = &eb->ctx->handles_vma;
+	struct drm_i915_gem_object *uninitialized_var(obj);
 	unsigned int i;
-	int slow_pass = -1;
 	int err;
 
+	if (unlikely(i915_gem_context_is_closed(eb->ctx)))
+		return -ENOENT;
+
+	if (unlikely(i915_gem_context_is_banned(eb->ctx)))
+		return -EIO;
+
 	INIT_LIST_HEAD(&eb->relocs);
 	INIT_LIST_HEAD(&eb->unbound);
 
-	if (unlikely(lut->ht_size & I915_CTX_RESIZE_IN_PROGRESS))
-		flush_work(&lut->resize);
-	GEM_BUG_ON(lut->ht_size & I915_CTX_RESIZE_IN_PROGRESS);
+	for (i = 0; i < eb->buffer_count; i++) {
+		u32 handle = eb->exec[i].handle;
+		struct i915_lut_handle *lut;
+		struct i915_vma *vma;
 
-	for (i = 0; i < count; i++) {
-		__exec_to_vma(&eb->exec[i]) = 0;
+		vma = radix_tree_lookup(handles_vma, handle);
+		if (likely(vma))
+			goto add_vma;
 
-		hlist_for_each_entry(vma,
-				     ht_head(lut, eb->exec[i].handle),
-				     ctx_node) {
-			if (vma->ctx_handle != eb->exec[i].handle)
-				continue;
-
-			err = eb_add_vma(eb, &eb->exec[i], vma);
-			if (unlikely(err))
-				return err;
-
-			goto next_vma;
-		}
-
-		if (slow_pass < 0)
-			slow_pass = i;
-next_vma: ;
-	}
-
-	if (slow_pass < 0)
-		goto out;
-
-	spin_lock(&eb->file->table_lock);
-	/*
-	 * Grab a reference to the object and release the lock so we can lookup
-	 * or create the VMA without using GFP_ATOMIC
-	 */
-	idr = &eb->file->object_idr;
-	for (i = slow_pass; i < count; i++) {
-		struct drm_i915_gem_object *obj;
-
-		if (__exec_to_vma(&eb->exec[i]))
-			continue;
-
-		obj = to_intel_bo(idr_find(idr, eb->exec[i].handle));
+		obj = i915_gem_object_lookup(eb->file, handle);
 		if (unlikely(!obj)) {
-			spin_unlock(&eb->file->table_lock);
-			DRM_DEBUG("Invalid object handle %d at index %d\n",
-				  eb->exec[i].handle, i);
 			err = -ENOENT;
-			goto err;
+			goto err_vma;
 		}
 
-		__exec_to_vma(&eb->exec[i]) = INTERMEDIATE | (uintptr_t)obj;
-	}
-	spin_unlock(&eb->file->table_lock);
-
-	for (i = slow_pass; i < count; i++) {
-		struct drm_i915_gem_object *obj;
-
-		if (!(__exec_to_vma(&eb->exec[i]) & INTERMEDIATE))
-			continue;
-
-		/*
-		 * NOTE: We can leak any vmas created here when something fails
-		 * later on. But that's no issue since vma_unbind can deal with
-		 * vmas which are not actually bound. And since only
-		 * lookup_or_create exists as an interface to get at the vma
-		 * from the (obj, vm) we don't run the risk of creating
-		 * duplicated vmas for the same vm.
-		 */
-		obj = u64_to_ptr(typeof(*obj),
-				 __exec_to_vma(&eb->exec[i]) & ~INTERMEDIATE);
 		vma = i915_vma_instance(obj, eb->vm, NULL);
 		if (unlikely(IS_ERR(vma))) {
-			DRM_DEBUG("Failed to lookup VMA\n");
 			err = PTR_ERR(vma);
-			goto err;
+			goto err_obj;
 		}
 
-		/* First come, first served */
-		if (!vma->ctx) {
-			vma->ctx = eb->ctx;
-			vma->ctx_handle = eb->exec[i].handle;
-			hlist_add_head(&vma->ctx_node,
-				       ht_head(lut, eb->exec[i].handle));
-			lut->ht_count++;
-			lut->ht_size |= I915_CTX_RESIZE_IN_PROGRESS;
-			if (i915_vma_is_ggtt(vma)) {
-				GEM_BUG_ON(obj->vma_hashed);
-				obj->vma_hashed = vma;
-			}
-
-			i915_vma_get(vma);
+		lut = kmem_cache_alloc(eb->i915->luts, GFP_KERNEL);
+		if (unlikely(!lut)) {
+			err = -ENOMEM;
+			goto err_obj;
 		}
 
-		err = eb_add_vma(eb, &eb->exec[i], vma);
+		err = radix_tree_insert(handles_vma, handle, vma);
+		if (unlikely(err)) {
+			kfree(lut);
+			goto err_obj;
+		}
+
+		list_add(&lut->obj_link, &obj->lut_list);
+		list_add(&lut->ctx_link, &eb->ctx->handles_list);
+		lut->ctx = eb->ctx;
+		lut->handle = handle;
+
+		/* transfer ref to ctx */
+		obj = NULL;
+
+add_vma:
+		err = eb_add_vma(eb, i, vma);
 		if (unlikely(err))
-			goto err;
+			goto err_obj;
 
-		/* Only after we validated the user didn't use our bits */
-		if (vma->ctx != eb->ctx) {
-			i915_vma_get(vma);
-			eb->exec[i].flags |= __EXEC_OBJECT_HAS_REF;
-		}
+		GEM_BUG_ON(vma != eb->vma[i]);
+		GEM_BUG_ON(vma->exec_flags != &eb->flags[i]);
 	}
 
-	if (lut->ht_size & I915_CTX_RESIZE_IN_PROGRESS) {
-		if (ht_needs_resize(lut))
-			queue_work(system_highpri_wq, &lut->resize);
-		else
-			lut->ht_size &= ~I915_CTX_RESIZE_IN_PROGRESS;
-	}
-
-out:
 	/* take note of the batch buffer before we might reorder the lists */
 	i = eb_batch_index(eb);
-	eb->batch = exec_to_vma(&eb->exec[i]);
+	eb->batch = eb->vma[i];
+	GEM_BUG_ON(eb->batch->exec_flags != &eb->flags[i]);
 
 	/*
 	 * SNA is doing fancy tricks with compressing batch buffers, which leads
@@ -832,22 +751,20 @@ next_vma: ;
 	 * Note that actual hangs have only been observed on gen7, but for
 	 * paranoia do it everywhere.
 	 */
-	if (!(eb->exec[i].flags & EXEC_OBJECT_PINNED))
-		eb->exec[i].flags |= __EXEC_OBJECT_NEEDS_BIAS;
+	if (!(eb->flags[i] & EXEC_OBJECT_PINNED))
+		eb->flags[i] |= __EXEC_OBJECT_NEEDS_BIAS;
 	if (eb->reloc_cache.has_fence)
-		eb->exec[i].flags |= EXEC_OBJECT_NEEDS_FENCE;
+		eb->flags[i] |= EXEC_OBJECT_NEEDS_FENCE;
 
 	eb->args->flags |= __EXEC_VALIDATED;
 	return eb_reserve(eb);
 
-err:
-	for (i = slow_pass; i < count; i++) {
-		if (__exec_to_vma(&eb->exec[i]) & INTERMEDIATE)
-			__exec_to_vma(&eb->exec[i]) = 0;
-	}
-	lut->ht_size &= ~I915_CTX_RESIZE_IN_PROGRESS;
+err_obj:
+	if (obj)
+		i915_gem_object_put(obj);
+err_vma:
+	eb->vma[i] = NULL;
 	return err;
-#undef INTERMEDIATE
 }
 
 static struct i915_vma *
@@ -856,7 +773,7 @@ eb_get_vma(const struct i915_execbuffer *eb, unsigned long handle)
 	if (eb->lut_size < 0) {
 		if (handle >= -eb->lut_size)
 			return NULL;
-		return exec_to_vma(&eb->exec[handle]);
+		return eb->vma[handle];
 	} else {
 		struct hlist_head *head;
 		struct i915_vma *vma;
@@ -876,24 +793,21 @@ static void eb_release_vmas(const struct i915_execbuffer *eb)
 	unsigned int i;
 
 	for (i = 0; i < count; i++) {
-		struct drm_i915_gem_exec_object2 *entry = &eb->exec[i];
-		struct i915_vma *vma = exec_to_vma(entry);
+		struct i915_vma *vma = eb->vma[i];
+		unsigned int flags = eb->flags[i];
 
 		if (!vma)
-			continue;
+			break;
 
-		GEM_BUG_ON(vma->exec_entry != entry);
-		vma->exec_entry = NULL;
-		__exec_to_vma(entry) = 0;
+		GEM_BUG_ON(vma->exec_flags != &eb->flags[i]);
+		vma->exec_flags = NULL;
+		eb->vma[i] = NULL;
 
-		if (entry->flags & __EXEC_OBJECT_HAS_PIN)
-			__eb_unreserve_vma(vma, entry);
+		if (flags & __EXEC_OBJECT_HAS_PIN)
+			__eb_unreserve_vma(vma, flags);
 
-		if (entry->flags & __EXEC_OBJECT_HAS_REF)
+		if (flags & __EXEC_OBJECT_HAS_REF)
 			i915_vma_put(vma);
-
-		entry->flags &=
-			~(__EXEC_OBJECT_RESERVED | __EXEC_OBJECT_HAS_REF);
 	}
 }
 
@@ -1266,7 +1180,9 @@ relocate_entry(struct i915_vma *vma,
 
 	if (!eb->reloc_cache.vaddr &&
 	    (DBG_FORCE_RELOC == FORCE_GPU_RELOC ||
-	     !reservation_object_test_signaled_rcu(vma->resv, true))) {
+	     !reservation_object_test_signaled_rcu(vma->resv, true)) &&
+	    __intel_engine_can_store_dword(eb->reloc_cache.gen,
+					   eb->engine->class)) {
 		const unsigned int gen = eb->reloc_cache.gen;
 		unsigned int len;
 		u32 *batch;
@@ -1276,10 +1192,8 @@ relocate_entry(struct i915_vma *vma,
 			len = offset & 7 ? 8 : 5;
 		else if (gen >= 4)
 			len = 4;
-		else if (gen >= 3)
+		else
 			len = 3;
-		else /* On gen2 MI_STORE_DWORD_IMM uses a physical address */
-			goto repeat;
 
 		batch = reloc_gpu(eb, vma, len);
 		if (IS_ERR(batch))
@@ -1382,7 +1296,7 @@ eb_relocate_entry(struct i915_execbuffer *eb,
 	}
 
 	if (reloc->write_domain) {
-		target->exec_entry->flags |= EXEC_OBJECT_WRITE;
+		*target->exec_flags |= EXEC_OBJECT_WRITE;
 
 		/*
 		 * Sandybridge PPGTT errata: We need a global gtt mapping
@@ -1432,9 +1346,9 @@ eb_relocate_entry(struct i915_execbuffer *eb,
 	 * patching using the GPU (though that should be serialised by the
 	 * timeline). To be completely sure, and since we are required to
 	 * do relocations we are already stalling, disable the user's opt
-	 * of our synchronisation.
+	 * out of our synchronisation.
 	 */
-	vma->exec_entry->flags &= ~EXEC_OBJECT_ASYNC;
+	*vma->exec_flags &= ~EXEC_OBJECT_ASYNC;
 
 	/* and update the user's relocation entry */
 	return relocate_entry(vma, reloc, eb, target);
@@ -1445,7 +1359,7 @@ static int eb_relocate_vma(struct i915_execbuffer *eb, struct i915_vma *vma)
 #define N_RELOC(x) ((x) / sizeof(struct drm_i915_gem_relocation_entry))
 	struct drm_i915_gem_relocation_entry stack[N_RELOC(512)];
 	struct drm_i915_gem_relocation_entry __user *urelocs;
-	const struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
+	const struct drm_i915_gem_exec_object2 *entry = exec_entry(eb, vma);
 	unsigned int remain;
 
 	urelocs = u64_to_user_ptr(entry->relocs_ptr);
@@ -1528,7 +1442,7 @@ static int eb_relocate_vma(struct i915_execbuffer *eb, struct i915_vma *vma)
 static int
 eb_relocate_vma_slow(struct i915_execbuffer *eb, struct i915_vma *vma)
 {
-	const struct drm_i915_gem_exec_object2 *entry = vma->exec_entry;
+	const struct drm_i915_gem_exec_object2 *entry = exec_entry(eb, vma);
 	struct drm_i915_gem_relocation_entry *relocs =
 		u64_to_ptr(typeof(*relocs), entry->relocs_ptr);
 	unsigned int i;
@@ -1732,6 +1646,8 @@ static noinline int eb_relocate_slow(struct i915_execbuffer *eb)
 	if (err)
 		goto err;
 
+	GEM_BUG_ON(!eb->batch);
+
 	list_for_each_entry(vma, &eb->relocs, reloc_link) {
 		if (!have_copy) {
 			pagefault_disable();
@@ -1825,11 +1741,11 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
 	int err;
 
 	for (i = 0; i < count; i++) {
-		struct drm_i915_gem_exec_object2 *entry = &eb->exec[i];
-		struct i915_vma *vma = exec_to_vma(entry);
+		unsigned int flags = eb->flags[i];
+		struct i915_vma *vma = eb->vma[i];
 		struct drm_i915_gem_object *obj = vma->obj;
 
-		if (entry->flags & EXEC_OBJECT_CAPTURE) {
+		if (flags & EXEC_OBJECT_CAPTURE) {
 			struct i915_gem_capture_list *capture;
 
 			capture = kmalloc(sizeof(*capture), GFP_KERNEL);
@@ -1837,35 +1753,47 @@ static int eb_move_to_gpu(struct i915_execbuffer *eb)
 				return -ENOMEM;
 
 			capture->next = eb->request->capture_list;
-			capture->vma = vma;
+			capture->vma = eb->vma[i];
 			eb->request->capture_list = capture;
 		}
 
-		if (unlikely(obj->cache_dirty && !obj->cache_coherent)) {
+		/*
+		 * If the GPU is not _reading_ through the CPU cache, we need
+		 * to make sure that any writes (both previous GPU writes from
+		 * before a change in snooping levels and normal CPU writes)
+		 * caught in that cache are flushed to main memory.
+		 *
+		 * We want to say
+		 *   obj->cache_dirty &&
+		 *   !(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_READ)
+		 * but gcc's optimiser doesn't handle that as well and emits
+		 * two jumps instead of one. Maybe one day...
+		 */
+		if (unlikely(obj->cache_dirty & ~obj->cache_coherent)) {
 			if (i915_gem_clflush_object(obj, 0))
-				entry->flags &= ~EXEC_OBJECT_ASYNC;
+				flags &= ~EXEC_OBJECT_ASYNC;
 		}
 
-		if (entry->flags & EXEC_OBJECT_ASYNC)
-			goto skip_flushes;
+		if (flags & EXEC_OBJECT_ASYNC)
+			continue;
 
 		err = i915_gem_request_await_object
-			(eb->request, obj, entry->flags & EXEC_OBJECT_WRITE);
+			(eb->request, obj, flags & EXEC_OBJECT_WRITE);
 		if (err)
 			return err;
-
-skip_flushes:
-		i915_vma_move_to_active(vma, eb->request, entry->flags);
-		__eb_unreserve_vma(vma, entry);
-		vma->exec_entry = NULL;
 	}
 
 	for (i = 0; i < count; i++) {
-		const struct drm_i915_gem_exec_object2 *entry = &eb->exec[i];
-		struct i915_vma *vma = exec_to_vma(entry);
+		unsigned int flags = eb->flags[i];
+		struct i915_vma *vma = eb->vma[i];
 
-		eb_export_fence(vma, eb->request, entry->flags);
-		if (unlikely(entry->flags & __EXEC_OBJECT_HAS_REF))
+		i915_vma_move_to_active(vma, eb->request, flags);
+		eb_export_fence(vma, eb->request, flags);
+
+		__eb_unreserve_vma(vma, flags);
+		vma->exec_flags = NULL;
+
+		if (unlikely(flags & __EXEC_OBJECT_HAS_REF))
 			i915_vma_put(vma);
 	}
 	eb->exec = NULL;
@@ -1883,8 +1811,10 @@ static bool i915_gem_check_execbuffer(struct drm_i915_gem_execbuffer2 *exec)
 		return false;
 
 	/* Kernel clipping was a DRI1 misfeature */
-	if (exec->num_cliprects || exec->cliprects_ptr)
-		return false;
+	if (!(exec->flags & I915_EXEC_FENCE_ARRAY)) {
+		if (exec->num_cliprects || exec->cliprects_ptr)
+			return false;
+	}
 
 	if (exec->DR4 == 0xffffffff) {
 		DRM_DEBUG("UXA submitting garbage DR4, fixing up\n");
@@ -1992,11 +1922,11 @@ static struct i915_vma *eb_parse(struct i915_execbuffer *eb, bool is_master)
 	if (IS_ERR(vma))
 		goto out;
 
-	vma->exec_entry =
-		memset(&eb->exec[eb->buffer_count++],
-		       0, sizeof(*vma->exec_entry));
-	vma->exec_entry->flags = __EXEC_OBJECT_HAS_PIN | __EXEC_OBJECT_HAS_REF;
-	__exec_to_vma(vma->exec_entry) = (uintptr_t)i915_vma_get(vma);
+	eb->vma[eb->buffer_count] = i915_vma_get(vma);
+	eb->flags[eb->buffer_count] =
+		__EXEC_OBJECT_HAS_PIN | __EXEC_OBJECT_HAS_REF;
+	vma->exec_flags = &eb->flags[eb->buffer_count];
+	eb->buffer_count++;
 
 out:
 	i915_gem_object_unpin_pages(shadow_batch_obj);
@@ -2115,11 +2045,129 @@ eb_select_engine(struct drm_i915_private *dev_priv,
 	return engine;
 }
 
+static void
+__free_fence_array(struct drm_syncobj **fences, unsigned int n)
+{
+	while (n--)
+		drm_syncobj_put(ptr_mask_bits(fences[n], 2));
+	kvfree(fences);
+}
+
+static struct drm_syncobj **
+get_fence_array(struct drm_i915_gem_execbuffer2 *args,
+		struct drm_file *file)
+{
+	const unsigned int nfences = args->num_cliprects;
+	struct drm_i915_gem_exec_fence __user *user;
+	struct drm_syncobj **fences;
+	unsigned int n;
+	int err;
+
+	if (!(args->flags & I915_EXEC_FENCE_ARRAY))
+		return NULL;
+
+	if (nfences > SIZE_MAX / sizeof(*fences))
+		return ERR_PTR(-EINVAL);
+
+	user = u64_to_user_ptr(args->cliprects_ptr);
+	if (!access_ok(VERIFY_READ, user, nfences * 2 * sizeof(u32)))
+		return ERR_PTR(-EFAULT);
+
+	fences = kvmalloc_array(args->num_cliprects, sizeof(*fences),
+				__GFP_NOWARN | GFP_TEMPORARY);
+	if (!fences)
+		return ERR_PTR(-ENOMEM);
+
+	for (n = 0; n < nfences; n++) {
+		struct drm_i915_gem_exec_fence fence;
+		struct drm_syncobj *syncobj;
+
+		if (__copy_from_user(&fence, user++, sizeof(fence))) {
+			err = -EFAULT;
+			goto err;
+		}
+
+		syncobj = drm_syncobj_find(file, fence.handle);
+		if (!syncobj) {
+			DRM_DEBUG("Invalid syncobj handle provided\n");
+			err = -ENOENT;
+			goto err;
+		}
+
+		fences[n] = ptr_pack_bits(syncobj, fence.flags, 2);
+	}
+
+	return fences;
+
+err:
+	__free_fence_array(fences, n);
+	return ERR_PTR(err);
+}
+
+static void
+put_fence_array(struct drm_i915_gem_execbuffer2 *args,
+		struct drm_syncobj **fences)
+{
+	if (fences)
+		__free_fence_array(fences, args->num_cliprects);
+}
+
+static int
+await_fence_array(struct i915_execbuffer *eb,
+		  struct drm_syncobj **fences)
+{
+	const unsigned int nfences = eb->args->num_cliprects;
+	unsigned int n;
+	int err;
+
+	for (n = 0; n < nfences; n++) {
+		struct drm_syncobj *syncobj;
+		struct dma_fence *fence;
+		unsigned int flags;
+
+		syncobj = ptr_unpack_bits(fences[n], &flags, 2);
+		if (!(flags & I915_EXEC_FENCE_WAIT))
+			continue;
+
+		fence = drm_syncobj_fence_get(syncobj);
+		if (!fence)
+			return -EINVAL;
+
+		err = i915_gem_request_await_dma_fence(eb->request, fence);
+		dma_fence_put(fence);
+		if (err < 0)
+			return err;
+	}
+
+	return 0;
+}
+
+static void
+signal_fence_array(struct i915_execbuffer *eb,
+		   struct drm_syncobj **fences)
+{
+	const unsigned int nfences = eb->args->num_cliprects;
+	struct dma_fence * const fence = &eb->request->fence;
+	unsigned int n;
+
+	for (n = 0; n < nfences; n++) {
+		struct drm_syncobj *syncobj;
+		unsigned int flags;
+
+		syncobj = ptr_unpack_bits(fences[n], &flags, 2);
+		if (!(flags & I915_EXEC_FENCE_SIGNAL))
+			continue;
+
+		drm_syncobj_replace_fence(syncobj, fence);
+	}
+}
+
 static int
 i915_gem_do_execbuffer(struct drm_device *dev,
 		       struct drm_file *file,
 		       struct drm_i915_gem_execbuffer2 *args,
-		       struct drm_i915_gem_exec_object2 *exec)
+		       struct drm_i915_gem_exec_object2 *exec,
+		       struct drm_syncobj **fences)
 {
 	struct i915_execbuffer eb;
 	struct dma_fence *in_fence = NULL;
@@ -2135,8 +2183,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	eb.args = args;
 	if (DBG_FORCE_RELOC || !(args->flags & I915_EXEC_NO_RELOC))
 		args->flags |= __EXEC_HAS_RELOC;
+
 	eb.exec = exec;
-	eb.ctx = NULL;
+	eb.vma = (struct i915_vma **)(exec + args->buffer_count + 1);
+	eb.vma[0] = NULL;
+	eb.flags = (unsigned int *)(eb.vma + args->buffer_count + 1);
+
 	eb.invalid_flags = __EXEC_OBJECT_UNKNOWN_FLAGS;
 	if (USES_FULL_PPGTT(eb.i915))
 		eb.invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
@@ -2194,6 +2246,10 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 
 	GEM_BUG_ON(!eb.lut_size);
 
+	err = eb_select_context(&eb);
+	if (unlikely(err))
+		goto err_destroy;
+
 	/*
 	 * Take a local wakeref for preparing to dispatch the execbuf as
 	 * we expect to access the hardware fairly frequently in the
@@ -2202,14 +2258,11 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	 * 100ms.
 	 */
 	intel_runtime_pm_get(eb.i915);
+
 	err = i915_mutex_lock_interruptible(dev);
 	if (err)
 		goto err_rpm;
 
-	err = eb_select_context(&eb);
-	if (unlikely(err))
-		goto err_unlock;
-
 	err = eb_relocate(&eb);
 	if (err) {
 		/*
@@ -2223,7 +2276,7 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 		goto err_vma;
 	}
 
-	if (unlikely(eb.batch->exec_entry->flags & EXEC_OBJECT_WRITE)) {
+	if (unlikely(*eb.batch->exec_flags & EXEC_OBJECT_WRITE)) {
 		DRM_DEBUG("Attempting to use self-modifying batch buffer\n");
 		err = -EINVAL;
 		goto err_vma;
@@ -2305,6 +2358,12 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 			goto err_request;
 	}
 
+	if (fences) {
+		err = await_fence_array(&eb, fences);
+		if (err)
+			goto err_request;
+	}
+
 	if (out_fence_fd != -1) {
 		out_fence = sync_file_create(&eb.request->fence);
 		if (!out_fence) {
@@ -2328,6 +2387,9 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 	__i915_add_request(eb.request, err == 0);
 	add_to_client(eb.request, file);
 
+	if (fences)
+		signal_fence_array(&eb, fences);
+
 	if (out_fence) {
 		if (err == 0) {
 			fd_install(out_fence_fd, out_fence->file);
@@ -2345,11 +2407,11 @@ i915_gem_do_execbuffer(struct drm_device *dev,
 err_vma:
 	if (eb.exec)
 		eb_release_vmas(&eb);
-	i915_gem_context_put(eb.ctx);
-err_unlock:
 	mutex_unlock(&dev->struct_mutex);
 err_rpm:
 	intel_runtime_pm_put(eb.i915);
+	i915_gem_context_put(eb.ctx);
+err_destroy:
 	eb_destroy(&eb);
 err_out_fence:
 	if (out_fence_fd != -1)
@@ -2367,7 +2429,9 @@ int
 i915_gem_execbuffer(struct drm_device *dev, void *data,
 		    struct drm_file *file)
 {
-	const size_t sz = sizeof(struct drm_i915_gem_exec_object2);
+	const size_t sz = (sizeof(struct drm_i915_gem_exec_object2) +
+			   sizeof(struct i915_vma *) +
+			   sizeof(unsigned int));
 	struct drm_i915_gem_execbuffer *args = data;
 	struct drm_i915_gem_execbuffer2 exec2;
 	struct drm_i915_gem_exec_object *exec_list = NULL;
@@ -2429,7 +2493,7 @@ i915_gem_execbuffer(struct drm_device *dev, void *data,
 			exec2_list[i].flags = 0;
 	}
 
-	err = i915_gem_do_execbuffer(dev, file, &exec2, exec2_list);
+	err = i915_gem_do_execbuffer(dev, file, &exec2, exec2_list, NULL);
 	if (exec2.flags & __EXEC_HAS_RELOC) {
 		struct drm_i915_gem_exec_object __user *user_exec_list =
 			u64_to_user_ptr(args->buffers_ptr);
@@ -2458,9 +2522,12 @@ int
 i915_gem_execbuffer2(struct drm_device *dev, void *data,
 		     struct drm_file *file)
 {
-	const size_t sz = sizeof(struct drm_i915_gem_exec_object2);
+	const size_t sz = (sizeof(struct drm_i915_gem_exec_object2) +
+			   sizeof(struct i915_vma *) +
+			   sizeof(unsigned int));
 	struct drm_i915_gem_execbuffer2 *args = data;
 	struct drm_i915_gem_exec_object2 *exec2_list;
+	struct drm_syncobj **fences = NULL;
 	int err;
 
 	if (args->buffer_count < 1 || args->buffer_count > SIZE_MAX / sz - 1) {
@@ -2487,7 +2554,15 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
 		return -EFAULT;
 	}
 
-	err = i915_gem_do_execbuffer(dev, file, args, exec2_list);
+	if (args->flags & I915_EXEC_FENCE_ARRAY) {
+		fences = get_fence_array(args, file);
+		if (IS_ERR(fences)) {
+			kvfree(exec2_list);
+			return PTR_ERR(fences);
+		}
+	}
+
+	err = i915_gem_do_execbuffer(dev, file, args, exec2_list, fences);
 
 	/*
 	 * Now that we have begun execution of the batchbuffer, we ignore
@@ -2517,6 +2592,7 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
 	}
 
 	args->flags &= ~__I915_EXEC_UNKNOWN_FLAGS;
+	put_fence_array(args, fences);
 	kvfree(exec2_list);
 	return err;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.c b/drivers/gpu/drm/i915/i915_gem_gtt.c
index 61fc7e9..d60f38a 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.c
@@ -144,9 +144,9 @@ int intel_sanitize_enable_ppgtt(struct drm_i915_private *dev_priv,
 	has_full_48bit_ppgtt = dev_priv->info.has_full_48bit_ppgtt;
 
 	if (intel_vgpu_active(dev_priv)) {
-		/* emulation is too hard */
+		/* GVT-g has no support for 32bit ppgtt */
 		has_full_ppgtt = false;
-		has_full_48bit_ppgtt = false;
+		has_full_48bit_ppgtt = intel_vgpu_has_full_48bit_ppgtt(dev_priv);
 	}
 
 	if (!has_aliasing_ppgtt)
@@ -180,10 +180,15 @@ int intel_sanitize_enable_ppgtt(struct drm_i915_private *dev_priv,
 		return 0;
 	}
 
-	if (INTEL_GEN(dev_priv) >= 8 && i915.enable_execlists && has_full_ppgtt)
-		return has_full_48bit_ppgtt ? 3 : 2;
-	else
-		return has_aliasing_ppgtt ? 1 : 0;
+	if (INTEL_GEN(dev_priv) >= 8 && i915.enable_execlists) {
+		if (has_full_48bit_ppgtt)
+			return 3;
+
+		if (has_full_ppgtt)
+			return 2;
+	}
+
+	return has_aliasing_ppgtt ? 1 : 0;
 }
 
 static int ppgtt_bind_vma(struct i915_vma *vma,
@@ -207,8 +212,7 @@ static int ppgtt_bind_vma(struct i915_vma *vma,
 	if (vma->obj->gt_ro)
 		pte_flags |= PTE_READ_ONLY;
 
-	vma->vm->insert_entries(vma->vm, vma->pages, vma->node.start,
-				cache_level, pte_flags);
+	vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags);
 
 	return 0;
 }
@@ -907,37 +911,35 @@ gen8_ppgtt_insert_pte_entries(struct i915_hw_ppgtt *ppgtt,
 }
 
 static void gen8_ppgtt_insert_3lvl(struct i915_address_space *vm,
-				   struct sg_table *pages,
-				   u64 start,
+				   struct i915_vma *vma,
 				   enum i915_cache_level cache_level,
 				   u32 unused)
 {
 	struct i915_hw_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
 	struct sgt_dma iter = {
-		.sg = pages->sgl,
+		.sg = vma->pages->sgl,
 		.dma = sg_dma_address(iter.sg),
 		.max = iter.dma + iter.sg->length,
 	};
-	struct gen8_insert_pte idx = gen8_insert_pte(start);
+	struct gen8_insert_pte idx = gen8_insert_pte(vma->node.start);
 
 	gen8_ppgtt_insert_pte_entries(ppgtt, &ppgtt->pdp, &iter, &idx,
 				      cache_level);
 }
 
 static void gen8_ppgtt_insert_4lvl(struct i915_address_space *vm,
-				   struct sg_table *pages,
-				   u64 start,
+				   struct i915_vma *vma,
 				   enum i915_cache_level cache_level,
 				   u32 unused)
 {
 	struct i915_hw_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
 	struct sgt_dma iter = {
-		.sg = pages->sgl,
+		.sg = vma->pages->sgl,
 		.dma = sg_dma_address(iter.sg),
 		.max = iter.dma + iter.sg->length,
 	};
 	struct i915_page_directory_pointer **pdps = ppgtt->pml4.pdps;
-	struct gen8_insert_pte idx = gen8_insert_pte(start);
+	struct gen8_insert_pte idx = gen8_insert_pte(vma->node.start);
 
 	while (gen8_ppgtt_insert_pte_entries(ppgtt, pdps[idx.pml4e++], &iter,
 					     &idx, cache_level))
@@ -1621,13 +1623,12 @@ static void gen6_ppgtt_clear_range(struct i915_address_space *vm,
 }
 
 static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
-				      struct sg_table *pages,
-				      u64 start,
+				      struct i915_vma *vma,
 				      enum i915_cache_level cache_level,
 				      u32 flags)
 {
 	struct i915_hw_ppgtt *ppgtt = i915_vm_to_ppgtt(vm);
-	unsigned first_entry = start >> PAGE_SHIFT;
+	unsigned first_entry = vma->node.start >> PAGE_SHIFT;
 	unsigned act_pt = first_entry / GEN6_PTES;
 	unsigned act_pte = first_entry % GEN6_PTES;
 	const u32 pte_encode = vm->pte_encode(0, cache_level, flags);
@@ -1635,7 +1636,7 @@ static void gen6_ppgtt_insert_entries(struct i915_address_space *vm,
 	gen6_pte_t *vaddr;
 
 	vaddr = kmap_atomic_px(ppgtt->pd.page_table[act_pt]);
-	iter.sg = pages->sgl;
+	iter.sg = vma->pages->sgl;
 	iter.dma = sg_dma_address(iter.sg);
 	iter.max = iter.dma + iter.sg->length;
 	do {
@@ -2090,8 +2091,7 @@ static void gen8_ggtt_insert_page(struct i915_address_space *vm,
 }
 
 static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
-				     struct sg_table *st,
-				     u64 start,
+				     struct i915_vma *vma,
 				     enum i915_cache_level level,
 				     u32 unused)
 {
@@ -2102,8 +2102,8 @@ static void gen8_ggtt_insert_entries(struct i915_address_space *vm,
 	dma_addr_t addr;
 
 	gtt_entries = (gen8_pte_t __iomem *)ggtt->gsm;
-	gtt_entries += start >> PAGE_SHIFT;
-	for_each_sgt_dma(addr, sgt_iter, st)
+	gtt_entries += vma->node.start >> PAGE_SHIFT;
+	for_each_sgt_dma(addr, sgt_iter, vma->pages)
 		gen8_set_pte(gtt_entries++, pte_encode | addr);
 
 	wmb();
@@ -2137,17 +2137,16 @@ static void gen6_ggtt_insert_page(struct i915_address_space *vm,
  * mapped BAR (dev_priv->mm.gtt->gtt).
  */
 static void gen6_ggtt_insert_entries(struct i915_address_space *vm,
-				     struct sg_table *st,
-				     u64 start,
+				     struct i915_vma *vma,
 				     enum i915_cache_level level,
 				     u32 flags)
 {
 	struct i915_ggtt *ggtt = i915_vm_to_ggtt(vm);
 	gen6_pte_t __iomem *entries = (gen6_pte_t __iomem *)ggtt->gsm;
-	unsigned int i = start >> PAGE_SHIFT;
+	unsigned int i = vma->node.start >> PAGE_SHIFT;
 	struct sgt_iter iter;
 	dma_addr_t addr;
-	for_each_sgt_dma(addr, iter, st)
+	for_each_sgt_dma(addr, iter, vma->pages)
 		iowrite32(vm->pte_encode(addr, level, flags), &entries[i++]);
 	wmb();
 
@@ -2229,8 +2228,7 @@ static void bxt_vtd_ggtt_insert_page__BKL(struct i915_address_space *vm,
 
 struct insert_entries {
 	struct i915_address_space *vm;
-	struct sg_table *st;
-	u64 start;
+	struct i915_vma *vma;
 	enum i915_cache_level level;
 };
 
@@ -2238,19 +2236,18 @@ static int bxt_vtd_ggtt_insert_entries__cb(void *_arg)
 {
 	struct insert_entries *arg = _arg;
 
-	gen8_ggtt_insert_entries(arg->vm, arg->st, arg->start, arg->level, 0);
+	gen8_ggtt_insert_entries(arg->vm, arg->vma, arg->level, 0);
 	bxt_vtd_ggtt_wa(arg->vm);
 
 	return 0;
 }
 
 static void bxt_vtd_ggtt_insert_entries__BKL(struct i915_address_space *vm,
-					     struct sg_table *st,
-					     u64 start,
+					     struct i915_vma *vma,
 					     enum i915_cache_level level,
 					     u32 unused)
 {
-	struct insert_entries arg = { vm, st, start, level };
+	struct insert_entries arg = { vm, vma, level };
 
 	stop_machine(bxt_vtd_ggtt_insert_entries__cb, &arg, NULL);
 }
@@ -2316,15 +2313,15 @@ static void i915_ggtt_insert_page(struct i915_address_space *vm,
 }
 
 static void i915_ggtt_insert_entries(struct i915_address_space *vm,
-				     struct sg_table *pages,
-				     u64 start,
+				     struct i915_vma *vma,
 				     enum i915_cache_level cache_level,
 				     u32 unused)
 {
 	unsigned int flags = (cache_level == I915_CACHE_NONE) ?
 		AGP_USER_MEMORY : AGP_USER_CACHED_MEMORY;
 
-	intel_gtt_insert_sg_entries(pages, start >> PAGE_SHIFT, flags);
+	intel_gtt_insert_sg_entries(vma->pages, vma->node.start >> PAGE_SHIFT,
+				    flags);
 }
 
 static void i915_ggtt_clear_range(struct i915_address_space *vm,
@@ -2353,8 +2350,7 @@ static int ggtt_bind_vma(struct i915_vma *vma,
 		pte_flags |= PTE_READ_ONLY;
 
 	intel_runtime_pm_get(i915);
-	vma->vm->insert_entries(vma->vm, vma->pages, vma->node.start,
-				cache_level, pte_flags);
+	vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags);
 	intel_runtime_pm_put(i915);
 
 	/*
@@ -2407,16 +2403,13 @@ static int aliasing_gtt_bind_vma(struct i915_vma *vma,
 				goto err_pages;
 		}
 
-		appgtt->base.insert_entries(&appgtt->base,
-					    vma->pages, vma->node.start,
-					    cache_level, pte_flags);
+		appgtt->base.insert_entries(&appgtt->base, vma, cache_level,
+					    pte_flags);
 	}
 
 	if (flags & I915_VMA_GLOBAL_BIND) {
 		intel_runtime_pm_get(i915);
-		vma->vm->insert_entries(vma->vm,
-					vma->pages, vma->node.start,
-					cache_level, pte_flags);
+		vma->vm->insert_entries(vma->vm, vma, cache_level, pte_flags);
 		intel_runtime_pm_put(i915);
 	}
 
@@ -2749,6 +2742,24 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
 	return 0;
 }
 
+static void cnl_setup_private_ppat(struct drm_i915_private *dev_priv)
+{
+	/* XXX: spec is unclear if this is still needed for CNL+ */
+	if (!USES_PPGTT(dev_priv)) {
+		I915_WRITE(GEN10_PAT_INDEX(0), GEN8_PPAT_UC);
+		return;
+	}
+
+	I915_WRITE(GEN10_PAT_INDEX(0), GEN8_PPAT_WB | GEN8_PPAT_LLC);
+	I915_WRITE(GEN10_PAT_INDEX(1), GEN8_PPAT_WC | GEN8_PPAT_LLCELLC);
+	I915_WRITE(GEN10_PAT_INDEX(2), GEN8_PPAT_WT | GEN8_PPAT_LLCELLC);
+	I915_WRITE(GEN10_PAT_INDEX(3), GEN8_PPAT_UC);
+	I915_WRITE(GEN10_PAT_INDEX(4), GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(0));
+	I915_WRITE(GEN10_PAT_INDEX(5), GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(1));
+	I915_WRITE(GEN10_PAT_INDEX(6), GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(2));
+	I915_WRITE(GEN10_PAT_INDEX(7), GEN8_PPAT_LLCELLC | GEN8_PPAT_AGE(3));
+}
+
 /* The GGTT and PPGTT need a private PPAT setup in order to handle cacheability
  * bits. When using advanced contexts each context stores its own PAT, but
  * writing this data shouldn't be harmful even in those cases. */
@@ -2863,7 +2874,9 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
 
 	ggtt->base.total = (size / sizeof(gen8_pte_t)) << PAGE_SHIFT;
 
-	if (IS_CHERRYVIEW(dev_priv) || IS_GEN9_LP(dev_priv))
+	if (INTEL_GEN(dev_priv) >= 10)
+		cnl_setup_private_ppat(dev_priv);
+	else if (IS_CHERRYVIEW(dev_priv) || IS_GEN9_LP(dev_priv))
 		chv_setup_private_ppat(dev_priv);
 	else
 		bdw_setup_private_ppat(dev_priv);
@@ -3145,7 +3158,9 @@ void i915_gem_restore_gtt_mappings(struct drm_i915_private *dev_priv)
 	ggtt->base.closed = false;
 
 	if (INTEL_GEN(dev_priv) >= 8) {
-		if (IS_CHERRYVIEW(dev_priv) || IS_GEN9_LP(dev_priv))
+		if (INTEL_GEN(dev_priv) >= 10)
+			cnl_setup_private_ppat(dev_priv);
+		else if (IS_CHERRYVIEW(dev_priv) || IS_GEN9_LP(dev_priv))
 			chv_setup_private_ppat(dev_priv);
 		else
 			bdw_setup_private_ppat(dev_priv);
diff --git a/drivers/gpu/drm/i915/i915_gem_gtt.h b/drivers/gpu/drm/i915/i915_gem_gtt.h
index 1b2a56c..b4e3aa7 100644
--- a/drivers/gpu/drm/i915/i915_gem_gtt.h
+++ b/drivers/gpu/drm/i915/i915_gem_gtt.h
@@ -313,8 +313,7 @@ struct i915_address_space {
 			    enum i915_cache_level cache_level,
 			    u32 flags);
 	void (*insert_entries)(struct i915_address_space *vm,
-			       struct sg_table *st,
-			       u64 start,
+			       struct i915_vma *vma,
 			       enum i915_cache_level cache_level,
 			       u32 flags);
 	void (*cleanup)(struct i915_address_space *vm);
diff --git a/drivers/gpu/drm/i915/i915_gem_internal.c b/drivers/gpu/drm/i915/i915_gem_internal.c
index 568bf83..c1f64dd 100644
--- a/drivers/gpu/drm/i915/i915_gem_internal.c
+++ b/drivers/gpu/drm/i915/i915_gem_internal.c
@@ -174,6 +174,7 @@ i915_gem_object_create_internal(struct drm_i915_private *i915,
 				phys_addr_t size)
 {
 	struct drm_i915_gem_object *obj;
+	unsigned int cache_level;
 
 	GEM_BUG_ON(!size);
 	GEM_BUG_ON(!IS_ALIGNED(size, PAGE_SIZE));
@@ -190,9 +191,9 @@ i915_gem_object_create_internal(struct drm_i915_private *i915,
 
 	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
 	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
-	obj->cache_level = HAS_LLC(i915) ? I915_CACHE_LLC : I915_CACHE_NONE;
-	obj->cache_coherent = i915_gem_object_is_coherent(obj);
-	obj->cache_dirty = !obj->cache_coherent;
+
+	cache_level = HAS_LLC(i915) ? I915_CACHE_LLC : I915_CACHE_NONE;
+	i915_gem_object_set_cache_coherency(obj, cache_level);
 
 	return obj;
 }
diff --git a/drivers/gpu/drm/i915/i915_gem_object.c b/drivers/gpu/drm/i915/i915_gem_object.c
new file mode 100644
index 0000000..aab8cdd
--- /dev/null
+++ b/drivers/gpu/drm/i915/i915_gem_object.c
@@ -0,0 +1,48 @@
+/*
+ * Copyright © 2017 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ */
+
+#include "i915_drv.h"
+#include "i915_gem_object.h"
+
+/**
+ * Mark up the object's coherency levels for a given cache_level
+ * @obj: #drm_i915_gem_object
+ * @cache_level: cache level
+ */
+void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
+					 unsigned int cache_level)
+{
+	obj->cache_level = cache_level;
+
+	if (cache_level != I915_CACHE_NONE)
+		obj->cache_coherent = (I915_BO_CACHE_COHERENT_FOR_READ |
+				       I915_BO_CACHE_COHERENT_FOR_WRITE);
+	else if (HAS_LLC(to_i915(obj->base.dev)))
+		obj->cache_coherent = I915_BO_CACHE_COHERENT_FOR_READ;
+	else
+		obj->cache_coherent = 0;
+
+	obj->cache_dirty =
+		!(obj->cache_coherent & I915_BO_CACHE_COHERENT_FOR_WRITE);
+}
diff --git a/drivers/gpu/drm/i915/i915_gem_object.h b/drivers/gpu/drm/i915/i915_gem_object.h
index 5b19a49..c30d8f8 100644
--- a/drivers/gpu/drm/i915/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/i915_gem_object.h
@@ -33,8 +33,24 @@
 
 #include <drm/i915_drm.h>
 
+#include "i915_gem_request.h"
 #include "i915_selftest.h"
 
+struct drm_i915_gem_object;
+
+/*
+ * struct i915_lut_handle tracks the fast lookups from handle to vma used
+ * for execbuf. Although we use a radixtree for that mapping, in order to
+ * remove them as the object or context is closed, we need a secondary list
+ * and a translation entry (i915_lut_handle).
+ */
+struct i915_lut_handle {
+	struct list_head obj_link;
+	struct list_head ctx_link;
+	struct i915_gem_context *ctx;
+	u32 handle;
+};
+
 struct drm_i915_gem_object_ops {
 	unsigned int flags;
 #define I915_GEM_OBJECT_HAS_STRUCT_PAGE BIT(0)
@@ -86,7 +102,15 @@ struct drm_i915_gem_object {
 	 * They are also added to @vma_list for easy iteration.
 	 */
 	struct rb_root vma_tree;
-	struct i915_vma *vma_hashed;
+
+	/**
+	 * @lut_list: List of vma lookup entries in use for this object.
+	 *
+	 * If this object is closed, we need to remove all of its VMA from
+	 * the fast lookup index in associated contexts; @lut_list provides
+	 * this translation from object to context->handles_vma.
+	 */
+	struct list_head lut_list;
 
 	/** Stolen memory for this object, instead of being backed by shmem. */
 	struct drm_mm_node *stolen;
@@ -118,8 +142,10 @@ struct drm_i915_gem_object {
 	 */
 	unsigned long gt_ro:1;
 	unsigned int cache_level:3;
+	unsigned int cache_coherent:2;
+#define I915_BO_CACHE_COHERENT_FOR_READ BIT(0)
+#define I915_BO_CACHE_COHERENT_FOR_WRITE BIT(1)
 	unsigned int cache_dirty:1;
-	unsigned int cache_coherent:1;
 
 	atomic_t frontbuffer_bits;
 	unsigned int frontbuffer_ggtt_origin; /* write once */
@@ -391,6 +417,8 @@ i915_gem_object_last_write_engine(struct drm_i915_gem_object *obj)
 	return engine;
 }
 
+void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
+					 unsigned int cache_level);
 void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj);
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_gem_request.c b/drivers/gpu/drm/i915/i915_gem_request.c
index 8c59c79..813a3b5 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.c
+++ b/drivers/gpu/drm/i915/i915_gem_request.c
@@ -213,6 +213,10 @@ static int reset_all_global_seqno(struct drm_i915_private *i915, u32 seqno)
 				cond_resched();
 		}
 
+		/* Check we are idle before we fiddle with hw state! */
+		GEM_BUG_ON(!intel_engine_is_idle(engine));
+		GEM_BUG_ON(i915_gem_active_isset(&engine->timeline->last_request));
+
 		/* Finally reset hw state */
 		intel_engine_init_global_seqno(engine, seqno);
 		tl->seqno = seqno;
@@ -240,27 +244,60 @@ int i915_gem_set_global_seqno(struct drm_device *dev, u32 seqno)
 	return reset_all_global_seqno(dev_priv, seqno - 1);
 }
 
-static int reserve_seqno(struct intel_engine_cs *engine)
+static void mark_busy(struct drm_i915_private *i915)
 {
+	if (i915->gt.awake)
+		return;
+
+	GEM_BUG_ON(!i915->gt.active_requests);
+
+	intel_runtime_pm_get_noresume(i915);
+	i915->gt.awake = true;
+
+	intel_enable_gt_powersave(i915);
+	i915_update_gfx_val(i915);
+	if (INTEL_GEN(i915) >= 6)
+		gen6_rps_busy(i915);
+
+	queue_delayed_work(i915->wq,
+			   &i915->gt.retire_work,
+			   round_jiffies_up_relative(HZ));
+}
+
+static int reserve_engine(struct intel_engine_cs *engine)
+{
+	struct drm_i915_private *i915 = engine->i915;
 	u32 active = ++engine->timeline->inflight_seqnos;
 	u32 seqno = engine->timeline->seqno;
 	int ret;
 
 	/* Reservation is fine until we need to wrap around */
-	if (likely(!add_overflows(seqno, active)))
-		return 0;
-
-	ret = reset_all_global_seqno(engine->i915, 0);
-	if (ret) {
-		engine->timeline->inflight_seqnos--;
-		return ret;
+	if (unlikely(add_overflows(seqno, active))) {
+		ret = reset_all_global_seqno(i915, 0);
+		if (ret) {
+			engine->timeline->inflight_seqnos--;
+			return ret;
+		}
 	}
 
+	if (!i915->gt.active_requests++)
+		mark_busy(i915);
+
 	return 0;
 }
 
-static void unreserve_seqno(struct intel_engine_cs *engine)
+static void unreserve_engine(struct intel_engine_cs *engine)
 {
+	struct drm_i915_private *i915 = engine->i915;
+
+	if (!--i915->gt.active_requests) {
+		/* Cancel the mark_busy() from our reserve_engine() */
+		GEM_BUG_ON(!i915->gt.awake);
+		mod_delayed_work(i915->wq,
+				 &i915->gt.idle_work,
+				 msecs_to_jiffies(100));
+	}
+
 	GEM_BUG_ON(!engine->timeline->inflight_seqnos);
 	engine->timeline->inflight_seqnos--;
 }
@@ -329,13 +366,7 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
 	list_del_init(&request->link);
 	spin_unlock_irq(&engine->timeline->lock);
 
-	if (!--request->i915->gt.active_requests) {
-		GEM_BUG_ON(!request->i915->gt.awake);
-		mod_delayed_work(request->i915->wq,
-				 &request->i915->gt.idle_work,
-				 msecs_to_jiffies(100));
-	}
-	unreserve_seqno(request->engine);
+	unreserve_engine(request->engine);
 	advance_ring(request);
 
 	free_capture_list(request);
@@ -370,8 +401,7 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
 	i915_gem_request_remove_from_client(request);
 
 	/* Retirement decays the ban score as it is a sign of ctx progress */
-	if (request->ctx->ban_score > 0)
-		request->ctx->ban_score--;
+	atomic_dec_if_positive(&request->ctx->ban_score);
 
 	/* The backing object for the context is done after switching to the
 	 * *next* context. Therefore we cannot retire the previous context until
@@ -384,7 +414,11 @@ static void i915_gem_request_retire(struct drm_i915_gem_request *request)
 		engine->context_unpin(engine, engine->last_retired_context);
 	engine->last_retired_context = request->ctx;
 
-	dma_fence_signal(&request->fence);
+	spin_lock_irq(&request->lock);
+	if (request->waitboost)
+		atomic_dec(&request->i915->rps.num_waiters);
+	dma_fence_signal_locked(&request->fence);
+	spin_unlock_irq(&request->lock);
 
 	i915_priotree_fini(request->i915, &request->priotree);
 	i915_gem_request_put(request);
@@ -568,7 +602,7 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
 		return ERR_CAST(ring);
 	GEM_BUG_ON(!ring);
 
-	ret = reserve_seqno(engine);
+	ret = reserve_engine(engine);
 	if (ret)
 		goto err_unpin;
 
@@ -639,6 +673,7 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
 	req->file_priv = NULL;
 	req->batch = NULL;
 	req->capture_list = NULL;
+	req->waitboost = false;
 
 	/*
 	 * Reserve space in the ring buffer for all the commands required to
@@ -673,7 +708,7 @@ i915_gem_request_alloc(struct intel_engine_cs *engine,
 
 	kmem_cache_free(dev_priv->requests, req);
 err_unreserve:
-	unreserve_seqno(engine);
+	unreserve_engine(engine);
 err_unpin:
 	engine->context_unpin(engine, ctx);
 	return ERR_PTR(ret);
@@ -855,28 +890,6 @@ i915_gem_request_await_object(struct drm_i915_gem_request *to,
 	return ret;
 }
 
-static void i915_gem_mark_busy(const struct intel_engine_cs *engine)
-{
-	struct drm_i915_private *dev_priv = engine->i915;
-
-	if (dev_priv->gt.awake)
-		return;
-
-	GEM_BUG_ON(!dev_priv->gt.active_requests);
-
-	intel_runtime_pm_get_noresume(dev_priv);
-	dev_priv->gt.awake = true;
-
-	intel_enable_gt_powersave(dev_priv);
-	i915_update_gfx_val(dev_priv);
-	if (INTEL_GEN(dev_priv) >= 6)
-		gen6_rps_busy(dev_priv);
-
-	queue_delayed_work(dev_priv->wq,
-			   &dev_priv->gt.retire_work,
-			   round_jiffies_up_relative(HZ));
-}
-
 /*
  * NB: This function is not allowed to fail. Doing so would mean the the
  * request is not being tracked for completion but the work itself is
@@ -958,9 +971,6 @@ void __i915_add_request(struct drm_i915_gem_request *request, bool flush_caches)
 	list_add_tail(&request->ring_link, &ring->request_list);
 	request->emitted_jiffies = jiffies;
 
-	if (!request->i915->gt.active_requests++)
-		i915_gem_mark_busy(engine);
-
 	/* Let the backend know a new request has arrived that may need
 	 * to adjust the existing execution schedule due to a high priority
 	 * request - i.e. we may want to preempt the current request in order
@@ -1063,7 +1073,7 @@ static bool __i915_wait_request_check_and_reset(struct drm_i915_gem_request *req
 		return false;
 
 	__set_current_state(TASK_RUNNING);
-	i915_reset(request->i915);
+	i915_reset(request->i915, 0);
 	return true;
 }
 
diff --git a/drivers/gpu/drm/i915/i915_gem_request.h b/drivers/gpu/drm/i915/i915_gem_request.h
index 7579b97..49a4c89 100644
--- a/drivers/gpu/drm/i915/i915_gem_request.h
+++ b/drivers/gpu/drm/i915/i915_gem_request.h
@@ -184,6 +184,8 @@ struct drm_i915_gem_request {
 	/** Time at which this request was emitted, in jiffies. */
 	unsigned long emitted_jiffies;
 
+	bool waitboost;
+
 	/** engine->request_list entry for this request */
 	struct list_head link;
 
diff --git a/drivers/gpu/drm/i915/i915_gem_stolen.c b/drivers/gpu/drm/i915/i915_gem_stolen.c
index a817b3e..507c9f0 100644
--- a/drivers/gpu/drm/i915/i915_gem_stolen.c
+++ b/drivers/gpu/drm/i915/i915_gem_stolen.c
@@ -254,9 +254,10 @@ static dma_addr_t i915_stolen_to_dma(struct drm_i915_private *dev_priv)
 		 * This is a BIOS w/a: Some BIOS wrap stolen in the root
 		 * PCI bus, but have an off-by-one error. Hence retry the
 		 * reservation starting from 1 instead of 0.
+		 * There's also BIOS with off-by-one on the other end.
 		 */
 		r = devm_request_mem_region(dev_priv->drm.dev, base + 1,
-					    ggtt->stolen_size - 1,
+					    ggtt->stolen_size - 2,
 					    "Graphics Stolen Memory");
 		/*
 		 * GEN3 firmware likes to smash pci bridges into the stolen
@@ -579,6 +580,7 @@ _i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
 			       struct drm_mm_node *stolen)
 {
 	struct drm_i915_gem_object *obj;
+	unsigned int cache_level;
 
 	obj = i915_gem_object_alloc(dev_priv);
 	if (obj == NULL)
@@ -589,8 +591,8 @@ _i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
 
 	obj->stolen = stolen;
 	obj->base.read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;
-	obj->cache_level = HAS_LLC(dev_priv) ? I915_CACHE_LLC : I915_CACHE_NONE;
-	obj->cache_coherent = true; /* assumptions! more like cache_oblivious */
+	cache_level = HAS_LLC(dev_priv) ? I915_CACHE_LLC : I915_CACHE_NONE;
+	i915_gem_object_set_cache_coherency(obj, cache_level);
 
 	if (i915_gem_object_pin_pages(obj))
 		goto cleanup;
diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index ccd09e8..f152a38 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -804,9 +804,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev, void *data, struct drm_file *file
 	i915_gem_object_init(obj, &i915_gem_userptr_ops);
 	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
 	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
-	obj->cache_level = I915_CACHE_LLC;
-	obj->cache_coherent = i915_gem_object_is_coherent(obj);
-	obj->cache_dirty = !obj->cache_coherent;
+	i915_gem_object_set_cache_coherency(obj, I915_CACHE_LLC);
 
 	obj->userptr.ptr = args->user_ptr;
 	obj->userptr.read_only = !!(args->flags & I915_USERPTR_READ_ONLY);
diff --git a/drivers/gpu/drm/i915/i915_gpu_error.c b/drivers/gpu/drm/i915/i915_gpu_error.c
index e18f350..ed5a1eb 100644
--- a/drivers/gpu/drm/i915/i915_gpu_error.c
+++ b/drivers/gpu/drm/i915/i915_gpu_error.c
@@ -463,6 +463,7 @@ static void error_print_engine(struct drm_i915_error_state_buf *m,
 	err_printf(m, "  hangcheck action timestamp: %lu, %u ms ago\n",
 		   ee->hangcheck_timestamp,
 		   jiffies_to_msecs(jiffies - ee->hangcheck_timestamp));
+	err_printf(m, "  engine reset count: %u\n", ee->reset_count);
 
 	error_print_request(m, "  ELSP[0]: ", &ee->execlist[0]);
 	error_print_request(m, "  ELSP[1]: ", &ee->execlist[1]);
@@ -1236,6 +1237,8 @@ static void error_record_engine_registers(struct i915_gpu_state *error,
 	ee->hangcheck_timestamp = engine->hangcheck.action_timestamp;
 	ee->hangcheck_action = engine->hangcheck.action;
 	ee->hangcheck_stalled = engine->hangcheck.stalled;
+	ee->reset_count = i915_reset_engine_count(&dev_priv->gpu_error,
+						  engine);
 
 	if (USES_PPGTT(dev_priv)) {
 		int i;
@@ -1263,7 +1266,7 @@ static void record_request(struct drm_i915_gem_request *request,
 			   struct drm_i915_error_request *erq)
 {
 	erq->context = request->ctx->hw_id;
-	erq->ban_score = request->ctx->ban_score;
+	erq->ban_score = atomic_read(&request->ctx->ban_score);
 	erq->seqno = request->global_seqno;
 	erq->jiffies = request->emitted_jiffies;
 	erq->head = request->head;
@@ -1354,9 +1357,9 @@ static void record_context(struct drm_i915_error_context *e,
 
 	e->handle = ctx->user_handle;
 	e->hw_id = ctx->hw_id;
-	e->ban_score = ctx->ban_score;
-	e->guilty = ctx->guilty_count;
-	e->active = ctx->active_count;
+	e->ban_score = atomic_read(&ctx->ban_score);
+	e->guilty = atomic_read(&ctx->guilty_count);
+	e->active = atomic_read(&ctx->active_count);
 }
 
 static void request_record_user_bo(struct drm_i915_gem_request *request,
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c
index 4cd9ee1..e21ce9c 100644
--- a/drivers/gpu/drm/i915/i915_irq.c
+++ b/drivers/gpu/drm/i915/i915_irq.c
@@ -275,17 +275,17 @@ void gen5_disable_gt_irq(struct drm_i915_private *dev_priv, uint32_t mask)
 
 static i915_reg_t gen6_pm_iir(struct drm_i915_private *dev_priv)
 {
-	return INTEL_INFO(dev_priv)->gen >= 8 ? GEN8_GT_IIR(2) : GEN6_PMIIR;
+	return INTEL_GEN(dev_priv) >= 8 ? GEN8_GT_IIR(2) : GEN6_PMIIR;
 }
 
 static i915_reg_t gen6_pm_imr(struct drm_i915_private *dev_priv)
 {
-	return INTEL_INFO(dev_priv)->gen >= 8 ? GEN8_GT_IMR(2) : GEN6_PMIMR;
+	return INTEL_GEN(dev_priv) >= 8 ? GEN8_GT_IMR(2) : GEN6_PMIMR;
 }
 
 static i915_reg_t gen6_pm_ier(struct drm_i915_private *dev_priv)
 {
-	return INTEL_INFO(dev_priv)->gen >= 8 ? GEN8_GT_IER(2) : GEN6_PMIER;
+	return INTEL_GEN(dev_priv) >= 8 ? GEN8_GT_IER(2) : GEN6_PMIER;
 }
 
 /**
@@ -1091,18 +1091,6 @@ static u32 vlv_wa_c0_ei(struct drm_i915_private *dev_priv, u32 pm_iir)
 	return events;
 }
 
-static bool any_waiters(struct drm_i915_private *dev_priv)
-{
-	struct intel_engine_cs *engine;
-	enum intel_engine_id id;
-
-	for_each_engine(engine, dev_priv, id)
-		if (intel_engine_has_waiter(engine))
-			return true;
-
-	return false;
-}
-
 static void gen6_pm_rps_work(struct work_struct *work)
 {
 	struct drm_i915_private *dev_priv =
@@ -1114,7 +1102,7 @@ static void gen6_pm_rps_work(struct work_struct *work)
 	spin_lock_irq(&dev_priv->irq_lock);
 	if (dev_priv->rps.interrupts_enabled) {
 		pm_iir = fetch_and_zero(&dev_priv->rps.pm_iir);
-		client_boost = fetch_and_zero(&dev_priv->rps.client_boost);
+		client_boost = atomic_read(&dev_priv->rps.num_waiters);
 	}
 	spin_unlock_irq(&dev_priv->irq_lock);
 
@@ -1131,7 +1119,7 @@ static void gen6_pm_rps_work(struct work_struct *work)
 	new_delay = dev_priv->rps.cur_freq;
 	min = dev_priv->rps.min_freq_softlimit;
 	max = dev_priv->rps.max_freq_softlimit;
-	if (client_boost || any_waiters(dev_priv))
+	if (client_boost)
 		max = dev_priv->rps.max_freq;
 	if (client_boost && new_delay < dev_priv->rps.boost_freq) {
 		new_delay = dev_priv->rps.boost_freq;
@@ -1144,7 +1132,7 @@ static void gen6_pm_rps_work(struct work_struct *work)
 
 		if (new_delay >= dev_priv->rps.max_freq_softlimit)
 			adj = 0;
-	} else if (client_boost || any_waiters(dev_priv)) {
+	} else if (client_boost) {
 		adj = 0;
 	} else if (pm_iir & GEN6_PM_RP_DOWN_TIMEOUT) {
 		if (dev_priv->rps.cur_freq > dev_priv->rps.efficient_freq)
@@ -1513,7 +1501,8 @@ static void intel_get_hpd_pins(u32 *pin_mask, u32 *long_mask,
 
 		*pin_mask |= BIT(i);
 
-		if (!intel_hpd_pin_to_port(i, &port))
+		port = intel_hpd_pin_to_port(i);
+		if (port == PORT_NONE)
 			continue;
 
 		if (long_pulse_detect(port, dig_hotplug_reg))
@@ -1603,7 +1592,7 @@ static void display_pipe_crc_irq_handler(struct drm_i915_private *dev_priv,
 		crcs[3] = crc3;
 		crcs[4] = crc4;
 		drm_crtc_add_crc_entry(&crtc->base, true,
-				       drm_accurate_vblank_count(&crtc->base),
+				       drm_crtc_accurate_vblank_count(&crtc->base),
 				       crcs);
 	}
 }
@@ -1673,7 +1662,7 @@ static void gen6_rps_irq_handler(struct drm_i915_private *dev_priv, u32 pm_iir)
 		spin_unlock(&dev_priv->irq_lock);
 	}
 
-	if (INTEL_INFO(dev_priv)->gen >= 8)
+	if (INTEL_GEN(dev_priv) >= 8)
 		return;
 
 	if (HAS_VEBOX(dev_priv)) {
@@ -1720,18 +1709,6 @@ static void gen9_guc_irq_handler(struct drm_i915_private *dev_priv, u32 gt_iir)
 	}
 }
 
-static bool intel_pipe_handle_vblank(struct drm_i915_private *dev_priv,
-				     enum pipe pipe)
-{
-	bool ret;
-
-	ret = drm_handle_vblank(&dev_priv->drm, pipe);
-	if (ret)
-		intel_finish_page_flip_mmio(dev_priv, pipe);
-
-	return ret;
-}
-
 static void valleyview_pipestat_irq_ack(struct drm_i915_private *dev_priv,
 					u32 iir, u32 pipe_stats[I915_MAX_PIPES])
 {
@@ -1796,12 +1773,8 @@ static void valleyview_pipestat_irq_handler(struct drm_i915_private *dev_priv,
 	enum pipe pipe;
 
 	for_each_pipe(dev_priv, pipe) {
-		if (pipe_stats[pipe] & PIPE_START_VBLANK_INTERRUPT_STATUS &&
-		    intel_pipe_handle_vblank(dev_priv, pipe))
-			intel_check_page_flip(dev_priv, pipe);
-
-		if (pipe_stats[pipe] & PLANE_FLIP_DONE_INT_STATUS_VLV)
-			intel_finish_page_flip_cs(dev_priv, pipe);
+		if (pipe_stats[pipe] & PIPE_START_VBLANK_INTERRUPT_STATUS)
+			drm_handle_vblank(&dev_priv->drm, pipe);
 
 		if (pipe_stats[pipe] & PIPE_CRC_DONE_INTERRUPT_STATUS)
 			i9xx_pipe_crc_irq_handler(dev_priv, pipe);
@@ -2098,10 +2071,10 @@ static void ibx_irq_handler(struct drm_i915_private *dev_priv, u32 pch_iir)
 		DRM_DEBUG_DRIVER("PCH transcoder CRC error interrupt\n");
 
 	if (pch_iir & SDE_TRANSA_FIFO_UNDER)
-		intel_pch_fifo_underrun_irq_handler(dev_priv, TRANSCODER_A);
+		intel_pch_fifo_underrun_irq_handler(dev_priv, PIPE_A);
 
 	if (pch_iir & SDE_TRANSB_FIFO_UNDER)
-		intel_pch_fifo_underrun_irq_handler(dev_priv, TRANSCODER_B);
+		intel_pch_fifo_underrun_irq_handler(dev_priv, PIPE_B);
 }
 
 static void ivb_err_int_handler(struct drm_i915_private *dev_priv)
@@ -2135,13 +2108,13 @@ static void cpt_serr_int_handler(struct drm_i915_private *dev_priv)
 		DRM_ERROR("PCH poison interrupt\n");
 
 	if (serr_int & SERR_INT_TRANS_A_FIFO_UNDERRUN)
-		intel_pch_fifo_underrun_irq_handler(dev_priv, TRANSCODER_A);
+		intel_pch_fifo_underrun_irq_handler(dev_priv, PIPE_A);
 
 	if (serr_int & SERR_INT_TRANS_B_FIFO_UNDERRUN)
-		intel_pch_fifo_underrun_irq_handler(dev_priv, TRANSCODER_B);
+		intel_pch_fifo_underrun_irq_handler(dev_priv, PIPE_B);
 
 	if (serr_int & SERR_INT_TRANS_C_FIFO_UNDERRUN)
-		intel_pch_fifo_underrun_irq_handler(dev_priv, TRANSCODER_C);
+		intel_pch_fifo_underrun_irq_handler(dev_priv, PIPE_C);
 
 	I915_WRITE(SERR_INT, serr_int);
 }
@@ -2253,19 +2226,14 @@ static void ilk_display_irq_handler(struct drm_i915_private *dev_priv,
 		DRM_ERROR("Poison interrupt\n");
 
 	for_each_pipe(dev_priv, pipe) {
-		if (de_iir & DE_PIPE_VBLANK(pipe) &&
-		    intel_pipe_handle_vblank(dev_priv, pipe))
-			intel_check_page_flip(dev_priv, pipe);
+		if (de_iir & DE_PIPE_VBLANK(pipe))
+			drm_handle_vblank(&dev_priv->drm, pipe);
 
 		if (de_iir & DE_PIPE_FIFO_UNDERRUN(pipe))
 			intel_cpu_fifo_underrun_irq_handler(dev_priv, pipe);
 
 		if (de_iir & DE_PIPE_CRC_DONE(pipe))
 			i9xx_pipe_crc_irq_handler(dev_priv, pipe);
-
-		/* plane/pipes map 1:1 on ilk+ */
-		if (de_iir & DE_PLANE_FLIP_DONE(pipe))
-			intel_finish_page_flip_cs(dev_priv, pipe);
 	}
 
 	/* check event from PCH */
@@ -2304,13 +2272,8 @@ static void ivb_display_irq_handler(struct drm_i915_private *dev_priv,
 		intel_opregion_asle_intr(dev_priv);
 
 	for_each_pipe(dev_priv, pipe) {
-		if (de_iir & (DE_PIPE_VBLANK_IVB(pipe)) &&
-		    intel_pipe_handle_vblank(dev_priv, pipe))
-			intel_check_page_flip(dev_priv, pipe);
-
-		/* plane/pipes map 1:1 on ilk+ */
-		if (de_iir & DE_PLANE_FLIP_DONE_IVB(pipe))
-			intel_finish_page_flip_cs(dev_priv, pipe);
+		if (de_iir & (DE_PIPE_VBLANK_IVB(pipe)))
+			drm_handle_vblank(&dev_priv->drm, pipe);
 	}
 
 	/* check event from PCH */
@@ -2452,7 +2415,7 @@ gen8_de_irq_handler(struct drm_i915_private *dev_priv, u32 master_ctl)
 			ret = IRQ_HANDLED;
 
 			tmp_mask = GEN8_AUX_CHANNEL_A;
-			if (INTEL_INFO(dev_priv)->gen >= 9)
+			if (INTEL_GEN(dev_priv) >= 9)
 				tmp_mask |= GEN9_AUX_CHANNEL_B |
 					    GEN9_AUX_CHANNEL_C |
 					    GEN9_AUX_CHANNEL_D;
@@ -2491,7 +2454,7 @@ gen8_de_irq_handler(struct drm_i915_private *dev_priv, u32 master_ctl)
 	}
 
 	for_each_pipe(dev_priv, pipe) {
-		u32 flip_done, fault_errors;
+		u32 fault_errors;
 
 		if (!(master_ctl & GEN8_DE_PIPE_IRQ(pipe)))
 			continue;
@@ -2505,18 +2468,8 @@ gen8_de_irq_handler(struct drm_i915_private *dev_priv, u32 master_ctl)
 		ret = IRQ_HANDLED;
 		I915_WRITE(GEN8_DE_PIPE_IIR(pipe), iir);
 
-		if (iir & GEN8_PIPE_VBLANK &&
-		    intel_pipe_handle_vblank(dev_priv, pipe))
-			intel_check_page_flip(dev_priv, pipe);
-
-		flip_done = iir;
-		if (INTEL_INFO(dev_priv)->gen >= 9)
-			flip_done &= GEN9_PIPE_PLANE1_FLIP_DONE;
-		else
-			flip_done &= GEN8_PIPE_PRIMARY_FLIP_DONE;
-
-		if (flip_done)
-			intel_finish_page_flip_cs(dev_priv, pipe);
+		if (iir & GEN8_PIPE_VBLANK)
+			drm_handle_vblank(&dev_priv->drm, pipe);
 
 		if (iir & GEN8_PIPE_CDCLK_CRC_DONE)
 			hsw_pipe_crc_irq_handler(dev_priv, pipe);
@@ -2525,7 +2478,7 @@ gen8_de_irq_handler(struct drm_i915_private *dev_priv, u32 master_ctl)
 			intel_cpu_fifo_underrun_irq_handler(dev_priv, pipe);
 
 		fault_errors = iir;
-		if (INTEL_INFO(dev_priv)->gen >= 9)
+		if (INTEL_GEN(dev_priv) >= 9)
 			fault_errors &= GEN9_DE_PIPE_IRQ_FAULT_ERRORS;
 		else
 			fault_errors &= GEN8_DE_PIPE_IRQ_FAULT_ERRORS;
@@ -2599,86 +2552,93 @@ static irqreturn_t gen8_irq_handler(int irq, void *arg)
 	return ret;
 }
 
+struct wedge_me {
+	struct delayed_work work;
+	struct drm_i915_private *i915;
+	const char *name;
+};
+
+static void wedge_me(struct work_struct *work)
+{
+	struct wedge_me *w = container_of(work, typeof(*w), work.work);
+
+	dev_err(w->i915->drm.dev,
+		"%s timed out, cancelling all in-flight rendering.\n",
+		w->name);
+	i915_gem_set_wedged(w->i915);
+}
+
+static void __init_wedge(struct wedge_me *w,
+			 struct drm_i915_private *i915,
+			 long timeout,
+			 const char *name)
+{
+	w->i915 = i915;
+	w->name = name;
+
+	INIT_DELAYED_WORK_ONSTACK(&w->work, wedge_me);
+	schedule_delayed_work(&w->work, timeout);
+}
+
+static void __fini_wedge(struct wedge_me *w)
+{
+	cancel_delayed_work_sync(&w->work);
+	destroy_delayed_work_on_stack(&w->work);
+	w->i915 = NULL;
+}
+
+#define i915_wedge_on_timeout(W, DEV, TIMEOUT)				\
+	for (__init_wedge((W), (DEV), (TIMEOUT), __func__);		\
+	     (W)->i915;							\
+	     __fini_wedge((W)))
+
 /**
- * i915_reset_and_wakeup - do process context error handling work
+ * i915_reset_device - do process context error handling work
  * @dev_priv: i915 device private
  *
  * Fire an error uevent so userspace can see that a hang or error
  * was detected.
  */
-static void i915_reset_and_wakeup(struct drm_i915_private *dev_priv)
+static void i915_reset_device(struct drm_i915_private *dev_priv)
 {
 	struct kobject *kobj = &dev_priv->drm.primary->kdev->kobj;
 	char *error_event[] = { I915_ERROR_UEVENT "=1", NULL };
 	char *reset_event[] = { I915_RESET_UEVENT "=1", NULL };
 	char *reset_done_event[] = { I915_ERROR_UEVENT "=0", NULL };
+	struct wedge_me w;
 
 	kobject_uevent_env(kobj, KOBJ_CHANGE, error_event);
 
 	DRM_DEBUG_DRIVER("resetting chip\n");
 	kobject_uevent_env(kobj, KOBJ_CHANGE, reset_event);
 
-	intel_prepare_reset(dev_priv);
+	/* Use a watchdog to ensure that our reset completes */
+	i915_wedge_on_timeout(&w, dev_priv, 5*HZ) {
+		intel_prepare_reset(dev_priv);
 
-	set_bit(I915_RESET_HANDOFF, &dev_priv->gpu_error.flags);
-	wake_up_all(&dev_priv->gpu_error.wait_queue);
+		/* Signal that locked waiters should reset the GPU */
+		set_bit(I915_RESET_HANDOFF, &dev_priv->gpu_error.flags);
+		wake_up_all(&dev_priv->gpu_error.wait_queue);
 
-	do {
-		/*
-		 * All state reset _must_ be completed before we update the
-		 * reset counter, for otherwise waiters might miss the reset
-		 * pending state and not properly drop locks, resulting in
-		 * deadlocks with the reset work.
+		/* Wait for anyone holding the lock to wakeup, without
+		 * blocking indefinitely on struct_mutex.
 		 */
-		if (mutex_trylock(&dev_priv->drm.struct_mutex)) {
-			i915_reset(dev_priv);
-			mutex_unlock(&dev_priv->drm.struct_mutex);
-		}
+		do {
+			if (mutex_trylock(&dev_priv->drm.struct_mutex)) {
+				i915_reset(dev_priv, 0);
+				mutex_unlock(&dev_priv->drm.struct_mutex);
+			}
+		} while (wait_on_bit_timeout(&dev_priv->gpu_error.flags,
+					     I915_RESET_HANDOFF,
+					     TASK_UNINTERRUPTIBLE,
+					     1));
 
-		/* We need to wait for anyone holding the lock to wakeup */
-	} while (wait_on_bit_timeout(&dev_priv->gpu_error.flags,
-				     I915_RESET_HANDOFF,
-				     TASK_UNINTERRUPTIBLE,
-				     HZ));
-
-	intel_finish_reset(dev_priv);
+		intel_finish_reset(dev_priv);
+	}
 
 	if (!test_bit(I915_WEDGED, &dev_priv->gpu_error.flags))
 		kobject_uevent_env(kobj,
 				   KOBJ_CHANGE, reset_done_event);
-
-	/*
-	 * Note: The wake_up also serves as a memory barrier so that
-	 * waiters see the updated value of the dev_priv->gpu_error.
-	 */
-	clear_bit(I915_RESET_BACKOFF, &dev_priv->gpu_error.flags);
-	wake_up_all(&dev_priv->gpu_error.reset_queue);
-}
-
-static inline void
-i915_err_print_instdone(struct drm_i915_private *dev_priv,
-			struct intel_instdone *instdone)
-{
-	int slice;
-	int subslice;
-
-	pr_err("  INSTDONE: 0x%08x\n", instdone->instdone);
-
-	if (INTEL_GEN(dev_priv) <= 3)
-		return;
-
-	pr_err("  SC_INSTDONE: 0x%08x\n", instdone->slice_common);
-
-	if (INTEL_GEN(dev_priv) <= 6)
-		return;
-
-	for_each_instdone_slice_subslice(dev_priv, slice, subslice)
-		pr_err("  SAMPLER_INSTDONE[%d][%d]: 0x%08x\n",
-		       slice, subslice, instdone->sampler[slice][subslice]);
-
-	for_each_instdone_slice_subslice(dev_priv, slice, subslice)
-		pr_err("  ROW_INSTDONE[%d][%d]: 0x%08x\n",
-		       slice, subslice, instdone->row[slice][subslice]);
 }
 
 static void i915_clear_error_registers(struct drm_i915_private *dev_priv)
@@ -2722,6 +2682,8 @@ void i915_handle_error(struct drm_i915_private *dev_priv,
 		       u32 engine_mask,
 		       const char *fmt, ...)
 {
+	struct intel_engine_cs *engine;
+	unsigned int tmp;
 	va_list args;
 	char error_msg[80];
 
@@ -2741,14 +2703,56 @@ void i915_handle_error(struct drm_i915_private *dev_priv,
 	i915_capture_error_state(dev_priv, engine_mask, error_msg);
 	i915_clear_error_registers(dev_priv);
 
+	/*
+	 * Try engine reset when available. We fall back to full reset if
+	 * single reset fails.
+	 */
+	if (intel_has_reset_engine(dev_priv)) {
+		for_each_engine_masked(engine, dev_priv, engine_mask, tmp) {
+			BUILD_BUG_ON(I915_RESET_MODESET >= I915_RESET_ENGINE);
+			if (test_and_set_bit(I915_RESET_ENGINE + engine->id,
+					     &dev_priv->gpu_error.flags))
+				continue;
+
+			if (i915_reset_engine(engine, 0) == 0)
+				engine_mask &= ~intel_engine_flag(engine);
+
+			clear_bit(I915_RESET_ENGINE + engine->id,
+				  &dev_priv->gpu_error.flags);
+			wake_up_bit(&dev_priv->gpu_error.flags,
+				    I915_RESET_ENGINE + engine->id);
+		}
+	}
+
 	if (!engine_mask)
 		goto out;
 
-	if (test_and_set_bit(I915_RESET_BACKOFF,
-			     &dev_priv->gpu_error.flags))
+	/* Full reset needs the mutex, stop any other user trying to do so. */
+	if (test_and_set_bit(I915_RESET_BACKOFF, &dev_priv->gpu_error.flags)) {
+		wait_event(dev_priv->gpu_error.reset_queue,
+			   !test_bit(I915_RESET_BACKOFF,
+				     &dev_priv->gpu_error.flags));
 		goto out;
+	}
 
-	i915_reset_and_wakeup(dev_priv);
+	/* Prevent any other reset-engine attempt. */
+	for_each_engine(engine, dev_priv, tmp) {
+		while (test_and_set_bit(I915_RESET_ENGINE + engine->id,
+					&dev_priv->gpu_error.flags))
+			wait_on_bit(&dev_priv->gpu_error.flags,
+				    I915_RESET_ENGINE + engine->id,
+				    TASK_UNINTERRUPTIBLE);
+	}
+
+	i915_reset_device(dev_priv);
+
+	for_each_engine(engine, dev_priv, tmp) {
+		clear_bit(I915_RESET_ENGINE + engine->id,
+			  &dev_priv->gpu_error.flags);
+	}
+
+	clear_bit(I915_RESET_BACKOFF, &dev_priv->gpu_error.flags);
+	wake_up_all(&dev_priv->gpu_error.reset_queue);
 
 out:
 	intel_runtime_pm_put(dev_priv);
@@ -3009,7 +3013,7 @@ static void gen8_irq_reset(struct drm_device *dev)
 }
 
 void gen8_irq_power_well_post_enable(struct drm_i915_private *dev_priv,
-				     unsigned int pipe_mask)
+				     u8 pipe_mask)
 {
 	uint32_t extra_ier = GEN8_PIPE_VBLANK | GEN8_PIPE_FIFO_UNDERRUN;
 	enum pipe pipe;
@@ -3023,7 +3027,7 @@ void gen8_irq_power_well_post_enable(struct drm_i915_private *dev_priv,
 }
 
 void gen8_irq_power_well_pre_disable(struct drm_i915_private *dev_priv,
-				     unsigned int pipe_mask)
+				     u8 pipe_mask)
 {
 	enum pipe pipe;
 
@@ -3427,7 +3431,7 @@ static void gen8_de_irq_postinstall(struct drm_i915_private *dev_priv)
 	u32 de_misc_masked = GEN8_DE_MISC_GSE;
 	enum pipe pipe;
 
-	if (INTEL_INFO(dev_priv)->gen >= 9) {
+	if (INTEL_GEN(dev_priv) >= 9) {
 		de_pipe_masked |= GEN9_PIPE_PLANE1_FLIP_DONE |
 				  GEN9_DE_PIPE_IRQ_FAULT_ERRORS;
 		de_port_masked |= GEN9_AUX_CHANNEL_B | GEN9_AUX_CHANNEL_C |
@@ -3610,34 +3614,6 @@ static int i8xx_irq_postinstall(struct drm_device *dev)
 /*
  * Returns true when a page flip has completed.
  */
-static bool i8xx_handle_vblank(struct drm_i915_private *dev_priv,
-			       int plane, int pipe, u32 iir)
-{
-	u16 flip_pending = DISPLAY_PLANE_FLIP_PENDING(plane);
-
-	if (!intel_pipe_handle_vblank(dev_priv, pipe))
-		return false;
-
-	if ((iir & flip_pending) == 0)
-		goto check_page_flip;
-
-	/* We detect FlipDone by looking for the change in PendingFlip from '1'
-	 * to '0' on the following vblank, i.e. IIR has the Pendingflip
-	 * asserted following the MI_DISPLAY_FLIP, but ISR is deasserted, hence
-	 * the flip is completed (no longer pending). Since this doesn't raise
-	 * an interrupt per se, we watch for the change at vblank.
-	 */
-	if (I915_READ16(ISR) & flip_pending)
-		goto check_page_flip;
-
-	intel_finish_page_flip_cs(dev_priv, pipe);
-	return true;
-
-check_page_flip:
-	intel_check_page_flip(dev_priv, pipe);
-	return false;
-}
-
 static irqreturn_t i8xx_irq_handler(int irq, void *arg)
 {
 	struct drm_device *dev = arg;
@@ -3645,9 +3621,6 @@ static irqreturn_t i8xx_irq_handler(int irq, void *arg)
 	u16 iir, new_iir;
 	u32 pipe_stats[2];
 	int pipe;
-	u16 flip_mask =
-		I915_DISPLAY_PLANE_A_FLIP_PENDING_INTERRUPT |
-		I915_DISPLAY_PLANE_B_FLIP_PENDING_INTERRUPT;
 	irqreturn_t ret;
 
 	if (!intel_irqs_enabled(dev_priv))
@@ -3661,7 +3634,7 @@ static irqreturn_t i8xx_irq_handler(int irq, void *arg)
 	if (iir == 0)
 		goto out;
 
-	while (iir & ~flip_mask) {
+	while (iir) {
 		/* Can't rely on pipestat interrupt bit in iir as it might
 		 * have been cleared after the pipestat interrupt was received.
 		 * It doesn't set the bit in iir again, but it still produces
@@ -3683,7 +3656,7 @@ static irqreturn_t i8xx_irq_handler(int irq, void *arg)
 		}
 		spin_unlock(&dev_priv->irq_lock);
 
-		I915_WRITE16(IIR, iir & ~flip_mask);
+		I915_WRITE16(IIR, iir);
 		new_iir = I915_READ16(IIR); /* Flush posted writes */
 
 		if (iir & I915_USER_INTERRUPT)
@@ -3694,9 +3667,8 @@ static irqreturn_t i8xx_irq_handler(int irq, void *arg)
 			if (HAS_FBC(dev_priv))
 				plane = !plane;
 
-			if (pipe_stats[pipe] & PIPE_VBLANK_INTERRUPT_STATUS &&
-			    i8xx_handle_vblank(dev_priv, plane, pipe, iir))
-				flip_mask &= ~DISPLAY_PLANE_FLIP_PENDING(plane);
+			if (pipe_stats[pipe] & PIPE_VBLANK_INTERRUPT_STATUS)
+				drm_handle_vblank(&dev_priv->drm, pipe);
 
 			if (pipe_stats[pipe] & PIPE_CRC_DONE_INTERRUPT_STATUS)
 				i9xx_pipe_crc_irq_handler(dev_priv, pipe);
@@ -3796,45 +3768,11 @@ static int i915_irq_postinstall(struct drm_device *dev)
 	return 0;
 }
 
-/*
- * Returns true when a page flip has completed.
- */
-static bool i915_handle_vblank(struct drm_i915_private *dev_priv,
-			       int plane, int pipe, u32 iir)
-{
-	u32 flip_pending = DISPLAY_PLANE_FLIP_PENDING(plane);
-
-	if (!intel_pipe_handle_vblank(dev_priv, pipe))
-		return false;
-
-	if ((iir & flip_pending) == 0)
-		goto check_page_flip;
-
-	/* We detect FlipDone by looking for the change in PendingFlip from '1'
-	 * to '0' on the following vblank, i.e. IIR has the Pendingflip
-	 * asserted following the MI_DISPLAY_FLIP, but ISR is deasserted, hence
-	 * the flip is completed (no longer pending). Since this doesn't raise
-	 * an interrupt per se, we watch for the change at vblank.
-	 */
-	if (I915_READ(ISR) & flip_pending)
-		goto check_page_flip;
-
-	intel_finish_page_flip_cs(dev_priv, pipe);
-	return true;
-
-check_page_flip:
-	intel_check_page_flip(dev_priv, pipe);
-	return false;
-}
-
 static irqreturn_t i915_irq_handler(int irq, void *arg)
 {
 	struct drm_device *dev = arg;
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	u32 iir, new_iir, pipe_stats[I915_MAX_PIPES];
-	u32 flip_mask =
-		I915_DISPLAY_PLANE_A_FLIP_PENDING_INTERRUPT |
-		I915_DISPLAY_PLANE_B_FLIP_PENDING_INTERRUPT;
 	int pipe, ret = IRQ_NONE;
 
 	if (!intel_irqs_enabled(dev_priv))
@@ -3845,7 +3783,7 @@ static irqreturn_t i915_irq_handler(int irq, void *arg)
 
 	iir = I915_READ(IIR);
 	do {
-		bool irq_received = (iir & ~flip_mask) != 0;
+		bool irq_received = (iir) != 0;
 		bool blc_event = false;
 
 		/* Can't rely on pipestat interrupt bit in iir as it might
@@ -3880,7 +3818,7 @@ static irqreturn_t i915_irq_handler(int irq, void *arg)
 				i9xx_hpd_irq_handler(dev_priv, hotplug_status);
 		}
 
-		I915_WRITE(IIR, iir & ~flip_mask);
+		I915_WRITE(IIR, iir);
 		new_iir = I915_READ(IIR); /* Flush posted writes */
 
 		if (iir & I915_USER_INTERRUPT)
@@ -3891,9 +3829,8 @@ static irqreturn_t i915_irq_handler(int irq, void *arg)
 			if (HAS_FBC(dev_priv))
 				plane = !plane;
 
-			if (pipe_stats[pipe] & PIPE_VBLANK_INTERRUPT_STATUS &&
-			    i915_handle_vblank(dev_priv, plane, pipe, iir))
-				flip_mask &= ~DISPLAY_PLANE_FLIP_PENDING(plane);
+			if (pipe_stats[pipe] & PIPE_VBLANK_INTERRUPT_STATUS)
+				drm_handle_vblank(&dev_priv->drm, pipe);
 
 			if (pipe_stats[pipe] & PIPE_LEGACY_BLC_EVENT_STATUS)
 				blc_event = true;
@@ -3926,7 +3863,7 @@ static irqreturn_t i915_irq_handler(int irq, void *arg)
 		 */
 		ret = IRQ_HANDLED;
 		iir = new_iir;
-	} while (iir & ~flip_mask);
+	} while (iir);
 
 	enable_rpm_wakeref_asserts(dev_priv);
 
@@ -4061,9 +3998,6 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 	u32 iir, new_iir;
 	u32 pipe_stats[I915_MAX_PIPES];
 	int ret = IRQ_NONE, pipe;
-	u32 flip_mask =
-		I915_DISPLAY_PLANE_A_FLIP_PENDING_INTERRUPT |
-		I915_DISPLAY_PLANE_B_FLIP_PENDING_INTERRUPT;
 
 	if (!intel_irqs_enabled(dev_priv))
 		return IRQ_NONE;
@@ -4074,7 +4008,7 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 	iir = I915_READ(IIR);
 
 	for (;;) {
-		bool irq_received = (iir & ~flip_mask) != 0;
+		bool irq_received = (iir) != 0;
 		bool blc_event = false;
 
 		/* Can't rely on pipestat interrupt bit in iir as it might
@@ -4112,7 +4046,7 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 				i9xx_hpd_irq_handler(dev_priv, hotplug_status);
 		}
 
-		I915_WRITE(IIR, iir & ~flip_mask);
+		I915_WRITE(IIR, iir);
 		new_iir = I915_READ(IIR); /* Flush posted writes */
 
 		if (iir & I915_USER_INTERRUPT)
@@ -4121,9 +4055,8 @@ static irqreturn_t i965_irq_handler(int irq, void *arg)
 			notify_ring(dev_priv->engine[VCS]);
 
 		for_each_pipe(dev_priv, pipe) {
-			if (pipe_stats[pipe] & PIPE_START_VBLANK_INTERRUPT_STATUS &&
-			    i915_handle_vblank(dev_priv, pipe, pipe, iir))
-				flip_mask &= ~DISPLAY_PLANE_FLIP_PENDING(pipe);
+			if (pipe_stats[pipe] & PIPE_START_VBLANK_INTERRUPT_STATUS)
+				drm_handle_vblank(&dev_priv->drm, pipe);
 
 			if (pipe_stats[pipe] & PIPE_LEGACY_BLC_EVENT_STATUS)
 				blc_event = true;
@@ -4225,16 +4158,16 @@ void intel_irq_init(struct drm_i915_private *dev_priv)
 	 *
 	 * TODO: verify if this can be reproduced on VLV,CHV.
 	 */
-	if (INTEL_INFO(dev_priv)->gen <= 7)
+	if (INTEL_GEN(dev_priv) <= 7)
 		dev_priv->rps.pm_intrmsk_mbz |= GEN6_PM_RP_UP_EI_EXPIRED;
 
-	if (INTEL_INFO(dev_priv)->gen >= 8)
+	if (INTEL_GEN(dev_priv) >= 8)
 		dev_priv->rps.pm_intrmsk_mbz |= GEN8_PMINTR_DISABLE_REDIRECT_TO_GUC;
 
 	if (IS_GEN2(dev_priv)) {
 		/* Gen2 doesn't have a hardware frame counter */
 		dev->max_vblank_count = 0;
-	} else if (IS_G4X(dev_priv) || INTEL_INFO(dev_priv)->gen >= 5) {
+	} else if (IS_G4X(dev_priv) || INTEL_GEN(dev_priv) >= 5) {
 		dev->max_vblank_count = 0xffffffff; /* full 32 bit counter */
 		dev->driver->get_vblank_counter = g4x_get_vblank_counter;
 	} else {
@@ -4281,7 +4214,7 @@ void intel_irq_init(struct drm_i915_private *dev_priv)
 		dev->driver->enable_vblank = i965_enable_vblank;
 		dev->driver->disable_vblank = i965_disable_vblank;
 		dev_priv->display.hpd_irq_setup = i915_hpd_irq_setup;
-	} else if (INTEL_INFO(dev_priv)->gen >= 8) {
+	} else if (INTEL_GEN(dev_priv) >= 8) {
 		dev->driver->irq_handler = gen8_irq_handler;
 		dev->driver->irq_preinstall = gen8_irq_reset;
 		dev->driver->irq_postinstall = gen8_irq_postinstall;
diff --git a/drivers/gpu/drm/i915/i915_oa_bdw.c b/drivers/gpu/drm/i915/i915_oa_bdw.c
index d4462c2a..abdf4d0 100644
--- a/drivers/gpu/drm/i915/i915_oa_bdw.c
+++ b/drivers/gpu/drm/i915/i915_oa_bdw.c
@@ -31,3981 +31,6 @@
 #include "i915_drv.h"
 #include "i915_oa_bdw.h"
 
-enum metric_set_id {
-	METRIC_SET_ID_RENDER_BASIC = 1,
-	METRIC_SET_ID_COMPUTE_BASIC,
-	METRIC_SET_ID_RENDER_PIPE_PROFILE,
-	METRIC_SET_ID_MEMORY_READS,
-	METRIC_SET_ID_MEMORY_WRITES,
-	METRIC_SET_ID_COMPUTE_EXTENDED,
-	METRIC_SET_ID_COMPUTE_L3_CACHE,
-	METRIC_SET_ID_DATA_PORT_READS_COALESCING,
-	METRIC_SET_ID_DATA_PORT_WRITES_COALESCING,
-	METRIC_SET_ID_HDC_AND_SF,
-	METRIC_SET_ID_L3_1,
-	METRIC_SET_ID_L3_2,
-	METRIC_SET_ID_L3_3,
-	METRIC_SET_ID_L3_4,
-	METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND,
-	METRIC_SET_ID_SAMPLER_1,
-	METRIC_SET_ID_SAMPLER_2,
-	METRIC_SET_ID_TDL_1,
-	METRIC_SET_ID_TDL_2,
-	METRIC_SET_ID_COMPUTE_EXTRA,
-	METRIC_SET_ID_VME_PIPE,
-	METRIC_SET_ID_TEST_OA,
-};
-
-int i915_oa_n_builtin_metric_sets_bdw = 22;
-
-static const struct i915_oa_reg b_counter_config_render_basic[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_render_basic[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_render_basic_0_slices_0x01[] = {
-	{ _MMIO(0x9888), 0x143f000f },
-	{ _MMIO(0x9888), 0x14110014 },
-	{ _MMIO(0x9888), 0x14310014 },
-	{ _MMIO(0x9888), 0x14bf000f },
-	{ _MMIO(0x9888), 0x118a0317 },
-	{ _MMIO(0x9888), 0x13837be0 },
-	{ _MMIO(0x9888), 0x3b800060 },
-	{ _MMIO(0x9888), 0x3d800005 },
-	{ _MMIO(0x9888), 0x005c4000 },
-	{ _MMIO(0x9888), 0x065c8000 },
-	{ _MMIO(0x9888), 0x085cc000 },
-	{ _MMIO(0x9888), 0x003d8000 },
-	{ _MMIO(0x9888), 0x183d0800 },
-	{ _MMIO(0x9888), 0x0a3f0023 },
-	{ _MMIO(0x9888), 0x103f0000 },
-	{ _MMIO(0x9888), 0x00584000 },
-	{ _MMIO(0x9888), 0x08584000 },
-	{ _MMIO(0x9888), 0x0a5a4000 },
-	{ _MMIO(0x9888), 0x005b4000 },
-	{ _MMIO(0x9888), 0x0e5b8000 },
-	{ _MMIO(0x9888), 0x185b2400 },
-	{ _MMIO(0x9888), 0x0a1d4000 },
-	{ _MMIO(0x9888), 0x0c1f0800 },
-	{ _MMIO(0x9888), 0x0e1faa00 },
-	{ _MMIO(0x9888), 0x00384000 },
-	{ _MMIO(0x9888), 0x0e384000 },
-	{ _MMIO(0x9888), 0x16384000 },
-	{ _MMIO(0x9888), 0x18380001 },
-	{ _MMIO(0x9888), 0x00392000 },
-	{ _MMIO(0x9888), 0x06398000 },
-	{ _MMIO(0x9888), 0x0839a000 },
-	{ _MMIO(0x9888), 0x0a391000 },
-	{ _MMIO(0x9888), 0x00104000 },
-	{ _MMIO(0x9888), 0x08104000 },
-	{ _MMIO(0x9888), 0x00110030 },
-	{ _MMIO(0x9888), 0x08110031 },
-	{ _MMIO(0x9888), 0x10110000 },
-	{ _MMIO(0x9888), 0x00134000 },
-	{ _MMIO(0x9888), 0x16130020 },
-	{ _MMIO(0x9888), 0x06308000 },
-	{ _MMIO(0x9888), 0x08308000 },
-	{ _MMIO(0x9888), 0x06311800 },
-	{ _MMIO(0x9888), 0x08311880 },
-	{ _MMIO(0x9888), 0x10310000 },
-	{ _MMIO(0x9888), 0x0e334000 },
-	{ _MMIO(0x9888), 0x16330080 },
-	{ _MMIO(0x9888), 0x0abf1180 },
-	{ _MMIO(0x9888), 0x10bf0000 },
-	{ _MMIO(0x9888), 0x0ada8000 },
-	{ _MMIO(0x9888), 0x0a9d8000 },
-	{ _MMIO(0x9888), 0x109f0002 },
-	{ _MMIO(0x9888), 0x0ab94000 },
-	{ _MMIO(0x9888), 0x0d888000 },
-	{ _MMIO(0x9888), 0x038a0380 },
-	{ _MMIO(0x9888), 0x058a000e },
-	{ _MMIO(0x9888), 0x018a8000 },
-	{ _MMIO(0x9888), 0x0f8a8000 },
-	{ _MMIO(0x9888), 0x198a8000 },
-	{ _MMIO(0x9888), 0x1b8a00a0 },
-	{ _MMIO(0x9888), 0x078a0000 },
-	{ _MMIO(0x9888), 0x098a0000 },
-	{ _MMIO(0x9888), 0x238b2820 },
-	{ _MMIO(0x9888), 0x258b2550 },
-	{ _MMIO(0x9888), 0x198c1000 },
-	{ _MMIO(0x9888), 0x0b8d8000 },
-	{ _MMIO(0x9888), 0x1f85aa80 },
-	{ _MMIO(0x9888), 0x2185aaa0 },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x0d831021 },
-	{ _MMIO(0x9888), 0x0f83572f },
-	{ _MMIO(0x9888), 0x01835680 },
-	{ _MMIO(0x9888), 0x0383002c },
-	{ _MMIO(0x9888), 0x11830000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830001 },
-	{ _MMIO(0x9888), 0x05830000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0184c000 },
-	{ _MMIO(0x9888), 0x07848000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x05844000 },
-	{ _MMIO(0x9888), 0x1b80c137 },
-	{ _MMIO(0x9888), 0x1d80c147 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x1180c000 },
-	{ _MMIO(0x9888), 0x17808000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x15804000 },
-	{ _MMIO(0x9888), 0x4d801110 },
-	{ _MMIO(0x9888), 0x4f800331 },
-	{ _MMIO(0x9888), 0x43800802 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45801465 },
-	{ _MMIO(0x9888), 0x53801111 },
-	{ _MMIO(0x9888), 0x478014a5 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x3f800ca5 },
-	{ _MMIO(0x9888), 0x41800003 },
-};
-
-static const struct i915_oa_reg mux_config_render_basic_1_slices_0x02[] = {
-	{ _MMIO(0x9888), 0x143f000f },
-	{ _MMIO(0x9888), 0x14bf000f },
-	{ _MMIO(0x9888), 0x14910014 },
-	{ _MMIO(0x9888), 0x14b10014 },
-	{ _MMIO(0x9888), 0x118a0317 },
-	{ _MMIO(0x9888), 0x13837be0 },
-	{ _MMIO(0x9888), 0x3b800060 },
-	{ _MMIO(0x9888), 0x3d800005 },
-	{ _MMIO(0x9888), 0x0a3f0023 },
-	{ _MMIO(0x9888), 0x103f0000 },
-	{ _MMIO(0x9888), 0x0a5a4000 },
-	{ _MMIO(0x9888), 0x0a1d4000 },
-	{ _MMIO(0x9888), 0x0e1f8000 },
-	{ _MMIO(0x9888), 0x0a391000 },
-	{ _MMIO(0x9888), 0x00dc4000 },
-	{ _MMIO(0x9888), 0x06dc8000 },
-	{ _MMIO(0x9888), 0x08dcc000 },
-	{ _MMIO(0x9888), 0x00bd8000 },
-	{ _MMIO(0x9888), 0x18bd0800 },
-	{ _MMIO(0x9888), 0x0abf1180 },
-	{ _MMIO(0x9888), 0x10bf0000 },
-	{ _MMIO(0x9888), 0x00d84000 },
-	{ _MMIO(0x9888), 0x08d84000 },
-	{ _MMIO(0x9888), 0x0ada8000 },
-	{ _MMIO(0x9888), 0x00db4000 },
-	{ _MMIO(0x9888), 0x0edb8000 },
-	{ _MMIO(0x9888), 0x18db2400 },
-	{ _MMIO(0x9888), 0x0a9d8000 },
-	{ _MMIO(0x9888), 0x0c9f0800 },
-	{ _MMIO(0x9888), 0x0e9f2a00 },
-	{ _MMIO(0x9888), 0x109f0002 },
-	{ _MMIO(0x9888), 0x00b84000 },
-	{ _MMIO(0x9888), 0x0eb84000 },
-	{ _MMIO(0x9888), 0x16b84000 },
-	{ _MMIO(0x9888), 0x18b80001 },
-	{ _MMIO(0x9888), 0x00b92000 },
-	{ _MMIO(0x9888), 0x06b98000 },
-	{ _MMIO(0x9888), 0x08b9a000 },
-	{ _MMIO(0x9888), 0x0ab94000 },
-	{ _MMIO(0x9888), 0x00904000 },
-	{ _MMIO(0x9888), 0x08904000 },
-	{ _MMIO(0x9888), 0x00910030 },
-	{ _MMIO(0x9888), 0x08910031 },
-	{ _MMIO(0x9888), 0x10910000 },
-	{ _MMIO(0x9888), 0x00934000 },
-	{ _MMIO(0x9888), 0x16930020 },
-	{ _MMIO(0x9888), 0x06b08000 },
-	{ _MMIO(0x9888), 0x08b08000 },
-	{ _MMIO(0x9888), 0x06b11800 },
-	{ _MMIO(0x9888), 0x08b11880 },
-	{ _MMIO(0x9888), 0x10b10000 },
-	{ _MMIO(0x9888), 0x0eb34000 },
-	{ _MMIO(0x9888), 0x16b30080 },
-	{ _MMIO(0x9888), 0x01888000 },
-	{ _MMIO(0x9888), 0x0d88b800 },
-	{ _MMIO(0x9888), 0x038a0380 },
-	{ _MMIO(0x9888), 0x058a000e },
-	{ _MMIO(0x9888), 0x1b8a0080 },
-	{ _MMIO(0x9888), 0x078a0000 },
-	{ _MMIO(0x9888), 0x098a0000 },
-	{ _MMIO(0x9888), 0x238b2840 },
-	{ _MMIO(0x9888), 0x258b26a0 },
-	{ _MMIO(0x9888), 0x018c4000 },
-	{ _MMIO(0x9888), 0x0f8c4000 },
-	{ _MMIO(0x9888), 0x178c2000 },
-	{ _MMIO(0x9888), 0x198c1100 },
-	{ _MMIO(0x9888), 0x018d2000 },
-	{ _MMIO(0x9888), 0x078d8000 },
-	{ _MMIO(0x9888), 0x098da000 },
-	{ _MMIO(0x9888), 0x0b8d8000 },
-	{ _MMIO(0x9888), 0x1f85aa80 },
-	{ _MMIO(0x9888), 0x2185aaa0 },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x0d831021 },
-	{ _MMIO(0x9888), 0x0f83572f },
-	{ _MMIO(0x9888), 0x01835680 },
-	{ _MMIO(0x9888), 0x0383002c },
-	{ _MMIO(0x9888), 0x11830000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830001 },
-	{ _MMIO(0x9888), 0x05830000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0184c000 },
-	{ _MMIO(0x9888), 0x07848000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x05844000 },
-	{ _MMIO(0x9888), 0x1b80c137 },
-	{ _MMIO(0x9888), 0x1d80c147 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x1180c000 },
-	{ _MMIO(0x9888), 0x17808000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x15804000 },
-	{ _MMIO(0x9888), 0x4d801550 },
-	{ _MMIO(0x9888), 0x4f800331 },
-	{ _MMIO(0x9888), 0x43800802 },
-	{ _MMIO(0x9888), 0x51800400 },
-	{ _MMIO(0x9888), 0x458004a1 },
-	{ _MMIO(0x9888), 0x53805555 },
-	{ _MMIO(0x9888), 0x47800421 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x3f801421 },
-	{ _MMIO(0x9888), 0x41800845 },
-};
-
-static int
-get_render_basic_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 2);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 2);
-
-	if (INTEL_INFO(dev_priv)->sseu.slice_mask & 0x01) {
-		regs[n] = mux_config_render_basic_0_slices_0x01;
-		lens[n] = ARRAY_SIZE(mux_config_render_basic_0_slices_0x01);
-		n++;
-	}
-	if (INTEL_INFO(dev_priv)->sseu.slice_mask & 0x02) {
-		regs[n] = mux_config_render_basic_1_slices_0x02;
-		lens[n] = ARRAY_SIZE(mux_config_render_basic_1_slices_0x02);
-		n++;
-	}
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_basic[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_basic[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_compute_basic_0_slices_0x01[] = {
-	{ _MMIO(0x9888), 0x105c00e0 },
-	{ _MMIO(0x9888), 0x105800e0 },
-	{ _MMIO(0x9888), 0x103800e0 },
-	{ _MMIO(0x9888), 0x3580001a },
-	{ _MMIO(0x9888), 0x3b800060 },
-	{ _MMIO(0x9888), 0x3d800005 },
-	{ _MMIO(0x9888), 0x065c2100 },
-	{ _MMIO(0x9888), 0x0a5c0041 },
-	{ _MMIO(0x9888), 0x0c5c6600 },
-	{ _MMIO(0x9888), 0x005c6580 },
-	{ _MMIO(0x9888), 0x085c8000 },
-	{ _MMIO(0x9888), 0x0e5c8000 },
-	{ _MMIO(0x9888), 0x00580042 },
-	{ _MMIO(0x9888), 0x08582080 },
-	{ _MMIO(0x9888), 0x0c58004c },
-	{ _MMIO(0x9888), 0x0e582580 },
-	{ _MMIO(0x9888), 0x005b4000 },
-	{ _MMIO(0x9888), 0x185b1000 },
-	{ _MMIO(0x9888), 0x1a5b0104 },
-	{ _MMIO(0x9888), 0x0c1fa800 },
-	{ _MMIO(0x9888), 0x0e1faa00 },
-	{ _MMIO(0x9888), 0x101f02aa },
-	{ _MMIO(0x9888), 0x08380042 },
-	{ _MMIO(0x9888), 0x0a382080 },
-	{ _MMIO(0x9888), 0x0e38404c },
-	{ _MMIO(0x9888), 0x0238404b },
-	{ _MMIO(0x9888), 0x00384000 },
-	{ _MMIO(0x9888), 0x16380000 },
-	{ _MMIO(0x9888), 0x18381145 },
-	{ _MMIO(0x9888), 0x04380000 },
-	{ _MMIO(0x9888), 0x0039a000 },
-	{ _MMIO(0x9888), 0x06398000 },
-	{ _MMIO(0x9888), 0x0839a000 },
-	{ _MMIO(0x9888), 0x0a39a000 },
-	{ _MMIO(0x9888), 0x0c39a000 },
-	{ _MMIO(0x9888), 0x0e39a000 },
-	{ _MMIO(0x9888), 0x02392000 },
-	{ _MMIO(0x9888), 0x018a8000 },
-	{ _MMIO(0x9888), 0x0f8a8000 },
-	{ _MMIO(0x9888), 0x198a8000 },
-	{ _MMIO(0x9888), 0x1b8aaaa0 },
-	{ _MMIO(0x9888), 0x1d8a0002 },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x238b02a0 },
-	{ _MMIO(0x9888), 0x258b5550 },
-	{ _MMIO(0x9888), 0x278b0015 },
-	{ _MMIO(0x9888), 0x1f850a80 },
-	{ _MMIO(0x9888), 0x2185aaa0 },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x01834000 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x0184c000 },
-	{ _MMIO(0x9888), 0x07848000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x03844000 },
-	{ _MMIO(0x9888), 0x17808137 },
-	{ _MMIO(0x9888), 0x1980c147 },
-	{ _MMIO(0x9888), 0x1b80c0e5 },
-	{ _MMIO(0x9888), 0x1d80c0e3 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x1180c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x13804000 },
-	{ _MMIO(0x9888), 0x15800000 },
-	{ _MMIO(0xd24), 0x00000000 },
-	{ _MMIO(0x9888), 0x4d801000 },
-	{ _MMIO(0x9888), 0x4f800111 },
-	{ _MMIO(0x9888), 0x43800062 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45800062 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47800062 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x3f801062 },
-	{ _MMIO(0x9888), 0x41801084 },
-};
-
-static const struct i915_oa_reg mux_config_compute_basic_2_slices_0x02[] = {
-	{ _MMIO(0x9888), 0x10dc00e0 },
-	{ _MMIO(0x9888), 0x10d800e0 },
-	{ _MMIO(0x9888), 0x10b800e0 },
-	{ _MMIO(0x9888), 0x3580001a },
-	{ _MMIO(0x9888), 0x3b800060 },
-	{ _MMIO(0x9888), 0x3d800005 },
-	{ _MMIO(0x9888), 0x06dc2100 },
-	{ _MMIO(0x9888), 0x0adc0041 },
-	{ _MMIO(0x9888), 0x0cdc6600 },
-	{ _MMIO(0x9888), 0x00dc6580 },
-	{ _MMIO(0x9888), 0x08dc8000 },
-	{ _MMIO(0x9888), 0x0edc8000 },
-	{ _MMIO(0x9888), 0x00d80042 },
-	{ _MMIO(0x9888), 0x08d82080 },
-	{ _MMIO(0x9888), 0x0cd8004c },
-	{ _MMIO(0x9888), 0x0ed82580 },
-	{ _MMIO(0x9888), 0x00db4000 },
-	{ _MMIO(0x9888), 0x18db1000 },
-	{ _MMIO(0x9888), 0x1adb0104 },
-	{ _MMIO(0x9888), 0x0c9fa800 },
-	{ _MMIO(0x9888), 0x0e9faa00 },
-	{ _MMIO(0x9888), 0x109f02aa },
-	{ _MMIO(0x9888), 0x08b80042 },
-	{ _MMIO(0x9888), 0x0ab82080 },
-	{ _MMIO(0x9888), 0x0eb8404c },
-	{ _MMIO(0x9888), 0x02b8404b },
-	{ _MMIO(0x9888), 0x00b84000 },
-	{ _MMIO(0x9888), 0x16b80000 },
-	{ _MMIO(0x9888), 0x18b81145 },
-	{ _MMIO(0x9888), 0x04b80000 },
-	{ _MMIO(0x9888), 0x00b9a000 },
-	{ _MMIO(0x9888), 0x06b98000 },
-	{ _MMIO(0x9888), 0x08b9a000 },
-	{ _MMIO(0x9888), 0x0ab9a000 },
-	{ _MMIO(0x9888), 0x0cb9a000 },
-	{ _MMIO(0x9888), 0x0eb9a000 },
-	{ _MMIO(0x9888), 0x02b92000 },
-	{ _MMIO(0x9888), 0x01888000 },
-	{ _MMIO(0x9888), 0x0d88f800 },
-	{ _MMIO(0x9888), 0x0f88000f },
-	{ _MMIO(0x9888), 0x03888000 },
-	{ _MMIO(0x9888), 0x05888000 },
-	{ _MMIO(0x9888), 0x238b0540 },
-	{ _MMIO(0x9888), 0x258baaa0 },
-	{ _MMIO(0x9888), 0x278b002a },
-	{ _MMIO(0x9888), 0x018c4000 },
-	{ _MMIO(0x9888), 0x0f8c4000 },
-	{ _MMIO(0x9888), 0x178c2000 },
-	{ _MMIO(0x9888), 0x198c5500 },
-	{ _MMIO(0x9888), 0x1b8c0015 },
-	{ _MMIO(0x9888), 0x038c4000 },
-	{ _MMIO(0x9888), 0x058c4000 },
-	{ _MMIO(0x9888), 0x018da000 },
-	{ _MMIO(0x9888), 0x078d8000 },
-	{ _MMIO(0x9888), 0x098da000 },
-	{ _MMIO(0x9888), 0x0b8da000 },
-	{ _MMIO(0x9888), 0x0d8da000 },
-	{ _MMIO(0x9888), 0x0f8da000 },
-	{ _MMIO(0x9888), 0x038d2000 },
-	{ _MMIO(0x9888), 0x1f850a80 },
-	{ _MMIO(0x9888), 0x2185aaa0 },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x01834000 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x0184c000 },
-	{ _MMIO(0x9888), 0x07848000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x03844000 },
-	{ _MMIO(0x9888), 0x17808137 },
-	{ _MMIO(0x9888), 0x1980c147 },
-	{ _MMIO(0x9888), 0x1b80c0e5 },
-	{ _MMIO(0x9888), 0x1d80c0e3 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x1180c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x13804000 },
-	{ _MMIO(0x9888), 0x15800000 },
-	{ _MMIO(0xd24), 0x00000000 },
-	{ _MMIO(0x9888), 0x4d805000 },
-	{ _MMIO(0x9888), 0x4f800555 },
-	{ _MMIO(0x9888), 0x43800062 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45800062 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47800062 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x3f800062 },
-	{ _MMIO(0x9888), 0x41800000 },
-};
-
-static int
-get_compute_basic_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 2);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 2);
-
-	if (INTEL_INFO(dev_priv)->sseu.slice_mask & 0x01) {
-		regs[n] = mux_config_compute_basic_0_slices_0x01;
-		lens[n] = ARRAY_SIZE(mux_config_compute_basic_0_slices_0x01);
-		n++;
-	}
-	if (INTEL_INFO(dev_priv)->sseu.slice_mask & 0x02) {
-		regs[n] = mux_config_compute_basic_2_slices_0x02;
-		lens[n] = ARRAY_SIZE(mux_config_compute_basic_2_slices_0x02);
-		n++;
-	}
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_render_pipe_profile[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007ffea },
-	{ _MMIO(0x2774), 0x00007ffc },
-	{ _MMIO(0x2778), 0x0007affa },
-	{ _MMIO(0x277c), 0x0000f5fd },
-	{ _MMIO(0x2780), 0x00079ffa },
-	{ _MMIO(0x2784), 0x0000f3fb },
-	{ _MMIO(0x2788), 0x0007bf7a },
-	{ _MMIO(0x278c), 0x0000f7e7 },
-	{ _MMIO(0x2790), 0x0007fefa },
-	{ _MMIO(0x2794), 0x0000f7cf },
-	{ _MMIO(0x2798), 0x00077ffa },
-	{ _MMIO(0x279c), 0x0000efdf },
-	{ _MMIO(0x27a0), 0x0006fffa },
-	{ _MMIO(0x27a4), 0x0000cfbf },
-	{ _MMIO(0x27a8), 0x0003fffa },
-	{ _MMIO(0x27ac), 0x00005f7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_render_pipe_profile[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_render_pipe_profile[] = {
-	{ _MMIO(0x9888), 0x0a1e0000 },
-	{ _MMIO(0x9888), 0x0c1f000f },
-	{ _MMIO(0x9888), 0x10176800 },
-	{ _MMIO(0x9888), 0x1191001f },
-	{ _MMIO(0x9888), 0x0b880320 },
-	{ _MMIO(0x9888), 0x01890c40 },
-	{ _MMIO(0x9888), 0x118a1c00 },
-	{ _MMIO(0x9888), 0x118d7c00 },
-	{ _MMIO(0x9888), 0x118e0020 },
-	{ _MMIO(0x9888), 0x118f4c00 },
-	{ _MMIO(0x9888), 0x11900000 },
-	{ _MMIO(0x9888), 0x13900001 },
-	{ _MMIO(0x9888), 0x065c4000 },
-	{ _MMIO(0x9888), 0x0c3d8000 },
-	{ _MMIO(0x9888), 0x06584000 },
-	{ _MMIO(0x9888), 0x0c5b4000 },
-	{ _MMIO(0x9888), 0x081e0040 },
-	{ _MMIO(0x9888), 0x0e1e0000 },
-	{ _MMIO(0x9888), 0x021f5400 },
-	{ _MMIO(0x9888), 0x001f0000 },
-	{ _MMIO(0x9888), 0x101f0010 },
-	{ _MMIO(0x9888), 0x0e1f0080 },
-	{ _MMIO(0x9888), 0x0c384000 },
-	{ _MMIO(0x9888), 0x06392000 },
-	{ _MMIO(0x9888), 0x0c13c000 },
-	{ _MMIO(0x9888), 0x06164000 },
-	{ _MMIO(0x9888), 0x06170012 },
-	{ _MMIO(0x9888), 0x00170000 },
-	{ _MMIO(0x9888), 0x01910005 },
-	{ _MMIO(0x9888), 0x07880002 },
-	{ _MMIO(0x9888), 0x01880c00 },
-	{ _MMIO(0x9888), 0x0f880000 },
-	{ _MMIO(0x9888), 0x0d880000 },
-	{ _MMIO(0x9888), 0x05880000 },
-	{ _MMIO(0x9888), 0x09890032 },
-	{ _MMIO(0x9888), 0x078a0800 },
-	{ _MMIO(0x9888), 0x0f8a0a00 },
-	{ _MMIO(0x9888), 0x198a4000 },
-	{ _MMIO(0x9888), 0x1b8a2000 },
-	{ _MMIO(0x9888), 0x1d8a0000 },
-	{ _MMIO(0x9888), 0x038a4000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x0d8a8000 },
-	{ _MMIO(0x9888), 0x238b54c0 },
-	{ _MMIO(0x9888), 0x258baa55 },
-	{ _MMIO(0x9888), 0x278b0019 },
-	{ _MMIO(0x9888), 0x198c0100 },
-	{ _MMIO(0x9888), 0x058c4000 },
-	{ _MMIO(0x9888), 0x0f8d0015 },
-	{ _MMIO(0x9888), 0x018d1000 },
-	{ _MMIO(0x9888), 0x098d8000 },
-	{ _MMIO(0x9888), 0x0b8df000 },
-	{ _MMIO(0x9888), 0x0d8d3000 },
-	{ _MMIO(0x9888), 0x038de000 },
-	{ _MMIO(0x9888), 0x058d3000 },
-	{ _MMIO(0x9888), 0x0d8e0004 },
-	{ _MMIO(0x9888), 0x058e000c },
-	{ _MMIO(0x9888), 0x098e0000 },
-	{ _MMIO(0x9888), 0x078e0000 },
-	{ _MMIO(0x9888), 0x038e0000 },
-	{ _MMIO(0x9888), 0x0b8f0020 },
-	{ _MMIO(0x9888), 0x198f0c00 },
-	{ _MMIO(0x9888), 0x078f8000 },
-	{ _MMIO(0x9888), 0x098f4000 },
-	{ _MMIO(0x9888), 0x0b900980 },
-	{ _MMIO(0x9888), 0x03900d80 },
-	{ _MMIO(0x9888), 0x01900000 },
-	{ _MMIO(0x9888), 0x1f85aa80 },
-	{ _MMIO(0x9888), 0x2185aaaa },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x01834000 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0184c000 },
-	{ _MMIO(0x9888), 0x0784c000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x1180c000 },
-	{ _MMIO(0x9888), 0x1780c000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0xd24), 0x00000000 },
-	{ _MMIO(0x9888), 0x4d801111 },
-	{ _MMIO(0x9888), 0x3d800800 },
-	{ _MMIO(0x9888), 0x4f801011 },
-	{ _MMIO(0x9888), 0x43800443 },
-	{ _MMIO(0x9888), 0x51801111 },
-	{ _MMIO(0x9888), 0x45800422 },
-	{ _MMIO(0x9888), 0x53801111 },
-	{ _MMIO(0x9888), 0x47800c60 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x3f800422 },
-	{ _MMIO(0x9888), 0x41800021 },
-};
-
-static int
-get_render_pipe_profile_mux_config(struct drm_i915_private *dev_priv,
-				   const struct i915_oa_reg **regs,
-				   int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_render_pipe_profile;
-	lens[n] = ARRAY_SIZE(mux_config_render_pipe_profile);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_memory_reads[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0x86543210 },
-	{ _MMIO(0x2748), 0x86543210 },
-	{ _MMIO(0x2744), 0x00006667 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x275c), 0x86543210 },
-	{ _MMIO(0x2758), 0x86543210 },
-	{ _MMIO(0x2754), 0x00006465 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007f81a },
-	{ _MMIO(0x2774), 0x0000fe00 },
-	{ _MMIO(0x2778), 0x0007f82a },
-	{ _MMIO(0x277c), 0x0000fe00 },
-	{ _MMIO(0x2780), 0x0007f872 },
-	{ _MMIO(0x2784), 0x0000fe00 },
-	{ _MMIO(0x2788), 0x0007f8ba },
-	{ _MMIO(0x278c), 0x0000fe00 },
-	{ _MMIO(0x2790), 0x0007f87a },
-	{ _MMIO(0x2794), 0x0000fe00 },
-	{ _MMIO(0x2798), 0x0007f8ea },
-	{ _MMIO(0x279c), 0x0000fe00 },
-	{ _MMIO(0x27a0), 0x0007f8e2 },
-	{ _MMIO(0x27a4), 0x0000fe00 },
-	{ _MMIO(0x27a8), 0x0007f8f2 },
-	{ _MMIO(0x27ac), 0x0000fe00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_memory_reads[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_memory_reads[] = {
-	{ _MMIO(0x9888), 0x198b0343 },
-	{ _MMIO(0x9888), 0x13845800 },
-	{ _MMIO(0x9888), 0x15840018 },
-	{ _MMIO(0x9888), 0x3580001a },
-	{ _MMIO(0x9888), 0x038b6300 },
-	{ _MMIO(0x9888), 0x058b6b62 },
-	{ _MMIO(0x9888), 0x078b006a },
-	{ _MMIO(0x9888), 0x118b0000 },
-	{ _MMIO(0x9888), 0x238b0000 },
-	{ _MMIO(0x9888), 0x258b0000 },
-	{ _MMIO(0x9888), 0x1f85a080 },
-	{ _MMIO(0x9888), 0x2185aaaa },
-	{ _MMIO(0x9888), 0x2385000a },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x01840018 },
-	{ _MMIO(0x9888), 0x07844c80 },
-	{ _MMIO(0x9888), 0x09840d9a },
-	{ _MMIO(0x9888), 0x0b840e9c },
-	{ _MMIO(0x9888), 0x0d840f9e },
-	{ _MMIO(0x9888), 0x0f840010 },
-	{ _MMIO(0x9888), 0x11840000 },
-	{ _MMIO(0x9888), 0x03848000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x2f8000e5 },
-	{ _MMIO(0x9888), 0x138080e3 },
-	{ _MMIO(0x9888), 0x1580c0e1 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x11804000 },
-	{ _MMIO(0x9888), 0x1780c000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f804000 },
-	{ _MMIO(0xd24), 0x00000000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3d800800 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x43800842 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45800842 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47801042 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x3f800084 },
-	{ _MMIO(0x9888), 0x41800000 },
-};
-
-static int
-get_memory_reads_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_memory_reads;
-	lens[n] = ARRAY_SIZE(mux_config_memory_reads);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_memory_writes[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0x86543210 },
-	{ _MMIO(0x2748), 0x86543210 },
-	{ _MMIO(0x2744), 0x00006667 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x275c), 0x86543210 },
-	{ _MMIO(0x2758), 0x86543210 },
-	{ _MMIO(0x2754), 0x00006465 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007f81a },
-	{ _MMIO(0x2774), 0x0000fe00 },
-	{ _MMIO(0x2778), 0x0007f82a },
-	{ _MMIO(0x277c), 0x0000fe00 },
-	{ _MMIO(0x2780), 0x0007f822 },
-	{ _MMIO(0x2784), 0x0000fe00 },
-	{ _MMIO(0x2788), 0x0007f8ba },
-	{ _MMIO(0x278c), 0x0000fe00 },
-	{ _MMIO(0x2790), 0x0007f87a },
-	{ _MMIO(0x2794), 0x0000fe00 },
-	{ _MMIO(0x2798), 0x0007f8ea },
-	{ _MMIO(0x279c), 0x0000fe00 },
-	{ _MMIO(0x27a0), 0x0007f8e2 },
-	{ _MMIO(0x27a4), 0x0000fe00 },
-	{ _MMIO(0x27a8), 0x0007f8f2 },
-	{ _MMIO(0x27ac), 0x0000fe00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_memory_writes[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_memory_writes[] = {
-	{ _MMIO(0x9888), 0x198b0343 },
-	{ _MMIO(0x9888), 0x13845400 },
-	{ _MMIO(0x9888), 0x3580001a },
-	{ _MMIO(0x9888), 0x3d800805 },
-	{ _MMIO(0x9888), 0x038b6300 },
-	{ _MMIO(0x9888), 0x058b6b62 },
-	{ _MMIO(0x9888), 0x078b006a },
-	{ _MMIO(0x9888), 0x118b0000 },
-	{ _MMIO(0x9888), 0x238b0000 },
-	{ _MMIO(0x9888), 0x258b0000 },
-	{ _MMIO(0x9888), 0x1f85a080 },
-	{ _MMIO(0x9888), 0x2185aaaa },
-	{ _MMIO(0x9888), 0x23850002 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x01840010 },
-	{ _MMIO(0x9888), 0x07844880 },
-	{ _MMIO(0x9888), 0x09840992 },
-	{ _MMIO(0x9888), 0x0b840a94 },
-	{ _MMIO(0x9888), 0x0d840b96 },
-	{ _MMIO(0x9888), 0x11840000 },
-	{ _MMIO(0x9888), 0x03848000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x2d800147 },
-	{ _MMIO(0x9888), 0x2f8000e5 },
-	{ _MMIO(0x9888), 0x138080e3 },
-	{ _MMIO(0x9888), 0x1580c0e1 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x11804000 },
-	{ _MMIO(0x9888), 0x1780c000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f800000 },
-	{ _MMIO(0xd24), 0x00000000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x43800842 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45800842 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47801082 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x3f800084 },
-	{ _MMIO(0x9888), 0x41800000 },
-};
-
-static int
-get_memory_writes_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_memory_writes;
-	lens[n] = ARRAY_SIZE(mux_config_memory_writes);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_extended[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007fc2a },
-	{ _MMIO(0x2774), 0x0000bf00 },
-	{ _MMIO(0x2778), 0x0007fc6a },
-	{ _MMIO(0x277c), 0x0000bf00 },
-	{ _MMIO(0x2780), 0x0007fc92 },
-	{ _MMIO(0x2784), 0x0000bf00 },
-	{ _MMIO(0x2788), 0x0007fca2 },
-	{ _MMIO(0x278c), 0x0000bf00 },
-	{ _MMIO(0x2790), 0x0007fc32 },
-	{ _MMIO(0x2794), 0x0000bf00 },
-	{ _MMIO(0x2798), 0x0007fc9a },
-	{ _MMIO(0x279c), 0x0000bf00 },
-	{ _MMIO(0x27a0), 0x0007fe6a },
-	{ _MMIO(0x27a4), 0x0000bf00 },
-	{ _MMIO(0x27a8), 0x0007fe7a },
-	{ _MMIO(0x27ac), 0x0000bf00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_extended[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_compute_extended_0_subslices_0x01[] = {
-	{ _MMIO(0x9888), 0x143d0160 },
-	{ _MMIO(0x9888), 0x163d2800 },
-	{ _MMIO(0x9888), 0x183d0120 },
-	{ _MMIO(0x9888), 0x105800e0 },
-	{ _MMIO(0x9888), 0x005cc000 },
-	{ _MMIO(0x9888), 0x065c8000 },
-	{ _MMIO(0x9888), 0x085cc000 },
-	{ _MMIO(0x9888), 0x0a5cc000 },
-	{ _MMIO(0x9888), 0x0c5cc000 },
-	{ _MMIO(0x9888), 0x0e5cc000 },
-	{ _MMIO(0x9888), 0x025cc000 },
-	{ _MMIO(0x9888), 0x045cc000 },
-	{ _MMIO(0x9888), 0x003d0011 },
-	{ _MMIO(0x9888), 0x063d0900 },
-	{ _MMIO(0x9888), 0x083d0a13 },
-	{ _MMIO(0x9888), 0x0a3d0b15 },
-	{ _MMIO(0x9888), 0x0c3d2317 },
-	{ _MMIO(0x9888), 0x043d21b7 },
-	{ _MMIO(0x9888), 0x103d0000 },
-	{ _MMIO(0x9888), 0x0e3d0000 },
-	{ _MMIO(0x9888), 0x1a3d0000 },
-	{ _MMIO(0x9888), 0x0e5825c1 },
-	{ _MMIO(0x9888), 0x00586100 },
-	{ _MMIO(0x9888), 0x0258204c },
-	{ _MMIO(0x9888), 0x06588000 },
-	{ _MMIO(0x9888), 0x0858c000 },
-	{ _MMIO(0x9888), 0x0a58c000 },
-	{ _MMIO(0x9888), 0x0c58c000 },
-	{ _MMIO(0x9888), 0x0458c000 },
-	{ _MMIO(0x9888), 0x005b4000 },
-	{ _MMIO(0x9888), 0x0e5b4000 },
-	{ _MMIO(0x9888), 0x185b5400 },
-	{ _MMIO(0x9888), 0x1a5b0155 },
-	{ _MMIO(0x9888), 0x025b4000 },
-	{ _MMIO(0x9888), 0x045b4000 },
-	{ _MMIO(0x9888), 0x065b4000 },
-	{ _MMIO(0x9888), 0x085b4000 },
-	{ _MMIO(0x9888), 0x0a5b4000 },
-	{ _MMIO(0x9888), 0x0c1fa800 },
-	{ _MMIO(0x9888), 0x0e1faa2a },
-	{ _MMIO(0x9888), 0x101f02aa },
-	{ _MMIO(0x9888), 0x00384000 },
-	{ _MMIO(0x9888), 0x0e384000 },
-	{ _MMIO(0x9888), 0x16384000 },
-	{ _MMIO(0x9888), 0x18381555 },
-	{ _MMIO(0x9888), 0x02384000 },
-	{ _MMIO(0x9888), 0x04384000 },
-	{ _MMIO(0x9888), 0x06384000 },
-	{ _MMIO(0x9888), 0x08384000 },
-	{ _MMIO(0x9888), 0x0a384000 },
-	{ _MMIO(0x9888), 0x0039a000 },
-	{ _MMIO(0x9888), 0x06398000 },
-	{ _MMIO(0x9888), 0x0839a000 },
-	{ _MMIO(0x9888), 0x0a39a000 },
-	{ _MMIO(0x9888), 0x0c39a000 },
-	{ _MMIO(0x9888), 0x0e39a000 },
-	{ _MMIO(0x9888), 0x0239a000 },
-	{ _MMIO(0x9888), 0x0439a000 },
-	{ _MMIO(0x9888), 0x018a8000 },
-	{ _MMIO(0x9888), 0x0f8a8000 },
-	{ _MMIO(0x9888), 0x198a8000 },
-	{ _MMIO(0x9888), 0x1b8aaaa0 },
-	{ _MMIO(0x9888), 0x1d8a0002 },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x078a8000 },
-	{ _MMIO(0x9888), 0x098a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x238b2aa0 },
-	{ _MMIO(0x9888), 0x258b5551 },
-	{ _MMIO(0x9888), 0x278b0015 },
-	{ _MMIO(0x9888), 0x1f85aa80 },
-	{ _MMIO(0x9888), 0x2185aaa2 },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x01834000 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0184c000 },
-	{ _MMIO(0x9888), 0x07848000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x1180c000 },
-	{ _MMIO(0x9888), 0x17808000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0xd24), 0x00000000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3d800000 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x43800000 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45800000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47800420 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x3f800421 },
-	{ _MMIO(0x9888), 0x41800000 },
-};
-
-static const struct i915_oa_reg mux_config_compute_extended_2_subslices_0x02[] = {
-	{ _MMIO(0x9888), 0x105c00e0 },
-	{ _MMIO(0x9888), 0x145b0160 },
-	{ _MMIO(0x9888), 0x165b2800 },
-	{ _MMIO(0x9888), 0x185b0120 },
-	{ _MMIO(0x9888), 0x0e5c25c1 },
-	{ _MMIO(0x9888), 0x005c6100 },
-	{ _MMIO(0x9888), 0x025c204c },
-	{ _MMIO(0x9888), 0x065c8000 },
-	{ _MMIO(0x9888), 0x085cc000 },
-	{ _MMIO(0x9888), 0x0a5cc000 },
-	{ _MMIO(0x9888), 0x0c5cc000 },
-	{ _MMIO(0x9888), 0x045cc000 },
-	{ _MMIO(0x9888), 0x005b0011 },
-	{ _MMIO(0x9888), 0x065b0900 },
-	{ _MMIO(0x9888), 0x085b0a13 },
-	{ _MMIO(0x9888), 0x0a5b0b15 },
-	{ _MMIO(0x9888), 0x0c5b2317 },
-	{ _MMIO(0x9888), 0x045b21b7 },
-	{ _MMIO(0x9888), 0x105b0000 },
-	{ _MMIO(0x9888), 0x0e5b0000 },
-	{ _MMIO(0x9888), 0x1a5b0000 },
-	{ _MMIO(0x9888), 0x0c1fa800 },
-	{ _MMIO(0x9888), 0x0e1faa2a },
-	{ _MMIO(0x9888), 0x101f02aa },
-	{ _MMIO(0x9888), 0x00384000 },
-	{ _MMIO(0x9888), 0x0e384000 },
-	{ _MMIO(0x9888), 0x16384000 },
-	{ _MMIO(0x9888), 0x18381555 },
-	{ _MMIO(0x9888), 0x02384000 },
-	{ _MMIO(0x9888), 0x04384000 },
-	{ _MMIO(0x9888), 0x06384000 },
-	{ _MMIO(0x9888), 0x08384000 },
-	{ _MMIO(0x9888), 0x0a384000 },
-	{ _MMIO(0x9888), 0x0039a000 },
-	{ _MMIO(0x9888), 0x06398000 },
-	{ _MMIO(0x9888), 0x0839a000 },
-	{ _MMIO(0x9888), 0x0a39a000 },
-	{ _MMIO(0x9888), 0x0c39a000 },
-	{ _MMIO(0x9888), 0x0e39a000 },
-	{ _MMIO(0x9888), 0x0239a000 },
-	{ _MMIO(0x9888), 0x0439a000 },
-	{ _MMIO(0x9888), 0x018a8000 },
-	{ _MMIO(0x9888), 0x0f8a8000 },
-	{ _MMIO(0x9888), 0x198a8000 },
-	{ _MMIO(0x9888), 0x1b8aaaa0 },
-	{ _MMIO(0x9888), 0x1d8a0002 },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x078a8000 },
-	{ _MMIO(0x9888), 0x098a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x238b2aa0 },
-	{ _MMIO(0x9888), 0x258b5551 },
-	{ _MMIO(0x9888), 0x278b0015 },
-	{ _MMIO(0x9888), 0x1f85aa80 },
-	{ _MMIO(0x9888), 0x2185aaa2 },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x01834000 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0184c000 },
-	{ _MMIO(0x9888), 0x07848000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x1180c000 },
-	{ _MMIO(0x9888), 0x17808000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0xd24), 0x00000000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3d800000 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x43800000 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45800000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47800420 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x3f800421 },
-	{ _MMIO(0x9888), 0x41800000 },
-};
-
-static const struct i915_oa_reg mux_config_compute_extended_4_subslices_0x04[] = {
-	{ _MMIO(0x9888), 0x103800e0 },
-	{ _MMIO(0x9888), 0x143a0160 },
-	{ _MMIO(0x9888), 0x163a2800 },
-	{ _MMIO(0x9888), 0x183a0120 },
-	{ _MMIO(0x9888), 0x0c1fa800 },
-	{ _MMIO(0x9888), 0x0e1faa2a },
-	{ _MMIO(0x9888), 0x101f02aa },
-	{ _MMIO(0x9888), 0x0e38a5c1 },
-	{ _MMIO(0x9888), 0x0038a100 },
-	{ _MMIO(0x9888), 0x0238204c },
-	{ _MMIO(0x9888), 0x16388000 },
-	{ _MMIO(0x9888), 0x183802aa },
-	{ _MMIO(0x9888), 0x04380000 },
-	{ _MMIO(0x9888), 0x06380000 },
-	{ _MMIO(0x9888), 0x08388000 },
-	{ _MMIO(0x9888), 0x0a388000 },
-	{ _MMIO(0x9888), 0x0039a000 },
-	{ _MMIO(0x9888), 0x06398000 },
-	{ _MMIO(0x9888), 0x0839a000 },
-	{ _MMIO(0x9888), 0x0a39a000 },
-	{ _MMIO(0x9888), 0x0c39a000 },
-	{ _MMIO(0x9888), 0x0e39a000 },
-	{ _MMIO(0x9888), 0x0239a000 },
-	{ _MMIO(0x9888), 0x0439a000 },
-	{ _MMIO(0x9888), 0x003a0011 },
-	{ _MMIO(0x9888), 0x063a0900 },
-	{ _MMIO(0x9888), 0x083a0a13 },
-	{ _MMIO(0x9888), 0x0a3a0b15 },
-	{ _MMIO(0x9888), 0x0c3a2317 },
-	{ _MMIO(0x9888), 0x043a21b7 },
-	{ _MMIO(0x9888), 0x103a0000 },
-	{ _MMIO(0x9888), 0x0e3a0000 },
-	{ _MMIO(0x9888), 0x1a3a0000 },
-	{ _MMIO(0x9888), 0x018a8000 },
-	{ _MMIO(0x9888), 0x0f8a8000 },
-	{ _MMIO(0x9888), 0x198a8000 },
-	{ _MMIO(0x9888), 0x1b8aaaa0 },
-	{ _MMIO(0x9888), 0x1d8a0002 },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x078a8000 },
-	{ _MMIO(0x9888), 0x098a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x238b2aa0 },
-	{ _MMIO(0x9888), 0x258b5551 },
-	{ _MMIO(0x9888), 0x278b0015 },
-	{ _MMIO(0x9888), 0x1f85aa80 },
-	{ _MMIO(0x9888), 0x2185aaa2 },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x01834000 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0184c000 },
-	{ _MMIO(0x9888), 0x07848000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x1180c000 },
-	{ _MMIO(0x9888), 0x17808000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0xd24), 0x00000000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3d800000 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x43800000 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45800000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47800420 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x3f800421 },
-	{ _MMIO(0x9888), 0x41800000 },
-};
-
-static const struct i915_oa_reg mux_config_compute_extended_1_subslices_0x08[] = {
-	{ _MMIO(0x9888), 0x14bd0160 },
-	{ _MMIO(0x9888), 0x16bd2800 },
-	{ _MMIO(0x9888), 0x18bd0120 },
-	{ _MMIO(0x9888), 0x10d800e0 },
-	{ _MMIO(0x9888), 0x00dcc000 },
-	{ _MMIO(0x9888), 0x06dc8000 },
-	{ _MMIO(0x9888), 0x08dcc000 },
-	{ _MMIO(0x9888), 0x0adcc000 },
-	{ _MMIO(0x9888), 0x0cdcc000 },
-	{ _MMIO(0x9888), 0x0edcc000 },
-	{ _MMIO(0x9888), 0x02dcc000 },
-	{ _MMIO(0x9888), 0x04dcc000 },
-	{ _MMIO(0x9888), 0x00bd0011 },
-	{ _MMIO(0x9888), 0x06bd0900 },
-	{ _MMIO(0x9888), 0x08bd0a13 },
-	{ _MMIO(0x9888), 0x0abd0b15 },
-	{ _MMIO(0x9888), 0x0cbd2317 },
-	{ _MMIO(0x9888), 0x04bd21b7 },
-	{ _MMIO(0x9888), 0x10bd0000 },
-	{ _MMIO(0x9888), 0x0ebd0000 },
-	{ _MMIO(0x9888), 0x1abd0000 },
-	{ _MMIO(0x9888), 0x0ed825c1 },
-	{ _MMIO(0x9888), 0x00d86100 },
-	{ _MMIO(0x9888), 0x02d8204c },
-	{ _MMIO(0x9888), 0x06d88000 },
-	{ _MMIO(0x9888), 0x08d8c000 },
-	{ _MMIO(0x9888), 0x0ad8c000 },
-	{ _MMIO(0x9888), 0x0cd8c000 },
-	{ _MMIO(0x9888), 0x04d8c000 },
-	{ _MMIO(0x9888), 0x00db4000 },
-	{ _MMIO(0x9888), 0x0edb4000 },
-	{ _MMIO(0x9888), 0x18db5400 },
-	{ _MMIO(0x9888), 0x1adb0155 },
-	{ _MMIO(0x9888), 0x02db4000 },
-	{ _MMIO(0x9888), 0x04db4000 },
-	{ _MMIO(0x9888), 0x06db4000 },
-	{ _MMIO(0x9888), 0x08db4000 },
-	{ _MMIO(0x9888), 0x0adb4000 },
-	{ _MMIO(0x9888), 0x0c9fa800 },
-	{ _MMIO(0x9888), 0x0e9faa2a },
-	{ _MMIO(0x9888), 0x109f02aa },
-	{ _MMIO(0x9888), 0x00b84000 },
-	{ _MMIO(0x9888), 0x0eb84000 },
-	{ _MMIO(0x9888), 0x16b84000 },
-	{ _MMIO(0x9888), 0x18b81555 },
-	{ _MMIO(0x9888), 0x02b84000 },
-	{ _MMIO(0x9888), 0x04b84000 },
-	{ _MMIO(0x9888), 0x06b84000 },
-	{ _MMIO(0x9888), 0x08b84000 },
-	{ _MMIO(0x9888), 0x0ab84000 },
-	{ _MMIO(0x9888), 0x00b9a000 },
-	{ _MMIO(0x9888), 0x06b98000 },
-	{ _MMIO(0x9888), 0x08b9a000 },
-	{ _MMIO(0x9888), 0x0ab9a000 },
-	{ _MMIO(0x9888), 0x0cb9a000 },
-	{ _MMIO(0x9888), 0x0eb9a000 },
-	{ _MMIO(0x9888), 0x02b9a000 },
-	{ _MMIO(0x9888), 0x04b9a000 },
-	{ _MMIO(0x9888), 0x01888000 },
-	{ _MMIO(0x9888), 0x0d88f800 },
-	{ _MMIO(0x9888), 0x0f88000f },
-	{ _MMIO(0x9888), 0x03888000 },
-	{ _MMIO(0x9888), 0x05888000 },
-	{ _MMIO(0x9888), 0x07888000 },
-	{ _MMIO(0x9888), 0x09888000 },
-	{ _MMIO(0x9888), 0x0b888000 },
-	{ _MMIO(0x9888), 0x238b5540 },
-	{ _MMIO(0x9888), 0x258baaa2 },
-	{ _MMIO(0x9888), 0x278b002a },
-	{ _MMIO(0x9888), 0x018c4000 },
-	{ _MMIO(0x9888), 0x0f8c4000 },
-	{ _MMIO(0x9888), 0x178c2000 },
-	{ _MMIO(0x9888), 0x198c5500 },
-	{ _MMIO(0x9888), 0x1b8c0015 },
-	{ _MMIO(0x9888), 0x038c4000 },
-	{ _MMIO(0x9888), 0x058c4000 },
-	{ _MMIO(0x9888), 0x078c4000 },
-	{ _MMIO(0x9888), 0x098c4000 },
-	{ _MMIO(0x9888), 0x0b8c4000 },
-	{ _MMIO(0x9888), 0x018da000 },
-	{ _MMIO(0x9888), 0x078d8000 },
-	{ _MMIO(0x9888), 0x098da000 },
-	{ _MMIO(0x9888), 0x0b8da000 },
-	{ _MMIO(0x9888), 0x0d8da000 },
-	{ _MMIO(0x9888), 0x0f8da000 },
-	{ _MMIO(0x9888), 0x038da000 },
-	{ _MMIO(0x9888), 0x058da000 },
-	{ _MMIO(0x9888), 0x1f85aa80 },
-	{ _MMIO(0x9888), 0x2185aaa2 },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x01834000 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0184c000 },
-	{ _MMIO(0x9888), 0x07848000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x1180c000 },
-	{ _MMIO(0x9888), 0x17808000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0xd24), 0x00000000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3d800000 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x43800000 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45800000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47800420 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x3f800421 },
-	{ _MMIO(0x9888), 0x41800000 },
-};
-
-static const struct i915_oa_reg mux_config_compute_extended_3_subslices_0x10[] = {
-	{ _MMIO(0x9888), 0x10dc00e0 },
-	{ _MMIO(0x9888), 0x14db0160 },
-	{ _MMIO(0x9888), 0x16db2800 },
-	{ _MMIO(0x9888), 0x18db0120 },
-	{ _MMIO(0x9888), 0x0edc25c1 },
-	{ _MMIO(0x9888), 0x00dc6100 },
-	{ _MMIO(0x9888), 0x02dc204c },
-	{ _MMIO(0x9888), 0x06dc8000 },
-	{ _MMIO(0x9888), 0x08dcc000 },
-	{ _MMIO(0x9888), 0x0adcc000 },
-	{ _MMIO(0x9888), 0x0cdcc000 },
-	{ _MMIO(0x9888), 0x04dcc000 },
-	{ _MMIO(0x9888), 0x00db0011 },
-	{ _MMIO(0x9888), 0x06db0900 },
-	{ _MMIO(0x9888), 0x08db0a13 },
-	{ _MMIO(0x9888), 0x0adb0b15 },
-	{ _MMIO(0x9888), 0x0cdb2317 },
-	{ _MMIO(0x9888), 0x04db21b7 },
-	{ _MMIO(0x9888), 0x10db0000 },
-	{ _MMIO(0x9888), 0x0edb0000 },
-	{ _MMIO(0x9888), 0x1adb0000 },
-	{ _MMIO(0x9888), 0x0c9fa800 },
-	{ _MMIO(0x9888), 0x0e9faa2a },
-	{ _MMIO(0x9888), 0x109f02aa },
-	{ _MMIO(0x9888), 0x00b84000 },
-	{ _MMIO(0x9888), 0x0eb84000 },
-	{ _MMIO(0x9888), 0x16b84000 },
-	{ _MMIO(0x9888), 0x18b81555 },
-	{ _MMIO(0x9888), 0x02b84000 },
-	{ _MMIO(0x9888), 0x04b84000 },
-	{ _MMIO(0x9888), 0x06b84000 },
-	{ _MMIO(0x9888), 0x08b84000 },
-	{ _MMIO(0x9888), 0x0ab84000 },
-	{ _MMIO(0x9888), 0x00b9a000 },
-	{ _MMIO(0x9888), 0x06b98000 },
-	{ _MMIO(0x9888), 0x08b9a000 },
-	{ _MMIO(0x9888), 0x0ab9a000 },
-	{ _MMIO(0x9888), 0x0cb9a000 },
-	{ _MMIO(0x9888), 0x0eb9a000 },
-	{ _MMIO(0x9888), 0x02b9a000 },
-	{ _MMIO(0x9888), 0x04b9a000 },
-	{ _MMIO(0x9888), 0x01888000 },
-	{ _MMIO(0x9888), 0x0d88f800 },
-	{ _MMIO(0x9888), 0x0f88000f },
-	{ _MMIO(0x9888), 0x03888000 },
-	{ _MMIO(0x9888), 0x05888000 },
-	{ _MMIO(0x9888), 0x07888000 },
-	{ _MMIO(0x9888), 0x09888000 },
-	{ _MMIO(0x9888), 0x0b888000 },
-	{ _MMIO(0x9888), 0x238b5540 },
-	{ _MMIO(0x9888), 0x258baaa2 },
-	{ _MMIO(0x9888), 0x278b002a },
-	{ _MMIO(0x9888), 0x018c4000 },
-	{ _MMIO(0x9888), 0x0f8c4000 },
-	{ _MMIO(0x9888), 0x178c2000 },
-	{ _MMIO(0x9888), 0x198c5500 },
-	{ _MMIO(0x9888), 0x1b8c0015 },
-	{ _MMIO(0x9888), 0x038c4000 },
-	{ _MMIO(0x9888), 0x058c4000 },
-	{ _MMIO(0x9888), 0x078c4000 },
-	{ _MMIO(0x9888), 0x098c4000 },
-	{ _MMIO(0x9888), 0x0b8c4000 },
-	{ _MMIO(0x9888), 0x018da000 },
-	{ _MMIO(0x9888), 0x078d8000 },
-	{ _MMIO(0x9888), 0x098da000 },
-	{ _MMIO(0x9888), 0x0b8da000 },
-	{ _MMIO(0x9888), 0x0d8da000 },
-	{ _MMIO(0x9888), 0x0f8da000 },
-	{ _MMIO(0x9888), 0x038da000 },
-	{ _MMIO(0x9888), 0x058da000 },
-	{ _MMIO(0x9888), 0x1f85aa80 },
-	{ _MMIO(0x9888), 0x2185aaa2 },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x01834000 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0184c000 },
-	{ _MMIO(0x9888), 0x07848000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x1180c000 },
-	{ _MMIO(0x9888), 0x17808000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0xd24), 0x00000000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3d800000 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x43800000 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45800000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47800420 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x3f800421 },
-	{ _MMIO(0x9888), 0x41800000 },
-};
-
-static const struct i915_oa_reg mux_config_compute_extended_5_subslices_0x20[] = {
-	{ _MMIO(0x9888), 0x10b800e0 },
-	{ _MMIO(0x9888), 0x14ba0160 },
-	{ _MMIO(0x9888), 0x16ba2800 },
-	{ _MMIO(0x9888), 0x18ba0120 },
-	{ _MMIO(0x9888), 0x0c9fa800 },
-	{ _MMIO(0x9888), 0x0e9faa2a },
-	{ _MMIO(0x9888), 0x109f02aa },
-	{ _MMIO(0x9888), 0x0eb8a5c1 },
-	{ _MMIO(0x9888), 0x00b8a100 },
-	{ _MMIO(0x9888), 0x02b8204c },
-	{ _MMIO(0x9888), 0x16b88000 },
-	{ _MMIO(0x9888), 0x18b802aa },
-	{ _MMIO(0x9888), 0x04b80000 },
-	{ _MMIO(0x9888), 0x06b80000 },
-	{ _MMIO(0x9888), 0x08b88000 },
-	{ _MMIO(0x9888), 0x0ab88000 },
-	{ _MMIO(0x9888), 0x00b9a000 },
-	{ _MMIO(0x9888), 0x06b98000 },
-	{ _MMIO(0x9888), 0x08b9a000 },
-	{ _MMIO(0x9888), 0x0ab9a000 },
-	{ _MMIO(0x9888), 0x0cb9a000 },
-	{ _MMIO(0x9888), 0x0eb9a000 },
-	{ _MMIO(0x9888), 0x02b9a000 },
-	{ _MMIO(0x9888), 0x04b9a000 },
-	{ _MMIO(0x9888), 0x00ba0011 },
-	{ _MMIO(0x9888), 0x06ba0900 },
-	{ _MMIO(0x9888), 0x08ba0a13 },
-	{ _MMIO(0x9888), 0x0aba0b15 },
-	{ _MMIO(0x9888), 0x0cba2317 },
-	{ _MMIO(0x9888), 0x04ba21b7 },
-	{ _MMIO(0x9888), 0x10ba0000 },
-	{ _MMIO(0x9888), 0x0eba0000 },
-	{ _MMIO(0x9888), 0x1aba0000 },
-	{ _MMIO(0x9888), 0x01888000 },
-	{ _MMIO(0x9888), 0x0d88f800 },
-	{ _MMIO(0x9888), 0x0f88000f },
-	{ _MMIO(0x9888), 0x03888000 },
-	{ _MMIO(0x9888), 0x05888000 },
-	{ _MMIO(0x9888), 0x07888000 },
-	{ _MMIO(0x9888), 0x09888000 },
-	{ _MMIO(0x9888), 0x0b888000 },
-	{ _MMIO(0x9888), 0x238b5540 },
-	{ _MMIO(0x9888), 0x258baaa2 },
-	{ _MMIO(0x9888), 0x278b002a },
-	{ _MMIO(0x9888), 0x018c4000 },
-	{ _MMIO(0x9888), 0x0f8c4000 },
-	{ _MMIO(0x9888), 0x178c2000 },
-	{ _MMIO(0x9888), 0x198c5500 },
-	{ _MMIO(0x9888), 0x1b8c0015 },
-	{ _MMIO(0x9888), 0x038c4000 },
-	{ _MMIO(0x9888), 0x058c4000 },
-	{ _MMIO(0x9888), 0x078c4000 },
-	{ _MMIO(0x9888), 0x098c4000 },
-	{ _MMIO(0x9888), 0x0b8c4000 },
-	{ _MMIO(0x9888), 0x018da000 },
-	{ _MMIO(0x9888), 0x078d8000 },
-	{ _MMIO(0x9888), 0x098da000 },
-	{ _MMIO(0x9888), 0x0b8da000 },
-	{ _MMIO(0x9888), 0x0d8da000 },
-	{ _MMIO(0x9888), 0x0f8da000 },
-	{ _MMIO(0x9888), 0x038da000 },
-	{ _MMIO(0x9888), 0x058da000 },
-	{ _MMIO(0x9888), 0x1f85aa80 },
-	{ _MMIO(0x9888), 0x2185aaa2 },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x01834000 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0184c000 },
-	{ _MMIO(0x9888), 0x07848000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x1180c000 },
-	{ _MMIO(0x9888), 0x17808000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0xd24), 0x00000000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3d800000 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x43800000 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45800000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47800420 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x3f800421 },
-	{ _MMIO(0x9888), 0x41800000 },
-};
-
-static int
-get_compute_extended_mux_config(struct drm_i915_private *dev_priv,
-				const struct i915_oa_reg **regs,
-				int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 6);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 6);
-
-	if (INTEL_INFO(dev_priv)->sseu.subslice_mask & 0x01) {
-		regs[n] = mux_config_compute_extended_0_subslices_0x01;
-		lens[n] = ARRAY_SIZE(mux_config_compute_extended_0_subslices_0x01);
-		n++;
-	}
-	if (INTEL_INFO(dev_priv)->sseu.subslice_mask & 0x08) {
-		regs[n] = mux_config_compute_extended_1_subslices_0x08;
-		lens[n] = ARRAY_SIZE(mux_config_compute_extended_1_subslices_0x08);
-		n++;
-	}
-	if (INTEL_INFO(dev_priv)->sseu.subslice_mask & 0x02) {
-		regs[n] = mux_config_compute_extended_2_subslices_0x02;
-		lens[n] = ARRAY_SIZE(mux_config_compute_extended_2_subslices_0x02);
-		n++;
-	}
-	if (INTEL_INFO(dev_priv)->sseu.subslice_mask & 0x10) {
-		regs[n] = mux_config_compute_extended_3_subslices_0x10;
-		lens[n] = ARRAY_SIZE(mux_config_compute_extended_3_subslices_0x10);
-		n++;
-	}
-	if (INTEL_INFO(dev_priv)->sseu.subslice_mask & 0x04) {
-		regs[n] = mux_config_compute_extended_4_subslices_0x04;
-		lens[n] = ARRAY_SIZE(mux_config_compute_extended_4_subslices_0x04);
-		n++;
-	}
-	if (INTEL_INFO(dev_priv)->sseu.subslice_mask & 0x20) {
-		regs[n] = mux_config_compute_extended_5_subslices_0x20;
-		lens[n] = ARRAY_SIZE(mux_config_compute_extended_5_subslices_0x20);
-		n++;
-	}
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_l3_cache[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x30800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007fffa },
-	{ _MMIO(0x2774), 0x0000fefe },
-	{ _MMIO(0x2778), 0x0007fffa },
-	{ _MMIO(0x277c), 0x0000fefd },
-	{ _MMIO(0x2790), 0x0007fffa },
-	{ _MMIO(0x2794), 0x0000fbef },
-	{ _MMIO(0x2798), 0x0007fffa },
-	{ _MMIO(0x279c), 0x0000fbdf },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_l3_cache[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00101100 },
-	{ _MMIO(0xe45c), 0x00201200 },
-	{ _MMIO(0xe55c), 0x00301300 },
-	{ _MMIO(0xe65c), 0x00401400 },
-};
-
-static const struct i915_oa_reg mux_config_compute_l3_cache[] = {
-	{ _MMIO(0x9888), 0x143f00b3 },
-	{ _MMIO(0x9888), 0x14bf00b3 },
-	{ _MMIO(0x9888), 0x138303c0 },
-	{ _MMIO(0x9888), 0x3b800060 },
-	{ _MMIO(0x9888), 0x3d800805 },
-	{ _MMIO(0x9888), 0x003f0029 },
-	{ _MMIO(0x9888), 0x063f1400 },
-	{ _MMIO(0x9888), 0x083f1225 },
-	{ _MMIO(0x9888), 0x0e3f1327 },
-	{ _MMIO(0x9888), 0x103f0000 },
-	{ _MMIO(0x9888), 0x005a4000 },
-	{ _MMIO(0x9888), 0x065a8000 },
-	{ _MMIO(0x9888), 0x085ac000 },
-	{ _MMIO(0x9888), 0x0e5ac000 },
-	{ _MMIO(0x9888), 0x001d4000 },
-	{ _MMIO(0x9888), 0x061d8000 },
-	{ _MMIO(0x9888), 0x081dc000 },
-	{ _MMIO(0x9888), 0x0e1dc000 },
-	{ _MMIO(0x9888), 0x0c1f0800 },
-	{ _MMIO(0x9888), 0x0e1f2a00 },
-	{ _MMIO(0x9888), 0x101f0280 },
-	{ _MMIO(0x9888), 0x00391000 },
-	{ _MMIO(0x9888), 0x06394000 },
-	{ _MMIO(0x9888), 0x08395000 },
-	{ _MMIO(0x9888), 0x0e395000 },
-	{ _MMIO(0x9888), 0x0abf1429 },
-	{ _MMIO(0x9888), 0x0cbf1225 },
-	{ _MMIO(0x9888), 0x00bf1380 },
-	{ _MMIO(0x9888), 0x02bf0026 },
-	{ _MMIO(0x9888), 0x10bf0000 },
-	{ _MMIO(0x9888), 0x0adac000 },
-	{ _MMIO(0x9888), 0x0cdac000 },
-	{ _MMIO(0x9888), 0x00da8000 },
-	{ _MMIO(0x9888), 0x02da4000 },
-	{ _MMIO(0x9888), 0x0a9dc000 },
-	{ _MMIO(0x9888), 0x0c9dc000 },
-	{ _MMIO(0x9888), 0x009d8000 },
-	{ _MMIO(0x9888), 0x029d4000 },
-	{ _MMIO(0x9888), 0x0e9f8000 },
-	{ _MMIO(0x9888), 0x109f002a },
-	{ _MMIO(0x9888), 0x0c9fa000 },
-	{ _MMIO(0x9888), 0x0ab95000 },
-	{ _MMIO(0x9888), 0x0cb95000 },
-	{ _MMIO(0x9888), 0x00b94000 },
-	{ _MMIO(0x9888), 0x02b91000 },
-	{ _MMIO(0x9888), 0x0d88c000 },
-	{ _MMIO(0x9888), 0x0f880003 },
-	{ _MMIO(0x9888), 0x03888000 },
-	{ _MMIO(0x9888), 0x05888000 },
-	{ _MMIO(0x9888), 0x018a8000 },
-	{ _MMIO(0x9888), 0x0f8a8000 },
-	{ _MMIO(0x9888), 0x198a8000 },
-	{ _MMIO(0x9888), 0x1b8a8020 },
-	{ _MMIO(0x9888), 0x1d8a0002 },
-	{ _MMIO(0x9888), 0x238b0520 },
-	{ _MMIO(0x9888), 0x258ba950 },
-	{ _MMIO(0x9888), 0x278b0016 },
-	{ _MMIO(0x9888), 0x198c5400 },
-	{ _MMIO(0x9888), 0x1b8c0001 },
-	{ _MMIO(0x9888), 0x038c4000 },
-	{ _MMIO(0x9888), 0x058c4000 },
-	{ _MMIO(0x9888), 0x0b8da000 },
-	{ _MMIO(0x9888), 0x0d8da000 },
-	{ _MMIO(0x9888), 0x018d8000 },
-	{ _MMIO(0x9888), 0x038d2000 },
-	{ _MMIO(0x9888), 0x1f85aa80 },
-	{ _MMIO(0x9888), 0x2185aaa0 },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x03835180 },
-	{ _MMIO(0x9888), 0x05834022 },
-	{ _MMIO(0x9888), 0x11830000 },
-	{ _MMIO(0x9888), 0x01834000 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x07830000 },
-	{ _MMIO(0x9888), 0x09830000 },
-	{ _MMIO(0x9888), 0x0184c000 },
-	{ _MMIO(0x9888), 0x07848000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x05844000 },
-	{ _MMIO(0x9888), 0x1b80c137 },
-	{ _MMIO(0x9888), 0x1d80c147 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x1180c000 },
-	{ _MMIO(0x9888), 0x17808000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x15804000 },
-	{ _MMIO(0xd24), 0x00000000 },
-	{ _MMIO(0x9888), 0x4d801000 },
-	{ _MMIO(0x9888), 0x4f800111 },
-	{ _MMIO(0x9888), 0x43800842 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45800000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47800840 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x3f800800 },
-	{ _MMIO(0x9888), 0x418014a2 },
-};
-
-static int
-get_compute_l3_cache_mux_config(struct drm_i915_private *dev_priv,
-				const struct i915_oa_reg **regs,
-				int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_l3_cache;
-	lens[n] = ARRAY_SIZE(mux_config_compute_l3_cache);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_data_port_reads_coalescing[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0xba98ba98 },
-	{ _MMIO(0x2748), 0xba98ba98 },
-	{ _MMIO(0x2744), 0x00003377 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007fff2 },
-	{ _MMIO(0x2774), 0x00007ff0 },
-	{ _MMIO(0x2778), 0x0007ffe2 },
-	{ _MMIO(0x277c), 0x00007ff0 },
-	{ _MMIO(0x2780), 0x0007ffc2 },
-	{ _MMIO(0x2784), 0x00007ff0 },
-	{ _MMIO(0x2788), 0x0007ff82 },
-	{ _MMIO(0x278c), 0x00007ff0 },
-	{ _MMIO(0x2790), 0x0007fffa },
-	{ _MMIO(0x2794), 0x0000bfef },
-	{ _MMIO(0x2798), 0x0007fffa },
-	{ _MMIO(0x279c), 0x0000bfdf },
-	{ _MMIO(0x27a0), 0x0007fffa },
-	{ _MMIO(0x27a4), 0x0000bfbf },
-	{ _MMIO(0x27a8), 0x0007fffa },
-	{ _MMIO(0x27ac), 0x0000bf7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_data_port_reads_coalescing[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_data_port_reads_coalescing_0_subslices_0x01[] = {
-	{ _MMIO(0x9888), 0x103d0005 },
-	{ _MMIO(0x9888), 0x163d240b },
-	{ _MMIO(0x9888), 0x1058022f },
-	{ _MMIO(0x9888), 0x185b5520 },
-	{ _MMIO(0x9888), 0x198b0003 },
-	{ _MMIO(0x9888), 0x005cc000 },
-	{ _MMIO(0x9888), 0x065cc000 },
-	{ _MMIO(0x9888), 0x085cc000 },
-	{ _MMIO(0x9888), 0x0a5cc000 },
-	{ _MMIO(0x9888), 0x0c5cc000 },
-	{ _MMIO(0x9888), 0x0e5cc000 },
-	{ _MMIO(0x9888), 0x025c4000 },
-	{ _MMIO(0x9888), 0x045c8000 },
-	{ _MMIO(0x9888), 0x003d0000 },
-	{ _MMIO(0x9888), 0x063d00b0 },
-	{ _MMIO(0x9888), 0x083d0182 },
-	{ _MMIO(0x9888), 0x0a3d10a0 },
-	{ _MMIO(0x9888), 0x0c3d11a2 },
-	{ _MMIO(0x9888), 0x0e3d0000 },
-	{ _MMIO(0x9888), 0x183d0000 },
-	{ _MMIO(0x9888), 0x1a3d0000 },
-	{ _MMIO(0x9888), 0x0e582242 },
-	{ _MMIO(0x9888), 0x00586700 },
-	{ _MMIO(0x9888), 0x0258004f },
-	{ _MMIO(0x9888), 0x0658c000 },
-	{ _MMIO(0x9888), 0x0858c000 },
-	{ _MMIO(0x9888), 0x0a58c000 },
-	{ _MMIO(0x9888), 0x0c58c000 },
-	{ _MMIO(0x9888), 0x045b6300 },
-	{ _MMIO(0x9888), 0x105b0000 },
-	{ _MMIO(0x9888), 0x005b4000 },
-	{ _MMIO(0x9888), 0x0e5b4000 },
-	{ _MMIO(0x9888), 0x1a5b0155 },
-	{ _MMIO(0x9888), 0x025b4000 },
-	{ _MMIO(0x9888), 0x0a5b0000 },
-	{ _MMIO(0x9888), 0x0c5b4000 },
-	{ _MMIO(0x9888), 0x0c1fa800 },
-	{ _MMIO(0x9888), 0x0e1faaa0 },
-	{ _MMIO(0x9888), 0x101f02aa },
-	{ _MMIO(0x9888), 0x00384000 },
-	{ _MMIO(0x9888), 0x0e384000 },
-	{ _MMIO(0x9888), 0x16384000 },
-	{ _MMIO(0x9888), 0x18381555 },
-	{ _MMIO(0x9888), 0x02384000 },
-	{ _MMIO(0x9888), 0x04384000 },
-	{ _MMIO(0x9888), 0x0a384000 },
-	{ _MMIO(0x9888), 0x0c384000 },
-	{ _MMIO(0x9888), 0x0039a000 },
-	{ _MMIO(0x9888), 0x0639a000 },
-	{ _MMIO(0x9888), 0x0839a000 },
-	{ _MMIO(0x9888), 0x0a39a000 },
-	{ _MMIO(0x9888), 0x0c39a000 },
-	{ _MMIO(0x9888), 0x0e39a000 },
-	{ _MMIO(0x9888), 0x02392000 },
-	{ _MMIO(0x9888), 0x04398000 },
-	{ _MMIO(0x9888), 0x018a8000 },
-	{ _MMIO(0x9888), 0x0f8a8000 },
-	{ _MMIO(0x9888), 0x198a8000 },
-	{ _MMIO(0x9888), 0x1b8aaaa0 },
-	{ _MMIO(0x9888), 0x1d8a0002 },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x0d8a8000 },
-	{ _MMIO(0x9888), 0x038b6300 },
-	{ _MMIO(0x9888), 0x058b0062 },
-	{ _MMIO(0x9888), 0x118b0000 },
-	{ _MMIO(0x9888), 0x238b02a0 },
-	{ _MMIO(0x9888), 0x258b5555 },
-	{ _MMIO(0x9888), 0x278b0015 },
-	{ _MMIO(0x9888), 0x1f85aa80 },
-	{ _MMIO(0x9888), 0x2185aaaa },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x01834000 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0184c000 },
-	{ _MMIO(0x9888), 0x0784c000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x1180c000 },
-	{ _MMIO(0x9888), 0x1780c000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0xd24), 0x00000000 },
-	{ _MMIO(0x9888), 0x4d801000 },
-	{ _MMIO(0x9888), 0x3d800000 },
-	{ _MMIO(0x9888), 0x4f800001 },
-	{ _MMIO(0x9888), 0x43800000 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45800000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47800420 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x3f800421 },
-	{ _MMIO(0x9888), 0x41800041 },
-};
-
-static int
-get_data_port_reads_coalescing_mux_config(struct drm_i915_private *dev_priv,
-					  const struct i915_oa_reg **regs,
-					  int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	if (INTEL_INFO(dev_priv)->sseu.subslice_mask & 0x01) {
-		regs[n] = mux_config_data_port_reads_coalescing_0_subslices_0x01;
-		lens[n] = ARRAY_SIZE(mux_config_data_port_reads_coalescing_0_subslices_0x01);
-		n++;
-	}
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_data_port_writes_coalescing[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0xba98ba98 },
-	{ _MMIO(0x2748), 0xba98ba98 },
-	{ _MMIO(0x2744), 0x00003377 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007ff72 },
-	{ _MMIO(0x2774), 0x0000bfd0 },
-	{ _MMIO(0x2778), 0x0007ff62 },
-	{ _MMIO(0x277c), 0x0000bfd0 },
-	{ _MMIO(0x2780), 0x0007ff42 },
-	{ _MMIO(0x2784), 0x0000bfd0 },
-	{ _MMIO(0x2788), 0x0007ff02 },
-	{ _MMIO(0x278c), 0x0000bfd0 },
-	{ _MMIO(0x2790), 0x0005fff2 },
-	{ _MMIO(0x2794), 0x0000bfd0 },
-	{ _MMIO(0x2798), 0x0005ffe2 },
-	{ _MMIO(0x279c), 0x0000bfd0 },
-	{ _MMIO(0x27a0), 0x0005ffc2 },
-	{ _MMIO(0x27a4), 0x0000bfd0 },
-	{ _MMIO(0x27a8), 0x0005ff82 },
-	{ _MMIO(0x27ac), 0x0000bfd0 },
-};
-
-static const struct i915_oa_reg flex_eu_config_data_port_writes_coalescing[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_data_port_writes_coalescing_0_subslices_0x01[] = {
-	{ _MMIO(0x9888), 0x103d0005 },
-	{ _MMIO(0x9888), 0x143d0120 },
-	{ _MMIO(0x9888), 0x163d2400 },
-	{ _MMIO(0x9888), 0x1058022f },
-	{ _MMIO(0x9888), 0x105b0000 },
-	{ _MMIO(0x9888), 0x198b0003 },
-	{ _MMIO(0x9888), 0x005cc000 },
-	{ _MMIO(0x9888), 0x065cc000 },
-	{ _MMIO(0x9888), 0x085cc000 },
-	{ _MMIO(0x9888), 0x0a5cc000 },
-	{ _MMIO(0x9888), 0x0e5cc000 },
-	{ _MMIO(0x9888), 0x025c4000 },
-	{ _MMIO(0x9888), 0x045c8000 },
-	{ _MMIO(0x9888), 0x003d0000 },
-	{ _MMIO(0x9888), 0x063d0094 },
-	{ _MMIO(0x9888), 0x083d0182 },
-	{ _MMIO(0x9888), 0x0a3d1814 },
-	{ _MMIO(0x9888), 0x0e3d0000 },
-	{ _MMIO(0x9888), 0x183d0000 },
-	{ _MMIO(0x9888), 0x1a3d0000 },
-	{ _MMIO(0x9888), 0x0c3d0000 },
-	{ _MMIO(0x9888), 0x0e582242 },
-	{ _MMIO(0x9888), 0x00586700 },
-	{ _MMIO(0x9888), 0x0258004f },
-	{ _MMIO(0x9888), 0x0658c000 },
-	{ _MMIO(0x9888), 0x0858c000 },
-	{ _MMIO(0x9888), 0x0a58c000 },
-	{ _MMIO(0x9888), 0x045b6a80 },
-	{ _MMIO(0x9888), 0x005b4000 },
-	{ _MMIO(0x9888), 0x0e5b4000 },
-	{ _MMIO(0x9888), 0x185b5400 },
-	{ _MMIO(0x9888), 0x1a5b0141 },
-	{ _MMIO(0x9888), 0x025b4000 },
-	{ _MMIO(0x9888), 0x0a5b0000 },
-	{ _MMIO(0x9888), 0x0c5b4000 },
-	{ _MMIO(0x9888), 0x0c1fa800 },
-	{ _MMIO(0x9888), 0x0e1faaa0 },
-	{ _MMIO(0x9888), 0x101f0282 },
-	{ _MMIO(0x9888), 0x00384000 },
-	{ _MMIO(0x9888), 0x0e384000 },
-	{ _MMIO(0x9888), 0x16384000 },
-	{ _MMIO(0x9888), 0x18381415 },
-	{ _MMIO(0x9888), 0x02384000 },
-	{ _MMIO(0x9888), 0x04384000 },
-	{ _MMIO(0x9888), 0x0a384000 },
-	{ _MMIO(0x9888), 0x0c384000 },
-	{ _MMIO(0x9888), 0x0039a000 },
-	{ _MMIO(0x9888), 0x0639a000 },
-	{ _MMIO(0x9888), 0x0839a000 },
-	{ _MMIO(0x9888), 0x0a39a000 },
-	{ _MMIO(0x9888), 0x0e39a000 },
-	{ _MMIO(0x9888), 0x02392000 },
-	{ _MMIO(0x9888), 0x04398000 },
-	{ _MMIO(0x9888), 0x018a8000 },
-	{ _MMIO(0x9888), 0x0f8a8000 },
-	{ _MMIO(0x9888), 0x198a8000 },
-	{ _MMIO(0x9888), 0x1b8a82a0 },
-	{ _MMIO(0x9888), 0x1d8a0002 },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x0d8a8000 },
-	{ _MMIO(0x9888), 0x038b6300 },
-	{ _MMIO(0x9888), 0x058b0062 },
-	{ _MMIO(0x9888), 0x118b0000 },
-	{ _MMIO(0x9888), 0x238b02a0 },
-	{ _MMIO(0x9888), 0x258b1555 },
-	{ _MMIO(0x9888), 0x278b0014 },
-	{ _MMIO(0x9888), 0x1f85aa80 },
-	{ _MMIO(0x9888), 0x21852aaa },
-	{ _MMIO(0x9888), 0x23850028 },
-	{ _MMIO(0x9888), 0x01834000 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830141 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0184c000 },
-	{ _MMIO(0x9888), 0x0784c000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x1180c000 },
-	{ _MMIO(0x9888), 0x1780c000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0xd24), 0x00000000 },
-	{ _MMIO(0x9888), 0x4d801000 },
-	{ _MMIO(0x9888), 0x3d800000 },
-	{ _MMIO(0x9888), 0x4f800001 },
-	{ _MMIO(0x9888), 0x43800000 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45800000 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47800420 },
-	{ _MMIO(0x9888), 0x3f800421 },
-	{ _MMIO(0x9888), 0x41800041 },
-};
-
-static int
-get_data_port_writes_coalescing_mux_config(struct drm_i915_private *dev_priv,
-					   const struct i915_oa_reg **regs,
-					   int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	if (INTEL_INFO(dev_priv)->sseu.subslice_mask & 0x01) {
-		regs[n] = mux_config_data_port_writes_coalescing_0_subslices_0x01;
-		lens[n] = ARRAY_SIZE(mux_config_data_port_writes_coalescing_0_subslices_0x01);
-		n++;
-	}
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_hdc_and_sf[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x10800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000fff7 },
-};
-
-static const struct i915_oa_reg flex_eu_config_hdc_and_sf[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_hdc_and_sf[] = {
-	{ _MMIO(0x9888), 0x105c0232 },
-	{ _MMIO(0x9888), 0x10580232 },
-	{ _MMIO(0x9888), 0x10380232 },
-	{ _MMIO(0x9888), 0x10dc0232 },
-	{ _MMIO(0x9888), 0x10d80232 },
-	{ _MMIO(0x9888), 0x10b80232 },
-	{ _MMIO(0x9888), 0x118e4400 },
-	{ _MMIO(0x9888), 0x025c6080 },
-	{ _MMIO(0x9888), 0x045c004b },
-	{ _MMIO(0x9888), 0x005c8000 },
-	{ _MMIO(0x9888), 0x00582080 },
-	{ _MMIO(0x9888), 0x0258004b },
-	{ _MMIO(0x9888), 0x025b4000 },
-	{ _MMIO(0x9888), 0x045b4000 },
-	{ _MMIO(0x9888), 0x0c1fa000 },
-	{ _MMIO(0x9888), 0x0e1f00aa },
-	{ _MMIO(0x9888), 0x04386080 },
-	{ _MMIO(0x9888), 0x0638404b },
-	{ _MMIO(0x9888), 0x02384000 },
-	{ _MMIO(0x9888), 0x08384000 },
-	{ _MMIO(0x9888), 0x0a380000 },
-	{ _MMIO(0x9888), 0x0c380000 },
-	{ _MMIO(0x9888), 0x00398000 },
-	{ _MMIO(0x9888), 0x0239a000 },
-	{ _MMIO(0x9888), 0x0439a000 },
-	{ _MMIO(0x9888), 0x06392000 },
-	{ _MMIO(0x9888), 0x0cdc25c1 },
-	{ _MMIO(0x9888), 0x0adcc000 },
-	{ _MMIO(0x9888), 0x0ad825c1 },
-	{ _MMIO(0x9888), 0x18db4000 },
-	{ _MMIO(0x9888), 0x1adb0001 },
-	{ _MMIO(0x9888), 0x0e9f8000 },
-	{ _MMIO(0x9888), 0x109f02aa },
-	{ _MMIO(0x9888), 0x0eb825c1 },
-	{ _MMIO(0x9888), 0x18b80154 },
-	{ _MMIO(0x9888), 0x0ab9a000 },
-	{ _MMIO(0x9888), 0x0cb9a000 },
-	{ _MMIO(0x9888), 0x0eb9a000 },
-	{ _MMIO(0x9888), 0x0d88c000 },
-	{ _MMIO(0x9888), 0x0f88000f },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x078a8000 },
-	{ _MMIO(0x9888), 0x098a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x0d8a8000 },
-	{ _MMIO(0x9888), 0x258baa05 },
-	{ _MMIO(0x9888), 0x278b002a },
-	{ _MMIO(0x9888), 0x238b2a80 },
-	{ _MMIO(0x9888), 0x198c5400 },
-	{ _MMIO(0x9888), 0x1b8c0015 },
-	{ _MMIO(0x9888), 0x098dc000 },
-	{ _MMIO(0x9888), 0x0b8da000 },
-	{ _MMIO(0x9888), 0x0d8da000 },
-	{ _MMIO(0x9888), 0x0f8da000 },
-	{ _MMIO(0x9888), 0x098e05c0 },
-	{ _MMIO(0x9888), 0x058e0000 },
-	{ _MMIO(0x9888), 0x198f0020 },
-	{ _MMIO(0x9888), 0x2185aa0a },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x19835000 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x09848000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x07844000 },
-	{ _MMIO(0x9888), 0x19808000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x17804000 },
-	{ _MMIO(0x9888), 0x51800040 },
-	{ _MMIO(0x9888), 0x43800400 },
-	{ _MMIO(0x9888), 0x45800800 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47800c62 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f801042 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x418014a4 },
-};
-
-static int
-get_hdc_and_sf_mux_config(struct drm_i915_private *dev_priv,
-			  const struct i915_oa_reg **regs,
-			  int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_hdc_and_sf;
-	lens[n] = ARRAY_SIZE(mux_config_hdc_and_sf);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00014002 },
-	{ _MMIO(0x277c), 0x0000c3ff },
-	{ _MMIO(0x2780), 0x00010002 },
-	{ _MMIO(0x2784), 0x0000c7ff },
-	{ _MMIO(0x2788), 0x00004002 },
-	{ _MMIO(0x278c), 0x0000d3ff },
-	{ _MMIO(0x2790), 0x00100700 },
-	{ _MMIO(0x2794), 0x0000ff1f },
-	{ _MMIO(0x2798), 0x00001402 },
-	{ _MMIO(0x279c), 0x0000fc3f },
-	{ _MMIO(0x27a0), 0x00001002 },
-	{ _MMIO(0x27a4), 0x0000fc7f },
-	{ _MMIO(0x27a8), 0x00000402 },
-	{ _MMIO(0x27ac), 0x0000fd3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_1[] = {
-	{ _MMIO(0x9888), 0x10bf03da },
-	{ _MMIO(0x9888), 0x14bf0001 },
-	{ _MMIO(0x9888), 0x12980340 },
-	{ _MMIO(0x9888), 0x12990340 },
-	{ _MMIO(0x9888), 0x0cbf1187 },
-	{ _MMIO(0x9888), 0x0ebf1205 },
-	{ _MMIO(0x9888), 0x00bf0500 },
-	{ _MMIO(0x9888), 0x02bf042b },
-	{ _MMIO(0x9888), 0x04bf002c },
-	{ _MMIO(0x9888), 0x0cdac000 },
-	{ _MMIO(0x9888), 0x0edac000 },
-	{ _MMIO(0x9888), 0x00da8000 },
-	{ _MMIO(0x9888), 0x02dac000 },
-	{ _MMIO(0x9888), 0x04da4000 },
-	{ _MMIO(0x9888), 0x04983400 },
-	{ _MMIO(0x9888), 0x10980000 },
-	{ _MMIO(0x9888), 0x06990034 },
-	{ _MMIO(0x9888), 0x10990000 },
-	{ _MMIO(0x9888), 0x0c9dc000 },
-	{ _MMIO(0x9888), 0x0e9dc000 },
-	{ _MMIO(0x9888), 0x009d8000 },
-	{ _MMIO(0x9888), 0x029dc000 },
-	{ _MMIO(0x9888), 0x049d4000 },
-	{ _MMIO(0x9888), 0x109f02a8 },
-	{ _MMIO(0x9888), 0x0c9fa000 },
-	{ _MMIO(0x9888), 0x0e9f00ba },
-	{ _MMIO(0x9888), 0x0cb88000 },
-	{ _MMIO(0x9888), 0x0cb95000 },
-	{ _MMIO(0x9888), 0x0eb95000 },
-	{ _MMIO(0x9888), 0x00b94000 },
-	{ _MMIO(0x9888), 0x02b95000 },
-	{ _MMIO(0x9888), 0x04b91000 },
-	{ _MMIO(0x9888), 0x06b92000 },
-	{ _MMIO(0x9888), 0x0cba4000 },
-	{ _MMIO(0x9888), 0x0f88000f },
-	{ _MMIO(0x9888), 0x03888000 },
-	{ _MMIO(0x9888), 0x05888000 },
-	{ _MMIO(0x9888), 0x07888000 },
-	{ _MMIO(0x9888), 0x09888000 },
-	{ _MMIO(0x9888), 0x0b888000 },
-	{ _MMIO(0x9888), 0x0d880400 },
-	{ _MMIO(0x9888), 0x258b800a },
-	{ _MMIO(0x9888), 0x278b002a },
-	{ _MMIO(0x9888), 0x238b5500 },
-	{ _MMIO(0x9888), 0x198c4000 },
-	{ _MMIO(0x9888), 0x1b8c0015 },
-	{ _MMIO(0x9888), 0x038c4000 },
-	{ _MMIO(0x9888), 0x058c4000 },
-	{ _MMIO(0x9888), 0x078c4000 },
-	{ _MMIO(0x9888), 0x098c4000 },
-	{ _MMIO(0x9888), 0x0b8c4000 },
-	{ _MMIO(0x9888), 0x0d8c4000 },
-	{ _MMIO(0x9888), 0x0d8da000 },
-	{ _MMIO(0x9888), 0x0f8da000 },
-	{ _MMIO(0x9888), 0x018d8000 },
-	{ _MMIO(0x9888), 0x038da000 },
-	{ _MMIO(0x9888), 0x058da000 },
-	{ _MMIO(0x9888), 0x078d2000 },
-	{ _MMIO(0x9888), 0x2185800a },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x1b830154 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x07844000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x17804000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x45800000 },
-	{ _MMIO(0x9888), 0x47800000 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f800000 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x41800060 },
-};
-
-static int
-get_l3_1_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_1;
-	lens[n] = ARRAY_SIZE(mux_config_l3_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_2[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00014002 },
-	{ _MMIO(0x277c), 0x0000c3ff },
-	{ _MMIO(0x2780), 0x00010002 },
-	{ _MMIO(0x2784), 0x0000c7ff },
-	{ _MMIO(0x2788), 0x00004002 },
-	{ _MMIO(0x278c), 0x0000d3ff },
-	{ _MMIO(0x2790), 0x00100700 },
-	{ _MMIO(0x2794), 0x0000ff1f },
-	{ _MMIO(0x2798), 0x00001402 },
-	{ _MMIO(0x279c), 0x0000fc3f },
-	{ _MMIO(0x27a0), 0x00001002 },
-	{ _MMIO(0x27a4), 0x0000fc7f },
-	{ _MMIO(0x27a8), 0x00000402 },
-	{ _MMIO(0x27ac), 0x0000fd3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_2[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_2[] = {
-	{ _MMIO(0x9888), 0x103f03da },
-	{ _MMIO(0x9888), 0x143f0001 },
-	{ _MMIO(0x9888), 0x12180340 },
-	{ _MMIO(0x9888), 0x12190340 },
-	{ _MMIO(0x9888), 0x0c3f1187 },
-	{ _MMIO(0x9888), 0x0e3f1205 },
-	{ _MMIO(0x9888), 0x003f0500 },
-	{ _MMIO(0x9888), 0x023f042b },
-	{ _MMIO(0x9888), 0x043f002c },
-	{ _MMIO(0x9888), 0x0c5ac000 },
-	{ _MMIO(0x9888), 0x0e5ac000 },
-	{ _MMIO(0x9888), 0x005a8000 },
-	{ _MMIO(0x9888), 0x025ac000 },
-	{ _MMIO(0x9888), 0x045a4000 },
-	{ _MMIO(0x9888), 0x04183400 },
-	{ _MMIO(0x9888), 0x10180000 },
-	{ _MMIO(0x9888), 0x06190034 },
-	{ _MMIO(0x9888), 0x10190000 },
-	{ _MMIO(0x9888), 0x0c1dc000 },
-	{ _MMIO(0x9888), 0x0e1dc000 },
-	{ _MMIO(0x9888), 0x001d8000 },
-	{ _MMIO(0x9888), 0x021dc000 },
-	{ _MMIO(0x9888), 0x041d4000 },
-	{ _MMIO(0x9888), 0x101f02a8 },
-	{ _MMIO(0x9888), 0x0c1fa000 },
-	{ _MMIO(0x9888), 0x0e1f00ba },
-	{ _MMIO(0x9888), 0x0c388000 },
-	{ _MMIO(0x9888), 0x0c395000 },
-	{ _MMIO(0x9888), 0x0e395000 },
-	{ _MMIO(0x9888), 0x00394000 },
-	{ _MMIO(0x9888), 0x02395000 },
-	{ _MMIO(0x9888), 0x04391000 },
-	{ _MMIO(0x9888), 0x06392000 },
-	{ _MMIO(0x9888), 0x0c3a4000 },
-	{ _MMIO(0x9888), 0x1b8aa800 },
-	{ _MMIO(0x9888), 0x1d8a0002 },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x078a8000 },
-	{ _MMIO(0x9888), 0x098a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x0d8a8000 },
-	{ _MMIO(0x9888), 0x258b4005 },
-	{ _MMIO(0x9888), 0x278b0015 },
-	{ _MMIO(0x9888), 0x238b2a80 },
-	{ _MMIO(0x9888), 0x2185800a },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x1b830154 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x07844000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x17804000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x45800000 },
-	{ _MMIO(0x9888), 0x47800000 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f800000 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x41800060 },
-};
-
-static int
-get_l3_2_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_2;
-	lens[n] = ARRAY_SIZE(mux_config_l3_2);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_3[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00014002 },
-	{ _MMIO(0x277c), 0x0000c3ff },
-	{ _MMIO(0x2780), 0x00010002 },
-	{ _MMIO(0x2784), 0x0000c7ff },
-	{ _MMIO(0x2788), 0x00004002 },
-	{ _MMIO(0x278c), 0x0000d3ff },
-	{ _MMIO(0x2790), 0x00100700 },
-	{ _MMIO(0x2794), 0x0000ff1f },
-	{ _MMIO(0x2798), 0x00001402 },
-	{ _MMIO(0x279c), 0x0000fc3f },
-	{ _MMIO(0x27a0), 0x00001002 },
-	{ _MMIO(0x27a4), 0x0000fc7f },
-	{ _MMIO(0x27a8), 0x00000402 },
-	{ _MMIO(0x27ac), 0x0000fd3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_3[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_3[] = {
-	{ _MMIO(0x9888), 0x121b0340 },
-	{ _MMIO(0x9888), 0x103f0274 },
-	{ _MMIO(0x9888), 0x123f0000 },
-	{ _MMIO(0x9888), 0x129b0340 },
-	{ _MMIO(0x9888), 0x10bf0274 },
-	{ _MMIO(0x9888), 0x12bf0000 },
-	{ _MMIO(0x9888), 0x041b3400 },
-	{ _MMIO(0x9888), 0x101b0000 },
-	{ _MMIO(0x9888), 0x045c8000 },
-	{ _MMIO(0x9888), 0x0a3d4000 },
-	{ _MMIO(0x9888), 0x003f0080 },
-	{ _MMIO(0x9888), 0x023f0793 },
-	{ _MMIO(0x9888), 0x043f0014 },
-	{ _MMIO(0x9888), 0x04588000 },
-	{ _MMIO(0x9888), 0x005a8000 },
-	{ _MMIO(0x9888), 0x025ac000 },
-	{ _MMIO(0x9888), 0x045a4000 },
-	{ _MMIO(0x9888), 0x0a5b4000 },
-	{ _MMIO(0x9888), 0x001d8000 },
-	{ _MMIO(0x9888), 0x021dc000 },
-	{ _MMIO(0x9888), 0x041d4000 },
-	{ _MMIO(0x9888), 0x0c1fa000 },
-	{ _MMIO(0x9888), 0x0e1f002a },
-	{ _MMIO(0x9888), 0x0a384000 },
-	{ _MMIO(0x9888), 0x00394000 },
-	{ _MMIO(0x9888), 0x02395000 },
-	{ _MMIO(0x9888), 0x04399000 },
-	{ _MMIO(0x9888), 0x069b0034 },
-	{ _MMIO(0x9888), 0x109b0000 },
-	{ _MMIO(0x9888), 0x06dc4000 },
-	{ _MMIO(0x9888), 0x0cbd4000 },
-	{ _MMIO(0x9888), 0x0cbf0981 },
-	{ _MMIO(0x9888), 0x0ebf0a0f },
-	{ _MMIO(0x9888), 0x06d84000 },
-	{ _MMIO(0x9888), 0x0cdac000 },
-	{ _MMIO(0x9888), 0x0edac000 },
-	{ _MMIO(0x9888), 0x0cdb4000 },
-	{ _MMIO(0x9888), 0x0c9dc000 },
-	{ _MMIO(0x9888), 0x0e9dc000 },
-	{ _MMIO(0x9888), 0x109f02a8 },
-	{ _MMIO(0x9888), 0x0e9f0080 },
-	{ _MMIO(0x9888), 0x0cb84000 },
-	{ _MMIO(0x9888), 0x0cb95000 },
-	{ _MMIO(0x9888), 0x0eb95000 },
-	{ _MMIO(0x9888), 0x06b92000 },
-	{ _MMIO(0x9888), 0x0f88000f },
-	{ _MMIO(0x9888), 0x0d880400 },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x078a8000 },
-	{ _MMIO(0x9888), 0x098a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x258b8009 },
-	{ _MMIO(0x9888), 0x278b002a },
-	{ _MMIO(0x9888), 0x238b2a80 },
-	{ _MMIO(0x9888), 0x198c4000 },
-	{ _MMIO(0x9888), 0x1b8c0015 },
-	{ _MMIO(0x9888), 0x0d8c4000 },
-	{ _MMIO(0x9888), 0x0d8da000 },
-	{ _MMIO(0x9888), 0x0f8da000 },
-	{ _MMIO(0x9888), 0x078d2000 },
-	{ _MMIO(0x9888), 0x2185800a },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x1b830154 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x07844000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x17804000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x45800c00 },
-	{ _MMIO(0x9888), 0x47800c63 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f8014a5 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x41800045 },
-};
-
-static int
-get_l3_3_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_3;
-	lens[n] = ARRAY_SIZE(mux_config_l3_3);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_4[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00014002 },
-	{ _MMIO(0x277c), 0x0000c3ff },
-	{ _MMIO(0x2780), 0x00010002 },
-	{ _MMIO(0x2784), 0x0000c7ff },
-	{ _MMIO(0x2788), 0x00004002 },
-	{ _MMIO(0x278c), 0x0000d3ff },
-	{ _MMIO(0x2790), 0x00100700 },
-	{ _MMIO(0x2794), 0x0000ff1f },
-	{ _MMIO(0x2798), 0x00001402 },
-	{ _MMIO(0x279c), 0x0000fc3f },
-	{ _MMIO(0x27a0), 0x00001002 },
-	{ _MMIO(0x27a4), 0x0000fc7f },
-	{ _MMIO(0x27a8), 0x00000402 },
-	{ _MMIO(0x27ac), 0x0000fd3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_4[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_4[] = {
-	{ _MMIO(0x9888), 0x121a0340 },
-	{ _MMIO(0x9888), 0x103f0017 },
-	{ _MMIO(0x9888), 0x123f0020 },
-	{ _MMIO(0x9888), 0x129a0340 },
-	{ _MMIO(0x9888), 0x10bf0017 },
-	{ _MMIO(0x9888), 0x12bf0020 },
-	{ _MMIO(0x9888), 0x041a3400 },
-	{ _MMIO(0x9888), 0x101a0000 },
-	{ _MMIO(0x9888), 0x043b8000 },
-	{ _MMIO(0x9888), 0x0a3e0010 },
-	{ _MMIO(0x9888), 0x003f0200 },
-	{ _MMIO(0x9888), 0x023f0113 },
-	{ _MMIO(0x9888), 0x043f0014 },
-	{ _MMIO(0x9888), 0x02592000 },
-	{ _MMIO(0x9888), 0x005a8000 },
-	{ _MMIO(0x9888), 0x025ac000 },
-	{ _MMIO(0x9888), 0x045a4000 },
-	{ _MMIO(0x9888), 0x0a1c8000 },
-	{ _MMIO(0x9888), 0x001d8000 },
-	{ _MMIO(0x9888), 0x021dc000 },
-	{ _MMIO(0x9888), 0x041d4000 },
-	{ _MMIO(0x9888), 0x0a1e8000 },
-	{ _MMIO(0x9888), 0x0c1fa000 },
-	{ _MMIO(0x9888), 0x0e1f001a },
-	{ _MMIO(0x9888), 0x00394000 },
-	{ _MMIO(0x9888), 0x02395000 },
-	{ _MMIO(0x9888), 0x04391000 },
-	{ _MMIO(0x9888), 0x069a0034 },
-	{ _MMIO(0x9888), 0x109a0000 },
-	{ _MMIO(0x9888), 0x06bb4000 },
-	{ _MMIO(0x9888), 0x0abe0040 },
-	{ _MMIO(0x9888), 0x0cbf0984 },
-	{ _MMIO(0x9888), 0x0ebf0a02 },
-	{ _MMIO(0x9888), 0x02d94000 },
-	{ _MMIO(0x9888), 0x0cdac000 },
-	{ _MMIO(0x9888), 0x0edac000 },
-	{ _MMIO(0x9888), 0x0c9c0400 },
-	{ _MMIO(0x9888), 0x0c9dc000 },
-	{ _MMIO(0x9888), 0x0e9dc000 },
-	{ _MMIO(0x9888), 0x0c9e0400 },
-	{ _MMIO(0x9888), 0x109f02a8 },
-	{ _MMIO(0x9888), 0x0e9f0040 },
-	{ _MMIO(0x9888), 0x0cb95000 },
-	{ _MMIO(0x9888), 0x0eb95000 },
-	{ _MMIO(0x9888), 0x0f88000f },
-	{ _MMIO(0x9888), 0x0d880400 },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x078a8000 },
-	{ _MMIO(0x9888), 0x098a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x258b8009 },
-	{ _MMIO(0x9888), 0x278b002a },
-	{ _MMIO(0x9888), 0x238b2a80 },
-	{ _MMIO(0x9888), 0x198c4000 },
-	{ _MMIO(0x9888), 0x1b8c0015 },
-	{ _MMIO(0x9888), 0x0d8c4000 },
-	{ _MMIO(0x9888), 0x0d8da000 },
-	{ _MMIO(0x9888), 0x0f8da000 },
-	{ _MMIO(0x9888), 0x078d2000 },
-	{ _MMIO(0x9888), 0x2185800a },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x1b830154 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x07844000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x17804000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x45800800 },
-	{ _MMIO(0x9888), 0x47800842 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f801084 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x41800044 },
-};
-
-static int
-get_l3_4_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_4;
-	lens[n] = ARRAY_SIZE(mux_config_l3_4);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00006000 },
-	{ _MMIO(0x2774), 0x0000f3ff },
-	{ _MMIO(0x2778), 0x00001800 },
-	{ _MMIO(0x277c), 0x0000fcff },
-	{ _MMIO(0x2780), 0x00000600 },
-	{ _MMIO(0x2784), 0x0000ff3f },
-	{ _MMIO(0x2788), 0x00000180 },
-	{ _MMIO(0x278c), 0x0000ffcf },
-	{ _MMIO(0x2790), 0x00000060 },
-	{ _MMIO(0x2794), 0x0000fff3 },
-	{ _MMIO(0x2798), 0x00000018 },
-	{ _MMIO(0x279c), 0x0000fffc },
-};
-
-static const struct i915_oa_reg flex_eu_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0x9888), 0x143b000e },
-	{ _MMIO(0x9888), 0x043c55c0 },
-	{ _MMIO(0x9888), 0x0a1e0280 },
-	{ _MMIO(0x9888), 0x0c1e0408 },
-	{ _MMIO(0x9888), 0x10390000 },
-	{ _MMIO(0x9888), 0x12397a1f },
-	{ _MMIO(0x9888), 0x14bb000e },
-	{ _MMIO(0x9888), 0x04bc5000 },
-	{ _MMIO(0x9888), 0x0a9e0296 },
-	{ _MMIO(0x9888), 0x0c9e0008 },
-	{ _MMIO(0x9888), 0x10b90000 },
-	{ _MMIO(0x9888), 0x12b97a1f },
-	{ _MMIO(0x9888), 0x063b0042 },
-	{ _MMIO(0x9888), 0x103b0000 },
-	{ _MMIO(0x9888), 0x083c0000 },
-	{ _MMIO(0x9888), 0x0a3e0040 },
-	{ _MMIO(0x9888), 0x043f8000 },
-	{ _MMIO(0x9888), 0x02594000 },
-	{ _MMIO(0x9888), 0x045a8000 },
-	{ _MMIO(0x9888), 0x0c1c0400 },
-	{ _MMIO(0x9888), 0x041d8000 },
-	{ _MMIO(0x9888), 0x081e02c0 },
-	{ _MMIO(0x9888), 0x0e1e0000 },
-	{ _MMIO(0x9888), 0x0c1fa800 },
-	{ _MMIO(0x9888), 0x0e1f0260 },
-	{ _MMIO(0x9888), 0x101f0014 },
-	{ _MMIO(0x9888), 0x003905e0 },
-	{ _MMIO(0x9888), 0x06390bc0 },
-	{ _MMIO(0x9888), 0x02390018 },
-	{ _MMIO(0x9888), 0x04394000 },
-	{ _MMIO(0x9888), 0x04bb0042 },
-	{ _MMIO(0x9888), 0x10bb0000 },
-	{ _MMIO(0x9888), 0x02bc05c0 },
-	{ _MMIO(0x9888), 0x08bc0000 },
-	{ _MMIO(0x9888), 0x0abe0004 },
-	{ _MMIO(0x9888), 0x02bf8000 },
-	{ _MMIO(0x9888), 0x02d91000 },
-	{ _MMIO(0x9888), 0x02da8000 },
-	{ _MMIO(0x9888), 0x089c8000 },
-	{ _MMIO(0x9888), 0x029d8000 },
-	{ _MMIO(0x9888), 0x089e8000 },
-	{ _MMIO(0x9888), 0x0e9e0000 },
-	{ _MMIO(0x9888), 0x0e9fa806 },
-	{ _MMIO(0x9888), 0x109f0142 },
-	{ _MMIO(0x9888), 0x08b90617 },
-	{ _MMIO(0x9888), 0x0ab90be0 },
-	{ _MMIO(0x9888), 0x02b94000 },
-	{ _MMIO(0x9888), 0x0d88f000 },
-	{ _MMIO(0x9888), 0x0f88000c },
-	{ _MMIO(0x9888), 0x07888000 },
-	{ _MMIO(0x9888), 0x09888000 },
-	{ _MMIO(0x9888), 0x018a8000 },
-	{ _MMIO(0x9888), 0x0f8a8000 },
-	{ _MMIO(0x9888), 0x1b8a2800 },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x0d8a8000 },
-	{ _MMIO(0x9888), 0x238b52a0 },
-	{ _MMIO(0x9888), 0x258b6a95 },
-	{ _MMIO(0x9888), 0x278b0029 },
-	{ _MMIO(0x9888), 0x178c2000 },
-	{ _MMIO(0x9888), 0x198c1500 },
-	{ _MMIO(0x9888), 0x1b8c0014 },
-	{ _MMIO(0x9888), 0x078c4000 },
-	{ _MMIO(0x9888), 0x098c4000 },
-	{ _MMIO(0x9888), 0x098da000 },
-	{ _MMIO(0x9888), 0x0b8da000 },
-	{ _MMIO(0x9888), 0x0f8da000 },
-	{ _MMIO(0x9888), 0x038d8000 },
-	{ _MMIO(0x9888), 0x058d2000 },
-	{ _MMIO(0x9888), 0x1f85aa80 },
-	{ _MMIO(0x9888), 0x2185aaaa },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x01834000 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0184c000 },
-	{ _MMIO(0x9888), 0x0784c000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x1180c000 },
-	{ _MMIO(0x9888), 0x1780c000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x4d800444 },
-	{ _MMIO(0x9888), 0x3d800000 },
-	{ _MMIO(0x9888), 0x4f804000 },
-	{ _MMIO(0x9888), 0x43801080 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45800084 },
-	{ _MMIO(0x9888), 0x53800044 },
-	{ _MMIO(0x9888), 0x47801080 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x3f800000 },
-	{ _MMIO(0x9888), 0x41800840 },
-};
-
-static int
-get_rasterizer_and_pixel_backend_mux_config(struct drm_i915_private *dev_priv,
-					    const struct i915_oa_reg **regs,
-					    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_rasterizer_and_pixel_backend;
-	lens[n] = ARRAY_SIZE(mux_config_rasterizer_and_pixel_backend);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_sampler_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x70800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x0000c000 },
-	{ _MMIO(0x2774), 0x0000e7ff },
-	{ _MMIO(0x2778), 0x00003000 },
-	{ _MMIO(0x277c), 0x0000f9ff },
-	{ _MMIO(0x2780), 0x00000c00 },
-	{ _MMIO(0x2784), 0x0000fe7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_sampler_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_sampler_1[] = {
-	{ _MMIO(0x9888), 0x18921400 },
-	{ _MMIO(0x9888), 0x149500ab },
-	{ _MMIO(0x9888), 0x18b21400 },
-	{ _MMIO(0x9888), 0x14b500ab },
-	{ _MMIO(0x9888), 0x18d21400 },
-	{ _MMIO(0x9888), 0x14d500ab },
-	{ _MMIO(0x9888), 0x0cdc8000 },
-	{ _MMIO(0x9888), 0x0edc4000 },
-	{ _MMIO(0x9888), 0x02dcc000 },
-	{ _MMIO(0x9888), 0x04dcc000 },
-	{ _MMIO(0x9888), 0x1abd00a0 },
-	{ _MMIO(0x9888), 0x0abd8000 },
-	{ _MMIO(0x9888), 0x0cd88000 },
-	{ _MMIO(0x9888), 0x0ed84000 },
-	{ _MMIO(0x9888), 0x04d88000 },
-	{ _MMIO(0x9888), 0x1adb0050 },
-	{ _MMIO(0x9888), 0x04db8000 },
-	{ _MMIO(0x9888), 0x06db8000 },
-	{ _MMIO(0x9888), 0x08db8000 },
-	{ _MMIO(0x9888), 0x0adb4000 },
-	{ _MMIO(0x9888), 0x109f02a0 },
-	{ _MMIO(0x9888), 0x0c9fa000 },
-	{ _MMIO(0x9888), 0x0e9f00aa },
-	{ _MMIO(0x9888), 0x18b82500 },
-	{ _MMIO(0x9888), 0x02b88000 },
-	{ _MMIO(0x9888), 0x04b84000 },
-	{ _MMIO(0x9888), 0x06b84000 },
-	{ _MMIO(0x9888), 0x08b84000 },
-	{ _MMIO(0x9888), 0x0ab84000 },
-	{ _MMIO(0x9888), 0x0cb88000 },
-	{ _MMIO(0x9888), 0x0cb98000 },
-	{ _MMIO(0x9888), 0x0eb9a000 },
-	{ _MMIO(0x9888), 0x00b98000 },
-	{ _MMIO(0x9888), 0x02b9a000 },
-	{ _MMIO(0x9888), 0x04b9a000 },
-	{ _MMIO(0x9888), 0x06b92000 },
-	{ _MMIO(0x9888), 0x1aba0200 },
-	{ _MMIO(0x9888), 0x02ba8000 },
-	{ _MMIO(0x9888), 0x0cba8000 },
-	{ _MMIO(0x9888), 0x04908000 },
-	{ _MMIO(0x9888), 0x04918000 },
-	{ _MMIO(0x9888), 0x04927300 },
-	{ _MMIO(0x9888), 0x10920000 },
-	{ _MMIO(0x9888), 0x1893000a },
-	{ _MMIO(0x9888), 0x0a934000 },
-	{ _MMIO(0x9888), 0x0a946000 },
-	{ _MMIO(0x9888), 0x0c959000 },
-	{ _MMIO(0x9888), 0x0e950098 },
-	{ _MMIO(0x9888), 0x10950000 },
-	{ _MMIO(0x9888), 0x04b04000 },
-	{ _MMIO(0x9888), 0x04b14000 },
-	{ _MMIO(0x9888), 0x04b20073 },
-	{ _MMIO(0x9888), 0x10b20000 },
-	{ _MMIO(0x9888), 0x04b38000 },
-	{ _MMIO(0x9888), 0x06b38000 },
-	{ _MMIO(0x9888), 0x08b34000 },
-	{ _MMIO(0x9888), 0x04b4c000 },
-	{ _MMIO(0x9888), 0x02b59890 },
-	{ _MMIO(0x9888), 0x10b50000 },
-	{ _MMIO(0x9888), 0x06d04000 },
-	{ _MMIO(0x9888), 0x06d14000 },
-	{ _MMIO(0x9888), 0x06d20073 },
-	{ _MMIO(0x9888), 0x10d20000 },
-	{ _MMIO(0x9888), 0x18d30020 },
-	{ _MMIO(0x9888), 0x02d38000 },
-	{ _MMIO(0x9888), 0x0cd34000 },
-	{ _MMIO(0x9888), 0x0ad48000 },
-	{ _MMIO(0x9888), 0x04d42000 },
-	{ _MMIO(0x9888), 0x0ed59000 },
-	{ _MMIO(0x9888), 0x00d59800 },
-	{ _MMIO(0x9888), 0x10d50000 },
-	{ _MMIO(0x9888), 0x0f88000e },
-	{ _MMIO(0x9888), 0x03888000 },
-	{ _MMIO(0x9888), 0x05888000 },
-	{ _MMIO(0x9888), 0x07888000 },
-	{ _MMIO(0x9888), 0x09888000 },
-	{ _MMIO(0x9888), 0x0b888000 },
-	{ _MMIO(0x9888), 0x0d880400 },
-	{ _MMIO(0x9888), 0x278b002a },
-	{ _MMIO(0x9888), 0x238b5500 },
-	{ _MMIO(0x9888), 0x258b000a },
-	{ _MMIO(0x9888), 0x1b8c0015 },
-	{ _MMIO(0x9888), 0x038c4000 },
-	{ _MMIO(0x9888), 0x058c4000 },
-	{ _MMIO(0x9888), 0x078c4000 },
-	{ _MMIO(0x9888), 0x098c4000 },
-	{ _MMIO(0x9888), 0x0b8c4000 },
-	{ _MMIO(0x9888), 0x0d8c4000 },
-	{ _MMIO(0x9888), 0x0d8d8000 },
-	{ _MMIO(0x9888), 0x0f8da000 },
-	{ _MMIO(0x9888), 0x018d8000 },
-	{ _MMIO(0x9888), 0x038da000 },
-	{ _MMIO(0x9888), 0x058da000 },
-	{ _MMIO(0x9888), 0x078d2000 },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x2185000a },
-	{ _MMIO(0x9888), 0x1b830150 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0d848000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x07844000 },
-	{ _MMIO(0x9888), 0x1d808000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x17804000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47801021 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f800c64 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x41800c02 },
-};
-
-static int
-get_sampler_1_mux_config(struct drm_i915_private *dev_priv,
-			 const struct i915_oa_reg **regs,
-			 int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_sampler_1;
-	lens[n] = ARRAY_SIZE(mux_config_sampler_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_sampler_2[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x70800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x0000c000 },
-	{ _MMIO(0x2774), 0x0000e7ff },
-	{ _MMIO(0x2778), 0x00003000 },
-	{ _MMIO(0x277c), 0x0000f9ff },
-	{ _MMIO(0x2780), 0x00000c00 },
-	{ _MMIO(0x2784), 0x0000fe7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_sampler_2[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_sampler_2[] = {
-	{ _MMIO(0x9888), 0x18121400 },
-	{ _MMIO(0x9888), 0x141500ab },
-	{ _MMIO(0x9888), 0x18321400 },
-	{ _MMIO(0x9888), 0x143500ab },
-	{ _MMIO(0x9888), 0x18521400 },
-	{ _MMIO(0x9888), 0x145500ab },
-	{ _MMIO(0x9888), 0x0c5c8000 },
-	{ _MMIO(0x9888), 0x0e5c4000 },
-	{ _MMIO(0x9888), 0x025cc000 },
-	{ _MMIO(0x9888), 0x045cc000 },
-	{ _MMIO(0x9888), 0x1a3d00a0 },
-	{ _MMIO(0x9888), 0x0a3d8000 },
-	{ _MMIO(0x9888), 0x0c588000 },
-	{ _MMIO(0x9888), 0x0e584000 },
-	{ _MMIO(0x9888), 0x04588000 },
-	{ _MMIO(0x9888), 0x1a5b0050 },
-	{ _MMIO(0x9888), 0x045b8000 },
-	{ _MMIO(0x9888), 0x065b8000 },
-	{ _MMIO(0x9888), 0x085b8000 },
-	{ _MMIO(0x9888), 0x0a5b4000 },
-	{ _MMIO(0x9888), 0x101f02a0 },
-	{ _MMIO(0x9888), 0x0c1fa000 },
-	{ _MMIO(0x9888), 0x0e1f00aa },
-	{ _MMIO(0x9888), 0x18382500 },
-	{ _MMIO(0x9888), 0x02388000 },
-	{ _MMIO(0x9888), 0x04384000 },
-	{ _MMIO(0x9888), 0x06384000 },
-	{ _MMIO(0x9888), 0x08384000 },
-	{ _MMIO(0x9888), 0x0a384000 },
-	{ _MMIO(0x9888), 0x0c388000 },
-	{ _MMIO(0x9888), 0x0c398000 },
-	{ _MMIO(0x9888), 0x0e39a000 },
-	{ _MMIO(0x9888), 0x00398000 },
-	{ _MMIO(0x9888), 0x0239a000 },
-	{ _MMIO(0x9888), 0x0439a000 },
-	{ _MMIO(0x9888), 0x06392000 },
-	{ _MMIO(0x9888), 0x1a3a0200 },
-	{ _MMIO(0x9888), 0x023a8000 },
-	{ _MMIO(0x9888), 0x0c3a8000 },
-	{ _MMIO(0x9888), 0x04108000 },
-	{ _MMIO(0x9888), 0x04118000 },
-	{ _MMIO(0x9888), 0x04127300 },
-	{ _MMIO(0x9888), 0x10120000 },
-	{ _MMIO(0x9888), 0x1813000a },
-	{ _MMIO(0x9888), 0x0a134000 },
-	{ _MMIO(0x9888), 0x0a146000 },
-	{ _MMIO(0x9888), 0x0c159000 },
-	{ _MMIO(0x9888), 0x0e150098 },
-	{ _MMIO(0x9888), 0x10150000 },
-	{ _MMIO(0x9888), 0x04304000 },
-	{ _MMIO(0x9888), 0x04314000 },
-	{ _MMIO(0x9888), 0x04320073 },
-	{ _MMIO(0x9888), 0x10320000 },
-	{ _MMIO(0x9888), 0x04338000 },
-	{ _MMIO(0x9888), 0x06338000 },
-	{ _MMIO(0x9888), 0x08334000 },
-	{ _MMIO(0x9888), 0x0434c000 },
-	{ _MMIO(0x9888), 0x02359890 },
-	{ _MMIO(0x9888), 0x10350000 },
-	{ _MMIO(0x9888), 0x06504000 },
-	{ _MMIO(0x9888), 0x06514000 },
-	{ _MMIO(0x9888), 0x06520073 },
-	{ _MMIO(0x9888), 0x10520000 },
-	{ _MMIO(0x9888), 0x18530020 },
-	{ _MMIO(0x9888), 0x02538000 },
-	{ _MMIO(0x9888), 0x0c534000 },
-	{ _MMIO(0x9888), 0x0a548000 },
-	{ _MMIO(0x9888), 0x04542000 },
-	{ _MMIO(0x9888), 0x0e559000 },
-	{ _MMIO(0x9888), 0x00559800 },
-	{ _MMIO(0x9888), 0x10550000 },
-	{ _MMIO(0x9888), 0x1b8aa000 },
-	{ _MMIO(0x9888), 0x1d8a0002 },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x078a8000 },
-	{ _MMIO(0x9888), 0x098a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x0d8a8000 },
-	{ _MMIO(0x9888), 0x278b0015 },
-	{ _MMIO(0x9888), 0x238b2a80 },
-	{ _MMIO(0x9888), 0x258b0005 },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x2185000a },
-	{ _MMIO(0x9888), 0x1b830150 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0d848000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x07844000 },
-	{ _MMIO(0x9888), 0x1d808000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x17804000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47801021 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f800c64 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x41800c02 },
-};
-
-static int
-get_sampler_2_mux_config(struct drm_i915_private *dev_priv,
-			 const struct i915_oa_reg **regs,
-			 int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_sampler_2;
-	lens[n] = ARRAY_SIZE(mux_config_sampler_2);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_tdl_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000fdff },
-	{ _MMIO(0x2778), 0x00000000 },
-	{ _MMIO(0x277c), 0x0000fe7f },
-	{ _MMIO(0x2780), 0x00000002 },
-	{ _MMIO(0x2784), 0x0000ffbf },
-	{ _MMIO(0x2788), 0x00000000 },
-	{ _MMIO(0x278c), 0x0000ffcf },
-	{ _MMIO(0x2790), 0x00000002 },
-	{ _MMIO(0x2794), 0x0000fff7 },
-	{ _MMIO(0x2798), 0x00000000 },
-	{ _MMIO(0x279c), 0x0000fff9 },
-};
-
-static const struct i915_oa_reg flex_eu_config_tdl_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_tdl_1[] = {
-	{ _MMIO(0x9888), 0x16154d60 },
-	{ _MMIO(0x9888), 0x16352e60 },
-	{ _MMIO(0x9888), 0x16554d60 },
-	{ _MMIO(0x9888), 0x16950000 },
-	{ _MMIO(0x9888), 0x16b50000 },
-	{ _MMIO(0x9888), 0x16d50000 },
-	{ _MMIO(0x9888), 0x005c8000 },
-	{ _MMIO(0x9888), 0x045cc000 },
-	{ _MMIO(0x9888), 0x065c4000 },
-	{ _MMIO(0x9888), 0x083d8000 },
-	{ _MMIO(0x9888), 0x0a3d8000 },
-	{ _MMIO(0x9888), 0x0458c000 },
-	{ _MMIO(0x9888), 0x025b8000 },
-	{ _MMIO(0x9888), 0x085b4000 },
-	{ _MMIO(0x9888), 0x0a5b4000 },
-	{ _MMIO(0x9888), 0x0c5b8000 },
-	{ _MMIO(0x9888), 0x0c1fa000 },
-	{ _MMIO(0x9888), 0x0e1f00aa },
-	{ _MMIO(0x9888), 0x02384000 },
-	{ _MMIO(0x9888), 0x04388000 },
-	{ _MMIO(0x9888), 0x06388000 },
-	{ _MMIO(0x9888), 0x08384000 },
-	{ _MMIO(0x9888), 0x0a384000 },
-	{ _MMIO(0x9888), 0x0c384000 },
-	{ _MMIO(0x9888), 0x00398000 },
-	{ _MMIO(0x9888), 0x0239a000 },
-	{ _MMIO(0x9888), 0x0439a000 },
-	{ _MMIO(0x9888), 0x06392000 },
-	{ _MMIO(0x9888), 0x043a8000 },
-	{ _MMIO(0x9888), 0x063a8000 },
-	{ _MMIO(0x9888), 0x08138000 },
-	{ _MMIO(0x9888), 0x0a138000 },
-	{ _MMIO(0x9888), 0x06143000 },
-	{ _MMIO(0x9888), 0x0415cfc7 },
-	{ _MMIO(0x9888), 0x10150000 },
-	{ _MMIO(0x9888), 0x02338000 },
-	{ _MMIO(0x9888), 0x0c338000 },
-	{ _MMIO(0x9888), 0x04342000 },
-	{ _MMIO(0x9888), 0x06344000 },
-	{ _MMIO(0x9888), 0x0035c700 },
-	{ _MMIO(0x9888), 0x063500cf },
-	{ _MMIO(0x9888), 0x10350000 },
-	{ _MMIO(0x9888), 0x04538000 },
-	{ _MMIO(0x9888), 0x06538000 },
-	{ _MMIO(0x9888), 0x0454c000 },
-	{ _MMIO(0x9888), 0x0255cfc7 },
-	{ _MMIO(0x9888), 0x10550000 },
-	{ _MMIO(0x9888), 0x06dc8000 },
-	{ _MMIO(0x9888), 0x08dc4000 },
-	{ _MMIO(0x9888), 0x0cdcc000 },
-	{ _MMIO(0x9888), 0x0edcc000 },
-	{ _MMIO(0x9888), 0x1abd00a8 },
-	{ _MMIO(0x9888), 0x0cd8c000 },
-	{ _MMIO(0x9888), 0x0ed84000 },
-	{ _MMIO(0x9888), 0x0edb8000 },
-	{ _MMIO(0x9888), 0x18db0800 },
-	{ _MMIO(0x9888), 0x1adb0254 },
-	{ _MMIO(0x9888), 0x0e9faa00 },
-	{ _MMIO(0x9888), 0x109f02aa },
-	{ _MMIO(0x9888), 0x0eb84000 },
-	{ _MMIO(0x9888), 0x16b84000 },
-	{ _MMIO(0x9888), 0x18b8156a },
-	{ _MMIO(0x9888), 0x06b98000 },
-	{ _MMIO(0x9888), 0x08b9a000 },
-	{ _MMIO(0x9888), 0x0ab9a000 },
-	{ _MMIO(0x9888), 0x0cb9a000 },
-	{ _MMIO(0x9888), 0x0eb9a000 },
-	{ _MMIO(0x9888), 0x18baa000 },
-	{ _MMIO(0x9888), 0x1aba0002 },
-	{ _MMIO(0x9888), 0x16934000 },
-	{ _MMIO(0x9888), 0x1893000a },
-	{ _MMIO(0x9888), 0x0a947000 },
-	{ _MMIO(0x9888), 0x0c95c5c1 },
-	{ _MMIO(0x9888), 0x0e9500c3 },
-	{ _MMIO(0x9888), 0x10950000 },
-	{ _MMIO(0x9888), 0x0eb38000 },
-	{ _MMIO(0x9888), 0x16b30040 },
-	{ _MMIO(0x9888), 0x18b30020 },
-	{ _MMIO(0x9888), 0x06b48000 },
-	{ _MMIO(0x9888), 0x08b41000 },
-	{ _MMIO(0x9888), 0x0ab48000 },
-	{ _MMIO(0x9888), 0x06b5c500 },
-	{ _MMIO(0x9888), 0x08b500c3 },
-	{ _MMIO(0x9888), 0x0eb5c100 },
-	{ _MMIO(0x9888), 0x10b50000 },
-	{ _MMIO(0x9888), 0x16d31500 },
-	{ _MMIO(0x9888), 0x08d4e000 },
-	{ _MMIO(0x9888), 0x08d5c100 },
-	{ _MMIO(0x9888), 0x0ad5c3c5 },
-	{ _MMIO(0x9888), 0x10d50000 },
-	{ _MMIO(0x9888), 0x0d88f800 },
-	{ _MMIO(0x9888), 0x0f88000f },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x078a8000 },
-	{ _MMIO(0x9888), 0x098a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x0d8a8000 },
-	{ _MMIO(0x9888), 0x258baaa5 },
-	{ _MMIO(0x9888), 0x278b002a },
-	{ _MMIO(0x9888), 0x238b2a80 },
-	{ _MMIO(0x9888), 0x0f8c4000 },
-	{ _MMIO(0x9888), 0x178c2000 },
-	{ _MMIO(0x9888), 0x198c5500 },
-	{ _MMIO(0x9888), 0x1b8c0015 },
-	{ _MMIO(0x9888), 0x078d8000 },
-	{ _MMIO(0x9888), 0x098da000 },
-	{ _MMIO(0x9888), 0x0b8da000 },
-	{ _MMIO(0x9888), 0x0d8da000 },
-	{ _MMIO(0x9888), 0x0f8da000 },
-	{ _MMIO(0x9888), 0x2185aaaa },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0784c000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x1780c000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x43800c42 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45800063 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47800800 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f8014a4 },
-	{ _MMIO(0x9888), 0x41801042 },
-};
-
-static int
-get_tdl_1_mux_config(struct drm_i915_private *dev_priv,
-		     const struct i915_oa_reg **regs,
-		     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_tdl_1;
-	lens[n] = ARRAY_SIZE(mux_config_tdl_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_tdl_2[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000fdff },
-	{ _MMIO(0x2778), 0x00000000 },
-	{ _MMIO(0x277c), 0x0000fe7f },
-	{ _MMIO(0x2780), 0x00000000 },
-	{ _MMIO(0x2784), 0x0000ff9f },
-	{ _MMIO(0x2788), 0x00000000 },
-	{ _MMIO(0x278c), 0x0000ffe7 },
-	{ _MMIO(0x2790), 0x00000002 },
-	{ _MMIO(0x2794), 0x0000fffb },
-	{ _MMIO(0x2798), 0x00000002 },
-	{ _MMIO(0x279c), 0x0000fffd },
-};
-
-static const struct i915_oa_reg flex_eu_config_tdl_2[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_tdl_2[] = {
-	{ _MMIO(0x9888), 0x16150000 },
-	{ _MMIO(0x9888), 0x16350000 },
-	{ _MMIO(0x9888), 0x16550000 },
-	{ _MMIO(0x9888), 0x16952e60 },
-	{ _MMIO(0x9888), 0x16b54d60 },
-	{ _MMIO(0x9888), 0x16d52e60 },
-	{ _MMIO(0x9888), 0x065c8000 },
-	{ _MMIO(0x9888), 0x085cc000 },
-	{ _MMIO(0x9888), 0x0a5cc000 },
-	{ _MMIO(0x9888), 0x0c5c4000 },
-	{ _MMIO(0x9888), 0x0e3d8000 },
-	{ _MMIO(0x9888), 0x183da000 },
-	{ _MMIO(0x9888), 0x06588000 },
-	{ _MMIO(0x9888), 0x08588000 },
-	{ _MMIO(0x9888), 0x0a584000 },
-	{ _MMIO(0x9888), 0x0e5b4000 },
-	{ _MMIO(0x9888), 0x185b5800 },
-	{ _MMIO(0x9888), 0x1a5b000a },
-	{ _MMIO(0x9888), 0x0e1faa00 },
-	{ _MMIO(0x9888), 0x101f02aa },
-	{ _MMIO(0x9888), 0x0e384000 },
-	{ _MMIO(0x9888), 0x16384000 },
-	{ _MMIO(0x9888), 0x18382a55 },
-	{ _MMIO(0x9888), 0x06398000 },
-	{ _MMIO(0x9888), 0x0839a000 },
-	{ _MMIO(0x9888), 0x0a39a000 },
-	{ _MMIO(0x9888), 0x0c39a000 },
-	{ _MMIO(0x9888), 0x0e39a000 },
-	{ _MMIO(0x9888), 0x1a3a02a0 },
-	{ _MMIO(0x9888), 0x0e138000 },
-	{ _MMIO(0x9888), 0x16130500 },
-	{ _MMIO(0x9888), 0x06148000 },
-	{ _MMIO(0x9888), 0x08146000 },
-	{ _MMIO(0x9888), 0x0615c100 },
-	{ _MMIO(0x9888), 0x0815c500 },
-	{ _MMIO(0x9888), 0x0a1500c3 },
-	{ _MMIO(0x9888), 0x10150000 },
-	{ _MMIO(0x9888), 0x16335040 },
-	{ _MMIO(0x9888), 0x08349000 },
-	{ _MMIO(0x9888), 0x0a341000 },
-	{ _MMIO(0x9888), 0x083500c1 },
-	{ _MMIO(0x9888), 0x0a35c500 },
-	{ _MMIO(0x9888), 0x0c3500c3 },
-	{ _MMIO(0x9888), 0x10350000 },
-	{ _MMIO(0x9888), 0x1853002a },
-	{ _MMIO(0x9888), 0x0a54e000 },
-	{ _MMIO(0x9888), 0x0c55c500 },
-	{ _MMIO(0x9888), 0x0e55c1c3 },
-	{ _MMIO(0x9888), 0x10550000 },
-	{ _MMIO(0x9888), 0x00dc8000 },
-	{ _MMIO(0x9888), 0x02dcc000 },
-	{ _MMIO(0x9888), 0x04dc4000 },
-	{ _MMIO(0x9888), 0x04bd8000 },
-	{ _MMIO(0x9888), 0x06bd8000 },
-	{ _MMIO(0x9888), 0x02d8c000 },
-	{ _MMIO(0x9888), 0x02db8000 },
-	{ _MMIO(0x9888), 0x04db4000 },
-	{ _MMIO(0x9888), 0x06db4000 },
-	{ _MMIO(0x9888), 0x08db8000 },
-	{ _MMIO(0x9888), 0x0c9fa000 },
-	{ _MMIO(0x9888), 0x0e9f00aa },
-	{ _MMIO(0x9888), 0x02b84000 },
-	{ _MMIO(0x9888), 0x04b84000 },
-	{ _MMIO(0x9888), 0x06b84000 },
-	{ _MMIO(0x9888), 0x08b84000 },
-	{ _MMIO(0x9888), 0x0ab88000 },
-	{ _MMIO(0x9888), 0x0cb88000 },
-	{ _MMIO(0x9888), 0x00b98000 },
-	{ _MMIO(0x9888), 0x02b9a000 },
-	{ _MMIO(0x9888), 0x04b9a000 },
-	{ _MMIO(0x9888), 0x06b92000 },
-	{ _MMIO(0x9888), 0x0aba8000 },
-	{ _MMIO(0x9888), 0x0cba8000 },
-	{ _MMIO(0x9888), 0x04938000 },
-	{ _MMIO(0x9888), 0x06938000 },
-	{ _MMIO(0x9888), 0x0494c000 },
-	{ _MMIO(0x9888), 0x0295cfc7 },
-	{ _MMIO(0x9888), 0x10950000 },
-	{ _MMIO(0x9888), 0x02b38000 },
-	{ _MMIO(0x9888), 0x08b38000 },
-	{ _MMIO(0x9888), 0x04b42000 },
-	{ _MMIO(0x9888), 0x06b41000 },
-	{ _MMIO(0x9888), 0x00b5c700 },
-	{ _MMIO(0x9888), 0x04b500cf },
-	{ _MMIO(0x9888), 0x10b50000 },
-	{ _MMIO(0x9888), 0x0ad38000 },
-	{ _MMIO(0x9888), 0x0cd38000 },
-	{ _MMIO(0x9888), 0x06d46000 },
-	{ _MMIO(0x9888), 0x04d5c700 },
-	{ _MMIO(0x9888), 0x06d500cf },
-	{ _MMIO(0x9888), 0x10d50000 },
-	{ _MMIO(0x9888), 0x03888000 },
-	{ _MMIO(0x9888), 0x05888000 },
-	{ _MMIO(0x9888), 0x07888000 },
-	{ _MMIO(0x9888), 0x09888000 },
-	{ _MMIO(0x9888), 0x0b888000 },
-	{ _MMIO(0x9888), 0x0d880400 },
-	{ _MMIO(0x9888), 0x0f8a8000 },
-	{ _MMIO(0x9888), 0x198a8000 },
-	{ _MMIO(0x9888), 0x1b8aaaa0 },
-	{ _MMIO(0x9888), 0x1d8a0002 },
-	{ _MMIO(0x9888), 0x258b555a },
-	{ _MMIO(0x9888), 0x278b0015 },
-	{ _MMIO(0x9888), 0x238b5500 },
-	{ _MMIO(0x9888), 0x038c4000 },
-	{ _MMIO(0x9888), 0x058c4000 },
-	{ _MMIO(0x9888), 0x078c4000 },
-	{ _MMIO(0x9888), 0x098c4000 },
-	{ _MMIO(0x9888), 0x0b8c4000 },
-	{ _MMIO(0x9888), 0x0d8c4000 },
-	{ _MMIO(0x9888), 0x018d8000 },
-	{ _MMIO(0x9888), 0x038da000 },
-	{ _MMIO(0x9888), 0x058da000 },
-	{ _MMIO(0x9888), 0x078d2000 },
-	{ _MMIO(0x9888), 0x2185aaaa },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0784c000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x1780c000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x43800882 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45801082 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x478014a5 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f800002 },
-	{ _MMIO(0x9888), 0x41800c62 },
-};
-
-static int
-get_tdl_2_mux_config(struct drm_i915_private *dev_priv,
-		     const struct i915_oa_reg **regs,
-		     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_tdl_2;
-	lens[n] = ARRAY_SIZE(mux_config_tdl_2);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_extra[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_extra[] = {
-	{ _MMIO(0xe458), 0x00001000 },
-	{ _MMIO(0xe558), 0x00003002 },
-	{ _MMIO(0xe658), 0x00005004 },
-	{ _MMIO(0xe758), 0x00011010 },
-	{ _MMIO(0xe45c), 0x00050012 },
-	{ _MMIO(0xe55c), 0x00052051 },
-	{ _MMIO(0xe65c), 0x00000008 },
-};
-
-static const struct i915_oa_reg mux_config_compute_extra[] = {
-	{ _MMIO(0x9888), 0x161503e0 },
-	{ _MMIO(0x9888), 0x163503e0 },
-	{ _MMIO(0x9888), 0x165503e0 },
-	{ _MMIO(0x9888), 0x169503e0 },
-	{ _MMIO(0x9888), 0x16b503e0 },
-	{ _MMIO(0x9888), 0x16d503e0 },
-	{ _MMIO(0x9888), 0x045cc000 },
-	{ _MMIO(0x9888), 0x083d8000 },
-	{ _MMIO(0x9888), 0x04584000 },
-	{ _MMIO(0x9888), 0x085b4000 },
-	{ _MMIO(0x9888), 0x0a5b8000 },
-	{ _MMIO(0x9888), 0x0e1f00a8 },
-	{ _MMIO(0x9888), 0x08384000 },
-	{ _MMIO(0x9888), 0x0a384000 },
-	{ _MMIO(0x9888), 0x0c388000 },
-	{ _MMIO(0x9888), 0x0439a000 },
-	{ _MMIO(0x9888), 0x06392000 },
-	{ _MMIO(0x9888), 0x0c3a8000 },
-	{ _MMIO(0x9888), 0x08138000 },
-	{ _MMIO(0x9888), 0x06141000 },
-	{ _MMIO(0x9888), 0x041500c3 },
-	{ _MMIO(0x9888), 0x10150000 },
-	{ _MMIO(0x9888), 0x0a338000 },
-	{ _MMIO(0x9888), 0x06342000 },
-	{ _MMIO(0x9888), 0x0435c300 },
-	{ _MMIO(0x9888), 0x10350000 },
-	{ _MMIO(0x9888), 0x0c538000 },
-	{ _MMIO(0x9888), 0x06544000 },
-	{ _MMIO(0x9888), 0x065500c3 },
-	{ _MMIO(0x9888), 0x10550000 },
-	{ _MMIO(0x9888), 0x00dc8000 },
-	{ _MMIO(0x9888), 0x02dc4000 },
-	{ _MMIO(0x9888), 0x02bd8000 },
-	{ _MMIO(0x9888), 0x00d88000 },
-	{ _MMIO(0x9888), 0x02db4000 },
-	{ _MMIO(0x9888), 0x04db8000 },
-	{ _MMIO(0x9888), 0x0c9fa000 },
-	{ _MMIO(0x9888), 0x0e9f0002 },
-	{ _MMIO(0x9888), 0x02b84000 },
-	{ _MMIO(0x9888), 0x04b84000 },
-	{ _MMIO(0x9888), 0x06b88000 },
-	{ _MMIO(0x9888), 0x00b98000 },
-	{ _MMIO(0x9888), 0x02b9a000 },
-	{ _MMIO(0x9888), 0x06ba8000 },
-	{ _MMIO(0x9888), 0x02938000 },
-	{ _MMIO(0x9888), 0x04942000 },
-	{ _MMIO(0x9888), 0x0095c300 },
-	{ _MMIO(0x9888), 0x10950000 },
-	{ _MMIO(0x9888), 0x04b38000 },
-	{ _MMIO(0x9888), 0x04b44000 },
-	{ _MMIO(0x9888), 0x02b500c3 },
-	{ _MMIO(0x9888), 0x10b50000 },
-	{ _MMIO(0x9888), 0x06d38000 },
-	{ _MMIO(0x9888), 0x04d48000 },
-	{ _MMIO(0x9888), 0x02d5c300 },
-	{ _MMIO(0x9888), 0x10d50000 },
-	{ _MMIO(0x9888), 0x03888000 },
-	{ _MMIO(0x9888), 0x05888000 },
-	{ _MMIO(0x9888), 0x07888000 },
-	{ _MMIO(0x9888), 0x098a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x0d8a8000 },
-	{ _MMIO(0x9888), 0x238b3500 },
-	{ _MMIO(0x9888), 0x258b0005 },
-	{ _MMIO(0x9888), 0x038c4000 },
-	{ _MMIO(0x9888), 0x058c4000 },
-	{ _MMIO(0x9888), 0x078c4000 },
-	{ _MMIO(0x9888), 0x018d8000 },
-	{ _MMIO(0x9888), 0x038da000 },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x2185000a },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x07844000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x17804000 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f800c40 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x41801482 },
-	{ _MMIO(0x9888), 0x31800000 },
-};
-
-static int
-get_compute_extra_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_extra;
-	lens[n] = ARRAY_SIZE(mux_config_compute_extra);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_vme_pipe[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00100030 },
-	{ _MMIO(0x2774), 0x0000fff9 },
-	{ _MMIO(0x2778), 0x00000002 },
-	{ _MMIO(0x277c), 0x0000fffc },
-	{ _MMIO(0x2780), 0x00000002 },
-	{ _MMIO(0x2784), 0x0000fff3 },
-	{ _MMIO(0x2788), 0x00100180 },
-	{ _MMIO(0x278c), 0x0000ffcf },
-	{ _MMIO(0x2790), 0x00000002 },
-	{ _MMIO(0x2794), 0x0000ffcf },
-	{ _MMIO(0x2798), 0x00000002 },
-	{ _MMIO(0x279c), 0x0000ff3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_vme_pipe[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00008003 },
-};
-
-static const struct i915_oa_reg mux_config_vme_pipe[] = {
-	{ _MMIO(0x9888), 0x14100812 },
-	{ _MMIO(0x9888), 0x14125800 },
-	{ _MMIO(0x9888), 0x161200c0 },
-	{ _MMIO(0x9888), 0x14300812 },
-	{ _MMIO(0x9888), 0x14325800 },
-	{ _MMIO(0x9888), 0x163200c0 },
-	{ _MMIO(0x9888), 0x005c4000 },
-	{ _MMIO(0x9888), 0x065c8000 },
-	{ _MMIO(0x9888), 0x085cc000 },
-	{ _MMIO(0x9888), 0x0a5cc000 },
-	{ _MMIO(0x9888), 0x0c5cc000 },
-	{ _MMIO(0x9888), 0x003d8000 },
-	{ _MMIO(0x9888), 0x0e3d8000 },
-	{ _MMIO(0x9888), 0x183d2800 },
-	{ _MMIO(0x9888), 0x00584000 },
-	{ _MMIO(0x9888), 0x06588000 },
-	{ _MMIO(0x9888), 0x0858c000 },
-	{ _MMIO(0x9888), 0x005b4000 },
-	{ _MMIO(0x9888), 0x0e5b4000 },
-	{ _MMIO(0x9888), 0x185b9400 },
-	{ _MMIO(0x9888), 0x1a5b002a },
-	{ _MMIO(0x9888), 0x0c1f0800 },
-	{ _MMIO(0x9888), 0x0e1faa00 },
-	{ _MMIO(0x9888), 0x101f002a },
-	{ _MMIO(0x9888), 0x00384000 },
-	{ _MMIO(0x9888), 0x0e384000 },
-	{ _MMIO(0x9888), 0x16384000 },
-	{ _MMIO(0x9888), 0x18380155 },
-	{ _MMIO(0x9888), 0x00392000 },
-	{ _MMIO(0x9888), 0x06398000 },
-	{ _MMIO(0x9888), 0x0839a000 },
-	{ _MMIO(0x9888), 0x0a39a000 },
-	{ _MMIO(0x9888), 0x0c39a000 },
-	{ _MMIO(0x9888), 0x00100047 },
-	{ _MMIO(0x9888), 0x06101a80 },
-	{ _MMIO(0x9888), 0x10100000 },
-	{ _MMIO(0x9888), 0x0810c000 },
-	{ _MMIO(0x9888), 0x0811c000 },
-	{ _MMIO(0x9888), 0x08126151 },
-	{ _MMIO(0x9888), 0x10120000 },
-	{ _MMIO(0x9888), 0x00134000 },
-	{ _MMIO(0x9888), 0x0e134000 },
-	{ _MMIO(0x9888), 0x161300a0 },
-	{ _MMIO(0x9888), 0x0a301ac7 },
-	{ _MMIO(0x9888), 0x10300000 },
-	{ _MMIO(0x9888), 0x0c30c000 },
-	{ _MMIO(0x9888), 0x0c31c000 },
-	{ _MMIO(0x9888), 0x0c326151 },
-	{ _MMIO(0x9888), 0x10320000 },
-	{ _MMIO(0x9888), 0x16332a00 },
-	{ _MMIO(0x9888), 0x18330001 },
-	{ _MMIO(0x9888), 0x018a8000 },
-	{ _MMIO(0x9888), 0x0f8a8000 },
-	{ _MMIO(0x9888), 0x198a8000 },
-	{ _MMIO(0x9888), 0x1b8a2aa0 },
-	{ _MMIO(0x9888), 0x238b0020 },
-	{ _MMIO(0x9888), 0x258b5550 },
-	{ _MMIO(0x9888), 0x278b0001 },
-	{ _MMIO(0x9888), 0x1f850080 },
-	{ _MMIO(0x9888), 0x2185aaa0 },
-	{ _MMIO(0x9888), 0x23850002 },
-	{ _MMIO(0x9888), 0x01834000 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830015 },
-	{ _MMIO(0x9888), 0x01844000 },
-	{ _MMIO(0x9888), 0x07848000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x11804000 },
-	{ _MMIO(0x9888), 0x17808000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3d800800 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x43800002 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45800884 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47800002 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-};
-
-static int
-get_vme_pipe_mux_config(struct drm_i915_private *dev_priv,
-			const struct i915_oa_reg **regs,
-			int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_vme_pipe;
-	lens[n] = ARRAY_SIZE(mux_config_vme_pipe);
-	n++;
-
-	return n;
-}
-
 static const struct i915_oa_reg b_counter_config_test_oa[] = {
 	{ _MMIO(0x2740), 0x00000000 },
 	{ _MMIO(0x2744), 0x00800000 },
@@ -4035,6 +60,7 @@ static const struct i915_oa_reg flex_eu_config_test_oa[] = {
 };
 
 static const struct i915_oa_reg mux_config_test_oa[] = {
+	{ _MMIO(0x9840), 0x000000a0 },
 	{ _MMIO(0x9888), 0x198b0000 },
 	{ _MMIO(0x9888), 0x078b0066 },
 	{ _MMIO(0x9888), 0x118b0000 },
@@ -4047,1330 +73,38 @@ static const struct i915_oa_reg mux_config_test_oa[] = {
 	{ _MMIO(0x9888), 0x4f800000 },
 	{ _MMIO(0x9888), 0x41800000 },
 	{ _MMIO(0x9888), 0x31800000 },
-};
-
-static int
-get_test_oa_mux_config(struct drm_i915_private *dev_priv,
-		       const struct i915_oa_reg **regs,
-		       int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_test_oa;
-	lens[n] = ARRAY_SIZE(mux_config_test_oa);
-	n++;
-
-	return n;
-}
-
-int i915_oa_select_metric_set_bdw(struct drm_i915_private *dev_priv)
-{
-	dev_priv->perf.oa.n_mux_configs = 0;
-	dev_priv->perf.oa.b_counter_regs = NULL;
-	dev_priv->perf.oa.b_counter_regs_len = 0;
-	dev_priv->perf.oa.flex_regs = NULL;
-	dev_priv->perf.oa.flex_regs_len = 0;
-
-	switch (dev_priv->perf.oa.metrics_set) {
-	case METRIC_SET_ID_RENDER_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_basic_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_basic);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_basic_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_basic);
-
-		return 0;
-	case METRIC_SET_ID_RENDER_PIPE_PROFILE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_pipe_profile_mux_config(dev_priv,
-							   dev_priv->perf.oa.mux_regs,
-							   dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_PIPE_PROFILE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_pipe_profile;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_pipe_profile);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_pipe_profile;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_pipe_profile);
-
-		return 0;
-	case METRIC_SET_ID_MEMORY_READS:
-		dev_priv->perf.oa.n_mux_configs =
-			get_memory_reads_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"MEMORY_READS\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_memory_reads;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_memory_reads);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_memory_reads;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_memory_reads);
-
-		return 0;
-	case METRIC_SET_ID_MEMORY_WRITES:
-		dev_priv->perf.oa.n_mux_configs =
-			get_memory_writes_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"MEMORY_WRITES\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_memory_writes;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_memory_writes);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_memory_writes;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_memory_writes);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_EXTENDED:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_extended_mux_config(dev_priv,
-							dev_priv->perf.oa.mux_regs,
-							dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_EXTENDED\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_extended;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_extended);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_extended;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_extended);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_L3_CACHE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_l3_cache_mux_config(dev_priv,
-							dev_priv->perf.oa.mux_regs,
-							dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_L3_CACHE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_l3_cache;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_l3_cache);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_l3_cache;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_l3_cache);
-
-		return 0;
-	case METRIC_SET_ID_DATA_PORT_READS_COALESCING:
-		dev_priv->perf.oa.n_mux_configs =
-			get_data_port_reads_coalescing_mux_config(dev_priv,
-								  dev_priv->perf.oa.mux_regs,
-								  dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"DATA_PORT_READS_COALESCING\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_data_port_reads_coalescing;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_data_port_reads_coalescing);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_data_port_reads_coalescing;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_data_port_reads_coalescing);
-
-		return 0;
-	case METRIC_SET_ID_DATA_PORT_WRITES_COALESCING:
-		dev_priv->perf.oa.n_mux_configs =
-			get_data_port_writes_coalescing_mux_config(dev_priv,
-								   dev_priv->perf.oa.mux_regs,
-								   dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"DATA_PORT_WRITES_COALESCING\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_data_port_writes_coalescing;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_data_port_writes_coalescing);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_data_port_writes_coalescing;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_data_port_writes_coalescing);
-
-		return 0;
-	case METRIC_SET_ID_HDC_AND_SF:
-		dev_priv->perf.oa.n_mux_configs =
-			get_hdc_and_sf_mux_config(dev_priv,
-						  dev_priv->perf.oa.mux_regs,
-						  dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"HDC_AND_SF\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_hdc_and_sf;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_hdc_and_sf);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_hdc_and_sf;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_hdc_and_sf);
-
-		return 0;
-	case METRIC_SET_ID_L3_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_1_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_1);
-
-		return 0;
-	case METRIC_SET_ID_L3_2:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_2_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_2\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_2;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_2);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_2;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_2);
-
-		return 0;
-	case METRIC_SET_ID_L3_3:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_3_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_3\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_3;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_3);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_3;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_3);
-
-		return 0;
-	case METRIC_SET_ID_L3_4:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_4_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_4\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_4;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_4);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_4;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_4);
-
-		return 0;
-	case METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND:
-		dev_priv->perf.oa.n_mux_configs =
-			get_rasterizer_and_pixel_backend_mux_config(dev_priv,
-								    dev_priv->perf.oa.mux_regs,
-								    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RASTERIZER_AND_PIXEL_BACKEND\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_rasterizer_and_pixel_backend;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_rasterizer_and_pixel_backend);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_rasterizer_and_pixel_backend;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_rasterizer_and_pixel_backend);
-
-		return 0;
-	case METRIC_SET_ID_SAMPLER_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_sampler_1_mux_config(dev_priv,
-						 dev_priv->perf.oa.mux_regs,
-						 dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"SAMPLER_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_sampler_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_sampler_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_sampler_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_sampler_1);
-
-		return 0;
-	case METRIC_SET_ID_SAMPLER_2:
-		dev_priv->perf.oa.n_mux_configs =
-			get_sampler_2_mux_config(dev_priv,
-						 dev_priv->perf.oa.mux_regs,
-						 dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"SAMPLER_2\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_sampler_2;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_sampler_2);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_sampler_2;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_sampler_2);
-
-		return 0;
-	case METRIC_SET_ID_TDL_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_tdl_1_mux_config(dev_priv,
-					     dev_priv->perf.oa.mux_regs,
-					     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TDL_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_tdl_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_tdl_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_tdl_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_tdl_1);
-
-		return 0;
-	case METRIC_SET_ID_TDL_2:
-		dev_priv->perf.oa.n_mux_configs =
-			get_tdl_2_mux_config(dev_priv,
-					     dev_priv->perf.oa.mux_regs,
-					     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TDL_2\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_tdl_2;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_tdl_2);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_tdl_2;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_tdl_2);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_EXTRA:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_extra_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_EXTRA\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_extra;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_extra);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_extra;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_extra);
-
-		return 0;
-	case METRIC_SET_ID_VME_PIPE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_vme_pipe_mux_config(dev_priv,
-						dev_priv->perf.oa.mux_regs,
-						dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"VME_PIPE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_vme_pipe;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_vme_pipe);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_vme_pipe;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_vme_pipe);
-
-		return 0;
-	case METRIC_SET_ID_TEST_OA:
-		dev_priv->perf.oa.n_mux_configs =
-			get_test_oa_mux_config(dev_priv,
-					       dev_priv->perf.oa.mux_regs,
-					       dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TEST_OA\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_test_oa;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_test_oa);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_test_oa;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_test_oa);
-
-		return 0;
-	default:
-		return -ENODEV;
-	}
-}
-
-static ssize_t
-show_render_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_BASIC);
-}
-
-static struct device_attribute dev_attr_render_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_basic[] = {
-	&dev_attr_render_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_basic = {
-	.name = "b541bd57-0e0f-4154-b4c0-5858010a2bf7",
-	.attrs =  attrs_render_basic,
-};
-
-static ssize_t
-show_compute_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_BASIC);
-}
-
-static struct device_attribute dev_attr_compute_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_basic[] = {
-	&dev_attr_compute_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_basic = {
-	.name = "35fbc9b2-a891-40a6-a38d-022bb7057552",
-	.attrs =  attrs_compute_basic,
-};
-
-static ssize_t
-show_render_pipe_profile_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_PIPE_PROFILE);
-}
-
-static struct device_attribute dev_attr_render_pipe_profile_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_pipe_profile_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_pipe_profile[] = {
-	&dev_attr_render_pipe_profile_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_pipe_profile = {
-	.name = "233d0544-fff7-4281-8291-e02f222aff72",
-	.attrs =  attrs_render_pipe_profile,
-};
-
-static ssize_t
-show_memory_reads_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_MEMORY_READS);
-}
-
-static struct device_attribute dev_attr_memory_reads_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_memory_reads_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_memory_reads[] = {
-	&dev_attr_memory_reads_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_memory_reads = {
-	.name = "2b255d48-2117-4fef-a8f7-f151e1d25a2c",
-	.attrs =  attrs_memory_reads,
-};
-
-static ssize_t
-show_memory_writes_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_MEMORY_WRITES);
-}
-
-static struct device_attribute dev_attr_memory_writes_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_memory_writes_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_memory_writes[] = {
-	&dev_attr_memory_writes_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_memory_writes = {
-	.name = "f7fd3220-b466-4a4d-9f98-b0caf3f2394c",
-	.attrs =  attrs_memory_writes,
-};
-
-static ssize_t
-show_compute_extended_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_EXTENDED);
-}
-
-static struct device_attribute dev_attr_compute_extended_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_extended_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_extended[] = {
-	&dev_attr_compute_extended_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_extended = {
-	.name = "e99ccaca-821c-4df9-97a7-96bdb7204e43",
-	.attrs =  attrs_compute_extended,
-};
-
-static ssize_t
-show_compute_l3_cache_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_L3_CACHE);
-}
-
-static struct device_attribute dev_attr_compute_l3_cache_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_l3_cache_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_l3_cache[] = {
-	&dev_attr_compute_l3_cache_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_l3_cache = {
-	.name = "27a364dc-8225-4ecb-b607-d6f1925598d9",
-	.attrs =  attrs_compute_l3_cache,
-};
-
-static ssize_t
-show_data_port_reads_coalescing_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_DATA_PORT_READS_COALESCING);
-}
-
-static struct device_attribute dev_attr_data_port_reads_coalescing_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_data_port_reads_coalescing_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_data_port_reads_coalescing[] = {
-	&dev_attr_data_port_reads_coalescing_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_data_port_reads_coalescing = {
-	.name = "857fc630-2f09-4804-85f1-084adfadd5ab",
-	.attrs =  attrs_data_port_reads_coalescing,
-};
-
-static ssize_t
-show_data_port_writes_coalescing_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_DATA_PORT_WRITES_COALESCING);
-}
-
-static struct device_attribute dev_attr_data_port_writes_coalescing_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_data_port_writes_coalescing_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_data_port_writes_coalescing[] = {
-	&dev_attr_data_port_writes_coalescing_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_data_port_writes_coalescing = {
-	.name = "343ebc99-4a55-414c-8c17-d8e259cf5e20",
-	.attrs =  attrs_data_port_writes_coalescing,
-};
-
-static ssize_t
-show_hdc_and_sf_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_HDC_AND_SF);
-}
-
-static struct device_attribute dev_attr_hdc_and_sf_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_hdc_and_sf_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_hdc_and_sf[] = {
-	&dev_attr_hdc_and_sf_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_hdc_and_sf = {
-	.name = "7bdafd88-a4fa-4ed5-bc09-1a977aa5be3e",
-	.attrs =  attrs_hdc_and_sf,
-};
-
-static ssize_t
-show_l3_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_1);
-}
-
-static struct device_attribute dev_attr_l3_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_1[] = {
-	&dev_attr_l3_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_1 = {
-	.name = "9385ebb2-f34f-4aa5-aec5-7e9cbbea0f0b",
-	.attrs =  attrs_l3_1,
-};
-
-static ssize_t
-show_l3_2_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_2);
-}
-
-static struct device_attribute dev_attr_l3_2_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_2_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_2[] = {
-	&dev_attr_l3_2_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_2 = {
-	.name = "446ae59b-ff2e-41c9-b49e-0184a54bf00a",
-	.attrs =  attrs_l3_2,
-};
-
-static ssize_t
-show_l3_3_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_3);
-}
-
-static struct device_attribute dev_attr_l3_3_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_3_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_3[] = {
-	&dev_attr_l3_3_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_3 = {
-	.name = "84a7956f-1ea4-4d0d-837f-e39a0376e38c",
-	.attrs =  attrs_l3_3,
-};
-
-static ssize_t
-show_l3_4_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_4);
-}
-
-static struct device_attribute dev_attr_l3_4_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_4_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_4[] = {
-	&dev_attr_l3_4_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_4 = {
-	.name = "92b493d9-df18-4bed-be06-5cac6f2a6f5f",
-	.attrs =  attrs_l3_4,
-};
-
-static ssize_t
-show_rasterizer_and_pixel_backend_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND);
-}
-
-static struct device_attribute dev_attr_rasterizer_and_pixel_backend_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_rasterizer_and_pixel_backend_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_rasterizer_and_pixel_backend[] = {
-	&dev_attr_rasterizer_and_pixel_backend_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_rasterizer_and_pixel_backend = {
-	.name = "14345c35-cc46-40d0-bb04-6ed1fbb43679",
-	.attrs =  attrs_rasterizer_and_pixel_backend,
-};
-
-static ssize_t
-show_sampler_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_SAMPLER_1);
-}
-
-static struct device_attribute dev_attr_sampler_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_sampler_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_sampler_1[] = {
-	&dev_attr_sampler_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_sampler_1 = {
-	.name = "f0c6ba37-d3d3-4211-91b5-226730312a54",
-	.attrs =  attrs_sampler_1,
-};
-
-static ssize_t
-show_sampler_2_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_SAMPLER_2);
-}
-
-static struct device_attribute dev_attr_sampler_2_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_sampler_2_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_sampler_2[] = {
-	&dev_attr_sampler_2_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_sampler_2 = {
-	.name = "30bf3702-48cf-4bca-b412-7cf50bb2f564",
-	.attrs =  attrs_sampler_2,
-};
-
-static ssize_t
-show_tdl_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TDL_1);
-}
-
-static struct device_attribute dev_attr_tdl_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_tdl_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_tdl_1[] = {
-	&dev_attr_tdl_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_tdl_1 = {
-	.name = "238bec85-df05-44f3-b905-d166712f2451",
-	.attrs =  attrs_tdl_1,
-};
-
-static ssize_t
-show_tdl_2_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TDL_2);
-}
-
-static struct device_attribute dev_attr_tdl_2_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_tdl_2_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_tdl_2[] = {
-	&dev_attr_tdl_2_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_tdl_2 = {
-	.name = "24bf02cd-8693-4583-981c-c4165b33da01",
-	.attrs =  attrs_tdl_2,
-};
-
-static ssize_t
-show_compute_extra_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_EXTRA);
-}
-
-static struct device_attribute dev_attr_compute_extra_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_extra_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_extra[] = {
-	&dev_attr_compute_extra_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_extra = {
-	.name = "8fb61ba2-2fbb-454c-a136-2dec5a8a595e",
-	.attrs =  attrs_compute_extra,
-};
-
-static ssize_t
-show_vme_pipe_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_VME_PIPE);
-}
-
-static struct device_attribute dev_attr_vme_pipe_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_vme_pipe_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_vme_pipe[] = {
-	&dev_attr_vme_pipe_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_vme_pipe = {
-	.name = "e1743ca0-7fc8-410b-a066-de7bbb9280b7",
-	.attrs =  attrs_vme_pipe,
+	{ _MMIO(0x9840), 0x00000080 },
 };
 
 static ssize_t
 show_test_oa_id(struct device *kdev, struct device_attribute *attr, char *buf)
 {
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TEST_OA);
-}
-
-static struct device_attribute dev_attr_test_oa_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_test_oa_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_test_oa[] = {
-	&dev_attr_test_oa_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_test_oa = {
-	.name = "d6de6f55-e526-4f79-a6a6-d7315c09044e",
-	.attrs =  attrs_test_oa,
-};
-
-int
-i915_perf_register_sysfs_bdw(struct drm_i915_private *dev_priv)
-{
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
-	int ret = 0;
-
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-		if (ret)
-			goto error_render_basic;
-	}
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-		if (ret)
-			goto error_compute_basic;
-	}
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-		if (ret)
-			goto error_render_pipe_profile;
-	}
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-		if (ret)
-			goto error_memory_reads;
-	}
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-		if (ret)
-			goto error_memory_writes;
-	}
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-		if (ret)
-			goto error_compute_extended;
-	}
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-		if (ret)
-			goto error_compute_l3_cache;
-	}
-	if (get_data_port_reads_coalescing_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_data_port_reads_coalescing);
-		if (ret)
-			goto error_data_port_reads_coalescing;
-	}
-	if (get_data_port_writes_coalescing_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_data_port_writes_coalescing);
-		if (ret)
-			goto error_data_port_writes_coalescing;
-	}
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-		if (ret)
-			goto error_hdc_and_sf;
-	}
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-		if (ret)
-			goto error_l3_1;
-	}
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-		if (ret)
-			goto error_l3_2;
-	}
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-		if (ret)
-			goto error_l3_3;
-	}
-	if (get_l3_4_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_4);
-		if (ret)
-			goto error_l3_4;
-	}
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-		if (ret)
-			goto error_rasterizer_and_pixel_backend;
-	}
-	if (get_sampler_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_sampler_1);
-		if (ret)
-			goto error_sampler_1;
-	}
-	if (get_sampler_2_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_sampler_2);
-		if (ret)
-			goto error_sampler_2;
-	}
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-		if (ret)
-			goto error_tdl_1;
-	}
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-		if (ret)
-			goto error_tdl_2;
-	}
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-		if (ret)
-			goto error_compute_extra;
-	}
-	if (get_vme_pipe_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_vme_pipe);
-		if (ret)
-			goto error_vme_pipe;
-	}
-	if (get_test_oa_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_test_oa);
-		if (ret)
-			goto error_test_oa;
-	}
-
-	return 0;
-
-error_test_oa:
-	if (get_vme_pipe_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_vme_pipe);
-error_vme_pipe:
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-error_compute_extra:
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-error_tdl_2:
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-error_tdl_1:
-	if (get_sampler_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler_2);
-error_sampler_2:
-	if (get_sampler_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler_1);
-error_sampler_1:
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-error_rasterizer_and_pixel_backend:
-	if (get_l3_4_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_4);
-error_l3_4:
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-error_l3_3:
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-error_l3_2:
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-error_l3_1:
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-error_hdc_and_sf:
-	if (get_data_port_writes_coalescing_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_data_port_writes_coalescing);
-error_data_port_writes_coalescing:
-	if (get_data_port_reads_coalescing_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_data_port_reads_coalescing);
-error_data_port_reads_coalescing:
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-error_compute_l3_cache:
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-error_compute_extended:
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-error_memory_writes:
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-error_memory_reads:
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-error_render_pipe_profile:
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-error_compute_basic:
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-error_render_basic:
-	return ret;
+	return sprintf(buf, "1\n");
 }
 
 void
-i915_perf_unregister_sysfs_bdw(struct drm_i915_private *dev_priv)
+i915_perf_load_test_config_bdw(struct drm_i915_private *dev_priv)
 {
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
+	strncpy(dev_priv->perf.oa.test_config.uuid,
+		"d6de6f55-e526-4f79-a6a6-d7315c09044e",
+		UUID_STRING_LEN);
+	dev_priv->perf.oa.test_config.id = 1;
 
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-	if (get_data_port_reads_coalescing_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_data_port_reads_coalescing);
-	if (get_data_port_writes_coalescing_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_data_port_writes_coalescing);
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-	if (get_l3_4_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_4);
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-	if (get_sampler_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler_1);
-	if (get_sampler_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler_2);
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-	if (get_vme_pipe_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_vme_pipe);
-	if (get_test_oa_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_test_oa);
+	dev_priv->perf.oa.test_config.mux_regs = mux_config_test_oa;
+	dev_priv->perf.oa.test_config.mux_regs_len = ARRAY_SIZE(mux_config_test_oa);
+
+	dev_priv->perf.oa.test_config.b_counter_regs = b_counter_config_test_oa;
+	dev_priv->perf.oa.test_config.b_counter_regs_len = ARRAY_SIZE(b_counter_config_test_oa);
+
+	dev_priv->perf.oa.test_config.flex_regs = flex_eu_config_test_oa;
+	dev_priv->perf.oa.test_config.flex_regs_len = ARRAY_SIZE(flex_eu_config_test_oa);
+
+	dev_priv->perf.oa.test_config.sysfs_metric.name = "d6de6f55-e526-4f79-a6a6-d7315c09044e";
+	dev_priv->perf.oa.test_config.sysfs_metric.attrs = dev_priv->perf.oa.test_config.attrs;
+
+	dev_priv->perf.oa.test_config.attrs[0] = &dev_priv->perf.oa.test_config.sysfs_metric_id.attr;
+
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.name = "id";
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.mode = 0444;
+	dev_priv->perf.oa.test_config.sysfs_metric_id.show = show_test_oa_id;
 }
diff --git a/drivers/gpu/drm/i915/i915_oa_bdw.h b/drivers/gpu/drm/i915/i915_oa_bdw.h
index 6363ff9..b812d16 100644
--- a/drivers/gpu/drm/i915/i915_oa_bdw.h
+++ b/drivers/gpu/drm/i915/i915_oa_bdw.h
@@ -29,12 +29,6 @@
 #ifndef __I915_OA_BDW_H__
 #define __I915_OA_BDW_H__
 
-extern int i915_oa_n_builtin_metric_sets_bdw;
-
-extern int i915_oa_select_metric_set_bdw(struct drm_i915_private *dev_priv);
-
-extern int i915_perf_register_sysfs_bdw(struct drm_i915_private *dev_priv);
-
-extern void i915_perf_unregister_sysfs_bdw(struct drm_i915_private *dev_priv);
+extern void i915_perf_load_test_config_bdw(struct drm_i915_private *dev_priv);
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_oa_bxt.c b/drivers/gpu/drm/i915/i915_oa_bxt.c
index 93864d8..b69b900 100644
--- a/drivers/gpu/drm/i915/i915_oa_bxt.c
+++ b/drivers/gpu/drm/i915/i915_oa_bxt.c
@@ -31,1702 +31,6 @@
 #include "i915_drv.h"
 #include "i915_oa_bxt.h"
 
-enum metric_set_id {
-	METRIC_SET_ID_RENDER_BASIC = 1,
-	METRIC_SET_ID_COMPUTE_BASIC,
-	METRIC_SET_ID_RENDER_PIPE_PROFILE,
-	METRIC_SET_ID_MEMORY_READS,
-	METRIC_SET_ID_MEMORY_WRITES,
-	METRIC_SET_ID_COMPUTE_EXTENDED,
-	METRIC_SET_ID_COMPUTE_L3_CACHE,
-	METRIC_SET_ID_HDC_AND_SF,
-	METRIC_SET_ID_L3_1,
-	METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND,
-	METRIC_SET_ID_SAMPLER,
-	METRIC_SET_ID_TDL_1,
-	METRIC_SET_ID_TDL_2,
-	METRIC_SET_ID_COMPUTE_EXTRA,
-	METRIC_SET_ID_TEST_OA,
-};
-
-int i915_oa_n_builtin_metric_sets_bxt = 15;
-
-static const struct i915_oa_reg b_counter_config_render_basic[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_render_basic[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_render_basic_0_sku_gte_0x03[] = {
-	{ _MMIO(0x9888), 0x166c00f0 },
-	{ _MMIO(0x9888), 0x12120280 },
-	{ _MMIO(0x9888), 0x12320280 },
-	{ _MMIO(0x9888), 0x11930317 },
-	{ _MMIO(0x9888), 0x159303df },
-	{ _MMIO(0x9888), 0x3f900c00 },
-	{ _MMIO(0x9888), 0x419000a0 },
-	{ _MMIO(0x9888), 0x002d1000 },
-	{ _MMIO(0x9888), 0x062d4000 },
-	{ _MMIO(0x9888), 0x082d5000 },
-	{ _MMIO(0x9888), 0x0a2d1000 },
-	{ _MMIO(0x9888), 0x0c2e0800 },
-	{ _MMIO(0x9888), 0x0e2e5900 },
-	{ _MMIO(0x9888), 0x0a4c8000 },
-	{ _MMIO(0x9888), 0x0c4c8000 },
-	{ _MMIO(0x9888), 0x0e4c4000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e2000 },
-	{ _MMIO(0x9888), 0x1c4f0010 },
-	{ _MMIO(0x9888), 0x0a6c0053 },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1a0fcc00 },
-	{ _MMIO(0x9888), 0x1c0f0002 },
-	{ _MMIO(0x9888), 0x1c2c0040 },
-	{ _MMIO(0x9888), 0x00101000 },
-	{ _MMIO(0x9888), 0x04101000 },
-	{ _MMIO(0x9888), 0x00114000 },
-	{ _MMIO(0x9888), 0x08114000 },
-	{ _MMIO(0x9888), 0x00120020 },
-	{ _MMIO(0x9888), 0x08120021 },
-	{ _MMIO(0x9888), 0x00141000 },
-	{ _MMIO(0x9888), 0x08141000 },
-	{ _MMIO(0x9888), 0x02308000 },
-	{ _MMIO(0x9888), 0x04302000 },
-	{ _MMIO(0x9888), 0x06318000 },
-	{ _MMIO(0x9888), 0x08318000 },
-	{ _MMIO(0x9888), 0x06320800 },
-	{ _MMIO(0x9888), 0x08320840 },
-	{ _MMIO(0x9888), 0x00320000 },
-	{ _MMIO(0x9888), 0x06344000 },
-	{ _MMIO(0x9888), 0x08344000 },
-	{ _MMIO(0x9888), 0x0d931831 },
-	{ _MMIO(0x9888), 0x0f939f3f },
-	{ _MMIO(0x9888), 0x01939e80 },
-	{ _MMIO(0x9888), 0x039303bc },
-	{ _MMIO(0x9888), 0x0593000e },
-	{ _MMIO(0x9888), 0x1993002a },
-	{ _MMIO(0x9888), 0x07930000 },
-	{ _MMIO(0x9888), 0x09930000 },
-	{ _MMIO(0x9888), 0x1d900177 },
-	{ _MMIO(0x9888), 0x1f900187 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x13904000 },
-	{ _MMIO(0x9888), 0x21904000 },
-	{ _MMIO(0x9888), 0x23904000 },
-	{ _MMIO(0x9888), 0x25904000 },
-	{ _MMIO(0x9888), 0x27904000 },
-	{ _MMIO(0x9888), 0x2b904000 },
-	{ _MMIO(0x9888), 0x2d904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17904000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x53901110 },
-	{ _MMIO(0x9888), 0x43900423 },
-	{ _MMIO(0x9888), 0x55900111 },
-	{ _MMIO(0x9888), 0x47900c02 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900020 },
-	{ _MMIO(0x9888), 0x59901111 },
-	{ _MMIO(0x9888), 0x4b900421 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900001 },
-	{ _MMIO(0x9888), 0x45900821 },
-};
-
-static int
-get_render_basic_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	if (dev_priv->drm.pdev->revision >= 0x03) {
-		regs[n] = mux_config_render_basic_0_sku_gte_0x03;
-		lens[n] = ARRAY_SIZE(mux_config_render_basic_0_sku_gte_0x03);
-		n++;
-	}
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_basic[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_basic[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_compute_basic[] = {
-	{ _MMIO(0x9888), 0x104f00e0 },
-	{ _MMIO(0x9888), 0x124f1c00 },
-	{ _MMIO(0x9888), 0x39900340 },
-	{ _MMIO(0x9888), 0x3f900c00 },
-	{ _MMIO(0x9888), 0x41900000 },
-	{ _MMIO(0x9888), 0x002d5000 },
-	{ _MMIO(0x9888), 0x062d4000 },
-	{ _MMIO(0x9888), 0x082d4000 },
-	{ _MMIO(0x9888), 0x0a2d1000 },
-	{ _MMIO(0x9888), 0x0c2d5000 },
-	{ _MMIO(0x9888), 0x0e2d4000 },
-	{ _MMIO(0x9888), 0x0c2e1400 },
-	{ _MMIO(0x9888), 0x0e2e5100 },
-	{ _MMIO(0x9888), 0x102e0114 },
-	{ _MMIO(0x9888), 0x044cc000 },
-	{ _MMIO(0x9888), 0x0a4c8000 },
-	{ _MMIO(0x9888), 0x0c4c8000 },
-	{ _MMIO(0x9888), 0x0e4c4000 },
-	{ _MMIO(0x9888), 0x104c8000 },
-	{ _MMIO(0x9888), 0x124c8000 },
-	{ _MMIO(0x9888), 0x164c2000 },
-	{ _MMIO(0x9888), 0x004ea000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e2000 },
-	{ _MMIO(0x9888), 0x0c4ea000 },
-	{ _MMIO(0x9888), 0x0e4e8000 },
-	{ _MMIO(0x9888), 0x004f6b42 },
-	{ _MMIO(0x9888), 0x064f6200 },
-	{ _MMIO(0x9888), 0x084f4100 },
-	{ _MMIO(0x9888), 0x0a4f0061 },
-	{ _MMIO(0x9888), 0x0c4f6c4c },
-	{ _MMIO(0x9888), 0x0e4f4b00 },
-	{ _MMIO(0x9888), 0x1a4f0000 },
-	{ _MMIO(0x9888), 0x1c4f0000 },
-	{ _MMIO(0x9888), 0x180f5000 },
-	{ _MMIO(0x9888), 0x1a0f8800 },
-	{ _MMIO(0x9888), 0x1c0f08a2 },
-	{ _MMIO(0x9888), 0x182c4000 },
-	{ _MMIO(0x9888), 0x1c2c1451 },
-	{ _MMIO(0x9888), 0x1e2c0001 },
-	{ _MMIO(0x9888), 0x1a2c0010 },
-	{ _MMIO(0x9888), 0x01938000 },
-	{ _MMIO(0x9888), 0x0f938000 },
-	{ _MMIO(0x9888), 0x19938a28 },
-	{ _MMIO(0x9888), 0x03938000 },
-	{ _MMIO(0x9888), 0x19900177 },
-	{ _MMIO(0x9888), 0x1b900178 },
-	{ _MMIO(0x9888), 0x1d900125 },
-	{ _MMIO(0x9888), 0x1f900123 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x13904000 },
-	{ _MMIO(0x9888), 0x21904000 },
-	{ _MMIO(0x9888), 0x25904000 },
-	{ _MMIO(0x9888), 0x27904000 },
-	{ _MMIO(0x9888), 0x2b904000 },
-	{ _MMIO(0x9888), 0x2d904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x53901000 },
-	{ _MMIO(0x9888), 0x43900000 },
-	{ _MMIO(0x9888), 0x55900111 },
-	{ _MMIO(0x9888), 0x47900000 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b900000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900000 },
-	{ _MMIO(0x9888), 0x45900000 },
-};
-
-static int
-get_compute_basic_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_basic;
-	lens[n] = ARRAY_SIZE(mux_config_compute_basic);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_render_pipe_profile[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007ffea },
-	{ _MMIO(0x2774), 0x00007ffc },
-	{ _MMIO(0x2778), 0x0007affa },
-	{ _MMIO(0x277c), 0x0000f5fd },
-	{ _MMIO(0x2780), 0x00079ffa },
-	{ _MMIO(0x2784), 0x0000f3fb },
-	{ _MMIO(0x2788), 0x0007bf7a },
-	{ _MMIO(0x278c), 0x0000f7e7 },
-	{ _MMIO(0x2790), 0x0007fefa },
-	{ _MMIO(0x2794), 0x0000f7cf },
-	{ _MMIO(0x2798), 0x00077ffa },
-	{ _MMIO(0x279c), 0x0000efdf },
-	{ _MMIO(0x27a0), 0x0006fffa },
-	{ _MMIO(0x27a4), 0x0000cfbf },
-	{ _MMIO(0x27a8), 0x0003fffa },
-	{ _MMIO(0x27ac), 0x00005f7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_render_pipe_profile[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_render_pipe_profile[] = {
-	{ _MMIO(0x9888), 0x0c2e001f },
-	{ _MMIO(0x9888), 0x0a2f0000 },
-	{ _MMIO(0x9888), 0x10186800 },
-	{ _MMIO(0x9888), 0x11810019 },
-	{ _MMIO(0x9888), 0x15810013 },
-	{ _MMIO(0x9888), 0x13820020 },
-	{ _MMIO(0x9888), 0x11830020 },
-	{ _MMIO(0x9888), 0x17840000 },
-	{ _MMIO(0x9888), 0x11860007 },
-	{ _MMIO(0x9888), 0x21860000 },
-	{ _MMIO(0x9888), 0x178703e0 },
-	{ _MMIO(0x9888), 0x0c2d8000 },
-	{ _MMIO(0x9888), 0x042d4000 },
-	{ _MMIO(0x9888), 0x062d1000 },
-	{ _MMIO(0x9888), 0x022e5400 },
-	{ _MMIO(0x9888), 0x002e0000 },
-	{ _MMIO(0x9888), 0x0e2e0080 },
-	{ _MMIO(0x9888), 0x082f0040 },
-	{ _MMIO(0x9888), 0x002f0000 },
-	{ _MMIO(0x9888), 0x06143000 },
-	{ _MMIO(0x9888), 0x06174000 },
-	{ _MMIO(0x9888), 0x06180012 },
-	{ _MMIO(0x9888), 0x00180000 },
-	{ _MMIO(0x9888), 0x0d804000 },
-	{ _MMIO(0x9888), 0x0f804000 },
-	{ _MMIO(0x9888), 0x05804000 },
-	{ _MMIO(0x9888), 0x09810200 },
-	{ _MMIO(0x9888), 0x0b810030 },
-	{ _MMIO(0x9888), 0x03810003 },
-	{ _MMIO(0x9888), 0x21819140 },
-	{ _MMIO(0x9888), 0x23819050 },
-	{ _MMIO(0x9888), 0x25810018 },
-	{ _MMIO(0x9888), 0x0b820980 },
-	{ _MMIO(0x9888), 0x03820d80 },
-	{ _MMIO(0x9888), 0x11820000 },
-	{ _MMIO(0x9888), 0x0182c000 },
-	{ _MMIO(0x9888), 0x07828000 },
-	{ _MMIO(0x9888), 0x09824000 },
-	{ _MMIO(0x9888), 0x0f828000 },
-	{ _MMIO(0x9888), 0x0d830004 },
-	{ _MMIO(0x9888), 0x0583000c },
-	{ _MMIO(0x9888), 0x0f831000 },
-	{ _MMIO(0x9888), 0x01848072 },
-	{ _MMIO(0x9888), 0x11840000 },
-	{ _MMIO(0x9888), 0x07848000 },
-	{ _MMIO(0x9888), 0x09844000 },
-	{ _MMIO(0x9888), 0x0f848000 },
-	{ _MMIO(0x9888), 0x07860000 },
-	{ _MMIO(0x9888), 0x09860092 },
-	{ _MMIO(0x9888), 0x0f860400 },
-	{ _MMIO(0x9888), 0x01869100 },
-	{ _MMIO(0x9888), 0x0f870065 },
-	{ _MMIO(0x9888), 0x01870000 },
-	{ _MMIO(0x9888), 0x19930800 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x0d938000 },
-	{ _MMIO(0x9888), 0x1b952000 },
-	{ _MMIO(0x9888), 0x1d955055 },
-	{ _MMIO(0x9888), 0x1f951455 },
-	{ _MMIO(0x9888), 0x0992a000 },
-	{ _MMIO(0x9888), 0x0f928000 },
-	{ _MMIO(0x9888), 0x1192a800 },
-	{ _MMIO(0x9888), 0x1392028a },
-	{ _MMIO(0x9888), 0x0b92a000 },
-	{ _MMIO(0x9888), 0x0d922000 },
-	{ _MMIO(0x9888), 0x13908000 },
-	{ _MMIO(0x9888), 0x21908000 },
-	{ _MMIO(0x9888), 0x23908000 },
-	{ _MMIO(0x9888), 0x25908000 },
-	{ _MMIO(0x9888), 0x27908000 },
-	{ _MMIO(0x9888), 0x29908000 },
-	{ _MMIO(0x9888), 0x2b908000 },
-	{ _MMIO(0x9888), 0x2d904000 },
-	{ _MMIO(0x9888), 0x2f908000 },
-	{ _MMIO(0x9888), 0x31908000 },
-	{ _MMIO(0x9888), 0x15908000 },
-	{ _MMIO(0x9888), 0x17908000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x1f904000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43900c01 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47900000 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900863 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b900061 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900000 },
-	{ _MMIO(0x9888), 0x45900c22 },
-};
-
-static int
-get_render_pipe_profile_mux_config(struct drm_i915_private *dev_priv,
-				   const struct i915_oa_reg **regs,
-				   int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_render_pipe_profile;
-	lens[n] = ARRAY_SIZE(mux_config_render_pipe_profile);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_memory_reads[] = {
-	{ _MMIO(0x272c), 0xffffffff },
-	{ _MMIO(0x2728), 0xffffffff },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x271c), 0xffffffff },
-	{ _MMIO(0x2718), 0xffffffff },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0x86543210 },
-	{ _MMIO(0x2748), 0x86543210 },
-	{ _MMIO(0x2744), 0x00006667 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x275c), 0x86543210 },
-	{ _MMIO(0x2758), 0x86543210 },
-	{ _MMIO(0x2754), 0x00006465 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007f81a },
-	{ _MMIO(0x2774), 0x0000fe00 },
-	{ _MMIO(0x2778), 0x0007f82a },
-	{ _MMIO(0x277c), 0x0000fe00 },
-	{ _MMIO(0x2780), 0x0007f872 },
-	{ _MMIO(0x2784), 0x0000fe00 },
-	{ _MMIO(0x2788), 0x0007f8ba },
-	{ _MMIO(0x278c), 0x0000fe00 },
-	{ _MMIO(0x2790), 0x0007f87a },
-	{ _MMIO(0x2794), 0x0000fe00 },
-	{ _MMIO(0x2798), 0x0007f8ea },
-	{ _MMIO(0x279c), 0x0000fe00 },
-	{ _MMIO(0x27a0), 0x0007f8e2 },
-	{ _MMIO(0x27a4), 0x0000fe00 },
-	{ _MMIO(0x27a8), 0x0007f8f2 },
-	{ _MMIO(0x27ac), 0x0000fe00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_memory_reads[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_memory_reads[] = {
-	{ _MMIO(0x9888), 0x19800343 },
-	{ _MMIO(0x9888), 0x39900340 },
-	{ _MMIO(0x9888), 0x3f901000 },
-	{ _MMIO(0x9888), 0x41900003 },
-	{ _MMIO(0x9888), 0x03803180 },
-	{ _MMIO(0x9888), 0x058035e2 },
-	{ _MMIO(0x9888), 0x0780006a },
-	{ _MMIO(0x9888), 0x11800000 },
-	{ _MMIO(0x9888), 0x2181a000 },
-	{ _MMIO(0x9888), 0x2381000a },
-	{ _MMIO(0x9888), 0x1d950550 },
-	{ _MMIO(0x9888), 0x0b928000 },
-	{ _MMIO(0x9888), 0x0d92a000 },
-	{ _MMIO(0x9888), 0x0f922000 },
-	{ _MMIO(0x9888), 0x13900170 },
-	{ _MMIO(0x9888), 0x21900171 },
-	{ _MMIO(0x9888), 0x23900172 },
-	{ _MMIO(0x9888), 0x25900173 },
-	{ _MMIO(0x9888), 0x27900174 },
-	{ _MMIO(0x9888), 0x29900175 },
-	{ _MMIO(0x9888), 0x2b900176 },
-	{ _MMIO(0x9888), 0x2d900177 },
-	{ _MMIO(0x9888), 0x2f90017f },
-	{ _MMIO(0x9888), 0x31900125 },
-	{ _MMIO(0x9888), 0x15900123 },
-	{ _MMIO(0x9888), 0x17900121 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d908000 },
-	{ _MMIO(0x9888), 0x1f908000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43901084 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47901080 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49901084 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b901084 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900004 },
-	{ _MMIO(0x9888), 0x45900000 },
-};
-
-static int
-get_memory_reads_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_memory_reads;
-	lens[n] = ARRAY_SIZE(mux_config_memory_reads);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_memory_writes[] = {
-	{ _MMIO(0x272c), 0xffffffff },
-	{ _MMIO(0x2728), 0xffffffff },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x271c), 0xffffffff },
-	{ _MMIO(0x2718), 0xffffffff },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0x86543210 },
-	{ _MMIO(0x2748), 0x86543210 },
-	{ _MMIO(0x2744), 0x00006667 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x275c), 0x86543210 },
-	{ _MMIO(0x2758), 0x86543210 },
-	{ _MMIO(0x2754), 0x00006465 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007f81a },
-	{ _MMIO(0x2774), 0x0000fe00 },
-	{ _MMIO(0x2778), 0x0007f82a },
-	{ _MMIO(0x277c), 0x0000fe00 },
-	{ _MMIO(0x2780), 0x0007f822 },
-	{ _MMIO(0x2784), 0x0000fe00 },
-	{ _MMIO(0x2788), 0x0007f8ba },
-	{ _MMIO(0x278c), 0x0000fe00 },
-	{ _MMIO(0x2790), 0x0007f87a },
-	{ _MMIO(0x2794), 0x0000fe00 },
-	{ _MMIO(0x2798), 0x0007f8ea },
-	{ _MMIO(0x279c), 0x0000fe00 },
-	{ _MMIO(0x27a0), 0x0007f8e2 },
-	{ _MMIO(0x27a4), 0x0000fe00 },
-	{ _MMIO(0x27a8), 0x0007f8f2 },
-	{ _MMIO(0x27ac), 0x0000fe00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_memory_writes[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_memory_writes[] = {
-	{ _MMIO(0x9888), 0x19800343 },
-	{ _MMIO(0x9888), 0x39900340 },
-	{ _MMIO(0x9888), 0x3f900000 },
-	{ _MMIO(0x9888), 0x41900080 },
-	{ _MMIO(0x9888), 0x03803180 },
-	{ _MMIO(0x9888), 0x058035e2 },
-	{ _MMIO(0x9888), 0x0780006a },
-	{ _MMIO(0x9888), 0x11800000 },
-	{ _MMIO(0x9888), 0x2181a000 },
-	{ _MMIO(0x9888), 0x2381000a },
-	{ _MMIO(0x9888), 0x1d950550 },
-	{ _MMIO(0x9888), 0x0b928000 },
-	{ _MMIO(0x9888), 0x0d92a000 },
-	{ _MMIO(0x9888), 0x0f922000 },
-	{ _MMIO(0x9888), 0x13900180 },
-	{ _MMIO(0x9888), 0x21900181 },
-	{ _MMIO(0x9888), 0x23900182 },
-	{ _MMIO(0x9888), 0x25900183 },
-	{ _MMIO(0x9888), 0x27900184 },
-	{ _MMIO(0x9888), 0x29900185 },
-	{ _MMIO(0x9888), 0x2b900186 },
-	{ _MMIO(0x9888), 0x2d900187 },
-	{ _MMIO(0x9888), 0x2f900170 },
-	{ _MMIO(0x9888), 0x31900125 },
-	{ _MMIO(0x9888), 0x15900123 },
-	{ _MMIO(0x9888), 0x17900121 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d908000 },
-	{ _MMIO(0x9888), 0x1f908000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43901084 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47901080 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49901084 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b901084 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900004 },
-	{ _MMIO(0x9888), 0x45900000 },
-};
-
-static int
-get_memory_writes_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_memory_writes;
-	lens[n] = ARRAY_SIZE(mux_config_memory_writes);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_extended[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007fc2a },
-	{ _MMIO(0x2774), 0x0000bf00 },
-	{ _MMIO(0x2778), 0x0007fc6a },
-	{ _MMIO(0x277c), 0x0000bf00 },
-	{ _MMIO(0x2780), 0x0007fc92 },
-	{ _MMIO(0x2784), 0x0000bf00 },
-	{ _MMIO(0x2788), 0x0007fca2 },
-	{ _MMIO(0x278c), 0x0000bf00 },
-	{ _MMIO(0x2790), 0x0007fc32 },
-	{ _MMIO(0x2794), 0x0000bf00 },
-	{ _MMIO(0x2798), 0x0007fc9a },
-	{ _MMIO(0x279c), 0x0000bf00 },
-	{ _MMIO(0x27a0), 0x0007fe6a },
-	{ _MMIO(0x27a4), 0x0000bf00 },
-	{ _MMIO(0x27a8), 0x0007fe7a },
-	{ _MMIO(0x27ac), 0x0000bf00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_extended[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_compute_extended[] = {
-	{ _MMIO(0x9888), 0x104f00e0 },
-	{ _MMIO(0x9888), 0x141c0160 },
-	{ _MMIO(0x9888), 0x161c0015 },
-	{ _MMIO(0x9888), 0x181c0120 },
-	{ _MMIO(0x9888), 0x002d5000 },
-	{ _MMIO(0x9888), 0x062d4000 },
-	{ _MMIO(0x9888), 0x082d5000 },
-	{ _MMIO(0x9888), 0x0a2d5000 },
-	{ _MMIO(0x9888), 0x0c2d5000 },
-	{ _MMIO(0x9888), 0x0e2d5000 },
-	{ _MMIO(0x9888), 0x022d5000 },
-	{ _MMIO(0x9888), 0x042d5000 },
-	{ _MMIO(0x9888), 0x0c2e5400 },
-	{ _MMIO(0x9888), 0x0e2e5515 },
-	{ _MMIO(0x9888), 0x102e0155 },
-	{ _MMIO(0x9888), 0x044cc000 },
-	{ _MMIO(0x9888), 0x0a4c8000 },
-	{ _MMIO(0x9888), 0x0c4cc000 },
-	{ _MMIO(0x9888), 0x0e4cc000 },
-	{ _MMIO(0x9888), 0x104c8000 },
-	{ _MMIO(0x9888), 0x124c8000 },
-	{ _MMIO(0x9888), 0x144c8000 },
-	{ _MMIO(0x9888), 0x164c2000 },
-	{ _MMIO(0x9888), 0x064cc000 },
-	{ _MMIO(0x9888), 0x084cc000 },
-	{ _MMIO(0x9888), 0x004ea000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084ea000 },
-	{ _MMIO(0x9888), 0x0a4ea000 },
-	{ _MMIO(0x9888), 0x0c4ea000 },
-	{ _MMIO(0x9888), 0x0e4ea000 },
-	{ _MMIO(0x9888), 0x024ea000 },
-	{ _MMIO(0x9888), 0x044ea000 },
-	{ _MMIO(0x9888), 0x0e4f4b41 },
-	{ _MMIO(0x9888), 0x004f4200 },
-	{ _MMIO(0x9888), 0x024f404c },
-	{ _MMIO(0x9888), 0x1c4f0000 },
-	{ _MMIO(0x9888), 0x1a4f0000 },
-	{ _MMIO(0x9888), 0x001b4000 },
-	{ _MMIO(0x9888), 0x061b8000 },
-	{ _MMIO(0x9888), 0x081bc000 },
-	{ _MMIO(0x9888), 0x0a1bc000 },
-	{ _MMIO(0x9888), 0x0c1bc000 },
-	{ _MMIO(0x9888), 0x041bc000 },
-	{ _MMIO(0x9888), 0x001c0031 },
-	{ _MMIO(0x9888), 0x061c1900 },
-	{ _MMIO(0x9888), 0x081c1a33 },
-	{ _MMIO(0x9888), 0x0a1c1b35 },
-	{ _MMIO(0x9888), 0x0c1c3337 },
-	{ _MMIO(0x9888), 0x041c31c7 },
-	{ _MMIO(0x9888), 0x180f5000 },
-	{ _MMIO(0x9888), 0x1a0fa8aa },
-	{ _MMIO(0x9888), 0x1c0f0aaa },
-	{ _MMIO(0x9888), 0x182c8000 },
-	{ _MMIO(0x9888), 0x1c2c6aaa },
-	{ _MMIO(0x9888), 0x1e2c0001 },
-	{ _MMIO(0x9888), 0x1a2c2950 },
-	{ _MMIO(0x9888), 0x01938000 },
-	{ _MMIO(0x9888), 0x0f938000 },
-	{ _MMIO(0x9888), 0x1993aaaa },
-	{ _MMIO(0x9888), 0x03938000 },
-	{ _MMIO(0x9888), 0x05938000 },
-	{ _MMIO(0x9888), 0x07938000 },
-	{ _MMIO(0x9888), 0x09938000 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x13904000 },
-	{ _MMIO(0x9888), 0x21904000 },
-	{ _MMIO(0x9888), 0x23904000 },
-	{ _MMIO(0x9888), 0x25904000 },
-	{ _MMIO(0x9888), 0x27904000 },
-	{ _MMIO(0x9888), 0x29904000 },
-	{ _MMIO(0x9888), 0x2b904000 },
-	{ _MMIO(0x9888), 0x2d904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17904000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43900420 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47900000 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b900400 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900001 },
-	{ _MMIO(0x9888), 0x45900001 },
-};
-
-static int
-get_compute_extended_mux_config(struct drm_i915_private *dev_priv,
-				const struct i915_oa_reg **regs,
-				int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_extended;
-	lens[n] = ARRAY_SIZE(mux_config_compute_extended);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_l3_cache[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x30800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007fffa },
-	{ _MMIO(0x2774), 0x0000fefe },
-	{ _MMIO(0x2778), 0x0007fffa },
-	{ _MMIO(0x277c), 0x0000fefd },
-	{ _MMIO(0x2790), 0x0007fffa },
-	{ _MMIO(0x2794), 0x0000fbef },
-	{ _MMIO(0x2798), 0x0007fffa },
-	{ _MMIO(0x279c), 0x0000fbdf },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_l3_cache[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00101100 },
-	{ _MMIO(0xe45c), 0x00201200 },
-	{ _MMIO(0xe55c), 0x00301300 },
-	{ _MMIO(0xe65c), 0x00401400 },
-};
-
-static const struct i915_oa_reg mux_config_compute_l3_cache[] = {
-	{ _MMIO(0x9888), 0x166c03b0 },
-	{ _MMIO(0x9888), 0x1593001e },
-	{ _MMIO(0x9888), 0x3f900c00 },
-	{ _MMIO(0x9888), 0x41900000 },
-	{ _MMIO(0x9888), 0x002d1000 },
-	{ _MMIO(0x9888), 0x062d4000 },
-	{ _MMIO(0x9888), 0x082d5000 },
-	{ _MMIO(0x9888), 0x0e2d5000 },
-	{ _MMIO(0x9888), 0x0c2e0400 },
-	{ _MMIO(0x9888), 0x0e2e1500 },
-	{ _MMIO(0x9888), 0x102e0140 },
-	{ _MMIO(0x9888), 0x044c4000 },
-	{ _MMIO(0x9888), 0x0a4c8000 },
-	{ _MMIO(0x9888), 0x0c4cc000 },
-	{ _MMIO(0x9888), 0x144c8000 },
-	{ _MMIO(0x9888), 0x164c2000 },
-	{ _MMIO(0x9888), 0x004e2000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084ea000 },
-	{ _MMIO(0x9888), 0x0e4ea000 },
-	{ _MMIO(0x9888), 0x1a4f4001 },
-	{ _MMIO(0x9888), 0x1c4f5005 },
-	{ _MMIO(0x9888), 0x006c0051 },
-	{ _MMIO(0x9888), 0x066c5000 },
-	{ _MMIO(0x9888), 0x086c5c5d },
-	{ _MMIO(0x9888), 0x0e6c5e5f },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x146c0000 },
-	{ _MMIO(0x9888), 0x1a6c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x180f1000 },
-	{ _MMIO(0x9888), 0x1a0fa800 },
-	{ _MMIO(0x9888), 0x1c0f0a00 },
-	{ _MMIO(0x9888), 0x182c4000 },
-	{ _MMIO(0x9888), 0x1c2c4015 },
-	{ _MMIO(0x9888), 0x1e2c0001 },
-	{ _MMIO(0x9888), 0x03931980 },
-	{ _MMIO(0x9888), 0x05930032 },
-	{ _MMIO(0x9888), 0x11930000 },
-	{ _MMIO(0x9888), 0x01938000 },
-	{ _MMIO(0x9888), 0x0f938000 },
-	{ _MMIO(0x9888), 0x1993a00a },
-	{ _MMIO(0x9888), 0x07930000 },
-	{ _MMIO(0x9888), 0x09930000 },
-	{ _MMIO(0x9888), 0x1d900177 },
-	{ _MMIO(0x9888), 0x1f900178 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x13904000 },
-	{ _MMIO(0x9888), 0x21904000 },
-	{ _MMIO(0x9888), 0x23904000 },
-	{ _MMIO(0x9888), 0x25904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x53901000 },
-	{ _MMIO(0x9888), 0x43900000 },
-	{ _MMIO(0x9888), 0x55900111 },
-	{ _MMIO(0x9888), 0x47900001 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b900000 },
-	{ _MMIO(0x9888), 0x4d900000 },
-	{ _MMIO(0x9888), 0x45900400 },
-};
-
-static int
-get_compute_l3_cache_mux_config(struct drm_i915_private *dev_priv,
-				const struct i915_oa_reg **regs,
-				int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_l3_cache;
-	lens[n] = ARRAY_SIZE(mux_config_compute_l3_cache);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_hdc_and_sf[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x10800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000fdff },
-};
-
-static const struct i915_oa_reg flex_eu_config_hdc_and_sf[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_hdc_and_sf[] = {
-	{ _MMIO(0x9888), 0x104f0232 },
-	{ _MMIO(0x9888), 0x124f4640 },
-	{ _MMIO(0x9888), 0x11834400 },
-	{ _MMIO(0x9888), 0x022d4000 },
-	{ _MMIO(0x9888), 0x042d5000 },
-	{ _MMIO(0x9888), 0x062d1000 },
-	{ _MMIO(0x9888), 0x0e2e0055 },
-	{ _MMIO(0x9888), 0x064c8000 },
-	{ _MMIO(0x9888), 0x084cc000 },
-	{ _MMIO(0x9888), 0x0a4c4000 },
-	{ _MMIO(0x9888), 0x024e8000 },
-	{ _MMIO(0x9888), 0x044ea000 },
-	{ _MMIO(0x9888), 0x064e2000 },
-	{ _MMIO(0x9888), 0x024f6100 },
-	{ _MMIO(0x9888), 0x044f416b },
-	{ _MMIO(0x9888), 0x064f004b },
-	{ _MMIO(0x9888), 0x1a4f0000 },
-	{ _MMIO(0x9888), 0x1a0f02a8 },
-	{ _MMIO(0x9888), 0x1a2c5500 },
-	{ _MMIO(0x9888), 0x0f808000 },
-	{ _MMIO(0x9888), 0x25810020 },
-	{ _MMIO(0x9888), 0x0f8305c0 },
-	{ _MMIO(0x9888), 0x07938000 },
-	{ _MMIO(0x9888), 0x09938000 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x0d938000 },
-	{ _MMIO(0x9888), 0x1f951000 },
-	{ _MMIO(0x9888), 0x13920200 },
-	{ _MMIO(0x9888), 0x31908000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x1f904000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4d900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900000 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_hdc_and_sf_mux_config(struct drm_i915_private *dev_priv,
-			  const struct i915_oa_reg **regs,
-			  int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_hdc_and_sf;
-	lens[n] = ARRAY_SIZE(mux_config_hdc_and_sf);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00014002 },
-	{ _MMIO(0x277c), 0x0000c3ff },
-	{ _MMIO(0x2780), 0x00010002 },
-	{ _MMIO(0x2784), 0x0000c7ff },
-	{ _MMIO(0x2788), 0x00004002 },
-	{ _MMIO(0x278c), 0x0000d3ff },
-	{ _MMIO(0x2790), 0x00100700 },
-	{ _MMIO(0x2794), 0x0000ff1f },
-	{ _MMIO(0x2798), 0x00001402 },
-	{ _MMIO(0x279c), 0x0000fc3f },
-	{ _MMIO(0x27a0), 0x00001002 },
-	{ _MMIO(0x27a4), 0x0000fc7f },
-	{ _MMIO(0x27a8), 0x00000402 },
-	{ _MMIO(0x27ac), 0x0000fd3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_1_0_sku_gte_0x03[] = {
-	{ _MMIO(0x9888), 0x12643400 },
-	{ _MMIO(0x9888), 0x12653400 },
-	{ _MMIO(0x9888), 0x106c6800 },
-	{ _MMIO(0x9888), 0x126c001e },
-	{ _MMIO(0x9888), 0x166c0010 },
-	{ _MMIO(0x9888), 0x0c2d5000 },
-	{ _MMIO(0x9888), 0x0e2d5000 },
-	{ _MMIO(0x9888), 0x002d4000 },
-	{ _MMIO(0x9888), 0x022d5000 },
-	{ _MMIO(0x9888), 0x042d5000 },
-	{ _MMIO(0x9888), 0x062d1000 },
-	{ _MMIO(0x9888), 0x102e0154 },
-	{ _MMIO(0x9888), 0x0c2e5000 },
-	{ _MMIO(0x9888), 0x0e2e0055 },
-	{ _MMIO(0x9888), 0x104c8000 },
-	{ _MMIO(0x9888), 0x124c8000 },
-	{ _MMIO(0x9888), 0x144c8000 },
-	{ _MMIO(0x9888), 0x164c2000 },
-	{ _MMIO(0x9888), 0x044c8000 },
-	{ _MMIO(0x9888), 0x064cc000 },
-	{ _MMIO(0x9888), 0x084cc000 },
-	{ _MMIO(0x9888), 0x0a4c4000 },
-	{ _MMIO(0x9888), 0x0c4ea000 },
-	{ _MMIO(0x9888), 0x0e4ea000 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x024ea000 },
-	{ _MMIO(0x9888), 0x044ea000 },
-	{ _MMIO(0x9888), 0x064e2000 },
-	{ _MMIO(0x9888), 0x1c4f5500 },
-	{ _MMIO(0x9888), 0x1a4f1554 },
-	{ _MMIO(0x9888), 0x0a640024 },
-	{ _MMIO(0x9888), 0x10640000 },
-	{ _MMIO(0x9888), 0x04640000 },
-	{ _MMIO(0x9888), 0x0c650024 },
-	{ _MMIO(0x9888), 0x10650000 },
-	{ _MMIO(0x9888), 0x06650000 },
-	{ _MMIO(0x9888), 0x0c6c5327 },
-	{ _MMIO(0x9888), 0x0e6c5425 },
-	{ _MMIO(0x9888), 0x006c2a00 },
-	{ _MMIO(0x9888), 0x026c285b },
-	{ _MMIO(0x9888), 0x046c005c },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1a6c0900 },
-	{ _MMIO(0x9888), 0x1c0f0aa0 },
-	{ _MMIO(0x9888), 0x180f4000 },
-	{ _MMIO(0x9888), 0x1a0f02aa },
-	{ _MMIO(0x9888), 0x1c2c5400 },
-	{ _MMIO(0x9888), 0x1e2c0001 },
-	{ _MMIO(0x9888), 0x1a2c5550 },
-	{ _MMIO(0x9888), 0x1993aa00 },
-	{ _MMIO(0x9888), 0x03938000 },
-	{ _MMIO(0x9888), 0x05938000 },
-	{ _MMIO(0x9888), 0x07938000 },
-	{ _MMIO(0x9888), 0x09938000 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x0d938000 },
-	{ _MMIO(0x9888), 0x2b904000 },
-	{ _MMIO(0x9888), 0x2d904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17904000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x1f904000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b900421 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900001 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43900420 },
-	{ _MMIO(0x9888), 0x45900021 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47900000 },
-};
-
-static const struct i915_oa_reg mux_config_l3_1_0_sku_lt_0x03[] = {
-	{ _MMIO(0x9888), 0x14640340 },
-	{ _MMIO(0x9888), 0x14650340 },
-	{ _MMIO(0x9888), 0x106c6800 },
-	{ _MMIO(0x9888), 0x126c001e },
-	{ _MMIO(0x9888), 0x166c0010 },
-	{ _MMIO(0x9888), 0x0c2d5000 },
-	{ _MMIO(0x9888), 0x0e2d5000 },
-	{ _MMIO(0x9888), 0x002d4000 },
-	{ _MMIO(0x9888), 0x022d5000 },
-	{ _MMIO(0x9888), 0x042d5000 },
-	{ _MMIO(0x9888), 0x062d1000 },
-	{ _MMIO(0x9888), 0x102e0154 },
-	{ _MMIO(0x9888), 0x0c2e5000 },
-	{ _MMIO(0x9888), 0x0e2e0055 },
-	{ _MMIO(0x9888), 0x104c8000 },
-	{ _MMIO(0x9888), 0x124c8000 },
-	{ _MMIO(0x9888), 0x144c8000 },
-	{ _MMIO(0x9888), 0x164c2000 },
-	{ _MMIO(0x9888), 0x044c8000 },
-	{ _MMIO(0x9888), 0x064cc000 },
-	{ _MMIO(0x9888), 0x084cc000 },
-	{ _MMIO(0x9888), 0x0a4c4000 },
-	{ _MMIO(0x9888), 0x0c4ea000 },
-	{ _MMIO(0x9888), 0x0e4ea000 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x024ea000 },
-	{ _MMIO(0x9888), 0x044ea000 },
-	{ _MMIO(0x9888), 0x064e2000 },
-	{ _MMIO(0x9888), 0x1c4f5500 },
-	{ _MMIO(0x9888), 0x1a4f1554 },
-	{ _MMIO(0x9888), 0x04642400 },
-	{ _MMIO(0x9888), 0x22640000 },
-	{ _MMIO(0x9888), 0x1a640000 },
-	{ _MMIO(0x9888), 0x06650024 },
-	{ _MMIO(0x9888), 0x22650000 },
-	{ _MMIO(0x9888), 0x1c650000 },
-	{ _MMIO(0x9888), 0x0c6c5327 },
-	{ _MMIO(0x9888), 0x0e6c5425 },
-	{ _MMIO(0x9888), 0x006c2a00 },
-	{ _MMIO(0x9888), 0x026c285b },
-	{ _MMIO(0x9888), 0x046c005c },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1a6c0900 },
-	{ _MMIO(0x9888), 0x1c0f0aa0 },
-	{ _MMIO(0x9888), 0x180f4000 },
-	{ _MMIO(0x9888), 0x1a0f02aa },
-	{ _MMIO(0x9888), 0x1c2c5400 },
-	{ _MMIO(0x9888), 0x1e2c0001 },
-	{ _MMIO(0x9888), 0x1a2c5550 },
-	{ _MMIO(0x9888), 0x1993aa00 },
-	{ _MMIO(0x9888), 0x03938000 },
-	{ _MMIO(0x9888), 0x05938000 },
-	{ _MMIO(0x9888), 0x07938000 },
-	{ _MMIO(0x9888), 0x09938000 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x0d938000 },
-	{ _MMIO(0x9888), 0x2b904000 },
-	{ _MMIO(0x9888), 0x2d904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17904000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x1f904000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b900421 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900001 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43900420 },
-	{ _MMIO(0x9888), 0x45900021 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47900000 },
-};
-
-static int
-get_l3_1_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 2);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 2);
-
-	if (dev_priv->drm.pdev->revision >= 0x03) {
-		regs[n] = mux_config_l3_1_0_sku_gte_0x03;
-		lens[n] = ARRAY_SIZE(mux_config_l3_1_0_sku_gte_0x03);
-		n++;
-	}
-	if (dev_priv->drm.pdev->revision < 0x03) {
-		regs[n] = mux_config_l3_1_0_sku_lt_0x03;
-		lens[n] = ARRAY_SIZE(mux_config_l3_1_0_sku_lt_0x03);
-		n++;
-	}
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x30800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000efff },
-	{ _MMIO(0x2778), 0x00006000 },
-	{ _MMIO(0x277c), 0x0000f3ff },
-};
-
-static const struct i915_oa_reg flex_eu_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0x9888), 0x102d7800 },
-	{ _MMIO(0x9888), 0x122d79e0 },
-	{ _MMIO(0x9888), 0x0c2f0004 },
-	{ _MMIO(0x9888), 0x100e3800 },
-	{ _MMIO(0x9888), 0x180f0005 },
-	{ _MMIO(0x9888), 0x002d0940 },
-	{ _MMIO(0x9888), 0x022d802f },
-	{ _MMIO(0x9888), 0x042d4013 },
-	{ _MMIO(0x9888), 0x062d1000 },
-	{ _MMIO(0x9888), 0x0e2e0050 },
-	{ _MMIO(0x9888), 0x022f0010 },
-	{ _MMIO(0x9888), 0x002f0000 },
-	{ _MMIO(0x9888), 0x084c8000 },
-	{ _MMIO(0x9888), 0x0a4c4000 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e2000 },
-	{ _MMIO(0x9888), 0x040e0480 },
-	{ _MMIO(0x9888), 0x000e0000 },
-	{ _MMIO(0x9888), 0x060f0027 },
-	{ _MMIO(0x9888), 0x100f0000 },
-	{ _MMIO(0x9888), 0x1a0f0040 },
-	{ _MMIO(0x9888), 0x03938000 },
-	{ _MMIO(0x9888), 0x05938000 },
-	{ _MMIO(0x9888), 0x07938000 },
-	{ _MMIO(0x9888), 0x09938000 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x0d938000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17904000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x1f904000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x439014a0 },
-	{ _MMIO(0x9888), 0x459000a4 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47900001 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_rasterizer_and_pixel_backend_mux_config(struct drm_i915_private *dev_priv,
-					    const struct i915_oa_reg **regs,
-					    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_rasterizer_and_pixel_backend;
-	lens[n] = ARRAY_SIZE(mux_config_rasterizer_and_pixel_backend);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_sampler[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x70800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x0000c000 },
-	{ _MMIO(0x2774), 0x0000e7ff },
-	{ _MMIO(0x2778), 0x00003000 },
-	{ _MMIO(0x277c), 0x0000f9ff },
-	{ _MMIO(0x2780), 0x00000c00 },
-	{ _MMIO(0x2784), 0x0000fe7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_sampler[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_sampler[] = {
-	{ _MMIO(0x9888), 0x121300a0 },
-	{ _MMIO(0x9888), 0x141600ab },
-	{ _MMIO(0x9888), 0x123300a0 },
-	{ _MMIO(0x9888), 0x143600ab },
-	{ _MMIO(0x9888), 0x125300a0 },
-	{ _MMIO(0x9888), 0x145600ab },
-	{ _MMIO(0x9888), 0x0c2d4000 },
-	{ _MMIO(0x9888), 0x0e2d5000 },
-	{ _MMIO(0x9888), 0x002d4000 },
-	{ _MMIO(0x9888), 0x022d5000 },
-	{ _MMIO(0x9888), 0x042d5000 },
-	{ _MMIO(0x9888), 0x062d1000 },
-	{ _MMIO(0x9888), 0x102e01a0 },
-	{ _MMIO(0x9888), 0x0c2e5000 },
-	{ _MMIO(0x9888), 0x0e2e0065 },
-	{ _MMIO(0x9888), 0x164c2000 },
-	{ _MMIO(0x9888), 0x044c8000 },
-	{ _MMIO(0x9888), 0x064cc000 },
-	{ _MMIO(0x9888), 0x084c4000 },
-	{ _MMIO(0x9888), 0x0a4c4000 },
-	{ _MMIO(0x9888), 0x0e4e8000 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x024ea000 },
-	{ _MMIO(0x9888), 0x044e2000 },
-	{ _MMIO(0x9888), 0x064e2000 },
-	{ _MMIO(0x9888), 0x1c0f0800 },
-	{ _MMIO(0x9888), 0x180f4000 },
-	{ _MMIO(0x9888), 0x1a0f023f },
-	{ _MMIO(0x9888), 0x1e2c0003 },
-	{ _MMIO(0x9888), 0x1a2cc030 },
-	{ _MMIO(0x9888), 0x04132180 },
-	{ _MMIO(0x9888), 0x02130000 },
-	{ _MMIO(0x9888), 0x0c148000 },
-	{ _MMIO(0x9888), 0x0e142000 },
-	{ _MMIO(0x9888), 0x04148000 },
-	{ _MMIO(0x9888), 0x1e150140 },
-	{ _MMIO(0x9888), 0x1c150040 },
-	{ _MMIO(0x9888), 0x0c163000 },
-	{ _MMIO(0x9888), 0x0e160068 },
-	{ _MMIO(0x9888), 0x10160000 },
-	{ _MMIO(0x9888), 0x18160000 },
-	{ _MMIO(0x9888), 0x0a164000 },
-	{ _MMIO(0x9888), 0x04330043 },
-	{ _MMIO(0x9888), 0x02330000 },
-	{ _MMIO(0x9888), 0x0234a000 },
-	{ _MMIO(0x9888), 0x04342000 },
-	{ _MMIO(0x9888), 0x1c350015 },
-	{ _MMIO(0x9888), 0x02363460 },
-	{ _MMIO(0x9888), 0x10360000 },
-	{ _MMIO(0x9888), 0x04360000 },
-	{ _MMIO(0x9888), 0x06360000 },
-	{ _MMIO(0x9888), 0x08364000 },
-	{ _MMIO(0x9888), 0x06530043 },
-	{ _MMIO(0x9888), 0x02530000 },
-	{ _MMIO(0x9888), 0x0e548000 },
-	{ _MMIO(0x9888), 0x00548000 },
-	{ _MMIO(0x9888), 0x06542000 },
-	{ _MMIO(0x9888), 0x1e550400 },
-	{ _MMIO(0x9888), 0x1a552000 },
-	{ _MMIO(0x9888), 0x1c550100 },
-	{ _MMIO(0x9888), 0x0e563000 },
-	{ _MMIO(0x9888), 0x00563400 },
-	{ _MMIO(0x9888), 0x10560000 },
-	{ _MMIO(0x9888), 0x18560000 },
-	{ _MMIO(0x9888), 0x02560000 },
-	{ _MMIO(0x9888), 0x0c564000 },
-	{ _MMIO(0x9888), 0x1993a800 },
-	{ _MMIO(0x9888), 0x03938000 },
-	{ _MMIO(0x9888), 0x05938000 },
-	{ _MMIO(0x9888), 0x07938000 },
-	{ _MMIO(0x9888), 0x09938000 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x0d938000 },
-	{ _MMIO(0x9888), 0x2d904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17904000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x1f904000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b9014a0 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900001 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43900820 },
-	{ _MMIO(0x9888), 0x45901022 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47900000 },
-};
-
-static int
-get_sampler_mux_config(struct drm_i915_private *dev_priv,
-		       const struct i915_oa_reg **regs,
-		       int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_sampler;
-	lens[n] = ARRAY_SIZE(mux_config_sampler);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_tdl_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x00007fff },
-	{ _MMIO(0x2778), 0x00000000 },
-	{ _MMIO(0x277c), 0x00009fff },
-	{ _MMIO(0x2780), 0x00000002 },
-	{ _MMIO(0x2784), 0x0000efff },
-	{ _MMIO(0x2788), 0x00000000 },
-	{ _MMIO(0x278c), 0x0000f3ff },
-	{ _MMIO(0x2790), 0x00000002 },
-	{ _MMIO(0x2794), 0x0000fdff },
-	{ _MMIO(0x2798), 0x00000000 },
-	{ _MMIO(0x279c), 0x0000fe7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_tdl_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_tdl_1[] = {
-	{ _MMIO(0x9888), 0x141a0000 },
-	{ _MMIO(0x9888), 0x143a0000 },
-	{ _MMIO(0x9888), 0x145a0000 },
-	{ _MMIO(0x9888), 0x0c2d4000 },
-	{ _MMIO(0x9888), 0x0e2d5000 },
-	{ _MMIO(0x9888), 0x002d4000 },
-	{ _MMIO(0x9888), 0x022d5000 },
-	{ _MMIO(0x9888), 0x042d5000 },
-	{ _MMIO(0x9888), 0x062d1000 },
-	{ _MMIO(0x9888), 0x102e0150 },
-	{ _MMIO(0x9888), 0x0c2e5000 },
-	{ _MMIO(0x9888), 0x0e2e006a },
-	{ _MMIO(0x9888), 0x124c8000 },
-	{ _MMIO(0x9888), 0x144c8000 },
-	{ _MMIO(0x9888), 0x164c2000 },
-	{ _MMIO(0x9888), 0x044c8000 },
-	{ _MMIO(0x9888), 0x064c4000 },
-	{ _MMIO(0x9888), 0x0a4c4000 },
-	{ _MMIO(0x9888), 0x0c4e8000 },
-	{ _MMIO(0x9888), 0x0e4ea000 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x024e2000 },
-	{ _MMIO(0x9888), 0x064e2000 },
-	{ _MMIO(0x9888), 0x1c0f0bc0 },
-	{ _MMIO(0x9888), 0x180f4000 },
-	{ _MMIO(0x9888), 0x1a0f0302 },
-	{ _MMIO(0x9888), 0x1e2c0003 },
-	{ _MMIO(0x9888), 0x1a2c00f0 },
-	{ _MMIO(0x9888), 0x021a3080 },
-	{ _MMIO(0x9888), 0x041a31e5 },
-	{ _MMIO(0x9888), 0x02148000 },
-	{ _MMIO(0x9888), 0x0414a000 },
-	{ _MMIO(0x9888), 0x1c150054 },
-	{ _MMIO(0x9888), 0x06168000 },
-	{ _MMIO(0x9888), 0x08168000 },
-	{ _MMIO(0x9888), 0x0a168000 },
-	{ _MMIO(0x9888), 0x0c3a3280 },
-	{ _MMIO(0x9888), 0x0e3a0063 },
-	{ _MMIO(0x9888), 0x063a0061 },
-	{ _MMIO(0x9888), 0x023a0000 },
-	{ _MMIO(0x9888), 0x0c348000 },
-	{ _MMIO(0x9888), 0x0e342000 },
-	{ _MMIO(0x9888), 0x06342000 },
-	{ _MMIO(0x9888), 0x1e350140 },
-	{ _MMIO(0x9888), 0x1c350100 },
-	{ _MMIO(0x9888), 0x18360028 },
-	{ _MMIO(0x9888), 0x0c368000 },
-	{ _MMIO(0x9888), 0x0e5a3080 },
-	{ _MMIO(0x9888), 0x005a3280 },
-	{ _MMIO(0x9888), 0x025a0063 },
-	{ _MMIO(0x9888), 0x0e548000 },
-	{ _MMIO(0x9888), 0x00548000 },
-	{ _MMIO(0x9888), 0x02542000 },
-	{ _MMIO(0x9888), 0x1e550400 },
-	{ _MMIO(0x9888), 0x1a552000 },
-	{ _MMIO(0x9888), 0x1c550001 },
-	{ _MMIO(0x9888), 0x18560080 },
-	{ _MMIO(0x9888), 0x02568000 },
-	{ _MMIO(0x9888), 0x04568000 },
-	{ _MMIO(0x9888), 0x1993a800 },
-	{ _MMIO(0x9888), 0x03938000 },
-	{ _MMIO(0x9888), 0x05938000 },
-	{ _MMIO(0x9888), 0x07938000 },
-	{ _MMIO(0x9888), 0x09938000 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x0d938000 },
-	{ _MMIO(0x9888), 0x2d904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17904000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x1f904000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b900420 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43900000 },
-	{ _MMIO(0x9888), 0x45901084 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47900001 },
-};
-
-static int
-get_tdl_1_mux_config(struct drm_i915_private *dev_priv,
-		     const struct i915_oa_reg **regs,
-		     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_tdl_1;
-	lens[n] = ARRAY_SIZE(mux_config_tdl_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_tdl_2[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_tdl_2[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_tdl_2[] = {
-	{ _MMIO(0x9888), 0x141a026b },
-	{ _MMIO(0x9888), 0x143a0173 },
-	{ _MMIO(0x9888), 0x145a026b },
-	{ _MMIO(0x9888), 0x002d4000 },
-	{ _MMIO(0x9888), 0x022d5000 },
-	{ _MMIO(0x9888), 0x042d5000 },
-	{ _MMIO(0x9888), 0x062d1000 },
-	{ _MMIO(0x9888), 0x0c2e5000 },
-	{ _MMIO(0x9888), 0x0e2e0069 },
-	{ _MMIO(0x9888), 0x044c8000 },
-	{ _MMIO(0x9888), 0x064cc000 },
-	{ _MMIO(0x9888), 0x0a4c4000 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x024ea000 },
-	{ _MMIO(0x9888), 0x064e2000 },
-	{ _MMIO(0x9888), 0x180f6000 },
-	{ _MMIO(0x9888), 0x1a0f030a },
-	{ _MMIO(0x9888), 0x1a2c03c0 },
-	{ _MMIO(0x9888), 0x041a37e7 },
-	{ _MMIO(0x9888), 0x021a0000 },
-	{ _MMIO(0x9888), 0x0414a000 },
-	{ _MMIO(0x9888), 0x1c150050 },
-	{ _MMIO(0x9888), 0x08168000 },
-	{ _MMIO(0x9888), 0x0a168000 },
-	{ _MMIO(0x9888), 0x003a3380 },
-	{ _MMIO(0x9888), 0x063a006f },
-	{ _MMIO(0x9888), 0x023a0000 },
-	{ _MMIO(0x9888), 0x00348000 },
-	{ _MMIO(0x9888), 0x06342000 },
-	{ _MMIO(0x9888), 0x1a352000 },
-	{ _MMIO(0x9888), 0x1c350100 },
-	{ _MMIO(0x9888), 0x02368000 },
-	{ _MMIO(0x9888), 0x0c368000 },
-	{ _MMIO(0x9888), 0x025a37e7 },
-	{ _MMIO(0x9888), 0x0254a000 },
-	{ _MMIO(0x9888), 0x1c550005 },
-	{ _MMIO(0x9888), 0x04568000 },
-	{ _MMIO(0x9888), 0x06568000 },
-	{ _MMIO(0x9888), 0x03938000 },
-	{ _MMIO(0x9888), 0x05938000 },
-	{ _MMIO(0x9888), 0x07938000 },
-	{ _MMIO(0x9888), 0x09938000 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x0d938000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17904000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x1f904000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43900020 },
-	{ _MMIO(0x9888), 0x45901080 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47900001 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_tdl_2_mux_config(struct drm_i915_private *dev_priv,
-		     const struct i915_oa_reg **regs,
-		     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_tdl_2;
-	lens[n] = ARRAY_SIZE(mux_config_tdl_2);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_extra[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_extra[] = {
-	{ _MMIO(0xe458), 0x00001000 },
-	{ _MMIO(0xe558), 0x00003002 },
-	{ _MMIO(0xe658), 0x00005004 },
-	{ _MMIO(0xe758), 0x00011010 },
-	{ _MMIO(0xe45c), 0x00050012 },
-	{ _MMIO(0xe55c), 0x00052051 },
-	{ _MMIO(0xe65c), 0x00000008 },
-};
-
-static const struct i915_oa_reg mux_config_compute_extra[] = {
-	{ _MMIO(0x9888), 0x141a001f },
-	{ _MMIO(0x9888), 0x143a001f },
-	{ _MMIO(0x9888), 0x145a001f },
-	{ _MMIO(0x9888), 0x042d5000 },
-	{ _MMIO(0x9888), 0x062d1000 },
-	{ _MMIO(0x9888), 0x0e2e0094 },
-	{ _MMIO(0x9888), 0x084cc000 },
-	{ _MMIO(0x9888), 0x044ea000 },
-	{ _MMIO(0x9888), 0x1a0f00e0 },
-	{ _MMIO(0x9888), 0x1a2c0c00 },
-	{ _MMIO(0x9888), 0x061a0063 },
-	{ _MMIO(0x9888), 0x021a0000 },
-	{ _MMIO(0x9888), 0x06142000 },
-	{ _MMIO(0x9888), 0x1c150100 },
-	{ _MMIO(0x9888), 0x0c168000 },
-	{ _MMIO(0x9888), 0x043a3180 },
-	{ _MMIO(0x9888), 0x023a0000 },
-	{ _MMIO(0x9888), 0x04348000 },
-	{ _MMIO(0x9888), 0x1c350040 },
-	{ _MMIO(0x9888), 0x0a368000 },
-	{ _MMIO(0x9888), 0x045a0063 },
-	{ _MMIO(0x9888), 0x025a0000 },
-	{ _MMIO(0x9888), 0x04542000 },
-	{ _MMIO(0x9888), 0x1c550010 },
-	{ _MMIO(0x9888), 0x08568000 },
-	{ _MMIO(0x9888), 0x09938000 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x0d938000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x1f904000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900400 },
-	{ _MMIO(0x9888), 0x47900004 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_compute_extra_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_extra;
-	lens[n] = ARRAY_SIZE(mux_config_compute_extra);
-	n++;
-
-	return n;
-}
-
 static const struct i915_oa_reg b_counter_config_test_oa[] = {
 	{ _MMIO(0x2740), 0x00000000 },
 	{ _MMIO(0x2744), 0x00800000 },
@@ -1756,6 +60,7 @@ static const struct i915_oa_reg flex_eu_config_test_oa[] = {
 };
 
 static const struct i915_oa_reg mux_config_test_oa[] = {
+	{ _MMIO(0x9840), 0x00000080 },
 	{ _MMIO(0x9888), 0x19800000 },
 	{ _MMIO(0x9888), 0x07800063 },
 	{ _MMIO(0x9888), 0x11800000 },
@@ -1769,922 +74,35 @@ static const struct i915_oa_reg mux_config_test_oa[] = {
 	{ _MMIO(0x9888), 0x33900000 },
 };
 
-static int
-get_test_oa_mux_config(struct drm_i915_private *dev_priv,
-		       const struct i915_oa_reg **regs,
-		       int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_test_oa;
-	lens[n] = ARRAY_SIZE(mux_config_test_oa);
-	n++;
-
-	return n;
-}
-
-int i915_oa_select_metric_set_bxt(struct drm_i915_private *dev_priv)
-{
-	dev_priv->perf.oa.n_mux_configs = 0;
-	dev_priv->perf.oa.b_counter_regs = NULL;
-	dev_priv->perf.oa.b_counter_regs_len = 0;
-	dev_priv->perf.oa.flex_regs = NULL;
-	dev_priv->perf.oa.flex_regs_len = 0;
-
-	switch (dev_priv->perf.oa.metrics_set) {
-	case METRIC_SET_ID_RENDER_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_basic_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_basic);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_basic_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_basic);
-
-		return 0;
-	case METRIC_SET_ID_RENDER_PIPE_PROFILE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_pipe_profile_mux_config(dev_priv,
-							   dev_priv->perf.oa.mux_regs,
-							   dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_PIPE_PROFILE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_pipe_profile;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_pipe_profile);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_pipe_profile;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_pipe_profile);
-
-		return 0;
-	case METRIC_SET_ID_MEMORY_READS:
-		dev_priv->perf.oa.n_mux_configs =
-			get_memory_reads_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"MEMORY_READS\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_memory_reads;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_memory_reads);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_memory_reads;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_memory_reads);
-
-		return 0;
-	case METRIC_SET_ID_MEMORY_WRITES:
-		dev_priv->perf.oa.n_mux_configs =
-			get_memory_writes_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"MEMORY_WRITES\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_memory_writes;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_memory_writes);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_memory_writes;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_memory_writes);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_EXTENDED:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_extended_mux_config(dev_priv,
-							dev_priv->perf.oa.mux_regs,
-							dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_EXTENDED\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_extended;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_extended);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_extended;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_extended);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_L3_CACHE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_l3_cache_mux_config(dev_priv,
-							dev_priv->perf.oa.mux_regs,
-							dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_L3_CACHE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_l3_cache;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_l3_cache);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_l3_cache;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_l3_cache);
-
-		return 0;
-	case METRIC_SET_ID_HDC_AND_SF:
-		dev_priv->perf.oa.n_mux_configs =
-			get_hdc_and_sf_mux_config(dev_priv,
-						  dev_priv->perf.oa.mux_regs,
-						  dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"HDC_AND_SF\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_hdc_and_sf;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_hdc_and_sf);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_hdc_and_sf;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_hdc_and_sf);
-
-		return 0;
-	case METRIC_SET_ID_L3_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_1_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_1);
-
-		return 0;
-	case METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND:
-		dev_priv->perf.oa.n_mux_configs =
-			get_rasterizer_and_pixel_backend_mux_config(dev_priv,
-								    dev_priv->perf.oa.mux_regs,
-								    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RASTERIZER_AND_PIXEL_BACKEND\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_rasterizer_and_pixel_backend;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_rasterizer_and_pixel_backend);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_rasterizer_and_pixel_backend;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_rasterizer_and_pixel_backend);
-
-		return 0;
-	case METRIC_SET_ID_SAMPLER:
-		dev_priv->perf.oa.n_mux_configs =
-			get_sampler_mux_config(dev_priv,
-					       dev_priv->perf.oa.mux_regs,
-					       dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"SAMPLER\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_sampler;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_sampler);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_sampler;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_sampler);
-
-		return 0;
-	case METRIC_SET_ID_TDL_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_tdl_1_mux_config(dev_priv,
-					     dev_priv->perf.oa.mux_regs,
-					     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TDL_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_tdl_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_tdl_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_tdl_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_tdl_1);
-
-		return 0;
-	case METRIC_SET_ID_TDL_2:
-		dev_priv->perf.oa.n_mux_configs =
-			get_tdl_2_mux_config(dev_priv,
-					     dev_priv->perf.oa.mux_regs,
-					     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TDL_2\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_tdl_2;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_tdl_2);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_tdl_2;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_tdl_2);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_EXTRA:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_extra_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_EXTRA\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_extra;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_extra);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_extra;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_extra);
-
-		return 0;
-	case METRIC_SET_ID_TEST_OA:
-		dev_priv->perf.oa.n_mux_configs =
-			get_test_oa_mux_config(dev_priv,
-					       dev_priv->perf.oa.mux_regs,
-					       dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TEST_OA\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_test_oa;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_test_oa);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_test_oa;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_test_oa);
-
-		return 0;
-	default:
-		return -ENODEV;
-	}
-}
-
-static ssize_t
-show_render_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_BASIC);
-}
-
-static struct device_attribute dev_attr_render_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_basic[] = {
-	&dev_attr_render_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_basic = {
-	.name = "22b9519a-e9ba-4c41-8b54-f4f8ca14fa0a",
-	.attrs =  attrs_render_basic,
-};
-
-static ssize_t
-show_compute_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_BASIC);
-}
-
-static struct device_attribute dev_attr_compute_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_basic[] = {
-	&dev_attr_compute_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_basic = {
-	.name = "012d72cf-82a9-4d25-8ddf-74076fd30797",
-	.attrs =  attrs_compute_basic,
-};
-
-static ssize_t
-show_render_pipe_profile_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_PIPE_PROFILE);
-}
-
-static struct device_attribute dev_attr_render_pipe_profile_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_pipe_profile_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_pipe_profile[] = {
-	&dev_attr_render_pipe_profile_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_pipe_profile = {
-	.name = "ce416533-e49e-4211-80af-ec513590a914",
-	.attrs =  attrs_render_pipe_profile,
-};
-
-static ssize_t
-show_memory_reads_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_MEMORY_READS);
-}
-
-static struct device_attribute dev_attr_memory_reads_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_memory_reads_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_memory_reads[] = {
-	&dev_attr_memory_reads_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_memory_reads = {
-	.name = "398e2452-18d7-42d0-b241-e4d0a9148ada",
-	.attrs =  attrs_memory_reads,
-};
-
-static ssize_t
-show_memory_writes_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_MEMORY_WRITES);
-}
-
-static struct device_attribute dev_attr_memory_writes_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_memory_writes_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_memory_writes[] = {
-	&dev_attr_memory_writes_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_memory_writes = {
-	.name = "d324a0d6-7269-4847-a5c2-6f71ddc7fed5",
-	.attrs =  attrs_memory_writes,
-};
-
-static ssize_t
-show_compute_extended_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_EXTENDED);
-}
-
-static struct device_attribute dev_attr_compute_extended_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_extended_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_extended[] = {
-	&dev_attr_compute_extended_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_extended = {
-	.name = "caf3596a-7bb1-4dec-b3b3-2a080d283b49",
-	.attrs =  attrs_compute_extended,
-};
-
-static ssize_t
-show_compute_l3_cache_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_L3_CACHE);
-}
-
-static struct device_attribute dev_attr_compute_l3_cache_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_l3_cache_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_l3_cache[] = {
-	&dev_attr_compute_l3_cache_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_l3_cache = {
-	.name = "49b956e2-d5b9-47e0-9d8a-cee5e8cec527",
-	.attrs =  attrs_compute_l3_cache,
-};
-
-static ssize_t
-show_hdc_and_sf_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_HDC_AND_SF);
-}
-
-static struct device_attribute dev_attr_hdc_and_sf_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_hdc_and_sf_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_hdc_and_sf[] = {
-	&dev_attr_hdc_and_sf_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_hdc_and_sf = {
-	.name = "f64ef50a-bdba-4b35-8f09-203c13d8ee5a",
-	.attrs =  attrs_hdc_and_sf,
-};
-
-static ssize_t
-show_l3_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_1);
-}
-
-static struct device_attribute dev_attr_l3_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_1[] = {
-	&dev_attr_l3_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_1 = {
-	.name = "00ad5a41-7eab-4f7a-9103-49d411c67219",
-	.attrs =  attrs_l3_1,
-};
-
-static ssize_t
-show_rasterizer_and_pixel_backend_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND);
-}
-
-static struct device_attribute dev_attr_rasterizer_and_pixel_backend_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_rasterizer_and_pixel_backend_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_rasterizer_and_pixel_backend[] = {
-	&dev_attr_rasterizer_and_pixel_backend_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_rasterizer_and_pixel_backend = {
-	.name = "46dc44ca-491c-4cc1-a951-e7b3e62bf02b",
-	.attrs =  attrs_rasterizer_and_pixel_backend,
-};
-
-static ssize_t
-show_sampler_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_SAMPLER);
-}
-
-static struct device_attribute dev_attr_sampler_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_sampler_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_sampler[] = {
-	&dev_attr_sampler_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_sampler = {
-	.name = "8364e2a8-af63-40af-b0d5-42969a255654",
-	.attrs =  attrs_sampler,
-};
-
-static ssize_t
-show_tdl_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TDL_1);
-}
-
-static struct device_attribute dev_attr_tdl_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_tdl_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_tdl_1[] = {
-	&dev_attr_tdl_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_tdl_1 = {
-	.name = "175c8092-cb25-4d1e-8dc7-b4fdd39e2d92",
-	.attrs =  attrs_tdl_1,
-};
-
-static ssize_t
-show_tdl_2_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TDL_2);
-}
-
-static struct device_attribute dev_attr_tdl_2_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_tdl_2_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_tdl_2[] = {
-	&dev_attr_tdl_2_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_tdl_2 = {
-	.name = "d260f03f-b34d-4b49-a44e-436819117332",
-	.attrs =  attrs_tdl_2,
-};
-
-static ssize_t
-show_compute_extra_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_EXTRA);
-}
-
-static struct device_attribute dev_attr_compute_extra_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_extra_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_extra[] = {
-	&dev_attr_compute_extra_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_extra = {
-	.name = "fa6ecf21-2cb8-4d0b-9308-6e4a7b4ca87a",
-	.attrs =  attrs_compute_extra,
-};
-
 static ssize_t
 show_test_oa_id(struct device *kdev, struct device_attribute *attr, char *buf)
 {
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TEST_OA);
-}
-
-static struct device_attribute dev_attr_test_oa_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_test_oa_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_test_oa[] = {
-	&dev_attr_test_oa_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_test_oa = {
-	.name = "5ee72f5c-092f-421e-8b70-225f7c3e9612",
-	.attrs =  attrs_test_oa,
-};
-
-int
-i915_perf_register_sysfs_bxt(struct drm_i915_private *dev_priv)
-{
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
-	int ret = 0;
-
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-		if (ret)
-			goto error_render_basic;
-	}
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-		if (ret)
-			goto error_compute_basic;
-	}
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-		if (ret)
-			goto error_render_pipe_profile;
-	}
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-		if (ret)
-			goto error_memory_reads;
-	}
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-		if (ret)
-			goto error_memory_writes;
-	}
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-		if (ret)
-			goto error_compute_extended;
-	}
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-		if (ret)
-			goto error_compute_l3_cache;
-	}
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-		if (ret)
-			goto error_hdc_and_sf;
-	}
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-		if (ret)
-			goto error_l3_1;
-	}
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-		if (ret)
-			goto error_rasterizer_and_pixel_backend;
-	}
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_sampler);
-		if (ret)
-			goto error_sampler;
-	}
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-		if (ret)
-			goto error_tdl_1;
-	}
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-		if (ret)
-			goto error_tdl_2;
-	}
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-		if (ret)
-			goto error_compute_extra;
-	}
-	if (get_test_oa_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_test_oa);
-		if (ret)
-			goto error_test_oa;
-	}
-
-	return 0;
-
-error_test_oa:
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-error_compute_extra:
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-error_tdl_2:
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-error_tdl_1:
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler);
-error_sampler:
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-error_rasterizer_and_pixel_backend:
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-error_l3_1:
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-error_hdc_and_sf:
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-error_compute_l3_cache:
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-error_compute_extended:
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-error_memory_writes:
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-error_memory_reads:
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-error_render_pipe_profile:
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-error_compute_basic:
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-error_render_basic:
-	return ret;
+	return sprintf(buf, "1\n");
 }
 
 void
-i915_perf_unregister_sysfs_bxt(struct drm_i915_private *dev_priv)
+i915_perf_load_test_config_bxt(struct drm_i915_private *dev_priv)
 {
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
+	strncpy(dev_priv->perf.oa.test_config.uuid,
+		"5ee72f5c-092f-421e-8b70-225f7c3e9612",
+		UUID_STRING_LEN);
+	dev_priv->perf.oa.test_config.id = 1;
 
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler);
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-	if (get_test_oa_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_test_oa);
+	dev_priv->perf.oa.test_config.mux_regs = mux_config_test_oa;
+	dev_priv->perf.oa.test_config.mux_regs_len = ARRAY_SIZE(mux_config_test_oa);
+
+	dev_priv->perf.oa.test_config.b_counter_regs = b_counter_config_test_oa;
+	dev_priv->perf.oa.test_config.b_counter_regs_len = ARRAY_SIZE(b_counter_config_test_oa);
+
+	dev_priv->perf.oa.test_config.flex_regs = flex_eu_config_test_oa;
+	dev_priv->perf.oa.test_config.flex_regs_len = ARRAY_SIZE(flex_eu_config_test_oa);
+
+	dev_priv->perf.oa.test_config.sysfs_metric.name = "5ee72f5c-092f-421e-8b70-225f7c3e9612";
+	dev_priv->perf.oa.test_config.sysfs_metric.attrs = dev_priv->perf.oa.test_config.attrs;
+
+	dev_priv->perf.oa.test_config.attrs[0] = &dev_priv->perf.oa.test_config.sysfs_metric_id.attr;
+
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.name = "id";
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.mode = 0444;
+	dev_priv->perf.oa.test_config.sysfs_metric_id.show = show_test_oa_id;
 }
diff --git a/drivers/gpu/drm/i915/i915_oa_bxt.h b/drivers/gpu/drm/i915/i915_oa_bxt.h
index 6cf7ba7..690b963 100644
--- a/drivers/gpu/drm/i915/i915_oa_bxt.h
+++ b/drivers/gpu/drm/i915/i915_oa_bxt.h
@@ -29,12 +29,6 @@
 #ifndef __I915_OA_BXT_H__
 #define __I915_OA_BXT_H__
 
-extern int i915_oa_n_builtin_metric_sets_bxt;
-
-extern int i915_oa_select_metric_set_bxt(struct drm_i915_private *dev_priv);
-
-extern int i915_perf_register_sysfs_bxt(struct drm_i915_private *dev_priv);
-
-extern void i915_perf_unregister_sysfs_bxt(struct drm_i915_private *dev_priv);
+extern void i915_perf_load_test_config_bxt(struct drm_i915_private *dev_priv);
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_oa_chv.c b/drivers/gpu/drm/i915/i915_oa_chv.c
index aa6bece..322a3f9 100644
--- a/drivers/gpu/drm/i915/i915_oa_chv.c
+++ b/drivers/gpu/drm/i915/i915_oa_chv.c
@@ -31,1943 +31,6 @@
 #include "i915_drv.h"
 #include "i915_oa_chv.h"
 
-enum metric_set_id {
-	METRIC_SET_ID_RENDER_BASIC = 1,
-	METRIC_SET_ID_COMPUTE_BASIC,
-	METRIC_SET_ID_RENDER_PIPE_PROFILE,
-	METRIC_SET_ID_HDC_AND_SF,
-	METRIC_SET_ID_L3_1,
-	METRIC_SET_ID_L3_2,
-	METRIC_SET_ID_L3_3,
-	METRIC_SET_ID_L3_4,
-	METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND,
-	METRIC_SET_ID_SAMPLER_1,
-	METRIC_SET_ID_SAMPLER_2,
-	METRIC_SET_ID_TDL_1,
-	METRIC_SET_ID_TDL_2,
-	METRIC_SET_ID_TEST_OA,
-};
-
-int i915_oa_n_builtin_metric_sets_chv = 14;
-
-static const struct i915_oa_reg b_counter_config_render_basic[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_render_basic[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_render_basic[] = {
-	{ _MMIO(0x9888), 0x59800000 },
-	{ _MMIO(0x9888), 0x59800001 },
-	{ _MMIO(0x9888), 0x285a0006 },
-	{ _MMIO(0x9888), 0x2c110014 },
-	{ _MMIO(0x9888), 0x2e110000 },
-	{ _MMIO(0x9888), 0x2c310014 },
-	{ _MMIO(0x9888), 0x2e310000 },
-	{ _MMIO(0x9888), 0x2b8303df },
-	{ _MMIO(0x9888), 0x3580024f },
-	{ _MMIO(0x9888), 0x00580888 },
-	{ _MMIO(0x9888), 0x1e5a0015 },
-	{ _MMIO(0x9888), 0x205a0014 },
-	{ _MMIO(0x9888), 0x045a0000 },
-	{ _MMIO(0x9888), 0x025a0000 },
-	{ _MMIO(0x9888), 0x02180500 },
-	{ _MMIO(0x9888), 0x00190555 },
-	{ _MMIO(0x9888), 0x021d0500 },
-	{ _MMIO(0x9888), 0x021f0a00 },
-	{ _MMIO(0x9888), 0x00380444 },
-	{ _MMIO(0x9888), 0x02390500 },
-	{ _MMIO(0x9888), 0x003a0666 },
-	{ _MMIO(0x9888), 0x00100111 },
-	{ _MMIO(0x9888), 0x06110030 },
-	{ _MMIO(0x9888), 0x0a110031 },
-	{ _MMIO(0x9888), 0x0e110046 },
-	{ _MMIO(0x9888), 0x04110000 },
-	{ _MMIO(0x9888), 0x00110000 },
-	{ _MMIO(0x9888), 0x00130111 },
-	{ _MMIO(0x9888), 0x00300444 },
-	{ _MMIO(0x9888), 0x08310030 },
-	{ _MMIO(0x9888), 0x0c310031 },
-	{ _MMIO(0x9888), 0x10310046 },
-	{ _MMIO(0x9888), 0x04310000 },
-	{ _MMIO(0x9888), 0x00310000 },
-	{ _MMIO(0x9888), 0x00330444 },
-	{ _MMIO(0x9888), 0x038a0a00 },
-	{ _MMIO(0x9888), 0x018b0fff },
-	{ _MMIO(0x9888), 0x038b0a00 },
-	{ _MMIO(0x9888), 0x01855000 },
-	{ _MMIO(0x9888), 0x03850055 },
-	{ _MMIO(0x9888), 0x13830021 },
-	{ _MMIO(0x9888), 0x15830020 },
-	{ _MMIO(0x9888), 0x1783002f },
-	{ _MMIO(0x9888), 0x1983002e },
-	{ _MMIO(0x9888), 0x1b83002d },
-	{ _MMIO(0x9888), 0x1d83002c },
-	{ _MMIO(0x9888), 0x05830000 },
-	{ _MMIO(0x9888), 0x01840555 },
-	{ _MMIO(0x9888), 0x03840500 },
-	{ _MMIO(0x9888), 0x23800074 },
-	{ _MMIO(0x9888), 0x2580007d },
-	{ _MMIO(0x9888), 0x05800000 },
-	{ _MMIO(0x9888), 0x01805000 },
-	{ _MMIO(0x9888), 0x03800055 },
-	{ _MMIO(0x9888), 0x01865000 },
-	{ _MMIO(0x9888), 0x03860055 },
-	{ _MMIO(0x9888), 0x01875000 },
-	{ _MMIO(0x9888), 0x03870055 },
-	{ _MMIO(0x9888), 0x418000aa },
-	{ _MMIO(0x9888), 0x4380000a },
-	{ _MMIO(0x9888), 0x45800000 },
-	{ _MMIO(0x9888), 0x4780000a },
-	{ _MMIO(0x9888), 0x49800000 },
-	{ _MMIO(0x9888), 0x4b800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x55800000 },
-	{ _MMIO(0x9888), 0x57800000 },
-	{ _MMIO(0x9888), 0x59800000 },
-};
-
-static int
-get_render_basic_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_render_basic;
-	lens[n] = ARRAY_SIZE(mux_config_render_basic);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_basic[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_basic[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_compute_basic[] = {
-	{ _MMIO(0x9888), 0x59800000 },
-	{ _MMIO(0x9888), 0x59800001 },
-	{ _MMIO(0x9888), 0x2e5800e0 },
-	{ _MMIO(0x9888), 0x2e3800e0 },
-	{ _MMIO(0x9888), 0x3580024f },
-	{ _MMIO(0x9888), 0x3d800140 },
-	{ _MMIO(0x9888), 0x08580042 },
-	{ _MMIO(0x9888), 0x0c580040 },
-	{ _MMIO(0x9888), 0x1058004c },
-	{ _MMIO(0x9888), 0x1458004b },
-	{ _MMIO(0x9888), 0x04580000 },
-	{ _MMIO(0x9888), 0x00580000 },
-	{ _MMIO(0x9888), 0x00195555 },
-	{ _MMIO(0x9888), 0x06380042 },
-	{ _MMIO(0x9888), 0x0a380040 },
-	{ _MMIO(0x9888), 0x0e38004c },
-	{ _MMIO(0x9888), 0x1238004b },
-	{ _MMIO(0x9888), 0x04380000 },
-	{ _MMIO(0x9888), 0x00384444 },
-	{ _MMIO(0x9888), 0x003a5555 },
-	{ _MMIO(0x9888), 0x018bffff },
-	{ _MMIO(0x9888), 0x01845555 },
-	{ _MMIO(0x9888), 0x17800074 },
-	{ _MMIO(0x9888), 0x1980007d },
-	{ _MMIO(0x9888), 0x1b80007c },
-	{ _MMIO(0x9888), 0x1d8000b6 },
-	{ _MMIO(0x9888), 0x1f8000b7 },
-	{ _MMIO(0x9888), 0x05800000 },
-	{ _MMIO(0x9888), 0x03800000 },
-	{ _MMIO(0x9888), 0x418000aa },
-	{ _MMIO(0x9888), 0x438000aa },
-	{ _MMIO(0x9888), 0x45800000 },
-	{ _MMIO(0x9888), 0x47800000 },
-	{ _MMIO(0x9888), 0x4980012a },
-	{ _MMIO(0x9888), 0x4b80012a },
-	{ _MMIO(0x9888), 0x4d80012a },
-	{ _MMIO(0x9888), 0x4f80012a },
-	{ _MMIO(0x9888), 0x518001ce },
-	{ _MMIO(0x9888), 0x538001ce },
-	{ _MMIO(0x9888), 0x5580000e },
-	{ _MMIO(0x9888), 0x59800000 },
-};
-
-static int
-get_compute_basic_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_basic;
-	lens[n] = ARRAY_SIZE(mux_config_compute_basic);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_render_pipe_profile[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007ffea },
-	{ _MMIO(0x2774), 0x00007ffc },
-	{ _MMIO(0x2778), 0x0007affa },
-	{ _MMIO(0x277c), 0x0000f5fd },
-	{ _MMIO(0x2780), 0x00079ffa },
-	{ _MMIO(0x2784), 0x0000f3fb },
-	{ _MMIO(0x2788), 0x0007bf7a },
-	{ _MMIO(0x278c), 0x0000f7e7 },
-	{ _MMIO(0x2790), 0x0007fefa },
-	{ _MMIO(0x2794), 0x0000f7cf },
-	{ _MMIO(0x2798), 0x00077ffa },
-	{ _MMIO(0x279c), 0x0000efdf },
-	{ _MMIO(0x27a0), 0x0006fffa },
-	{ _MMIO(0x27a4), 0x0000cfbf },
-	{ _MMIO(0x27a8), 0x0003fffa },
-	{ _MMIO(0x27ac), 0x00005f7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_render_pipe_profile[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_render_pipe_profile[] = {
-	{ _MMIO(0x9888), 0x59800000 },
-	{ _MMIO(0x9888), 0x59800001 },
-	{ _MMIO(0x9888), 0x261e0000 },
-	{ _MMIO(0x9888), 0x281f000f },
-	{ _MMIO(0x9888), 0x2817001a },
-	{ _MMIO(0x9888), 0x2791001f },
-	{ _MMIO(0x9888), 0x27880019 },
-	{ _MMIO(0x9888), 0x2d890000 },
-	{ _MMIO(0x9888), 0x278a0007 },
-	{ _MMIO(0x9888), 0x298d001f },
-	{ _MMIO(0x9888), 0x278e0020 },
-	{ _MMIO(0x9888), 0x2b8f0012 },
-	{ _MMIO(0x9888), 0x29900000 },
-	{ _MMIO(0x9888), 0x00184000 },
-	{ _MMIO(0x9888), 0x02181000 },
-	{ _MMIO(0x9888), 0x02194000 },
-	{ _MMIO(0x9888), 0x141e0002 },
-	{ _MMIO(0x9888), 0x041e0000 },
-	{ _MMIO(0x9888), 0x001e0000 },
-	{ _MMIO(0x9888), 0x221f0015 },
-	{ _MMIO(0x9888), 0x041f0000 },
-	{ _MMIO(0x9888), 0x001f4000 },
-	{ _MMIO(0x9888), 0x021f0000 },
-	{ _MMIO(0x9888), 0x023a8000 },
-	{ _MMIO(0x9888), 0x0213c000 },
-	{ _MMIO(0x9888), 0x02164000 },
-	{ _MMIO(0x9888), 0x24170012 },
-	{ _MMIO(0x9888), 0x04170000 },
-	{ _MMIO(0x9888), 0x07910005 },
-	{ _MMIO(0x9888), 0x05910000 },
-	{ _MMIO(0x9888), 0x01911500 },
-	{ _MMIO(0x9888), 0x03910501 },
-	{ _MMIO(0x9888), 0x0d880002 },
-	{ _MMIO(0x9888), 0x1d880003 },
-	{ _MMIO(0x9888), 0x05880000 },
-	{ _MMIO(0x9888), 0x0b890032 },
-	{ _MMIO(0x9888), 0x1b890031 },
-	{ _MMIO(0x9888), 0x05890000 },
-	{ _MMIO(0x9888), 0x01890040 },
-	{ _MMIO(0x9888), 0x03890040 },
-	{ _MMIO(0x9888), 0x098a0000 },
-	{ _MMIO(0x9888), 0x198a0004 },
-	{ _MMIO(0x9888), 0x058a0000 },
-	{ _MMIO(0x9888), 0x018a8050 },
-	{ _MMIO(0x9888), 0x038a2050 },
-	{ _MMIO(0x9888), 0x018b95a9 },
-	{ _MMIO(0x9888), 0x038be5a9 },
-	{ _MMIO(0x9888), 0x018c1500 },
-	{ _MMIO(0x9888), 0x038c0501 },
-	{ _MMIO(0x9888), 0x178d0015 },
-	{ _MMIO(0x9888), 0x058d0000 },
-	{ _MMIO(0x9888), 0x138e0004 },
-	{ _MMIO(0x9888), 0x218e000c },
-	{ _MMIO(0x9888), 0x058e0000 },
-	{ _MMIO(0x9888), 0x018e0500 },
-	{ _MMIO(0x9888), 0x038e0101 },
-	{ _MMIO(0x9888), 0x0f8f0027 },
-	{ _MMIO(0x9888), 0x058f0000 },
-	{ _MMIO(0x9888), 0x018f0000 },
-	{ _MMIO(0x9888), 0x038f0001 },
-	{ _MMIO(0x9888), 0x11900013 },
-	{ _MMIO(0x9888), 0x1f900017 },
-	{ _MMIO(0x9888), 0x05900000 },
-	{ _MMIO(0x9888), 0x01900100 },
-	{ _MMIO(0x9888), 0x03900001 },
-	{ _MMIO(0x9888), 0x01845555 },
-	{ _MMIO(0x9888), 0x03845555 },
-	{ _MMIO(0x9888), 0x418000aa },
-	{ _MMIO(0x9888), 0x438000aa },
-	{ _MMIO(0x9888), 0x458000aa },
-	{ _MMIO(0x9888), 0x478000aa },
-	{ _MMIO(0x9888), 0x4980018c },
-	{ _MMIO(0x9888), 0x4b80014b },
-	{ _MMIO(0x9888), 0x4d800128 },
-	{ _MMIO(0x9888), 0x4f80012a },
-	{ _MMIO(0x9888), 0x51800187 },
-	{ _MMIO(0x9888), 0x5380014b },
-	{ _MMIO(0x9888), 0x55800149 },
-	{ _MMIO(0x9888), 0x5780010a },
-	{ _MMIO(0x9888), 0x59800000 },
-};
-
-static int
-get_render_pipe_profile_mux_config(struct drm_i915_private *dev_priv,
-				   const struct i915_oa_reg **regs,
-				   int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_render_pipe_profile;
-	lens[n] = ARRAY_SIZE(mux_config_render_pipe_profile);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_hdc_and_sf[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x10800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000fff7 },
-};
-
-static const struct i915_oa_reg flex_eu_config_hdc_and_sf[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_hdc_and_sf[] = {
-	{ _MMIO(0x9888), 0x105c0232 },
-	{ _MMIO(0x9888), 0x10580232 },
-	{ _MMIO(0x9888), 0x10380232 },
-	{ _MMIO(0x9888), 0x10dc0232 },
-	{ _MMIO(0x9888), 0x10d80232 },
-	{ _MMIO(0x9888), 0x10b80232 },
-	{ _MMIO(0x9888), 0x118e4400 },
-	{ _MMIO(0x9888), 0x025c6080 },
-	{ _MMIO(0x9888), 0x045c004b },
-	{ _MMIO(0x9888), 0x005c8000 },
-	{ _MMIO(0x9888), 0x00582080 },
-	{ _MMIO(0x9888), 0x0258004b },
-	{ _MMIO(0x9888), 0x025b4000 },
-	{ _MMIO(0x9888), 0x045b4000 },
-	{ _MMIO(0x9888), 0x0c1fa000 },
-	{ _MMIO(0x9888), 0x0e1f00aa },
-	{ _MMIO(0x9888), 0x04386080 },
-	{ _MMIO(0x9888), 0x0638404b },
-	{ _MMIO(0x9888), 0x02384000 },
-	{ _MMIO(0x9888), 0x08384000 },
-	{ _MMIO(0x9888), 0x0a380000 },
-	{ _MMIO(0x9888), 0x0c380000 },
-	{ _MMIO(0x9888), 0x00398000 },
-	{ _MMIO(0x9888), 0x0239a000 },
-	{ _MMIO(0x9888), 0x0439a000 },
-	{ _MMIO(0x9888), 0x06392000 },
-	{ _MMIO(0x9888), 0x0cdc25c1 },
-	{ _MMIO(0x9888), 0x0adcc000 },
-	{ _MMIO(0x9888), 0x0ad825c1 },
-	{ _MMIO(0x9888), 0x18db4000 },
-	{ _MMIO(0x9888), 0x1adb0001 },
-	{ _MMIO(0x9888), 0x0e9f8000 },
-	{ _MMIO(0x9888), 0x109f02aa },
-	{ _MMIO(0x9888), 0x0eb825c1 },
-	{ _MMIO(0x9888), 0x18b80154 },
-	{ _MMIO(0x9888), 0x0ab9a000 },
-	{ _MMIO(0x9888), 0x0cb9a000 },
-	{ _MMIO(0x9888), 0x0eb9a000 },
-	{ _MMIO(0x9888), 0x0d88c000 },
-	{ _MMIO(0x9888), 0x0f88000f },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x078a8000 },
-	{ _MMIO(0x9888), 0x098a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x0d8a8000 },
-	{ _MMIO(0x9888), 0x258baa05 },
-	{ _MMIO(0x9888), 0x278b002a },
-	{ _MMIO(0x9888), 0x238b2a80 },
-	{ _MMIO(0x9888), 0x198c5400 },
-	{ _MMIO(0x9888), 0x1b8c0015 },
-	{ _MMIO(0x9888), 0x098dc000 },
-	{ _MMIO(0x9888), 0x0b8da000 },
-	{ _MMIO(0x9888), 0x0d8da000 },
-	{ _MMIO(0x9888), 0x0f8da000 },
-	{ _MMIO(0x9888), 0x098e05c0 },
-	{ _MMIO(0x9888), 0x058e0000 },
-	{ _MMIO(0x9888), 0x198f0020 },
-	{ _MMIO(0x9888), 0x2185aa0a },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x19835000 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x09848000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x07844000 },
-	{ _MMIO(0x9888), 0x19808000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x17804000 },
-	{ _MMIO(0x9888), 0x51800040 },
-	{ _MMIO(0x9888), 0x43800400 },
-	{ _MMIO(0x9888), 0x45800800 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47800c62 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f801042 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x418014a4 },
-};
-
-static int
-get_hdc_and_sf_mux_config(struct drm_i915_private *dev_priv,
-			  const struct i915_oa_reg **regs,
-			  int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_hdc_and_sf;
-	lens[n] = ARRAY_SIZE(mux_config_hdc_and_sf);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00014002 },
-	{ _MMIO(0x277c), 0x0000c3ff },
-	{ _MMIO(0x2780), 0x00010002 },
-	{ _MMIO(0x2784), 0x0000c7ff },
-	{ _MMIO(0x2788), 0x00004002 },
-	{ _MMIO(0x278c), 0x0000d3ff },
-	{ _MMIO(0x2790), 0x00100700 },
-	{ _MMIO(0x2794), 0x0000ff1f },
-	{ _MMIO(0x2798), 0x00001402 },
-	{ _MMIO(0x279c), 0x0000fc3f },
-	{ _MMIO(0x27a0), 0x00001002 },
-	{ _MMIO(0x27a4), 0x0000fc7f },
-	{ _MMIO(0x27a8), 0x00000402 },
-	{ _MMIO(0x27ac), 0x0000fd3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_1[] = {
-	{ _MMIO(0x9888), 0x10bf03da },
-	{ _MMIO(0x9888), 0x14bf0001 },
-	{ _MMIO(0x9888), 0x12980340 },
-	{ _MMIO(0x9888), 0x12990340 },
-	{ _MMIO(0x9888), 0x0cbf1187 },
-	{ _MMIO(0x9888), 0x0ebf1205 },
-	{ _MMIO(0x9888), 0x00bf0500 },
-	{ _MMIO(0x9888), 0x02bf042b },
-	{ _MMIO(0x9888), 0x04bf002c },
-	{ _MMIO(0x9888), 0x0cdac000 },
-	{ _MMIO(0x9888), 0x0edac000 },
-	{ _MMIO(0x9888), 0x00da8000 },
-	{ _MMIO(0x9888), 0x02dac000 },
-	{ _MMIO(0x9888), 0x04da4000 },
-	{ _MMIO(0x9888), 0x04983400 },
-	{ _MMIO(0x9888), 0x10980000 },
-	{ _MMIO(0x9888), 0x06990034 },
-	{ _MMIO(0x9888), 0x10990000 },
-	{ _MMIO(0x9888), 0x0c9dc000 },
-	{ _MMIO(0x9888), 0x0e9dc000 },
-	{ _MMIO(0x9888), 0x009d8000 },
-	{ _MMIO(0x9888), 0x029dc000 },
-	{ _MMIO(0x9888), 0x049d4000 },
-	{ _MMIO(0x9888), 0x109f02a8 },
-	{ _MMIO(0x9888), 0x0c9fa000 },
-	{ _MMIO(0x9888), 0x0e9f00ba },
-	{ _MMIO(0x9888), 0x0cb88000 },
-	{ _MMIO(0x9888), 0x0cb95000 },
-	{ _MMIO(0x9888), 0x0eb95000 },
-	{ _MMIO(0x9888), 0x00b94000 },
-	{ _MMIO(0x9888), 0x02b95000 },
-	{ _MMIO(0x9888), 0x04b91000 },
-	{ _MMIO(0x9888), 0x06b92000 },
-	{ _MMIO(0x9888), 0x0cba4000 },
-	{ _MMIO(0x9888), 0x0f88000f },
-	{ _MMIO(0x9888), 0x03888000 },
-	{ _MMIO(0x9888), 0x05888000 },
-	{ _MMIO(0x9888), 0x07888000 },
-	{ _MMIO(0x9888), 0x09888000 },
-	{ _MMIO(0x9888), 0x0b888000 },
-	{ _MMIO(0x9888), 0x0d880400 },
-	{ _MMIO(0x9888), 0x258b800a },
-	{ _MMIO(0x9888), 0x278b002a },
-	{ _MMIO(0x9888), 0x238b5500 },
-	{ _MMIO(0x9888), 0x198c4000 },
-	{ _MMIO(0x9888), 0x1b8c0015 },
-	{ _MMIO(0x9888), 0x038c4000 },
-	{ _MMIO(0x9888), 0x058c4000 },
-	{ _MMIO(0x9888), 0x078c4000 },
-	{ _MMIO(0x9888), 0x098c4000 },
-	{ _MMIO(0x9888), 0x0b8c4000 },
-	{ _MMIO(0x9888), 0x0d8c4000 },
-	{ _MMIO(0x9888), 0x0d8da000 },
-	{ _MMIO(0x9888), 0x0f8da000 },
-	{ _MMIO(0x9888), 0x018d8000 },
-	{ _MMIO(0x9888), 0x038da000 },
-	{ _MMIO(0x9888), 0x058da000 },
-	{ _MMIO(0x9888), 0x078d2000 },
-	{ _MMIO(0x9888), 0x2185800a },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x1b830154 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x07844000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x17804000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x45800000 },
-	{ _MMIO(0x9888), 0x47800000 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f800000 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x41800060 },
-};
-
-static int
-get_l3_1_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_1;
-	lens[n] = ARRAY_SIZE(mux_config_l3_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_2[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00014002 },
-	{ _MMIO(0x277c), 0x0000c3ff },
-	{ _MMIO(0x2780), 0x00010002 },
-	{ _MMIO(0x2784), 0x0000c7ff },
-	{ _MMIO(0x2788), 0x00004002 },
-	{ _MMIO(0x278c), 0x0000d3ff },
-	{ _MMIO(0x2790), 0x00100700 },
-	{ _MMIO(0x2794), 0x0000ff1f },
-	{ _MMIO(0x2798), 0x00001402 },
-	{ _MMIO(0x279c), 0x0000fc3f },
-	{ _MMIO(0x27a0), 0x00001002 },
-	{ _MMIO(0x27a4), 0x0000fc7f },
-	{ _MMIO(0x27a8), 0x00000402 },
-	{ _MMIO(0x27ac), 0x0000fd3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_2[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_2[] = {
-	{ _MMIO(0x9888), 0x103f03da },
-	{ _MMIO(0x9888), 0x143f0001 },
-	{ _MMIO(0x9888), 0x12180340 },
-	{ _MMIO(0x9888), 0x12190340 },
-	{ _MMIO(0x9888), 0x0c3f1187 },
-	{ _MMIO(0x9888), 0x0e3f1205 },
-	{ _MMIO(0x9888), 0x003f0500 },
-	{ _MMIO(0x9888), 0x023f042b },
-	{ _MMIO(0x9888), 0x043f002c },
-	{ _MMIO(0x9888), 0x0c5ac000 },
-	{ _MMIO(0x9888), 0x0e5ac000 },
-	{ _MMIO(0x9888), 0x005a8000 },
-	{ _MMIO(0x9888), 0x025ac000 },
-	{ _MMIO(0x9888), 0x045a4000 },
-	{ _MMIO(0x9888), 0x04183400 },
-	{ _MMIO(0x9888), 0x10180000 },
-	{ _MMIO(0x9888), 0x06190034 },
-	{ _MMIO(0x9888), 0x10190000 },
-	{ _MMIO(0x9888), 0x0c1dc000 },
-	{ _MMIO(0x9888), 0x0e1dc000 },
-	{ _MMIO(0x9888), 0x001d8000 },
-	{ _MMIO(0x9888), 0x021dc000 },
-	{ _MMIO(0x9888), 0x041d4000 },
-	{ _MMIO(0x9888), 0x101f02a8 },
-	{ _MMIO(0x9888), 0x0c1fa000 },
-	{ _MMIO(0x9888), 0x0e1f00ba },
-	{ _MMIO(0x9888), 0x0c388000 },
-	{ _MMIO(0x9888), 0x0c395000 },
-	{ _MMIO(0x9888), 0x0e395000 },
-	{ _MMIO(0x9888), 0x00394000 },
-	{ _MMIO(0x9888), 0x02395000 },
-	{ _MMIO(0x9888), 0x04391000 },
-	{ _MMIO(0x9888), 0x06392000 },
-	{ _MMIO(0x9888), 0x0c3a4000 },
-	{ _MMIO(0x9888), 0x1b8aa800 },
-	{ _MMIO(0x9888), 0x1d8a0002 },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x078a8000 },
-	{ _MMIO(0x9888), 0x098a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x0d8a8000 },
-	{ _MMIO(0x9888), 0x258b4005 },
-	{ _MMIO(0x9888), 0x278b0015 },
-	{ _MMIO(0x9888), 0x238b2a80 },
-	{ _MMIO(0x9888), 0x2185800a },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x1b830154 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x07844000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x17804000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x45800000 },
-	{ _MMIO(0x9888), 0x47800000 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f800000 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x41800060 },
-};
-
-static int
-get_l3_2_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_2;
-	lens[n] = ARRAY_SIZE(mux_config_l3_2);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_3[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00014002 },
-	{ _MMIO(0x277c), 0x0000c3ff },
-	{ _MMIO(0x2780), 0x00010002 },
-	{ _MMIO(0x2784), 0x0000c7ff },
-	{ _MMIO(0x2788), 0x00004002 },
-	{ _MMIO(0x278c), 0x0000d3ff },
-	{ _MMIO(0x2790), 0x00100700 },
-	{ _MMIO(0x2794), 0x0000ff1f },
-	{ _MMIO(0x2798), 0x00001402 },
-	{ _MMIO(0x279c), 0x0000fc3f },
-	{ _MMIO(0x27a0), 0x00001002 },
-	{ _MMIO(0x27a4), 0x0000fc7f },
-	{ _MMIO(0x27a8), 0x00000402 },
-	{ _MMIO(0x27ac), 0x0000fd3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_3[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_3[] = {
-	{ _MMIO(0x9888), 0x121b0340 },
-	{ _MMIO(0x9888), 0x103f0274 },
-	{ _MMIO(0x9888), 0x123f0000 },
-	{ _MMIO(0x9888), 0x129b0340 },
-	{ _MMIO(0x9888), 0x10bf0274 },
-	{ _MMIO(0x9888), 0x12bf0000 },
-	{ _MMIO(0x9888), 0x041b3400 },
-	{ _MMIO(0x9888), 0x101b0000 },
-	{ _MMIO(0x9888), 0x045c8000 },
-	{ _MMIO(0x9888), 0x0a3d4000 },
-	{ _MMIO(0x9888), 0x003f0080 },
-	{ _MMIO(0x9888), 0x023f0793 },
-	{ _MMIO(0x9888), 0x043f0014 },
-	{ _MMIO(0x9888), 0x04588000 },
-	{ _MMIO(0x9888), 0x005a8000 },
-	{ _MMIO(0x9888), 0x025ac000 },
-	{ _MMIO(0x9888), 0x045a4000 },
-	{ _MMIO(0x9888), 0x0a5b4000 },
-	{ _MMIO(0x9888), 0x001d8000 },
-	{ _MMIO(0x9888), 0x021dc000 },
-	{ _MMIO(0x9888), 0x041d4000 },
-	{ _MMIO(0x9888), 0x0c1fa000 },
-	{ _MMIO(0x9888), 0x0e1f002a },
-	{ _MMIO(0x9888), 0x0a384000 },
-	{ _MMIO(0x9888), 0x00394000 },
-	{ _MMIO(0x9888), 0x02395000 },
-	{ _MMIO(0x9888), 0x04399000 },
-	{ _MMIO(0x9888), 0x069b0034 },
-	{ _MMIO(0x9888), 0x109b0000 },
-	{ _MMIO(0x9888), 0x06dc4000 },
-	{ _MMIO(0x9888), 0x0cbd4000 },
-	{ _MMIO(0x9888), 0x0cbf0981 },
-	{ _MMIO(0x9888), 0x0ebf0a0f },
-	{ _MMIO(0x9888), 0x06d84000 },
-	{ _MMIO(0x9888), 0x0cdac000 },
-	{ _MMIO(0x9888), 0x0edac000 },
-	{ _MMIO(0x9888), 0x0cdb4000 },
-	{ _MMIO(0x9888), 0x0c9dc000 },
-	{ _MMIO(0x9888), 0x0e9dc000 },
-	{ _MMIO(0x9888), 0x109f02a8 },
-	{ _MMIO(0x9888), 0x0e9f0080 },
-	{ _MMIO(0x9888), 0x0cb84000 },
-	{ _MMIO(0x9888), 0x0cb95000 },
-	{ _MMIO(0x9888), 0x0eb95000 },
-	{ _MMIO(0x9888), 0x06b92000 },
-	{ _MMIO(0x9888), 0x0f88000f },
-	{ _MMIO(0x9888), 0x0d880400 },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x078a8000 },
-	{ _MMIO(0x9888), 0x098a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x258b8009 },
-	{ _MMIO(0x9888), 0x278b002a },
-	{ _MMIO(0x9888), 0x238b2a80 },
-	{ _MMIO(0x9888), 0x198c4000 },
-	{ _MMIO(0x9888), 0x1b8c0015 },
-	{ _MMIO(0x9888), 0x0d8c4000 },
-	{ _MMIO(0x9888), 0x0d8da000 },
-	{ _MMIO(0x9888), 0x0f8da000 },
-	{ _MMIO(0x9888), 0x078d2000 },
-	{ _MMIO(0x9888), 0x2185800a },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x1b830154 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x07844000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x17804000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x45800c00 },
-	{ _MMIO(0x9888), 0x47800c63 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f8014a5 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x41800045 },
-};
-
-static int
-get_l3_3_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_3;
-	lens[n] = ARRAY_SIZE(mux_config_l3_3);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_4[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00014002 },
-	{ _MMIO(0x277c), 0x0000c3ff },
-	{ _MMIO(0x2780), 0x00010002 },
-	{ _MMIO(0x2784), 0x0000c7ff },
-	{ _MMIO(0x2788), 0x00004002 },
-	{ _MMIO(0x278c), 0x0000d3ff },
-	{ _MMIO(0x2790), 0x00100700 },
-	{ _MMIO(0x2794), 0x0000ff1f },
-	{ _MMIO(0x2798), 0x00001402 },
-	{ _MMIO(0x279c), 0x0000fc3f },
-	{ _MMIO(0x27a0), 0x00001002 },
-	{ _MMIO(0x27a4), 0x0000fc7f },
-	{ _MMIO(0x27a8), 0x00000402 },
-	{ _MMIO(0x27ac), 0x0000fd3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_4[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_4[] = {
-	{ _MMIO(0x9888), 0x121a0340 },
-	{ _MMIO(0x9888), 0x103f0017 },
-	{ _MMIO(0x9888), 0x123f0020 },
-	{ _MMIO(0x9888), 0x129a0340 },
-	{ _MMIO(0x9888), 0x10bf0017 },
-	{ _MMIO(0x9888), 0x12bf0020 },
-	{ _MMIO(0x9888), 0x041a3400 },
-	{ _MMIO(0x9888), 0x101a0000 },
-	{ _MMIO(0x9888), 0x043b8000 },
-	{ _MMIO(0x9888), 0x0a3e0010 },
-	{ _MMIO(0x9888), 0x003f0200 },
-	{ _MMIO(0x9888), 0x023f0113 },
-	{ _MMIO(0x9888), 0x043f0014 },
-	{ _MMIO(0x9888), 0x02592000 },
-	{ _MMIO(0x9888), 0x005a8000 },
-	{ _MMIO(0x9888), 0x025ac000 },
-	{ _MMIO(0x9888), 0x045a4000 },
-	{ _MMIO(0x9888), 0x0a1c8000 },
-	{ _MMIO(0x9888), 0x001d8000 },
-	{ _MMIO(0x9888), 0x021dc000 },
-	{ _MMIO(0x9888), 0x041d4000 },
-	{ _MMIO(0x9888), 0x0a1e8000 },
-	{ _MMIO(0x9888), 0x0c1fa000 },
-	{ _MMIO(0x9888), 0x0e1f001a },
-	{ _MMIO(0x9888), 0x00394000 },
-	{ _MMIO(0x9888), 0x02395000 },
-	{ _MMIO(0x9888), 0x04391000 },
-	{ _MMIO(0x9888), 0x069a0034 },
-	{ _MMIO(0x9888), 0x109a0000 },
-	{ _MMIO(0x9888), 0x06bb4000 },
-	{ _MMIO(0x9888), 0x0abe0040 },
-	{ _MMIO(0x9888), 0x0cbf0984 },
-	{ _MMIO(0x9888), 0x0ebf0a02 },
-	{ _MMIO(0x9888), 0x02d94000 },
-	{ _MMIO(0x9888), 0x0cdac000 },
-	{ _MMIO(0x9888), 0x0edac000 },
-	{ _MMIO(0x9888), 0x0c9c0400 },
-	{ _MMIO(0x9888), 0x0c9dc000 },
-	{ _MMIO(0x9888), 0x0e9dc000 },
-	{ _MMIO(0x9888), 0x0c9e0400 },
-	{ _MMIO(0x9888), 0x109f02a8 },
-	{ _MMIO(0x9888), 0x0e9f0040 },
-	{ _MMIO(0x9888), 0x0cb95000 },
-	{ _MMIO(0x9888), 0x0eb95000 },
-	{ _MMIO(0x9888), 0x0f88000f },
-	{ _MMIO(0x9888), 0x0d880400 },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x078a8000 },
-	{ _MMIO(0x9888), 0x098a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x258b8009 },
-	{ _MMIO(0x9888), 0x278b002a },
-	{ _MMIO(0x9888), 0x238b2a80 },
-	{ _MMIO(0x9888), 0x198c4000 },
-	{ _MMIO(0x9888), 0x1b8c0015 },
-	{ _MMIO(0x9888), 0x0d8c4000 },
-	{ _MMIO(0x9888), 0x0d8da000 },
-	{ _MMIO(0x9888), 0x0f8da000 },
-	{ _MMIO(0x9888), 0x078d2000 },
-	{ _MMIO(0x9888), 0x2185800a },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x1b830154 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x07844000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x17804000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x45800800 },
-	{ _MMIO(0x9888), 0x47800842 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f801084 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x41800044 },
-};
-
-static int
-get_l3_4_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_4;
-	lens[n] = ARRAY_SIZE(mux_config_l3_4);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00006000 },
-	{ _MMIO(0x2774), 0x0000f3ff },
-	{ _MMIO(0x2778), 0x00001800 },
-	{ _MMIO(0x277c), 0x0000fcff },
-	{ _MMIO(0x2780), 0x00000600 },
-	{ _MMIO(0x2784), 0x0000ff3f },
-	{ _MMIO(0x2788), 0x00000180 },
-	{ _MMIO(0x278c), 0x0000ffcf },
-	{ _MMIO(0x2790), 0x00000060 },
-	{ _MMIO(0x2794), 0x0000fff3 },
-	{ _MMIO(0x2798), 0x00000018 },
-	{ _MMIO(0x279c), 0x0000fffc },
-};
-
-static const struct i915_oa_reg flex_eu_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0x9888), 0x143b000e },
-	{ _MMIO(0x9888), 0x043c55c0 },
-	{ _MMIO(0x9888), 0x0a1e0280 },
-	{ _MMIO(0x9888), 0x0c1e0408 },
-	{ _MMIO(0x9888), 0x10390000 },
-	{ _MMIO(0x9888), 0x12397a1f },
-	{ _MMIO(0x9888), 0x14bb000e },
-	{ _MMIO(0x9888), 0x04bc5000 },
-	{ _MMIO(0x9888), 0x0a9e0296 },
-	{ _MMIO(0x9888), 0x0c9e0008 },
-	{ _MMIO(0x9888), 0x10b90000 },
-	{ _MMIO(0x9888), 0x12b97a1f },
-	{ _MMIO(0x9888), 0x063b0042 },
-	{ _MMIO(0x9888), 0x103b0000 },
-	{ _MMIO(0x9888), 0x083c0000 },
-	{ _MMIO(0x9888), 0x0a3e0040 },
-	{ _MMIO(0x9888), 0x043f8000 },
-	{ _MMIO(0x9888), 0x02594000 },
-	{ _MMIO(0x9888), 0x045a8000 },
-	{ _MMIO(0x9888), 0x0c1c0400 },
-	{ _MMIO(0x9888), 0x041d8000 },
-	{ _MMIO(0x9888), 0x081e02c0 },
-	{ _MMIO(0x9888), 0x0e1e0000 },
-	{ _MMIO(0x9888), 0x0c1fa800 },
-	{ _MMIO(0x9888), 0x0e1f0260 },
-	{ _MMIO(0x9888), 0x101f0014 },
-	{ _MMIO(0x9888), 0x003905e0 },
-	{ _MMIO(0x9888), 0x06390bc0 },
-	{ _MMIO(0x9888), 0x02390018 },
-	{ _MMIO(0x9888), 0x04394000 },
-	{ _MMIO(0x9888), 0x04bb0042 },
-	{ _MMIO(0x9888), 0x10bb0000 },
-	{ _MMIO(0x9888), 0x02bc05c0 },
-	{ _MMIO(0x9888), 0x08bc0000 },
-	{ _MMIO(0x9888), 0x0abe0004 },
-	{ _MMIO(0x9888), 0x02bf8000 },
-	{ _MMIO(0x9888), 0x02d91000 },
-	{ _MMIO(0x9888), 0x02da8000 },
-	{ _MMIO(0x9888), 0x089c8000 },
-	{ _MMIO(0x9888), 0x029d8000 },
-	{ _MMIO(0x9888), 0x089e8000 },
-	{ _MMIO(0x9888), 0x0e9e0000 },
-	{ _MMIO(0x9888), 0x0e9fa806 },
-	{ _MMIO(0x9888), 0x109f0142 },
-	{ _MMIO(0x9888), 0x08b90617 },
-	{ _MMIO(0x9888), 0x0ab90be0 },
-	{ _MMIO(0x9888), 0x02b94000 },
-	{ _MMIO(0x9888), 0x0d88f000 },
-	{ _MMIO(0x9888), 0x0f88000c },
-	{ _MMIO(0x9888), 0x07888000 },
-	{ _MMIO(0x9888), 0x09888000 },
-	{ _MMIO(0x9888), 0x018a8000 },
-	{ _MMIO(0x9888), 0x0f8a8000 },
-	{ _MMIO(0x9888), 0x1b8a2800 },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x0d8a8000 },
-	{ _MMIO(0x9888), 0x238b52a0 },
-	{ _MMIO(0x9888), 0x258b6a95 },
-	{ _MMIO(0x9888), 0x278b0029 },
-	{ _MMIO(0x9888), 0x178c2000 },
-	{ _MMIO(0x9888), 0x198c1500 },
-	{ _MMIO(0x9888), 0x1b8c0014 },
-	{ _MMIO(0x9888), 0x078c4000 },
-	{ _MMIO(0x9888), 0x098c4000 },
-	{ _MMIO(0x9888), 0x098da000 },
-	{ _MMIO(0x9888), 0x0b8da000 },
-	{ _MMIO(0x9888), 0x0f8da000 },
-	{ _MMIO(0x9888), 0x038d8000 },
-	{ _MMIO(0x9888), 0x058d2000 },
-	{ _MMIO(0x9888), 0x1f85aa80 },
-	{ _MMIO(0x9888), 0x2185aaaa },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x01834000 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0184c000 },
-	{ _MMIO(0x9888), 0x0784c000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x1180c000 },
-	{ _MMIO(0x9888), 0x1780c000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x4d800444 },
-	{ _MMIO(0x9888), 0x3d800000 },
-	{ _MMIO(0x9888), 0x4f804000 },
-	{ _MMIO(0x9888), 0x43801080 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45800084 },
-	{ _MMIO(0x9888), 0x53800044 },
-	{ _MMIO(0x9888), 0x47801080 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x3f800000 },
-	{ _MMIO(0x9888), 0x41800840 },
-};
-
-static int
-get_rasterizer_and_pixel_backend_mux_config(struct drm_i915_private *dev_priv,
-					    const struct i915_oa_reg **regs,
-					    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_rasterizer_and_pixel_backend;
-	lens[n] = ARRAY_SIZE(mux_config_rasterizer_and_pixel_backend);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_sampler_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x70800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x0000c000 },
-	{ _MMIO(0x2774), 0x0000e7ff },
-	{ _MMIO(0x2778), 0x00003000 },
-	{ _MMIO(0x277c), 0x0000f9ff },
-	{ _MMIO(0x2780), 0x00000c00 },
-	{ _MMIO(0x2784), 0x0000fe7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_sampler_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_sampler_1[] = {
-	{ _MMIO(0x9888), 0x18921400 },
-	{ _MMIO(0x9888), 0x149500ab },
-	{ _MMIO(0x9888), 0x18b21400 },
-	{ _MMIO(0x9888), 0x14b500ab },
-	{ _MMIO(0x9888), 0x18d21400 },
-	{ _MMIO(0x9888), 0x14d500ab },
-	{ _MMIO(0x9888), 0x0cdc8000 },
-	{ _MMIO(0x9888), 0x0edc4000 },
-	{ _MMIO(0x9888), 0x02dcc000 },
-	{ _MMIO(0x9888), 0x04dcc000 },
-	{ _MMIO(0x9888), 0x1abd00a0 },
-	{ _MMIO(0x9888), 0x0abd8000 },
-	{ _MMIO(0x9888), 0x0cd88000 },
-	{ _MMIO(0x9888), 0x0ed84000 },
-	{ _MMIO(0x9888), 0x04d88000 },
-	{ _MMIO(0x9888), 0x1adb0050 },
-	{ _MMIO(0x9888), 0x04db8000 },
-	{ _MMIO(0x9888), 0x06db8000 },
-	{ _MMIO(0x9888), 0x08db8000 },
-	{ _MMIO(0x9888), 0x0adb4000 },
-	{ _MMIO(0x9888), 0x109f02a0 },
-	{ _MMIO(0x9888), 0x0c9fa000 },
-	{ _MMIO(0x9888), 0x0e9f00aa },
-	{ _MMIO(0x9888), 0x18b82500 },
-	{ _MMIO(0x9888), 0x02b88000 },
-	{ _MMIO(0x9888), 0x04b84000 },
-	{ _MMIO(0x9888), 0x06b84000 },
-	{ _MMIO(0x9888), 0x08b84000 },
-	{ _MMIO(0x9888), 0x0ab84000 },
-	{ _MMIO(0x9888), 0x0cb88000 },
-	{ _MMIO(0x9888), 0x0cb98000 },
-	{ _MMIO(0x9888), 0x0eb9a000 },
-	{ _MMIO(0x9888), 0x00b98000 },
-	{ _MMIO(0x9888), 0x02b9a000 },
-	{ _MMIO(0x9888), 0x04b9a000 },
-	{ _MMIO(0x9888), 0x06b92000 },
-	{ _MMIO(0x9888), 0x1aba0200 },
-	{ _MMIO(0x9888), 0x02ba8000 },
-	{ _MMIO(0x9888), 0x0cba8000 },
-	{ _MMIO(0x9888), 0x04908000 },
-	{ _MMIO(0x9888), 0x04918000 },
-	{ _MMIO(0x9888), 0x04927300 },
-	{ _MMIO(0x9888), 0x10920000 },
-	{ _MMIO(0x9888), 0x1893000a },
-	{ _MMIO(0x9888), 0x0a934000 },
-	{ _MMIO(0x9888), 0x0a946000 },
-	{ _MMIO(0x9888), 0x0c959000 },
-	{ _MMIO(0x9888), 0x0e950098 },
-	{ _MMIO(0x9888), 0x10950000 },
-	{ _MMIO(0x9888), 0x04b04000 },
-	{ _MMIO(0x9888), 0x04b14000 },
-	{ _MMIO(0x9888), 0x04b20073 },
-	{ _MMIO(0x9888), 0x10b20000 },
-	{ _MMIO(0x9888), 0x04b38000 },
-	{ _MMIO(0x9888), 0x06b38000 },
-	{ _MMIO(0x9888), 0x08b34000 },
-	{ _MMIO(0x9888), 0x04b4c000 },
-	{ _MMIO(0x9888), 0x02b59890 },
-	{ _MMIO(0x9888), 0x10b50000 },
-	{ _MMIO(0x9888), 0x06d04000 },
-	{ _MMIO(0x9888), 0x06d14000 },
-	{ _MMIO(0x9888), 0x06d20073 },
-	{ _MMIO(0x9888), 0x10d20000 },
-	{ _MMIO(0x9888), 0x18d30020 },
-	{ _MMIO(0x9888), 0x02d38000 },
-	{ _MMIO(0x9888), 0x0cd34000 },
-	{ _MMIO(0x9888), 0x0ad48000 },
-	{ _MMIO(0x9888), 0x04d42000 },
-	{ _MMIO(0x9888), 0x0ed59000 },
-	{ _MMIO(0x9888), 0x00d59800 },
-	{ _MMIO(0x9888), 0x10d50000 },
-	{ _MMIO(0x9888), 0x0f88000e },
-	{ _MMIO(0x9888), 0x03888000 },
-	{ _MMIO(0x9888), 0x05888000 },
-	{ _MMIO(0x9888), 0x07888000 },
-	{ _MMIO(0x9888), 0x09888000 },
-	{ _MMIO(0x9888), 0x0b888000 },
-	{ _MMIO(0x9888), 0x0d880400 },
-	{ _MMIO(0x9888), 0x278b002a },
-	{ _MMIO(0x9888), 0x238b5500 },
-	{ _MMIO(0x9888), 0x258b000a },
-	{ _MMIO(0x9888), 0x1b8c0015 },
-	{ _MMIO(0x9888), 0x038c4000 },
-	{ _MMIO(0x9888), 0x058c4000 },
-	{ _MMIO(0x9888), 0x078c4000 },
-	{ _MMIO(0x9888), 0x098c4000 },
-	{ _MMIO(0x9888), 0x0b8c4000 },
-	{ _MMIO(0x9888), 0x0d8c4000 },
-	{ _MMIO(0x9888), 0x0d8d8000 },
-	{ _MMIO(0x9888), 0x0f8da000 },
-	{ _MMIO(0x9888), 0x018d8000 },
-	{ _MMIO(0x9888), 0x038da000 },
-	{ _MMIO(0x9888), 0x058da000 },
-	{ _MMIO(0x9888), 0x078d2000 },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x2185000a },
-	{ _MMIO(0x9888), 0x1b830150 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0d848000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x07844000 },
-	{ _MMIO(0x9888), 0x1d808000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x17804000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47801021 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f800c64 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x41800c02 },
-};
-
-static int
-get_sampler_1_mux_config(struct drm_i915_private *dev_priv,
-			 const struct i915_oa_reg **regs,
-			 int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_sampler_1;
-	lens[n] = ARRAY_SIZE(mux_config_sampler_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_sampler_2[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x70800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x0000c000 },
-	{ _MMIO(0x2774), 0x0000e7ff },
-	{ _MMIO(0x2778), 0x00003000 },
-	{ _MMIO(0x277c), 0x0000f9ff },
-	{ _MMIO(0x2780), 0x00000c00 },
-	{ _MMIO(0x2784), 0x0000fe7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_sampler_2[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_sampler_2[] = {
-	{ _MMIO(0x9888), 0x18121400 },
-	{ _MMIO(0x9888), 0x141500ab },
-	{ _MMIO(0x9888), 0x18321400 },
-	{ _MMIO(0x9888), 0x143500ab },
-	{ _MMIO(0x9888), 0x18521400 },
-	{ _MMIO(0x9888), 0x145500ab },
-	{ _MMIO(0x9888), 0x0c5c8000 },
-	{ _MMIO(0x9888), 0x0e5c4000 },
-	{ _MMIO(0x9888), 0x025cc000 },
-	{ _MMIO(0x9888), 0x045cc000 },
-	{ _MMIO(0x9888), 0x1a3d00a0 },
-	{ _MMIO(0x9888), 0x0a3d8000 },
-	{ _MMIO(0x9888), 0x0c588000 },
-	{ _MMIO(0x9888), 0x0e584000 },
-	{ _MMIO(0x9888), 0x04588000 },
-	{ _MMIO(0x9888), 0x1a5b0050 },
-	{ _MMIO(0x9888), 0x045b8000 },
-	{ _MMIO(0x9888), 0x065b8000 },
-	{ _MMIO(0x9888), 0x085b8000 },
-	{ _MMIO(0x9888), 0x0a5b4000 },
-	{ _MMIO(0x9888), 0x101f02a0 },
-	{ _MMIO(0x9888), 0x0c1fa000 },
-	{ _MMIO(0x9888), 0x0e1f00aa },
-	{ _MMIO(0x9888), 0x18382500 },
-	{ _MMIO(0x9888), 0x02388000 },
-	{ _MMIO(0x9888), 0x04384000 },
-	{ _MMIO(0x9888), 0x06384000 },
-	{ _MMIO(0x9888), 0x08384000 },
-	{ _MMIO(0x9888), 0x0a384000 },
-	{ _MMIO(0x9888), 0x0c388000 },
-	{ _MMIO(0x9888), 0x0c398000 },
-	{ _MMIO(0x9888), 0x0e39a000 },
-	{ _MMIO(0x9888), 0x00398000 },
-	{ _MMIO(0x9888), 0x0239a000 },
-	{ _MMIO(0x9888), 0x0439a000 },
-	{ _MMIO(0x9888), 0x06392000 },
-	{ _MMIO(0x9888), 0x1a3a0200 },
-	{ _MMIO(0x9888), 0x023a8000 },
-	{ _MMIO(0x9888), 0x0c3a8000 },
-	{ _MMIO(0x9888), 0x04108000 },
-	{ _MMIO(0x9888), 0x04118000 },
-	{ _MMIO(0x9888), 0x04127300 },
-	{ _MMIO(0x9888), 0x10120000 },
-	{ _MMIO(0x9888), 0x1813000a },
-	{ _MMIO(0x9888), 0x0a134000 },
-	{ _MMIO(0x9888), 0x0a146000 },
-	{ _MMIO(0x9888), 0x0c159000 },
-	{ _MMIO(0x9888), 0x0e150098 },
-	{ _MMIO(0x9888), 0x10150000 },
-	{ _MMIO(0x9888), 0x04304000 },
-	{ _MMIO(0x9888), 0x04314000 },
-	{ _MMIO(0x9888), 0x04320073 },
-	{ _MMIO(0x9888), 0x10320000 },
-	{ _MMIO(0x9888), 0x04338000 },
-	{ _MMIO(0x9888), 0x06338000 },
-	{ _MMIO(0x9888), 0x08334000 },
-	{ _MMIO(0x9888), 0x0434c000 },
-	{ _MMIO(0x9888), 0x02359890 },
-	{ _MMIO(0x9888), 0x10350000 },
-	{ _MMIO(0x9888), 0x06504000 },
-	{ _MMIO(0x9888), 0x06514000 },
-	{ _MMIO(0x9888), 0x06520073 },
-	{ _MMIO(0x9888), 0x10520000 },
-	{ _MMIO(0x9888), 0x18530020 },
-	{ _MMIO(0x9888), 0x02538000 },
-	{ _MMIO(0x9888), 0x0c534000 },
-	{ _MMIO(0x9888), 0x0a548000 },
-	{ _MMIO(0x9888), 0x04542000 },
-	{ _MMIO(0x9888), 0x0e559000 },
-	{ _MMIO(0x9888), 0x00559800 },
-	{ _MMIO(0x9888), 0x10550000 },
-	{ _MMIO(0x9888), 0x1b8aa000 },
-	{ _MMIO(0x9888), 0x1d8a0002 },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x078a8000 },
-	{ _MMIO(0x9888), 0x098a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x0d8a8000 },
-	{ _MMIO(0x9888), 0x278b0015 },
-	{ _MMIO(0x9888), 0x238b2a80 },
-	{ _MMIO(0x9888), 0x258b0005 },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x2185000a },
-	{ _MMIO(0x9888), 0x1b830150 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0d848000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x07844000 },
-	{ _MMIO(0x9888), 0x1d808000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x17804000 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47801021 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f800c64 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x41800c02 },
-};
-
-static int
-get_sampler_2_mux_config(struct drm_i915_private *dev_priv,
-			 const struct i915_oa_reg **regs,
-			 int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_sampler_2;
-	lens[n] = ARRAY_SIZE(mux_config_sampler_2);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_tdl_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000fdff },
-	{ _MMIO(0x2778), 0x00000000 },
-	{ _MMIO(0x277c), 0x0000fe7f },
-	{ _MMIO(0x2780), 0x00000002 },
-	{ _MMIO(0x2784), 0x0000ffbf },
-	{ _MMIO(0x2788), 0x00000000 },
-	{ _MMIO(0x278c), 0x0000ffcf },
-	{ _MMIO(0x2790), 0x00000002 },
-	{ _MMIO(0x2794), 0x0000fff7 },
-	{ _MMIO(0x2798), 0x00000000 },
-	{ _MMIO(0x279c), 0x0000fff9 },
-};
-
-static const struct i915_oa_reg flex_eu_config_tdl_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_tdl_1[] = {
-	{ _MMIO(0x9888), 0x16154d60 },
-	{ _MMIO(0x9888), 0x16352e60 },
-	{ _MMIO(0x9888), 0x16554d60 },
-	{ _MMIO(0x9888), 0x16950000 },
-	{ _MMIO(0x9888), 0x16b50000 },
-	{ _MMIO(0x9888), 0x16d50000 },
-	{ _MMIO(0x9888), 0x005c8000 },
-	{ _MMIO(0x9888), 0x045cc000 },
-	{ _MMIO(0x9888), 0x065c4000 },
-	{ _MMIO(0x9888), 0x083d8000 },
-	{ _MMIO(0x9888), 0x0a3d8000 },
-	{ _MMIO(0x9888), 0x0458c000 },
-	{ _MMIO(0x9888), 0x025b8000 },
-	{ _MMIO(0x9888), 0x085b4000 },
-	{ _MMIO(0x9888), 0x0a5b4000 },
-	{ _MMIO(0x9888), 0x0c5b8000 },
-	{ _MMIO(0x9888), 0x0c1fa000 },
-	{ _MMIO(0x9888), 0x0e1f00aa },
-	{ _MMIO(0x9888), 0x02384000 },
-	{ _MMIO(0x9888), 0x04388000 },
-	{ _MMIO(0x9888), 0x06388000 },
-	{ _MMIO(0x9888), 0x08384000 },
-	{ _MMIO(0x9888), 0x0a384000 },
-	{ _MMIO(0x9888), 0x0c384000 },
-	{ _MMIO(0x9888), 0x00398000 },
-	{ _MMIO(0x9888), 0x0239a000 },
-	{ _MMIO(0x9888), 0x0439a000 },
-	{ _MMIO(0x9888), 0x06392000 },
-	{ _MMIO(0x9888), 0x043a8000 },
-	{ _MMIO(0x9888), 0x063a8000 },
-	{ _MMIO(0x9888), 0x08138000 },
-	{ _MMIO(0x9888), 0x0a138000 },
-	{ _MMIO(0x9888), 0x06143000 },
-	{ _MMIO(0x9888), 0x0415cfc7 },
-	{ _MMIO(0x9888), 0x10150000 },
-	{ _MMIO(0x9888), 0x02338000 },
-	{ _MMIO(0x9888), 0x0c338000 },
-	{ _MMIO(0x9888), 0x04342000 },
-	{ _MMIO(0x9888), 0x06344000 },
-	{ _MMIO(0x9888), 0x0035c700 },
-	{ _MMIO(0x9888), 0x063500cf },
-	{ _MMIO(0x9888), 0x10350000 },
-	{ _MMIO(0x9888), 0x04538000 },
-	{ _MMIO(0x9888), 0x06538000 },
-	{ _MMIO(0x9888), 0x0454c000 },
-	{ _MMIO(0x9888), 0x0255cfc7 },
-	{ _MMIO(0x9888), 0x10550000 },
-	{ _MMIO(0x9888), 0x06dc8000 },
-	{ _MMIO(0x9888), 0x08dc4000 },
-	{ _MMIO(0x9888), 0x0cdcc000 },
-	{ _MMIO(0x9888), 0x0edcc000 },
-	{ _MMIO(0x9888), 0x1abd00a8 },
-	{ _MMIO(0x9888), 0x0cd8c000 },
-	{ _MMIO(0x9888), 0x0ed84000 },
-	{ _MMIO(0x9888), 0x0edb8000 },
-	{ _MMIO(0x9888), 0x18db0800 },
-	{ _MMIO(0x9888), 0x1adb0254 },
-	{ _MMIO(0x9888), 0x0e9faa00 },
-	{ _MMIO(0x9888), 0x109f02aa },
-	{ _MMIO(0x9888), 0x0eb84000 },
-	{ _MMIO(0x9888), 0x16b84000 },
-	{ _MMIO(0x9888), 0x18b8156a },
-	{ _MMIO(0x9888), 0x06b98000 },
-	{ _MMIO(0x9888), 0x08b9a000 },
-	{ _MMIO(0x9888), 0x0ab9a000 },
-	{ _MMIO(0x9888), 0x0cb9a000 },
-	{ _MMIO(0x9888), 0x0eb9a000 },
-	{ _MMIO(0x9888), 0x18baa000 },
-	{ _MMIO(0x9888), 0x1aba0002 },
-	{ _MMIO(0x9888), 0x16934000 },
-	{ _MMIO(0x9888), 0x1893000a },
-	{ _MMIO(0x9888), 0x0a947000 },
-	{ _MMIO(0x9888), 0x0c95c5c1 },
-	{ _MMIO(0x9888), 0x0e9500c3 },
-	{ _MMIO(0x9888), 0x10950000 },
-	{ _MMIO(0x9888), 0x0eb38000 },
-	{ _MMIO(0x9888), 0x16b30040 },
-	{ _MMIO(0x9888), 0x18b30020 },
-	{ _MMIO(0x9888), 0x06b48000 },
-	{ _MMIO(0x9888), 0x08b41000 },
-	{ _MMIO(0x9888), 0x0ab48000 },
-	{ _MMIO(0x9888), 0x06b5c500 },
-	{ _MMIO(0x9888), 0x08b500c3 },
-	{ _MMIO(0x9888), 0x0eb5c100 },
-	{ _MMIO(0x9888), 0x10b50000 },
-	{ _MMIO(0x9888), 0x16d31500 },
-	{ _MMIO(0x9888), 0x08d4e000 },
-	{ _MMIO(0x9888), 0x08d5c100 },
-	{ _MMIO(0x9888), 0x0ad5c3c5 },
-	{ _MMIO(0x9888), 0x10d50000 },
-	{ _MMIO(0x9888), 0x0d88f800 },
-	{ _MMIO(0x9888), 0x0f88000f },
-	{ _MMIO(0x9888), 0x038a8000 },
-	{ _MMIO(0x9888), 0x058a8000 },
-	{ _MMIO(0x9888), 0x078a8000 },
-	{ _MMIO(0x9888), 0x098a8000 },
-	{ _MMIO(0x9888), 0x0b8a8000 },
-	{ _MMIO(0x9888), 0x0d8a8000 },
-	{ _MMIO(0x9888), 0x258baaa5 },
-	{ _MMIO(0x9888), 0x278b002a },
-	{ _MMIO(0x9888), 0x238b2a80 },
-	{ _MMIO(0x9888), 0x0f8c4000 },
-	{ _MMIO(0x9888), 0x178c2000 },
-	{ _MMIO(0x9888), 0x198c5500 },
-	{ _MMIO(0x9888), 0x1b8c0015 },
-	{ _MMIO(0x9888), 0x078d8000 },
-	{ _MMIO(0x9888), 0x098da000 },
-	{ _MMIO(0x9888), 0x0b8da000 },
-	{ _MMIO(0x9888), 0x0d8da000 },
-	{ _MMIO(0x9888), 0x0f8da000 },
-	{ _MMIO(0x9888), 0x2185aaaa },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0784c000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x1780c000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x43800c42 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45800063 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x47800800 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f8014a4 },
-	{ _MMIO(0x9888), 0x41801042 },
-};
-
-static int
-get_tdl_1_mux_config(struct drm_i915_private *dev_priv,
-		     const struct i915_oa_reg **regs,
-		     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_tdl_1;
-	lens[n] = ARRAY_SIZE(mux_config_tdl_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_tdl_2[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000fdff },
-	{ _MMIO(0x2778), 0x00000000 },
-	{ _MMIO(0x277c), 0x0000fe7f },
-	{ _MMIO(0x2780), 0x00000000 },
-	{ _MMIO(0x2784), 0x0000ff9f },
-	{ _MMIO(0x2788), 0x00000000 },
-	{ _MMIO(0x278c), 0x0000ffe7 },
-	{ _MMIO(0x2790), 0x00000002 },
-	{ _MMIO(0x2794), 0x0000fffb },
-	{ _MMIO(0x2798), 0x00000002 },
-	{ _MMIO(0x279c), 0x0000fffd },
-};
-
-static const struct i915_oa_reg flex_eu_config_tdl_2[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_tdl_2[] = {
-	{ _MMIO(0x9888), 0x16150000 },
-	{ _MMIO(0x9888), 0x16350000 },
-	{ _MMIO(0x9888), 0x16550000 },
-	{ _MMIO(0x9888), 0x16952e60 },
-	{ _MMIO(0x9888), 0x16b54d60 },
-	{ _MMIO(0x9888), 0x16d52e60 },
-	{ _MMIO(0x9888), 0x065c8000 },
-	{ _MMIO(0x9888), 0x085cc000 },
-	{ _MMIO(0x9888), 0x0a5cc000 },
-	{ _MMIO(0x9888), 0x0c5c4000 },
-	{ _MMIO(0x9888), 0x0e3d8000 },
-	{ _MMIO(0x9888), 0x183da000 },
-	{ _MMIO(0x9888), 0x06588000 },
-	{ _MMIO(0x9888), 0x08588000 },
-	{ _MMIO(0x9888), 0x0a584000 },
-	{ _MMIO(0x9888), 0x0e5b4000 },
-	{ _MMIO(0x9888), 0x185b5800 },
-	{ _MMIO(0x9888), 0x1a5b000a },
-	{ _MMIO(0x9888), 0x0e1faa00 },
-	{ _MMIO(0x9888), 0x101f02aa },
-	{ _MMIO(0x9888), 0x0e384000 },
-	{ _MMIO(0x9888), 0x16384000 },
-	{ _MMIO(0x9888), 0x18382a55 },
-	{ _MMIO(0x9888), 0x06398000 },
-	{ _MMIO(0x9888), 0x0839a000 },
-	{ _MMIO(0x9888), 0x0a39a000 },
-	{ _MMIO(0x9888), 0x0c39a000 },
-	{ _MMIO(0x9888), 0x0e39a000 },
-	{ _MMIO(0x9888), 0x1a3a02a0 },
-	{ _MMIO(0x9888), 0x0e138000 },
-	{ _MMIO(0x9888), 0x16130500 },
-	{ _MMIO(0x9888), 0x06148000 },
-	{ _MMIO(0x9888), 0x08146000 },
-	{ _MMIO(0x9888), 0x0615c100 },
-	{ _MMIO(0x9888), 0x0815c500 },
-	{ _MMIO(0x9888), 0x0a1500c3 },
-	{ _MMIO(0x9888), 0x10150000 },
-	{ _MMIO(0x9888), 0x16335040 },
-	{ _MMIO(0x9888), 0x08349000 },
-	{ _MMIO(0x9888), 0x0a341000 },
-	{ _MMIO(0x9888), 0x083500c1 },
-	{ _MMIO(0x9888), 0x0a35c500 },
-	{ _MMIO(0x9888), 0x0c3500c3 },
-	{ _MMIO(0x9888), 0x10350000 },
-	{ _MMIO(0x9888), 0x1853002a },
-	{ _MMIO(0x9888), 0x0a54e000 },
-	{ _MMIO(0x9888), 0x0c55c500 },
-	{ _MMIO(0x9888), 0x0e55c1c3 },
-	{ _MMIO(0x9888), 0x10550000 },
-	{ _MMIO(0x9888), 0x00dc8000 },
-	{ _MMIO(0x9888), 0x02dcc000 },
-	{ _MMIO(0x9888), 0x04dc4000 },
-	{ _MMIO(0x9888), 0x04bd8000 },
-	{ _MMIO(0x9888), 0x06bd8000 },
-	{ _MMIO(0x9888), 0x02d8c000 },
-	{ _MMIO(0x9888), 0x02db8000 },
-	{ _MMIO(0x9888), 0x04db4000 },
-	{ _MMIO(0x9888), 0x06db4000 },
-	{ _MMIO(0x9888), 0x08db8000 },
-	{ _MMIO(0x9888), 0x0c9fa000 },
-	{ _MMIO(0x9888), 0x0e9f00aa },
-	{ _MMIO(0x9888), 0x02b84000 },
-	{ _MMIO(0x9888), 0x04b84000 },
-	{ _MMIO(0x9888), 0x06b84000 },
-	{ _MMIO(0x9888), 0x08b84000 },
-	{ _MMIO(0x9888), 0x0ab88000 },
-	{ _MMIO(0x9888), 0x0cb88000 },
-	{ _MMIO(0x9888), 0x00b98000 },
-	{ _MMIO(0x9888), 0x02b9a000 },
-	{ _MMIO(0x9888), 0x04b9a000 },
-	{ _MMIO(0x9888), 0x06b92000 },
-	{ _MMIO(0x9888), 0x0aba8000 },
-	{ _MMIO(0x9888), 0x0cba8000 },
-	{ _MMIO(0x9888), 0x04938000 },
-	{ _MMIO(0x9888), 0x06938000 },
-	{ _MMIO(0x9888), 0x0494c000 },
-	{ _MMIO(0x9888), 0x0295cfc7 },
-	{ _MMIO(0x9888), 0x10950000 },
-	{ _MMIO(0x9888), 0x02b38000 },
-	{ _MMIO(0x9888), 0x08b38000 },
-	{ _MMIO(0x9888), 0x04b42000 },
-	{ _MMIO(0x9888), 0x06b41000 },
-	{ _MMIO(0x9888), 0x00b5c700 },
-	{ _MMIO(0x9888), 0x04b500cf },
-	{ _MMIO(0x9888), 0x10b50000 },
-	{ _MMIO(0x9888), 0x0ad38000 },
-	{ _MMIO(0x9888), 0x0cd38000 },
-	{ _MMIO(0x9888), 0x06d46000 },
-	{ _MMIO(0x9888), 0x04d5c700 },
-	{ _MMIO(0x9888), 0x06d500cf },
-	{ _MMIO(0x9888), 0x10d50000 },
-	{ _MMIO(0x9888), 0x03888000 },
-	{ _MMIO(0x9888), 0x05888000 },
-	{ _MMIO(0x9888), 0x07888000 },
-	{ _MMIO(0x9888), 0x09888000 },
-	{ _MMIO(0x9888), 0x0b888000 },
-	{ _MMIO(0x9888), 0x0d880400 },
-	{ _MMIO(0x9888), 0x0f8a8000 },
-	{ _MMIO(0x9888), 0x198a8000 },
-	{ _MMIO(0x9888), 0x1b8aaaa0 },
-	{ _MMIO(0x9888), 0x1d8a0002 },
-	{ _MMIO(0x9888), 0x258b555a },
-	{ _MMIO(0x9888), 0x278b0015 },
-	{ _MMIO(0x9888), 0x238b5500 },
-	{ _MMIO(0x9888), 0x038c4000 },
-	{ _MMIO(0x9888), 0x058c4000 },
-	{ _MMIO(0x9888), 0x078c4000 },
-	{ _MMIO(0x9888), 0x098c4000 },
-	{ _MMIO(0x9888), 0x0b8c4000 },
-	{ _MMIO(0x9888), 0x0d8c4000 },
-	{ _MMIO(0x9888), 0x018d8000 },
-	{ _MMIO(0x9888), 0x038da000 },
-	{ _MMIO(0x9888), 0x058da000 },
-	{ _MMIO(0x9888), 0x078d2000 },
-	{ _MMIO(0x9888), 0x2185aaaa },
-	{ _MMIO(0x9888), 0x2385002a },
-	{ _MMIO(0x9888), 0x1f85aa00 },
-	{ _MMIO(0x9888), 0x0f834000 },
-	{ _MMIO(0x9888), 0x19835400 },
-	{ _MMIO(0x9888), 0x1b830155 },
-	{ _MMIO(0x9888), 0x03834000 },
-	{ _MMIO(0x9888), 0x05834000 },
-	{ _MMIO(0x9888), 0x07834000 },
-	{ _MMIO(0x9888), 0x09834000 },
-	{ _MMIO(0x9888), 0x0b834000 },
-	{ _MMIO(0x9888), 0x0d834000 },
-	{ _MMIO(0x9888), 0x0784c000 },
-	{ _MMIO(0x9888), 0x0984c000 },
-	{ _MMIO(0x9888), 0x0b84c000 },
-	{ _MMIO(0x9888), 0x0d84c000 },
-	{ _MMIO(0x9888), 0x0f84c000 },
-	{ _MMIO(0x9888), 0x01848000 },
-	{ _MMIO(0x9888), 0x0384c000 },
-	{ _MMIO(0x9888), 0x0584c000 },
-	{ _MMIO(0x9888), 0x1780c000 },
-	{ _MMIO(0x9888), 0x1980c000 },
-	{ _MMIO(0x9888), 0x1b80c000 },
-	{ _MMIO(0x9888), 0x1d80c000 },
-	{ _MMIO(0x9888), 0x1f80c000 },
-	{ _MMIO(0x9888), 0x11808000 },
-	{ _MMIO(0x9888), 0x1380c000 },
-	{ _MMIO(0x9888), 0x1580c000 },
-	{ _MMIO(0x9888), 0x4f800000 },
-	{ _MMIO(0x9888), 0x43800882 },
-	{ _MMIO(0x9888), 0x51800000 },
-	{ _MMIO(0x9888), 0x45801082 },
-	{ _MMIO(0x9888), 0x53800000 },
-	{ _MMIO(0x9888), 0x478014a5 },
-	{ _MMIO(0x9888), 0x21800000 },
-	{ _MMIO(0x9888), 0x31800000 },
-	{ _MMIO(0x9888), 0x4d800000 },
-	{ _MMIO(0x9888), 0x3f800002 },
-	{ _MMIO(0x9888), 0x41800c62 },
-};
-
-static int
-get_tdl_2_mux_config(struct drm_i915_private *dev_priv,
-		     const struct i915_oa_reg **regs,
-		     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_tdl_2;
-	lens[n] = ARRAY_SIZE(mux_config_tdl_2);
-	n++;
-
-	return n;
-}
-
 static const struct i915_oa_reg b_counter_config_test_oa[] = {
 	{ _MMIO(0x2740), 0x00000000 },
 	{ _MMIO(0x2744), 0x00800000 },
@@ -1997,6 +60,7 @@ static const struct i915_oa_reg flex_eu_config_test_oa[] = {
 };
 
 static const struct i915_oa_reg mux_config_test_oa[] = {
+	{ _MMIO(0x9840), 0x000000a0 },
 	{ _MMIO(0x9888), 0x59800000 },
 	{ _MMIO(0x9888), 0x59800001 },
 	{ _MMIO(0x9888), 0x338b0000 },
@@ -2008,866 +72,38 @@ static const struct i915_oa_reg mux_config_test_oa[] = {
 	{ _MMIO(0x9888), 0x57800000 },
 	{ _MMIO(0x1823a4), 0x00000000 },
 	{ _MMIO(0x9888), 0x59800000 },
-};
-
-static int
-get_test_oa_mux_config(struct drm_i915_private *dev_priv,
-		       const struct i915_oa_reg **regs,
-		       int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_test_oa;
-	lens[n] = ARRAY_SIZE(mux_config_test_oa);
-	n++;
-
-	return n;
-}
-
-int i915_oa_select_metric_set_chv(struct drm_i915_private *dev_priv)
-{
-	dev_priv->perf.oa.n_mux_configs = 0;
-	dev_priv->perf.oa.b_counter_regs = NULL;
-	dev_priv->perf.oa.b_counter_regs_len = 0;
-	dev_priv->perf.oa.flex_regs = NULL;
-	dev_priv->perf.oa.flex_regs_len = 0;
-
-	switch (dev_priv->perf.oa.metrics_set) {
-	case METRIC_SET_ID_RENDER_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_basic_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_basic);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_basic_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_basic);
-
-		return 0;
-	case METRIC_SET_ID_RENDER_PIPE_PROFILE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_pipe_profile_mux_config(dev_priv,
-							   dev_priv->perf.oa.mux_regs,
-							   dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_PIPE_PROFILE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_pipe_profile;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_pipe_profile);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_pipe_profile;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_pipe_profile);
-
-		return 0;
-	case METRIC_SET_ID_HDC_AND_SF:
-		dev_priv->perf.oa.n_mux_configs =
-			get_hdc_and_sf_mux_config(dev_priv,
-						  dev_priv->perf.oa.mux_regs,
-						  dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"HDC_AND_SF\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_hdc_and_sf;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_hdc_and_sf);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_hdc_and_sf;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_hdc_and_sf);
-
-		return 0;
-	case METRIC_SET_ID_L3_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_1_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_1);
-
-		return 0;
-	case METRIC_SET_ID_L3_2:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_2_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_2\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_2;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_2);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_2;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_2);
-
-		return 0;
-	case METRIC_SET_ID_L3_3:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_3_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_3\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_3;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_3);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_3;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_3);
-
-		return 0;
-	case METRIC_SET_ID_L3_4:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_4_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_4\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_4;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_4);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_4;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_4);
-
-		return 0;
-	case METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND:
-		dev_priv->perf.oa.n_mux_configs =
-			get_rasterizer_and_pixel_backend_mux_config(dev_priv,
-								    dev_priv->perf.oa.mux_regs,
-								    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RASTERIZER_AND_PIXEL_BACKEND\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_rasterizer_and_pixel_backend;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_rasterizer_and_pixel_backend);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_rasterizer_and_pixel_backend;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_rasterizer_and_pixel_backend);
-
-		return 0;
-	case METRIC_SET_ID_SAMPLER_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_sampler_1_mux_config(dev_priv,
-						 dev_priv->perf.oa.mux_regs,
-						 dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"SAMPLER_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_sampler_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_sampler_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_sampler_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_sampler_1);
-
-		return 0;
-	case METRIC_SET_ID_SAMPLER_2:
-		dev_priv->perf.oa.n_mux_configs =
-			get_sampler_2_mux_config(dev_priv,
-						 dev_priv->perf.oa.mux_regs,
-						 dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"SAMPLER_2\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_sampler_2;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_sampler_2);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_sampler_2;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_sampler_2);
-
-		return 0;
-	case METRIC_SET_ID_TDL_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_tdl_1_mux_config(dev_priv,
-					     dev_priv->perf.oa.mux_regs,
-					     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TDL_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_tdl_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_tdl_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_tdl_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_tdl_1);
-
-		return 0;
-	case METRIC_SET_ID_TDL_2:
-		dev_priv->perf.oa.n_mux_configs =
-			get_tdl_2_mux_config(dev_priv,
-					     dev_priv->perf.oa.mux_regs,
-					     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TDL_2\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_tdl_2;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_tdl_2);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_tdl_2;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_tdl_2);
-
-		return 0;
-	case METRIC_SET_ID_TEST_OA:
-		dev_priv->perf.oa.n_mux_configs =
-			get_test_oa_mux_config(dev_priv,
-					       dev_priv->perf.oa.mux_regs,
-					       dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TEST_OA\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_test_oa;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_test_oa);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_test_oa;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_test_oa);
-
-		return 0;
-	default:
-		return -ENODEV;
-	}
-}
-
-static ssize_t
-show_render_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_BASIC);
-}
-
-static struct device_attribute dev_attr_render_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_basic[] = {
-	&dev_attr_render_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_basic = {
-	.name = "9d8a3af5-c02c-4a4a-b947-f1672469e0fb",
-	.attrs =  attrs_render_basic,
-};
-
-static ssize_t
-show_compute_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_BASIC);
-}
-
-static struct device_attribute dev_attr_compute_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_basic[] = {
-	&dev_attr_compute_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_basic = {
-	.name = "f522a89c-ecd1-4522-8331-3383c54af5f5",
-	.attrs =  attrs_compute_basic,
-};
-
-static ssize_t
-show_render_pipe_profile_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_PIPE_PROFILE);
-}
-
-static struct device_attribute dev_attr_render_pipe_profile_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_pipe_profile_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_pipe_profile[] = {
-	&dev_attr_render_pipe_profile_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_pipe_profile = {
-	.name = "a9ccc03d-a943-4e6b-9cd6-13e063075927",
-	.attrs =  attrs_render_pipe_profile,
-};
-
-static ssize_t
-show_hdc_and_sf_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_HDC_AND_SF);
-}
-
-static struct device_attribute dev_attr_hdc_and_sf_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_hdc_and_sf_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_hdc_and_sf[] = {
-	&dev_attr_hdc_and_sf_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_hdc_and_sf = {
-	.name = "2cf0c064-68df-4fac-9b3f-57f51ca8a069",
-	.attrs =  attrs_hdc_and_sf,
-};
-
-static ssize_t
-show_l3_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_1);
-}
-
-static struct device_attribute dev_attr_l3_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_1[] = {
-	&dev_attr_l3_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_1 = {
-	.name = "78a87ff9-543a-49ce-95ea-26d86071ea93",
-	.attrs =  attrs_l3_1,
-};
-
-static ssize_t
-show_l3_2_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_2);
-}
-
-static struct device_attribute dev_attr_l3_2_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_2_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_2[] = {
-	&dev_attr_l3_2_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_2 = {
-	.name = "9f2cece5-7bfe-4320-ad66-8c7cc526bec5",
-	.attrs =  attrs_l3_2,
-};
-
-static ssize_t
-show_l3_3_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_3);
-}
-
-static struct device_attribute dev_attr_l3_3_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_3_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_3[] = {
-	&dev_attr_l3_3_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_3 = {
-	.name = "d890ef38-d309-47e4-b8b5-aa779bb19ab0",
-	.attrs =  attrs_l3_3,
-};
-
-static ssize_t
-show_l3_4_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_4);
-}
-
-static struct device_attribute dev_attr_l3_4_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_4_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_4[] = {
-	&dev_attr_l3_4_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_4 = {
-	.name = "5fdff4a6-9dc8-45e1-bfda-ef54869fbdd4",
-	.attrs =  attrs_l3_4,
-};
-
-static ssize_t
-show_rasterizer_and_pixel_backend_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND);
-}
-
-static struct device_attribute dev_attr_rasterizer_and_pixel_backend_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_rasterizer_and_pixel_backend_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_rasterizer_and_pixel_backend[] = {
-	&dev_attr_rasterizer_and_pixel_backend_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_rasterizer_and_pixel_backend = {
-	.name = "2c0e45e1-7e2c-4a14-ae00-0b7ec868b8aa",
-	.attrs =  attrs_rasterizer_and_pixel_backend,
-};
-
-static ssize_t
-show_sampler_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_SAMPLER_1);
-}
-
-static struct device_attribute dev_attr_sampler_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_sampler_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_sampler_1[] = {
-	&dev_attr_sampler_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_sampler_1 = {
-	.name = "71148d78-baf5-474f-878a-e23158d0265d",
-	.attrs =  attrs_sampler_1,
-};
-
-static ssize_t
-show_sampler_2_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_SAMPLER_2);
-}
-
-static struct device_attribute dev_attr_sampler_2_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_sampler_2_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_sampler_2[] = {
-	&dev_attr_sampler_2_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_sampler_2 = {
-	.name = "b996a2b7-c59c-492d-877a-8cd54fd6df84",
-	.attrs =  attrs_sampler_2,
-};
-
-static ssize_t
-show_tdl_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TDL_1);
-}
-
-static struct device_attribute dev_attr_tdl_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_tdl_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_tdl_1[] = {
-	&dev_attr_tdl_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_tdl_1 = {
-	.name = "eb2fecba-b431-42e7-8261-fe9429a6e67a",
-	.attrs =  attrs_tdl_1,
-};
-
-static ssize_t
-show_tdl_2_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TDL_2);
-}
-
-static struct device_attribute dev_attr_tdl_2_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_tdl_2_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_tdl_2[] = {
-	&dev_attr_tdl_2_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_tdl_2 = {
-	.name = "60749470-a648-4a4b-9f10-dbfe1e36e44d",
-	.attrs =  attrs_tdl_2,
+	{ _MMIO(0x9840), 0x00000080 },
 };
 
 static ssize_t
 show_test_oa_id(struct device *kdev, struct device_attribute *attr, char *buf)
 {
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TEST_OA);
-}
-
-static struct device_attribute dev_attr_test_oa_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_test_oa_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_test_oa[] = {
-	&dev_attr_test_oa_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_test_oa = {
-	.name = "4a534b07-cba3-414d-8d60-874830e883aa",
-	.attrs =  attrs_test_oa,
-};
-
-int
-i915_perf_register_sysfs_chv(struct drm_i915_private *dev_priv)
-{
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
-	int ret = 0;
-
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-		if (ret)
-			goto error_render_basic;
-	}
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-		if (ret)
-			goto error_compute_basic;
-	}
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-		if (ret)
-			goto error_render_pipe_profile;
-	}
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-		if (ret)
-			goto error_hdc_and_sf;
-	}
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-		if (ret)
-			goto error_l3_1;
-	}
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-		if (ret)
-			goto error_l3_2;
-	}
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-		if (ret)
-			goto error_l3_3;
-	}
-	if (get_l3_4_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_4);
-		if (ret)
-			goto error_l3_4;
-	}
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-		if (ret)
-			goto error_rasterizer_and_pixel_backend;
-	}
-	if (get_sampler_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_sampler_1);
-		if (ret)
-			goto error_sampler_1;
-	}
-	if (get_sampler_2_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_sampler_2);
-		if (ret)
-			goto error_sampler_2;
-	}
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-		if (ret)
-			goto error_tdl_1;
-	}
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-		if (ret)
-			goto error_tdl_2;
-	}
-	if (get_test_oa_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_test_oa);
-		if (ret)
-			goto error_test_oa;
-	}
-
-	return 0;
-
-error_test_oa:
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-error_tdl_2:
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-error_tdl_1:
-	if (get_sampler_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler_2);
-error_sampler_2:
-	if (get_sampler_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler_1);
-error_sampler_1:
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-error_rasterizer_and_pixel_backend:
-	if (get_l3_4_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_4);
-error_l3_4:
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-error_l3_3:
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-error_l3_2:
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-error_l3_1:
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-error_hdc_and_sf:
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-error_render_pipe_profile:
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-error_compute_basic:
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-error_render_basic:
-	return ret;
+	return sprintf(buf, "1\n");
 }
 
 void
-i915_perf_unregister_sysfs_chv(struct drm_i915_private *dev_priv)
+i915_perf_load_test_config_chv(struct drm_i915_private *dev_priv)
 {
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
+	strncpy(dev_priv->perf.oa.test_config.uuid,
+		"4a534b07-cba3-414d-8d60-874830e883aa",
+		UUID_STRING_LEN);
+	dev_priv->perf.oa.test_config.id = 1;
 
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-	if (get_l3_4_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_4);
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-	if (get_sampler_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler_1);
-	if (get_sampler_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler_2);
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-	if (get_test_oa_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_test_oa);
+	dev_priv->perf.oa.test_config.mux_regs = mux_config_test_oa;
+	dev_priv->perf.oa.test_config.mux_regs_len = ARRAY_SIZE(mux_config_test_oa);
+
+	dev_priv->perf.oa.test_config.b_counter_regs = b_counter_config_test_oa;
+	dev_priv->perf.oa.test_config.b_counter_regs_len = ARRAY_SIZE(b_counter_config_test_oa);
+
+	dev_priv->perf.oa.test_config.flex_regs = flex_eu_config_test_oa;
+	dev_priv->perf.oa.test_config.flex_regs_len = ARRAY_SIZE(flex_eu_config_test_oa);
+
+	dev_priv->perf.oa.test_config.sysfs_metric.name = "4a534b07-cba3-414d-8d60-874830e883aa";
+	dev_priv->perf.oa.test_config.sysfs_metric.attrs = dev_priv->perf.oa.test_config.attrs;
+
+	dev_priv->perf.oa.test_config.attrs[0] = &dev_priv->perf.oa.test_config.sysfs_metric_id.attr;
+
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.name = "id";
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.mode = 0444;
+	dev_priv->perf.oa.test_config.sysfs_metric_id.show = show_test_oa_id;
 }
diff --git a/drivers/gpu/drm/i915/i915_oa_chv.h b/drivers/gpu/drm/i915/i915_oa_chv.h
index 8b8bdc2..b962249 100644
--- a/drivers/gpu/drm/i915/i915_oa_chv.h
+++ b/drivers/gpu/drm/i915/i915_oa_chv.h
@@ -29,12 +29,6 @@
 #ifndef __I915_OA_CHV_H__
 #define __I915_OA_CHV_H__
 
-extern int i915_oa_n_builtin_metric_sets_chv;
-
-extern int i915_oa_select_metric_set_chv(struct drm_i915_private *dev_priv);
-
-extern int i915_perf_register_sysfs_chv(struct drm_i915_private *dev_priv);
-
-extern void i915_perf_unregister_sysfs_chv(struct drm_i915_private *dev_priv);
+extern void i915_perf_load_test_config_chv(struct drm_i915_private *dev_priv);
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_oa_glk.c b/drivers/gpu/drm/i915/i915_oa_glk.c
index 2f356d5..4ee527e 100644
--- a/drivers/gpu/drm/i915/i915_oa_glk.c
+++ b/drivers/gpu/drm/i915/i915_oa_glk.c
@@ -31,1614 +31,6 @@
 #include "i915_drv.h"
 #include "i915_oa_glk.h"
 
-enum metric_set_id {
-	METRIC_SET_ID_RENDER_BASIC = 1,
-	METRIC_SET_ID_COMPUTE_BASIC,
-	METRIC_SET_ID_RENDER_PIPE_PROFILE,
-	METRIC_SET_ID_MEMORY_READS,
-	METRIC_SET_ID_MEMORY_WRITES,
-	METRIC_SET_ID_COMPUTE_EXTENDED,
-	METRIC_SET_ID_COMPUTE_L3_CACHE,
-	METRIC_SET_ID_HDC_AND_SF,
-	METRIC_SET_ID_L3_1,
-	METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND,
-	METRIC_SET_ID_SAMPLER,
-	METRIC_SET_ID_TDL_1,
-	METRIC_SET_ID_TDL_2,
-	METRIC_SET_ID_COMPUTE_EXTRA,
-	METRIC_SET_ID_TEST_OA,
-};
-
-int i915_oa_n_builtin_metric_sets_glk = 15;
-
-static const struct i915_oa_reg b_counter_config_render_basic[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_render_basic[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_render_basic[] = {
-	{ _MMIO(0x9888), 0x166c00f0 },
-	{ _MMIO(0x9888), 0x12120280 },
-	{ _MMIO(0x9888), 0x12320280 },
-	{ _MMIO(0x9888), 0x11930317 },
-	{ _MMIO(0x9888), 0x159303df },
-	{ _MMIO(0x9888), 0x3f900c00 },
-	{ _MMIO(0x9888), 0x419000a0 },
-	{ _MMIO(0x9888), 0x002d1000 },
-	{ _MMIO(0x9888), 0x062d4000 },
-	{ _MMIO(0x9888), 0x082d5000 },
-	{ _MMIO(0x9888), 0x0a2d1000 },
-	{ _MMIO(0x9888), 0x0c2e0800 },
-	{ _MMIO(0x9888), 0x0e2e5900 },
-	{ _MMIO(0x9888), 0x0a4c8000 },
-	{ _MMIO(0x9888), 0x0c4c8000 },
-	{ _MMIO(0x9888), 0x0e4c4000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e2000 },
-	{ _MMIO(0x9888), 0x1c4f0010 },
-	{ _MMIO(0x9888), 0x0a6c0053 },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1a0fcc00 },
-	{ _MMIO(0x9888), 0x1c0f0002 },
-	{ _MMIO(0x9888), 0x1c2c0040 },
-	{ _MMIO(0x9888), 0x00101000 },
-	{ _MMIO(0x9888), 0x04101000 },
-	{ _MMIO(0x9888), 0x00114000 },
-	{ _MMIO(0x9888), 0x08114000 },
-	{ _MMIO(0x9888), 0x00120020 },
-	{ _MMIO(0x9888), 0x08120021 },
-	{ _MMIO(0x9888), 0x00141000 },
-	{ _MMIO(0x9888), 0x08141000 },
-	{ _MMIO(0x9888), 0x02308000 },
-	{ _MMIO(0x9888), 0x04302000 },
-	{ _MMIO(0x9888), 0x06318000 },
-	{ _MMIO(0x9888), 0x08318000 },
-	{ _MMIO(0x9888), 0x06320800 },
-	{ _MMIO(0x9888), 0x08320840 },
-	{ _MMIO(0x9888), 0x00320000 },
-	{ _MMIO(0x9888), 0x06344000 },
-	{ _MMIO(0x9888), 0x08344000 },
-	{ _MMIO(0x9888), 0x0d931831 },
-	{ _MMIO(0x9888), 0x0f939f3f },
-	{ _MMIO(0x9888), 0x01939e80 },
-	{ _MMIO(0x9888), 0x039303bc },
-	{ _MMIO(0x9888), 0x0593000e },
-	{ _MMIO(0x9888), 0x1993002a },
-	{ _MMIO(0x9888), 0x07930000 },
-	{ _MMIO(0x9888), 0x09930000 },
-	{ _MMIO(0x9888), 0x1d900177 },
-	{ _MMIO(0x9888), 0x1f900187 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x13904000 },
-	{ _MMIO(0x9888), 0x21904000 },
-	{ _MMIO(0x9888), 0x23904000 },
-	{ _MMIO(0x9888), 0x25904000 },
-	{ _MMIO(0x9888), 0x27904000 },
-	{ _MMIO(0x9888), 0x2b904000 },
-	{ _MMIO(0x9888), 0x2d904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17904000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x53901110 },
-	{ _MMIO(0x9888), 0x43900423 },
-	{ _MMIO(0x9888), 0x55900111 },
-	{ _MMIO(0x9888), 0x47900c02 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900020 },
-	{ _MMIO(0x9888), 0x59901111 },
-	{ _MMIO(0x9888), 0x4b900421 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900001 },
-	{ _MMIO(0x9888), 0x45900821 },
-};
-
-static int
-get_render_basic_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_render_basic;
-	lens[n] = ARRAY_SIZE(mux_config_render_basic);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_basic[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_basic[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_compute_basic[] = {
-	{ _MMIO(0x9888), 0x104f00e0 },
-	{ _MMIO(0x9888), 0x124f1c00 },
-	{ _MMIO(0x9888), 0x39900340 },
-	{ _MMIO(0x9888), 0x3f900c00 },
-	{ _MMIO(0x9888), 0x41900000 },
-	{ _MMIO(0x9888), 0x002d5000 },
-	{ _MMIO(0x9888), 0x062d4000 },
-	{ _MMIO(0x9888), 0x082d4000 },
-	{ _MMIO(0x9888), 0x0a2d1000 },
-	{ _MMIO(0x9888), 0x0c2d5000 },
-	{ _MMIO(0x9888), 0x0e2d4000 },
-	{ _MMIO(0x9888), 0x0c2e1400 },
-	{ _MMIO(0x9888), 0x0e2e5100 },
-	{ _MMIO(0x9888), 0x102e0114 },
-	{ _MMIO(0x9888), 0x044cc000 },
-	{ _MMIO(0x9888), 0x0a4c8000 },
-	{ _MMIO(0x9888), 0x0c4c8000 },
-	{ _MMIO(0x9888), 0x0e4c4000 },
-	{ _MMIO(0x9888), 0x104c8000 },
-	{ _MMIO(0x9888), 0x124c8000 },
-	{ _MMIO(0x9888), 0x164c2000 },
-	{ _MMIO(0x9888), 0x004ea000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e2000 },
-	{ _MMIO(0x9888), 0x0c4ea000 },
-	{ _MMIO(0x9888), 0x0e4e8000 },
-	{ _MMIO(0x9888), 0x004f6b42 },
-	{ _MMIO(0x9888), 0x064f6200 },
-	{ _MMIO(0x9888), 0x084f4100 },
-	{ _MMIO(0x9888), 0x0a4f0061 },
-	{ _MMIO(0x9888), 0x0c4f6c4c },
-	{ _MMIO(0x9888), 0x0e4f4b00 },
-	{ _MMIO(0x9888), 0x1a4f0000 },
-	{ _MMIO(0x9888), 0x1c4f0000 },
-	{ _MMIO(0x9888), 0x180f5000 },
-	{ _MMIO(0x9888), 0x1a0f8800 },
-	{ _MMIO(0x9888), 0x1c0f08a2 },
-	{ _MMIO(0x9888), 0x182c4000 },
-	{ _MMIO(0x9888), 0x1c2c1451 },
-	{ _MMIO(0x9888), 0x1e2c0001 },
-	{ _MMIO(0x9888), 0x1a2c0010 },
-	{ _MMIO(0x9888), 0x01938000 },
-	{ _MMIO(0x9888), 0x0f938000 },
-	{ _MMIO(0x9888), 0x19938a28 },
-	{ _MMIO(0x9888), 0x03938000 },
-	{ _MMIO(0x9888), 0x19900177 },
-	{ _MMIO(0x9888), 0x1b900178 },
-	{ _MMIO(0x9888), 0x1d900125 },
-	{ _MMIO(0x9888), 0x1f900123 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x13904000 },
-	{ _MMIO(0x9888), 0x21904000 },
-	{ _MMIO(0x9888), 0x25904000 },
-	{ _MMIO(0x9888), 0x27904000 },
-	{ _MMIO(0x9888), 0x2b904000 },
-	{ _MMIO(0x9888), 0x2d904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x53901000 },
-	{ _MMIO(0x9888), 0x43900000 },
-	{ _MMIO(0x9888), 0x55900111 },
-	{ _MMIO(0x9888), 0x47900000 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b900000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900000 },
-	{ _MMIO(0x9888), 0x45900000 },
-};
-
-static int
-get_compute_basic_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_basic;
-	lens[n] = ARRAY_SIZE(mux_config_compute_basic);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_render_pipe_profile[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007ffea },
-	{ _MMIO(0x2774), 0x00007ffc },
-	{ _MMIO(0x2778), 0x0007affa },
-	{ _MMIO(0x277c), 0x0000f5fd },
-	{ _MMIO(0x2780), 0x00079ffa },
-	{ _MMIO(0x2784), 0x0000f3fb },
-	{ _MMIO(0x2788), 0x0007bf7a },
-	{ _MMIO(0x278c), 0x0000f7e7 },
-	{ _MMIO(0x2790), 0x0007fefa },
-	{ _MMIO(0x2794), 0x0000f7cf },
-	{ _MMIO(0x2798), 0x00077ffa },
-	{ _MMIO(0x279c), 0x0000efdf },
-	{ _MMIO(0x27a0), 0x0006fffa },
-	{ _MMIO(0x27a4), 0x0000cfbf },
-	{ _MMIO(0x27a8), 0x0003fffa },
-	{ _MMIO(0x27ac), 0x00005f7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_render_pipe_profile[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_render_pipe_profile[] = {
-	{ _MMIO(0x9888), 0x0c2e001f },
-	{ _MMIO(0x9888), 0x0a2f0000 },
-	{ _MMIO(0x9888), 0x10186800 },
-	{ _MMIO(0x9888), 0x11810019 },
-	{ _MMIO(0x9888), 0x15810013 },
-	{ _MMIO(0x9888), 0x13820020 },
-	{ _MMIO(0x9888), 0x11830020 },
-	{ _MMIO(0x9888), 0x17840000 },
-	{ _MMIO(0x9888), 0x11860007 },
-	{ _MMIO(0x9888), 0x21860000 },
-	{ _MMIO(0x9888), 0x178703e0 },
-	{ _MMIO(0x9888), 0x0c2d8000 },
-	{ _MMIO(0x9888), 0x042d4000 },
-	{ _MMIO(0x9888), 0x062d1000 },
-	{ _MMIO(0x9888), 0x022e5400 },
-	{ _MMIO(0x9888), 0x002e0000 },
-	{ _MMIO(0x9888), 0x0e2e0080 },
-	{ _MMIO(0x9888), 0x082f0040 },
-	{ _MMIO(0x9888), 0x002f0000 },
-	{ _MMIO(0x9888), 0x06143000 },
-	{ _MMIO(0x9888), 0x06174000 },
-	{ _MMIO(0x9888), 0x06180012 },
-	{ _MMIO(0x9888), 0x00180000 },
-	{ _MMIO(0x9888), 0x0d804000 },
-	{ _MMIO(0x9888), 0x0f804000 },
-	{ _MMIO(0x9888), 0x05804000 },
-	{ _MMIO(0x9888), 0x09810200 },
-	{ _MMIO(0x9888), 0x0b810030 },
-	{ _MMIO(0x9888), 0x03810003 },
-	{ _MMIO(0x9888), 0x21819140 },
-	{ _MMIO(0x9888), 0x23819050 },
-	{ _MMIO(0x9888), 0x25810018 },
-	{ _MMIO(0x9888), 0x0b820980 },
-	{ _MMIO(0x9888), 0x03820d80 },
-	{ _MMIO(0x9888), 0x11820000 },
-	{ _MMIO(0x9888), 0x0182c000 },
-	{ _MMIO(0x9888), 0x07828000 },
-	{ _MMIO(0x9888), 0x09824000 },
-	{ _MMIO(0x9888), 0x0f828000 },
-	{ _MMIO(0x9888), 0x0d830004 },
-	{ _MMIO(0x9888), 0x0583000c },
-	{ _MMIO(0x9888), 0x0f831000 },
-	{ _MMIO(0x9888), 0x01848072 },
-	{ _MMIO(0x9888), 0x11840000 },
-	{ _MMIO(0x9888), 0x07848000 },
-	{ _MMIO(0x9888), 0x09844000 },
-	{ _MMIO(0x9888), 0x0f848000 },
-	{ _MMIO(0x9888), 0x07860000 },
-	{ _MMIO(0x9888), 0x09860092 },
-	{ _MMIO(0x9888), 0x0f860400 },
-	{ _MMIO(0x9888), 0x01869100 },
-	{ _MMIO(0x9888), 0x0f870065 },
-	{ _MMIO(0x9888), 0x01870000 },
-	{ _MMIO(0x9888), 0x19930800 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x0d938000 },
-	{ _MMIO(0x9888), 0x1b952000 },
-	{ _MMIO(0x9888), 0x1d955055 },
-	{ _MMIO(0x9888), 0x1f951455 },
-	{ _MMIO(0x9888), 0x0992a000 },
-	{ _MMIO(0x9888), 0x0f928000 },
-	{ _MMIO(0x9888), 0x1192a800 },
-	{ _MMIO(0x9888), 0x1392028a },
-	{ _MMIO(0x9888), 0x0b92a000 },
-	{ _MMIO(0x9888), 0x0d922000 },
-	{ _MMIO(0x9888), 0x13908000 },
-	{ _MMIO(0x9888), 0x21908000 },
-	{ _MMIO(0x9888), 0x23908000 },
-	{ _MMIO(0x9888), 0x25908000 },
-	{ _MMIO(0x9888), 0x27908000 },
-	{ _MMIO(0x9888), 0x29908000 },
-	{ _MMIO(0x9888), 0x2b908000 },
-	{ _MMIO(0x9888), 0x2d904000 },
-	{ _MMIO(0x9888), 0x2f908000 },
-	{ _MMIO(0x9888), 0x31908000 },
-	{ _MMIO(0x9888), 0x15908000 },
-	{ _MMIO(0x9888), 0x17908000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x1f904000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43900c01 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47900000 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900863 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b900061 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900000 },
-	{ _MMIO(0x9888), 0x45900c22 },
-};
-
-static int
-get_render_pipe_profile_mux_config(struct drm_i915_private *dev_priv,
-				   const struct i915_oa_reg **regs,
-				   int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_render_pipe_profile;
-	lens[n] = ARRAY_SIZE(mux_config_render_pipe_profile);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_memory_reads[] = {
-	{ _MMIO(0x272c), 0xffffffff },
-	{ _MMIO(0x2728), 0xffffffff },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x271c), 0xffffffff },
-	{ _MMIO(0x2718), 0xffffffff },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0x86543210 },
-	{ _MMIO(0x2748), 0x86543210 },
-	{ _MMIO(0x2744), 0x00006667 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x275c), 0x86543210 },
-	{ _MMIO(0x2758), 0x86543210 },
-	{ _MMIO(0x2754), 0x00006465 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007f81a },
-	{ _MMIO(0x2774), 0x0000fe00 },
-	{ _MMIO(0x2778), 0x0007f82a },
-	{ _MMIO(0x277c), 0x0000fe00 },
-	{ _MMIO(0x2780), 0x0007f872 },
-	{ _MMIO(0x2784), 0x0000fe00 },
-	{ _MMIO(0x2788), 0x0007f8ba },
-	{ _MMIO(0x278c), 0x0000fe00 },
-	{ _MMIO(0x2790), 0x0007f87a },
-	{ _MMIO(0x2794), 0x0000fe00 },
-	{ _MMIO(0x2798), 0x0007f8ea },
-	{ _MMIO(0x279c), 0x0000fe00 },
-	{ _MMIO(0x27a0), 0x0007f8e2 },
-	{ _MMIO(0x27a4), 0x0000fe00 },
-	{ _MMIO(0x27a8), 0x0007f8f2 },
-	{ _MMIO(0x27ac), 0x0000fe00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_memory_reads[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_memory_reads[] = {
-	{ _MMIO(0x9888), 0x19800343 },
-	{ _MMIO(0x9888), 0x39900340 },
-	{ _MMIO(0x9888), 0x3f901000 },
-	{ _MMIO(0x9888), 0x41900003 },
-	{ _MMIO(0x9888), 0x03803180 },
-	{ _MMIO(0x9888), 0x058035e2 },
-	{ _MMIO(0x9888), 0x0780006a },
-	{ _MMIO(0x9888), 0x11800000 },
-	{ _MMIO(0x9888), 0x2181a000 },
-	{ _MMIO(0x9888), 0x2381000a },
-	{ _MMIO(0x9888), 0x1d950550 },
-	{ _MMIO(0x9888), 0x0b928000 },
-	{ _MMIO(0x9888), 0x0d92a000 },
-	{ _MMIO(0x9888), 0x0f922000 },
-	{ _MMIO(0x9888), 0x13900170 },
-	{ _MMIO(0x9888), 0x21900171 },
-	{ _MMIO(0x9888), 0x23900172 },
-	{ _MMIO(0x9888), 0x25900173 },
-	{ _MMIO(0x9888), 0x27900174 },
-	{ _MMIO(0x9888), 0x29900175 },
-	{ _MMIO(0x9888), 0x2b900176 },
-	{ _MMIO(0x9888), 0x2d900177 },
-	{ _MMIO(0x9888), 0x2f90017f },
-	{ _MMIO(0x9888), 0x31900125 },
-	{ _MMIO(0x9888), 0x15900123 },
-	{ _MMIO(0x9888), 0x17900121 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d908000 },
-	{ _MMIO(0x9888), 0x1f908000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43901084 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47901080 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49901084 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b901084 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900004 },
-	{ _MMIO(0x9888), 0x45900000 },
-};
-
-static int
-get_memory_reads_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_memory_reads;
-	lens[n] = ARRAY_SIZE(mux_config_memory_reads);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_memory_writes[] = {
-	{ _MMIO(0x272c), 0xffffffff },
-	{ _MMIO(0x2728), 0xffffffff },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x271c), 0xffffffff },
-	{ _MMIO(0x2718), 0xffffffff },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0x86543210 },
-	{ _MMIO(0x2748), 0x86543210 },
-	{ _MMIO(0x2744), 0x00006667 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x275c), 0x86543210 },
-	{ _MMIO(0x2758), 0x86543210 },
-	{ _MMIO(0x2754), 0x00006465 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007f81a },
-	{ _MMIO(0x2774), 0x0000fe00 },
-	{ _MMIO(0x2778), 0x0007f82a },
-	{ _MMIO(0x277c), 0x0000fe00 },
-	{ _MMIO(0x2780), 0x0007f822 },
-	{ _MMIO(0x2784), 0x0000fe00 },
-	{ _MMIO(0x2788), 0x0007f8ba },
-	{ _MMIO(0x278c), 0x0000fe00 },
-	{ _MMIO(0x2790), 0x0007f87a },
-	{ _MMIO(0x2794), 0x0000fe00 },
-	{ _MMIO(0x2798), 0x0007f8ea },
-	{ _MMIO(0x279c), 0x0000fe00 },
-	{ _MMIO(0x27a0), 0x0007f8e2 },
-	{ _MMIO(0x27a4), 0x0000fe00 },
-	{ _MMIO(0x27a8), 0x0007f8f2 },
-	{ _MMIO(0x27ac), 0x0000fe00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_memory_writes[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_memory_writes[] = {
-	{ _MMIO(0x9888), 0x19800343 },
-	{ _MMIO(0x9888), 0x39900340 },
-	{ _MMIO(0x9888), 0x3f900000 },
-	{ _MMIO(0x9888), 0x41900080 },
-	{ _MMIO(0x9888), 0x03803180 },
-	{ _MMIO(0x9888), 0x058035e2 },
-	{ _MMIO(0x9888), 0x0780006a },
-	{ _MMIO(0x9888), 0x11800000 },
-	{ _MMIO(0x9888), 0x2181a000 },
-	{ _MMIO(0x9888), 0x2381000a },
-	{ _MMIO(0x9888), 0x1d950550 },
-	{ _MMIO(0x9888), 0x0b928000 },
-	{ _MMIO(0x9888), 0x0d92a000 },
-	{ _MMIO(0x9888), 0x0f922000 },
-	{ _MMIO(0x9888), 0x13900180 },
-	{ _MMIO(0x9888), 0x21900181 },
-	{ _MMIO(0x9888), 0x23900182 },
-	{ _MMIO(0x9888), 0x25900183 },
-	{ _MMIO(0x9888), 0x27900184 },
-	{ _MMIO(0x9888), 0x29900185 },
-	{ _MMIO(0x9888), 0x2b900186 },
-	{ _MMIO(0x9888), 0x2d900187 },
-	{ _MMIO(0x9888), 0x2f900170 },
-	{ _MMIO(0x9888), 0x31900125 },
-	{ _MMIO(0x9888), 0x15900123 },
-	{ _MMIO(0x9888), 0x17900121 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d908000 },
-	{ _MMIO(0x9888), 0x1f908000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43901084 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47901080 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49901084 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b901084 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900004 },
-	{ _MMIO(0x9888), 0x45900000 },
-};
-
-static int
-get_memory_writes_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_memory_writes;
-	lens[n] = ARRAY_SIZE(mux_config_memory_writes);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_extended[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007fc2a },
-	{ _MMIO(0x2774), 0x0000bf00 },
-	{ _MMIO(0x2778), 0x0007fc6a },
-	{ _MMIO(0x277c), 0x0000bf00 },
-	{ _MMIO(0x2780), 0x0007fc92 },
-	{ _MMIO(0x2784), 0x0000bf00 },
-	{ _MMIO(0x2788), 0x0007fca2 },
-	{ _MMIO(0x278c), 0x0000bf00 },
-	{ _MMIO(0x2790), 0x0007fc32 },
-	{ _MMIO(0x2794), 0x0000bf00 },
-	{ _MMIO(0x2798), 0x0007fc9a },
-	{ _MMIO(0x279c), 0x0000bf00 },
-	{ _MMIO(0x27a0), 0x0007fe6a },
-	{ _MMIO(0x27a4), 0x0000bf00 },
-	{ _MMIO(0x27a8), 0x0007fe7a },
-	{ _MMIO(0x27ac), 0x0000bf00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_extended[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_compute_extended[] = {
-	{ _MMIO(0x9888), 0x104f00e0 },
-	{ _MMIO(0x9888), 0x141c0160 },
-	{ _MMIO(0x9888), 0x161c0015 },
-	{ _MMIO(0x9888), 0x181c0120 },
-	{ _MMIO(0x9888), 0x002d5000 },
-	{ _MMIO(0x9888), 0x062d4000 },
-	{ _MMIO(0x9888), 0x082d5000 },
-	{ _MMIO(0x9888), 0x0a2d5000 },
-	{ _MMIO(0x9888), 0x0c2d5000 },
-	{ _MMIO(0x9888), 0x0e2d5000 },
-	{ _MMIO(0x9888), 0x022d5000 },
-	{ _MMIO(0x9888), 0x042d5000 },
-	{ _MMIO(0x9888), 0x0c2e5400 },
-	{ _MMIO(0x9888), 0x0e2e5515 },
-	{ _MMIO(0x9888), 0x102e0155 },
-	{ _MMIO(0x9888), 0x044cc000 },
-	{ _MMIO(0x9888), 0x0a4c8000 },
-	{ _MMIO(0x9888), 0x0c4cc000 },
-	{ _MMIO(0x9888), 0x0e4cc000 },
-	{ _MMIO(0x9888), 0x104c8000 },
-	{ _MMIO(0x9888), 0x124c8000 },
-	{ _MMIO(0x9888), 0x144c8000 },
-	{ _MMIO(0x9888), 0x164c2000 },
-	{ _MMIO(0x9888), 0x064cc000 },
-	{ _MMIO(0x9888), 0x084cc000 },
-	{ _MMIO(0x9888), 0x004ea000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084ea000 },
-	{ _MMIO(0x9888), 0x0a4ea000 },
-	{ _MMIO(0x9888), 0x0c4ea000 },
-	{ _MMIO(0x9888), 0x0e4ea000 },
-	{ _MMIO(0x9888), 0x024ea000 },
-	{ _MMIO(0x9888), 0x044ea000 },
-	{ _MMIO(0x9888), 0x0e4f4b41 },
-	{ _MMIO(0x9888), 0x004f4200 },
-	{ _MMIO(0x9888), 0x024f404c },
-	{ _MMIO(0x9888), 0x1c4f0000 },
-	{ _MMIO(0x9888), 0x1a4f0000 },
-	{ _MMIO(0x9888), 0x001b4000 },
-	{ _MMIO(0x9888), 0x061b8000 },
-	{ _MMIO(0x9888), 0x081bc000 },
-	{ _MMIO(0x9888), 0x0a1bc000 },
-	{ _MMIO(0x9888), 0x0c1bc000 },
-	{ _MMIO(0x9888), 0x041bc000 },
-	{ _MMIO(0x9888), 0x001c0031 },
-	{ _MMIO(0x9888), 0x061c1900 },
-	{ _MMIO(0x9888), 0x081c1a33 },
-	{ _MMIO(0x9888), 0x0a1c1b35 },
-	{ _MMIO(0x9888), 0x0c1c3337 },
-	{ _MMIO(0x9888), 0x041c31c7 },
-	{ _MMIO(0x9888), 0x180f5000 },
-	{ _MMIO(0x9888), 0x1a0fa8aa },
-	{ _MMIO(0x9888), 0x1c0f0aaa },
-	{ _MMIO(0x9888), 0x182c8000 },
-	{ _MMIO(0x9888), 0x1c2c6aaa },
-	{ _MMIO(0x9888), 0x1e2c0001 },
-	{ _MMIO(0x9888), 0x1a2c2950 },
-	{ _MMIO(0x9888), 0x01938000 },
-	{ _MMIO(0x9888), 0x0f938000 },
-	{ _MMIO(0x9888), 0x1993aaaa },
-	{ _MMIO(0x9888), 0x03938000 },
-	{ _MMIO(0x9888), 0x05938000 },
-	{ _MMIO(0x9888), 0x07938000 },
-	{ _MMIO(0x9888), 0x09938000 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x13904000 },
-	{ _MMIO(0x9888), 0x21904000 },
-	{ _MMIO(0x9888), 0x23904000 },
-	{ _MMIO(0x9888), 0x25904000 },
-	{ _MMIO(0x9888), 0x27904000 },
-	{ _MMIO(0x9888), 0x29904000 },
-	{ _MMIO(0x9888), 0x2b904000 },
-	{ _MMIO(0x9888), 0x2d904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17904000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43900420 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47900000 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b900400 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900001 },
-	{ _MMIO(0x9888), 0x45900001 },
-};
-
-static int
-get_compute_extended_mux_config(struct drm_i915_private *dev_priv,
-				const struct i915_oa_reg **regs,
-				int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_extended;
-	lens[n] = ARRAY_SIZE(mux_config_compute_extended);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_l3_cache[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x30800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007fffa },
-	{ _MMIO(0x2774), 0x0000fefe },
-	{ _MMIO(0x2778), 0x0007fffa },
-	{ _MMIO(0x277c), 0x0000fefd },
-	{ _MMIO(0x2790), 0x0007fffa },
-	{ _MMIO(0x2794), 0x0000fbef },
-	{ _MMIO(0x2798), 0x0007fffa },
-	{ _MMIO(0x279c), 0x0000fbdf },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_l3_cache[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00101100 },
-	{ _MMIO(0xe45c), 0x00201200 },
-	{ _MMIO(0xe55c), 0x00301300 },
-	{ _MMIO(0xe65c), 0x00401400 },
-};
-
-static const struct i915_oa_reg mux_config_compute_l3_cache[] = {
-	{ _MMIO(0x9888), 0x166c03b0 },
-	{ _MMIO(0x9888), 0x1593001e },
-	{ _MMIO(0x9888), 0x3f900c00 },
-	{ _MMIO(0x9888), 0x41900000 },
-	{ _MMIO(0x9888), 0x002d1000 },
-	{ _MMIO(0x9888), 0x062d4000 },
-	{ _MMIO(0x9888), 0x082d5000 },
-	{ _MMIO(0x9888), 0x0e2d5000 },
-	{ _MMIO(0x9888), 0x0c2e0400 },
-	{ _MMIO(0x9888), 0x0e2e1500 },
-	{ _MMIO(0x9888), 0x102e0140 },
-	{ _MMIO(0x9888), 0x044c4000 },
-	{ _MMIO(0x9888), 0x0a4c8000 },
-	{ _MMIO(0x9888), 0x0c4cc000 },
-	{ _MMIO(0x9888), 0x144c8000 },
-	{ _MMIO(0x9888), 0x164c2000 },
-	{ _MMIO(0x9888), 0x004e2000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084ea000 },
-	{ _MMIO(0x9888), 0x0e4ea000 },
-	{ _MMIO(0x9888), 0x1a4f4001 },
-	{ _MMIO(0x9888), 0x1c4f5005 },
-	{ _MMIO(0x9888), 0x006c0051 },
-	{ _MMIO(0x9888), 0x066c5000 },
-	{ _MMIO(0x9888), 0x086c5c5d },
-	{ _MMIO(0x9888), 0x0e6c5e5f },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x146c0000 },
-	{ _MMIO(0x9888), 0x1a6c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x180f1000 },
-	{ _MMIO(0x9888), 0x1a0fa800 },
-	{ _MMIO(0x9888), 0x1c0f0a00 },
-	{ _MMIO(0x9888), 0x182c4000 },
-	{ _MMIO(0x9888), 0x1c2c4015 },
-	{ _MMIO(0x9888), 0x1e2c0001 },
-	{ _MMIO(0x9888), 0x03931980 },
-	{ _MMIO(0x9888), 0x05930032 },
-	{ _MMIO(0x9888), 0x11930000 },
-	{ _MMIO(0x9888), 0x01938000 },
-	{ _MMIO(0x9888), 0x0f938000 },
-	{ _MMIO(0x9888), 0x1993a00a },
-	{ _MMIO(0x9888), 0x07930000 },
-	{ _MMIO(0x9888), 0x09930000 },
-	{ _MMIO(0x9888), 0x1d900177 },
-	{ _MMIO(0x9888), 0x1f900178 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x13904000 },
-	{ _MMIO(0x9888), 0x21904000 },
-	{ _MMIO(0x9888), 0x23904000 },
-	{ _MMIO(0x9888), 0x25904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x53901000 },
-	{ _MMIO(0x9888), 0x43900000 },
-	{ _MMIO(0x9888), 0x55900111 },
-	{ _MMIO(0x9888), 0x47900001 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b900000 },
-	{ _MMIO(0x9888), 0x4d900000 },
-	{ _MMIO(0x9888), 0x45900400 },
-};
-
-static int
-get_compute_l3_cache_mux_config(struct drm_i915_private *dev_priv,
-				const struct i915_oa_reg **regs,
-				int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_l3_cache;
-	lens[n] = ARRAY_SIZE(mux_config_compute_l3_cache);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_hdc_and_sf[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x10800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000fdff },
-};
-
-static const struct i915_oa_reg flex_eu_config_hdc_and_sf[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_hdc_and_sf[] = {
-	{ _MMIO(0x9888), 0x104f0232 },
-	{ _MMIO(0x9888), 0x124f4640 },
-	{ _MMIO(0x9888), 0x11834400 },
-	{ _MMIO(0x9888), 0x022d4000 },
-	{ _MMIO(0x9888), 0x042d5000 },
-	{ _MMIO(0x9888), 0x062d1000 },
-	{ _MMIO(0x9888), 0x0e2e0055 },
-	{ _MMIO(0x9888), 0x064c8000 },
-	{ _MMIO(0x9888), 0x084cc000 },
-	{ _MMIO(0x9888), 0x0a4c4000 },
-	{ _MMIO(0x9888), 0x024e8000 },
-	{ _MMIO(0x9888), 0x044ea000 },
-	{ _MMIO(0x9888), 0x064e2000 },
-	{ _MMIO(0x9888), 0x024f6100 },
-	{ _MMIO(0x9888), 0x044f416b },
-	{ _MMIO(0x9888), 0x064f004b },
-	{ _MMIO(0x9888), 0x1a4f0000 },
-	{ _MMIO(0x9888), 0x1a0f02a8 },
-	{ _MMIO(0x9888), 0x1a2c5500 },
-	{ _MMIO(0x9888), 0x0f808000 },
-	{ _MMIO(0x9888), 0x25810020 },
-	{ _MMIO(0x9888), 0x0f8305c0 },
-	{ _MMIO(0x9888), 0x07938000 },
-	{ _MMIO(0x9888), 0x09938000 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x0d938000 },
-	{ _MMIO(0x9888), 0x1f951000 },
-	{ _MMIO(0x9888), 0x13920200 },
-	{ _MMIO(0x9888), 0x31908000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x1f904000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4d900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900000 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_hdc_and_sf_mux_config(struct drm_i915_private *dev_priv,
-			  const struct i915_oa_reg **regs,
-			  int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_hdc_and_sf;
-	lens[n] = ARRAY_SIZE(mux_config_hdc_and_sf);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00014002 },
-	{ _MMIO(0x277c), 0x0000c3ff },
-	{ _MMIO(0x2780), 0x00010002 },
-	{ _MMIO(0x2784), 0x0000c7ff },
-	{ _MMIO(0x2788), 0x00004002 },
-	{ _MMIO(0x278c), 0x0000d3ff },
-	{ _MMIO(0x2790), 0x00100700 },
-	{ _MMIO(0x2794), 0x0000ff1f },
-	{ _MMIO(0x2798), 0x00001402 },
-	{ _MMIO(0x279c), 0x0000fc3f },
-	{ _MMIO(0x27a0), 0x00001002 },
-	{ _MMIO(0x27a4), 0x0000fc7f },
-	{ _MMIO(0x27a8), 0x00000402 },
-	{ _MMIO(0x27ac), 0x0000fd3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_1[] = {
-	{ _MMIO(0x9888), 0x12643400 },
-	{ _MMIO(0x9888), 0x12653400 },
-	{ _MMIO(0x9888), 0x106c6800 },
-	{ _MMIO(0x9888), 0x126c001e },
-	{ _MMIO(0x9888), 0x166c0010 },
-	{ _MMIO(0x9888), 0x0c2d5000 },
-	{ _MMIO(0x9888), 0x0e2d5000 },
-	{ _MMIO(0x9888), 0x002d4000 },
-	{ _MMIO(0x9888), 0x022d5000 },
-	{ _MMIO(0x9888), 0x042d5000 },
-	{ _MMIO(0x9888), 0x062d1000 },
-	{ _MMIO(0x9888), 0x102e0154 },
-	{ _MMIO(0x9888), 0x0c2e5000 },
-	{ _MMIO(0x9888), 0x0e2e0055 },
-	{ _MMIO(0x9888), 0x104c8000 },
-	{ _MMIO(0x9888), 0x124c8000 },
-	{ _MMIO(0x9888), 0x144c8000 },
-	{ _MMIO(0x9888), 0x164c2000 },
-	{ _MMIO(0x9888), 0x044c8000 },
-	{ _MMIO(0x9888), 0x064cc000 },
-	{ _MMIO(0x9888), 0x084cc000 },
-	{ _MMIO(0x9888), 0x0a4c4000 },
-	{ _MMIO(0x9888), 0x0c4ea000 },
-	{ _MMIO(0x9888), 0x0e4ea000 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x024ea000 },
-	{ _MMIO(0x9888), 0x044ea000 },
-	{ _MMIO(0x9888), 0x064e2000 },
-	{ _MMIO(0x9888), 0x1c4f5500 },
-	{ _MMIO(0x9888), 0x1a4f1554 },
-	{ _MMIO(0x9888), 0x0a640024 },
-	{ _MMIO(0x9888), 0x10640000 },
-	{ _MMIO(0x9888), 0x04640000 },
-	{ _MMIO(0x9888), 0x0c650024 },
-	{ _MMIO(0x9888), 0x10650000 },
-	{ _MMIO(0x9888), 0x06650000 },
-	{ _MMIO(0x9888), 0x0c6c5327 },
-	{ _MMIO(0x9888), 0x0e6c5425 },
-	{ _MMIO(0x9888), 0x006c2a00 },
-	{ _MMIO(0x9888), 0x026c285b },
-	{ _MMIO(0x9888), 0x046c005c },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1a6c0900 },
-	{ _MMIO(0x9888), 0x1c0f0aa0 },
-	{ _MMIO(0x9888), 0x180f4000 },
-	{ _MMIO(0x9888), 0x1a0f02aa },
-	{ _MMIO(0x9888), 0x1c2c5400 },
-	{ _MMIO(0x9888), 0x1e2c0001 },
-	{ _MMIO(0x9888), 0x1a2c5550 },
-	{ _MMIO(0x9888), 0x1993aa00 },
-	{ _MMIO(0x9888), 0x03938000 },
-	{ _MMIO(0x9888), 0x05938000 },
-	{ _MMIO(0x9888), 0x07938000 },
-	{ _MMIO(0x9888), 0x09938000 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x0d938000 },
-	{ _MMIO(0x9888), 0x2b904000 },
-	{ _MMIO(0x9888), 0x2d904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17904000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x1f904000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b900421 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900001 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43900420 },
-	{ _MMIO(0x9888), 0x45900021 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47900000 },
-};
-
-static int
-get_l3_1_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_1;
-	lens[n] = ARRAY_SIZE(mux_config_l3_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x30800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000efff },
-	{ _MMIO(0x2778), 0x00006000 },
-	{ _MMIO(0x277c), 0x0000f3ff },
-};
-
-static const struct i915_oa_reg flex_eu_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0x9888), 0x102d7800 },
-	{ _MMIO(0x9888), 0x122d79e0 },
-	{ _MMIO(0x9888), 0x0c2f0004 },
-	{ _MMIO(0x9888), 0x100e3800 },
-	{ _MMIO(0x9888), 0x180f0005 },
-	{ _MMIO(0x9888), 0x002d0940 },
-	{ _MMIO(0x9888), 0x022d802f },
-	{ _MMIO(0x9888), 0x042d4013 },
-	{ _MMIO(0x9888), 0x062d1000 },
-	{ _MMIO(0x9888), 0x0e2e0050 },
-	{ _MMIO(0x9888), 0x022f0010 },
-	{ _MMIO(0x9888), 0x002f0000 },
-	{ _MMIO(0x9888), 0x084c8000 },
-	{ _MMIO(0x9888), 0x0a4c4000 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e2000 },
-	{ _MMIO(0x9888), 0x040e0480 },
-	{ _MMIO(0x9888), 0x000e0000 },
-	{ _MMIO(0x9888), 0x060f0027 },
-	{ _MMIO(0x9888), 0x100f0000 },
-	{ _MMIO(0x9888), 0x1a0f0040 },
-	{ _MMIO(0x9888), 0x03938000 },
-	{ _MMIO(0x9888), 0x05938000 },
-	{ _MMIO(0x9888), 0x07938000 },
-	{ _MMIO(0x9888), 0x09938000 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x0d938000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17904000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x1f904000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x439014a0 },
-	{ _MMIO(0x9888), 0x459000a4 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47900001 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_rasterizer_and_pixel_backend_mux_config(struct drm_i915_private *dev_priv,
-					    const struct i915_oa_reg **regs,
-					    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_rasterizer_and_pixel_backend;
-	lens[n] = ARRAY_SIZE(mux_config_rasterizer_and_pixel_backend);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_sampler[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x70800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x0000c000 },
-	{ _MMIO(0x2774), 0x0000e7ff },
-	{ _MMIO(0x2778), 0x00003000 },
-	{ _MMIO(0x277c), 0x0000f9ff },
-	{ _MMIO(0x2780), 0x00000c00 },
-	{ _MMIO(0x2784), 0x0000fe7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_sampler[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_sampler[] = {
-	{ _MMIO(0x9888), 0x121300a0 },
-	{ _MMIO(0x9888), 0x141600ab },
-	{ _MMIO(0x9888), 0x123300a0 },
-	{ _MMIO(0x9888), 0x143600ab },
-	{ _MMIO(0x9888), 0x125300a0 },
-	{ _MMIO(0x9888), 0x145600ab },
-	{ _MMIO(0x9888), 0x0c2d4000 },
-	{ _MMIO(0x9888), 0x0e2d5000 },
-	{ _MMIO(0x9888), 0x002d4000 },
-	{ _MMIO(0x9888), 0x022d5000 },
-	{ _MMIO(0x9888), 0x042d5000 },
-	{ _MMIO(0x9888), 0x062d1000 },
-	{ _MMIO(0x9888), 0x102e01a0 },
-	{ _MMIO(0x9888), 0x0c2e5000 },
-	{ _MMIO(0x9888), 0x0e2e0065 },
-	{ _MMIO(0x9888), 0x164c2000 },
-	{ _MMIO(0x9888), 0x044c8000 },
-	{ _MMIO(0x9888), 0x064cc000 },
-	{ _MMIO(0x9888), 0x084c4000 },
-	{ _MMIO(0x9888), 0x0a4c4000 },
-	{ _MMIO(0x9888), 0x0e4e8000 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x024ea000 },
-	{ _MMIO(0x9888), 0x044e2000 },
-	{ _MMIO(0x9888), 0x064e2000 },
-	{ _MMIO(0x9888), 0x1c0f0800 },
-	{ _MMIO(0x9888), 0x180f4000 },
-	{ _MMIO(0x9888), 0x1a0f023f },
-	{ _MMIO(0x9888), 0x1e2c0003 },
-	{ _MMIO(0x9888), 0x1a2cc030 },
-	{ _MMIO(0x9888), 0x04132180 },
-	{ _MMIO(0x9888), 0x02130000 },
-	{ _MMIO(0x9888), 0x0c148000 },
-	{ _MMIO(0x9888), 0x0e142000 },
-	{ _MMIO(0x9888), 0x04148000 },
-	{ _MMIO(0x9888), 0x1e150140 },
-	{ _MMIO(0x9888), 0x1c150040 },
-	{ _MMIO(0x9888), 0x0c163000 },
-	{ _MMIO(0x9888), 0x0e160068 },
-	{ _MMIO(0x9888), 0x10160000 },
-	{ _MMIO(0x9888), 0x18160000 },
-	{ _MMIO(0x9888), 0x0a164000 },
-	{ _MMIO(0x9888), 0x04330043 },
-	{ _MMIO(0x9888), 0x02330000 },
-	{ _MMIO(0x9888), 0x0234a000 },
-	{ _MMIO(0x9888), 0x04342000 },
-	{ _MMIO(0x9888), 0x1c350015 },
-	{ _MMIO(0x9888), 0x02363460 },
-	{ _MMIO(0x9888), 0x10360000 },
-	{ _MMIO(0x9888), 0x04360000 },
-	{ _MMIO(0x9888), 0x06360000 },
-	{ _MMIO(0x9888), 0x08364000 },
-	{ _MMIO(0x9888), 0x06530043 },
-	{ _MMIO(0x9888), 0x02530000 },
-	{ _MMIO(0x9888), 0x0e548000 },
-	{ _MMIO(0x9888), 0x00548000 },
-	{ _MMIO(0x9888), 0x06542000 },
-	{ _MMIO(0x9888), 0x1e550400 },
-	{ _MMIO(0x9888), 0x1a552000 },
-	{ _MMIO(0x9888), 0x1c550100 },
-	{ _MMIO(0x9888), 0x0e563000 },
-	{ _MMIO(0x9888), 0x00563400 },
-	{ _MMIO(0x9888), 0x10560000 },
-	{ _MMIO(0x9888), 0x18560000 },
-	{ _MMIO(0x9888), 0x02560000 },
-	{ _MMIO(0x9888), 0x0c564000 },
-	{ _MMIO(0x9888), 0x1993a800 },
-	{ _MMIO(0x9888), 0x03938000 },
-	{ _MMIO(0x9888), 0x05938000 },
-	{ _MMIO(0x9888), 0x07938000 },
-	{ _MMIO(0x9888), 0x09938000 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x0d938000 },
-	{ _MMIO(0x9888), 0x2d904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17904000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x1f904000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b9014a0 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900001 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43900820 },
-	{ _MMIO(0x9888), 0x45901022 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47900000 },
-};
-
-static int
-get_sampler_mux_config(struct drm_i915_private *dev_priv,
-		       const struct i915_oa_reg **regs,
-		       int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_sampler;
-	lens[n] = ARRAY_SIZE(mux_config_sampler);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_tdl_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x00007fff },
-	{ _MMIO(0x2778), 0x00000000 },
-	{ _MMIO(0x277c), 0x00009fff },
-	{ _MMIO(0x2780), 0x00000002 },
-	{ _MMIO(0x2784), 0x0000efff },
-	{ _MMIO(0x2788), 0x00000000 },
-	{ _MMIO(0x278c), 0x0000f3ff },
-	{ _MMIO(0x2790), 0x00000002 },
-	{ _MMIO(0x2794), 0x0000fdff },
-	{ _MMIO(0x2798), 0x00000000 },
-	{ _MMIO(0x279c), 0x0000fe7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_tdl_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_tdl_1[] = {
-	{ _MMIO(0x9888), 0x141a0000 },
-	{ _MMIO(0x9888), 0x143a0000 },
-	{ _MMIO(0x9888), 0x145a0000 },
-	{ _MMIO(0x9888), 0x0c2d4000 },
-	{ _MMIO(0x9888), 0x0e2d5000 },
-	{ _MMIO(0x9888), 0x002d4000 },
-	{ _MMIO(0x9888), 0x022d5000 },
-	{ _MMIO(0x9888), 0x042d5000 },
-	{ _MMIO(0x9888), 0x062d1000 },
-	{ _MMIO(0x9888), 0x102e0150 },
-	{ _MMIO(0x9888), 0x0c2e5000 },
-	{ _MMIO(0x9888), 0x0e2e006a },
-	{ _MMIO(0x9888), 0x124c8000 },
-	{ _MMIO(0x9888), 0x144c8000 },
-	{ _MMIO(0x9888), 0x164c2000 },
-	{ _MMIO(0x9888), 0x044c8000 },
-	{ _MMIO(0x9888), 0x064c4000 },
-	{ _MMIO(0x9888), 0x0a4c4000 },
-	{ _MMIO(0x9888), 0x0c4e8000 },
-	{ _MMIO(0x9888), 0x0e4ea000 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x024e2000 },
-	{ _MMIO(0x9888), 0x064e2000 },
-	{ _MMIO(0x9888), 0x1c0f0bc0 },
-	{ _MMIO(0x9888), 0x180f4000 },
-	{ _MMIO(0x9888), 0x1a0f0302 },
-	{ _MMIO(0x9888), 0x1e2c0003 },
-	{ _MMIO(0x9888), 0x1a2c00f0 },
-	{ _MMIO(0x9888), 0x021a3080 },
-	{ _MMIO(0x9888), 0x041a31e5 },
-	{ _MMIO(0x9888), 0x02148000 },
-	{ _MMIO(0x9888), 0x0414a000 },
-	{ _MMIO(0x9888), 0x1c150054 },
-	{ _MMIO(0x9888), 0x06168000 },
-	{ _MMIO(0x9888), 0x08168000 },
-	{ _MMIO(0x9888), 0x0a168000 },
-	{ _MMIO(0x9888), 0x0c3a3280 },
-	{ _MMIO(0x9888), 0x0e3a0063 },
-	{ _MMIO(0x9888), 0x063a0061 },
-	{ _MMIO(0x9888), 0x023a0000 },
-	{ _MMIO(0x9888), 0x0c348000 },
-	{ _MMIO(0x9888), 0x0e342000 },
-	{ _MMIO(0x9888), 0x06342000 },
-	{ _MMIO(0x9888), 0x1e350140 },
-	{ _MMIO(0x9888), 0x1c350100 },
-	{ _MMIO(0x9888), 0x18360028 },
-	{ _MMIO(0x9888), 0x0c368000 },
-	{ _MMIO(0x9888), 0x0e5a3080 },
-	{ _MMIO(0x9888), 0x005a3280 },
-	{ _MMIO(0x9888), 0x025a0063 },
-	{ _MMIO(0x9888), 0x0e548000 },
-	{ _MMIO(0x9888), 0x00548000 },
-	{ _MMIO(0x9888), 0x02542000 },
-	{ _MMIO(0x9888), 0x1e550400 },
-	{ _MMIO(0x9888), 0x1a552000 },
-	{ _MMIO(0x9888), 0x1c550001 },
-	{ _MMIO(0x9888), 0x18560080 },
-	{ _MMIO(0x9888), 0x02568000 },
-	{ _MMIO(0x9888), 0x04568000 },
-	{ _MMIO(0x9888), 0x1993a800 },
-	{ _MMIO(0x9888), 0x03938000 },
-	{ _MMIO(0x9888), 0x05938000 },
-	{ _MMIO(0x9888), 0x07938000 },
-	{ _MMIO(0x9888), 0x09938000 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x0d938000 },
-	{ _MMIO(0x9888), 0x2d904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17904000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x1f904000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b900420 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4d900000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43900000 },
-	{ _MMIO(0x9888), 0x45901084 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47900001 },
-};
-
-static int
-get_tdl_1_mux_config(struct drm_i915_private *dev_priv,
-		     const struct i915_oa_reg **regs,
-		     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_tdl_1;
-	lens[n] = ARRAY_SIZE(mux_config_tdl_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_tdl_2[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_tdl_2[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_tdl_2[] = {
-	{ _MMIO(0x9888), 0x141a026b },
-	{ _MMIO(0x9888), 0x143a0173 },
-	{ _MMIO(0x9888), 0x145a026b },
-	{ _MMIO(0x9888), 0x002d4000 },
-	{ _MMIO(0x9888), 0x022d5000 },
-	{ _MMIO(0x9888), 0x042d5000 },
-	{ _MMIO(0x9888), 0x062d1000 },
-	{ _MMIO(0x9888), 0x0c2e5000 },
-	{ _MMIO(0x9888), 0x0e2e0069 },
-	{ _MMIO(0x9888), 0x044c8000 },
-	{ _MMIO(0x9888), 0x064cc000 },
-	{ _MMIO(0x9888), 0x0a4c4000 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x024ea000 },
-	{ _MMIO(0x9888), 0x064e2000 },
-	{ _MMIO(0x9888), 0x180f6000 },
-	{ _MMIO(0x9888), 0x1a0f030a },
-	{ _MMIO(0x9888), 0x1a2c03c0 },
-	{ _MMIO(0x9888), 0x041a37e7 },
-	{ _MMIO(0x9888), 0x021a0000 },
-	{ _MMIO(0x9888), 0x0414a000 },
-	{ _MMIO(0x9888), 0x1c150050 },
-	{ _MMIO(0x9888), 0x08168000 },
-	{ _MMIO(0x9888), 0x0a168000 },
-	{ _MMIO(0x9888), 0x003a3380 },
-	{ _MMIO(0x9888), 0x063a006f },
-	{ _MMIO(0x9888), 0x023a0000 },
-	{ _MMIO(0x9888), 0x00348000 },
-	{ _MMIO(0x9888), 0x06342000 },
-	{ _MMIO(0x9888), 0x1a352000 },
-	{ _MMIO(0x9888), 0x1c350100 },
-	{ _MMIO(0x9888), 0x02368000 },
-	{ _MMIO(0x9888), 0x0c368000 },
-	{ _MMIO(0x9888), 0x025a37e7 },
-	{ _MMIO(0x9888), 0x0254a000 },
-	{ _MMIO(0x9888), 0x1c550005 },
-	{ _MMIO(0x9888), 0x04568000 },
-	{ _MMIO(0x9888), 0x06568000 },
-	{ _MMIO(0x9888), 0x03938000 },
-	{ _MMIO(0x9888), 0x05938000 },
-	{ _MMIO(0x9888), 0x07938000 },
-	{ _MMIO(0x9888), 0x09938000 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x0d938000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17904000 },
-	{ _MMIO(0x9888), 0x19904000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x1f904000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43900020 },
-	{ _MMIO(0x9888), 0x45901080 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x47900001 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_tdl_2_mux_config(struct drm_i915_private *dev_priv,
-		     const struct i915_oa_reg **regs,
-		     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_tdl_2;
-	lens[n] = ARRAY_SIZE(mux_config_tdl_2);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_extra[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_extra[] = {
-	{ _MMIO(0xe458), 0x00001000 },
-	{ _MMIO(0xe558), 0x00003002 },
-	{ _MMIO(0xe658), 0x00005004 },
-	{ _MMIO(0xe758), 0x00011010 },
-	{ _MMIO(0xe45c), 0x00050012 },
-	{ _MMIO(0xe55c), 0x00052051 },
-	{ _MMIO(0xe65c), 0x00000008 },
-};
-
-static const struct i915_oa_reg mux_config_compute_extra[] = {
-	{ _MMIO(0x9888), 0x141a001f },
-	{ _MMIO(0x9888), 0x143a001f },
-	{ _MMIO(0x9888), 0x145a001f },
-	{ _MMIO(0x9888), 0x042d5000 },
-	{ _MMIO(0x9888), 0x062d1000 },
-	{ _MMIO(0x9888), 0x0e2e0094 },
-	{ _MMIO(0x9888), 0x084cc000 },
-	{ _MMIO(0x9888), 0x044ea000 },
-	{ _MMIO(0x9888), 0x1a0f00e0 },
-	{ _MMIO(0x9888), 0x1a2c0c00 },
-	{ _MMIO(0x9888), 0x061a0063 },
-	{ _MMIO(0x9888), 0x021a0000 },
-	{ _MMIO(0x9888), 0x06142000 },
-	{ _MMIO(0x9888), 0x1c150100 },
-	{ _MMIO(0x9888), 0x0c168000 },
-	{ _MMIO(0x9888), 0x043a3180 },
-	{ _MMIO(0x9888), 0x023a0000 },
-	{ _MMIO(0x9888), 0x04348000 },
-	{ _MMIO(0x9888), 0x1c350040 },
-	{ _MMIO(0x9888), 0x0a368000 },
-	{ _MMIO(0x9888), 0x045a0063 },
-	{ _MMIO(0x9888), 0x025a0000 },
-	{ _MMIO(0x9888), 0x04542000 },
-	{ _MMIO(0x9888), 0x1c550010 },
-	{ _MMIO(0x9888), 0x08568000 },
-	{ _MMIO(0x9888), 0x09938000 },
-	{ _MMIO(0x9888), 0x0b938000 },
-	{ _MMIO(0x9888), 0x0d938000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1d904000 },
-	{ _MMIO(0x9888), 0x1f904000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900400 },
-	{ _MMIO(0x9888), 0x47900004 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_compute_extra_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_extra;
-	lens[n] = ARRAY_SIZE(mux_config_compute_extra);
-	n++;
-
-	return n;
-}
-
 static const struct i915_oa_reg b_counter_config_test_oa[] = {
 	{ _MMIO(0x2740), 0x00000000 },
 	{ _MMIO(0x2744), 0x00800000 },
@@ -1668,6 +60,7 @@ static const struct i915_oa_reg flex_eu_config_test_oa[] = {
 };
 
 static const struct i915_oa_reg mux_config_test_oa[] = {
+	{ _MMIO(0x9840), 0x00000080 },
 	{ _MMIO(0x9888), 0x19800000 },
 	{ _MMIO(0x9888), 0x07800063 },
 	{ _MMIO(0x9888), 0x11800000 },
@@ -1681,922 +74,35 @@ static const struct i915_oa_reg mux_config_test_oa[] = {
 	{ _MMIO(0x9888), 0x33900000 },
 };
 
-static int
-get_test_oa_mux_config(struct drm_i915_private *dev_priv,
-		       const struct i915_oa_reg **regs,
-		       int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_test_oa;
-	lens[n] = ARRAY_SIZE(mux_config_test_oa);
-	n++;
-
-	return n;
-}
-
-int i915_oa_select_metric_set_glk(struct drm_i915_private *dev_priv)
-{
-	dev_priv->perf.oa.n_mux_configs = 0;
-	dev_priv->perf.oa.b_counter_regs = NULL;
-	dev_priv->perf.oa.b_counter_regs_len = 0;
-	dev_priv->perf.oa.flex_regs = NULL;
-	dev_priv->perf.oa.flex_regs_len = 0;
-
-	switch (dev_priv->perf.oa.metrics_set) {
-	case METRIC_SET_ID_RENDER_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_basic_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_basic);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_basic_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_basic);
-
-		return 0;
-	case METRIC_SET_ID_RENDER_PIPE_PROFILE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_pipe_profile_mux_config(dev_priv,
-							   dev_priv->perf.oa.mux_regs,
-							   dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_PIPE_PROFILE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_pipe_profile;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_pipe_profile);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_pipe_profile;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_pipe_profile);
-
-		return 0;
-	case METRIC_SET_ID_MEMORY_READS:
-		dev_priv->perf.oa.n_mux_configs =
-			get_memory_reads_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"MEMORY_READS\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_memory_reads;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_memory_reads);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_memory_reads;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_memory_reads);
-
-		return 0;
-	case METRIC_SET_ID_MEMORY_WRITES:
-		dev_priv->perf.oa.n_mux_configs =
-			get_memory_writes_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"MEMORY_WRITES\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_memory_writes;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_memory_writes);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_memory_writes;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_memory_writes);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_EXTENDED:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_extended_mux_config(dev_priv,
-							dev_priv->perf.oa.mux_regs,
-							dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_EXTENDED\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_extended;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_extended);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_extended;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_extended);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_L3_CACHE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_l3_cache_mux_config(dev_priv,
-							dev_priv->perf.oa.mux_regs,
-							dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_L3_CACHE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_l3_cache;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_l3_cache);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_l3_cache;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_l3_cache);
-
-		return 0;
-	case METRIC_SET_ID_HDC_AND_SF:
-		dev_priv->perf.oa.n_mux_configs =
-			get_hdc_and_sf_mux_config(dev_priv,
-						  dev_priv->perf.oa.mux_regs,
-						  dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"HDC_AND_SF\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_hdc_and_sf;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_hdc_and_sf);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_hdc_and_sf;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_hdc_and_sf);
-
-		return 0;
-	case METRIC_SET_ID_L3_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_1_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_1);
-
-		return 0;
-	case METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND:
-		dev_priv->perf.oa.n_mux_configs =
-			get_rasterizer_and_pixel_backend_mux_config(dev_priv,
-								    dev_priv->perf.oa.mux_regs,
-								    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RASTERIZER_AND_PIXEL_BACKEND\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_rasterizer_and_pixel_backend;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_rasterizer_and_pixel_backend);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_rasterizer_and_pixel_backend;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_rasterizer_and_pixel_backend);
-
-		return 0;
-	case METRIC_SET_ID_SAMPLER:
-		dev_priv->perf.oa.n_mux_configs =
-			get_sampler_mux_config(dev_priv,
-					       dev_priv->perf.oa.mux_regs,
-					       dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"SAMPLER\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_sampler;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_sampler);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_sampler;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_sampler);
-
-		return 0;
-	case METRIC_SET_ID_TDL_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_tdl_1_mux_config(dev_priv,
-					     dev_priv->perf.oa.mux_regs,
-					     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TDL_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_tdl_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_tdl_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_tdl_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_tdl_1);
-
-		return 0;
-	case METRIC_SET_ID_TDL_2:
-		dev_priv->perf.oa.n_mux_configs =
-			get_tdl_2_mux_config(dev_priv,
-					     dev_priv->perf.oa.mux_regs,
-					     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TDL_2\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_tdl_2;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_tdl_2);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_tdl_2;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_tdl_2);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_EXTRA:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_extra_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_EXTRA\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_extra;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_extra);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_extra;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_extra);
-
-		return 0;
-	case METRIC_SET_ID_TEST_OA:
-		dev_priv->perf.oa.n_mux_configs =
-			get_test_oa_mux_config(dev_priv,
-					       dev_priv->perf.oa.mux_regs,
-					       dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TEST_OA\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_test_oa;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_test_oa);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_test_oa;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_test_oa);
-
-		return 0;
-	default:
-		return -ENODEV;
-	}
-}
-
-static ssize_t
-show_render_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_BASIC);
-}
-
-static struct device_attribute dev_attr_render_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_basic[] = {
-	&dev_attr_render_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_basic = {
-	.name = "d72df5c7-5b4a-4274-a43f-00b0fd51fc68",
-	.attrs =  attrs_render_basic,
-};
-
-static ssize_t
-show_compute_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_BASIC);
-}
-
-static struct device_attribute dev_attr_compute_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_basic[] = {
-	&dev_attr_compute_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_basic = {
-	.name = "814285f6-354d-41d2-ba49-e24e622714a0",
-	.attrs =  attrs_compute_basic,
-};
-
-static ssize_t
-show_render_pipe_profile_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_PIPE_PROFILE);
-}
-
-static struct device_attribute dev_attr_render_pipe_profile_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_pipe_profile_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_pipe_profile[] = {
-	&dev_attr_render_pipe_profile_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_pipe_profile = {
-	.name = "07d397a6-b3e6-49f6-9433-a4f293d55978",
-	.attrs =  attrs_render_pipe_profile,
-};
-
-static ssize_t
-show_memory_reads_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_MEMORY_READS);
-}
-
-static struct device_attribute dev_attr_memory_reads_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_memory_reads_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_memory_reads[] = {
-	&dev_attr_memory_reads_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_memory_reads = {
-	.name = "1a356946-5428-450b-a2f0-89f8783a302d",
-	.attrs =  attrs_memory_reads,
-};
-
-static ssize_t
-show_memory_writes_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_MEMORY_WRITES);
-}
-
-static struct device_attribute dev_attr_memory_writes_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_memory_writes_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_memory_writes[] = {
-	&dev_attr_memory_writes_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_memory_writes = {
-	.name = "5299be9d-7a61-4c99-9f81-f87e6c5aaca9",
-	.attrs =  attrs_memory_writes,
-};
-
-static ssize_t
-show_compute_extended_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_EXTENDED);
-}
-
-static struct device_attribute dev_attr_compute_extended_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_extended_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_extended[] = {
-	&dev_attr_compute_extended_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_extended = {
-	.name = "bc9bcff2-459a-4cbc-986d-a84b077153f3",
-	.attrs =  attrs_compute_extended,
-};
-
-static ssize_t
-show_compute_l3_cache_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_L3_CACHE);
-}
-
-static struct device_attribute dev_attr_compute_l3_cache_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_l3_cache_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_l3_cache[] = {
-	&dev_attr_compute_l3_cache_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_l3_cache = {
-	.name = "88ec931f-5b4a-453a-9db6-a61232b6143d",
-	.attrs =  attrs_compute_l3_cache,
-};
-
-static ssize_t
-show_hdc_and_sf_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_HDC_AND_SF);
-}
-
-static struct device_attribute dev_attr_hdc_and_sf_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_hdc_and_sf_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_hdc_and_sf[] = {
-	&dev_attr_hdc_and_sf_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_hdc_and_sf = {
-	.name = "530d176d-2a18-4014-adf8-1500c6c60835",
-	.attrs =  attrs_hdc_and_sf,
-};
-
-static ssize_t
-show_l3_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_1);
-}
-
-static struct device_attribute dev_attr_l3_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_1[] = {
-	&dev_attr_l3_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_1 = {
-	.name = "fdee5a5a-f23c-43d1-aa73-f6257c71671d",
-	.attrs =  attrs_l3_1,
-};
-
-static ssize_t
-show_rasterizer_and_pixel_backend_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND);
-}
-
-static struct device_attribute dev_attr_rasterizer_and_pixel_backend_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_rasterizer_and_pixel_backend_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_rasterizer_and_pixel_backend[] = {
-	&dev_attr_rasterizer_and_pixel_backend_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_rasterizer_and_pixel_backend = {
-	.name = "6617623e-ca73-4791-b2b7-ddedd0846a0c",
-	.attrs =  attrs_rasterizer_and_pixel_backend,
-};
-
-static ssize_t
-show_sampler_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_SAMPLER);
-}
-
-static struct device_attribute dev_attr_sampler_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_sampler_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_sampler[] = {
-	&dev_attr_sampler_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_sampler = {
-	.name = "f3b2ea63-e82e-4234-b418-44dd20dd34d0",
-	.attrs =  attrs_sampler,
-};
-
-static ssize_t
-show_tdl_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TDL_1);
-}
-
-static struct device_attribute dev_attr_tdl_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_tdl_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_tdl_1[] = {
-	&dev_attr_tdl_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_tdl_1 = {
-	.name = "14411d35-cbf6-4f5e-b68b-190faf9a1a83",
-	.attrs =  attrs_tdl_1,
-};
-
-static ssize_t
-show_tdl_2_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TDL_2);
-}
-
-static struct device_attribute dev_attr_tdl_2_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_tdl_2_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_tdl_2[] = {
-	&dev_attr_tdl_2_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_tdl_2 = {
-	.name = "ffa3f263-0478-4724-8c9f-c911c5ec0f1d",
-	.attrs =  attrs_tdl_2,
-};
-
-static ssize_t
-show_compute_extra_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_EXTRA);
-}
-
-static struct device_attribute dev_attr_compute_extra_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_extra_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_extra[] = {
-	&dev_attr_compute_extra_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_extra = {
-	.name = "15274c82-27d2-4819-876a-7cb1a2c59ba4",
-	.attrs =  attrs_compute_extra,
-};
-
 static ssize_t
 show_test_oa_id(struct device *kdev, struct device_attribute *attr, char *buf)
 {
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TEST_OA);
-}
-
-static struct device_attribute dev_attr_test_oa_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_test_oa_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_test_oa[] = {
-	&dev_attr_test_oa_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_test_oa = {
-	.name = "dd3fd789-e783-4204-8cd0-b671bbccb0cf",
-	.attrs =  attrs_test_oa,
-};
-
-int
-i915_perf_register_sysfs_glk(struct drm_i915_private *dev_priv)
-{
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
-	int ret = 0;
-
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-		if (ret)
-			goto error_render_basic;
-	}
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-		if (ret)
-			goto error_compute_basic;
-	}
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-		if (ret)
-			goto error_render_pipe_profile;
-	}
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-		if (ret)
-			goto error_memory_reads;
-	}
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-		if (ret)
-			goto error_memory_writes;
-	}
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-		if (ret)
-			goto error_compute_extended;
-	}
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-		if (ret)
-			goto error_compute_l3_cache;
-	}
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-		if (ret)
-			goto error_hdc_and_sf;
-	}
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-		if (ret)
-			goto error_l3_1;
-	}
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-		if (ret)
-			goto error_rasterizer_and_pixel_backend;
-	}
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_sampler);
-		if (ret)
-			goto error_sampler;
-	}
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-		if (ret)
-			goto error_tdl_1;
-	}
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-		if (ret)
-			goto error_tdl_2;
-	}
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-		if (ret)
-			goto error_compute_extra;
-	}
-	if (get_test_oa_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_test_oa);
-		if (ret)
-			goto error_test_oa;
-	}
-
-	return 0;
-
-error_test_oa:
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-error_compute_extra:
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-error_tdl_2:
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-error_tdl_1:
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler);
-error_sampler:
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-error_rasterizer_and_pixel_backend:
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-error_l3_1:
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-error_hdc_and_sf:
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-error_compute_l3_cache:
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-error_compute_extended:
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-error_memory_writes:
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-error_memory_reads:
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-error_render_pipe_profile:
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-error_compute_basic:
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-error_render_basic:
-	return ret;
+	return sprintf(buf, "1\n");
 }
 
 void
-i915_perf_unregister_sysfs_glk(struct drm_i915_private *dev_priv)
+i915_perf_load_test_config_glk(struct drm_i915_private *dev_priv)
 {
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
+	strncpy(dev_priv->perf.oa.test_config.uuid,
+		"dd3fd789-e783-4204-8cd0-b671bbccb0cf",
+		UUID_STRING_LEN);
+	dev_priv->perf.oa.test_config.id = 1;
 
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler);
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-	if (get_test_oa_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_test_oa);
+	dev_priv->perf.oa.test_config.mux_regs = mux_config_test_oa;
+	dev_priv->perf.oa.test_config.mux_regs_len = ARRAY_SIZE(mux_config_test_oa);
+
+	dev_priv->perf.oa.test_config.b_counter_regs = b_counter_config_test_oa;
+	dev_priv->perf.oa.test_config.b_counter_regs_len = ARRAY_SIZE(b_counter_config_test_oa);
+
+	dev_priv->perf.oa.test_config.flex_regs = flex_eu_config_test_oa;
+	dev_priv->perf.oa.test_config.flex_regs_len = ARRAY_SIZE(flex_eu_config_test_oa);
+
+	dev_priv->perf.oa.test_config.sysfs_metric.name = "dd3fd789-e783-4204-8cd0-b671bbccb0cf";
+	dev_priv->perf.oa.test_config.sysfs_metric.attrs = dev_priv->perf.oa.test_config.attrs;
+
+	dev_priv->perf.oa.test_config.attrs[0] = &dev_priv->perf.oa.test_config.sysfs_metric_id.attr;
+
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.name = "id";
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.mode = 0444;
+	dev_priv->perf.oa.test_config.sysfs_metric_id.show = show_test_oa_id;
 }
diff --git a/drivers/gpu/drm/i915/i915_oa_glk.h b/drivers/gpu/drm/i915/i915_oa_glk.h
index 5511bb1..63bd113 100644
--- a/drivers/gpu/drm/i915/i915_oa_glk.h
+++ b/drivers/gpu/drm/i915/i915_oa_glk.h
@@ -29,12 +29,6 @@
 #ifndef __I915_OA_GLK_H__
 #define __I915_OA_GLK_H__
 
-extern int i915_oa_n_builtin_metric_sets_glk;
-
-extern int i915_oa_select_metric_set_glk(struct drm_i915_private *dev_priv);
-
-extern int i915_perf_register_sysfs_glk(struct drm_i915_private *dev_priv);
-
-extern void i915_perf_unregister_sysfs_glk(struct drm_i915_private *dev_priv);
+extern void i915_perf_load_test_config_glk(struct drm_i915_private *dev_priv);
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_oa_hsw.c b/drivers/gpu/drm/i915/i915_oa_hsw.c
index 10f169f..56b0377 100644
--- a/drivers/gpu/drm/i915/i915_oa_hsw.c
+++ b/drivers/gpu/drm/i915/i915_oa_hsw.c
@@ -31,17 +31,6 @@
 #include "i915_drv.h"
 #include "i915_oa_hsw.h"
 
-enum metric_set_id {
-	METRIC_SET_ID_RENDER_BASIC = 1,
-	METRIC_SET_ID_COMPUTE_BASIC,
-	METRIC_SET_ID_COMPUTE_EXTENDED,
-	METRIC_SET_ID_MEMORY_READS,
-	METRIC_SET_ID_MEMORY_WRITES,
-	METRIC_SET_ID_SAMPLER_BALANCE,
-};
-
-int i915_oa_n_builtin_metric_sets_hsw = 6;
-
 static const struct i915_oa_reg b_counter_config_render_basic[] = {
 	{ _MMIO(0x2724), 0x00800000 },
 	{ _MMIO(0x2720), 0x00000000 },
@@ -53,6 +42,7 @@ static const struct i915_oa_reg flex_eu_config_render_basic[] = {
 };
 
 static const struct i915_oa_reg mux_config_render_basic[] = {
+	{ _MMIO(0x9840), 0x00000080 },
 	{ _MMIO(0x253a4), 0x01600000 },
 	{ _MMIO(0x25440), 0x00100000 },
 	{ _MMIO(0x25128), 0x00000000 },
@@ -114,750 +104,35 @@ static const struct i915_oa_reg mux_config_render_basic[] = {
 	{ _MMIO(0x25428), 0x00042049 },
 };
 
-static int
-get_render_basic_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_render_basic;
-	lens[n] = ARRAY_SIZE(mux_config_render_basic);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_basic[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2718), 0xaaaaaaaa },
-	{ _MMIO(0x271c), 0xaaaaaaaa },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2728), 0xaaaaaaaa },
-	{ _MMIO(0x272c), 0xaaaaaaaa },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00000000 },
-	{ _MMIO(0x2748), 0x00000000 },
-	{ _MMIO(0x274c), 0x00000000 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2754), 0x00000000 },
-	{ _MMIO(0x2758), 0x00000000 },
-	{ _MMIO(0x275c), 0x00000000 },
-	{ _MMIO(0x236c), 0x00000000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_basic[] = {
-};
-
-static const struct i915_oa_reg mux_config_compute_basic[] = {
-	{ _MMIO(0x253a4), 0x00000000 },
-	{ _MMIO(0x2681c), 0x01f00800 },
-	{ _MMIO(0x26820), 0x00001000 },
-	{ _MMIO(0x2781c), 0x01f00800 },
-	{ _MMIO(0x26520), 0x00000007 },
-	{ _MMIO(0x265a0), 0x00000007 },
-	{ _MMIO(0x25380), 0x00000010 },
-	{ _MMIO(0x2538c), 0x00300000 },
-	{ _MMIO(0x25384), 0xaa8aaaaa },
-	{ _MMIO(0x25404), 0xffffffff },
-	{ _MMIO(0x26800), 0x00004202 },
-	{ _MMIO(0x26808), 0x00605817 },
-	{ _MMIO(0x2680c), 0x10001005 },
-	{ _MMIO(0x26804), 0x00000000 },
-	{ _MMIO(0x27800), 0x00000102 },
-	{ _MMIO(0x27808), 0x0c0701e0 },
-	{ _MMIO(0x2780c), 0x000200a0 },
-	{ _MMIO(0x27804), 0x00000000 },
-	{ _MMIO(0x26484), 0x44000000 },
-	{ _MMIO(0x26704), 0x44000000 },
-	{ _MMIO(0x26500), 0x00000006 },
-	{ _MMIO(0x26510), 0x00000001 },
-	{ _MMIO(0x26504), 0x88000000 },
-	{ _MMIO(0x26580), 0x00000006 },
-	{ _MMIO(0x26590), 0x00000020 },
-	{ _MMIO(0x26584), 0x00000000 },
-	{ _MMIO(0x26104), 0x55822222 },
-	{ _MMIO(0x26184), 0xaa866666 },
-	{ _MMIO(0x25420), 0x08320c83 },
-	{ _MMIO(0x25424), 0x06820c83 },
-	{ _MMIO(0x2541c), 0x00000000 },
-	{ _MMIO(0x25428), 0x00000c03 },
-};
-
-static int
-get_compute_basic_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_basic;
-	lens[n] = ARRAY_SIZE(mux_config_compute_basic);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_extended[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007fe2a },
-	{ _MMIO(0x2774), 0x0000ff00 },
-	{ _MMIO(0x2778), 0x0007fe6a },
-	{ _MMIO(0x277c), 0x0000ff00 },
-	{ _MMIO(0x2780), 0x0007fe92 },
-	{ _MMIO(0x2784), 0x0000ff00 },
-	{ _MMIO(0x2788), 0x0007fea2 },
-	{ _MMIO(0x278c), 0x0000ff00 },
-	{ _MMIO(0x2790), 0x0007fe32 },
-	{ _MMIO(0x2794), 0x0000ff00 },
-	{ _MMIO(0x2798), 0x0007fe9a },
-	{ _MMIO(0x279c), 0x0000ff00 },
-	{ _MMIO(0x27a0), 0x0007ff23 },
-	{ _MMIO(0x27a4), 0x0000ff00 },
-	{ _MMIO(0x27a8), 0x0007fff3 },
-	{ _MMIO(0x27ac), 0x0000fffe },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_extended[] = {
-};
-
-static const struct i915_oa_reg mux_config_compute_extended[] = {
-	{ _MMIO(0x2681c), 0x3eb00800 },
-	{ _MMIO(0x26820), 0x00900000 },
-	{ _MMIO(0x25384), 0x02aaaaaa },
-	{ _MMIO(0x25404), 0x03ffffff },
-	{ _MMIO(0x26800), 0x00142284 },
-	{ _MMIO(0x26808), 0x0e629062 },
-	{ _MMIO(0x2680c), 0x3f6f55cb },
-	{ _MMIO(0x26810), 0x00000014 },
-	{ _MMIO(0x26804), 0x00000000 },
-	{ _MMIO(0x26104), 0x02aaaaaa },
-	{ _MMIO(0x26184), 0x02aaaaaa },
-	{ _MMIO(0x25420), 0x00000000 },
-	{ _MMIO(0x25424), 0x00000000 },
-	{ _MMIO(0x2541c), 0x00000000 },
-	{ _MMIO(0x25428), 0x00000000 },
-};
-
-static int
-get_compute_extended_mux_config(struct drm_i915_private *dev_priv,
-				const struct i915_oa_reg **regs,
-				int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_extended;
-	lens[n] = ARRAY_SIZE(mux_config_compute_extended);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_memory_reads[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0x76543298 },
-	{ _MMIO(0x2748), 0x98989898 },
-	{ _MMIO(0x2744), 0x000000e4 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x275c), 0x98a98a98 },
-	{ _MMIO(0x2758), 0x88888888 },
-	{ _MMIO(0x2754), 0x000c5500 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007f81a },
-	{ _MMIO(0x2774), 0x0000fc00 },
-	{ _MMIO(0x2778), 0x0007f82a },
-	{ _MMIO(0x277c), 0x0000fc00 },
-	{ _MMIO(0x2780), 0x0007f872 },
-	{ _MMIO(0x2784), 0x0000fc00 },
-	{ _MMIO(0x2788), 0x0007f8ba },
-	{ _MMIO(0x278c), 0x0000fc00 },
-	{ _MMIO(0x2790), 0x0007f87a },
-	{ _MMIO(0x2794), 0x0000fc00 },
-	{ _MMIO(0x2798), 0x0007f8ea },
-	{ _MMIO(0x279c), 0x0000fc00 },
-	{ _MMIO(0x27a0), 0x0007f8e2 },
-	{ _MMIO(0x27a4), 0x0000fc00 },
-	{ _MMIO(0x27a8), 0x0007f8f2 },
-	{ _MMIO(0x27ac), 0x0000fc00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_memory_reads[] = {
-};
-
-static const struct i915_oa_reg mux_config_memory_reads[] = {
-	{ _MMIO(0x253a4), 0x34300000 },
-	{ _MMIO(0x25440), 0x2d800000 },
-	{ _MMIO(0x25444), 0x00000008 },
-	{ _MMIO(0x25128), 0x0e600000 },
-	{ _MMIO(0x25380), 0x00000450 },
-	{ _MMIO(0x25390), 0x00052c43 },
-	{ _MMIO(0x25384), 0x00000000 },
-	{ _MMIO(0x25400), 0x00006144 },
-	{ _MMIO(0x25408), 0x0a418820 },
-	{ _MMIO(0x2540c), 0x000820e6 },
-	{ _MMIO(0x25404), 0xff500000 },
-	{ _MMIO(0x25100), 0x000005d6 },
-	{ _MMIO(0x2510c), 0x0ef00000 },
-	{ _MMIO(0x25104), 0x00000000 },
-	{ _MMIO(0x25420), 0x02108421 },
-	{ _MMIO(0x25424), 0x00008421 },
-	{ _MMIO(0x2541c), 0x00000000 },
-	{ _MMIO(0x25428), 0x00000000 },
-};
-
-static int
-get_memory_reads_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_memory_reads;
-	lens[n] = ARRAY_SIZE(mux_config_memory_reads);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_memory_writes[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0x76543298 },
-	{ _MMIO(0x2748), 0x98989898 },
-	{ _MMIO(0x2744), 0x000000e4 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x275c), 0xbabababa },
-	{ _MMIO(0x2758), 0x88888888 },
-	{ _MMIO(0x2754), 0x000c5500 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007f81a },
-	{ _MMIO(0x2774), 0x0000fc00 },
-	{ _MMIO(0x2778), 0x0007f82a },
-	{ _MMIO(0x277c), 0x0000fc00 },
-	{ _MMIO(0x2780), 0x0007f822 },
-	{ _MMIO(0x2784), 0x0000fc00 },
-	{ _MMIO(0x2788), 0x0007f8ba },
-	{ _MMIO(0x278c), 0x0000fc00 },
-	{ _MMIO(0x2790), 0x0007f87a },
-	{ _MMIO(0x2794), 0x0000fc00 },
-	{ _MMIO(0x2798), 0x0007f8ea },
-	{ _MMIO(0x279c), 0x0000fc00 },
-	{ _MMIO(0x27a0), 0x0007f8e2 },
-	{ _MMIO(0x27a4), 0x0000fc00 },
-	{ _MMIO(0x27a8), 0x0007f8f2 },
-	{ _MMIO(0x27ac), 0x0000fc00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_memory_writes[] = {
-};
-
-static const struct i915_oa_reg mux_config_memory_writes[] = {
-	{ _MMIO(0x253a4), 0x34300000 },
-	{ _MMIO(0x25440), 0x01500000 },
-	{ _MMIO(0x25444), 0x00000120 },
-	{ _MMIO(0x25128), 0x0c200000 },
-	{ _MMIO(0x25380), 0x00000450 },
-	{ _MMIO(0x25390), 0x00052c43 },
-	{ _MMIO(0x25384), 0x00000000 },
-	{ _MMIO(0x25400), 0x00007184 },
-	{ _MMIO(0x25408), 0x0a418820 },
-	{ _MMIO(0x2540c), 0x000820e6 },
-	{ _MMIO(0x25404), 0xff500000 },
-	{ _MMIO(0x25100), 0x000005d6 },
-	{ _MMIO(0x2510c), 0x1e700000 },
-	{ _MMIO(0x25104), 0x00000000 },
-	{ _MMIO(0x25420), 0x02108421 },
-	{ _MMIO(0x25424), 0x00008421 },
-	{ _MMIO(0x2541c), 0x00000000 },
-	{ _MMIO(0x25428), 0x00000000 },
-};
-
-static int
-get_memory_writes_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_memory_writes;
-	lens[n] = ARRAY_SIZE(mux_config_memory_writes);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_sampler_balance[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_sampler_balance[] = {
-};
-
-static const struct i915_oa_reg mux_config_sampler_balance[] = {
-	{ _MMIO(0x2eb9c), 0x01906400 },
-	{ _MMIO(0x2fb9c), 0x01906400 },
-	{ _MMIO(0x253a4), 0x00000000 },
-	{ _MMIO(0x26b9c), 0x01906400 },
-	{ _MMIO(0x27b9c), 0x01906400 },
-	{ _MMIO(0x27104), 0x00a00000 },
-	{ _MMIO(0x27184), 0x00a50000 },
-	{ _MMIO(0x2e804), 0x00500000 },
-	{ _MMIO(0x2e984), 0x00500000 },
-	{ _MMIO(0x2eb04), 0x00500000 },
-	{ _MMIO(0x2eb80), 0x00000084 },
-	{ _MMIO(0x2eb8c), 0x14200000 },
-	{ _MMIO(0x2eb84), 0x00000000 },
-	{ _MMIO(0x2f804), 0x00050000 },
-	{ _MMIO(0x2f984), 0x00050000 },
-	{ _MMIO(0x2fb04), 0x00050000 },
-	{ _MMIO(0x2fb80), 0x00000084 },
-	{ _MMIO(0x2fb8c), 0x00050800 },
-	{ _MMIO(0x2fb84), 0x00000000 },
-	{ _MMIO(0x25380), 0x00000010 },
-	{ _MMIO(0x2538c), 0x000000c0 },
-	{ _MMIO(0x25384), 0xaa550000 },
-	{ _MMIO(0x25404), 0xffffc000 },
-	{ _MMIO(0x26804), 0x50000000 },
-	{ _MMIO(0x26984), 0x50000000 },
-	{ _MMIO(0x26b04), 0x50000000 },
-	{ _MMIO(0x26b80), 0x00000084 },
-	{ _MMIO(0x26b90), 0x00050800 },
-	{ _MMIO(0x26b84), 0x00000000 },
-	{ _MMIO(0x27804), 0x05000000 },
-	{ _MMIO(0x27984), 0x05000000 },
-	{ _MMIO(0x27b04), 0x05000000 },
-	{ _MMIO(0x27b80), 0x00000084 },
-	{ _MMIO(0x27b90), 0x00000142 },
-	{ _MMIO(0x27b84), 0x00000000 },
-	{ _MMIO(0x26104), 0xa0000000 },
-	{ _MMIO(0x26184), 0xa5000000 },
-	{ _MMIO(0x25424), 0x00008620 },
-	{ _MMIO(0x2541c), 0x00000000 },
-	{ _MMIO(0x25428), 0x0004a54a },
-};
-
-static int
-get_sampler_balance_mux_config(struct drm_i915_private *dev_priv,
-			       const struct i915_oa_reg **regs,
-			       int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_sampler_balance;
-	lens[n] = ARRAY_SIZE(mux_config_sampler_balance);
-	n++;
-
-	return n;
-}
-
-int i915_oa_select_metric_set_hsw(struct drm_i915_private *dev_priv)
-{
-	dev_priv->perf.oa.n_mux_configs = 0;
-	dev_priv->perf.oa.b_counter_regs = NULL;
-	dev_priv->perf.oa.b_counter_regs_len = 0;
-
-	switch (dev_priv->perf.oa.metrics_set) {
-	case METRIC_SET_ID_RENDER_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_basic_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_basic);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_basic_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_basic);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_EXTENDED:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_extended_mux_config(dev_priv,
-							dev_priv->perf.oa.mux_regs,
-							dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_EXTENDED\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_extended;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_extended);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_extended;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_extended);
-
-		return 0;
-	case METRIC_SET_ID_MEMORY_READS:
-		dev_priv->perf.oa.n_mux_configs =
-			get_memory_reads_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"MEMORY_READS\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_memory_reads;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_memory_reads);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_memory_reads;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_memory_reads);
-
-		return 0;
-	case METRIC_SET_ID_MEMORY_WRITES:
-		dev_priv->perf.oa.n_mux_configs =
-			get_memory_writes_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"MEMORY_WRITES\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_memory_writes;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_memory_writes);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_memory_writes;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_memory_writes);
-
-		return 0;
-	case METRIC_SET_ID_SAMPLER_BALANCE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_sampler_balance_mux_config(dev_priv,
-						       dev_priv->perf.oa.mux_regs,
-						       dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"SAMPLER_BALANCE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_sampler_balance;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_sampler_balance);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_sampler_balance;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_sampler_balance);
-
-		return 0;
-	default:
-		return -ENODEV;
-	}
-}
-
 static ssize_t
 show_render_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
 {
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_BASIC);
-}
-
-static struct device_attribute dev_attr_render_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_basic[] = {
-	&dev_attr_render_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_basic = {
-	.name = "403d8832-1a27-4aa6-a64e-f5389ce7b212",
-	.attrs =  attrs_render_basic,
-};
-
-static ssize_t
-show_compute_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_BASIC);
-}
-
-static struct device_attribute dev_attr_compute_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_basic[] = {
-	&dev_attr_compute_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_basic = {
-	.name = "39ad14bc-2380-45c4-91eb-fbcb3aa7ae7b",
-	.attrs =  attrs_compute_basic,
-};
-
-static ssize_t
-show_compute_extended_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_EXTENDED);
-}
-
-static struct device_attribute dev_attr_compute_extended_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_extended_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_extended[] = {
-	&dev_attr_compute_extended_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_extended = {
-	.name = "3865be28-6982-49fe-9494-e4d1b4795413",
-	.attrs =  attrs_compute_extended,
-};
-
-static ssize_t
-show_memory_reads_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_MEMORY_READS);
-}
-
-static struct device_attribute dev_attr_memory_reads_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_memory_reads_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_memory_reads[] = {
-	&dev_attr_memory_reads_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_memory_reads = {
-	.name = "bb5ed49b-2497-4095-94f6-26ba294db88a",
-	.attrs =  attrs_memory_reads,
-};
-
-static ssize_t
-show_memory_writes_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_MEMORY_WRITES);
-}
-
-static struct device_attribute dev_attr_memory_writes_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_memory_writes_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_memory_writes[] = {
-	&dev_attr_memory_writes_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_memory_writes = {
-	.name = "3358d639-9b5f-45ab-976d-9b08cbfc6240",
-	.attrs =  attrs_memory_writes,
-};
-
-static ssize_t
-show_sampler_balance_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_SAMPLER_BALANCE);
-}
-
-static struct device_attribute dev_attr_sampler_balance_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_sampler_balance_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_sampler_balance[] = {
-	&dev_attr_sampler_balance_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_sampler_balance = {
-	.name = "bc274488-b4b6-40c7-90da-b77d7ad16189",
-	.attrs =  attrs_sampler_balance,
-};
-
-int
-i915_perf_register_sysfs_hsw(struct drm_i915_private *dev_priv)
-{
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
-	int ret = 0;
-
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-		if (ret)
-			goto error_render_basic;
-	}
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-		if (ret)
-			goto error_compute_basic;
-	}
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-		if (ret)
-			goto error_compute_extended;
-	}
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-		if (ret)
-			goto error_memory_reads;
-	}
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-		if (ret)
-			goto error_memory_writes;
-	}
-	if (get_sampler_balance_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_sampler_balance);
-		if (ret)
-			goto error_sampler_balance;
-	}
-
-	return 0;
-
-error_sampler_balance:
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-error_memory_writes:
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-error_memory_reads:
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-error_compute_extended:
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-error_compute_basic:
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-error_render_basic:
-	return ret;
+	return sprintf(buf, "1\n");
 }
 
 void
-i915_perf_unregister_sysfs_hsw(struct drm_i915_private *dev_priv)
+i915_perf_load_test_config_hsw(struct drm_i915_private *dev_priv)
 {
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
+	strncpy(dev_priv->perf.oa.test_config.uuid,
+		"403d8832-1a27-4aa6-a64e-f5389ce7b212",
+		UUID_STRING_LEN);
+	dev_priv->perf.oa.test_config.id = 1;
 
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-	if (get_sampler_balance_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler_balance);
+	dev_priv->perf.oa.test_config.mux_regs = mux_config_render_basic;
+	dev_priv->perf.oa.test_config.mux_regs_len = ARRAY_SIZE(mux_config_render_basic);
+
+	dev_priv->perf.oa.test_config.b_counter_regs = b_counter_config_render_basic;
+	dev_priv->perf.oa.test_config.b_counter_regs_len = ARRAY_SIZE(b_counter_config_render_basic);
+
+	dev_priv->perf.oa.test_config.flex_regs = flex_eu_config_render_basic;
+	dev_priv->perf.oa.test_config.flex_regs_len = ARRAY_SIZE(flex_eu_config_render_basic);
+
+	dev_priv->perf.oa.test_config.sysfs_metric.name = "403d8832-1a27-4aa6-a64e-f5389ce7b212";
+	dev_priv->perf.oa.test_config.sysfs_metric.attrs = dev_priv->perf.oa.test_config.attrs;
+
+	dev_priv->perf.oa.test_config.attrs[0] = &dev_priv->perf.oa.test_config.sysfs_metric_id.attr;
+
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.name = "id";
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.mode = 0444;
+	dev_priv->perf.oa.test_config.sysfs_metric_id.show = show_render_basic_id;
 }
diff --git a/drivers/gpu/drm/i915/i915_oa_hsw.h b/drivers/gpu/drm/i915/i915_oa_hsw.h
index 6fe7e06..74d0343 100644
--- a/drivers/gpu/drm/i915/i915_oa_hsw.h
+++ b/drivers/gpu/drm/i915/i915_oa_hsw.h
@@ -29,12 +29,6 @@
 #ifndef __I915_OA_HSW_H__
 #define __I915_OA_HSW_H__
 
-extern int i915_oa_n_builtin_metric_sets_hsw;
-
-extern int i915_oa_select_metric_set_hsw(struct drm_i915_private *dev_priv);
-
-extern int i915_perf_register_sysfs_hsw(struct drm_i915_private *dev_priv);
-
-extern void i915_perf_unregister_sysfs_hsw(struct drm_i915_private *dev_priv);
+extern void i915_perf_load_test_config_hsw(struct drm_i915_private *dev_priv);
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_oa_kblgt2.c b/drivers/gpu/drm/i915/i915_oa_kblgt2.c
index 87dbd0a..b6e7cc7 100644
--- a/drivers/gpu/drm/i915/i915_oa_kblgt2.c
+++ b/drivers/gpu/drm/i915/i915_oa_kblgt2.c
@@ -31,1828 +31,6 @@
 #include "i915_drv.h"
 #include "i915_oa_kblgt2.h"
 
-enum metric_set_id {
-	METRIC_SET_ID_RENDER_BASIC = 1,
-	METRIC_SET_ID_COMPUTE_BASIC,
-	METRIC_SET_ID_RENDER_PIPE_PROFILE,
-	METRIC_SET_ID_MEMORY_READS,
-	METRIC_SET_ID_MEMORY_WRITES,
-	METRIC_SET_ID_COMPUTE_EXTENDED,
-	METRIC_SET_ID_COMPUTE_L3_CACHE,
-	METRIC_SET_ID_HDC_AND_SF,
-	METRIC_SET_ID_L3_1,
-	METRIC_SET_ID_L3_2,
-	METRIC_SET_ID_L3_3,
-	METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND,
-	METRIC_SET_ID_SAMPLER,
-	METRIC_SET_ID_TDL_1,
-	METRIC_SET_ID_TDL_2,
-	METRIC_SET_ID_COMPUTE_EXTRA,
-	METRIC_SET_ID_VME_PIPE,
-	METRIC_SET_ID_TEST_OA,
-};
-
-int i915_oa_n_builtin_metric_sets_kblgt2 = 18;
-
-static const struct i915_oa_reg b_counter_config_render_basic[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_render_basic[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_render_basic[] = {
-	{ _MMIO(0x9888), 0x166c01e0 },
-	{ _MMIO(0x9888), 0x12170280 },
-	{ _MMIO(0x9888), 0x12370280 },
-	{ _MMIO(0x9888), 0x11930317 },
-	{ _MMIO(0x9888), 0x159303df },
-	{ _MMIO(0x9888), 0x3f900003 },
-	{ _MMIO(0x9888), 0x1a4e0080 },
-	{ _MMIO(0x9888), 0x0a6c0053 },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x0a1b4000 },
-	{ _MMIO(0x9888), 0x1c1c0001 },
-	{ _MMIO(0x9888), 0x002f1000 },
-	{ _MMIO(0x9888), 0x042f1000 },
-	{ _MMIO(0x9888), 0x004c4000 },
-	{ _MMIO(0x9888), 0x0a4c8400 },
-	{ _MMIO(0x9888), 0x000d2000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0d2000 },
-	{ _MMIO(0x9888), 0x0c0f0400 },
-	{ _MMIO(0x9888), 0x0e0f6600 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x162c2200 },
-	{ _MMIO(0x9888), 0x062d8000 },
-	{ _MMIO(0x9888), 0x082d8000 },
-	{ _MMIO(0x9888), 0x00133000 },
-	{ _MMIO(0x9888), 0x08133000 },
-	{ _MMIO(0x9888), 0x00170020 },
-	{ _MMIO(0x9888), 0x08170021 },
-	{ _MMIO(0x9888), 0x10170000 },
-	{ _MMIO(0x9888), 0x0633c000 },
-	{ _MMIO(0x9888), 0x0833c000 },
-	{ _MMIO(0x9888), 0x06370800 },
-	{ _MMIO(0x9888), 0x08370840 },
-	{ _MMIO(0x9888), 0x10370000 },
-	{ _MMIO(0x9888), 0x0d933031 },
-	{ _MMIO(0x9888), 0x0f933e3f },
-	{ _MMIO(0x9888), 0x01933d00 },
-	{ _MMIO(0x9888), 0x0393073c },
-	{ _MMIO(0x9888), 0x0593000e },
-	{ _MMIO(0x9888), 0x1d930000 },
-	{ _MMIO(0x9888), 0x19930000 },
-	{ _MMIO(0x9888), 0x1b930000 },
-	{ _MMIO(0x9888), 0x1d900157 },
-	{ _MMIO(0x9888), 0x1f900158 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x2b908000 },
-	{ _MMIO(0x9888), 0x2d908000 },
-	{ _MMIO(0x9888), 0x2f908000 },
-	{ _MMIO(0x9888), 0x31908000 },
-	{ _MMIO(0x9888), 0x15908000 },
-	{ _MMIO(0x9888), 0x17908000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1190001f },
-	{ _MMIO(0x9888), 0x51904400 },
-	{ _MMIO(0x9888), 0x41900020 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900c21 },
-	{ _MMIO(0x9888), 0x47900061 },
-	{ _MMIO(0x9888), 0x57904440 },
-	{ _MMIO(0x9888), 0x49900000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900000 },
-	{ _MMIO(0x9888), 0x59900004 },
-	{ _MMIO(0x9888), 0x43900000 },
-	{ _MMIO(0x9888), 0x53904444 },
-};
-
-static int
-get_render_basic_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_render_basic;
-	lens[n] = ARRAY_SIZE(mux_config_render_basic);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_basic[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_basic[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_compute_basic[] = {
-	{ _MMIO(0x9888), 0x104f00e0 },
-	{ _MMIO(0x9888), 0x124f1c00 },
-	{ _MMIO(0x9888), 0x106c00e0 },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f900003 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x1a4e0820 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x064f0900 },
-	{ _MMIO(0x9888), 0x084f0032 },
-	{ _MMIO(0x9888), 0x0a4f1891 },
-	{ _MMIO(0x9888), 0x0c4f0e00 },
-	{ _MMIO(0x9888), 0x0e4f003c },
-	{ _MMIO(0x9888), 0x004f0d80 },
-	{ _MMIO(0x9888), 0x024f003b },
-	{ _MMIO(0x9888), 0x006c0002 },
-	{ _MMIO(0x9888), 0x086c0100 },
-	{ _MMIO(0x9888), 0x0c6c000c },
-	{ _MMIO(0x9888), 0x0e6c0b00 },
-	{ _MMIO(0x9888), 0x186c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x001b4000 },
-	{ _MMIO(0x9888), 0x081b8000 },
-	{ _MMIO(0x9888), 0x0c1b4000 },
-	{ _MMIO(0x9888), 0x0e1b8000 },
-	{ _MMIO(0x9888), 0x101c8000 },
-	{ _MMIO(0x9888), 0x1a1c8000 },
-	{ _MMIO(0x9888), 0x1c1c0024 },
-	{ _MMIO(0x9888), 0x065b8000 },
-	{ _MMIO(0x9888), 0x085b4000 },
-	{ _MMIO(0x9888), 0x0a5bc000 },
-	{ _MMIO(0x9888), 0x0c5b8000 },
-	{ _MMIO(0x9888), 0x0e5b4000 },
-	{ _MMIO(0x9888), 0x005b8000 },
-	{ _MMIO(0x9888), 0x025b4000 },
-	{ _MMIO(0x9888), 0x1a5c6000 },
-	{ _MMIO(0x9888), 0x1c5c001b },
-	{ _MMIO(0x9888), 0x125c8000 },
-	{ _MMIO(0x9888), 0x145c8000 },
-	{ _MMIO(0x9888), 0x004c8000 },
-	{ _MMIO(0x9888), 0x0a4c2000 },
-	{ _MMIO(0x9888), 0x0c4c0208 },
-	{ _MMIO(0x9888), 0x000da000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x020d2000 },
-	{ _MMIO(0x9888), 0x0c0f5400 },
-	{ _MMIO(0x9888), 0x0e0f5500 },
-	{ _MMIO(0x9888), 0x100f0155 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2cc000 },
-	{ _MMIO(0x9888), 0x162cfb00 },
-	{ _MMIO(0x9888), 0x182c00be },
-	{ _MMIO(0x9888), 0x022cc000 },
-	{ _MMIO(0x9888), 0x042cc000 },
-	{ _MMIO(0x9888), 0x19900157 },
-	{ _MMIO(0x9888), 0x1b900158 },
-	{ _MMIO(0x9888), 0x1d900105 },
-	{ _MMIO(0x9888), 0x1f900103 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x11900fff },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900800 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900821 },
-	{ _MMIO(0x9888), 0x47900802 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900802 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900002 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900422 },
-	{ _MMIO(0x9888), 0x53904444 },
-};
-
-static int
-get_compute_basic_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_basic;
-	lens[n] = ARRAY_SIZE(mux_config_compute_basic);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_render_pipe_profile[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007ffea },
-	{ _MMIO(0x2774), 0x00007ffc },
-	{ _MMIO(0x2778), 0x0007affa },
-	{ _MMIO(0x277c), 0x0000f5fd },
-	{ _MMIO(0x2780), 0x00079ffa },
-	{ _MMIO(0x2784), 0x0000f3fb },
-	{ _MMIO(0x2788), 0x0007bf7a },
-	{ _MMIO(0x278c), 0x0000f7e7 },
-	{ _MMIO(0x2790), 0x0007fefa },
-	{ _MMIO(0x2794), 0x0000f7cf },
-	{ _MMIO(0x2798), 0x00077ffa },
-	{ _MMIO(0x279c), 0x0000efdf },
-	{ _MMIO(0x27a0), 0x0006fffa },
-	{ _MMIO(0x27a4), 0x0000cfbf },
-	{ _MMIO(0x27a8), 0x0003fffa },
-	{ _MMIO(0x27ac), 0x00005f7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_render_pipe_profile[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_render_pipe_profile[] = {
-	{ _MMIO(0x9888), 0x0c0e001f },
-	{ _MMIO(0x9888), 0x0a0f0000 },
-	{ _MMIO(0x9888), 0x10116800 },
-	{ _MMIO(0x9888), 0x178a03e0 },
-	{ _MMIO(0x9888), 0x11824c00 },
-	{ _MMIO(0x9888), 0x11830020 },
-	{ _MMIO(0x9888), 0x13840020 },
-	{ _MMIO(0x9888), 0x11850019 },
-	{ _MMIO(0x9888), 0x11860007 },
-	{ _MMIO(0x9888), 0x01870c40 },
-	{ _MMIO(0x9888), 0x17880000 },
-	{ _MMIO(0x9888), 0x022f4000 },
-	{ _MMIO(0x9888), 0x0a4c0040 },
-	{ _MMIO(0x9888), 0x0c0d8000 },
-	{ _MMIO(0x9888), 0x040d4000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x020e5400 },
-	{ _MMIO(0x9888), 0x000e0000 },
-	{ _MMIO(0x9888), 0x080f0040 },
-	{ _MMIO(0x9888), 0x000f0000 },
-	{ _MMIO(0x9888), 0x100f0000 },
-	{ _MMIO(0x9888), 0x0e0f0040 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x06104000 },
-	{ _MMIO(0x9888), 0x06110012 },
-	{ _MMIO(0x9888), 0x06131000 },
-	{ _MMIO(0x9888), 0x01898000 },
-	{ _MMIO(0x9888), 0x0d890100 },
-	{ _MMIO(0x9888), 0x03898000 },
-	{ _MMIO(0x9888), 0x09808000 },
-	{ _MMIO(0x9888), 0x0b808000 },
-	{ _MMIO(0x9888), 0x0380c000 },
-	{ _MMIO(0x9888), 0x0f8a0075 },
-	{ _MMIO(0x9888), 0x1d8a0000 },
-	{ _MMIO(0x9888), 0x118a8000 },
-	{ _MMIO(0x9888), 0x1b8a4000 },
-	{ _MMIO(0x9888), 0x138a8000 },
-	{ _MMIO(0x9888), 0x1d81a000 },
-	{ _MMIO(0x9888), 0x15818000 },
-	{ _MMIO(0x9888), 0x17818000 },
-	{ _MMIO(0x9888), 0x0b820030 },
-	{ _MMIO(0x9888), 0x07828000 },
-	{ _MMIO(0x9888), 0x0d824000 },
-	{ _MMIO(0x9888), 0x0f828000 },
-	{ _MMIO(0x9888), 0x05824000 },
-	{ _MMIO(0x9888), 0x0d830003 },
-	{ _MMIO(0x9888), 0x0583000c },
-	{ _MMIO(0x9888), 0x09830000 },
-	{ _MMIO(0x9888), 0x03838000 },
-	{ _MMIO(0x9888), 0x07838000 },
-	{ _MMIO(0x9888), 0x0b840980 },
-	{ _MMIO(0x9888), 0x03844d80 },
-	{ _MMIO(0x9888), 0x11840000 },
-	{ _MMIO(0x9888), 0x09848000 },
-	{ _MMIO(0x9888), 0x09850080 },
-	{ _MMIO(0x9888), 0x03850003 },
-	{ _MMIO(0x9888), 0x01850000 },
-	{ _MMIO(0x9888), 0x07860000 },
-	{ _MMIO(0x9888), 0x0f860400 },
-	{ _MMIO(0x9888), 0x09870032 },
-	{ _MMIO(0x9888), 0x01888052 },
-	{ _MMIO(0x9888), 0x11880000 },
-	{ _MMIO(0x9888), 0x09884000 },
-	{ _MMIO(0x9888), 0x1b931001 },
-	{ _MMIO(0x9888), 0x1d930001 },
-	{ _MMIO(0x9888), 0x19934000 },
-	{ _MMIO(0x9888), 0x1b958000 },
-	{ _MMIO(0x9888), 0x1d950094 },
-	{ _MMIO(0x9888), 0x19958000 },
-	{ _MMIO(0x9888), 0x09e58000 },
-	{ _MMIO(0x9888), 0x0be58000 },
-	{ _MMIO(0x9888), 0x03e5c000 },
-	{ _MMIO(0x9888), 0x0592c000 },
-	{ _MMIO(0x9888), 0x0b928000 },
-	{ _MMIO(0x9888), 0x0d924000 },
-	{ _MMIO(0x9888), 0x0f924000 },
-	{ _MMIO(0x9888), 0x11928000 },
-	{ _MMIO(0x9888), 0x1392c000 },
-	{ _MMIO(0x9888), 0x09924000 },
-	{ _MMIO(0x9888), 0x01985000 },
-	{ _MMIO(0x9888), 0x07988000 },
-	{ _MMIO(0x9888), 0x09981000 },
-	{ _MMIO(0x9888), 0x0b982000 },
-	{ _MMIO(0x9888), 0x0d982000 },
-	{ _MMIO(0x9888), 0x0f989000 },
-	{ _MMIO(0x9888), 0x05982000 },
-	{ _MMIO(0x9888), 0x13904000 },
-	{ _MMIO(0x9888), 0x21904000 },
-	{ _MMIO(0x9888), 0x23904000 },
-	{ _MMIO(0x9888), 0x25908000 },
-	{ _MMIO(0x9888), 0x27904000 },
-	{ _MMIO(0x9888), 0x29908000 },
-	{ _MMIO(0x9888), 0x2b904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17908000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1190c080 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900440 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900400 },
-	{ _MMIO(0x9888), 0x47900c21 },
-	{ _MMIO(0x9888), 0x57900400 },
-	{ _MMIO(0x9888), 0x49900042 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900024 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900841 },
-	{ _MMIO(0x9888), 0x53900400 },
-};
-
-static int
-get_render_pipe_profile_mux_config(struct drm_i915_private *dev_priv,
-				   const struct i915_oa_reg **regs,
-				   int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_render_pipe_profile;
-	lens[n] = ARRAY_SIZE(mux_config_render_pipe_profile);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_memory_reads[] = {
-	{ _MMIO(0x272c), 0xffffffff },
-	{ _MMIO(0x2728), 0xffffffff },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x271c), 0xffffffff },
-	{ _MMIO(0x2718), 0xffffffff },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0x86543210 },
-	{ _MMIO(0x2748), 0x86543210 },
-	{ _MMIO(0x2744), 0x00006667 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x275c), 0x86543210 },
-	{ _MMIO(0x2758), 0x86543210 },
-	{ _MMIO(0x2754), 0x00006465 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007f81a },
-	{ _MMIO(0x2774), 0x0000fe00 },
-	{ _MMIO(0x2778), 0x0007f82a },
-	{ _MMIO(0x277c), 0x0000fe00 },
-	{ _MMIO(0x2780), 0x0007f872 },
-	{ _MMIO(0x2784), 0x0000fe00 },
-	{ _MMIO(0x2788), 0x0007f8ba },
-	{ _MMIO(0x278c), 0x0000fe00 },
-	{ _MMIO(0x2790), 0x0007f87a },
-	{ _MMIO(0x2794), 0x0000fe00 },
-	{ _MMIO(0x2798), 0x0007f8ea },
-	{ _MMIO(0x279c), 0x0000fe00 },
-	{ _MMIO(0x27a0), 0x0007f8e2 },
-	{ _MMIO(0x27a4), 0x0000fe00 },
-	{ _MMIO(0x27a8), 0x0007f8f2 },
-	{ _MMIO(0x27ac), 0x0000fe00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_memory_reads[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_memory_reads[] = {
-	{ _MMIO(0x9888), 0x11810c00 },
-	{ _MMIO(0x9888), 0x1381001a },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f900064 },
-	{ _MMIO(0x9888), 0x03811300 },
-	{ _MMIO(0x9888), 0x05811b12 },
-	{ _MMIO(0x9888), 0x0781001a },
-	{ _MMIO(0x9888), 0x1f810000 },
-	{ _MMIO(0x9888), 0x17810000 },
-	{ _MMIO(0x9888), 0x19810000 },
-	{ _MMIO(0x9888), 0x1b810000 },
-	{ _MMIO(0x9888), 0x1d810000 },
-	{ _MMIO(0x9888), 0x1b930055 },
-	{ _MMIO(0x9888), 0x03e58000 },
-	{ _MMIO(0x9888), 0x05e5c000 },
-	{ _MMIO(0x9888), 0x07e54000 },
-	{ _MMIO(0x9888), 0x13900150 },
-	{ _MMIO(0x9888), 0x21900151 },
-	{ _MMIO(0x9888), 0x23900152 },
-	{ _MMIO(0x9888), 0x25900153 },
-	{ _MMIO(0x9888), 0x27900154 },
-	{ _MMIO(0x9888), 0x29900155 },
-	{ _MMIO(0x9888), 0x2b900156 },
-	{ _MMIO(0x9888), 0x2d900157 },
-	{ _MMIO(0x9888), 0x2f90015f },
-	{ _MMIO(0x9888), 0x31900105 },
-	{ _MMIO(0x9888), 0x15900103 },
-	{ _MMIO(0x9888), 0x17900101 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d908000 },
-	{ _MMIO(0x9888), 0x1f908000 },
-	{ _MMIO(0x9888), 0x11900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c60 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900c00 },
-	{ _MMIO(0x9888), 0x47900c63 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900c63 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900063 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static int
-get_memory_reads_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_memory_reads;
-	lens[n] = ARRAY_SIZE(mux_config_memory_reads);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_memory_writes[] = {
-	{ _MMIO(0x272c), 0xffffffff },
-	{ _MMIO(0x2728), 0xffffffff },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x271c), 0xffffffff },
-	{ _MMIO(0x2718), 0xffffffff },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0x86543210 },
-	{ _MMIO(0x2748), 0x86543210 },
-	{ _MMIO(0x2744), 0x00006667 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x275c), 0x86543210 },
-	{ _MMIO(0x2758), 0x86543210 },
-	{ _MMIO(0x2754), 0x00006465 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007f81a },
-	{ _MMIO(0x2774), 0x0000fe00 },
-	{ _MMIO(0x2778), 0x0007f82a },
-	{ _MMIO(0x277c), 0x0000fe00 },
-	{ _MMIO(0x2780), 0x0007f822 },
-	{ _MMIO(0x2784), 0x0000fe00 },
-	{ _MMIO(0x2788), 0x0007f8ba },
-	{ _MMIO(0x278c), 0x0000fe00 },
-	{ _MMIO(0x2790), 0x0007f87a },
-	{ _MMIO(0x2794), 0x0000fe00 },
-	{ _MMIO(0x2798), 0x0007f8ea },
-	{ _MMIO(0x279c), 0x0000fe00 },
-	{ _MMIO(0x27a0), 0x0007f8e2 },
-	{ _MMIO(0x27a4), 0x0000fe00 },
-	{ _MMIO(0x27a8), 0x0007f8f2 },
-	{ _MMIO(0x27ac), 0x0000fe00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_memory_writes[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_memory_writes[] = {
-	{ _MMIO(0x9888), 0x11810c00 },
-	{ _MMIO(0x9888), 0x1381001a },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f901000 },
-	{ _MMIO(0x9888), 0x03811300 },
-	{ _MMIO(0x9888), 0x05811b12 },
-	{ _MMIO(0x9888), 0x0781001a },
-	{ _MMIO(0x9888), 0x1f810000 },
-	{ _MMIO(0x9888), 0x17810000 },
-	{ _MMIO(0x9888), 0x19810000 },
-	{ _MMIO(0x9888), 0x1b810000 },
-	{ _MMIO(0x9888), 0x1d810000 },
-	{ _MMIO(0x9888), 0x1b930055 },
-	{ _MMIO(0x9888), 0x03e58000 },
-	{ _MMIO(0x9888), 0x05e5c000 },
-	{ _MMIO(0x9888), 0x07e54000 },
-	{ _MMIO(0x9888), 0x13900160 },
-	{ _MMIO(0x9888), 0x21900161 },
-	{ _MMIO(0x9888), 0x23900162 },
-	{ _MMIO(0x9888), 0x25900163 },
-	{ _MMIO(0x9888), 0x27900164 },
-	{ _MMIO(0x9888), 0x29900165 },
-	{ _MMIO(0x9888), 0x2b900166 },
-	{ _MMIO(0x9888), 0x2d900167 },
-	{ _MMIO(0x9888), 0x2f900150 },
-	{ _MMIO(0x9888), 0x31900105 },
-	{ _MMIO(0x9888), 0x15900103 },
-	{ _MMIO(0x9888), 0x17900101 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d908000 },
-	{ _MMIO(0x9888), 0x1f908000 },
-	{ _MMIO(0x9888), 0x11900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c60 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900c00 },
-	{ _MMIO(0x9888), 0x47900c63 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900c63 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900063 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static int
-get_memory_writes_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_memory_writes;
-	lens[n] = ARRAY_SIZE(mux_config_memory_writes);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_extended[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007fc2a },
-	{ _MMIO(0x2774), 0x0000bf00 },
-	{ _MMIO(0x2778), 0x0007fc6a },
-	{ _MMIO(0x277c), 0x0000bf00 },
-	{ _MMIO(0x2780), 0x0007fc92 },
-	{ _MMIO(0x2784), 0x0000bf00 },
-	{ _MMIO(0x2788), 0x0007fca2 },
-	{ _MMIO(0x278c), 0x0000bf00 },
-	{ _MMIO(0x2790), 0x0007fc32 },
-	{ _MMIO(0x2794), 0x0000bf00 },
-	{ _MMIO(0x2798), 0x0007fc9a },
-	{ _MMIO(0x279c), 0x0000bf00 },
-	{ _MMIO(0x27a0), 0x0007fe6a },
-	{ _MMIO(0x27a4), 0x0000bf00 },
-	{ _MMIO(0x27a8), 0x0007fe7a },
-	{ _MMIO(0x27ac), 0x0000bf00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_extended[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_compute_extended[] = {
-	{ _MMIO(0x9888), 0x106c00e0 },
-	{ _MMIO(0x9888), 0x141c8160 },
-	{ _MMIO(0x9888), 0x161c8015 },
-	{ _MMIO(0x9888), 0x181c0120 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x0e4e8000 },
-	{ _MMIO(0x9888), 0x184e8000 },
-	{ _MMIO(0x9888), 0x1a4eaaa0 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x024e8000 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x0e6c0b01 },
-	{ _MMIO(0x9888), 0x006c0200 },
-	{ _MMIO(0x9888), 0x026c000c },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x1a6c0000 },
-	{ _MMIO(0x9888), 0x0e1bc000 },
-	{ _MMIO(0x9888), 0x001b8000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x001c0041 },
-	{ _MMIO(0x9888), 0x061c4200 },
-	{ _MMIO(0x9888), 0x081c4443 },
-	{ _MMIO(0x9888), 0x0a1c4645 },
-	{ _MMIO(0x9888), 0x0c1c7647 },
-	{ _MMIO(0x9888), 0x041c7357 },
-	{ _MMIO(0x9888), 0x1c1c0030 },
-	{ _MMIO(0x9888), 0x101c0000 },
-	{ _MMIO(0x9888), 0x1a1c0000 },
-	{ _MMIO(0x9888), 0x121c8000 },
-	{ _MMIO(0x9888), 0x004c8000 },
-	{ _MMIO(0x9888), 0x0a4caa2a },
-	{ _MMIO(0x9888), 0x0c4c02aa },
-	{ _MMIO(0x9888), 0x084ca000 },
-	{ _MMIO(0x9888), 0x000da000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x0c0f5400 },
-	{ _MMIO(0x9888), 0x0e0f5515 },
-	{ _MMIO(0x9888), 0x100f0155 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2c8000 },
-	{ _MMIO(0x9888), 0x162caa00 },
-	{ _MMIO(0x9888), 0x182c00aa },
-	{ _MMIO(0x9888), 0x022c8000 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x11907fff },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900040 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900802 },
-	{ _MMIO(0x9888), 0x47900842 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900842 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900800 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static int
-get_compute_extended_mux_config(struct drm_i915_private *dev_priv,
-				const struct i915_oa_reg **regs,
-				int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_extended;
-	lens[n] = ARRAY_SIZE(mux_config_compute_extended);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_l3_cache[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x30800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007fffa },
-	{ _MMIO(0x2774), 0x0000fefe },
-	{ _MMIO(0x2778), 0x0007fffa },
-	{ _MMIO(0x277c), 0x0000fefd },
-	{ _MMIO(0x2790), 0x0007fffa },
-	{ _MMIO(0x2794), 0x0000fbef },
-	{ _MMIO(0x2798), 0x0007fffa },
-	{ _MMIO(0x279c), 0x0000fbdf },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_l3_cache[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00101100 },
-	{ _MMIO(0xe45c), 0x00201200 },
-	{ _MMIO(0xe55c), 0x00301300 },
-	{ _MMIO(0xe65c), 0x00401400 },
-};
-
-static const struct i915_oa_reg mux_config_compute_l3_cache[] = {
-	{ _MMIO(0x9888), 0x166c0760 },
-	{ _MMIO(0x9888), 0x1593001e },
-	{ _MMIO(0x9888), 0x3f900003 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x0e4e8000 },
-	{ _MMIO(0x9888), 0x184e8000 },
-	{ _MMIO(0x9888), 0x1a4e8020 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x006c0051 },
-	{ _MMIO(0x9888), 0x066c5000 },
-	{ _MMIO(0x9888), 0x086c5c5d },
-	{ _MMIO(0x9888), 0x0e6c5e5f },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x186c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x001b4000 },
-	{ _MMIO(0x9888), 0x061b8000 },
-	{ _MMIO(0x9888), 0x081bc000 },
-	{ _MMIO(0x9888), 0x0e1bc000 },
-	{ _MMIO(0x9888), 0x101c8000 },
-	{ _MMIO(0x9888), 0x1a1ce000 },
-	{ _MMIO(0x9888), 0x1c1c0030 },
-	{ _MMIO(0x9888), 0x004c8000 },
-	{ _MMIO(0x9888), 0x0a4c2a00 },
-	{ _MMIO(0x9888), 0x0c4c0280 },
-	{ _MMIO(0x9888), 0x000d2000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x0c0f0400 },
-	{ _MMIO(0x9888), 0x0e0f1500 },
-	{ _MMIO(0x9888), 0x100f0140 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2c8000 },
-	{ _MMIO(0x9888), 0x162c0a00 },
-	{ _MMIO(0x9888), 0x182c00a0 },
-	{ _MMIO(0x9888), 0x03933300 },
-	{ _MMIO(0x9888), 0x05930032 },
-	{ _MMIO(0x9888), 0x11930000 },
-	{ _MMIO(0x9888), 0x1b930000 },
-	{ _MMIO(0x9888), 0x1d900157 },
-	{ _MMIO(0x9888), 0x1f900158 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1190030f },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900000 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900021 },
-	{ _MMIO(0x9888), 0x47900000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x4b900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x53904444 },
-	{ _MMIO(0x9888), 0x43900000 },
-};
-
-static int
-get_compute_l3_cache_mux_config(struct drm_i915_private *dev_priv,
-				const struct i915_oa_reg **regs,
-				int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_l3_cache;
-	lens[n] = ARRAY_SIZE(mux_config_compute_l3_cache);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_hdc_and_sf[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x10800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000fdff },
-};
-
-static const struct i915_oa_reg flex_eu_config_hdc_and_sf[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_hdc_and_sf[] = {
-	{ _MMIO(0x9888), 0x104f0232 },
-	{ _MMIO(0x9888), 0x124f4640 },
-	{ _MMIO(0x9888), 0x106c0232 },
-	{ _MMIO(0x9888), 0x11834400 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x0c4e8000 },
-	{ _MMIO(0x9888), 0x004f1880 },
-	{ _MMIO(0x9888), 0x024f08bb },
-	{ _MMIO(0x9888), 0x044f001b },
-	{ _MMIO(0x9888), 0x046c0100 },
-	{ _MMIO(0x9888), 0x066c000b },
-	{ _MMIO(0x9888), 0x1a6c0000 },
-	{ _MMIO(0x9888), 0x041b8000 },
-	{ _MMIO(0x9888), 0x061b4000 },
-	{ _MMIO(0x9888), 0x1a1c1800 },
-	{ _MMIO(0x9888), 0x005b8000 },
-	{ _MMIO(0x9888), 0x025bc000 },
-	{ _MMIO(0x9888), 0x045b4000 },
-	{ _MMIO(0x9888), 0x125c8000 },
-	{ _MMIO(0x9888), 0x145c8000 },
-	{ _MMIO(0x9888), 0x165c8000 },
-	{ _MMIO(0x9888), 0x185c8000 },
-	{ _MMIO(0x9888), 0x0a4c00a0 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0f5000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x022cc000 },
-	{ _MMIO(0x9888), 0x042cc000 },
-	{ _MMIO(0x9888), 0x062cc000 },
-	{ _MMIO(0x9888), 0x082cc000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x0f828000 },
-	{ _MMIO(0x9888), 0x0f8305c0 },
-	{ _MMIO(0x9888), 0x09830000 },
-	{ _MMIO(0x9888), 0x07830000 },
-	{ _MMIO(0x9888), 0x1d950080 },
-	{ _MMIO(0x9888), 0x13928000 },
-	{ _MMIO(0x9888), 0x0f988000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x1190fc00 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b900040 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900800 },
-	{ _MMIO(0x9888), 0x43900842 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_hdc_and_sf_mux_config(struct drm_i915_private *dev_priv,
-			  const struct i915_oa_reg **regs,
-			  int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_hdc_and_sf;
-	lens[n] = ARRAY_SIZE(mux_config_hdc_and_sf);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00014002 },
-	{ _MMIO(0x277c), 0x0000c3ff },
-	{ _MMIO(0x2780), 0x00010002 },
-	{ _MMIO(0x2784), 0x0000c7ff },
-	{ _MMIO(0x2788), 0x00004002 },
-	{ _MMIO(0x278c), 0x0000d3ff },
-	{ _MMIO(0x2790), 0x00100700 },
-	{ _MMIO(0x2794), 0x0000ff1f },
-	{ _MMIO(0x2798), 0x00001402 },
-	{ _MMIO(0x279c), 0x0000fc3f },
-	{ _MMIO(0x27a0), 0x00001002 },
-	{ _MMIO(0x27a4), 0x0000fc7f },
-	{ _MMIO(0x27a8), 0x00000402 },
-	{ _MMIO(0x27ac), 0x0000fd3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_1[] = {
-	{ _MMIO(0x9888), 0x126c7b40 },
-	{ _MMIO(0x9888), 0x166c0020 },
-	{ _MMIO(0x9888), 0x0a603444 },
-	{ _MMIO(0x9888), 0x0a613400 },
-	{ _MMIO(0x9888), 0x1a4ea800 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x024e8000 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x064f4000 },
-	{ _MMIO(0x9888), 0x0c6c5327 },
-	{ _MMIO(0x9888), 0x0e6c5425 },
-	{ _MMIO(0x9888), 0x006c2a00 },
-	{ _MMIO(0x9888), 0x026c285b },
-	{ _MMIO(0x9888), 0x046c005c },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x1a6c0800 },
-	{ _MMIO(0x9888), 0x0c1bc000 },
-	{ _MMIO(0x9888), 0x0e1bc000 },
-	{ _MMIO(0x9888), 0x001b8000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x041bc000 },
-	{ _MMIO(0x9888), 0x1c1c003c },
-	{ _MMIO(0x9888), 0x121c8000 },
-	{ _MMIO(0x9888), 0x141c8000 },
-	{ _MMIO(0x9888), 0x161c8000 },
-	{ _MMIO(0x9888), 0x181c8000 },
-	{ _MMIO(0x9888), 0x1a1c0800 },
-	{ _MMIO(0x9888), 0x065b4000 },
-	{ _MMIO(0x9888), 0x1a5c1000 },
-	{ _MMIO(0x9888), 0x10600000 },
-	{ _MMIO(0x9888), 0x04600000 },
-	{ _MMIO(0x9888), 0x0c610044 },
-	{ _MMIO(0x9888), 0x10610000 },
-	{ _MMIO(0x9888), 0x06610000 },
-	{ _MMIO(0x9888), 0x0c4c02a8 },
-	{ _MMIO(0x9888), 0x084ca000 },
-	{ _MMIO(0x9888), 0x0a4c002a },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x100f0154 },
-	{ _MMIO(0x9888), 0x0c0f5000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x182c00aa },
-	{ _MMIO(0x9888), 0x022c8000 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2cc000 },
-	{ _MMIO(0x9888), 0x1190ffc0 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900420 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900021 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900400 },
-	{ _MMIO(0x9888), 0x43900421 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900040 },
-};
-
-static int
-get_l3_1_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_1;
-	lens[n] = ARRAY_SIZE(mux_config_l3_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_2[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00028002 },
-	{ _MMIO(0x277c), 0x000087ff },
-	{ _MMIO(0x2780), 0x00020002 },
-	{ _MMIO(0x2784), 0x00008fff },
-	{ _MMIO(0x2788), 0x00008002 },
-	{ _MMIO(0x278c), 0x0000a7ff },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_2[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_2[] = {
-	{ _MMIO(0x9888), 0x126c02e0 },
-	{ _MMIO(0x9888), 0x146c0001 },
-	{ _MMIO(0x9888), 0x0a623400 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x064f4000 },
-	{ _MMIO(0x9888), 0x026c3324 },
-	{ _MMIO(0x9888), 0x046c3422 },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1a6c0000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x041bc000 },
-	{ _MMIO(0x9888), 0x141c8000 },
-	{ _MMIO(0x9888), 0x161c8000 },
-	{ _MMIO(0x9888), 0x181c8000 },
-	{ _MMIO(0x9888), 0x1a1c0800 },
-	{ _MMIO(0x9888), 0x065b4000 },
-	{ _MMIO(0x9888), 0x1a5c1000 },
-	{ _MMIO(0x9888), 0x06614000 },
-	{ _MMIO(0x9888), 0x0c620044 },
-	{ _MMIO(0x9888), 0x10620000 },
-	{ _MMIO(0x9888), 0x06620000 },
-	{ _MMIO(0x9888), 0x084c8000 },
-	{ _MMIO(0x9888), 0x0a4c002a },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0f4000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2cc000 },
-	{ _MMIO(0x9888), 0x1190f800 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x43900000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_l3_2_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_2;
-	lens[n] = ARRAY_SIZE(mux_config_l3_2);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_3[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00028002 },
-	{ _MMIO(0x277c), 0x000087ff },
-	{ _MMIO(0x2780), 0x00020002 },
-	{ _MMIO(0x2784), 0x00008fff },
-	{ _MMIO(0x2788), 0x00008002 },
-	{ _MMIO(0x278c), 0x0000a7ff },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_3[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_3[] = {
-	{ _MMIO(0x9888), 0x126c4e80 },
-	{ _MMIO(0x9888), 0x146c0000 },
-	{ _MMIO(0x9888), 0x0a633400 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x0c4e8000 },
-	{ _MMIO(0x9888), 0x026c3321 },
-	{ _MMIO(0x9888), 0x046c342f },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1a6c2000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x041bc000 },
-	{ _MMIO(0x9888), 0x061b4000 },
-	{ _MMIO(0x9888), 0x141c8000 },
-	{ _MMIO(0x9888), 0x161c8000 },
-	{ _MMIO(0x9888), 0x181c8000 },
-	{ _MMIO(0x9888), 0x1a1c1800 },
-	{ _MMIO(0x9888), 0x06604000 },
-	{ _MMIO(0x9888), 0x0c630044 },
-	{ _MMIO(0x9888), 0x10630000 },
-	{ _MMIO(0x9888), 0x06630000 },
-	{ _MMIO(0x9888), 0x084c8000 },
-	{ _MMIO(0x9888), 0x0a4c00aa },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0f4000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x1190f800 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x43900842 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900002 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_l3_3_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_3;
-	lens[n] = ARRAY_SIZE(mux_config_l3_3);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x30800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000efff },
-	{ _MMIO(0x2778), 0x00006000 },
-	{ _MMIO(0x277c), 0x0000f3ff },
-};
-
-static const struct i915_oa_reg flex_eu_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0x9888), 0x102f3800 },
-	{ _MMIO(0x9888), 0x144d0500 },
-	{ _MMIO(0x9888), 0x120d03c0 },
-	{ _MMIO(0x9888), 0x140d03cf },
-	{ _MMIO(0x9888), 0x0c0f0004 },
-	{ _MMIO(0x9888), 0x0c4e4000 },
-	{ _MMIO(0x9888), 0x042f0480 },
-	{ _MMIO(0x9888), 0x082f0000 },
-	{ _MMIO(0x9888), 0x022f0000 },
-	{ _MMIO(0x9888), 0x0a4c0090 },
-	{ _MMIO(0x9888), 0x064d0027 },
-	{ _MMIO(0x9888), 0x004d0000 },
-	{ _MMIO(0x9888), 0x000d0d40 },
-	{ _MMIO(0x9888), 0x020d803f },
-	{ _MMIO(0x9888), 0x040d8023 },
-	{ _MMIO(0x9888), 0x100d0000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x020f0010 },
-	{ _MMIO(0x9888), 0x000f0000 },
-	{ _MMIO(0x9888), 0x0e0f0050 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x1190fc00 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41901400 },
-	{ _MMIO(0x9888), 0x43901485 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900001 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_rasterizer_and_pixel_backend_mux_config(struct drm_i915_private *dev_priv,
-					    const struct i915_oa_reg **regs,
-					    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_rasterizer_and_pixel_backend;
-	lens[n] = ARRAY_SIZE(mux_config_rasterizer_and_pixel_backend);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_sampler[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x70800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x0000c000 },
-	{ _MMIO(0x2774), 0x0000e7ff },
-	{ _MMIO(0x2778), 0x00003000 },
-	{ _MMIO(0x277c), 0x0000f9ff },
-	{ _MMIO(0x2780), 0x00000c00 },
-	{ _MMIO(0x2784), 0x0000fe7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_sampler[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_sampler[] = {
-	{ _MMIO(0x9888), 0x14152c00 },
-	{ _MMIO(0x9888), 0x16150005 },
-	{ _MMIO(0x9888), 0x121600a0 },
-	{ _MMIO(0x9888), 0x14352c00 },
-	{ _MMIO(0x9888), 0x16350005 },
-	{ _MMIO(0x9888), 0x123600a0 },
-	{ _MMIO(0x9888), 0x14552c00 },
-	{ _MMIO(0x9888), 0x16550005 },
-	{ _MMIO(0x9888), 0x125600a0 },
-	{ _MMIO(0x9888), 0x062f6000 },
-	{ _MMIO(0x9888), 0x022f2000 },
-	{ _MMIO(0x9888), 0x0c4c0050 },
-	{ _MMIO(0x9888), 0x0a4c0010 },
-	{ _MMIO(0x9888), 0x0c0d8000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x100f0350 },
-	{ _MMIO(0x9888), 0x0c0fb000 },
-	{ _MMIO(0x9888), 0x0e0f00da },
-	{ _MMIO(0x9888), 0x182c0028 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x022dc000 },
-	{ _MMIO(0x9888), 0x042d4000 },
-	{ _MMIO(0x9888), 0x0c138000 },
-	{ _MMIO(0x9888), 0x0e132000 },
-	{ _MMIO(0x9888), 0x0413c000 },
-	{ _MMIO(0x9888), 0x1c140018 },
-	{ _MMIO(0x9888), 0x0c157000 },
-	{ _MMIO(0x9888), 0x0e150078 },
-	{ _MMIO(0x9888), 0x10150000 },
-	{ _MMIO(0x9888), 0x04162180 },
-	{ _MMIO(0x9888), 0x02160000 },
-	{ _MMIO(0x9888), 0x04174000 },
-	{ _MMIO(0x9888), 0x0233a000 },
-	{ _MMIO(0x9888), 0x04333000 },
-	{ _MMIO(0x9888), 0x14348000 },
-	{ _MMIO(0x9888), 0x16348000 },
-	{ _MMIO(0x9888), 0x02357870 },
-	{ _MMIO(0x9888), 0x10350000 },
-	{ _MMIO(0x9888), 0x04360043 },
-	{ _MMIO(0x9888), 0x02360000 },
-	{ _MMIO(0x9888), 0x04371000 },
-	{ _MMIO(0x9888), 0x0e538000 },
-	{ _MMIO(0x9888), 0x00538000 },
-	{ _MMIO(0x9888), 0x06533000 },
-	{ _MMIO(0x9888), 0x1c540020 },
-	{ _MMIO(0x9888), 0x12548000 },
-	{ _MMIO(0x9888), 0x0e557000 },
-	{ _MMIO(0x9888), 0x00557800 },
-	{ _MMIO(0x9888), 0x10550000 },
-	{ _MMIO(0x9888), 0x06560043 },
-	{ _MMIO(0x9888), 0x02560000 },
-	{ _MMIO(0x9888), 0x06571000 },
-	{ _MMIO(0x9888), 0x1190ff80 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900060 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c00 },
-	{ _MMIO(0x9888), 0x43900842 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900060 },
-};
-
-static int
-get_sampler_mux_config(struct drm_i915_private *dev_priv,
-		       const struct i915_oa_reg **regs,
-		       int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_sampler;
-	lens[n] = ARRAY_SIZE(mux_config_sampler);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_tdl_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x00007fff },
-	{ _MMIO(0x2778), 0x00000000 },
-	{ _MMIO(0x277c), 0x00009fff },
-	{ _MMIO(0x2780), 0x00000002 },
-	{ _MMIO(0x2784), 0x0000efff },
-	{ _MMIO(0x2788), 0x00000000 },
-	{ _MMIO(0x278c), 0x0000f3ff },
-	{ _MMIO(0x2790), 0x00000002 },
-	{ _MMIO(0x2794), 0x0000fdff },
-	{ _MMIO(0x2798), 0x00000000 },
-	{ _MMIO(0x279c), 0x0000fe7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_tdl_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_tdl_1[] = {
-	{ _MMIO(0x9888), 0x12120000 },
-	{ _MMIO(0x9888), 0x12320000 },
-	{ _MMIO(0x9888), 0x12520000 },
-	{ _MMIO(0x9888), 0x002f8000 },
-	{ _MMIO(0x9888), 0x022f3000 },
-	{ _MMIO(0x9888), 0x0a4c0015 },
-	{ _MMIO(0x9888), 0x0c0d8000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x100f03a0 },
-	{ _MMIO(0x9888), 0x0c0ff000 },
-	{ _MMIO(0x9888), 0x0e0f0095 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2d8000 },
-	{ _MMIO(0x9888), 0x0e2d4000 },
-	{ _MMIO(0x9888), 0x062d4000 },
-	{ _MMIO(0x9888), 0x02108000 },
-	{ _MMIO(0x9888), 0x0410c000 },
-	{ _MMIO(0x9888), 0x02118000 },
-	{ _MMIO(0x9888), 0x0411c000 },
-	{ _MMIO(0x9888), 0x02121880 },
-	{ _MMIO(0x9888), 0x041219b5 },
-	{ _MMIO(0x9888), 0x00120000 },
-	{ _MMIO(0x9888), 0x02134000 },
-	{ _MMIO(0x9888), 0x04135000 },
-	{ _MMIO(0x9888), 0x0c308000 },
-	{ _MMIO(0x9888), 0x0e304000 },
-	{ _MMIO(0x9888), 0x06304000 },
-	{ _MMIO(0x9888), 0x0c318000 },
-	{ _MMIO(0x9888), 0x0e314000 },
-	{ _MMIO(0x9888), 0x06314000 },
-	{ _MMIO(0x9888), 0x0c321a80 },
-	{ _MMIO(0x9888), 0x0e320033 },
-	{ _MMIO(0x9888), 0x06320031 },
-	{ _MMIO(0x9888), 0x00320000 },
-	{ _MMIO(0x9888), 0x0c334000 },
-	{ _MMIO(0x9888), 0x0e331000 },
-	{ _MMIO(0x9888), 0x06331000 },
-	{ _MMIO(0x9888), 0x0e508000 },
-	{ _MMIO(0x9888), 0x00508000 },
-	{ _MMIO(0x9888), 0x02504000 },
-	{ _MMIO(0x9888), 0x0e518000 },
-	{ _MMIO(0x9888), 0x00518000 },
-	{ _MMIO(0x9888), 0x02514000 },
-	{ _MMIO(0x9888), 0x0e521880 },
-	{ _MMIO(0x9888), 0x00521a80 },
-	{ _MMIO(0x9888), 0x02520033 },
-	{ _MMIO(0x9888), 0x0e534000 },
-	{ _MMIO(0x9888), 0x00534000 },
-	{ _MMIO(0x9888), 0x02531000 },
-	{ _MMIO(0x9888), 0x1190ff80 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900800 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900062 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c00 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900040 },
-};
-
-static int
-get_tdl_1_mux_config(struct drm_i915_private *dev_priv,
-		     const struct i915_oa_reg **regs,
-		     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_tdl_1;
-	lens[n] = ARRAY_SIZE(mux_config_tdl_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_tdl_2[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_tdl_2[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_tdl_2[] = {
-	{ _MMIO(0x9888), 0x12124d60 },
-	{ _MMIO(0x9888), 0x12322e60 },
-	{ _MMIO(0x9888), 0x12524d60 },
-	{ _MMIO(0x9888), 0x022f3000 },
-	{ _MMIO(0x9888), 0x0a4c0014 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0fe000 },
-	{ _MMIO(0x9888), 0x0e0f0097 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x002d8000 },
-	{ _MMIO(0x9888), 0x062d4000 },
-	{ _MMIO(0x9888), 0x0410c000 },
-	{ _MMIO(0x9888), 0x0411c000 },
-	{ _MMIO(0x9888), 0x04121fb7 },
-	{ _MMIO(0x9888), 0x00120000 },
-	{ _MMIO(0x9888), 0x04135000 },
-	{ _MMIO(0x9888), 0x00308000 },
-	{ _MMIO(0x9888), 0x06304000 },
-	{ _MMIO(0x9888), 0x00318000 },
-	{ _MMIO(0x9888), 0x06314000 },
-	{ _MMIO(0x9888), 0x00321b80 },
-	{ _MMIO(0x9888), 0x0632003f },
-	{ _MMIO(0x9888), 0x00334000 },
-	{ _MMIO(0x9888), 0x06331000 },
-	{ _MMIO(0x9888), 0x0250c000 },
-	{ _MMIO(0x9888), 0x0251c000 },
-	{ _MMIO(0x9888), 0x02521fb7 },
-	{ _MMIO(0x9888), 0x00520000 },
-	{ _MMIO(0x9888), 0x02535000 },
-	{ _MMIO(0x9888), 0x1190fc00 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900800 },
-	{ _MMIO(0x9888), 0x43900063 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900040 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_tdl_2_mux_config(struct drm_i915_private *dev_priv,
-		     const struct i915_oa_reg **regs,
-		     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_tdl_2;
-	lens[n] = ARRAY_SIZE(mux_config_tdl_2);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_extra[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_extra[] = {
-	{ _MMIO(0xe458), 0x00001000 },
-	{ _MMIO(0xe558), 0x00003002 },
-	{ _MMIO(0xe658), 0x00005004 },
-	{ _MMIO(0xe758), 0x00011010 },
-	{ _MMIO(0xe45c), 0x00050012 },
-	{ _MMIO(0xe55c), 0x00052051 },
-	{ _MMIO(0xe65c), 0x00000008 },
-};
-
-static const struct i915_oa_reg mux_config_compute_extra[] = {
-	{ _MMIO(0x9888), 0x121203e0 },
-	{ _MMIO(0x9888), 0x123203e0 },
-	{ _MMIO(0x9888), 0x125203e0 },
-	{ _MMIO(0x9888), 0x022f4000 },
-	{ _MMIO(0x9888), 0x0a4c0040 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0e0f006c },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x042d8000 },
-	{ _MMIO(0x9888), 0x06104000 },
-	{ _MMIO(0x9888), 0x06114000 },
-	{ _MMIO(0x9888), 0x06120033 },
-	{ _MMIO(0x9888), 0x00120000 },
-	{ _MMIO(0x9888), 0x06131000 },
-	{ _MMIO(0x9888), 0x04308000 },
-	{ _MMIO(0x9888), 0x04318000 },
-	{ _MMIO(0x9888), 0x04321980 },
-	{ _MMIO(0x9888), 0x00320000 },
-	{ _MMIO(0x9888), 0x04334000 },
-	{ _MMIO(0x9888), 0x04504000 },
-	{ _MMIO(0x9888), 0x04514000 },
-	{ _MMIO(0x9888), 0x04520033 },
-	{ _MMIO(0x9888), 0x00520000 },
-	{ _MMIO(0x9888), 0x04531000 },
-	{ _MMIO(0x9888), 0x1190e000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43900c00 },
-	{ _MMIO(0x9888), 0x45900002 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_compute_extra_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_extra;
-	lens[n] = ARRAY_SIZE(mux_config_compute_extra);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_vme_pipe[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00100030 },
-	{ _MMIO(0x2774), 0x0000fff9 },
-	{ _MMIO(0x2778), 0x00000002 },
-	{ _MMIO(0x277c), 0x0000fffc },
-	{ _MMIO(0x2780), 0x00000002 },
-	{ _MMIO(0x2784), 0x0000fff3 },
-	{ _MMIO(0x2788), 0x00100180 },
-	{ _MMIO(0x278c), 0x0000ffcf },
-	{ _MMIO(0x2790), 0x00000002 },
-	{ _MMIO(0x2794), 0x0000ffcf },
-	{ _MMIO(0x2798), 0x00000002 },
-	{ _MMIO(0x279c), 0x0000ff3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_vme_pipe[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00008003 },
-};
-
-static const struct i915_oa_reg mux_config_vme_pipe[] = {
-	{ _MMIO(0x9888), 0x141a5800 },
-	{ _MMIO(0x9888), 0x161a00c0 },
-	{ _MMIO(0x9888), 0x12180240 },
-	{ _MMIO(0x9888), 0x14180002 },
-	{ _MMIO(0x9888), 0x143a5800 },
-	{ _MMIO(0x9888), 0x163a00c0 },
-	{ _MMIO(0x9888), 0x12380240 },
-	{ _MMIO(0x9888), 0x14380002 },
-	{ _MMIO(0x9888), 0x002f1000 },
-	{ _MMIO(0x9888), 0x022f8000 },
-	{ _MMIO(0x9888), 0x042f3000 },
-	{ _MMIO(0x9888), 0x004c4000 },
-	{ _MMIO(0x9888), 0x0a4c1500 },
-	{ _MMIO(0x9888), 0x000d2000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0c0f0400 },
-	{ _MMIO(0x9888), 0x0e0f9500 },
-	{ _MMIO(0x9888), 0x100f002a },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2c8000 },
-	{ _MMIO(0x9888), 0x162c0a00 },
-	{ _MMIO(0x9888), 0x0a2dc000 },
-	{ _MMIO(0x9888), 0x0c2dc000 },
-	{ _MMIO(0x9888), 0x04193000 },
-	{ _MMIO(0x9888), 0x081a28c1 },
-	{ _MMIO(0x9888), 0x001a0000 },
-	{ _MMIO(0x9888), 0x00133000 },
-	{ _MMIO(0x9888), 0x0613c000 },
-	{ _MMIO(0x9888), 0x0813f000 },
-	{ _MMIO(0x9888), 0x00172000 },
-	{ _MMIO(0x9888), 0x06178000 },
-	{ _MMIO(0x9888), 0x0817a000 },
-	{ _MMIO(0x9888), 0x00180037 },
-	{ _MMIO(0x9888), 0x06180940 },
-	{ _MMIO(0x9888), 0x08180000 },
-	{ _MMIO(0x9888), 0x02180000 },
-	{ _MMIO(0x9888), 0x04183000 },
-	{ _MMIO(0x9888), 0x06393000 },
-	{ _MMIO(0x9888), 0x0c3a28c1 },
-	{ _MMIO(0x9888), 0x003a0000 },
-	{ _MMIO(0x9888), 0x0a33f000 },
-	{ _MMIO(0x9888), 0x0c33f000 },
-	{ _MMIO(0x9888), 0x0a37a000 },
-	{ _MMIO(0x9888), 0x0c37a000 },
-	{ _MMIO(0x9888), 0x0a380977 },
-	{ _MMIO(0x9888), 0x08380000 },
-	{ _MMIO(0x9888), 0x04380000 },
-	{ _MMIO(0x9888), 0x06383000 },
-	{ _MMIO(0x9888), 0x119000ff },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900040 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900800 },
-	{ _MMIO(0x9888), 0x47901000 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900844 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_vme_pipe_mux_config(struct drm_i915_private *dev_priv,
-			const struct i915_oa_reg **regs,
-			int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_vme_pipe;
-	lens[n] = ARRAY_SIZE(mux_config_vme_pipe);
-	n++;
-
-	return n;
-}
-
 static const struct i915_oa_reg b_counter_config_test_oa[] = {
 	{ _MMIO(0x2740), 0x00000000 },
 	{ _MMIO(0x2744), 0x00800000 },
@@ -1882,6 +60,7 @@ static const struct i915_oa_reg flex_eu_config_test_oa[] = {
 };
 
 static const struct i915_oa_reg mux_config_test_oa[] = {
+	{ _MMIO(0x9840), 0x00000080 },
 	{ _MMIO(0x9888), 0x11810000 },
 	{ _MMIO(0x9888), 0x07810013 },
 	{ _MMIO(0x9888), 0x1f810000 },
@@ -1896,1096 +75,35 @@ static const struct i915_oa_reg mux_config_test_oa[] = {
 	{ _MMIO(0x9888), 0x33900000 },
 };
 
-static int
-get_test_oa_mux_config(struct drm_i915_private *dev_priv,
-		       const struct i915_oa_reg **regs,
-		       int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_test_oa;
-	lens[n] = ARRAY_SIZE(mux_config_test_oa);
-	n++;
-
-	return n;
-}
-
-int i915_oa_select_metric_set_kblgt2(struct drm_i915_private *dev_priv)
-{
-	dev_priv->perf.oa.n_mux_configs = 0;
-	dev_priv->perf.oa.b_counter_regs = NULL;
-	dev_priv->perf.oa.b_counter_regs_len = 0;
-	dev_priv->perf.oa.flex_regs = NULL;
-	dev_priv->perf.oa.flex_regs_len = 0;
-
-	switch (dev_priv->perf.oa.metrics_set) {
-	case METRIC_SET_ID_RENDER_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_basic_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_basic);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_basic_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_basic);
-
-		return 0;
-	case METRIC_SET_ID_RENDER_PIPE_PROFILE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_pipe_profile_mux_config(dev_priv,
-							   dev_priv->perf.oa.mux_regs,
-							   dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_PIPE_PROFILE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_pipe_profile;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_pipe_profile);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_pipe_profile;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_pipe_profile);
-
-		return 0;
-	case METRIC_SET_ID_MEMORY_READS:
-		dev_priv->perf.oa.n_mux_configs =
-			get_memory_reads_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"MEMORY_READS\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_memory_reads;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_memory_reads);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_memory_reads;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_memory_reads);
-
-		return 0;
-	case METRIC_SET_ID_MEMORY_WRITES:
-		dev_priv->perf.oa.n_mux_configs =
-			get_memory_writes_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"MEMORY_WRITES\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_memory_writes;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_memory_writes);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_memory_writes;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_memory_writes);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_EXTENDED:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_extended_mux_config(dev_priv,
-							dev_priv->perf.oa.mux_regs,
-							dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_EXTENDED\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_extended;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_extended);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_extended;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_extended);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_L3_CACHE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_l3_cache_mux_config(dev_priv,
-							dev_priv->perf.oa.mux_regs,
-							dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_L3_CACHE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_l3_cache;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_l3_cache);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_l3_cache;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_l3_cache);
-
-		return 0;
-	case METRIC_SET_ID_HDC_AND_SF:
-		dev_priv->perf.oa.n_mux_configs =
-			get_hdc_and_sf_mux_config(dev_priv,
-						  dev_priv->perf.oa.mux_regs,
-						  dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"HDC_AND_SF\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_hdc_and_sf;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_hdc_and_sf);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_hdc_and_sf;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_hdc_and_sf);
-
-		return 0;
-	case METRIC_SET_ID_L3_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_1_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_1);
-
-		return 0;
-	case METRIC_SET_ID_L3_2:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_2_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_2\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_2;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_2);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_2;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_2);
-
-		return 0;
-	case METRIC_SET_ID_L3_3:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_3_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_3\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_3;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_3);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_3;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_3);
-
-		return 0;
-	case METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND:
-		dev_priv->perf.oa.n_mux_configs =
-			get_rasterizer_and_pixel_backend_mux_config(dev_priv,
-								    dev_priv->perf.oa.mux_regs,
-								    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RASTERIZER_AND_PIXEL_BACKEND\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_rasterizer_and_pixel_backend;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_rasterizer_and_pixel_backend);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_rasterizer_and_pixel_backend;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_rasterizer_and_pixel_backend);
-
-		return 0;
-	case METRIC_SET_ID_SAMPLER:
-		dev_priv->perf.oa.n_mux_configs =
-			get_sampler_mux_config(dev_priv,
-					       dev_priv->perf.oa.mux_regs,
-					       dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"SAMPLER\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_sampler;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_sampler);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_sampler;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_sampler);
-
-		return 0;
-	case METRIC_SET_ID_TDL_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_tdl_1_mux_config(dev_priv,
-					     dev_priv->perf.oa.mux_regs,
-					     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TDL_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_tdl_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_tdl_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_tdl_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_tdl_1);
-
-		return 0;
-	case METRIC_SET_ID_TDL_2:
-		dev_priv->perf.oa.n_mux_configs =
-			get_tdl_2_mux_config(dev_priv,
-					     dev_priv->perf.oa.mux_regs,
-					     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TDL_2\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_tdl_2;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_tdl_2);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_tdl_2;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_tdl_2);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_EXTRA:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_extra_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_EXTRA\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_extra;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_extra);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_extra;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_extra);
-
-		return 0;
-	case METRIC_SET_ID_VME_PIPE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_vme_pipe_mux_config(dev_priv,
-						dev_priv->perf.oa.mux_regs,
-						dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"VME_PIPE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_vme_pipe;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_vme_pipe);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_vme_pipe;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_vme_pipe);
-
-		return 0;
-	case METRIC_SET_ID_TEST_OA:
-		dev_priv->perf.oa.n_mux_configs =
-			get_test_oa_mux_config(dev_priv,
-					       dev_priv->perf.oa.mux_regs,
-					       dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TEST_OA\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_test_oa;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_test_oa);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_test_oa;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_test_oa);
-
-		return 0;
-	default:
-		return -ENODEV;
-	}
-}
-
-static ssize_t
-show_render_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_BASIC);
-}
-
-static struct device_attribute dev_attr_render_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_basic[] = {
-	&dev_attr_render_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_basic = {
-	.name = "f8d677e9-ff6f-4df1-9310-0334c6efacce",
-	.attrs =  attrs_render_basic,
-};
-
-static ssize_t
-show_compute_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_BASIC);
-}
-
-static struct device_attribute dev_attr_compute_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_basic[] = {
-	&dev_attr_compute_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_basic = {
-	.name = "e17fc42a-e614-41b6-90c4-1074841a6c77",
-	.attrs =  attrs_compute_basic,
-};
-
-static ssize_t
-show_render_pipe_profile_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_PIPE_PROFILE);
-}
-
-static struct device_attribute dev_attr_render_pipe_profile_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_pipe_profile_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_pipe_profile[] = {
-	&dev_attr_render_pipe_profile_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_pipe_profile = {
-	.name = "d7a17a3a-ca71-40d2-a919-ace80d50633f",
-	.attrs =  attrs_render_pipe_profile,
-};
-
-static ssize_t
-show_memory_reads_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_MEMORY_READS);
-}
-
-static struct device_attribute dev_attr_memory_reads_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_memory_reads_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_memory_reads[] = {
-	&dev_attr_memory_reads_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_memory_reads = {
-	.name = "57b59202-172b-477a-87de-33f85572c589",
-	.attrs =  attrs_memory_reads,
-};
-
-static ssize_t
-show_memory_writes_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_MEMORY_WRITES);
-}
-
-static struct device_attribute dev_attr_memory_writes_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_memory_writes_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_memory_writes[] = {
-	&dev_attr_memory_writes_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_memory_writes = {
-	.name = "3addf8ef-8e9b-40f5-a448-3dbb5d5128b0",
-	.attrs =  attrs_memory_writes,
-};
-
-static ssize_t
-show_compute_extended_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_EXTENDED);
-}
-
-static struct device_attribute dev_attr_compute_extended_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_extended_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_extended[] = {
-	&dev_attr_compute_extended_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_extended = {
-	.name = "4af0400a-81c3-47db-a6b6-deddbd75680e",
-	.attrs =  attrs_compute_extended,
-};
-
-static ssize_t
-show_compute_l3_cache_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_L3_CACHE);
-}
-
-static struct device_attribute dev_attr_compute_l3_cache_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_l3_cache_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_l3_cache[] = {
-	&dev_attr_compute_l3_cache_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_l3_cache = {
-	.name = "0e22f995-79ca-4f67-83ab-e9d9772488d8",
-	.attrs =  attrs_compute_l3_cache,
-};
-
-static ssize_t
-show_hdc_and_sf_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_HDC_AND_SF);
-}
-
-static struct device_attribute dev_attr_hdc_and_sf_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_hdc_and_sf_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_hdc_and_sf[] = {
-	&dev_attr_hdc_and_sf_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_hdc_and_sf = {
-	.name = "bc2a00f7-cb8a-4ff2-8ad0-e241dad16937",
-	.attrs =  attrs_hdc_and_sf,
-};
-
-static ssize_t
-show_l3_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_1);
-}
-
-static struct device_attribute dev_attr_l3_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_1[] = {
-	&dev_attr_l3_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_1 = {
-	.name = "d2bbe790-f058-42d9-81c6-cdedcf655bc2",
-	.attrs =  attrs_l3_1,
-};
-
-static ssize_t
-show_l3_2_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_2);
-}
-
-static struct device_attribute dev_attr_l3_2_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_2_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_2[] = {
-	&dev_attr_l3_2_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_2 = {
-	.name = "2f8e32e4-5956-46e2-af31-c8ea95887332",
-	.attrs =  attrs_l3_2,
-};
-
-static ssize_t
-show_l3_3_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_3);
-}
-
-static struct device_attribute dev_attr_l3_3_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_3_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_3[] = {
-	&dev_attr_l3_3_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_3 = {
-	.name = "ca046aad-b5fb-4101-adce-6473ee6e5b14",
-	.attrs =  attrs_l3_3,
-};
-
-static ssize_t
-show_rasterizer_and_pixel_backend_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND);
-}
-
-static struct device_attribute dev_attr_rasterizer_and_pixel_backend_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_rasterizer_and_pixel_backend_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_rasterizer_and_pixel_backend[] = {
-	&dev_attr_rasterizer_and_pixel_backend_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_rasterizer_and_pixel_backend = {
-	.name = "605f388f-24bb-455c-88e3-8d57ae0d7e9f",
-	.attrs =  attrs_rasterizer_and_pixel_backend,
-};
-
-static ssize_t
-show_sampler_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_SAMPLER);
-}
-
-static struct device_attribute dev_attr_sampler_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_sampler_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_sampler[] = {
-	&dev_attr_sampler_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_sampler = {
-	.name = "31dd157c-bf4e-4bab-bf2b-f5c8174af1af",
-	.attrs =  attrs_sampler,
-};
-
-static ssize_t
-show_tdl_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TDL_1);
-}
-
-static struct device_attribute dev_attr_tdl_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_tdl_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_tdl_1[] = {
-	&dev_attr_tdl_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_tdl_1 = {
-	.name = "105db928-5542-466b-9128-e1f3c91426cb",
-	.attrs =  attrs_tdl_1,
-};
-
-static ssize_t
-show_tdl_2_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TDL_2);
-}
-
-static struct device_attribute dev_attr_tdl_2_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_tdl_2_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_tdl_2[] = {
-	&dev_attr_tdl_2_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_tdl_2 = {
-	.name = "03db94d2-b37f-4c58-a791-0d2067b013bb",
-	.attrs =  attrs_tdl_2,
-};
-
-static ssize_t
-show_compute_extra_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_EXTRA);
-}
-
-static struct device_attribute dev_attr_compute_extra_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_extra_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_extra[] = {
-	&dev_attr_compute_extra_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_extra = {
-	.name = "aa7a3fb9-22fb-43ff-a32d-0ab6c13bbd16",
-	.attrs =  attrs_compute_extra,
-};
-
-static ssize_t
-show_vme_pipe_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_VME_PIPE);
-}
-
-static struct device_attribute dev_attr_vme_pipe_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_vme_pipe_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_vme_pipe[] = {
-	&dev_attr_vme_pipe_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_vme_pipe = {
-	.name = "398a4268-ef6f-4ffc-b55f-3c7b5363ce61",
-	.attrs =  attrs_vme_pipe,
-};
-
 static ssize_t
 show_test_oa_id(struct device *kdev, struct device_attribute *attr, char *buf)
 {
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TEST_OA);
-}
-
-static struct device_attribute dev_attr_test_oa_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_test_oa_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_test_oa[] = {
-	&dev_attr_test_oa_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_test_oa = {
-	.name = "baa3c7e4-52b6-4b85-801e-465a94b746dd",
-	.attrs =  attrs_test_oa,
-};
-
-int
-i915_perf_register_sysfs_kblgt2(struct drm_i915_private *dev_priv)
-{
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
-	int ret = 0;
-
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-		if (ret)
-			goto error_render_basic;
-	}
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-		if (ret)
-			goto error_compute_basic;
-	}
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-		if (ret)
-			goto error_render_pipe_profile;
-	}
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-		if (ret)
-			goto error_memory_reads;
-	}
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-		if (ret)
-			goto error_memory_writes;
-	}
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-		if (ret)
-			goto error_compute_extended;
-	}
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-		if (ret)
-			goto error_compute_l3_cache;
-	}
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-		if (ret)
-			goto error_hdc_and_sf;
-	}
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-		if (ret)
-			goto error_l3_1;
-	}
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-		if (ret)
-			goto error_l3_2;
-	}
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-		if (ret)
-			goto error_l3_3;
-	}
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-		if (ret)
-			goto error_rasterizer_and_pixel_backend;
-	}
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_sampler);
-		if (ret)
-			goto error_sampler;
-	}
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-		if (ret)
-			goto error_tdl_1;
-	}
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-		if (ret)
-			goto error_tdl_2;
-	}
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-		if (ret)
-			goto error_compute_extra;
-	}
-	if (get_vme_pipe_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_vme_pipe);
-		if (ret)
-			goto error_vme_pipe;
-	}
-	if (get_test_oa_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_test_oa);
-		if (ret)
-			goto error_test_oa;
-	}
-
-	return 0;
-
-error_test_oa:
-	if (get_vme_pipe_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_vme_pipe);
-error_vme_pipe:
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-error_compute_extra:
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-error_tdl_2:
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-error_tdl_1:
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler);
-error_sampler:
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-error_rasterizer_and_pixel_backend:
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-error_l3_3:
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-error_l3_2:
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-error_l3_1:
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-error_hdc_and_sf:
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-error_compute_l3_cache:
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-error_compute_extended:
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-error_memory_writes:
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-error_memory_reads:
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-error_render_pipe_profile:
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-error_compute_basic:
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-error_render_basic:
-	return ret;
+	return sprintf(buf, "1\n");
 }
 
 void
-i915_perf_unregister_sysfs_kblgt2(struct drm_i915_private *dev_priv)
+i915_perf_load_test_config_kblgt2(struct drm_i915_private *dev_priv)
 {
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
+	strncpy(dev_priv->perf.oa.test_config.uuid,
+		"baa3c7e4-52b6-4b85-801e-465a94b746dd",
+		UUID_STRING_LEN);
+	dev_priv->perf.oa.test_config.id = 1;
 
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler);
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-	if (get_vme_pipe_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_vme_pipe);
-	if (get_test_oa_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_test_oa);
+	dev_priv->perf.oa.test_config.mux_regs = mux_config_test_oa;
+	dev_priv->perf.oa.test_config.mux_regs_len = ARRAY_SIZE(mux_config_test_oa);
+
+	dev_priv->perf.oa.test_config.b_counter_regs = b_counter_config_test_oa;
+	dev_priv->perf.oa.test_config.b_counter_regs_len = ARRAY_SIZE(b_counter_config_test_oa);
+
+	dev_priv->perf.oa.test_config.flex_regs = flex_eu_config_test_oa;
+	dev_priv->perf.oa.test_config.flex_regs_len = ARRAY_SIZE(flex_eu_config_test_oa);
+
+	dev_priv->perf.oa.test_config.sysfs_metric.name = "baa3c7e4-52b6-4b85-801e-465a94b746dd";
+	dev_priv->perf.oa.test_config.sysfs_metric.attrs = dev_priv->perf.oa.test_config.attrs;
+
+	dev_priv->perf.oa.test_config.attrs[0] = &dev_priv->perf.oa.test_config.sysfs_metric_id.attr;
+
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.name = "id";
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.mode = 0444;
+	dev_priv->perf.oa.test_config.sysfs_metric_id.show = show_test_oa_id;
 }
diff --git a/drivers/gpu/drm/i915/i915_oa_kblgt2.h b/drivers/gpu/drm/i915/i915_oa_kblgt2.h
index 7e61bfc..25b80354 100644
--- a/drivers/gpu/drm/i915/i915_oa_kblgt2.h
+++ b/drivers/gpu/drm/i915/i915_oa_kblgt2.h
@@ -29,12 +29,6 @@
 #ifndef __I915_OA_KBLGT2_H__
 #define __I915_OA_KBLGT2_H__
 
-extern int i915_oa_n_builtin_metric_sets_kblgt2;
-
-extern int i915_oa_select_metric_set_kblgt2(struct drm_i915_private *dev_priv);
-
-extern int i915_perf_register_sysfs_kblgt2(struct drm_i915_private *dev_priv);
-
-extern void i915_perf_unregister_sysfs_kblgt2(struct drm_i915_private *dev_priv);
+extern void i915_perf_load_test_config_kblgt2(struct drm_i915_private *dev_priv);
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_oa_kblgt3.c b/drivers/gpu/drm/i915/i915_oa_kblgt3.c
index 6ed0925..5576afd 100644
--- a/drivers/gpu/drm/i915/i915_oa_kblgt3.c
+++ b/drivers/gpu/drm/i915/i915_oa_kblgt3.c
@@ -31,1877 +31,6 @@
 #include "i915_drv.h"
 #include "i915_oa_kblgt3.h"
 
-enum metric_set_id {
-	METRIC_SET_ID_RENDER_BASIC = 1,
-	METRIC_SET_ID_COMPUTE_BASIC,
-	METRIC_SET_ID_RENDER_PIPE_PROFILE,
-	METRIC_SET_ID_MEMORY_READS,
-	METRIC_SET_ID_MEMORY_WRITES,
-	METRIC_SET_ID_COMPUTE_EXTENDED,
-	METRIC_SET_ID_COMPUTE_L3_CACHE,
-	METRIC_SET_ID_HDC_AND_SF,
-	METRIC_SET_ID_L3_1,
-	METRIC_SET_ID_L3_2,
-	METRIC_SET_ID_L3_3,
-	METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND,
-	METRIC_SET_ID_SAMPLER,
-	METRIC_SET_ID_TDL_1,
-	METRIC_SET_ID_TDL_2,
-	METRIC_SET_ID_COMPUTE_EXTRA,
-	METRIC_SET_ID_VME_PIPE,
-	METRIC_SET_ID_TEST_OA,
-};
-
-int i915_oa_n_builtin_metric_sets_kblgt3 = 18;
-
-static const struct i915_oa_reg b_counter_config_render_basic[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_render_basic[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_render_basic[] = {
-	{ _MMIO(0x9888), 0x166c01e0 },
-	{ _MMIO(0x9888), 0x12170280 },
-	{ _MMIO(0x9888), 0x12370280 },
-	{ _MMIO(0x9888), 0x16ec01e0 },
-	{ _MMIO(0x9888), 0x11930317 },
-	{ _MMIO(0x9888), 0x159303df },
-	{ _MMIO(0x9888), 0x3f900003 },
-	{ _MMIO(0x9888), 0x1a4e0380 },
-	{ _MMIO(0x9888), 0x0a6c0053 },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x0a1b4000 },
-	{ _MMIO(0x9888), 0x1c1c0001 },
-	{ _MMIO(0x9888), 0x002f1000 },
-	{ _MMIO(0x9888), 0x042f1000 },
-	{ _MMIO(0x9888), 0x004c4000 },
-	{ _MMIO(0x9888), 0x0a4c8400 },
-	{ _MMIO(0x9888), 0x0c4c0002 },
-	{ _MMIO(0x9888), 0x000d2000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0f0400 },
-	{ _MMIO(0x9888), 0x0e0f6600 },
-	{ _MMIO(0x9888), 0x100f0001 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x162ca200 },
-	{ _MMIO(0x9888), 0x062d8000 },
-	{ _MMIO(0x9888), 0x082d8000 },
-	{ _MMIO(0x9888), 0x00133000 },
-	{ _MMIO(0x9888), 0x08133000 },
-	{ _MMIO(0x9888), 0x00170020 },
-	{ _MMIO(0x9888), 0x08170021 },
-	{ _MMIO(0x9888), 0x10170000 },
-	{ _MMIO(0x9888), 0x0633c000 },
-	{ _MMIO(0x9888), 0x0833c000 },
-	{ _MMIO(0x9888), 0x06370800 },
-	{ _MMIO(0x9888), 0x08370840 },
-	{ _MMIO(0x9888), 0x10370000 },
-	{ _MMIO(0x9888), 0x1ace0200 },
-	{ _MMIO(0x9888), 0x0aec5300 },
-	{ _MMIO(0x9888), 0x10ec0000 },
-	{ _MMIO(0x9888), 0x1cec0000 },
-	{ _MMIO(0x9888), 0x0a9b8000 },
-	{ _MMIO(0x9888), 0x1c9c0002 },
-	{ _MMIO(0x9888), 0x0ccc0002 },
-	{ _MMIO(0x9888), 0x0a8d8000 },
-	{ _MMIO(0x9888), 0x108f0001 },
-	{ _MMIO(0x9888), 0x16ac8000 },
-	{ _MMIO(0x9888), 0x0d933031 },
-	{ _MMIO(0x9888), 0x0f933e3f },
-	{ _MMIO(0x9888), 0x01933d00 },
-	{ _MMIO(0x9888), 0x0393073c },
-	{ _MMIO(0x9888), 0x0593000e },
-	{ _MMIO(0x9888), 0x1d930000 },
-	{ _MMIO(0x9888), 0x19930000 },
-	{ _MMIO(0x9888), 0x1b930000 },
-	{ _MMIO(0x9888), 0x1d900157 },
-	{ _MMIO(0x9888), 0x1f900158 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x2b908000 },
-	{ _MMIO(0x9888), 0x2d908000 },
-	{ _MMIO(0x9888), 0x2f908000 },
-	{ _MMIO(0x9888), 0x31908000 },
-	{ _MMIO(0x9888), 0x15908000 },
-	{ _MMIO(0x9888), 0x17908000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1190003f },
-	{ _MMIO(0x9888), 0x51902240 },
-	{ _MMIO(0x9888), 0x41900c00 },
-	{ _MMIO(0x9888), 0x55900242 },
-	{ _MMIO(0x9888), 0x45900084 },
-	{ _MMIO(0x9888), 0x47901400 },
-	{ _MMIO(0x9888), 0x57902220 },
-	{ _MMIO(0x9888), 0x49900c60 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900063 },
-	{ _MMIO(0x9888), 0x59900002 },
-	{ _MMIO(0x9888), 0x43900c63 },
-	{ _MMIO(0x9888), 0x53902222 },
-};
-
-static int
-get_render_basic_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_render_basic;
-	lens[n] = ARRAY_SIZE(mux_config_render_basic);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_basic[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_basic[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_compute_basic[] = {
-	{ _MMIO(0x9888), 0x104f00e0 },
-	{ _MMIO(0x9888), 0x124f1c00 },
-	{ _MMIO(0x9888), 0x106c00e0 },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f900003 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x1a4e0820 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x064f0900 },
-	{ _MMIO(0x9888), 0x084f0032 },
-	{ _MMIO(0x9888), 0x0a4f1891 },
-	{ _MMIO(0x9888), 0x0c4f0e00 },
-	{ _MMIO(0x9888), 0x0e4f003c },
-	{ _MMIO(0x9888), 0x004f0d80 },
-	{ _MMIO(0x9888), 0x024f003b },
-	{ _MMIO(0x9888), 0x006c0002 },
-	{ _MMIO(0x9888), 0x086c0100 },
-	{ _MMIO(0x9888), 0x0c6c000c },
-	{ _MMIO(0x9888), 0x0e6c0b00 },
-	{ _MMIO(0x9888), 0x186c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x001b4000 },
-	{ _MMIO(0x9888), 0x081b8000 },
-	{ _MMIO(0x9888), 0x0c1b4000 },
-	{ _MMIO(0x9888), 0x0e1b8000 },
-	{ _MMIO(0x9888), 0x101c8000 },
-	{ _MMIO(0x9888), 0x1a1c8000 },
-	{ _MMIO(0x9888), 0x1c1c0024 },
-	{ _MMIO(0x9888), 0x065b8000 },
-	{ _MMIO(0x9888), 0x085b4000 },
-	{ _MMIO(0x9888), 0x0a5bc000 },
-	{ _MMIO(0x9888), 0x0c5b8000 },
-	{ _MMIO(0x9888), 0x0e5b4000 },
-	{ _MMIO(0x9888), 0x005b8000 },
-	{ _MMIO(0x9888), 0x025b4000 },
-	{ _MMIO(0x9888), 0x1a5c6000 },
-	{ _MMIO(0x9888), 0x1c5c001b },
-	{ _MMIO(0x9888), 0x125c8000 },
-	{ _MMIO(0x9888), 0x145c8000 },
-	{ _MMIO(0x9888), 0x004c8000 },
-	{ _MMIO(0x9888), 0x0a4c2000 },
-	{ _MMIO(0x9888), 0x0c4c0208 },
-	{ _MMIO(0x9888), 0x000da000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x020d2000 },
-	{ _MMIO(0x9888), 0x0c0f5400 },
-	{ _MMIO(0x9888), 0x0e0f5500 },
-	{ _MMIO(0x9888), 0x100f0155 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2cc000 },
-	{ _MMIO(0x9888), 0x162cfb00 },
-	{ _MMIO(0x9888), 0x182c00be },
-	{ _MMIO(0x9888), 0x022cc000 },
-	{ _MMIO(0x9888), 0x042cc000 },
-	{ _MMIO(0x9888), 0x19900157 },
-	{ _MMIO(0x9888), 0x1b900158 },
-	{ _MMIO(0x9888), 0x1d900105 },
-	{ _MMIO(0x9888), 0x1f900103 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x11900fff },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900800 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900821 },
-	{ _MMIO(0x9888), 0x47900802 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900802 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900002 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900422 },
-	{ _MMIO(0x9888), 0x53904444 },
-};
-
-static int
-get_compute_basic_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_basic;
-	lens[n] = ARRAY_SIZE(mux_config_compute_basic);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_render_pipe_profile[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007ffea },
-	{ _MMIO(0x2774), 0x00007ffc },
-	{ _MMIO(0x2778), 0x0007affa },
-	{ _MMIO(0x277c), 0x0000f5fd },
-	{ _MMIO(0x2780), 0x00079ffa },
-	{ _MMIO(0x2784), 0x0000f3fb },
-	{ _MMIO(0x2788), 0x0007bf7a },
-	{ _MMIO(0x278c), 0x0000f7e7 },
-	{ _MMIO(0x2790), 0x0007fefa },
-	{ _MMIO(0x2794), 0x0000f7cf },
-	{ _MMIO(0x2798), 0x00077ffa },
-	{ _MMIO(0x279c), 0x0000efdf },
-	{ _MMIO(0x27a0), 0x0006fffa },
-	{ _MMIO(0x27a4), 0x0000cfbf },
-	{ _MMIO(0x27a8), 0x0003fffa },
-	{ _MMIO(0x27ac), 0x00005f7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_render_pipe_profile[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_render_pipe_profile[] = {
-	{ _MMIO(0x9888), 0x0c0e001f },
-	{ _MMIO(0x9888), 0x0a0f0000 },
-	{ _MMIO(0x9888), 0x10116800 },
-	{ _MMIO(0x9888), 0x178a03e0 },
-	{ _MMIO(0x9888), 0x11824c00 },
-	{ _MMIO(0x9888), 0x11830020 },
-	{ _MMIO(0x9888), 0x13840020 },
-	{ _MMIO(0x9888), 0x11850019 },
-	{ _MMIO(0x9888), 0x11860007 },
-	{ _MMIO(0x9888), 0x01870c40 },
-	{ _MMIO(0x9888), 0x17880000 },
-	{ _MMIO(0x9888), 0x022f4000 },
-	{ _MMIO(0x9888), 0x0a4c0040 },
-	{ _MMIO(0x9888), 0x0c0d8000 },
-	{ _MMIO(0x9888), 0x040d4000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x020e5400 },
-	{ _MMIO(0x9888), 0x000e0000 },
-	{ _MMIO(0x9888), 0x080f0040 },
-	{ _MMIO(0x9888), 0x000f0000 },
-	{ _MMIO(0x9888), 0x100f0000 },
-	{ _MMIO(0x9888), 0x0e0f0040 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x06104000 },
-	{ _MMIO(0x9888), 0x06110012 },
-	{ _MMIO(0x9888), 0x06131000 },
-	{ _MMIO(0x9888), 0x01898000 },
-	{ _MMIO(0x9888), 0x0d890100 },
-	{ _MMIO(0x9888), 0x03898000 },
-	{ _MMIO(0x9888), 0x09808000 },
-	{ _MMIO(0x9888), 0x0b808000 },
-	{ _MMIO(0x9888), 0x0380c000 },
-	{ _MMIO(0x9888), 0x0f8a0075 },
-	{ _MMIO(0x9888), 0x1d8a0000 },
-	{ _MMIO(0x9888), 0x118a8000 },
-	{ _MMIO(0x9888), 0x1b8a4000 },
-	{ _MMIO(0x9888), 0x138a8000 },
-	{ _MMIO(0x9888), 0x1d81a000 },
-	{ _MMIO(0x9888), 0x15818000 },
-	{ _MMIO(0x9888), 0x17818000 },
-	{ _MMIO(0x9888), 0x0b820030 },
-	{ _MMIO(0x9888), 0x07828000 },
-	{ _MMIO(0x9888), 0x0d824000 },
-	{ _MMIO(0x9888), 0x0f828000 },
-	{ _MMIO(0x9888), 0x05824000 },
-	{ _MMIO(0x9888), 0x0d830003 },
-	{ _MMIO(0x9888), 0x0583000c },
-	{ _MMIO(0x9888), 0x09830000 },
-	{ _MMIO(0x9888), 0x03838000 },
-	{ _MMIO(0x9888), 0x07838000 },
-	{ _MMIO(0x9888), 0x0b840980 },
-	{ _MMIO(0x9888), 0x03844d80 },
-	{ _MMIO(0x9888), 0x11840000 },
-	{ _MMIO(0x9888), 0x09848000 },
-	{ _MMIO(0x9888), 0x09850080 },
-	{ _MMIO(0x9888), 0x03850003 },
-	{ _MMIO(0x9888), 0x01850000 },
-	{ _MMIO(0x9888), 0x07860000 },
-	{ _MMIO(0x9888), 0x0f860400 },
-	{ _MMIO(0x9888), 0x09870032 },
-	{ _MMIO(0x9888), 0x01888052 },
-	{ _MMIO(0x9888), 0x11880000 },
-	{ _MMIO(0x9888), 0x09884000 },
-	{ _MMIO(0x9888), 0x1b931001 },
-	{ _MMIO(0x9888), 0x1d930001 },
-	{ _MMIO(0x9888), 0x19934000 },
-	{ _MMIO(0x9888), 0x1b958000 },
-	{ _MMIO(0x9888), 0x1d950094 },
-	{ _MMIO(0x9888), 0x19958000 },
-	{ _MMIO(0x9888), 0x09e58000 },
-	{ _MMIO(0x9888), 0x0be58000 },
-	{ _MMIO(0x9888), 0x03e5c000 },
-	{ _MMIO(0x9888), 0x0592c000 },
-	{ _MMIO(0x9888), 0x0b928000 },
-	{ _MMIO(0x9888), 0x0d924000 },
-	{ _MMIO(0x9888), 0x0f924000 },
-	{ _MMIO(0x9888), 0x11928000 },
-	{ _MMIO(0x9888), 0x1392c000 },
-	{ _MMIO(0x9888), 0x09924000 },
-	{ _MMIO(0x9888), 0x01985000 },
-	{ _MMIO(0x9888), 0x07988000 },
-	{ _MMIO(0x9888), 0x09981000 },
-	{ _MMIO(0x9888), 0x0b982000 },
-	{ _MMIO(0x9888), 0x0d982000 },
-	{ _MMIO(0x9888), 0x0f989000 },
-	{ _MMIO(0x9888), 0x05982000 },
-	{ _MMIO(0x9888), 0x13904000 },
-	{ _MMIO(0x9888), 0x21904000 },
-	{ _MMIO(0x9888), 0x23904000 },
-	{ _MMIO(0x9888), 0x25908000 },
-	{ _MMIO(0x9888), 0x27904000 },
-	{ _MMIO(0x9888), 0x29908000 },
-	{ _MMIO(0x9888), 0x2b904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17908000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1190c080 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900440 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900400 },
-	{ _MMIO(0x9888), 0x47900c21 },
-	{ _MMIO(0x9888), 0x57900400 },
-	{ _MMIO(0x9888), 0x49900042 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900024 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900841 },
-	{ _MMIO(0x9888), 0x53900400 },
-};
-
-static int
-get_render_pipe_profile_mux_config(struct drm_i915_private *dev_priv,
-				   const struct i915_oa_reg **regs,
-				   int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_render_pipe_profile;
-	lens[n] = ARRAY_SIZE(mux_config_render_pipe_profile);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_memory_reads[] = {
-	{ _MMIO(0x272c), 0xffffffff },
-	{ _MMIO(0x2728), 0xffffffff },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x271c), 0xffffffff },
-	{ _MMIO(0x2718), 0xffffffff },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0x86543210 },
-	{ _MMIO(0x2748), 0x86543210 },
-	{ _MMIO(0x2744), 0x00006667 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x275c), 0x86543210 },
-	{ _MMIO(0x2758), 0x86543210 },
-	{ _MMIO(0x2754), 0x00006465 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007f81a },
-	{ _MMIO(0x2774), 0x0000fe00 },
-	{ _MMIO(0x2778), 0x0007f82a },
-	{ _MMIO(0x277c), 0x0000fe00 },
-	{ _MMIO(0x2780), 0x0007f872 },
-	{ _MMIO(0x2784), 0x0000fe00 },
-	{ _MMIO(0x2788), 0x0007f8ba },
-	{ _MMIO(0x278c), 0x0000fe00 },
-	{ _MMIO(0x2790), 0x0007f87a },
-	{ _MMIO(0x2794), 0x0000fe00 },
-	{ _MMIO(0x2798), 0x0007f8ea },
-	{ _MMIO(0x279c), 0x0000fe00 },
-	{ _MMIO(0x27a0), 0x0007f8e2 },
-	{ _MMIO(0x27a4), 0x0000fe00 },
-	{ _MMIO(0x27a8), 0x0007f8f2 },
-	{ _MMIO(0x27ac), 0x0000fe00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_memory_reads[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_memory_reads[] = {
-	{ _MMIO(0x9888), 0x11810c00 },
-	{ _MMIO(0x9888), 0x1381001a },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f900064 },
-	{ _MMIO(0x9888), 0x03811300 },
-	{ _MMIO(0x9888), 0x05811b12 },
-	{ _MMIO(0x9888), 0x0781001a },
-	{ _MMIO(0x9888), 0x1f810000 },
-	{ _MMIO(0x9888), 0x17810000 },
-	{ _MMIO(0x9888), 0x19810000 },
-	{ _MMIO(0x9888), 0x1b810000 },
-	{ _MMIO(0x9888), 0x1d810000 },
-	{ _MMIO(0x9888), 0x1b930055 },
-	{ _MMIO(0x9888), 0x03e58000 },
-	{ _MMIO(0x9888), 0x05e5c000 },
-	{ _MMIO(0x9888), 0x07e54000 },
-	{ _MMIO(0x9888), 0x13900150 },
-	{ _MMIO(0x9888), 0x21900151 },
-	{ _MMIO(0x9888), 0x23900152 },
-	{ _MMIO(0x9888), 0x25900153 },
-	{ _MMIO(0x9888), 0x27900154 },
-	{ _MMIO(0x9888), 0x29900155 },
-	{ _MMIO(0x9888), 0x2b900156 },
-	{ _MMIO(0x9888), 0x2d900157 },
-	{ _MMIO(0x9888), 0x2f90015f },
-	{ _MMIO(0x9888), 0x31900105 },
-	{ _MMIO(0x9888), 0x15900103 },
-	{ _MMIO(0x9888), 0x17900101 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d908000 },
-	{ _MMIO(0x9888), 0x1f908000 },
-	{ _MMIO(0x9888), 0x11900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c60 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900c00 },
-	{ _MMIO(0x9888), 0x47900c63 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900c63 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900063 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static int
-get_memory_reads_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_memory_reads;
-	lens[n] = ARRAY_SIZE(mux_config_memory_reads);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_memory_writes[] = {
-	{ _MMIO(0x272c), 0xffffffff },
-	{ _MMIO(0x2728), 0xffffffff },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x271c), 0xffffffff },
-	{ _MMIO(0x2718), 0xffffffff },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0x86543210 },
-	{ _MMIO(0x2748), 0x86543210 },
-	{ _MMIO(0x2744), 0x00006667 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x275c), 0x86543210 },
-	{ _MMIO(0x2758), 0x86543210 },
-	{ _MMIO(0x2754), 0x00006465 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007f81a },
-	{ _MMIO(0x2774), 0x0000fe00 },
-	{ _MMIO(0x2778), 0x0007f82a },
-	{ _MMIO(0x277c), 0x0000fe00 },
-	{ _MMIO(0x2780), 0x0007f822 },
-	{ _MMIO(0x2784), 0x0000fe00 },
-	{ _MMIO(0x2788), 0x0007f8ba },
-	{ _MMIO(0x278c), 0x0000fe00 },
-	{ _MMIO(0x2790), 0x0007f87a },
-	{ _MMIO(0x2794), 0x0000fe00 },
-	{ _MMIO(0x2798), 0x0007f8ea },
-	{ _MMIO(0x279c), 0x0000fe00 },
-	{ _MMIO(0x27a0), 0x0007f8e2 },
-	{ _MMIO(0x27a4), 0x0000fe00 },
-	{ _MMIO(0x27a8), 0x0007f8f2 },
-	{ _MMIO(0x27ac), 0x0000fe00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_memory_writes[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_memory_writes[] = {
-	{ _MMIO(0x9888), 0x11810c00 },
-	{ _MMIO(0x9888), 0x1381001a },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f901000 },
-	{ _MMIO(0x9888), 0x03811300 },
-	{ _MMIO(0x9888), 0x05811b12 },
-	{ _MMIO(0x9888), 0x0781001a },
-	{ _MMIO(0x9888), 0x1f810000 },
-	{ _MMIO(0x9888), 0x17810000 },
-	{ _MMIO(0x9888), 0x19810000 },
-	{ _MMIO(0x9888), 0x1b810000 },
-	{ _MMIO(0x9888), 0x1d810000 },
-	{ _MMIO(0x9888), 0x1b930055 },
-	{ _MMIO(0x9888), 0x03e58000 },
-	{ _MMIO(0x9888), 0x05e5c000 },
-	{ _MMIO(0x9888), 0x07e54000 },
-	{ _MMIO(0x9888), 0x13900160 },
-	{ _MMIO(0x9888), 0x21900161 },
-	{ _MMIO(0x9888), 0x23900162 },
-	{ _MMIO(0x9888), 0x25900163 },
-	{ _MMIO(0x9888), 0x27900164 },
-	{ _MMIO(0x9888), 0x29900165 },
-	{ _MMIO(0x9888), 0x2b900166 },
-	{ _MMIO(0x9888), 0x2d900167 },
-	{ _MMIO(0x9888), 0x2f900150 },
-	{ _MMIO(0x9888), 0x31900105 },
-	{ _MMIO(0x9888), 0x15900103 },
-	{ _MMIO(0x9888), 0x17900101 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d908000 },
-	{ _MMIO(0x9888), 0x1f908000 },
-	{ _MMIO(0x9888), 0x11900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c60 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900c00 },
-	{ _MMIO(0x9888), 0x47900c63 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900c63 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900063 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static int
-get_memory_writes_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_memory_writes;
-	lens[n] = ARRAY_SIZE(mux_config_memory_writes);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_extended[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007fc2a },
-	{ _MMIO(0x2774), 0x0000bf00 },
-	{ _MMIO(0x2778), 0x0007fc6a },
-	{ _MMIO(0x277c), 0x0000bf00 },
-	{ _MMIO(0x2780), 0x0007fc92 },
-	{ _MMIO(0x2784), 0x0000bf00 },
-	{ _MMIO(0x2788), 0x0007fca2 },
-	{ _MMIO(0x278c), 0x0000bf00 },
-	{ _MMIO(0x2790), 0x0007fc32 },
-	{ _MMIO(0x2794), 0x0000bf00 },
-	{ _MMIO(0x2798), 0x0007fc9a },
-	{ _MMIO(0x279c), 0x0000bf00 },
-	{ _MMIO(0x27a0), 0x0007fe6a },
-	{ _MMIO(0x27a4), 0x0000bf00 },
-	{ _MMIO(0x27a8), 0x0007fe7a },
-	{ _MMIO(0x27ac), 0x0000bf00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_extended[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_compute_extended[] = {
-	{ _MMIO(0x9888), 0x106c00e0 },
-	{ _MMIO(0x9888), 0x141c8160 },
-	{ _MMIO(0x9888), 0x161c8015 },
-	{ _MMIO(0x9888), 0x181c0120 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x0e4e8000 },
-	{ _MMIO(0x9888), 0x184e8000 },
-	{ _MMIO(0x9888), 0x1a4eaaa0 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x024e8000 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x0e6c0b01 },
-	{ _MMIO(0x9888), 0x006c0200 },
-	{ _MMIO(0x9888), 0x026c000c },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x1a6c0000 },
-	{ _MMIO(0x9888), 0x0e1bc000 },
-	{ _MMIO(0x9888), 0x001b8000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x001c0041 },
-	{ _MMIO(0x9888), 0x061c4200 },
-	{ _MMIO(0x9888), 0x081c4443 },
-	{ _MMIO(0x9888), 0x0a1c4645 },
-	{ _MMIO(0x9888), 0x0c1c7647 },
-	{ _MMIO(0x9888), 0x041c7357 },
-	{ _MMIO(0x9888), 0x1c1c0030 },
-	{ _MMIO(0x9888), 0x101c0000 },
-	{ _MMIO(0x9888), 0x1a1c0000 },
-	{ _MMIO(0x9888), 0x121c8000 },
-	{ _MMIO(0x9888), 0x004c8000 },
-	{ _MMIO(0x9888), 0x0a4caa2a },
-	{ _MMIO(0x9888), 0x0c4c02aa },
-	{ _MMIO(0x9888), 0x084ca000 },
-	{ _MMIO(0x9888), 0x000da000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x0c0f5400 },
-	{ _MMIO(0x9888), 0x0e0f5515 },
-	{ _MMIO(0x9888), 0x100f0155 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2c8000 },
-	{ _MMIO(0x9888), 0x162caa00 },
-	{ _MMIO(0x9888), 0x182c00aa },
-	{ _MMIO(0x9888), 0x022c8000 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x11907fff },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900040 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900802 },
-	{ _MMIO(0x9888), 0x47900842 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900842 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900800 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static int
-get_compute_extended_mux_config(struct drm_i915_private *dev_priv,
-				const struct i915_oa_reg **regs,
-				int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_extended;
-	lens[n] = ARRAY_SIZE(mux_config_compute_extended);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_l3_cache[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x30800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007fffa },
-	{ _MMIO(0x2774), 0x0000fefe },
-	{ _MMIO(0x2778), 0x0007fffa },
-	{ _MMIO(0x277c), 0x0000fefd },
-	{ _MMIO(0x2790), 0x0007fffa },
-	{ _MMIO(0x2794), 0x0000fbef },
-	{ _MMIO(0x2798), 0x0007fffa },
-	{ _MMIO(0x279c), 0x0000fbdf },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_l3_cache[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00101100 },
-	{ _MMIO(0xe45c), 0x00201200 },
-	{ _MMIO(0xe55c), 0x00301300 },
-	{ _MMIO(0xe65c), 0x00401400 },
-};
-
-static const struct i915_oa_reg mux_config_compute_l3_cache[] = {
-	{ _MMIO(0x9888), 0x166c0760 },
-	{ _MMIO(0x9888), 0x1593001e },
-	{ _MMIO(0x9888), 0x3f900003 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x0e4e8000 },
-	{ _MMIO(0x9888), 0x184e8000 },
-	{ _MMIO(0x9888), 0x1a4e8020 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x006c0051 },
-	{ _MMIO(0x9888), 0x066c5000 },
-	{ _MMIO(0x9888), 0x086c5c5d },
-	{ _MMIO(0x9888), 0x0e6c5e5f },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x186c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x001b4000 },
-	{ _MMIO(0x9888), 0x061b8000 },
-	{ _MMIO(0x9888), 0x081bc000 },
-	{ _MMIO(0x9888), 0x0e1bc000 },
-	{ _MMIO(0x9888), 0x101c8000 },
-	{ _MMIO(0x9888), 0x1a1ce000 },
-	{ _MMIO(0x9888), 0x1c1c0030 },
-	{ _MMIO(0x9888), 0x004c8000 },
-	{ _MMIO(0x9888), 0x0a4c2a00 },
-	{ _MMIO(0x9888), 0x0c4c0280 },
-	{ _MMIO(0x9888), 0x000d2000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x0c0f0400 },
-	{ _MMIO(0x9888), 0x0e0f1500 },
-	{ _MMIO(0x9888), 0x100f0140 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2c8000 },
-	{ _MMIO(0x9888), 0x162c0a00 },
-	{ _MMIO(0x9888), 0x182c00a0 },
-	{ _MMIO(0x9888), 0x03933300 },
-	{ _MMIO(0x9888), 0x05930032 },
-	{ _MMIO(0x9888), 0x11930000 },
-	{ _MMIO(0x9888), 0x1b930000 },
-	{ _MMIO(0x9888), 0x1d900157 },
-	{ _MMIO(0x9888), 0x1f900158 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1190030f },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900000 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900021 },
-	{ _MMIO(0x9888), 0x47900000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x4b900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x53904444 },
-	{ _MMIO(0x9888), 0x43900000 },
-};
-
-static int
-get_compute_l3_cache_mux_config(struct drm_i915_private *dev_priv,
-				const struct i915_oa_reg **regs,
-				int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_l3_cache;
-	lens[n] = ARRAY_SIZE(mux_config_compute_l3_cache);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_hdc_and_sf[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x10800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000fdff },
-};
-
-static const struct i915_oa_reg flex_eu_config_hdc_and_sf[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_hdc_and_sf[] = {
-	{ _MMIO(0x9888), 0x104f0232 },
-	{ _MMIO(0x9888), 0x124f4640 },
-	{ _MMIO(0x9888), 0x106c0232 },
-	{ _MMIO(0x9888), 0x11834400 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x0c4e8000 },
-	{ _MMIO(0x9888), 0x004f1880 },
-	{ _MMIO(0x9888), 0x024f08bb },
-	{ _MMIO(0x9888), 0x044f001b },
-	{ _MMIO(0x9888), 0x046c0100 },
-	{ _MMIO(0x9888), 0x066c000b },
-	{ _MMIO(0x9888), 0x1a6c0000 },
-	{ _MMIO(0x9888), 0x041b8000 },
-	{ _MMIO(0x9888), 0x061b4000 },
-	{ _MMIO(0x9888), 0x1a1c1800 },
-	{ _MMIO(0x9888), 0x005b8000 },
-	{ _MMIO(0x9888), 0x025bc000 },
-	{ _MMIO(0x9888), 0x045b4000 },
-	{ _MMIO(0x9888), 0x125c8000 },
-	{ _MMIO(0x9888), 0x145c8000 },
-	{ _MMIO(0x9888), 0x165c8000 },
-	{ _MMIO(0x9888), 0x185c8000 },
-	{ _MMIO(0x9888), 0x0a4c00a0 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0f5000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x022cc000 },
-	{ _MMIO(0x9888), 0x042cc000 },
-	{ _MMIO(0x9888), 0x062cc000 },
-	{ _MMIO(0x9888), 0x082cc000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x0f828000 },
-	{ _MMIO(0x9888), 0x0f8305c0 },
-	{ _MMIO(0x9888), 0x09830000 },
-	{ _MMIO(0x9888), 0x07830000 },
-	{ _MMIO(0x9888), 0x1d950080 },
-	{ _MMIO(0x9888), 0x13928000 },
-	{ _MMIO(0x9888), 0x0f988000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x1190fc00 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b900040 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900800 },
-	{ _MMIO(0x9888), 0x43900842 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_hdc_and_sf_mux_config(struct drm_i915_private *dev_priv,
-			  const struct i915_oa_reg **regs,
-			  int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_hdc_and_sf;
-	lens[n] = ARRAY_SIZE(mux_config_hdc_and_sf);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00014002 },
-	{ _MMIO(0x277c), 0x0000c3ff },
-	{ _MMIO(0x2780), 0x00010002 },
-	{ _MMIO(0x2784), 0x0000c7ff },
-	{ _MMIO(0x2788), 0x00004002 },
-	{ _MMIO(0x278c), 0x0000d3ff },
-	{ _MMIO(0x2790), 0x00100700 },
-	{ _MMIO(0x2794), 0x0000ff1f },
-	{ _MMIO(0x2798), 0x00001402 },
-	{ _MMIO(0x279c), 0x0000fc3f },
-	{ _MMIO(0x27a0), 0x00001002 },
-	{ _MMIO(0x27a4), 0x0000fc7f },
-	{ _MMIO(0x27a8), 0x00000402 },
-	{ _MMIO(0x27ac), 0x0000fd3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_1[] = {
-	{ _MMIO(0x9888), 0x126c7b40 },
-	{ _MMIO(0x9888), 0x166c0020 },
-	{ _MMIO(0x9888), 0x0a603444 },
-	{ _MMIO(0x9888), 0x0a613400 },
-	{ _MMIO(0x9888), 0x1a4ea800 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x024e8000 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x064f4000 },
-	{ _MMIO(0x9888), 0x0c6c5327 },
-	{ _MMIO(0x9888), 0x0e6c5425 },
-	{ _MMIO(0x9888), 0x006c2a00 },
-	{ _MMIO(0x9888), 0x026c285b },
-	{ _MMIO(0x9888), 0x046c005c },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x1a6c0800 },
-	{ _MMIO(0x9888), 0x0c1bc000 },
-	{ _MMIO(0x9888), 0x0e1bc000 },
-	{ _MMIO(0x9888), 0x001b8000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x041bc000 },
-	{ _MMIO(0x9888), 0x1c1c003c },
-	{ _MMIO(0x9888), 0x121c8000 },
-	{ _MMIO(0x9888), 0x141c8000 },
-	{ _MMIO(0x9888), 0x161c8000 },
-	{ _MMIO(0x9888), 0x181c8000 },
-	{ _MMIO(0x9888), 0x1a1c0800 },
-	{ _MMIO(0x9888), 0x065b4000 },
-	{ _MMIO(0x9888), 0x1a5c1000 },
-	{ _MMIO(0x9888), 0x10600000 },
-	{ _MMIO(0x9888), 0x04600000 },
-	{ _MMIO(0x9888), 0x0c610044 },
-	{ _MMIO(0x9888), 0x10610000 },
-	{ _MMIO(0x9888), 0x06610000 },
-	{ _MMIO(0x9888), 0x0c4c02a8 },
-	{ _MMIO(0x9888), 0x084ca000 },
-	{ _MMIO(0x9888), 0x0a4c002a },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x100f0154 },
-	{ _MMIO(0x9888), 0x0c0f5000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x182c00aa },
-	{ _MMIO(0x9888), 0x022c8000 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2cc000 },
-	{ _MMIO(0x9888), 0x1190ffc0 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900420 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900021 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900400 },
-	{ _MMIO(0x9888), 0x43900421 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900040 },
-};
-
-static int
-get_l3_1_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_1;
-	lens[n] = ARRAY_SIZE(mux_config_l3_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_2[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00028002 },
-	{ _MMIO(0x277c), 0x000087ff },
-	{ _MMIO(0x2780), 0x00020002 },
-	{ _MMIO(0x2784), 0x00008fff },
-	{ _MMIO(0x2788), 0x00008002 },
-	{ _MMIO(0x278c), 0x0000a7ff },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_2[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_2[] = {
-	{ _MMIO(0x9888), 0x126c02e0 },
-	{ _MMIO(0x9888), 0x146c0001 },
-	{ _MMIO(0x9888), 0x0a623400 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x064f4000 },
-	{ _MMIO(0x9888), 0x026c3324 },
-	{ _MMIO(0x9888), 0x046c3422 },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1a6c0000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x041bc000 },
-	{ _MMIO(0x9888), 0x141c8000 },
-	{ _MMIO(0x9888), 0x161c8000 },
-	{ _MMIO(0x9888), 0x181c8000 },
-	{ _MMIO(0x9888), 0x1a1c0800 },
-	{ _MMIO(0x9888), 0x065b4000 },
-	{ _MMIO(0x9888), 0x1a5c1000 },
-	{ _MMIO(0x9888), 0x06614000 },
-	{ _MMIO(0x9888), 0x0c620044 },
-	{ _MMIO(0x9888), 0x10620000 },
-	{ _MMIO(0x9888), 0x06620000 },
-	{ _MMIO(0x9888), 0x084c8000 },
-	{ _MMIO(0x9888), 0x0a4c002a },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0f4000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2cc000 },
-	{ _MMIO(0x9888), 0x1190f800 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x43900000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_l3_2_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_2;
-	lens[n] = ARRAY_SIZE(mux_config_l3_2);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_3[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00028002 },
-	{ _MMIO(0x277c), 0x000087ff },
-	{ _MMIO(0x2780), 0x00020002 },
-	{ _MMIO(0x2784), 0x00008fff },
-	{ _MMIO(0x2788), 0x00008002 },
-	{ _MMIO(0x278c), 0x0000a7ff },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_3[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_3[] = {
-	{ _MMIO(0x9888), 0x126c4e80 },
-	{ _MMIO(0x9888), 0x146c0000 },
-	{ _MMIO(0x9888), 0x0a633400 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x0c4e8000 },
-	{ _MMIO(0x9888), 0x026c3321 },
-	{ _MMIO(0x9888), 0x046c342f },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1a6c2000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x041bc000 },
-	{ _MMIO(0x9888), 0x061b4000 },
-	{ _MMIO(0x9888), 0x141c8000 },
-	{ _MMIO(0x9888), 0x161c8000 },
-	{ _MMIO(0x9888), 0x181c8000 },
-	{ _MMIO(0x9888), 0x1a1c1800 },
-	{ _MMIO(0x9888), 0x06604000 },
-	{ _MMIO(0x9888), 0x0c630044 },
-	{ _MMIO(0x9888), 0x10630000 },
-	{ _MMIO(0x9888), 0x06630000 },
-	{ _MMIO(0x9888), 0x084c8000 },
-	{ _MMIO(0x9888), 0x0a4c00aa },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0f4000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x1190f800 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x43900842 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900002 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_l3_3_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_3;
-	lens[n] = ARRAY_SIZE(mux_config_l3_3);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x30800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000efff },
-	{ _MMIO(0x2778), 0x00006000 },
-	{ _MMIO(0x277c), 0x0000f3ff },
-};
-
-static const struct i915_oa_reg flex_eu_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0x9888), 0x102f3800 },
-	{ _MMIO(0x9888), 0x144d0500 },
-	{ _MMIO(0x9888), 0x120d03c0 },
-	{ _MMIO(0x9888), 0x140d03cf },
-	{ _MMIO(0x9888), 0x0c0f0004 },
-	{ _MMIO(0x9888), 0x0c4e4000 },
-	{ _MMIO(0x9888), 0x042f0480 },
-	{ _MMIO(0x9888), 0x082f0000 },
-	{ _MMIO(0x9888), 0x022f0000 },
-	{ _MMIO(0x9888), 0x0a4c0090 },
-	{ _MMIO(0x9888), 0x064d0027 },
-	{ _MMIO(0x9888), 0x004d0000 },
-	{ _MMIO(0x9888), 0x000d0d40 },
-	{ _MMIO(0x9888), 0x020d803f },
-	{ _MMIO(0x9888), 0x040d8023 },
-	{ _MMIO(0x9888), 0x100d0000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x020f0010 },
-	{ _MMIO(0x9888), 0x000f0000 },
-	{ _MMIO(0x9888), 0x0e0f0050 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x1190fc00 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41901400 },
-	{ _MMIO(0x9888), 0x43901485 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900001 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_rasterizer_and_pixel_backend_mux_config(struct drm_i915_private *dev_priv,
-					    const struct i915_oa_reg **regs,
-					    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_rasterizer_and_pixel_backend;
-	lens[n] = ARRAY_SIZE(mux_config_rasterizer_and_pixel_backend);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_sampler[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x70800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x0000c000 },
-	{ _MMIO(0x2774), 0x0000e7ff },
-	{ _MMIO(0x2778), 0x00003000 },
-	{ _MMIO(0x277c), 0x0000f9ff },
-	{ _MMIO(0x2780), 0x00000c00 },
-	{ _MMIO(0x2784), 0x0000fe7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_sampler[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_sampler[] = {
-	{ _MMIO(0x9888), 0x14152c00 },
-	{ _MMIO(0x9888), 0x16150005 },
-	{ _MMIO(0x9888), 0x121600a0 },
-	{ _MMIO(0x9888), 0x14352c00 },
-	{ _MMIO(0x9888), 0x16350005 },
-	{ _MMIO(0x9888), 0x123600a0 },
-	{ _MMIO(0x9888), 0x14552c00 },
-	{ _MMIO(0x9888), 0x16550005 },
-	{ _MMIO(0x9888), 0x125600a0 },
-	{ _MMIO(0x9888), 0x062f6000 },
-	{ _MMIO(0x9888), 0x022f2000 },
-	{ _MMIO(0x9888), 0x0c4c0050 },
-	{ _MMIO(0x9888), 0x0a4c0010 },
-	{ _MMIO(0x9888), 0x0c0d8000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x100f0350 },
-	{ _MMIO(0x9888), 0x0c0fb000 },
-	{ _MMIO(0x9888), 0x0e0f00da },
-	{ _MMIO(0x9888), 0x182c0028 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x022dc000 },
-	{ _MMIO(0x9888), 0x042d4000 },
-	{ _MMIO(0x9888), 0x0c138000 },
-	{ _MMIO(0x9888), 0x0e132000 },
-	{ _MMIO(0x9888), 0x0413c000 },
-	{ _MMIO(0x9888), 0x1c140018 },
-	{ _MMIO(0x9888), 0x0c157000 },
-	{ _MMIO(0x9888), 0x0e150078 },
-	{ _MMIO(0x9888), 0x10150000 },
-	{ _MMIO(0x9888), 0x04162180 },
-	{ _MMIO(0x9888), 0x02160000 },
-	{ _MMIO(0x9888), 0x04174000 },
-	{ _MMIO(0x9888), 0x0233a000 },
-	{ _MMIO(0x9888), 0x04333000 },
-	{ _MMIO(0x9888), 0x14348000 },
-	{ _MMIO(0x9888), 0x16348000 },
-	{ _MMIO(0x9888), 0x02357870 },
-	{ _MMIO(0x9888), 0x10350000 },
-	{ _MMIO(0x9888), 0x04360043 },
-	{ _MMIO(0x9888), 0x02360000 },
-	{ _MMIO(0x9888), 0x04371000 },
-	{ _MMIO(0x9888), 0x0e538000 },
-	{ _MMIO(0x9888), 0x00538000 },
-	{ _MMIO(0x9888), 0x06533000 },
-	{ _MMIO(0x9888), 0x1c540020 },
-	{ _MMIO(0x9888), 0x12548000 },
-	{ _MMIO(0x9888), 0x0e557000 },
-	{ _MMIO(0x9888), 0x00557800 },
-	{ _MMIO(0x9888), 0x10550000 },
-	{ _MMIO(0x9888), 0x06560043 },
-	{ _MMIO(0x9888), 0x02560000 },
-	{ _MMIO(0x9888), 0x06571000 },
-	{ _MMIO(0x9888), 0x1190ff80 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900060 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c00 },
-	{ _MMIO(0x9888), 0x43900842 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900060 },
-};
-
-static int
-get_sampler_mux_config(struct drm_i915_private *dev_priv,
-		       const struct i915_oa_reg **regs,
-		       int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_sampler;
-	lens[n] = ARRAY_SIZE(mux_config_sampler);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_tdl_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x00007fff },
-	{ _MMIO(0x2778), 0x00000000 },
-	{ _MMIO(0x277c), 0x00009fff },
-	{ _MMIO(0x2780), 0x00000002 },
-	{ _MMIO(0x2784), 0x0000efff },
-	{ _MMIO(0x2788), 0x00000000 },
-	{ _MMIO(0x278c), 0x0000f3ff },
-	{ _MMIO(0x2790), 0x00000002 },
-	{ _MMIO(0x2794), 0x0000fdff },
-	{ _MMIO(0x2798), 0x00000000 },
-	{ _MMIO(0x279c), 0x0000fe7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_tdl_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_tdl_1[] = {
-	{ _MMIO(0x9888), 0x12120000 },
-	{ _MMIO(0x9888), 0x12320000 },
-	{ _MMIO(0x9888), 0x12520000 },
-	{ _MMIO(0x9888), 0x002f8000 },
-	{ _MMIO(0x9888), 0x022f3000 },
-	{ _MMIO(0x9888), 0x0a4c0015 },
-	{ _MMIO(0x9888), 0x0c0d8000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x100f03a0 },
-	{ _MMIO(0x9888), 0x0c0ff000 },
-	{ _MMIO(0x9888), 0x0e0f0095 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2d8000 },
-	{ _MMIO(0x9888), 0x0e2d4000 },
-	{ _MMIO(0x9888), 0x062d4000 },
-	{ _MMIO(0x9888), 0x02108000 },
-	{ _MMIO(0x9888), 0x0410c000 },
-	{ _MMIO(0x9888), 0x02118000 },
-	{ _MMIO(0x9888), 0x0411c000 },
-	{ _MMIO(0x9888), 0x02121880 },
-	{ _MMIO(0x9888), 0x041219b5 },
-	{ _MMIO(0x9888), 0x00120000 },
-	{ _MMIO(0x9888), 0x02134000 },
-	{ _MMIO(0x9888), 0x04135000 },
-	{ _MMIO(0x9888), 0x0c308000 },
-	{ _MMIO(0x9888), 0x0e304000 },
-	{ _MMIO(0x9888), 0x06304000 },
-	{ _MMIO(0x9888), 0x0c318000 },
-	{ _MMIO(0x9888), 0x0e314000 },
-	{ _MMIO(0x9888), 0x06314000 },
-	{ _MMIO(0x9888), 0x0c321a80 },
-	{ _MMIO(0x9888), 0x0e320033 },
-	{ _MMIO(0x9888), 0x06320031 },
-	{ _MMIO(0x9888), 0x00320000 },
-	{ _MMIO(0x9888), 0x0c334000 },
-	{ _MMIO(0x9888), 0x0e331000 },
-	{ _MMIO(0x9888), 0x06331000 },
-	{ _MMIO(0x9888), 0x0e508000 },
-	{ _MMIO(0x9888), 0x00508000 },
-	{ _MMIO(0x9888), 0x02504000 },
-	{ _MMIO(0x9888), 0x0e518000 },
-	{ _MMIO(0x9888), 0x00518000 },
-	{ _MMIO(0x9888), 0x02514000 },
-	{ _MMIO(0x9888), 0x0e521880 },
-	{ _MMIO(0x9888), 0x00521a80 },
-	{ _MMIO(0x9888), 0x02520033 },
-	{ _MMIO(0x9888), 0x0e534000 },
-	{ _MMIO(0x9888), 0x00534000 },
-	{ _MMIO(0x9888), 0x02531000 },
-	{ _MMIO(0x9888), 0x1190ff80 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900800 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900062 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c00 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900040 },
-};
-
-static int
-get_tdl_1_mux_config(struct drm_i915_private *dev_priv,
-		     const struct i915_oa_reg **regs,
-		     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_tdl_1;
-	lens[n] = ARRAY_SIZE(mux_config_tdl_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_tdl_2[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_tdl_2[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_tdl_2[] = {
-	{ _MMIO(0x9888), 0x12124d60 },
-	{ _MMIO(0x9888), 0x12322e60 },
-	{ _MMIO(0x9888), 0x12524d60 },
-	{ _MMIO(0x9888), 0x022f3000 },
-	{ _MMIO(0x9888), 0x0a4c0014 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0fe000 },
-	{ _MMIO(0x9888), 0x0e0f0097 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x002d8000 },
-	{ _MMIO(0x9888), 0x062d4000 },
-	{ _MMIO(0x9888), 0x0410c000 },
-	{ _MMIO(0x9888), 0x0411c000 },
-	{ _MMIO(0x9888), 0x04121fb7 },
-	{ _MMIO(0x9888), 0x00120000 },
-	{ _MMIO(0x9888), 0x04135000 },
-	{ _MMIO(0x9888), 0x00308000 },
-	{ _MMIO(0x9888), 0x06304000 },
-	{ _MMIO(0x9888), 0x00318000 },
-	{ _MMIO(0x9888), 0x06314000 },
-	{ _MMIO(0x9888), 0x00321b80 },
-	{ _MMIO(0x9888), 0x0632003f },
-	{ _MMIO(0x9888), 0x00334000 },
-	{ _MMIO(0x9888), 0x06331000 },
-	{ _MMIO(0x9888), 0x0250c000 },
-	{ _MMIO(0x9888), 0x0251c000 },
-	{ _MMIO(0x9888), 0x02521fb7 },
-	{ _MMIO(0x9888), 0x00520000 },
-	{ _MMIO(0x9888), 0x02535000 },
-	{ _MMIO(0x9888), 0x1190fc00 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900800 },
-	{ _MMIO(0x9888), 0x43900063 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900040 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_tdl_2_mux_config(struct drm_i915_private *dev_priv,
-		     const struct i915_oa_reg **regs,
-		     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_tdl_2;
-	lens[n] = ARRAY_SIZE(mux_config_tdl_2);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_extra[] = {
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_extra[] = {
-};
-
-static const struct i915_oa_reg mux_config_compute_extra[] = {
-	{ _MMIO(0x9888), 0x121203e0 },
-	{ _MMIO(0x9888), 0x123203e0 },
-	{ _MMIO(0x9888), 0x125203e0 },
-	{ _MMIO(0x9888), 0x129203e0 },
-	{ _MMIO(0x9888), 0x12b203e0 },
-	{ _MMIO(0x9888), 0x12d203e0 },
-	{ _MMIO(0x9888), 0x024ec000 },
-	{ _MMIO(0x9888), 0x044ec000 },
-	{ _MMIO(0x9888), 0x064ec000 },
-	{ _MMIO(0x9888), 0x022f4000 },
-	{ _MMIO(0x9888), 0x084ca000 },
-	{ _MMIO(0x9888), 0x0a4c0042 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0f5000 },
-	{ _MMIO(0x9888), 0x0e0f006d },
-	{ _MMIO(0x9888), 0x022c8000 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x042d8000 },
-	{ _MMIO(0x9888), 0x06104000 },
-	{ _MMIO(0x9888), 0x06114000 },
-	{ _MMIO(0x9888), 0x06120033 },
-	{ _MMIO(0x9888), 0x00120000 },
-	{ _MMIO(0x9888), 0x06131000 },
-	{ _MMIO(0x9888), 0x04308000 },
-	{ _MMIO(0x9888), 0x04318000 },
-	{ _MMIO(0x9888), 0x04321980 },
-	{ _MMIO(0x9888), 0x00320000 },
-	{ _MMIO(0x9888), 0x04334000 },
-	{ _MMIO(0x9888), 0x04504000 },
-	{ _MMIO(0x9888), 0x04514000 },
-	{ _MMIO(0x9888), 0x04520033 },
-	{ _MMIO(0x9888), 0x00520000 },
-	{ _MMIO(0x9888), 0x04531000 },
-	{ _MMIO(0x9888), 0x00af8000 },
-	{ _MMIO(0x9888), 0x0acc0001 },
-	{ _MMIO(0x9888), 0x008d8000 },
-	{ _MMIO(0x9888), 0x028da000 },
-	{ _MMIO(0x9888), 0x0c8fb000 },
-	{ _MMIO(0x9888), 0x0e8f0001 },
-	{ _MMIO(0x9888), 0x06ac8000 },
-	{ _MMIO(0x9888), 0x02ad4000 },
-	{ _MMIO(0x9888), 0x02908000 },
-	{ _MMIO(0x9888), 0x02918000 },
-	{ _MMIO(0x9888), 0x02921980 },
-	{ _MMIO(0x9888), 0x00920000 },
-	{ _MMIO(0x9888), 0x02934000 },
-	{ _MMIO(0x9888), 0x02b04000 },
-	{ _MMIO(0x9888), 0x02b14000 },
-	{ _MMIO(0x9888), 0x02b20033 },
-	{ _MMIO(0x9888), 0x00b20000 },
-	{ _MMIO(0x9888), 0x02b31000 },
-	{ _MMIO(0x9888), 0x00d08000 },
-	{ _MMIO(0x9888), 0x00d18000 },
-	{ _MMIO(0x9888), 0x00d21980 },
-	{ _MMIO(0x9888), 0x00d34000 },
-	{ _MMIO(0x9888), 0x1190fc00 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c00 },
-	{ _MMIO(0x9888), 0x43900002 },
-	{ _MMIO(0x9888), 0x53900420 },
-	{ _MMIO(0x9888), 0x459000a1 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_compute_extra_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_extra;
-	lens[n] = ARRAY_SIZE(mux_config_compute_extra);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_vme_pipe[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00100030 },
-	{ _MMIO(0x2774), 0x0000fff9 },
-	{ _MMIO(0x2778), 0x00000002 },
-	{ _MMIO(0x277c), 0x0000fffc },
-	{ _MMIO(0x2780), 0x00000002 },
-	{ _MMIO(0x2784), 0x0000fff3 },
-	{ _MMIO(0x2788), 0x00100180 },
-	{ _MMIO(0x278c), 0x0000ffcf },
-	{ _MMIO(0x2790), 0x00000002 },
-	{ _MMIO(0x2794), 0x0000ffcf },
-	{ _MMIO(0x2798), 0x00000002 },
-	{ _MMIO(0x279c), 0x0000ff3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_vme_pipe[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00008003 },
-};
-
-static const struct i915_oa_reg mux_config_vme_pipe[] = {
-	{ _MMIO(0x9888), 0x141a5800 },
-	{ _MMIO(0x9888), 0x161a00c0 },
-	{ _MMIO(0x9888), 0x12180240 },
-	{ _MMIO(0x9888), 0x14180002 },
-	{ _MMIO(0x9888), 0x149a5800 },
-	{ _MMIO(0x9888), 0x169a00c0 },
-	{ _MMIO(0x9888), 0x12980240 },
-	{ _MMIO(0x9888), 0x14980002 },
-	{ _MMIO(0x9888), 0x1a4e3fc0 },
-	{ _MMIO(0x9888), 0x002f1000 },
-	{ _MMIO(0x9888), 0x022f8000 },
-	{ _MMIO(0x9888), 0x042f3000 },
-	{ _MMIO(0x9888), 0x004c4000 },
-	{ _MMIO(0x9888), 0x0a4c9500 },
-	{ _MMIO(0x9888), 0x0c4c002a },
-	{ _MMIO(0x9888), 0x000d2000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0c0f0400 },
-	{ _MMIO(0x9888), 0x0e0f5500 },
-	{ _MMIO(0x9888), 0x100f0015 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2c8000 },
-	{ _MMIO(0x9888), 0x162caa00 },
-	{ _MMIO(0x9888), 0x182c000a },
-	{ _MMIO(0x9888), 0x04193000 },
-	{ _MMIO(0x9888), 0x081a28c1 },
-	{ _MMIO(0x9888), 0x001a0000 },
-	{ _MMIO(0x9888), 0x00133000 },
-	{ _MMIO(0x9888), 0x0613c000 },
-	{ _MMIO(0x9888), 0x0813f000 },
-	{ _MMIO(0x9888), 0x00172000 },
-	{ _MMIO(0x9888), 0x06178000 },
-	{ _MMIO(0x9888), 0x0817a000 },
-	{ _MMIO(0x9888), 0x00180037 },
-	{ _MMIO(0x9888), 0x06180940 },
-	{ _MMIO(0x9888), 0x08180000 },
-	{ _MMIO(0x9888), 0x02180000 },
-	{ _MMIO(0x9888), 0x04183000 },
-	{ _MMIO(0x9888), 0x04afc000 },
-	{ _MMIO(0x9888), 0x06af3000 },
-	{ _MMIO(0x9888), 0x0acc4000 },
-	{ _MMIO(0x9888), 0x0ccc0015 },
-	{ _MMIO(0x9888), 0x0a8da000 },
-	{ _MMIO(0x9888), 0x0c8da000 },
-	{ _MMIO(0x9888), 0x0e8f4000 },
-	{ _MMIO(0x9888), 0x108f0015 },
-	{ _MMIO(0x9888), 0x16aca000 },
-	{ _MMIO(0x9888), 0x18ac000a },
-	{ _MMIO(0x9888), 0x06993000 },
-	{ _MMIO(0x9888), 0x0c9a28c1 },
-	{ _MMIO(0x9888), 0x009a0000 },
-	{ _MMIO(0x9888), 0x0a93f000 },
-	{ _MMIO(0x9888), 0x0c93f000 },
-	{ _MMIO(0x9888), 0x0a97a000 },
-	{ _MMIO(0x9888), 0x0c97a000 },
-	{ _MMIO(0x9888), 0x0a980977 },
-	{ _MMIO(0x9888), 0x08980000 },
-	{ _MMIO(0x9888), 0x04980000 },
-	{ _MMIO(0x9888), 0x06983000 },
-	{ _MMIO(0x9888), 0x119000ff },
-	{ _MMIO(0x9888), 0x51900040 },
-	{ _MMIO(0x9888), 0x41900020 },
-	{ _MMIO(0x9888), 0x55900004 },
-	{ _MMIO(0x9888), 0x45900400 },
-	{ _MMIO(0x9888), 0x479008a5 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900002 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_vme_pipe_mux_config(struct drm_i915_private *dev_priv,
-			const struct i915_oa_reg **regs,
-			int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_vme_pipe;
-	lens[n] = ARRAY_SIZE(mux_config_vme_pipe);
-	n++;
-
-	return n;
-}
-
 static const struct i915_oa_reg b_counter_config_test_oa[] = {
 	{ _MMIO(0x2740), 0x00000000 },
 	{ _MMIO(0x2744), 0x00800000 },
@@ -1931,6 +60,7 @@ static const struct i915_oa_reg flex_eu_config_test_oa[] = {
 };
 
 static const struct i915_oa_reg mux_config_test_oa[] = {
+	{ _MMIO(0x9840), 0x00000080 },
 	{ _MMIO(0x9888), 0x11810000 },
 	{ _MMIO(0x9888), 0x07810013 },
 	{ _MMIO(0x9888), 0x1f810000 },
@@ -1945,1096 +75,35 @@ static const struct i915_oa_reg mux_config_test_oa[] = {
 	{ _MMIO(0x9888), 0x33900000 },
 };
 
-static int
-get_test_oa_mux_config(struct drm_i915_private *dev_priv,
-		       const struct i915_oa_reg **regs,
-		       int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_test_oa;
-	lens[n] = ARRAY_SIZE(mux_config_test_oa);
-	n++;
-
-	return n;
-}
-
-int i915_oa_select_metric_set_kblgt3(struct drm_i915_private *dev_priv)
-{
-	dev_priv->perf.oa.n_mux_configs = 0;
-	dev_priv->perf.oa.b_counter_regs = NULL;
-	dev_priv->perf.oa.b_counter_regs_len = 0;
-	dev_priv->perf.oa.flex_regs = NULL;
-	dev_priv->perf.oa.flex_regs_len = 0;
-
-	switch (dev_priv->perf.oa.metrics_set) {
-	case METRIC_SET_ID_RENDER_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_basic_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_basic);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_basic_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_basic);
-
-		return 0;
-	case METRIC_SET_ID_RENDER_PIPE_PROFILE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_pipe_profile_mux_config(dev_priv,
-							   dev_priv->perf.oa.mux_regs,
-							   dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_PIPE_PROFILE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_pipe_profile;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_pipe_profile);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_pipe_profile;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_pipe_profile);
-
-		return 0;
-	case METRIC_SET_ID_MEMORY_READS:
-		dev_priv->perf.oa.n_mux_configs =
-			get_memory_reads_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"MEMORY_READS\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_memory_reads;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_memory_reads);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_memory_reads;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_memory_reads);
-
-		return 0;
-	case METRIC_SET_ID_MEMORY_WRITES:
-		dev_priv->perf.oa.n_mux_configs =
-			get_memory_writes_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"MEMORY_WRITES\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_memory_writes;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_memory_writes);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_memory_writes;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_memory_writes);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_EXTENDED:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_extended_mux_config(dev_priv,
-							dev_priv->perf.oa.mux_regs,
-							dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_EXTENDED\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_extended;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_extended);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_extended;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_extended);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_L3_CACHE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_l3_cache_mux_config(dev_priv,
-							dev_priv->perf.oa.mux_regs,
-							dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_L3_CACHE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_l3_cache;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_l3_cache);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_l3_cache;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_l3_cache);
-
-		return 0;
-	case METRIC_SET_ID_HDC_AND_SF:
-		dev_priv->perf.oa.n_mux_configs =
-			get_hdc_and_sf_mux_config(dev_priv,
-						  dev_priv->perf.oa.mux_regs,
-						  dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"HDC_AND_SF\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_hdc_and_sf;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_hdc_and_sf);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_hdc_and_sf;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_hdc_and_sf);
-
-		return 0;
-	case METRIC_SET_ID_L3_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_1_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_1);
-
-		return 0;
-	case METRIC_SET_ID_L3_2:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_2_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_2\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_2;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_2);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_2;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_2);
-
-		return 0;
-	case METRIC_SET_ID_L3_3:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_3_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_3\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_3;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_3);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_3;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_3);
-
-		return 0;
-	case METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND:
-		dev_priv->perf.oa.n_mux_configs =
-			get_rasterizer_and_pixel_backend_mux_config(dev_priv,
-								    dev_priv->perf.oa.mux_regs,
-								    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RASTERIZER_AND_PIXEL_BACKEND\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_rasterizer_and_pixel_backend;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_rasterizer_and_pixel_backend);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_rasterizer_and_pixel_backend;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_rasterizer_and_pixel_backend);
-
-		return 0;
-	case METRIC_SET_ID_SAMPLER:
-		dev_priv->perf.oa.n_mux_configs =
-			get_sampler_mux_config(dev_priv,
-					       dev_priv->perf.oa.mux_regs,
-					       dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"SAMPLER\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_sampler;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_sampler);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_sampler;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_sampler);
-
-		return 0;
-	case METRIC_SET_ID_TDL_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_tdl_1_mux_config(dev_priv,
-					     dev_priv->perf.oa.mux_regs,
-					     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TDL_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_tdl_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_tdl_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_tdl_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_tdl_1);
-
-		return 0;
-	case METRIC_SET_ID_TDL_2:
-		dev_priv->perf.oa.n_mux_configs =
-			get_tdl_2_mux_config(dev_priv,
-					     dev_priv->perf.oa.mux_regs,
-					     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TDL_2\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_tdl_2;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_tdl_2);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_tdl_2;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_tdl_2);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_EXTRA:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_extra_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_EXTRA\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_extra;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_extra);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_extra;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_extra);
-
-		return 0;
-	case METRIC_SET_ID_VME_PIPE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_vme_pipe_mux_config(dev_priv,
-						dev_priv->perf.oa.mux_regs,
-						dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"VME_PIPE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_vme_pipe;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_vme_pipe);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_vme_pipe;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_vme_pipe);
-
-		return 0;
-	case METRIC_SET_ID_TEST_OA:
-		dev_priv->perf.oa.n_mux_configs =
-			get_test_oa_mux_config(dev_priv,
-					       dev_priv->perf.oa.mux_regs,
-					       dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TEST_OA\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_test_oa;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_test_oa);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_test_oa;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_test_oa);
-
-		return 0;
-	default:
-		return -ENODEV;
-	}
-}
-
-static ssize_t
-show_render_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_BASIC);
-}
-
-static struct device_attribute dev_attr_render_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_basic[] = {
-	&dev_attr_render_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_basic = {
-	.name = "0286c920-2f6d-493b-b22d-7a5280df43de",
-	.attrs =  attrs_render_basic,
-};
-
-static ssize_t
-show_compute_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_BASIC);
-}
-
-static struct device_attribute dev_attr_compute_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_basic[] = {
-	&dev_attr_compute_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_basic = {
-	.name = "9823aaa1-b06f-40ce-884b-cd798c79f0c2",
-	.attrs =  attrs_compute_basic,
-};
-
-static ssize_t
-show_render_pipe_profile_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_PIPE_PROFILE);
-}
-
-static struct device_attribute dev_attr_render_pipe_profile_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_pipe_profile_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_pipe_profile[] = {
-	&dev_attr_render_pipe_profile_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_pipe_profile = {
-	.name = "c7c735f3-ce58-45cf-aa04-30b183f1faff",
-	.attrs =  attrs_render_pipe_profile,
-};
-
-static ssize_t
-show_memory_reads_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_MEMORY_READS);
-}
-
-static struct device_attribute dev_attr_memory_reads_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_memory_reads_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_memory_reads[] = {
-	&dev_attr_memory_reads_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_memory_reads = {
-	.name = "96ec2219-040b-428a-856a-6bc03363a057",
-	.attrs =  attrs_memory_reads,
-};
-
-static ssize_t
-show_memory_writes_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_MEMORY_WRITES);
-}
-
-static struct device_attribute dev_attr_memory_writes_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_memory_writes_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_memory_writes[] = {
-	&dev_attr_memory_writes_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_memory_writes = {
-	.name = "03372b64-4996-4d3b-aa18-790e75eeb9c2",
-	.attrs =  attrs_memory_writes,
-};
-
-static ssize_t
-show_compute_extended_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_EXTENDED);
-}
-
-static struct device_attribute dev_attr_compute_extended_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_extended_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_extended[] = {
-	&dev_attr_compute_extended_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_extended = {
-	.name = "31b4ce5a-bd61-4c1f-bb5d-f2e731412150",
-	.attrs =  attrs_compute_extended,
-};
-
-static ssize_t
-show_compute_l3_cache_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_L3_CACHE);
-}
-
-static struct device_attribute dev_attr_compute_l3_cache_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_l3_cache_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_l3_cache[] = {
-	&dev_attr_compute_l3_cache_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_l3_cache = {
-	.name = "2ce0911a-27fc-4887-96f0-11084fa807c3",
-	.attrs =  attrs_compute_l3_cache,
-};
-
-static ssize_t
-show_hdc_and_sf_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_HDC_AND_SF);
-}
-
-static struct device_attribute dev_attr_hdc_and_sf_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_hdc_and_sf_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_hdc_and_sf[] = {
-	&dev_attr_hdc_and_sf_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_hdc_and_sf = {
-	.name = "546c4c1d-99b8-42fb-a107-5aaabb5314a8",
-	.attrs =  attrs_hdc_and_sf,
-};
-
-static ssize_t
-show_l3_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_1);
-}
-
-static struct device_attribute dev_attr_l3_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_1[] = {
-	&dev_attr_l3_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_1 = {
-	.name = "4e93d156-9b39-4268-8544-a8e0480806d7",
-	.attrs =  attrs_l3_1,
-};
-
-static ssize_t
-show_l3_2_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_2);
-}
-
-static struct device_attribute dev_attr_l3_2_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_2_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_2[] = {
-	&dev_attr_l3_2_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_2 = {
-	.name = "de1bec86-ca92-4b43-89fa-147653221cc0",
-	.attrs =  attrs_l3_2,
-};
-
-static ssize_t
-show_l3_3_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_3);
-}
-
-static struct device_attribute dev_attr_l3_3_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_3_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_3[] = {
-	&dev_attr_l3_3_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_3 = {
-	.name = "e63537bb-10be-4d4a-92c4-c6b0c65e02ef",
-	.attrs =  attrs_l3_3,
-};
-
-static ssize_t
-show_rasterizer_and_pixel_backend_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND);
-}
-
-static struct device_attribute dev_attr_rasterizer_and_pixel_backend_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_rasterizer_and_pixel_backend_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_rasterizer_and_pixel_backend[] = {
-	&dev_attr_rasterizer_and_pixel_backend_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_rasterizer_and_pixel_backend = {
-	.name = "7a03a9f8-ec5e-46bb-8b67-1f0ff1476281",
-	.attrs =  attrs_rasterizer_and_pixel_backend,
-};
-
-static ssize_t
-show_sampler_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_SAMPLER);
-}
-
-static struct device_attribute dev_attr_sampler_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_sampler_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_sampler[] = {
-	&dev_attr_sampler_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_sampler = {
-	.name = "b25d2ebf-a6e0-4b29-96be-a9b010edeeda",
-	.attrs =  attrs_sampler,
-};
-
-static ssize_t
-show_tdl_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TDL_1);
-}
-
-static struct device_attribute dev_attr_tdl_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_tdl_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_tdl_1[] = {
-	&dev_attr_tdl_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_tdl_1 = {
-	.name = "469a05e5-e299-46f7-9598-7b05f3c34991",
-	.attrs =  attrs_tdl_1,
-};
-
-static ssize_t
-show_tdl_2_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TDL_2);
-}
-
-static struct device_attribute dev_attr_tdl_2_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_tdl_2_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_tdl_2[] = {
-	&dev_attr_tdl_2_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_tdl_2 = {
-	.name = "52f925c6-786a-4ec6-86ce-cba85c83453a",
-	.attrs =  attrs_tdl_2,
-};
-
-static ssize_t
-show_compute_extra_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_EXTRA);
-}
-
-static struct device_attribute dev_attr_compute_extra_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_extra_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_extra[] = {
-	&dev_attr_compute_extra_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_extra = {
-	.name = "efc497ac-884e-4ee4-a4a8-15fba22aaf21",
-	.attrs =  attrs_compute_extra,
-};
-
-static ssize_t
-show_vme_pipe_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_VME_PIPE);
-}
-
-static struct device_attribute dev_attr_vme_pipe_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_vme_pipe_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_vme_pipe[] = {
-	&dev_attr_vme_pipe_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_vme_pipe = {
-	.name = "bfd9764d-2c5b-4c16-bfc1-89de3ca10917",
-	.attrs =  attrs_vme_pipe,
-};
-
 static ssize_t
 show_test_oa_id(struct device *kdev, struct device_attribute *attr, char *buf)
 {
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TEST_OA);
-}
-
-static struct device_attribute dev_attr_test_oa_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_test_oa_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_test_oa[] = {
-	&dev_attr_test_oa_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_test_oa = {
-	.name = "f1792f32-6db2-4b50-b4b2-557128f1688d",
-	.attrs =  attrs_test_oa,
-};
-
-int
-i915_perf_register_sysfs_kblgt3(struct drm_i915_private *dev_priv)
-{
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
-	int ret = 0;
-
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-		if (ret)
-			goto error_render_basic;
-	}
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-		if (ret)
-			goto error_compute_basic;
-	}
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-		if (ret)
-			goto error_render_pipe_profile;
-	}
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-		if (ret)
-			goto error_memory_reads;
-	}
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-		if (ret)
-			goto error_memory_writes;
-	}
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-		if (ret)
-			goto error_compute_extended;
-	}
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-		if (ret)
-			goto error_compute_l3_cache;
-	}
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-		if (ret)
-			goto error_hdc_and_sf;
-	}
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-		if (ret)
-			goto error_l3_1;
-	}
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-		if (ret)
-			goto error_l3_2;
-	}
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-		if (ret)
-			goto error_l3_3;
-	}
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-		if (ret)
-			goto error_rasterizer_and_pixel_backend;
-	}
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_sampler);
-		if (ret)
-			goto error_sampler;
-	}
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-		if (ret)
-			goto error_tdl_1;
-	}
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-		if (ret)
-			goto error_tdl_2;
-	}
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-		if (ret)
-			goto error_compute_extra;
-	}
-	if (get_vme_pipe_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_vme_pipe);
-		if (ret)
-			goto error_vme_pipe;
-	}
-	if (get_test_oa_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_test_oa);
-		if (ret)
-			goto error_test_oa;
-	}
-
-	return 0;
-
-error_test_oa:
-	if (get_vme_pipe_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_vme_pipe);
-error_vme_pipe:
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-error_compute_extra:
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-error_tdl_2:
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-error_tdl_1:
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler);
-error_sampler:
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-error_rasterizer_and_pixel_backend:
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-error_l3_3:
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-error_l3_2:
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-error_l3_1:
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-error_hdc_and_sf:
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-error_compute_l3_cache:
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-error_compute_extended:
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-error_memory_writes:
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-error_memory_reads:
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-error_render_pipe_profile:
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-error_compute_basic:
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-error_render_basic:
-	return ret;
+	return sprintf(buf, "1\n");
 }
 
 void
-i915_perf_unregister_sysfs_kblgt3(struct drm_i915_private *dev_priv)
+i915_perf_load_test_config_kblgt3(struct drm_i915_private *dev_priv)
 {
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
+	strncpy(dev_priv->perf.oa.test_config.uuid,
+		"f1792f32-6db2-4b50-b4b2-557128f1688d",
+		UUID_STRING_LEN);
+	dev_priv->perf.oa.test_config.id = 1;
 
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler);
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-	if (get_vme_pipe_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_vme_pipe);
-	if (get_test_oa_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_test_oa);
+	dev_priv->perf.oa.test_config.mux_regs = mux_config_test_oa;
+	dev_priv->perf.oa.test_config.mux_regs_len = ARRAY_SIZE(mux_config_test_oa);
+
+	dev_priv->perf.oa.test_config.b_counter_regs = b_counter_config_test_oa;
+	dev_priv->perf.oa.test_config.b_counter_regs_len = ARRAY_SIZE(b_counter_config_test_oa);
+
+	dev_priv->perf.oa.test_config.flex_regs = flex_eu_config_test_oa;
+	dev_priv->perf.oa.test_config.flex_regs_len = ARRAY_SIZE(flex_eu_config_test_oa);
+
+	dev_priv->perf.oa.test_config.sysfs_metric.name = "f1792f32-6db2-4b50-b4b2-557128f1688d";
+	dev_priv->perf.oa.test_config.sysfs_metric.attrs = dev_priv->perf.oa.test_config.attrs;
+
+	dev_priv->perf.oa.test_config.attrs[0] = &dev_priv->perf.oa.test_config.sysfs_metric_id.attr;
+
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.name = "id";
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.mode = 0444;
+	dev_priv->perf.oa.test_config.sysfs_metric_id.show = show_test_oa_id;
 }
diff --git a/drivers/gpu/drm/i915/i915_oa_kblgt3.h b/drivers/gpu/drm/i915/i915_oa_kblgt3.h
index b0ca7f3..d5b5b5c 100644
--- a/drivers/gpu/drm/i915/i915_oa_kblgt3.h
+++ b/drivers/gpu/drm/i915/i915_oa_kblgt3.h
@@ -29,12 +29,6 @@
 #ifndef __I915_OA_KBLGT3_H__
 #define __I915_OA_KBLGT3_H__
 
-extern int i915_oa_n_builtin_metric_sets_kblgt3;
-
-extern int i915_oa_select_metric_set_kblgt3(struct drm_i915_private *dev_priv);
-
-extern int i915_perf_register_sysfs_kblgt3(struct drm_i915_private *dev_priv);
-
-extern void i915_perf_unregister_sysfs_kblgt3(struct drm_i915_private *dev_priv);
+extern void i915_perf_load_test_config_kblgt3(struct drm_i915_private *dev_priv);
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_oa_sklgt2.c b/drivers/gpu/drm/i915/i915_oa_sklgt2.c
index 1268bed..890d558 100644
--- a/drivers/gpu/drm/i915/i915_oa_sklgt2.c
+++ b/drivers/gpu/drm/i915/i915_oa_sklgt2.c
@@ -31,2317 +31,6 @@
 #include "i915_drv.h"
 #include "i915_oa_sklgt2.h"
 
-enum metric_set_id {
-	METRIC_SET_ID_RENDER_BASIC = 1,
-	METRIC_SET_ID_COMPUTE_BASIC,
-	METRIC_SET_ID_RENDER_PIPE_PROFILE,
-	METRIC_SET_ID_MEMORY_READS,
-	METRIC_SET_ID_MEMORY_WRITES,
-	METRIC_SET_ID_COMPUTE_EXTENDED,
-	METRIC_SET_ID_COMPUTE_L3_CACHE,
-	METRIC_SET_ID_HDC_AND_SF,
-	METRIC_SET_ID_L3_1,
-	METRIC_SET_ID_L3_2,
-	METRIC_SET_ID_L3_3,
-	METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND,
-	METRIC_SET_ID_SAMPLER,
-	METRIC_SET_ID_TDL_1,
-	METRIC_SET_ID_TDL_2,
-	METRIC_SET_ID_COMPUTE_EXTRA,
-	METRIC_SET_ID_VME_PIPE,
-	METRIC_SET_ID_TEST_OA,
-};
-
-int i915_oa_n_builtin_metric_sets_sklgt2 = 18;
-
-static const struct i915_oa_reg b_counter_config_render_basic[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_render_basic[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_render_basic_1_sku_gte_0x02[] = {
-	{ _MMIO(0x9888), 0x166c01e0 },
-	{ _MMIO(0x9888), 0x12170280 },
-	{ _MMIO(0x9888), 0x12370280 },
-	{ _MMIO(0x9888), 0x11930317 },
-	{ _MMIO(0x9888), 0x159303df },
-	{ _MMIO(0x9888), 0x3f900003 },
-	{ _MMIO(0x9888), 0x1a4e0080 },
-	{ _MMIO(0x9888), 0x0a6c0053 },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x0a1b4000 },
-	{ _MMIO(0x9888), 0x1c1c0001 },
-	{ _MMIO(0x9888), 0x002f1000 },
-	{ _MMIO(0x9888), 0x042f1000 },
-	{ _MMIO(0x9888), 0x004c4000 },
-	{ _MMIO(0x9888), 0x0a4c8400 },
-	{ _MMIO(0x9888), 0x000d2000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0d2000 },
-	{ _MMIO(0x9888), 0x0c0f0400 },
-	{ _MMIO(0x9888), 0x0e0f6600 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x162c2200 },
-	{ _MMIO(0x9888), 0x062d8000 },
-	{ _MMIO(0x9888), 0x082d8000 },
-	{ _MMIO(0x9888), 0x00133000 },
-	{ _MMIO(0x9888), 0x08133000 },
-	{ _MMIO(0x9888), 0x00170020 },
-	{ _MMIO(0x9888), 0x08170021 },
-	{ _MMIO(0x9888), 0x10170000 },
-	{ _MMIO(0x9888), 0x0633c000 },
-	{ _MMIO(0x9888), 0x0833c000 },
-	{ _MMIO(0x9888), 0x06370800 },
-	{ _MMIO(0x9888), 0x08370840 },
-	{ _MMIO(0x9888), 0x10370000 },
-	{ _MMIO(0x9888), 0x0d933031 },
-	{ _MMIO(0x9888), 0x0f933e3f },
-	{ _MMIO(0x9888), 0x01933d00 },
-	{ _MMIO(0x9888), 0x0393073c },
-	{ _MMIO(0x9888), 0x0593000e },
-	{ _MMIO(0x9888), 0x1d930000 },
-	{ _MMIO(0x9888), 0x19930000 },
-	{ _MMIO(0x9888), 0x1b930000 },
-	{ _MMIO(0x9888), 0x1d900157 },
-	{ _MMIO(0x9888), 0x1f900158 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x2b908000 },
-	{ _MMIO(0x9888), 0x2d908000 },
-	{ _MMIO(0x9888), 0x2f908000 },
-	{ _MMIO(0x9888), 0x31908000 },
-	{ _MMIO(0x9888), 0x15908000 },
-	{ _MMIO(0x9888), 0x17908000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1190001f },
-	{ _MMIO(0x9888), 0x51904400 },
-	{ _MMIO(0x9888), 0x41900020 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900c21 },
-	{ _MMIO(0x9888), 0x47900061 },
-	{ _MMIO(0x9888), 0x57904440 },
-	{ _MMIO(0x9888), 0x49900000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900000 },
-	{ _MMIO(0x9888), 0x59900004 },
-	{ _MMIO(0x9888), 0x43900000 },
-	{ _MMIO(0x9888), 0x53904444 },
-};
-
-static int
-get_render_basic_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	if (dev_priv->drm.pdev->revision >= 0x02) {
-		regs[n] = mux_config_render_basic_1_sku_gte_0x02;
-		lens[n] = ARRAY_SIZE(mux_config_render_basic_1_sku_gte_0x02);
-		n++;
-	}
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_basic[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_basic[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_compute_basic_0_slices_0x01_and_sku_lt_0x02[] = {
-	{ _MMIO(0x9888), 0x104f00e0 },
-	{ _MMIO(0x9888), 0x124f1c00 },
-	{ _MMIO(0x9888), 0x106c00e0 },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f901403 },
-	{ _MMIO(0x9888), 0x184e8000 },
-	{ _MMIO(0x9888), 0x1a4e8200 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x004f0db2 },
-	{ _MMIO(0x9888), 0x064f0900 },
-	{ _MMIO(0x9888), 0x084f1880 },
-	{ _MMIO(0x9888), 0x0a4f0011 },
-	{ _MMIO(0x9888), 0x0c4f0e3c },
-	{ _MMIO(0x9888), 0x0e4f1d80 },
-	{ _MMIO(0x9888), 0x086c0002 },
-	{ _MMIO(0x9888), 0x0a6c0100 },
-	{ _MMIO(0x9888), 0x0e6c000c },
-	{ _MMIO(0x9888), 0x026c000b },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1a6c0000 },
-	{ _MMIO(0x9888), 0x081b4000 },
-	{ _MMIO(0x9888), 0x0a1b8000 },
-	{ _MMIO(0x9888), 0x0e1b4000 },
-	{ _MMIO(0x9888), 0x021b4000 },
-	{ _MMIO(0x9888), 0x1a1c4000 },
-	{ _MMIO(0x9888), 0x1c1c0012 },
-	{ _MMIO(0x9888), 0x141c8000 },
-	{ _MMIO(0x9888), 0x005bc000 },
-	{ _MMIO(0x9888), 0x065b8000 },
-	{ _MMIO(0x9888), 0x085b8000 },
-	{ _MMIO(0x9888), 0x0a5b4000 },
-	{ _MMIO(0x9888), 0x0c5bc000 },
-	{ _MMIO(0x9888), 0x0e5b8000 },
-	{ _MMIO(0x9888), 0x105c8000 },
-	{ _MMIO(0x9888), 0x1a5ca000 },
-	{ _MMIO(0x9888), 0x1c5c002d },
-	{ _MMIO(0x9888), 0x125c8000 },
-	{ _MMIO(0x9888), 0x0a4c0800 },
-	{ _MMIO(0x9888), 0x0c4c0082 },
-	{ _MMIO(0x9888), 0x084c8000 },
-	{ _MMIO(0x9888), 0x000da000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x020d2000 },
-	{ _MMIO(0x9888), 0x0c0f5400 },
-	{ _MMIO(0x9888), 0x0e0f5500 },
-	{ _MMIO(0x9888), 0x100f0155 },
-	{ _MMIO(0x9888), 0x002cc000 },
-	{ _MMIO(0x9888), 0x0e2cc000 },
-	{ _MMIO(0x9888), 0x162cbe00 },
-	{ _MMIO(0x9888), 0x182c00ef },
-	{ _MMIO(0x9888), 0x022cc000 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x19900157 },
-	{ _MMIO(0x9888), 0x1b900167 },
-	{ _MMIO(0x9888), 0x1d900105 },
-	{ _MMIO(0x9888), 0x1f900103 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0xd28), 0x00000000 },
-	{ _MMIO(0x9888), 0x11900fff },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900840 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900842 },
-	{ _MMIO(0x9888), 0x47900840 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900840 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900040 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900840 },
-	{ _MMIO(0x9888), 0x53901111 },
-};
-
-static const struct i915_oa_reg mux_config_compute_basic_0_slices_0x01_and_sku_gte_0x02[] = {
-	{ _MMIO(0x9888), 0x104f00e0 },
-	{ _MMIO(0x9888), 0x124f1c00 },
-	{ _MMIO(0x9888), 0x106c00e0 },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f901403 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x1a4e0820 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x064f0900 },
-	{ _MMIO(0x9888), 0x084f0032 },
-	{ _MMIO(0x9888), 0x0a4f1810 },
-	{ _MMIO(0x9888), 0x0c4f0e00 },
-	{ _MMIO(0x9888), 0x0e4f003c },
-	{ _MMIO(0x9888), 0x004f0d80 },
-	{ _MMIO(0x9888), 0x024f003b },
-	{ _MMIO(0x9888), 0x006c0002 },
-	{ _MMIO(0x9888), 0x086c0000 },
-	{ _MMIO(0x9888), 0x0c6c000c },
-	{ _MMIO(0x9888), 0x0e6c0b00 },
-	{ _MMIO(0x9888), 0x186c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x001b4000 },
-	{ _MMIO(0x9888), 0x081b8000 },
-	{ _MMIO(0x9888), 0x0c1b4000 },
-	{ _MMIO(0x9888), 0x0e1b8000 },
-	{ _MMIO(0x9888), 0x101c8000 },
-	{ _MMIO(0x9888), 0x1a1c8000 },
-	{ _MMIO(0x9888), 0x1c1c0024 },
-	{ _MMIO(0x9888), 0x065b8000 },
-	{ _MMIO(0x9888), 0x085b4000 },
-	{ _MMIO(0x9888), 0x0a5bc000 },
-	{ _MMIO(0x9888), 0x0c5b8000 },
-	{ _MMIO(0x9888), 0x0e5b4000 },
-	{ _MMIO(0x9888), 0x005b8000 },
-	{ _MMIO(0x9888), 0x025b4000 },
-	{ _MMIO(0x9888), 0x1a5c6000 },
-	{ _MMIO(0x9888), 0x1c5c001b },
-	{ _MMIO(0x9888), 0x125c8000 },
-	{ _MMIO(0x9888), 0x145c8000 },
-	{ _MMIO(0x9888), 0x004c8000 },
-	{ _MMIO(0x9888), 0x0a4c2000 },
-	{ _MMIO(0x9888), 0x0c4c0208 },
-	{ _MMIO(0x9888), 0x000da000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x020d2000 },
-	{ _MMIO(0x9888), 0x0c0f5400 },
-	{ _MMIO(0x9888), 0x0e0f5500 },
-	{ _MMIO(0x9888), 0x100f0155 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2cc000 },
-	{ _MMIO(0x9888), 0x162cfb00 },
-	{ _MMIO(0x9888), 0x182c00be },
-	{ _MMIO(0x9888), 0x022cc000 },
-	{ _MMIO(0x9888), 0x042cc000 },
-	{ _MMIO(0x9888), 0x19900157 },
-	{ _MMIO(0x9888), 0x1b900167 },
-	{ _MMIO(0x9888), 0x1d900105 },
-	{ _MMIO(0x9888), 0x1f900103 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x11900fff },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900800 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900842 },
-	{ _MMIO(0x9888), 0x47900802 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900802 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900002 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900842 },
-	{ _MMIO(0x9888), 0x53901111 },
-};
-
-static int
-get_compute_basic_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 2);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 2);
-
-	if ((INTEL_INFO(dev_priv)->sseu.slice_mask & 0x01) &&
-	    (dev_priv->drm.pdev->revision < 0x02)) {
-		regs[n] = mux_config_compute_basic_0_slices_0x01_and_sku_lt_0x02;
-		lens[n] = ARRAY_SIZE(mux_config_compute_basic_0_slices_0x01_and_sku_lt_0x02);
-		n++;
-	}
-	if ((INTEL_INFO(dev_priv)->sseu.slice_mask & 0x01) &&
-		   (dev_priv->drm.pdev->revision >= 0x02)) {
-		regs[n] = mux_config_compute_basic_0_slices_0x01_and_sku_gte_0x02;
-		lens[n] = ARRAY_SIZE(mux_config_compute_basic_0_slices_0x01_and_sku_gte_0x02);
-		n++;
-	}
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_render_pipe_profile[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007ffea },
-	{ _MMIO(0x2774), 0x00007ffc },
-	{ _MMIO(0x2778), 0x0007affa },
-	{ _MMIO(0x277c), 0x0000f5fd },
-	{ _MMIO(0x2780), 0x00079ffa },
-	{ _MMIO(0x2784), 0x0000f3fb },
-	{ _MMIO(0x2788), 0x0007bf7a },
-	{ _MMIO(0x278c), 0x0000f7e7 },
-	{ _MMIO(0x2790), 0x0007fefa },
-	{ _MMIO(0x2794), 0x0000f7cf },
-	{ _MMIO(0x2798), 0x00077ffa },
-	{ _MMIO(0x279c), 0x0000efdf },
-	{ _MMIO(0x27a0), 0x0006fffa },
-	{ _MMIO(0x27a4), 0x0000cfbf },
-	{ _MMIO(0x27a8), 0x0003fffa },
-	{ _MMIO(0x27ac), 0x00005f7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_render_pipe_profile[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_render_pipe_profile_0_sku_lt_0x02[] = {
-	{ _MMIO(0x9888), 0x0c0e001f },
-	{ _MMIO(0x9888), 0x0a0f0000 },
-	{ _MMIO(0x9888), 0x10116800 },
-	{ _MMIO(0x9888), 0x178a03e0 },
-	{ _MMIO(0x9888), 0x11824c00 },
-	{ _MMIO(0x9888), 0x11830020 },
-	{ _MMIO(0x9888), 0x13840020 },
-	{ _MMIO(0x9888), 0x11850019 },
-	{ _MMIO(0x9888), 0x11860007 },
-	{ _MMIO(0x9888), 0x01870c40 },
-	{ _MMIO(0x9888), 0x17880000 },
-	{ _MMIO(0x9888), 0x022f4000 },
-	{ _MMIO(0x9888), 0x0a4c0040 },
-	{ _MMIO(0x9888), 0x0c0d8000 },
-	{ _MMIO(0x9888), 0x040d4000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x020e5400 },
-	{ _MMIO(0x9888), 0x000e0000 },
-	{ _MMIO(0x9888), 0x080f0040 },
-	{ _MMIO(0x9888), 0x000f0000 },
-	{ _MMIO(0x9888), 0x100f0000 },
-	{ _MMIO(0x9888), 0x0e0f0040 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x06104000 },
-	{ _MMIO(0x9888), 0x06110012 },
-	{ _MMIO(0x9888), 0x06131000 },
-	{ _MMIO(0x9888), 0x01898000 },
-	{ _MMIO(0x9888), 0x0d890100 },
-	{ _MMIO(0x9888), 0x03898000 },
-	{ _MMIO(0x9888), 0x09808000 },
-	{ _MMIO(0x9888), 0x0b808000 },
-	{ _MMIO(0x9888), 0x0380c000 },
-	{ _MMIO(0x9888), 0x0f8a0075 },
-	{ _MMIO(0x9888), 0x1d8a0000 },
-	{ _MMIO(0x9888), 0x118a8000 },
-	{ _MMIO(0x9888), 0x1b8a4000 },
-	{ _MMIO(0x9888), 0x138a8000 },
-	{ _MMIO(0x9888), 0x1d81a000 },
-	{ _MMIO(0x9888), 0x15818000 },
-	{ _MMIO(0x9888), 0x17818000 },
-	{ _MMIO(0x9888), 0x0b820030 },
-	{ _MMIO(0x9888), 0x07828000 },
-	{ _MMIO(0x9888), 0x0d824000 },
-	{ _MMIO(0x9888), 0x0f828000 },
-	{ _MMIO(0x9888), 0x05824000 },
-	{ _MMIO(0x9888), 0x0d830003 },
-	{ _MMIO(0x9888), 0x0583000c },
-	{ _MMIO(0x9888), 0x09830000 },
-	{ _MMIO(0x9888), 0x03838000 },
-	{ _MMIO(0x9888), 0x07838000 },
-	{ _MMIO(0x9888), 0x0b840980 },
-	{ _MMIO(0x9888), 0x03844d80 },
-	{ _MMIO(0x9888), 0x11840000 },
-	{ _MMIO(0x9888), 0x09848000 },
-	{ _MMIO(0x9888), 0x09850080 },
-	{ _MMIO(0x9888), 0x03850003 },
-	{ _MMIO(0x9888), 0x01850000 },
-	{ _MMIO(0x9888), 0x07860000 },
-	{ _MMIO(0x9888), 0x0f860400 },
-	{ _MMIO(0x9888), 0x09870032 },
-	{ _MMIO(0x9888), 0x01888052 },
-	{ _MMIO(0x9888), 0x11880000 },
-	{ _MMIO(0x9888), 0x09884000 },
-	{ _MMIO(0x9888), 0x15968000 },
-	{ _MMIO(0x9888), 0x17968000 },
-	{ _MMIO(0x9888), 0x0f96c000 },
-	{ _MMIO(0x9888), 0x1f950011 },
-	{ _MMIO(0x9888), 0x1d950014 },
-	{ _MMIO(0x9888), 0x0592c000 },
-	{ _MMIO(0x9888), 0x0b928000 },
-	{ _MMIO(0x9888), 0x0d924000 },
-	{ _MMIO(0x9888), 0x0f924000 },
-	{ _MMIO(0x9888), 0x11928000 },
-	{ _MMIO(0x9888), 0x1392c000 },
-	{ _MMIO(0x9888), 0x09924000 },
-	{ _MMIO(0x9888), 0x01985000 },
-	{ _MMIO(0x9888), 0x07988000 },
-	{ _MMIO(0x9888), 0x09981000 },
-	{ _MMIO(0x9888), 0x0b982000 },
-	{ _MMIO(0x9888), 0x0d982000 },
-	{ _MMIO(0x9888), 0x0f989000 },
-	{ _MMIO(0x9888), 0x05982000 },
-	{ _MMIO(0x9888), 0x13904000 },
-	{ _MMIO(0x9888), 0x21904000 },
-	{ _MMIO(0x9888), 0x23904000 },
-	{ _MMIO(0x9888), 0x25908000 },
-	{ _MMIO(0x9888), 0x27904000 },
-	{ _MMIO(0x9888), 0x29908000 },
-	{ _MMIO(0x9888), 0x2b904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17908000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x0b978000 },
-	{ _MMIO(0x9888), 0x0f974000 },
-	{ _MMIO(0x9888), 0x11974000 },
-	{ _MMIO(0x9888), 0x13978000 },
-	{ _MMIO(0x9888), 0x09974000 },
-	{ _MMIO(0xd28), 0x00000000 },
-	{ _MMIO(0x9888), 0x1190c080 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x419010a0 },
-	{ _MMIO(0x9888), 0x55904000 },
-	{ _MMIO(0x9888), 0x45901000 },
-	{ _MMIO(0x9888), 0x47900084 },
-	{ _MMIO(0x9888), 0x57904400 },
-	{ _MMIO(0x9888), 0x499000a5 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900081 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x439014a4 },
-	{ _MMIO(0x9888), 0x53900400 },
-};
-
-static const struct i915_oa_reg mux_config_render_pipe_profile_0_sku_gte_0x02[] = {
-	{ _MMIO(0x9888), 0x0c0e001f },
-	{ _MMIO(0x9888), 0x0a0f0000 },
-	{ _MMIO(0x9888), 0x10116800 },
-	{ _MMIO(0x9888), 0x178a03e0 },
-	{ _MMIO(0x9888), 0x11824c00 },
-	{ _MMIO(0x9888), 0x11830020 },
-	{ _MMIO(0x9888), 0x13840020 },
-	{ _MMIO(0x9888), 0x11850019 },
-	{ _MMIO(0x9888), 0x11860007 },
-	{ _MMIO(0x9888), 0x01870c40 },
-	{ _MMIO(0x9888), 0x17880000 },
-	{ _MMIO(0x9888), 0x022f4000 },
-	{ _MMIO(0x9888), 0x0a4c0040 },
-	{ _MMIO(0x9888), 0x0c0d8000 },
-	{ _MMIO(0x9888), 0x040d4000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x020e5400 },
-	{ _MMIO(0x9888), 0x000e0000 },
-	{ _MMIO(0x9888), 0x080f0040 },
-	{ _MMIO(0x9888), 0x000f0000 },
-	{ _MMIO(0x9888), 0x100f0000 },
-	{ _MMIO(0x9888), 0x0e0f0040 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x06104000 },
-	{ _MMIO(0x9888), 0x06110012 },
-	{ _MMIO(0x9888), 0x06131000 },
-	{ _MMIO(0x9888), 0x01898000 },
-	{ _MMIO(0x9888), 0x0d890100 },
-	{ _MMIO(0x9888), 0x03898000 },
-	{ _MMIO(0x9888), 0x09808000 },
-	{ _MMIO(0x9888), 0x0b808000 },
-	{ _MMIO(0x9888), 0x0380c000 },
-	{ _MMIO(0x9888), 0x0f8a0075 },
-	{ _MMIO(0x9888), 0x1d8a0000 },
-	{ _MMIO(0x9888), 0x118a8000 },
-	{ _MMIO(0x9888), 0x1b8a4000 },
-	{ _MMIO(0x9888), 0x138a8000 },
-	{ _MMIO(0x9888), 0x1d81a000 },
-	{ _MMIO(0x9888), 0x15818000 },
-	{ _MMIO(0x9888), 0x17818000 },
-	{ _MMIO(0x9888), 0x0b820030 },
-	{ _MMIO(0x9888), 0x07828000 },
-	{ _MMIO(0x9888), 0x0d824000 },
-	{ _MMIO(0x9888), 0x0f828000 },
-	{ _MMIO(0x9888), 0x05824000 },
-	{ _MMIO(0x9888), 0x0d830003 },
-	{ _MMIO(0x9888), 0x0583000c },
-	{ _MMIO(0x9888), 0x09830000 },
-	{ _MMIO(0x9888), 0x03838000 },
-	{ _MMIO(0x9888), 0x07838000 },
-	{ _MMIO(0x9888), 0x0b840980 },
-	{ _MMIO(0x9888), 0x03844d80 },
-	{ _MMIO(0x9888), 0x11840000 },
-	{ _MMIO(0x9888), 0x09848000 },
-	{ _MMIO(0x9888), 0x09850080 },
-	{ _MMIO(0x9888), 0x03850003 },
-	{ _MMIO(0x9888), 0x01850000 },
-	{ _MMIO(0x9888), 0x07860000 },
-	{ _MMIO(0x9888), 0x0f860400 },
-	{ _MMIO(0x9888), 0x09870032 },
-	{ _MMIO(0x9888), 0x01888052 },
-	{ _MMIO(0x9888), 0x11880000 },
-	{ _MMIO(0x9888), 0x09884000 },
-	{ _MMIO(0x9888), 0x1b931001 },
-	{ _MMIO(0x9888), 0x1d930001 },
-	{ _MMIO(0x9888), 0x19934000 },
-	{ _MMIO(0x9888), 0x1b958000 },
-	{ _MMIO(0x9888), 0x1d950094 },
-	{ _MMIO(0x9888), 0x19958000 },
-	{ _MMIO(0x9888), 0x05e5a000 },
-	{ _MMIO(0x9888), 0x01e5c000 },
-	{ _MMIO(0x9888), 0x0592c000 },
-	{ _MMIO(0x9888), 0x0b928000 },
-	{ _MMIO(0x9888), 0x0d924000 },
-	{ _MMIO(0x9888), 0x0f924000 },
-	{ _MMIO(0x9888), 0x11928000 },
-	{ _MMIO(0x9888), 0x1392c000 },
-	{ _MMIO(0x9888), 0x09924000 },
-	{ _MMIO(0x9888), 0x01985000 },
-	{ _MMIO(0x9888), 0x07988000 },
-	{ _MMIO(0x9888), 0x09981000 },
-	{ _MMIO(0x9888), 0x0b982000 },
-	{ _MMIO(0x9888), 0x0d982000 },
-	{ _MMIO(0x9888), 0x0f989000 },
-	{ _MMIO(0x9888), 0x05982000 },
-	{ _MMIO(0x9888), 0x13904000 },
-	{ _MMIO(0x9888), 0x21904000 },
-	{ _MMIO(0x9888), 0x23904000 },
-	{ _MMIO(0x9888), 0x25908000 },
-	{ _MMIO(0x9888), 0x27904000 },
-	{ _MMIO(0x9888), 0x29908000 },
-	{ _MMIO(0x9888), 0x2b904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17908000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1190c080 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x419010a0 },
-	{ _MMIO(0x9888), 0x55904000 },
-	{ _MMIO(0x9888), 0x45901000 },
-	{ _MMIO(0x9888), 0x47900084 },
-	{ _MMIO(0x9888), 0x57904400 },
-	{ _MMIO(0x9888), 0x499000a5 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900081 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x439014a4 },
-	{ _MMIO(0x9888), 0x53900400 },
-};
-
-static int
-get_render_pipe_profile_mux_config(struct drm_i915_private *dev_priv,
-				   const struct i915_oa_reg **regs,
-				   int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 2);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 2);
-
-	if (dev_priv->drm.pdev->revision < 0x02) {
-		regs[n] = mux_config_render_pipe_profile_0_sku_lt_0x02;
-		lens[n] = ARRAY_SIZE(mux_config_render_pipe_profile_0_sku_lt_0x02);
-		n++;
-	}
-	if (dev_priv->drm.pdev->revision >= 0x02) {
-		regs[n] = mux_config_render_pipe_profile_0_sku_gte_0x02;
-		lens[n] = ARRAY_SIZE(mux_config_render_pipe_profile_0_sku_gte_0x02);
-		n++;
-	}
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_memory_reads[] = {
-	{ _MMIO(0x272c), 0xffffffff },
-	{ _MMIO(0x2728), 0xffffffff },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x271c), 0xffffffff },
-	{ _MMIO(0x2718), 0xffffffff },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0x86543210 },
-	{ _MMIO(0x2748), 0x86543210 },
-	{ _MMIO(0x2744), 0x00006667 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x275c), 0x86543210 },
-	{ _MMIO(0x2758), 0x86543210 },
-	{ _MMIO(0x2754), 0x00006465 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007f81a },
-	{ _MMIO(0x2774), 0x0000fe00 },
-	{ _MMIO(0x2778), 0x0007f82a },
-	{ _MMIO(0x277c), 0x0000fe00 },
-	{ _MMIO(0x2780), 0x0007f872 },
-	{ _MMIO(0x2784), 0x0000fe00 },
-	{ _MMIO(0x2788), 0x0007f8ba },
-	{ _MMIO(0x278c), 0x0000fe00 },
-	{ _MMIO(0x2790), 0x0007f87a },
-	{ _MMIO(0x2794), 0x0000fe00 },
-	{ _MMIO(0x2798), 0x0007f8ea },
-	{ _MMIO(0x279c), 0x0000fe00 },
-	{ _MMIO(0x27a0), 0x0007f8e2 },
-	{ _MMIO(0x27a4), 0x0000fe00 },
-	{ _MMIO(0x27a8), 0x0007f8f2 },
-	{ _MMIO(0x27ac), 0x0000fe00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_memory_reads[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_memory_reads_0_slices_0x01_and_sku_lt_0x02[] = {
-	{ _MMIO(0x9888), 0x11810c00 },
-	{ _MMIO(0x9888), 0x1381001a },
-	{ _MMIO(0x9888), 0x13946000 },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f900003 },
-	{ _MMIO(0x9888), 0x03811300 },
-	{ _MMIO(0x9888), 0x05811b12 },
-	{ _MMIO(0x9888), 0x0781001a },
-	{ _MMIO(0x9888), 0x1f810000 },
-	{ _MMIO(0x9888), 0x17810000 },
-	{ _MMIO(0x9888), 0x19810000 },
-	{ _MMIO(0x9888), 0x1b810000 },
-	{ _MMIO(0x9888), 0x1d810000 },
-	{ _MMIO(0x9888), 0x0f968000 },
-	{ _MMIO(0x9888), 0x1196c000 },
-	{ _MMIO(0x9888), 0x13964000 },
-	{ _MMIO(0x9888), 0x11938000 },
-	{ _MMIO(0x9888), 0x1b93fe00 },
-	{ _MMIO(0x9888), 0x01940010 },
-	{ _MMIO(0x9888), 0x07941100 },
-	{ _MMIO(0x9888), 0x09941312 },
-	{ _MMIO(0x9888), 0x0b941514 },
-	{ _MMIO(0x9888), 0x0d941716 },
-	{ _MMIO(0x9888), 0x11940000 },
-	{ _MMIO(0x9888), 0x19940000 },
-	{ _MMIO(0x9888), 0x1b940000 },
-	{ _MMIO(0x9888), 0x1d940000 },
-	{ _MMIO(0x9888), 0x1b954000 },
-	{ _MMIO(0x9888), 0x1d95a550 },
-	{ _MMIO(0x9888), 0x1f9502aa },
-	{ _MMIO(0x9888), 0x2f900157 },
-	{ _MMIO(0x9888), 0x31900105 },
-	{ _MMIO(0x9888), 0x15900103 },
-	{ _MMIO(0x9888), 0x17900101 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x13908000 },
-	{ _MMIO(0x9888), 0x21908000 },
-	{ _MMIO(0x9888), 0x23908000 },
-	{ _MMIO(0x9888), 0x25908000 },
-	{ _MMIO(0x9888), 0x27908000 },
-	{ _MMIO(0x9888), 0x29908000 },
-	{ _MMIO(0x9888), 0x2b908000 },
-	{ _MMIO(0x9888), 0x2d908000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d908000 },
-	{ _MMIO(0x9888), 0x1f908000 },
-	{ _MMIO(0xd28), 0x00000000 },
-	{ _MMIO(0x9888), 0x11900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c00 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900000 },
-	{ _MMIO(0x9888), 0x47900000 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900063 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static const struct i915_oa_reg mux_config_memory_reads_0_sku_lt_0x05_and_sku_gte_0x02[] = {
-	{ _MMIO(0x9888), 0x11810c00 },
-	{ _MMIO(0x9888), 0x1381001a },
-	{ _MMIO(0x9888), 0x13946000 },
-	{ _MMIO(0x9888), 0x15940016 },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x03811300 },
-	{ _MMIO(0x9888), 0x05811b12 },
-	{ _MMIO(0x9888), 0x0781001a },
-	{ _MMIO(0x9888), 0x1f810000 },
-	{ _MMIO(0x9888), 0x17810000 },
-	{ _MMIO(0x9888), 0x19810000 },
-	{ _MMIO(0x9888), 0x1b810000 },
-	{ _MMIO(0x9888), 0x1d810000 },
-	{ _MMIO(0x9888), 0x19930800 },
-	{ _MMIO(0x9888), 0x1b93aa55 },
-	{ _MMIO(0x9888), 0x1d9300aa },
-	{ _MMIO(0x9888), 0x01940010 },
-	{ _MMIO(0x9888), 0x07941100 },
-	{ _MMIO(0x9888), 0x09941312 },
-	{ _MMIO(0x9888), 0x0b941514 },
-	{ _MMIO(0x9888), 0x0d941716 },
-	{ _MMIO(0x9888), 0x0f940018 },
-	{ _MMIO(0x9888), 0x1b940000 },
-	{ _MMIO(0x9888), 0x11940000 },
-	{ _MMIO(0x9888), 0x01e58000 },
-	{ _MMIO(0x9888), 0x03e57000 },
-	{ _MMIO(0x9888), 0x31900105 },
-	{ _MMIO(0x9888), 0x15900103 },
-	{ _MMIO(0x9888), 0x17900101 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x13908000 },
-	{ _MMIO(0x9888), 0x21908000 },
-	{ _MMIO(0x9888), 0x23908000 },
-	{ _MMIO(0x9888), 0x25908000 },
-	{ _MMIO(0x9888), 0x27908000 },
-	{ _MMIO(0x9888), 0x29908000 },
-	{ _MMIO(0x9888), 0x2b908000 },
-	{ _MMIO(0x9888), 0x2d908000 },
-	{ _MMIO(0x9888), 0x2f908000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d908000 },
-	{ _MMIO(0x9888), 0x1f908000 },
-	{ _MMIO(0x9888), 0x11900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c20 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900400 },
-	{ _MMIO(0x9888), 0x47900421 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900421 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900061 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static const struct i915_oa_reg mux_config_memory_reads_0_sku_gte_0x05[] = {
-	{ _MMIO(0x9888), 0x11810c00 },
-	{ _MMIO(0x9888), 0x1381001a },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f900064 },
-	{ _MMIO(0x9888), 0x03811300 },
-	{ _MMIO(0x9888), 0x05811b12 },
-	{ _MMIO(0x9888), 0x0781001a },
-	{ _MMIO(0x9888), 0x1f810000 },
-	{ _MMIO(0x9888), 0x17810000 },
-	{ _MMIO(0x9888), 0x19810000 },
-	{ _MMIO(0x9888), 0x1b810000 },
-	{ _MMIO(0x9888), 0x1d810000 },
-	{ _MMIO(0x9888), 0x1b930055 },
-	{ _MMIO(0x9888), 0x03e58000 },
-	{ _MMIO(0x9888), 0x05e5c000 },
-	{ _MMIO(0x9888), 0x07e54000 },
-	{ _MMIO(0x9888), 0x13900150 },
-	{ _MMIO(0x9888), 0x21900151 },
-	{ _MMIO(0x9888), 0x23900152 },
-	{ _MMIO(0x9888), 0x25900153 },
-	{ _MMIO(0x9888), 0x27900154 },
-	{ _MMIO(0x9888), 0x29900155 },
-	{ _MMIO(0x9888), 0x2b900156 },
-	{ _MMIO(0x9888), 0x2d900157 },
-	{ _MMIO(0x9888), 0x2f90015f },
-	{ _MMIO(0x9888), 0x31900105 },
-	{ _MMIO(0x9888), 0x15900103 },
-	{ _MMIO(0x9888), 0x17900101 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d908000 },
-	{ _MMIO(0x9888), 0x1f908000 },
-	{ _MMIO(0x9888), 0x11900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c60 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900c00 },
-	{ _MMIO(0x9888), 0x47900c63 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900c63 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900063 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static int
-get_memory_reads_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 3);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 3);
-
-	if ((INTEL_INFO(dev_priv)->sseu.slice_mask & 0x01) &&
-	    (dev_priv->drm.pdev->revision < 0x02)) {
-		regs[n] = mux_config_memory_reads_0_slices_0x01_and_sku_lt_0x02;
-		lens[n] = ARRAY_SIZE(mux_config_memory_reads_0_slices_0x01_and_sku_lt_0x02);
-		n++;
-	}
-	if ((dev_priv->drm.pdev->revision < 0x05) &&
-		   (dev_priv->drm.pdev->revision >= 0x02)) {
-		regs[n] = mux_config_memory_reads_0_sku_lt_0x05_and_sku_gte_0x02;
-		lens[n] = ARRAY_SIZE(mux_config_memory_reads_0_sku_lt_0x05_and_sku_gte_0x02);
-		n++;
-	}
-	if (dev_priv->drm.pdev->revision >= 0x05) {
-		regs[n] = mux_config_memory_reads_0_sku_gte_0x05;
-		lens[n] = ARRAY_SIZE(mux_config_memory_reads_0_sku_gte_0x05);
-		n++;
-	}
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_memory_writes[] = {
-	{ _MMIO(0x272c), 0xffffffff },
-	{ _MMIO(0x2728), 0xffffffff },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x271c), 0xffffffff },
-	{ _MMIO(0x2718), 0xffffffff },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0x86543210 },
-	{ _MMIO(0x2748), 0x86543210 },
-	{ _MMIO(0x2744), 0x00006667 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x275c), 0x86543210 },
-	{ _MMIO(0x2758), 0x86543210 },
-	{ _MMIO(0x2754), 0x00006465 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007f81a },
-	{ _MMIO(0x2774), 0x0000fe00 },
-	{ _MMIO(0x2778), 0x0007f82a },
-	{ _MMIO(0x277c), 0x0000fe00 },
-	{ _MMIO(0x2780), 0x0007f822 },
-	{ _MMIO(0x2784), 0x0000fe00 },
-	{ _MMIO(0x2788), 0x0007f8ba },
-	{ _MMIO(0x278c), 0x0000fe00 },
-	{ _MMIO(0x2790), 0x0007f87a },
-	{ _MMIO(0x2794), 0x0000fe00 },
-	{ _MMIO(0x2798), 0x0007f8ea },
-	{ _MMIO(0x279c), 0x0000fe00 },
-	{ _MMIO(0x27a0), 0x0007f8e2 },
-	{ _MMIO(0x27a4), 0x0000fe00 },
-	{ _MMIO(0x27a8), 0x0007f8f2 },
-	{ _MMIO(0x27ac), 0x0000fe00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_memory_writes[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_memory_writes_0_slices_0x01_and_sku_lt_0x02[] = {
-	{ _MMIO(0x9888), 0x11810c00 },
-	{ _MMIO(0x9888), 0x1381001a },
-	{ _MMIO(0x9888), 0x13945400 },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f901400 },
-	{ _MMIO(0x9888), 0x03811300 },
-	{ _MMIO(0x9888), 0x05811b12 },
-	{ _MMIO(0x9888), 0x0781001a },
-	{ _MMIO(0x9888), 0x1f810000 },
-	{ _MMIO(0x9888), 0x17810000 },
-	{ _MMIO(0x9888), 0x19810000 },
-	{ _MMIO(0x9888), 0x1b810000 },
-	{ _MMIO(0x9888), 0x1d810000 },
-	{ _MMIO(0x9888), 0x0f968000 },
-	{ _MMIO(0x9888), 0x1196c000 },
-	{ _MMIO(0x9888), 0x13964000 },
-	{ _MMIO(0x9888), 0x11938000 },
-	{ _MMIO(0x9888), 0x1b93fe00 },
-	{ _MMIO(0x9888), 0x01940010 },
-	{ _MMIO(0x9888), 0x07941100 },
-	{ _MMIO(0x9888), 0x09941312 },
-	{ _MMIO(0x9888), 0x0b941514 },
-	{ _MMIO(0x9888), 0x0d941716 },
-	{ _MMIO(0x9888), 0x11940000 },
-	{ _MMIO(0x9888), 0x19940000 },
-	{ _MMIO(0x9888), 0x1b940000 },
-	{ _MMIO(0x9888), 0x1d940000 },
-	{ _MMIO(0x9888), 0x1b954000 },
-	{ _MMIO(0x9888), 0x1d95a550 },
-	{ _MMIO(0x9888), 0x1f9502aa },
-	{ _MMIO(0x9888), 0x2f900167 },
-	{ _MMIO(0x9888), 0x31900105 },
-	{ _MMIO(0x9888), 0x15900103 },
-	{ _MMIO(0x9888), 0x17900101 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x13908000 },
-	{ _MMIO(0x9888), 0x21908000 },
-	{ _MMIO(0x9888), 0x23908000 },
-	{ _MMIO(0x9888), 0x25908000 },
-	{ _MMIO(0x9888), 0x27908000 },
-	{ _MMIO(0x9888), 0x29908000 },
-	{ _MMIO(0x9888), 0x2b908000 },
-	{ _MMIO(0x9888), 0x2d908000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d908000 },
-	{ _MMIO(0x9888), 0x1f908000 },
-	{ _MMIO(0xd28), 0x00000000 },
-	{ _MMIO(0x9888), 0x11900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c00 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900000 },
-	{ _MMIO(0x9888), 0x47900000 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900063 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static const struct i915_oa_reg mux_config_memory_writes_0_sku_lt_0x05_and_sku_gte_0x02[] = {
-	{ _MMIO(0x9888), 0x11810c00 },
-	{ _MMIO(0x9888), 0x1381001a },
-	{ _MMIO(0x9888), 0x13945400 },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f901400 },
-	{ _MMIO(0x9888), 0x03811300 },
-	{ _MMIO(0x9888), 0x05811b12 },
-	{ _MMIO(0x9888), 0x0781001a },
-	{ _MMIO(0x9888), 0x1f810000 },
-	{ _MMIO(0x9888), 0x17810000 },
-	{ _MMIO(0x9888), 0x19810000 },
-	{ _MMIO(0x9888), 0x1b810000 },
-	{ _MMIO(0x9888), 0x1d810000 },
-	{ _MMIO(0x9888), 0x19930800 },
-	{ _MMIO(0x9888), 0x1b93aa55 },
-	{ _MMIO(0x9888), 0x1d93002a },
-	{ _MMIO(0x9888), 0x01940010 },
-	{ _MMIO(0x9888), 0x07941100 },
-	{ _MMIO(0x9888), 0x09941312 },
-	{ _MMIO(0x9888), 0x0b941514 },
-	{ _MMIO(0x9888), 0x0d941716 },
-	{ _MMIO(0x9888), 0x1b940000 },
-	{ _MMIO(0x9888), 0x11940000 },
-	{ _MMIO(0x9888), 0x01e58000 },
-	{ _MMIO(0x9888), 0x03e57000 },
-	{ _MMIO(0x9888), 0x2f900167 },
-	{ _MMIO(0x9888), 0x31900105 },
-	{ _MMIO(0x9888), 0x15900103 },
-	{ _MMIO(0x9888), 0x17900101 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x13908000 },
-	{ _MMIO(0x9888), 0x21908000 },
-	{ _MMIO(0x9888), 0x23908000 },
-	{ _MMIO(0x9888), 0x25908000 },
-	{ _MMIO(0x9888), 0x27908000 },
-	{ _MMIO(0x9888), 0x29908000 },
-	{ _MMIO(0x9888), 0x2b908000 },
-	{ _MMIO(0x9888), 0x2d908000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d908000 },
-	{ _MMIO(0x9888), 0x1f908000 },
-	{ _MMIO(0x9888), 0x11900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c20 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900400 },
-	{ _MMIO(0x9888), 0x47900421 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900421 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900063 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static const struct i915_oa_reg mux_config_memory_writes_0_sku_gte_0x05[] = {
-	{ _MMIO(0x9888), 0x11810c00 },
-	{ _MMIO(0x9888), 0x1381001a },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f901000 },
-	{ _MMIO(0x9888), 0x03811300 },
-	{ _MMIO(0x9888), 0x05811b12 },
-	{ _MMIO(0x9888), 0x0781001a },
-	{ _MMIO(0x9888), 0x1f810000 },
-	{ _MMIO(0x9888), 0x17810000 },
-	{ _MMIO(0x9888), 0x19810000 },
-	{ _MMIO(0x9888), 0x1b810000 },
-	{ _MMIO(0x9888), 0x1d810000 },
-	{ _MMIO(0x9888), 0x1b930055 },
-	{ _MMIO(0x9888), 0x03e58000 },
-	{ _MMIO(0x9888), 0x05e5c000 },
-	{ _MMIO(0x9888), 0x07e54000 },
-	{ _MMIO(0x9888), 0x13900160 },
-	{ _MMIO(0x9888), 0x21900161 },
-	{ _MMIO(0x9888), 0x23900162 },
-	{ _MMIO(0x9888), 0x25900163 },
-	{ _MMIO(0x9888), 0x27900164 },
-	{ _MMIO(0x9888), 0x29900165 },
-	{ _MMIO(0x9888), 0x2b900166 },
-	{ _MMIO(0x9888), 0x2d900167 },
-	{ _MMIO(0x9888), 0x2f900150 },
-	{ _MMIO(0x9888), 0x31900105 },
-	{ _MMIO(0x9888), 0x15900103 },
-	{ _MMIO(0x9888), 0x17900101 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d908000 },
-	{ _MMIO(0x9888), 0x1f908000 },
-	{ _MMIO(0x9888), 0x11900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c60 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900c00 },
-	{ _MMIO(0x9888), 0x47900c63 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900c63 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900063 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static int
-get_memory_writes_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 3);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 3);
-
-	if ((INTEL_INFO(dev_priv)->sseu.slice_mask & 0x01) &&
-	    (dev_priv->drm.pdev->revision < 0x02)) {
-		regs[n] = mux_config_memory_writes_0_slices_0x01_and_sku_lt_0x02;
-		lens[n] = ARRAY_SIZE(mux_config_memory_writes_0_slices_0x01_and_sku_lt_0x02);
-		n++;
-	}
-	if ((dev_priv->drm.pdev->revision < 0x05) &&
-		   (dev_priv->drm.pdev->revision >= 0x02)) {
-		regs[n] = mux_config_memory_writes_0_sku_lt_0x05_and_sku_gte_0x02;
-		lens[n] = ARRAY_SIZE(mux_config_memory_writes_0_sku_lt_0x05_and_sku_gte_0x02);
-		n++;
-	}
-	if (dev_priv->drm.pdev->revision >= 0x05) {
-		regs[n] = mux_config_memory_writes_0_sku_gte_0x05;
-		lens[n] = ARRAY_SIZE(mux_config_memory_writes_0_sku_gte_0x05);
-		n++;
-	}
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_extended[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007fc2a },
-	{ _MMIO(0x2774), 0x0000bf00 },
-	{ _MMIO(0x2778), 0x0007fc6a },
-	{ _MMIO(0x277c), 0x0000bf00 },
-	{ _MMIO(0x2780), 0x0007fc92 },
-	{ _MMIO(0x2784), 0x0000bf00 },
-	{ _MMIO(0x2788), 0x0007fca2 },
-	{ _MMIO(0x278c), 0x0000bf00 },
-	{ _MMIO(0x2790), 0x0007fc32 },
-	{ _MMIO(0x2794), 0x0000bf00 },
-	{ _MMIO(0x2798), 0x0007fc9a },
-	{ _MMIO(0x279c), 0x0000bf00 },
-	{ _MMIO(0x27a0), 0x0007fe6a },
-	{ _MMIO(0x27a4), 0x0000bf00 },
-	{ _MMIO(0x27a8), 0x0007fe7a },
-	{ _MMIO(0x27ac), 0x0000bf00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_extended[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_compute_extended_0_subslices_0x01[] = {
-	{ _MMIO(0x9888), 0x106c00e0 },
-	{ _MMIO(0x9888), 0x141c8160 },
-	{ _MMIO(0x9888), 0x161c8015 },
-	{ _MMIO(0x9888), 0x181c0120 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x0e4e8000 },
-	{ _MMIO(0x9888), 0x184e8000 },
-	{ _MMIO(0x9888), 0x1a4eaaa0 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x024e8000 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x0e6c0b01 },
-	{ _MMIO(0x9888), 0x006c0200 },
-	{ _MMIO(0x9888), 0x026c000c },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x1a6c0000 },
-	{ _MMIO(0x9888), 0x0e1bc000 },
-	{ _MMIO(0x9888), 0x001b8000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x001c0041 },
-	{ _MMIO(0x9888), 0x061c4200 },
-	{ _MMIO(0x9888), 0x081c4443 },
-	{ _MMIO(0x9888), 0x0a1c4645 },
-	{ _MMIO(0x9888), 0x0c1c7647 },
-	{ _MMIO(0x9888), 0x041c7357 },
-	{ _MMIO(0x9888), 0x1c1c0030 },
-	{ _MMIO(0x9888), 0x101c0000 },
-	{ _MMIO(0x9888), 0x1a1c0000 },
-	{ _MMIO(0x9888), 0x121c8000 },
-	{ _MMIO(0x9888), 0x004c8000 },
-	{ _MMIO(0x9888), 0x0a4caa2a },
-	{ _MMIO(0x9888), 0x0c4c02aa },
-	{ _MMIO(0x9888), 0x084ca000 },
-	{ _MMIO(0x9888), 0x000da000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x0c0f5400 },
-	{ _MMIO(0x9888), 0x0e0f5515 },
-	{ _MMIO(0x9888), 0x100f0155 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2c8000 },
-	{ _MMIO(0x9888), 0x162caa00 },
-	{ _MMIO(0x9888), 0x182c00aa },
-	{ _MMIO(0x9888), 0x022c8000 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0xd28), 0x00000000 },
-	{ _MMIO(0x9888), 0x11907fff },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900040 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900802 },
-	{ _MMIO(0x9888), 0x47900842 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900842 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900800 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static int
-get_compute_extended_mux_config(struct drm_i915_private *dev_priv,
-				const struct i915_oa_reg **regs,
-				int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	if (INTEL_INFO(dev_priv)->sseu.subslice_mask & 0x01) {
-		regs[n] = mux_config_compute_extended_0_subslices_0x01;
-		lens[n] = ARRAY_SIZE(mux_config_compute_extended_0_subslices_0x01);
-		n++;
-	}
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_l3_cache[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x30800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007fffa },
-	{ _MMIO(0x2774), 0x0000fefe },
-	{ _MMIO(0x2778), 0x0007fffa },
-	{ _MMIO(0x277c), 0x0000fefd },
-	{ _MMIO(0x2790), 0x0007fffa },
-	{ _MMIO(0x2794), 0x0000fbef },
-	{ _MMIO(0x2798), 0x0007fffa },
-	{ _MMIO(0x279c), 0x0000fbdf },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_l3_cache[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00101100 },
-	{ _MMIO(0xe45c), 0x00201200 },
-	{ _MMIO(0xe55c), 0x00301300 },
-	{ _MMIO(0xe65c), 0x00401400 },
-};
-
-static const struct i915_oa_reg mux_config_compute_l3_cache[] = {
-	{ _MMIO(0x9888), 0x166c0760 },
-	{ _MMIO(0x9888), 0x1593001e },
-	{ _MMIO(0x9888), 0x3f901403 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x0e4e8000 },
-	{ _MMIO(0x9888), 0x184e8000 },
-	{ _MMIO(0x9888), 0x1a4e8020 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x006c0051 },
-	{ _MMIO(0x9888), 0x066c5000 },
-	{ _MMIO(0x9888), 0x086c5c5d },
-	{ _MMIO(0x9888), 0x0e6c5e5f },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x186c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x001b4000 },
-	{ _MMIO(0x9888), 0x061b8000 },
-	{ _MMIO(0x9888), 0x081bc000 },
-	{ _MMIO(0x9888), 0x0e1bc000 },
-	{ _MMIO(0x9888), 0x101c8000 },
-	{ _MMIO(0x9888), 0x1a1ce000 },
-	{ _MMIO(0x9888), 0x1c1c0030 },
-	{ _MMIO(0x9888), 0x004c8000 },
-	{ _MMIO(0x9888), 0x0a4c2a00 },
-	{ _MMIO(0x9888), 0x0c4c0280 },
-	{ _MMIO(0x9888), 0x000d2000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x0c0f0400 },
-	{ _MMIO(0x9888), 0x0e0f1500 },
-	{ _MMIO(0x9888), 0x100f0140 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2c8000 },
-	{ _MMIO(0x9888), 0x162c0a00 },
-	{ _MMIO(0x9888), 0x182c00a0 },
-	{ _MMIO(0x9888), 0x03933300 },
-	{ _MMIO(0x9888), 0x05930032 },
-	{ _MMIO(0x9888), 0x11930000 },
-	{ _MMIO(0x9888), 0x1b930000 },
-	{ _MMIO(0x9888), 0x1d900157 },
-	{ _MMIO(0x9888), 0x1f900167 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1190030f },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900000 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900042 },
-	{ _MMIO(0x9888), 0x47900000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x4b900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x53901111 },
-	{ _MMIO(0x9888), 0x43900420 },
-};
-
-static int
-get_compute_l3_cache_mux_config(struct drm_i915_private *dev_priv,
-				const struct i915_oa_reg **regs,
-				int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_l3_cache;
-	lens[n] = ARRAY_SIZE(mux_config_compute_l3_cache);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_hdc_and_sf[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x10800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000fdff },
-};
-
-static const struct i915_oa_reg flex_eu_config_hdc_and_sf[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_hdc_and_sf[] = {
-	{ _MMIO(0x9888), 0x104f0232 },
-	{ _MMIO(0x9888), 0x124f4640 },
-	{ _MMIO(0x9888), 0x106c0232 },
-	{ _MMIO(0x9888), 0x11834400 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x0c4e8000 },
-	{ _MMIO(0x9888), 0x004f1880 },
-	{ _MMIO(0x9888), 0x024f08bb },
-	{ _MMIO(0x9888), 0x044f001b },
-	{ _MMIO(0x9888), 0x046c0100 },
-	{ _MMIO(0x9888), 0x066c000b },
-	{ _MMIO(0x9888), 0x1a6c0000 },
-	{ _MMIO(0x9888), 0x041b8000 },
-	{ _MMIO(0x9888), 0x061b4000 },
-	{ _MMIO(0x9888), 0x1a1c1800 },
-	{ _MMIO(0x9888), 0x005b8000 },
-	{ _MMIO(0x9888), 0x025bc000 },
-	{ _MMIO(0x9888), 0x045b4000 },
-	{ _MMIO(0x9888), 0x125c8000 },
-	{ _MMIO(0x9888), 0x145c8000 },
-	{ _MMIO(0x9888), 0x165c8000 },
-	{ _MMIO(0x9888), 0x185c8000 },
-	{ _MMIO(0x9888), 0x0a4c00a0 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0f5000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x022cc000 },
-	{ _MMIO(0x9888), 0x042cc000 },
-	{ _MMIO(0x9888), 0x062cc000 },
-	{ _MMIO(0x9888), 0x082cc000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x0f828000 },
-	{ _MMIO(0x9888), 0x0f8305c0 },
-	{ _MMIO(0x9888), 0x09830000 },
-	{ _MMIO(0x9888), 0x07830000 },
-	{ _MMIO(0x9888), 0x1d950080 },
-	{ _MMIO(0x9888), 0x13928000 },
-	{ _MMIO(0x9888), 0x0f988000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x1190fc00 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x4b9000a0 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900800 },
-	{ _MMIO(0x9888), 0x43900842 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_hdc_and_sf_mux_config(struct drm_i915_private *dev_priv,
-			  const struct i915_oa_reg **regs,
-			  int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_hdc_and_sf;
-	lens[n] = ARRAY_SIZE(mux_config_hdc_and_sf);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00014002 },
-	{ _MMIO(0x277c), 0x0000c3ff },
-	{ _MMIO(0x2780), 0x00010002 },
-	{ _MMIO(0x2784), 0x0000c7ff },
-	{ _MMIO(0x2788), 0x00004002 },
-	{ _MMIO(0x278c), 0x0000d3ff },
-	{ _MMIO(0x2790), 0x00100700 },
-	{ _MMIO(0x2794), 0x0000ff1f },
-	{ _MMIO(0x2798), 0x00001402 },
-	{ _MMIO(0x279c), 0x0000fc3f },
-	{ _MMIO(0x27a0), 0x00001002 },
-	{ _MMIO(0x27a4), 0x0000fc7f },
-	{ _MMIO(0x27a8), 0x00000402 },
-	{ _MMIO(0x27ac), 0x0000fd3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_1[] = {
-	{ _MMIO(0x9888), 0x126c7b40 },
-	{ _MMIO(0x9888), 0x166c0020 },
-	{ _MMIO(0x9888), 0x0a603444 },
-	{ _MMIO(0x9888), 0x0a613400 },
-	{ _MMIO(0x9888), 0x1a4ea800 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x024e8000 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x064f4000 },
-	{ _MMIO(0x9888), 0x0c6c5327 },
-	{ _MMIO(0x9888), 0x0e6c5425 },
-	{ _MMIO(0x9888), 0x006c2a00 },
-	{ _MMIO(0x9888), 0x026c285b },
-	{ _MMIO(0x9888), 0x046c005c },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x1a6c0800 },
-	{ _MMIO(0x9888), 0x0c1bc000 },
-	{ _MMIO(0x9888), 0x0e1bc000 },
-	{ _MMIO(0x9888), 0x001b8000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x041bc000 },
-	{ _MMIO(0x9888), 0x1c1c003c },
-	{ _MMIO(0x9888), 0x121c8000 },
-	{ _MMIO(0x9888), 0x141c8000 },
-	{ _MMIO(0x9888), 0x161c8000 },
-	{ _MMIO(0x9888), 0x181c8000 },
-	{ _MMIO(0x9888), 0x1a1c0800 },
-	{ _MMIO(0x9888), 0x065b4000 },
-	{ _MMIO(0x9888), 0x1a5c1000 },
-	{ _MMIO(0x9888), 0x10600000 },
-	{ _MMIO(0x9888), 0x04600000 },
-	{ _MMIO(0x9888), 0x0c610044 },
-	{ _MMIO(0x9888), 0x10610000 },
-	{ _MMIO(0x9888), 0x06610000 },
-	{ _MMIO(0x9888), 0x0c4c02a8 },
-	{ _MMIO(0x9888), 0x084ca000 },
-	{ _MMIO(0x9888), 0x0a4c002a },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x100f0154 },
-	{ _MMIO(0x9888), 0x0c0f5000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x182c00aa },
-	{ _MMIO(0x9888), 0x022c8000 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2cc000 },
-	{ _MMIO(0x9888), 0x1190ffc0 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900420 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900021 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900400 },
-	{ _MMIO(0x9888), 0x43900421 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900040 },
-};
-
-static int
-get_l3_1_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_1;
-	lens[n] = ARRAY_SIZE(mux_config_l3_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_2[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00028002 },
-	{ _MMIO(0x277c), 0x000087ff },
-	{ _MMIO(0x2780), 0x00020002 },
-	{ _MMIO(0x2784), 0x00008fff },
-	{ _MMIO(0x2788), 0x00008002 },
-	{ _MMIO(0x278c), 0x0000a7ff },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_2[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_2[] = {
-	{ _MMIO(0x9888), 0x126c02e0 },
-	{ _MMIO(0x9888), 0x146c0001 },
-	{ _MMIO(0x9888), 0x0a623400 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x064f4000 },
-	{ _MMIO(0x9888), 0x026c3324 },
-	{ _MMIO(0x9888), 0x046c3422 },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1a6c0000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x041bc000 },
-	{ _MMIO(0x9888), 0x141c8000 },
-	{ _MMIO(0x9888), 0x161c8000 },
-	{ _MMIO(0x9888), 0x181c8000 },
-	{ _MMIO(0x9888), 0x1a1c0800 },
-	{ _MMIO(0x9888), 0x065b4000 },
-	{ _MMIO(0x9888), 0x1a5c1000 },
-	{ _MMIO(0x9888), 0x06614000 },
-	{ _MMIO(0x9888), 0x0c620044 },
-	{ _MMIO(0x9888), 0x10620000 },
-	{ _MMIO(0x9888), 0x06620000 },
-	{ _MMIO(0x9888), 0x084c8000 },
-	{ _MMIO(0x9888), 0x0a4c002a },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0f4000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2cc000 },
-	{ _MMIO(0x9888), 0x1190f800 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x43900000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_l3_2_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_2;
-	lens[n] = ARRAY_SIZE(mux_config_l3_2);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_3[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00028002 },
-	{ _MMIO(0x277c), 0x000087ff },
-	{ _MMIO(0x2780), 0x00020002 },
-	{ _MMIO(0x2784), 0x00008fff },
-	{ _MMIO(0x2788), 0x00008002 },
-	{ _MMIO(0x278c), 0x0000a7ff },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_3[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_3[] = {
-	{ _MMIO(0x9888), 0x126c4e80 },
-	{ _MMIO(0x9888), 0x146c0000 },
-	{ _MMIO(0x9888), 0x0a633400 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x0c4e8000 },
-	{ _MMIO(0x9888), 0x026c3321 },
-	{ _MMIO(0x9888), 0x046c342f },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1a6c2000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x041bc000 },
-	{ _MMIO(0x9888), 0x061b4000 },
-	{ _MMIO(0x9888), 0x141c8000 },
-	{ _MMIO(0x9888), 0x161c8000 },
-	{ _MMIO(0x9888), 0x181c8000 },
-	{ _MMIO(0x9888), 0x1a1c1800 },
-	{ _MMIO(0x9888), 0x06604000 },
-	{ _MMIO(0x9888), 0x0c630044 },
-	{ _MMIO(0x9888), 0x10630000 },
-	{ _MMIO(0x9888), 0x06630000 },
-	{ _MMIO(0x9888), 0x084c8000 },
-	{ _MMIO(0x9888), 0x0a4c00aa },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0f4000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x1190f800 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x43900842 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900002 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_l3_3_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_3;
-	lens[n] = ARRAY_SIZE(mux_config_l3_3);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x30800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000efff },
-	{ _MMIO(0x2778), 0x00006000 },
-	{ _MMIO(0x277c), 0x0000f3ff },
-};
-
-static const struct i915_oa_reg flex_eu_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0x9888), 0x102f3800 },
-	{ _MMIO(0x9888), 0x144d0500 },
-	{ _MMIO(0x9888), 0x120d03c0 },
-	{ _MMIO(0x9888), 0x140d03cf },
-	{ _MMIO(0x9888), 0x0c0f0004 },
-	{ _MMIO(0x9888), 0x0c4e4000 },
-	{ _MMIO(0x9888), 0x042f0480 },
-	{ _MMIO(0x9888), 0x082f0000 },
-	{ _MMIO(0x9888), 0x022f0000 },
-	{ _MMIO(0x9888), 0x0a4c0090 },
-	{ _MMIO(0x9888), 0x064d0027 },
-	{ _MMIO(0x9888), 0x004d0000 },
-	{ _MMIO(0x9888), 0x000d0d40 },
-	{ _MMIO(0x9888), 0x020d803f },
-	{ _MMIO(0x9888), 0x040d8023 },
-	{ _MMIO(0x9888), 0x100d0000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x020f0010 },
-	{ _MMIO(0x9888), 0x000f0000 },
-	{ _MMIO(0x9888), 0x0e0f0050 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x1190fc00 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41901400 },
-	{ _MMIO(0x9888), 0x43901485 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900001 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_rasterizer_and_pixel_backend_mux_config(struct drm_i915_private *dev_priv,
-					    const struct i915_oa_reg **regs,
-					    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_rasterizer_and_pixel_backend;
-	lens[n] = ARRAY_SIZE(mux_config_rasterizer_and_pixel_backend);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_sampler[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x70800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x0000c000 },
-	{ _MMIO(0x2774), 0x0000e7ff },
-	{ _MMIO(0x2778), 0x00003000 },
-	{ _MMIO(0x277c), 0x0000f9ff },
-	{ _MMIO(0x2780), 0x00000c00 },
-	{ _MMIO(0x2784), 0x0000fe7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_sampler[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_sampler[] = {
-	{ _MMIO(0x9888), 0x14152c00 },
-	{ _MMIO(0x9888), 0x16150005 },
-	{ _MMIO(0x9888), 0x121600a0 },
-	{ _MMIO(0x9888), 0x14352c00 },
-	{ _MMIO(0x9888), 0x16350005 },
-	{ _MMIO(0x9888), 0x123600a0 },
-	{ _MMIO(0x9888), 0x14552c00 },
-	{ _MMIO(0x9888), 0x16550005 },
-	{ _MMIO(0x9888), 0x125600a0 },
-	{ _MMIO(0x9888), 0x062f6000 },
-	{ _MMIO(0x9888), 0x022f2000 },
-	{ _MMIO(0x9888), 0x0c4c0050 },
-	{ _MMIO(0x9888), 0x0a4c0010 },
-	{ _MMIO(0x9888), 0x0c0d8000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x100f0350 },
-	{ _MMIO(0x9888), 0x0c0fb000 },
-	{ _MMIO(0x9888), 0x0e0f00da },
-	{ _MMIO(0x9888), 0x182c0028 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x022dc000 },
-	{ _MMIO(0x9888), 0x042d4000 },
-	{ _MMIO(0x9888), 0x0c138000 },
-	{ _MMIO(0x9888), 0x0e132000 },
-	{ _MMIO(0x9888), 0x0413c000 },
-	{ _MMIO(0x9888), 0x1c140018 },
-	{ _MMIO(0x9888), 0x0c157000 },
-	{ _MMIO(0x9888), 0x0e150078 },
-	{ _MMIO(0x9888), 0x10150000 },
-	{ _MMIO(0x9888), 0x04162180 },
-	{ _MMIO(0x9888), 0x02160000 },
-	{ _MMIO(0x9888), 0x04174000 },
-	{ _MMIO(0x9888), 0x0233a000 },
-	{ _MMIO(0x9888), 0x04333000 },
-	{ _MMIO(0x9888), 0x14348000 },
-	{ _MMIO(0x9888), 0x16348000 },
-	{ _MMIO(0x9888), 0x02357870 },
-	{ _MMIO(0x9888), 0x10350000 },
-	{ _MMIO(0x9888), 0x04360043 },
-	{ _MMIO(0x9888), 0x02360000 },
-	{ _MMIO(0x9888), 0x04371000 },
-	{ _MMIO(0x9888), 0x0e538000 },
-	{ _MMIO(0x9888), 0x00538000 },
-	{ _MMIO(0x9888), 0x06533000 },
-	{ _MMIO(0x9888), 0x1c540020 },
-	{ _MMIO(0x9888), 0x12548000 },
-	{ _MMIO(0x9888), 0x0e557000 },
-	{ _MMIO(0x9888), 0x00557800 },
-	{ _MMIO(0x9888), 0x10550000 },
-	{ _MMIO(0x9888), 0x06560043 },
-	{ _MMIO(0x9888), 0x02560000 },
-	{ _MMIO(0x9888), 0x06571000 },
-	{ _MMIO(0x9888), 0x1190ff80 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900060 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c00 },
-	{ _MMIO(0x9888), 0x43900842 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900060 },
-};
-
-static int
-get_sampler_mux_config(struct drm_i915_private *dev_priv,
-		       const struct i915_oa_reg **regs,
-		       int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_sampler;
-	lens[n] = ARRAY_SIZE(mux_config_sampler);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_tdl_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x00007fff },
-	{ _MMIO(0x2778), 0x00000000 },
-	{ _MMIO(0x277c), 0x00009fff },
-	{ _MMIO(0x2780), 0x00000002 },
-	{ _MMIO(0x2784), 0x0000efff },
-	{ _MMIO(0x2788), 0x00000000 },
-	{ _MMIO(0x278c), 0x0000f3ff },
-	{ _MMIO(0x2790), 0x00000002 },
-	{ _MMIO(0x2794), 0x0000fdff },
-	{ _MMIO(0x2798), 0x00000000 },
-	{ _MMIO(0x279c), 0x0000fe7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_tdl_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_tdl_1[] = {
-	{ _MMIO(0x9888), 0x12120000 },
-	{ _MMIO(0x9888), 0x12320000 },
-	{ _MMIO(0x9888), 0x12520000 },
-	{ _MMIO(0x9888), 0x002f8000 },
-	{ _MMIO(0x9888), 0x022f3000 },
-	{ _MMIO(0x9888), 0x0a4c0015 },
-	{ _MMIO(0x9888), 0x0c0d8000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x100f03a0 },
-	{ _MMIO(0x9888), 0x0c0ff000 },
-	{ _MMIO(0x9888), 0x0e0f0095 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2d8000 },
-	{ _MMIO(0x9888), 0x0e2d4000 },
-	{ _MMIO(0x9888), 0x062d4000 },
-	{ _MMIO(0x9888), 0x02108000 },
-	{ _MMIO(0x9888), 0x0410c000 },
-	{ _MMIO(0x9888), 0x02118000 },
-	{ _MMIO(0x9888), 0x0411c000 },
-	{ _MMIO(0x9888), 0x02121880 },
-	{ _MMIO(0x9888), 0x041219b5 },
-	{ _MMIO(0x9888), 0x00120000 },
-	{ _MMIO(0x9888), 0x02134000 },
-	{ _MMIO(0x9888), 0x04135000 },
-	{ _MMIO(0x9888), 0x0c308000 },
-	{ _MMIO(0x9888), 0x0e304000 },
-	{ _MMIO(0x9888), 0x06304000 },
-	{ _MMIO(0x9888), 0x0c318000 },
-	{ _MMIO(0x9888), 0x0e314000 },
-	{ _MMIO(0x9888), 0x06314000 },
-	{ _MMIO(0x9888), 0x0c321a80 },
-	{ _MMIO(0x9888), 0x0e320033 },
-	{ _MMIO(0x9888), 0x06320031 },
-	{ _MMIO(0x9888), 0x00320000 },
-	{ _MMIO(0x9888), 0x0c334000 },
-	{ _MMIO(0x9888), 0x0e331000 },
-	{ _MMIO(0x9888), 0x06331000 },
-	{ _MMIO(0x9888), 0x0e508000 },
-	{ _MMIO(0x9888), 0x00508000 },
-	{ _MMIO(0x9888), 0x02504000 },
-	{ _MMIO(0x9888), 0x0e518000 },
-	{ _MMIO(0x9888), 0x00518000 },
-	{ _MMIO(0x9888), 0x02514000 },
-	{ _MMIO(0x9888), 0x0e521880 },
-	{ _MMIO(0x9888), 0x00521a80 },
-	{ _MMIO(0x9888), 0x02520033 },
-	{ _MMIO(0x9888), 0x0e534000 },
-	{ _MMIO(0x9888), 0x00534000 },
-	{ _MMIO(0x9888), 0x02531000 },
-	{ _MMIO(0x9888), 0x1190ff80 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900800 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900062 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c00 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900040 },
-};
-
-static int
-get_tdl_1_mux_config(struct drm_i915_private *dev_priv,
-		     const struct i915_oa_reg **regs,
-		     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_tdl_1;
-	lens[n] = ARRAY_SIZE(mux_config_tdl_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_tdl_2[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_tdl_2[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_tdl_2[] = {
-	{ _MMIO(0x9888), 0x12124d60 },
-	{ _MMIO(0x9888), 0x12322e60 },
-	{ _MMIO(0x9888), 0x12524d60 },
-	{ _MMIO(0x9888), 0x022f3000 },
-	{ _MMIO(0x9888), 0x0a4c0014 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0fe000 },
-	{ _MMIO(0x9888), 0x0e0f0097 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x002d8000 },
-	{ _MMIO(0x9888), 0x062d4000 },
-	{ _MMIO(0x9888), 0x0410c000 },
-	{ _MMIO(0x9888), 0x0411c000 },
-	{ _MMIO(0x9888), 0x04121fb7 },
-	{ _MMIO(0x9888), 0x00120000 },
-	{ _MMIO(0x9888), 0x04135000 },
-	{ _MMIO(0x9888), 0x00308000 },
-	{ _MMIO(0x9888), 0x06304000 },
-	{ _MMIO(0x9888), 0x00318000 },
-	{ _MMIO(0x9888), 0x06314000 },
-	{ _MMIO(0x9888), 0x00321b80 },
-	{ _MMIO(0x9888), 0x0632003f },
-	{ _MMIO(0x9888), 0x00334000 },
-	{ _MMIO(0x9888), 0x06331000 },
-	{ _MMIO(0x9888), 0x0250c000 },
-	{ _MMIO(0x9888), 0x0251c000 },
-	{ _MMIO(0x9888), 0x02521fb7 },
-	{ _MMIO(0x9888), 0x00520000 },
-	{ _MMIO(0x9888), 0x02535000 },
-	{ _MMIO(0x9888), 0x1190fc00 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900800 },
-	{ _MMIO(0x9888), 0x43900063 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900040 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_tdl_2_mux_config(struct drm_i915_private *dev_priv,
-		     const struct i915_oa_reg **regs,
-		     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_tdl_2;
-	lens[n] = ARRAY_SIZE(mux_config_tdl_2);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_extra[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_extra[] = {
-	{ _MMIO(0xe458), 0x00001000 },
-	{ _MMIO(0xe558), 0x00003002 },
-	{ _MMIO(0xe658), 0x00005004 },
-	{ _MMIO(0xe758), 0x00011010 },
-	{ _MMIO(0xe45c), 0x00050012 },
-	{ _MMIO(0xe55c), 0x00052051 },
-	{ _MMIO(0xe65c), 0x00000008 },
-};
-
-static const struct i915_oa_reg mux_config_compute_extra[] = {
-	{ _MMIO(0x9888), 0x121203e0 },
-	{ _MMIO(0x9888), 0x123203e0 },
-	{ _MMIO(0x9888), 0x125203e0 },
-	{ _MMIO(0x9888), 0x022f4000 },
-	{ _MMIO(0x9888), 0x0a4c0040 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0e0f006c },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x042d8000 },
-	{ _MMIO(0x9888), 0x06104000 },
-	{ _MMIO(0x9888), 0x06114000 },
-	{ _MMIO(0x9888), 0x06120033 },
-	{ _MMIO(0x9888), 0x00120000 },
-	{ _MMIO(0x9888), 0x06131000 },
-	{ _MMIO(0x9888), 0x04308000 },
-	{ _MMIO(0x9888), 0x04318000 },
-	{ _MMIO(0x9888), 0x04321980 },
-	{ _MMIO(0x9888), 0x00320000 },
-	{ _MMIO(0x9888), 0x04334000 },
-	{ _MMIO(0x9888), 0x04504000 },
-	{ _MMIO(0x9888), 0x04514000 },
-	{ _MMIO(0x9888), 0x04520033 },
-	{ _MMIO(0x9888), 0x00520000 },
-	{ _MMIO(0x9888), 0x04531000 },
-	{ _MMIO(0x9888), 0x1190e000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x43900c00 },
-	{ _MMIO(0x9888), 0x45900002 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_compute_extra_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_extra;
-	lens[n] = ARRAY_SIZE(mux_config_compute_extra);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_vme_pipe[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00100030 },
-	{ _MMIO(0x2774), 0x0000fff9 },
-	{ _MMIO(0x2778), 0x00000002 },
-	{ _MMIO(0x277c), 0x0000fffc },
-	{ _MMIO(0x2780), 0x00000002 },
-	{ _MMIO(0x2784), 0x0000fff3 },
-	{ _MMIO(0x2788), 0x00100180 },
-	{ _MMIO(0x278c), 0x0000ffcf },
-	{ _MMIO(0x2790), 0x00000002 },
-	{ _MMIO(0x2794), 0x0000ffcf },
-	{ _MMIO(0x2798), 0x00000002 },
-	{ _MMIO(0x279c), 0x0000ff3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_vme_pipe[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00008003 },
-};
-
-static const struct i915_oa_reg mux_config_vme_pipe[] = {
-	{ _MMIO(0x9888), 0x141a5800 },
-	{ _MMIO(0x9888), 0x161a00c0 },
-	{ _MMIO(0x9888), 0x12180240 },
-	{ _MMIO(0x9888), 0x14180002 },
-	{ _MMIO(0x9888), 0x143a5800 },
-	{ _MMIO(0x9888), 0x163a00c0 },
-	{ _MMIO(0x9888), 0x12380240 },
-	{ _MMIO(0x9888), 0x14380002 },
-	{ _MMIO(0x9888), 0x002f1000 },
-	{ _MMIO(0x9888), 0x022f8000 },
-	{ _MMIO(0x9888), 0x042f3000 },
-	{ _MMIO(0x9888), 0x004c4000 },
-	{ _MMIO(0x9888), 0x0a4c1500 },
-	{ _MMIO(0x9888), 0x000d2000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0c0f0400 },
-	{ _MMIO(0x9888), 0x0e0f9500 },
-	{ _MMIO(0x9888), 0x100f002a },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2c8000 },
-	{ _MMIO(0x9888), 0x162c0a00 },
-	{ _MMIO(0x9888), 0x0a2dc000 },
-	{ _MMIO(0x9888), 0x0c2dc000 },
-	{ _MMIO(0x9888), 0x04193000 },
-	{ _MMIO(0x9888), 0x081a28c1 },
-	{ _MMIO(0x9888), 0x001a0000 },
-	{ _MMIO(0x9888), 0x00133000 },
-	{ _MMIO(0x9888), 0x0613c000 },
-	{ _MMIO(0x9888), 0x0813f000 },
-	{ _MMIO(0x9888), 0x00172000 },
-	{ _MMIO(0x9888), 0x06178000 },
-	{ _MMIO(0x9888), 0x0817a000 },
-	{ _MMIO(0x9888), 0x00180037 },
-	{ _MMIO(0x9888), 0x06180940 },
-	{ _MMIO(0x9888), 0x08180000 },
-	{ _MMIO(0x9888), 0x02180000 },
-	{ _MMIO(0x9888), 0x04183000 },
-	{ _MMIO(0x9888), 0x06393000 },
-	{ _MMIO(0x9888), 0x0c3a28c1 },
-	{ _MMIO(0x9888), 0x003a0000 },
-	{ _MMIO(0x9888), 0x0a33f000 },
-	{ _MMIO(0x9888), 0x0c33f000 },
-	{ _MMIO(0x9888), 0x0a37a000 },
-	{ _MMIO(0x9888), 0x0c37a000 },
-	{ _MMIO(0x9888), 0x0a380977 },
-	{ _MMIO(0x9888), 0x08380000 },
-	{ _MMIO(0x9888), 0x04380000 },
-	{ _MMIO(0x9888), 0x06383000 },
-	{ _MMIO(0x9888), 0x119000ff },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900040 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900800 },
-	{ _MMIO(0x9888), 0x47901000 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900844 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_vme_pipe_mux_config(struct drm_i915_private *dev_priv,
-			const struct i915_oa_reg **regs,
-			int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_vme_pipe;
-	lens[n] = ARRAY_SIZE(mux_config_vme_pipe);
-	n++;
-
-	return n;
-}
-
 static const struct i915_oa_reg b_counter_config_test_oa[] = {
 	{ _MMIO(0x2740), 0x00000000 },
 	{ _MMIO(0x2714), 0xf0800000 },
@@ -2370,6 +59,7 @@ static const struct i915_oa_reg flex_eu_config_test_oa[] = {
 };
 
 static const struct i915_oa_reg mux_config_test_oa[] = {
+	{ _MMIO(0x9840), 0x00000080 },
 	{ _MMIO(0x9888), 0x11810000 },
 	{ _MMIO(0x9888), 0x07810016 },
 	{ _MMIO(0x9888), 0x1f810000 },
@@ -2384,1096 +74,35 @@ static const struct i915_oa_reg mux_config_test_oa[] = {
 	{ _MMIO(0x9888), 0x33900000 },
 };
 
-static int
-get_test_oa_mux_config(struct drm_i915_private *dev_priv,
-		       const struct i915_oa_reg **regs,
-		       int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_test_oa;
-	lens[n] = ARRAY_SIZE(mux_config_test_oa);
-	n++;
-
-	return n;
-}
-
-int i915_oa_select_metric_set_sklgt2(struct drm_i915_private *dev_priv)
-{
-	dev_priv->perf.oa.n_mux_configs = 0;
-	dev_priv->perf.oa.b_counter_regs = NULL;
-	dev_priv->perf.oa.b_counter_regs_len = 0;
-	dev_priv->perf.oa.flex_regs = NULL;
-	dev_priv->perf.oa.flex_regs_len = 0;
-
-	switch (dev_priv->perf.oa.metrics_set) {
-	case METRIC_SET_ID_RENDER_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_basic_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_basic);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_basic_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_basic);
-
-		return 0;
-	case METRIC_SET_ID_RENDER_PIPE_PROFILE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_pipe_profile_mux_config(dev_priv,
-							   dev_priv->perf.oa.mux_regs,
-							   dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_PIPE_PROFILE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_pipe_profile;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_pipe_profile);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_pipe_profile;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_pipe_profile);
-
-		return 0;
-	case METRIC_SET_ID_MEMORY_READS:
-		dev_priv->perf.oa.n_mux_configs =
-			get_memory_reads_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"MEMORY_READS\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_memory_reads;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_memory_reads);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_memory_reads;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_memory_reads);
-
-		return 0;
-	case METRIC_SET_ID_MEMORY_WRITES:
-		dev_priv->perf.oa.n_mux_configs =
-			get_memory_writes_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"MEMORY_WRITES\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_memory_writes;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_memory_writes);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_memory_writes;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_memory_writes);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_EXTENDED:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_extended_mux_config(dev_priv,
-							dev_priv->perf.oa.mux_regs,
-							dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_EXTENDED\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_extended;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_extended);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_extended;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_extended);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_L3_CACHE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_l3_cache_mux_config(dev_priv,
-							dev_priv->perf.oa.mux_regs,
-							dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_L3_CACHE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_l3_cache;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_l3_cache);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_l3_cache;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_l3_cache);
-
-		return 0;
-	case METRIC_SET_ID_HDC_AND_SF:
-		dev_priv->perf.oa.n_mux_configs =
-			get_hdc_and_sf_mux_config(dev_priv,
-						  dev_priv->perf.oa.mux_regs,
-						  dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"HDC_AND_SF\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_hdc_and_sf;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_hdc_and_sf);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_hdc_and_sf;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_hdc_and_sf);
-
-		return 0;
-	case METRIC_SET_ID_L3_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_1_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_1);
-
-		return 0;
-	case METRIC_SET_ID_L3_2:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_2_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_2\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_2;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_2);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_2;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_2);
-
-		return 0;
-	case METRIC_SET_ID_L3_3:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_3_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_3\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_3;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_3);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_3;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_3);
-
-		return 0;
-	case METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND:
-		dev_priv->perf.oa.n_mux_configs =
-			get_rasterizer_and_pixel_backend_mux_config(dev_priv,
-								    dev_priv->perf.oa.mux_regs,
-								    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RASTERIZER_AND_PIXEL_BACKEND\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_rasterizer_and_pixel_backend;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_rasterizer_and_pixel_backend);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_rasterizer_and_pixel_backend;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_rasterizer_and_pixel_backend);
-
-		return 0;
-	case METRIC_SET_ID_SAMPLER:
-		dev_priv->perf.oa.n_mux_configs =
-			get_sampler_mux_config(dev_priv,
-					       dev_priv->perf.oa.mux_regs,
-					       dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"SAMPLER\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_sampler;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_sampler);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_sampler;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_sampler);
-
-		return 0;
-	case METRIC_SET_ID_TDL_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_tdl_1_mux_config(dev_priv,
-					     dev_priv->perf.oa.mux_regs,
-					     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TDL_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_tdl_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_tdl_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_tdl_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_tdl_1);
-
-		return 0;
-	case METRIC_SET_ID_TDL_2:
-		dev_priv->perf.oa.n_mux_configs =
-			get_tdl_2_mux_config(dev_priv,
-					     dev_priv->perf.oa.mux_regs,
-					     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TDL_2\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_tdl_2;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_tdl_2);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_tdl_2;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_tdl_2);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_EXTRA:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_extra_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_EXTRA\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_extra;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_extra);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_extra;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_extra);
-
-		return 0;
-	case METRIC_SET_ID_VME_PIPE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_vme_pipe_mux_config(dev_priv,
-						dev_priv->perf.oa.mux_regs,
-						dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"VME_PIPE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_vme_pipe;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_vme_pipe);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_vme_pipe;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_vme_pipe);
-
-		return 0;
-	case METRIC_SET_ID_TEST_OA:
-		dev_priv->perf.oa.n_mux_configs =
-			get_test_oa_mux_config(dev_priv,
-					       dev_priv->perf.oa.mux_regs,
-					       dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TEST_OA\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_test_oa;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_test_oa);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_test_oa;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_test_oa);
-
-		return 0;
-	default:
-		return -ENODEV;
-	}
-}
-
-static ssize_t
-show_render_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_BASIC);
-}
-
-static struct device_attribute dev_attr_render_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_basic[] = {
-	&dev_attr_render_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_basic = {
-	.name = "f519e481-24d2-4d42-87c9-3fdd12c00202",
-	.attrs =  attrs_render_basic,
-};
-
-static ssize_t
-show_compute_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_BASIC);
-}
-
-static struct device_attribute dev_attr_compute_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_basic[] = {
-	&dev_attr_compute_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_basic = {
-	.name = "fe47b29d-ae51-423e-bff4-27d965a95b60",
-	.attrs =  attrs_compute_basic,
-};
-
-static ssize_t
-show_render_pipe_profile_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_PIPE_PROFILE);
-}
-
-static struct device_attribute dev_attr_render_pipe_profile_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_pipe_profile_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_pipe_profile[] = {
-	&dev_attr_render_pipe_profile_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_pipe_profile = {
-	.name = "e0ad5ae0-84ba-4f29-a723-1906c12cb774",
-	.attrs =  attrs_render_pipe_profile,
-};
-
-static ssize_t
-show_memory_reads_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_MEMORY_READS);
-}
-
-static struct device_attribute dev_attr_memory_reads_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_memory_reads_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_memory_reads[] = {
-	&dev_attr_memory_reads_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_memory_reads = {
-	.name = "9bc436dd-6130-4add-affc-283eb6eaa864",
-	.attrs =  attrs_memory_reads,
-};
-
-static ssize_t
-show_memory_writes_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_MEMORY_WRITES);
-}
-
-static struct device_attribute dev_attr_memory_writes_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_memory_writes_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_memory_writes[] = {
-	&dev_attr_memory_writes_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_memory_writes = {
-	.name = "2ea0da8f-3527-4669-9d9d-13099a7435bf",
-	.attrs =  attrs_memory_writes,
-};
-
-static ssize_t
-show_compute_extended_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_EXTENDED);
-}
-
-static struct device_attribute dev_attr_compute_extended_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_extended_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_extended[] = {
-	&dev_attr_compute_extended_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_extended = {
-	.name = "d97d16af-028b-4cd1-a672-6210cb5513dd",
-	.attrs =  attrs_compute_extended,
-};
-
-static ssize_t
-show_compute_l3_cache_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_L3_CACHE);
-}
-
-static struct device_attribute dev_attr_compute_l3_cache_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_l3_cache_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_l3_cache[] = {
-	&dev_attr_compute_l3_cache_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_l3_cache = {
-	.name = "9fb22842-e708-43f7-9752-e0e41670c39e",
-	.attrs =  attrs_compute_l3_cache,
-};
-
-static ssize_t
-show_hdc_and_sf_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_HDC_AND_SF);
-}
-
-static struct device_attribute dev_attr_hdc_and_sf_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_hdc_and_sf_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_hdc_and_sf[] = {
-	&dev_attr_hdc_and_sf_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_hdc_and_sf = {
-	.name = "5378e2a1-4248-4188-a4ae-da25a794c603",
-	.attrs =  attrs_hdc_and_sf,
-};
-
-static ssize_t
-show_l3_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_1);
-}
-
-static struct device_attribute dev_attr_l3_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_1[] = {
-	&dev_attr_l3_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_1 = {
-	.name = "f42cdd6a-b000-42cb-870f-5eb423a7f514",
-	.attrs =  attrs_l3_1,
-};
-
-static ssize_t
-show_l3_2_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_2);
-}
-
-static struct device_attribute dev_attr_l3_2_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_2_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_2[] = {
-	&dev_attr_l3_2_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_2 = {
-	.name = "b9bf2423-d88c-4a7b-a051-627611d00dcc",
-	.attrs =  attrs_l3_2,
-};
-
-static ssize_t
-show_l3_3_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_3);
-}
-
-static struct device_attribute dev_attr_l3_3_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_3_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_3[] = {
-	&dev_attr_l3_3_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_3 = {
-	.name = "2414a93d-d84f-406e-99c0-472161194b40",
-	.attrs =  attrs_l3_3,
-};
-
-static ssize_t
-show_rasterizer_and_pixel_backend_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND);
-}
-
-static struct device_attribute dev_attr_rasterizer_and_pixel_backend_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_rasterizer_and_pixel_backend_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_rasterizer_and_pixel_backend[] = {
-	&dev_attr_rasterizer_and_pixel_backend_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_rasterizer_and_pixel_backend = {
-	.name = "53a45d2d-170b-4cf5-b7bb-585120c8e2f5",
-	.attrs =  attrs_rasterizer_and_pixel_backend,
-};
-
-static ssize_t
-show_sampler_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_SAMPLER);
-}
-
-static struct device_attribute dev_attr_sampler_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_sampler_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_sampler[] = {
-	&dev_attr_sampler_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_sampler = {
-	.name = "b4cff514-a91e-4798-a0b3-426ca13fc9c1",
-	.attrs =  attrs_sampler,
-};
-
-static ssize_t
-show_tdl_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TDL_1);
-}
-
-static struct device_attribute dev_attr_tdl_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_tdl_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_tdl_1[] = {
-	&dev_attr_tdl_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_tdl_1 = {
-	.name = "7821d13b-9b8b-4405-9618-78cd56b62cce",
-	.attrs =  attrs_tdl_1,
-};
-
-static ssize_t
-show_tdl_2_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TDL_2);
-}
-
-static struct device_attribute dev_attr_tdl_2_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_tdl_2_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_tdl_2[] = {
-	&dev_attr_tdl_2_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_tdl_2 = {
-	.name = "893f1a4d-919d-4388-8cb7-746d73ea7259",
-	.attrs =  attrs_tdl_2,
-};
-
-static ssize_t
-show_compute_extra_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_EXTRA);
-}
-
-static struct device_attribute dev_attr_compute_extra_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_extra_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_extra[] = {
-	&dev_attr_compute_extra_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_extra = {
-	.name = "41a24047-7484-4ead-ae37-de907e5ff2b2",
-	.attrs =  attrs_compute_extra,
-};
-
-static ssize_t
-show_vme_pipe_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_VME_PIPE);
-}
-
-static struct device_attribute dev_attr_vme_pipe_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_vme_pipe_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_vme_pipe[] = {
-	&dev_attr_vme_pipe_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_vme_pipe = {
-	.name = "95910492-943f-44bd-9461-390240f243fd",
-	.attrs =  attrs_vme_pipe,
-};
-
 static ssize_t
 show_test_oa_id(struct device *kdev, struct device_attribute *attr, char *buf)
 {
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TEST_OA);
-}
-
-static struct device_attribute dev_attr_test_oa_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_test_oa_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_test_oa[] = {
-	&dev_attr_test_oa_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_test_oa = {
-	.name = "1651949f-0ac0-4cb1-a06f-dafd74a407d1",
-	.attrs =  attrs_test_oa,
-};
-
-int
-i915_perf_register_sysfs_sklgt2(struct drm_i915_private *dev_priv)
-{
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
-	int ret = 0;
-
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-		if (ret)
-			goto error_render_basic;
-	}
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-		if (ret)
-			goto error_compute_basic;
-	}
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-		if (ret)
-			goto error_render_pipe_profile;
-	}
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-		if (ret)
-			goto error_memory_reads;
-	}
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-		if (ret)
-			goto error_memory_writes;
-	}
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-		if (ret)
-			goto error_compute_extended;
-	}
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-		if (ret)
-			goto error_compute_l3_cache;
-	}
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-		if (ret)
-			goto error_hdc_and_sf;
-	}
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-		if (ret)
-			goto error_l3_1;
-	}
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-		if (ret)
-			goto error_l3_2;
-	}
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-		if (ret)
-			goto error_l3_3;
-	}
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-		if (ret)
-			goto error_rasterizer_and_pixel_backend;
-	}
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_sampler);
-		if (ret)
-			goto error_sampler;
-	}
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-		if (ret)
-			goto error_tdl_1;
-	}
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-		if (ret)
-			goto error_tdl_2;
-	}
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-		if (ret)
-			goto error_compute_extra;
-	}
-	if (get_vme_pipe_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_vme_pipe);
-		if (ret)
-			goto error_vme_pipe;
-	}
-	if (get_test_oa_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_test_oa);
-		if (ret)
-			goto error_test_oa;
-	}
-
-	return 0;
-
-error_test_oa:
-	if (get_vme_pipe_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_vme_pipe);
-error_vme_pipe:
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-error_compute_extra:
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-error_tdl_2:
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-error_tdl_1:
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler);
-error_sampler:
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-error_rasterizer_and_pixel_backend:
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-error_l3_3:
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-error_l3_2:
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-error_l3_1:
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-error_hdc_and_sf:
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-error_compute_l3_cache:
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-error_compute_extended:
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-error_memory_writes:
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-error_memory_reads:
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-error_render_pipe_profile:
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-error_compute_basic:
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-error_render_basic:
-	return ret;
+	return sprintf(buf, "1\n");
 }
 
 void
-i915_perf_unregister_sysfs_sklgt2(struct drm_i915_private *dev_priv)
+i915_perf_load_test_config_sklgt2(struct drm_i915_private *dev_priv)
 {
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
+	strncpy(dev_priv->perf.oa.test_config.uuid,
+		"1651949f-0ac0-4cb1-a06f-dafd74a407d1",
+		UUID_STRING_LEN);
+	dev_priv->perf.oa.test_config.id = 1;
 
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler);
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-	if (get_vme_pipe_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_vme_pipe);
-	if (get_test_oa_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_test_oa);
+	dev_priv->perf.oa.test_config.mux_regs = mux_config_test_oa;
+	dev_priv->perf.oa.test_config.mux_regs_len = ARRAY_SIZE(mux_config_test_oa);
+
+	dev_priv->perf.oa.test_config.b_counter_regs = b_counter_config_test_oa;
+	dev_priv->perf.oa.test_config.b_counter_regs_len = ARRAY_SIZE(b_counter_config_test_oa);
+
+	dev_priv->perf.oa.test_config.flex_regs = flex_eu_config_test_oa;
+	dev_priv->perf.oa.test_config.flex_regs_len = ARRAY_SIZE(flex_eu_config_test_oa);
+
+	dev_priv->perf.oa.test_config.sysfs_metric.name = "1651949f-0ac0-4cb1-a06f-dafd74a407d1";
+	dev_priv->perf.oa.test_config.sysfs_metric.attrs = dev_priv->perf.oa.test_config.attrs;
+
+	dev_priv->perf.oa.test_config.attrs[0] = &dev_priv->perf.oa.test_config.sysfs_metric_id.attr;
+
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.name = "id";
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.mode = 0444;
+	dev_priv->perf.oa.test_config.sysfs_metric_id.show = show_test_oa_id;
 }
diff --git a/drivers/gpu/drm/i915/i915_oa_sklgt2.h b/drivers/gpu/drm/i915/i915_oa_sklgt2.h
index f4397ba..fe1aa2c 100644
--- a/drivers/gpu/drm/i915/i915_oa_sklgt2.h
+++ b/drivers/gpu/drm/i915/i915_oa_sklgt2.h
@@ -29,12 +29,6 @@
 #ifndef __I915_OA_SKLGT2_H__
 #define __I915_OA_SKLGT2_H__
 
-extern int i915_oa_n_builtin_metric_sets_sklgt2;
-
-extern int i915_oa_select_metric_set_sklgt2(struct drm_i915_private *dev_priv);
-
-extern int i915_perf_register_sysfs_sklgt2(struct drm_i915_private *dev_priv);
-
-extern void i915_perf_unregister_sysfs_sklgt2(struct drm_i915_private *dev_priv);
+extern void i915_perf_load_test_config_sklgt2(struct drm_i915_private *dev_priv);
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_oa_sklgt3.c b/drivers/gpu/drm/i915/i915_oa_sklgt3.c
index 7765e22..85e51ad 100644
--- a/drivers/gpu/drm/i915/i915_oa_sklgt3.c
+++ b/drivers/gpu/drm/i915/i915_oa_sklgt3.c
@@ -31,1876 +31,6 @@
 #include "i915_drv.h"
 #include "i915_oa_sklgt3.h"
 
-enum metric_set_id {
-	METRIC_SET_ID_RENDER_BASIC = 1,
-	METRIC_SET_ID_COMPUTE_BASIC,
-	METRIC_SET_ID_RENDER_PIPE_PROFILE,
-	METRIC_SET_ID_MEMORY_READS,
-	METRIC_SET_ID_MEMORY_WRITES,
-	METRIC_SET_ID_COMPUTE_EXTENDED,
-	METRIC_SET_ID_COMPUTE_L3_CACHE,
-	METRIC_SET_ID_HDC_AND_SF,
-	METRIC_SET_ID_L3_1,
-	METRIC_SET_ID_L3_2,
-	METRIC_SET_ID_L3_3,
-	METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND,
-	METRIC_SET_ID_SAMPLER,
-	METRIC_SET_ID_TDL_1,
-	METRIC_SET_ID_TDL_2,
-	METRIC_SET_ID_COMPUTE_EXTRA,
-	METRIC_SET_ID_VME_PIPE,
-	METRIC_SET_ID_TEST_OA,
-};
-
-int i915_oa_n_builtin_metric_sets_sklgt3 = 18;
-
-static const struct i915_oa_reg b_counter_config_render_basic[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_render_basic[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_render_basic[] = {
-	{ _MMIO(0x9888), 0x166c01e0 },
-	{ _MMIO(0x9888), 0x12170280 },
-	{ _MMIO(0x9888), 0x12370280 },
-	{ _MMIO(0x9888), 0x16ec01e0 },
-	{ _MMIO(0x9888), 0x11930317 },
-	{ _MMIO(0x9888), 0x159303df },
-	{ _MMIO(0x9888), 0x3f900003 },
-	{ _MMIO(0x9888), 0x1a4e0380 },
-	{ _MMIO(0x9888), 0x0a6c0053 },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x0a1b4000 },
-	{ _MMIO(0x9888), 0x1c1c0001 },
-	{ _MMIO(0x9888), 0x002f1000 },
-	{ _MMIO(0x9888), 0x042f1000 },
-	{ _MMIO(0x9888), 0x004c4000 },
-	{ _MMIO(0x9888), 0x0a4c8400 },
-	{ _MMIO(0x9888), 0x0c4c0002 },
-	{ _MMIO(0x9888), 0x000d2000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0f0400 },
-	{ _MMIO(0x9888), 0x0e0f6600 },
-	{ _MMIO(0x9888), 0x100f0001 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x162ca200 },
-	{ _MMIO(0x9888), 0x062d8000 },
-	{ _MMIO(0x9888), 0x082d8000 },
-	{ _MMIO(0x9888), 0x00133000 },
-	{ _MMIO(0x9888), 0x08133000 },
-	{ _MMIO(0x9888), 0x00170020 },
-	{ _MMIO(0x9888), 0x08170021 },
-	{ _MMIO(0x9888), 0x10170000 },
-	{ _MMIO(0x9888), 0x0633c000 },
-	{ _MMIO(0x9888), 0x0833c000 },
-	{ _MMIO(0x9888), 0x06370800 },
-	{ _MMIO(0x9888), 0x08370840 },
-	{ _MMIO(0x9888), 0x10370000 },
-	{ _MMIO(0x9888), 0x1ace0200 },
-	{ _MMIO(0x9888), 0x0aec5300 },
-	{ _MMIO(0x9888), 0x10ec0000 },
-	{ _MMIO(0x9888), 0x1cec0000 },
-	{ _MMIO(0x9888), 0x0a9b8000 },
-	{ _MMIO(0x9888), 0x1c9c0002 },
-	{ _MMIO(0x9888), 0x0ccc0002 },
-	{ _MMIO(0x9888), 0x0a8d8000 },
-	{ _MMIO(0x9888), 0x108f0001 },
-	{ _MMIO(0x9888), 0x16ac8000 },
-	{ _MMIO(0x9888), 0x0d933031 },
-	{ _MMIO(0x9888), 0x0f933e3f },
-	{ _MMIO(0x9888), 0x01933d00 },
-	{ _MMIO(0x9888), 0x0393073c },
-	{ _MMIO(0x9888), 0x0593000e },
-	{ _MMIO(0x9888), 0x1d930000 },
-	{ _MMIO(0x9888), 0x19930000 },
-	{ _MMIO(0x9888), 0x1b930000 },
-	{ _MMIO(0x9888), 0x1d900157 },
-	{ _MMIO(0x9888), 0x1f900158 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x2b908000 },
-	{ _MMIO(0x9888), 0x2d908000 },
-	{ _MMIO(0x9888), 0x2f908000 },
-	{ _MMIO(0x9888), 0x31908000 },
-	{ _MMIO(0x9888), 0x15908000 },
-	{ _MMIO(0x9888), 0x17908000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1190003f },
-	{ _MMIO(0x9888), 0x51907710 },
-	{ _MMIO(0x9888), 0x419020a0 },
-	{ _MMIO(0x9888), 0x55901515 },
-	{ _MMIO(0x9888), 0x45900529 },
-	{ _MMIO(0x9888), 0x47901025 },
-	{ _MMIO(0x9888), 0x57907770 },
-	{ _MMIO(0x9888), 0x49902100 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900108 },
-	{ _MMIO(0x9888), 0x59900007 },
-	{ _MMIO(0x9888), 0x43902108 },
-	{ _MMIO(0x9888), 0x53907777 },
-};
-
-static int
-get_render_basic_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_render_basic;
-	lens[n] = ARRAY_SIZE(mux_config_render_basic);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_basic[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_basic[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_compute_basic[] = {
-	{ _MMIO(0x9888), 0x104f00e0 },
-	{ _MMIO(0x9888), 0x124f1c00 },
-	{ _MMIO(0x9888), 0x106c00e0 },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f900003 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x1a4e0820 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x064f0900 },
-	{ _MMIO(0x9888), 0x084f0032 },
-	{ _MMIO(0x9888), 0x0a4f1891 },
-	{ _MMIO(0x9888), 0x0c4f0e00 },
-	{ _MMIO(0x9888), 0x0e4f003c },
-	{ _MMIO(0x9888), 0x004f0d80 },
-	{ _MMIO(0x9888), 0x024f003b },
-	{ _MMIO(0x9888), 0x006c0002 },
-	{ _MMIO(0x9888), 0x086c0100 },
-	{ _MMIO(0x9888), 0x0c6c000c },
-	{ _MMIO(0x9888), 0x0e6c0b00 },
-	{ _MMIO(0x9888), 0x186c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x001b4000 },
-	{ _MMIO(0x9888), 0x081b8000 },
-	{ _MMIO(0x9888), 0x0c1b4000 },
-	{ _MMIO(0x9888), 0x0e1b8000 },
-	{ _MMIO(0x9888), 0x101c8000 },
-	{ _MMIO(0x9888), 0x1a1c8000 },
-	{ _MMIO(0x9888), 0x1c1c0024 },
-	{ _MMIO(0x9888), 0x065b8000 },
-	{ _MMIO(0x9888), 0x085b4000 },
-	{ _MMIO(0x9888), 0x0a5bc000 },
-	{ _MMIO(0x9888), 0x0c5b8000 },
-	{ _MMIO(0x9888), 0x0e5b4000 },
-	{ _MMIO(0x9888), 0x005b8000 },
-	{ _MMIO(0x9888), 0x025b4000 },
-	{ _MMIO(0x9888), 0x1a5c6000 },
-	{ _MMIO(0x9888), 0x1c5c001b },
-	{ _MMIO(0x9888), 0x125c8000 },
-	{ _MMIO(0x9888), 0x145c8000 },
-	{ _MMIO(0x9888), 0x004c8000 },
-	{ _MMIO(0x9888), 0x0a4c2000 },
-	{ _MMIO(0x9888), 0x0c4c0208 },
-	{ _MMIO(0x9888), 0x000da000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x020d2000 },
-	{ _MMIO(0x9888), 0x0c0f5400 },
-	{ _MMIO(0x9888), 0x0e0f5500 },
-	{ _MMIO(0x9888), 0x100f0155 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2cc000 },
-	{ _MMIO(0x9888), 0x162cfb00 },
-	{ _MMIO(0x9888), 0x182c00be },
-	{ _MMIO(0x9888), 0x022cc000 },
-	{ _MMIO(0x9888), 0x042cc000 },
-	{ _MMIO(0x9888), 0x19900157 },
-	{ _MMIO(0x9888), 0x1b900158 },
-	{ _MMIO(0x9888), 0x1d900105 },
-	{ _MMIO(0x9888), 0x1f900103 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x11900fff },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900800 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900863 },
-	{ _MMIO(0x9888), 0x47900802 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900802 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900002 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900c62 },
-	{ _MMIO(0x9888), 0x53903333 },
-};
-
-static int
-get_compute_basic_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_basic;
-	lens[n] = ARRAY_SIZE(mux_config_compute_basic);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_render_pipe_profile[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007ffea },
-	{ _MMIO(0x2774), 0x00007ffc },
-	{ _MMIO(0x2778), 0x0007affa },
-	{ _MMIO(0x277c), 0x0000f5fd },
-	{ _MMIO(0x2780), 0x00079ffa },
-	{ _MMIO(0x2784), 0x0000f3fb },
-	{ _MMIO(0x2788), 0x0007bf7a },
-	{ _MMIO(0x278c), 0x0000f7e7 },
-	{ _MMIO(0x2790), 0x0007fefa },
-	{ _MMIO(0x2794), 0x0000f7cf },
-	{ _MMIO(0x2798), 0x00077ffa },
-	{ _MMIO(0x279c), 0x0000efdf },
-	{ _MMIO(0x27a0), 0x0006fffa },
-	{ _MMIO(0x27a4), 0x0000cfbf },
-	{ _MMIO(0x27a8), 0x0003fffa },
-	{ _MMIO(0x27ac), 0x00005f7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_render_pipe_profile[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_render_pipe_profile[] = {
-	{ _MMIO(0x9888), 0x0c0e001f },
-	{ _MMIO(0x9888), 0x0a0f0000 },
-	{ _MMIO(0x9888), 0x10116800 },
-	{ _MMIO(0x9888), 0x178a03e0 },
-	{ _MMIO(0x9888), 0x11824c00 },
-	{ _MMIO(0x9888), 0x11830020 },
-	{ _MMIO(0x9888), 0x13840020 },
-	{ _MMIO(0x9888), 0x11850019 },
-	{ _MMIO(0x9888), 0x11860007 },
-	{ _MMIO(0x9888), 0x01870c40 },
-	{ _MMIO(0x9888), 0x17880000 },
-	{ _MMIO(0x9888), 0x022f4000 },
-	{ _MMIO(0x9888), 0x0a4c0040 },
-	{ _MMIO(0x9888), 0x0c0d8000 },
-	{ _MMIO(0x9888), 0x040d4000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x020e5400 },
-	{ _MMIO(0x9888), 0x000e0000 },
-	{ _MMIO(0x9888), 0x080f0040 },
-	{ _MMIO(0x9888), 0x000f0000 },
-	{ _MMIO(0x9888), 0x100f0000 },
-	{ _MMIO(0x9888), 0x0e0f0040 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x06104000 },
-	{ _MMIO(0x9888), 0x06110012 },
-	{ _MMIO(0x9888), 0x06131000 },
-	{ _MMIO(0x9888), 0x01898000 },
-	{ _MMIO(0x9888), 0x0d890100 },
-	{ _MMIO(0x9888), 0x03898000 },
-	{ _MMIO(0x9888), 0x09808000 },
-	{ _MMIO(0x9888), 0x0b808000 },
-	{ _MMIO(0x9888), 0x0380c000 },
-	{ _MMIO(0x9888), 0x0f8a0075 },
-	{ _MMIO(0x9888), 0x1d8a0000 },
-	{ _MMIO(0x9888), 0x118a8000 },
-	{ _MMIO(0x9888), 0x1b8a4000 },
-	{ _MMIO(0x9888), 0x138a8000 },
-	{ _MMIO(0x9888), 0x1d81a000 },
-	{ _MMIO(0x9888), 0x15818000 },
-	{ _MMIO(0x9888), 0x17818000 },
-	{ _MMIO(0x9888), 0x0b820030 },
-	{ _MMIO(0x9888), 0x07828000 },
-	{ _MMIO(0x9888), 0x0d824000 },
-	{ _MMIO(0x9888), 0x0f828000 },
-	{ _MMIO(0x9888), 0x05824000 },
-	{ _MMIO(0x9888), 0x0d830003 },
-	{ _MMIO(0x9888), 0x0583000c },
-	{ _MMIO(0x9888), 0x09830000 },
-	{ _MMIO(0x9888), 0x03838000 },
-	{ _MMIO(0x9888), 0x07838000 },
-	{ _MMIO(0x9888), 0x0b840980 },
-	{ _MMIO(0x9888), 0x03844d80 },
-	{ _MMIO(0x9888), 0x11840000 },
-	{ _MMIO(0x9888), 0x09848000 },
-	{ _MMIO(0x9888), 0x09850080 },
-	{ _MMIO(0x9888), 0x03850003 },
-	{ _MMIO(0x9888), 0x01850000 },
-	{ _MMIO(0x9888), 0x07860000 },
-	{ _MMIO(0x9888), 0x0f860400 },
-	{ _MMIO(0x9888), 0x09870032 },
-	{ _MMIO(0x9888), 0x01888052 },
-	{ _MMIO(0x9888), 0x11880000 },
-	{ _MMIO(0x9888), 0x09884000 },
-	{ _MMIO(0x9888), 0x1b931001 },
-	{ _MMIO(0x9888), 0x1d930001 },
-	{ _MMIO(0x9888), 0x19934000 },
-	{ _MMIO(0x9888), 0x1b958000 },
-	{ _MMIO(0x9888), 0x1d950094 },
-	{ _MMIO(0x9888), 0x19958000 },
-	{ _MMIO(0x9888), 0x09e58000 },
-	{ _MMIO(0x9888), 0x0be58000 },
-	{ _MMIO(0x9888), 0x03e5c000 },
-	{ _MMIO(0x9888), 0x0592c000 },
-	{ _MMIO(0x9888), 0x0b928000 },
-	{ _MMIO(0x9888), 0x0d924000 },
-	{ _MMIO(0x9888), 0x0f924000 },
-	{ _MMIO(0x9888), 0x11928000 },
-	{ _MMIO(0x9888), 0x1392c000 },
-	{ _MMIO(0x9888), 0x09924000 },
-	{ _MMIO(0x9888), 0x01985000 },
-	{ _MMIO(0x9888), 0x07988000 },
-	{ _MMIO(0x9888), 0x09981000 },
-	{ _MMIO(0x9888), 0x0b982000 },
-	{ _MMIO(0x9888), 0x0d982000 },
-	{ _MMIO(0x9888), 0x0f989000 },
-	{ _MMIO(0x9888), 0x05982000 },
-	{ _MMIO(0x9888), 0x13904000 },
-	{ _MMIO(0x9888), 0x21904000 },
-	{ _MMIO(0x9888), 0x23904000 },
-	{ _MMIO(0x9888), 0x25908000 },
-	{ _MMIO(0x9888), 0x27904000 },
-	{ _MMIO(0x9888), 0x29908000 },
-	{ _MMIO(0x9888), 0x2b904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17908000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1190c080 },
-	{ _MMIO(0x9888), 0x51901150 },
-	{ _MMIO(0x9888), 0x41901400 },
-	{ _MMIO(0x9888), 0x55905111 },
-	{ _MMIO(0x9888), 0x45901400 },
-	{ _MMIO(0x9888), 0x479004a5 },
-	{ _MMIO(0x9888), 0x57903455 },
-	{ _MMIO(0x9888), 0x49900000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b9000a0 },
-	{ _MMIO(0x9888), 0x59900001 },
-	{ _MMIO(0x9888), 0x43900005 },
-	{ _MMIO(0x9888), 0x53900455 },
-};
-
-static int
-get_render_pipe_profile_mux_config(struct drm_i915_private *dev_priv,
-				   const struct i915_oa_reg **regs,
-				   int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_render_pipe_profile;
-	lens[n] = ARRAY_SIZE(mux_config_render_pipe_profile);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_memory_reads[] = {
-	{ _MMIO(0x272c), 0xffffffff },
-	{ _MMIO(0x2728), 0xffffffff },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x271c), 0xffffffff },
-	{ _MMIO(0x2718), 0xffffffff },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0x86543210 },
-	{ _MMIO(0x2748), 0x86543210 },
-	{ _MMIO(0x2744), 0x00006667 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x275c), 0x86543210 },
-	{ _MMIO(0x2758), 0x86543210 },
-	{ _MMIO(0x2754), 0x00006465 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007f81a },
-	{ _MMIO(0x2774), 0x0000fe00 },
-	{ _MMIO(0x2778), 0x0007f82a },
-	{ _MMIO(0x277c), 0x0000fe00 },
-	{ _MMIO(0x2780), 0x0007f872 },
-	{ _MMIO(0x2784), 0x0000fe00 },
-	{ _MMIO(0x2788), 0x0007f8ba },
-	{ _MMIO(0x278c), 0x0000fe00 },
-	{ _MMIO(0x2790), 0x0007f87a },
-	{ _MMIO(0x2794), 0x0000fe00 },
-	{ _MMIO(0x2798), 0x0007f8ea },
-	{ _MMIO(0x279c), 0x0000fe00 },
-	{ _MMIO(0x27a0), 0x0007f8e2 },
-	{ _MMIO(0x27a4), 0x0000fe00 },
-	{ _MMIO(0x27a8), 0x0007f8f2 },
-	{ _MMIO(0x27ac), 0x0000fe00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_memory_reads[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_memory_reads[] = {
-	{ _MMIO(0x9888), 0x11810c00 },
-	{ _MMIO(0x9888), 0x1381001a },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f900064 },
-	{ _MMIO(0x9888), 0x03811300 },
-	{ _MMIO(0x9888), 0x05811b12 },
-	{ _MMIO(0x9888), 0x0781001a },
-	{ _MMIO(0x9888), 0x1f810000 },
-	{ _MMIO(0x9888), 0x17810000 },
-	{ _MMIO(0x9888), 0x19810000 },
-	{ _MMIO(0x9888), 0x1b810000 },
-	{ _MMIO(0x9888), 0x1d810000 },
-	{ _MMIO(0x9888), 0x1b930055 },
-	{ _MMIO(0x9888), 0x03e58000 },
-	{ _MMIO(0x9888), 0x05e5c000 },
-	{ _MMIO(0x9888), 0x07e54000 },
-	{ _MMIO(0x9888), 0x13900150 },
-	{ _MMIO(0x9888), 0x21900151 },
-	{ _MMIO(0x9888), 0x23900152 },
-	{ _MMIO(0x9888), 0x25900153 },
-	{ _MMIO(0x9888), 0x27900154 },
-	{ _MMIO(0x9888), 0x29900155 },
-	{ _MMIO(0x9888), 0x2b900156 },
-	{ _MMIO(0x9888), 0x2d900157 },
-	{ _MMIO(0x9888), 0x2f90015f },
-	{ _MMIO(0x9888), 0x31900105 },
-	{ _MMIO(0x9888), 0x15900103 },
-	{ _MMIO(0x9888), 0x17900101 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d908000 },
-	{ _MMIO(0x9888), 0x1f908000 },
-	{ _MMIO(0x9888), 0x11900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c60 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900c00 },
-	{ _MMIO(0x9888), 0x47900c63 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900c63 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900063 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static int
-get_memory_reads_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_memory_reads;
-	lens[n] = ARRAY_SIZE(mux_config_memory_reads);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_memory_writes[] = {
-	{ _MMIO(0x272c), 0xffffffff },
-	{ _MMIO(0x2728), 0xffffffff },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x271c), 0xffffffff },
-	{ _MMIO(0x2718), 0xffffffff },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0x86543210 },
-	{ _MMIO(0x2748), 0x86543210 },
-	{ _MMIO(0x2744), 0x00006667 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x275c), 0x86543210 },
-	{ _MMIO(0x2758), 0x86543210 },
-	{ _MMIO(0x2754), 0x00006465 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007f81a },
-	{ _MMIO(0x2774), 0x0000fe00 },
-	{ _MMIO(0x2778), 0x0007f82a },
-	{ _MMIO(0x277c), 0x0000fe00 },
-	{ _MMIO(0x2780), 0x0007f822 },
-	{ _MMIO(0x2784), 0x0000fe00 },
-	{ _MMIO(0x2788), 0x0007f8ba },
-	{ _MMIO(0x278c), 0x0000fe00 },
-	{ _MMIO(0x2790), 0x0007f87a },
-	{ _MMIO(0x2794), 0x0000fe00 },
-	{ _MMIO(0x2798), 0x0007f8ea },
-	{ _MMIO(0x279c), 0x0000fe00 },
-	{ _MMIO(0x27a0), 0x0007f8e2 },
-	{ _MMIO(0x27a4), 0x0000fe00 },
-	{ _MMIO(0x27a8), 0x0007f8f2 },
-	{ _MMIO(0x27ac), 0x0000fe00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_memory_writes[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_memory_writes[] = {
-	{ _MMIO(0x9888), 0x11810c00 },
-	{ _MMIO(0x9888), 0x1381001a },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f901000 },
-	{ _MMIO(0x9888), 0x03811300 },
-	{ _MMIO(0x9888), 0x05811b12 },
-	{ _MMIO(0x9888), 0x0781001a },
-	{ _MMIO(0x9888), 0x1f810000 },
-	{ _MMIO(0x9888), 0x17810000 },
-	{ _MMIO(0x9888), 0x19810000 },
-	{ _MMIO(0x9888), 0x1b810000 },
-	{ _MMIO(0x9888), 0x1d810000 },
-	{ _MMIO(0x9888), 0x1b930055 },
-	{ _MMIO(0x9888), 0x03e58000 },
-	{ _MMIO(0x9888), 0x05e5c000 },
-	{ _MMIO(0x9888), 0x07e54000 },
-	{ _MMIO(0x9888), 0x13900160 },
-	{ _MMIO(0x9888), 0x21900161 },
-	{ _MMIO(0x9888), 0x23900162 },
-	{ _MMIO(0x9888), 0x25900163 },
-	{ _MMIO(0x9888), 0x27900164 },
-	{ _MMIO(0x9888), 0x29900165 },
-	{ _MMIO(0x9888), 0x2b900166 },
-	{ _MMIO(0x9888), 0x2d900167 },
-	{ _MMIO(0x9888), 0x2f900150 },
-	{ _MMIO(0x9888), 0x31900105 },
-	{ _MMIO(0x9888), 0x15900103 },
-	{ _MMIO(0x9888), 0x17900101 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d908000 },
-	{ _MMIO(0x9888), 0x1f908000 },
-	{ _MMIO(0x9888), 0x11900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c60 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900c00 },
-	{ _MMIO(0x9888), 0x47900c63 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900c63 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900063 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static int
-get_memory_writes_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_memory_writes;
-	lens[n] = ARRAY_SIZE(mux_config_memory_writes);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_extended[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007fc2a },
-	{ _MMIO(0x2774), 0x0000bf00 },
-	{ _MMIO(0x2778), 0x0007fc6a },
-	{ _MMIO(0x277c), 0x0000bf00 },
-	{ _MMIO(0x2780), 0x0007fc92 },
-	{ _MMIO(0x2784), 0x0000bf00 },
-	{ _MMIO(0x2788), 0x0007fca2 },
-	{ _MMIO(0x278c), 0x0000bf00 },
-	{ _MMIO(0x2790), 0x0007fc32 },
-	{ _MMIO(0x2794), 0x0000bf00 },
-	{ _MMIO(0x2798), 0x0007fc9a },
-	{ _MMIO(0x279c), 0x0000bf00 },
-	{ _MMIO(0x27a0), 0x0007fe6a },
-	{ _MMIO(0x27a4), 0x0000bf00 },
-	{ _MMIO(0x27a8), 0x0007fe7a },
-	{ _MMIO(0x27ac), 0x0000bf00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_extended[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_compute_extended[] = {
-	{ _MMIO(0x9888), 0x106c00e0 },
-	{ _MMIO(0x9888), 0x141c8160 },
-	{ _MMIO(0x9888), 0x161c8015 },
-	{ _MMIO(0x9888), 0x181c0120 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x0e4e8000 },
-	{ _MMIO(0x9888), 0x184e8000 },
-	{ _MMIO(0x9888), 0x1a4eaaa0 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x024e8000 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x0e6c0b01 },
-	{ _MMIO(0x9888), 0x006c0200 },
-	{ _MMIO(0x9888), 0x026c000c },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x1a6c0000 },
-	{ _MMIO(0x9888), 0x0e1bc000 },
-	{ _MMIO(0x9888), 0x001b8000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x001c0041 },
-	{ _MMIO(0x9888), 0x061c4200 },
-	{ _MMIO(0x9888), 0x081c4443 },
-	{ _MMIO(0x9888), 0x0a1c4645 },
-	{ _MMIO(0x9888), 0x0c1c7647 },
-	{ _MMIO(0x9888), 0x041c7357 },
-	{ _MMIO(0x9888), 0x1c1c0030 },
-	{ _MMIO(0x9888), 0x101c0000 },
-	{ _MMIO(0x9888), 0x1a1c0000 },
-	{ _MMIO(0x9888), 0x121c8000 },
-	{ _MMIO(0x9888), 0x004c8000 },
-	{ _MMIO(0x9888), 0x0a4caa2a },
-	{ _MMIO(0x9888), 0x0c4c02aa },
-	{ _MMIO(0x9888), 0x084ca000 },
-	{ _MMIO(0x9888), 0x000da000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x0c0f5400 },
-	{ _MMIO(0x9888), 0x0e0f5515 },
-	{ _MMIO(0x9888), 0x100f0155 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2c8000 },
-	{ _MMIO(0x9888), 0x162caa00 },
-	{ _MMIO(0x9888), 0x182c00aa },
-	{ _MMIO(0x9888), 0x022c8000 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x11907fff },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900040 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900802 },
-	{ _MMIO(0x9888), 0x47900842 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900842 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900800 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static int
-get_compute_extended_mux_config(struct drm_i915_private *dev_priv,
-				const struct i915_oa_reg **regs,
-				int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_extended;
-	lens[n] = ARRAY_SIZE(mux_config_compute_extended);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_l3_cache[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x30800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007fffa },
-	{ _MMIO(0x2774), 0x0000fefe },
-	{ _MMIO(0x2778), 0x0007fffa },
-	{ _MMIO(0x277c), 0x0000fefd },
-	{ _MMIO(0x2790), 0x0007fffa },
-	{ _MMIO(0x2794), 0x0000fbef },
-	{ _MMIO(0x2798), 0x0007fffa },
-	{ _MMIO(0x279c), 0x0000fbdf },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_l3_cache[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00101100 },
-	{ _MMIO(0xe45c), 0x00201200 },
-	{ _MMIO(0xe55c), 0x00301300 },
-	{ _MMIO(0xe65c), 0x00401400 },
-};
-
-static const struct i915_oa_reg mux_config_compute_l3_cache[] = {
-	{ _MMIO(0x9888), 0x166c0760 },
-	{ _MMIO(0x9888), 0x1593001e },
-	{ _MMIO(0x9888), 0x3f900003 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x0e4e8000 },
-	{ _MMIO(0x9888), 0x184e8000 },
-	{ _MMIO(0x9888), 0x1a4e8020 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x006c0051 },
-	{ _MMIO(0x9888), 0x066c5000 },
-	{ _MMIO(0x9888), 0x086c5c5d },
-	{ _MMIO(0x9888), 0x0e6c5e5f },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x186c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x001b4000 },
-	{ _MMIO(0x9888), 0x061b8000 },
-	{ _MMIO(0x9888), 0x081bc000 },
-	{ _MMIO(0x9888), 0x0e1bc000 },
-	{ _MMIO(0x9888), 0x101c8000 },
-	{ _MMIO(0x9888), 0x1a1ce000 },
-	{ _MMIO(0x9888), 0x1c1c0030 },
-	{ _MMIO(0x9888), 0x004c8000 },
-	{ _MMIO(0x9888), 0x0a4c2a00 },
-	{ _MMIO(0x9888), 0x0c4c0280 },
-	{ _MMIO(0x9888), 0x000d2000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x0c0f0400 },
-	{ _MMIO(0x9888), 0x0e0f1500 },
-	{ _MMIO(0x9888), 0x100f0140 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2c8000 },
-	{ _MMIO(0x9888), 0x162c0a00 },
-	{ _MMIO(0x9888), 0x182c00a0 },
-	{ _MMIO(0x9888), 0x03933300 },
-	{ _MMIO(0x9888), 0x05930032 },
-	{ _MMIO(0x9888), 0x11930000 },
-	{ _MMIO(0x9888), 0x1b930000 },
-	{ _MMIO(0x9888), 0x1d900157 },
-	{ _MMIO(0x9888), 0x1f900158 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1190030f },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900000 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900063 },
-	{ _MMIO(0x9888), 0x47900000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x4b900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x53903333 },
-	{ _MMIO(0x9888), 0x43900840 },
-};
-
-static int
-get_compute_l3_cache_mux_config(struct drm_i915_private *dev_priv,
-				const struct i915_oa_reg **regs,
-				int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_l3_cache;
-	lens[n] = ARRAY_SIZE(mux_config_compute_l3_cache);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_hdc_and_sf[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x10800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000fdff },
-};
-
-static const struct i915_oa_reg flex_eu_config_hdc_and_sf[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_hdc_and_sf[] = {
-	{ _MMIO(0x9888), 0x104f0232 },
-	{ _MMIO(0x9888), 0x124f4640 },
-	{ _MMIO(0x9888), 0x106c0232 },
-	{ _MMIO(0x9888), 0x11834400 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x0c4e8000 },
-	{ _MMIO(0x9888), 0x004f1880 },
-	{ _MMIO(0x9888), 0x024f08bb },
-	{ _MMIO(0x9888), 0x044f001b },
-	{ _MMIO(0x9888), 0x046c0100 },
-	{ _MMIO(0x9888), 0x066c000b },
-	{ _MMIO(0x9888), 0x1a6c0000 },
-	{ _MMIO(0x9888), 0x041b8000 },
-	{ _MMIO(0x9888), 0x061b4000 },
-	{ _MMIO(0x9888), 0x1a1c1800 },
-	{ _MMIO(0x9888), 0x005b8000 },
-	{ _MMIO(0x9888), 0x025bc000 },
-	{ _MMIO(0x9888), 0x045b4000 },
-	{ _MMIO(0x9888), 0x125c8000 },
-	{ _MMIO(0x9888), 0x145c8000 },
-	{ _MMIO(0x9888), 0x165c8000 },
-	{ _MMIO(0x9888), 0x185c8000 },
-	{ _MMIO(0x9888), 0x0a4c00a0 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0f5000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x022cc000 },
-	{ _MMIO(0x9888), 0x042cc000 },
-	{ _MMIO(0x9888), 0x062cc000 },
-	{ _MMIO(0x9888), 0x082cc000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x0f828000 },
-	{ _MMIO(0x9888), 0x0f8305c0 },
-	{ _MMIO(0x9888), 0x09830000 },
-	{ _MMIO(0x9888), 0x07830000 },
-	{ _MMIO(0x9888), 0x1d950080 },
-	{ _MMIO(0x9888), 0x13928000 },
-	{ _MMIO(0x9888), 0x0f988000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x1190fc00 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x59900005 },
-	{ _MMIO(0x9888), 0x4b900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900800 },
-	{ _MMIO(0x9888), 0x43900842 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_hdc_and_sf_mux_config(struct drm_i915_private *dev_priv,
-			  const struct i915_oa_reg **regs,
-			  int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_hdc_and_sf;
-	lens[n] = ARRAY_SIZE(mux_config_hdc_and_sf);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00014002 },
-	{ _MMIO(0x277c), 0x0000c3ff },
-	{ _MMIO(0x2780), 0x00010002 },
-	{ _MMIO(0x2784), 0x0000c7ff },
-	{ _MMIO(0x2788), 0x00004002 },
-	{ _MMIO(0x278c), 0x0000d3ff },
-	{ _MMIO(0x2790), 0x00100700 },
-	{ _MMIO(0x2794), 0x0000ff1f },
-	{ _MMIO(0x2798), 0x00001402 },
-	{ _MMIO(0x279c), 0x0000fc3f },
-	{ _MMIO(0x27a0), 0x00001002 },
-	{ _MMIO(0x27a4), 0x0000fc7f },
-	{ _MMIO(0x27a8), 0x00000402 },
-	{ _MMIO(0x27ac), 0x0000fd3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_1[] = {
-	{ _MMIO(0x9888), 0x126c7b40 },
-	{ _MMIO(0x9888), 0x166c0020 },
-	{ _MMIO(0x9888), 0x0a603444 },
-	{ _MMIO(0x9888), 0x0a613400 },
-	{ _MMIO(0x9888), 0x1a4ea800 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x024e8000 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x064f4000 },
-	{ _MMIO(0x9888), 0x0c6c5327 },
-	{ _MMIO(0x9888), 0x0e6c5425 },
-	{ _MMIO(0x9888), 0x006c2a00 },
-	{ _MMIO(0x9888), 0x026c285b },
-	{ _MMIO(0x9888), 0x046c005c },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x1a6c0800 },
-	{ _MMIO(0x9888), 0x0c1bc000 },
-	{ _MMIO(0x9888), 0x0e1bc000 },
-	{ _MMIO(0x9888), 0x001b8000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x041bc000 },
-	{ _MMIO(0x9888), 0x1c1c003c },
-	{ _MMIO(0x9888), 0x121c8000 },
-	{ _MMIO(0x9888), 0x141c8000 },
-	{ _MMIO(0x9888), 0x161c8000 },
-	{ _MMIO(0x9888), 0x181c8000 },
-	{ _MMIO(0x9888), 0x1a1c0800 },
-	{ _MMIO(0x9888), 0x065b4000 },
-	{ _MMIO(0x9888), 0x1a5c1000 },
-	{ _MMIO(0x9888), 0x10600000 },
-	{ _MMIO(0x9888), 0x04600000 },
-	{ _MMIO(0x9888), 0x0c610044 },
-	{ _MMIO(0x9888), 0x10610000 },
-	{ _MMIO(0x9888), 0x06610000 },
-	{ _MMIO(0x9888), 0x0c4c02a8 },
-	{ _MMIO(0x9888), 0x084ca000 },
-	{ _MMIO(0x9888), 0x0a4c002a },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x100f0154 },
-	{ _MMIO(0x9888), 0x0c0f5000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x182c00aa },
-	{ _MMIO(0x9888), 0x022c8000 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2cc000 },
-	{ _MMIO(0x9888), 0x1190ffc0 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900420 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900021 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900400 },
-	{ _MMIO(0x9888), 0x43900421 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900040 },
-};
-
-static int
-get_l3_1_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_1;
-	lens[n] = ARRAY_SIZE(mux_config_l3_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_2[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00028002 },
-	{ _MMIO(0x277c), 0x000087ff },
-	{ _MMIO(0x2780), 0x00020002 },
-	{ _MMIO(0x2784), 0x00008fff },
-	{ _MMIO(0x2788), 0x00008002 },
-	{ _MMIO(0x278c), 0x0000a7ff },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_2[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_2[] = {
-	{ _MMIO(0x9888), 0x126c02e0 },
-	{ _MMIO(0x9888), 0x146c0001 },
-	{ _MMIO(0x9888), 0x0a623400 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x064f4000 },
-	{ _MMIO(0x9888), 0x026c3324 },
-	{ _MMIO(0x9888), 0x046c3422 },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1a6c0000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x041bc000 },
-	{ _MMIO(0x9888), 0x141c8000 },
-	{ _MMIO(0x9888), 0x161c8000 },
-	{ _MMIO(0x9888), 0x181c8000 },
-	{ _MMIO(0x9888), 0x1a1c0800 },
-	{ _MMIO(0x9888), 0x065b4000 },
-	{ _MMIO(0x9888), 0x1a5c1000 },
-	{ _MMIO(0x9888), 0x06614000 },
-	{ _MMIO(0x9888), 0x0c620044 },
-	{ _MMIO(0x9888), 0x10620000 },
-	{ _MMIO(0x9888), 0x06620000 },
-	{ _MMIO(0x9888), 0x084c8000 },
-	{ _MMIO(0x9888), 0x0a4c002a },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0f4000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2cc000 },
-	{ _MMIO(0x9888), 0x1190f800 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x43900000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_l3_2_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_2;
-	lens[n] = ARRAY_SIZE(mux_config_l3_2);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_3[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00028002 },
-	{ _MMIO(0x277c), 0x000087ff },
-	{ _MMIO(0x2780), 0x00020002 },
-	{ _MMIO(0x2784), 0x00008fff },
-	{ _MMIO(0x2788), 0x00008002 },
-	{ _MMIO(0x278c), 0x0000a7ff },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_3[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_3[] = {
-	{ _MMIO(0x9888), 0x126c4e80 },
-	{ _MMIO(0x9888), 0x146c0000 },
-	{ _MMIO(0x9888), 0x0a633400 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x0c4e8000 },
-	{ _MMIO(0x9888), 0x026c3321 },
-	{ _MMIO(0x9888), 0x046c342f },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1a6c2000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x041bc000 },
-	{ _MMIO(0x9888), 0x061b4000 },
-	{ _MMIO(0x9888), 0x141c8000 },
-	{ _MMIO(0x9888), 0x161c8000 },
-	{ _MMIO(0x9888), 0x181c8000 },
-	{ _MMIO(0x9888), 0x1a1c1800 },
-	{ _MMIO(0x9888), 0x06604000 },
-	{ _MMIO(0x9888), 0x0c630044 },
-	{ _MMIO(0x9888), 0x10630000 },
-	{ _MMIO(0x9888), 0x06630000 },
-	{ _MMIO(0x9888), 0x084c8000 },
-	{ _MMIO(0x9888), 0x0a4c00aa },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0f4000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x1190f800 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x43900842 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900002 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_l3_3_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_3;
-	lens[n] = ARRAY_SIZE(mux_config_l3_3);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x30800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000efff },
-	{ _MMIO(0x2778), 0x00006000 },
-	{ _MMIO(0x277c), 0x0000f3ff },
-};
-
-static const struct i915_oa_reg flex_eu_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0x9888), 0x102f3800 },
-	{ _MMIO(0x9888), 0x144d0500 },
-	{ _MMIO(0x9888), 0x120d03c0 },
-	{ _MMIO(0x9888), 0x140d03cf },
-	{ _MMIO(0x9888), 0x0c0f0004 },
-	{ _MMIO(0x9888), 0x0c4e4000 },
-	{ _MMIO(0x9888), 0x042f0480 },
-	{ _MMIO(0x9888), 0x082f0000 },
-	{ _MMIO(0x9888), 0x022f0000 },
-	{ _MMIO(0x9888), 0x0a4c0090 },
-	{ _MMIO(0x9888), 0x064d0027 },
-	{ _MMIO(0x9888), 0x004d0000 },
-	{ _MMIO(0x9888), 0x000d0d40 },
-	{ _MMIO(0x9888), 0x020d803f },
-	{ _MMIO(0x9888), 0x040d8023 },
-	{ _MMIO(0x9888), 0x100d0000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x020f0010 },
-	{ _MMIO(0x9888), 0x000f0000 },
-	{ _MMIO(0x9888), 0x0e0f0050 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x1190fc00 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41901400 },
-	{ _MMIO(0x9888), 0x43901485 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900001 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_rasterizer_and_pixel_backend_mux_config(struct drm_i915_private *dev_priv,
-					    const struct i915_oa_reg **regs,
-					    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_rasterizer_and_pixel_backend;
-	lens[n] = ARRAY_SIZE(mux_config_rasterizer_and_pixel_backend);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_sampler[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x70800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x0000c000 },
-	{ _MMIO(0x2774), 0x0000e7ff },
-	{ _MMIO(0x2778), 0x00003000 },
-	{ _MMIO(0x277c), 0x0000f9ff },
-	{ _MMIO(0x2780), 0x00000c00 },
-	{ _MMIO(0x2784), 0x0000fe7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_sampler[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_sampler[] = {
-	{ _MMIO(0x9888), 0x14152c00 },
-	{ _MMIO(0x9888), 0x16150005 },
-	{ _MMIO(0x9888), 0x121600a0 },
-	{ _MMIO(0x9888), 0x14352c00 },
-	{ _MMIO(0x9888), 0x16350005 },
-	{ _MMIO(0x9888), 0x123600a0 },
-	{ _MMIO(0x9888), 0x14552c00 },
-	{ _MMIO(0x9888), 0x16550005 },
-	{ _MMIO(0x9888), 0x125600a0 },
-	{ _MMIO(0x9888), 0x062f6000 },
-	{ _MMIO(0x9888), 0x022f2000 },
-	{ _MMIO(0x9888), 0x0c4c0050 },
-	{ _MMIO(0x9888), 0x0a4c0010 },
-	{ _MMIO(0x9888), 0x0c0d8000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x100f0350 },
-	{ _MMIO(0x9888), 0x0c0fb000 },
-	{ _MMIO(0x9888), 0x0e0f00da },
-	{ _MMIO(0x9888), 0x182c0028 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x022dc000 },
-	{ _MMIO(0x9888), 0x042d4000 },
-	{ _MMIO(0x9888), 0x0c138000 },
-	{ _MMIO(0x9888), 0x0e132000 },
-	{ _MMIO(0x9888), 0x0413c000 },
-	{ _MMIO(0x9888), 0x1c140018 },
-	{ _MMIO(0x9888), 0x0c157000 },
-	{ _MMIO(0x9888), 0x0e150078 },
-	{ _MMIO(0x9888), 0x10150000 },
-	{ _MMIO(0x9888), 0x04162180 },
-	{ _MMIO(0x9888), 0x02160000 },
-	{ _MMIO(0x9888), 0x04174000 },
-	{ _MMIO(0x9888), 0x0233a000 },
-	{ _MMIO(0x9888), 0x04333000 },
-	{ _MMIO(0x9888), 0x14348000 },
-	{ _MMIO(0x9888), 0x16348000 },
-	{ _MMIO(0x9888), 0x02357870 },
-	{ _MMIO(0x9888), 0x10350000 },
-	{ _MMIO(0x9888), 0x04360043 },
-	{ _MMIO(0x9888), 0x02360000 },
-	{ _MMIO(0x9888), 0x04371000 },
-	{ _MMIO(0x9888), 0x0e538000 },
-	{ _MMIO(0x9888), 0x00538000 },
-	{ _MMIO(0x9888), 0x06533000 },
-	{ _MMIO(0x9888), 0x1c540020 },
-	{ _MMIO(0x9888), 0x12548000 },
-	{ _MMIO(0x9888), 0x0e557000 },
-	{ _MMIO(0x9888), 0x00557800 },
-	{ _MMIO(0x9888), 0x10550000 },
-	{ _MMIO(0x9888), 0x06560043 },
-	{ _MMIO(0x9888), 0x02560000 },
-	{ _MMIO(0x9888), 0x06571000 },
-	{ _MMIO(0x9888), 0x1190ff80 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900060 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c00 },
-	{ _MMIO(0x9888), 0x43900842 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900060 },
-};
-
-static int
-get_sampler_mux_config(struct drm_i915_private *dev_priv,
-		       const struct i915_oa_reg **regs,
-		       int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_sampler;
-	lens[n] = ARRAY_SIZE(mux_config_sampler);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_tdl_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x00007fff },
-	{ _MMIO(0x2778), 0x00000000 },
-	{ _MMIO(0x277c), 0x00009fff },
-	{ _MMIO(0x2780), 0x00000002 },
-	{ _MMIO(0x2784), 0x0000efff },
-	{ _MMIO(0x2788), 0x00000000 },
-	{ _MMIO(0x278c), 0x0000f3ff },
-	{ _MMIO(0x2790), 0x00000002 },
-	{ _MMIO(0x2794), 0x0000fdff },
-	{ _MMIO(0x2798), 0x00000000 },
-	{ _MMIO(0x279c), 0x0000fe7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_tdl_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_tdl_1[] = {
-	{ _MMIO(0x9888), 0x12120000 },
-	{ _MMIO(0x9888), 0x12320000 },
-	{ _MMIO(0x9888), 0x12520000 },
-	{ _MMIO(0x9888), 0x002f8000 },
-	{ _MMIO(0x9888), 0x022f3000 },
-	{ _MMIO(0x9888), 0x0a4c0015 },
-	{ _MMIO(0x9888), 0x0c0d8000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x100f03a0 },
-	{ _MMIO(0x9888), 0x0c0ff000 },
-	{ _MMIO(0x9888), 0x0e0f0095 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2d8000 },
-	{ _MMIO(0x9888), 0x0e2d4000 },
-	{ _MMIO(0x9888), 0x062d4000 },
-	{ _MMIO(0x9888), 0x02108000 },
-	{ _MMIO(0x9888), 0x0410c000 },
-	{ _MMIO(0x9888), 0x02118000 },
-	{ _MMIO(0x9888), 0x0411c000 },
-	{ _MMIO(0x9888), 0x02121880 },
-	{ _MMIO(0x9888), 0x041219b5 },
-	{ _MMIO(0x9888), 0x00120000 },
-	{ _MMIO(0x9888), 0x02134000 },
-	{ _MMIO(0x9888), 0x04135000 },
-	{ _MMIO(0x9888), 0x0c308000 },
-	{ _MMIO(0x9888), 0x0e304000 },
-	{ _MMIO(0x9888), 0x06304000 },
-	{ _MMIO(0x9888), 0x0c318000 },
-	{ _MMIO(0x9888), 0x0e314000 },
-	{ _MMIO(0x9888), 0x06314000 },
-	{ _MMIO(0x9888), 0x0c321a80 },
-	{ _MMIO(0x9888), 0x0e320033 },
-	{ _MMIO(0x9888), 0x06320031 },
-	{ _MMIO(0x9888), 0x00320000 },
-	{ _MMIO(0x9888), 0x0c334000 },
-	{ _MMIO(0x9888), 0x0e331000 },
-	{ _MMIO(0x9888), 0x06331000 },
-	{ _MMIO(0x9888), 0x0e508000 },
-	{ _MMIO(0x9888), 0x00508000 },
-	{ _MMIO(0x9888), 0x02504000 },
-	{ _MMIO(0x9888), 0x0e518000 },
-	{ _MMIO(0x9888), 0x00518000 },
-	{ _MMIO(0x9888), 0x02514000 },
-	{ _MMIO(0x9888), 0x0e521880 },
-	{ _MMIO(0x9888), 0x00521a80 },
-	{ _MMIO(0x9888), 0x02520033 },
-	{ _MMIO(0x9888), 0x0e534000 },
-	{ _MMIO(0x9888), 0x00534000 },
-	{ _MMIO(0x9888), 0x02531000 },
-	{ _MMIO(0x9888), 0x1190ff80 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900800 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900062 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c00 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900040 },
-};
-
-static int
-get_tdl_1_mux_config(struct drm_i915_private *dev_priv,
-		     const struct i915_oa_reg **regs,
-		     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_tdl_1;
-	lens[n] = ARRAY_SIZE(mux_config_tdl_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_tdl_2[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_tdl_2[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_tdl_2[] = {
-	{ _MMIO(0x9888), 0x12124d60 },
-	{ _MMIO(0x9888), 0x12322e60 },
-	{ _MMIO(0x9888), 0x12524d60 },
-	{ _MMIO(0x9888), 0x022f3000 },
-	{ _MMIO(0x9888), 0x0a4c0014 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0fe000 },
-	{ _MMIO(0x9888), 0x0e0f0097 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x002d8000 },
-	{ _MMIO(0x9888), 0x062d4000 },
-	{ _MMIO(0x9888), 0x0410c000 },
-	{ _MMIO(0x9888), 0x0411c000 },
-	{ _MMIO(0x9888), 0x04121fb7 },
-	{ _MMIO(0x9888), 0x00120000 },
-	{ _MMIO(0x9888), 0x04135000 },
-	{ _MMIO(0x9888), 0x00308000 },
-	{ _MMIO(0x9888), 0x06304000 },
-	{ _MMIO(0x9888), 0x00318000 },
-	{ _MMIO(0x9888), 0x06314000 },
-	{ _MMIO(0x9888), 0x00321b80 },
-	{ _MMIO(0x9888), 0x0632003f },
-	{ _MMIO(0x9888), 0x00334000 },
-	{ _MMIO(0x9888), 0x06331000 },
-	{ _MMIO(0x9888), 0x0250c000 },
-	{ _MMIO(0x9888), 0x0251c000 },
-	{ _MMIO(0x9888), 0x02521fb7 },
-	{ _MMIO(0x9888), 0x00520000 },
-	{ _MMIO(0x9888), 0x02535000 },
-	{ _MMIO(0x9888), 0x1190fc00 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900800 },
-	{ _MMIO(0x9888), 0x43900063 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900040 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_tdl_2_mux_config(struct drm_i915_private *dev_priv,
-		     const struct i915_oa_reg **regs,
-		     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_tdl_2;
-	lens[n] = ARRAY_SIZE(mux_config_tdl_2);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_extra[] = {
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_extra[] = {
-};
-
-static const struct i915_oa_reg mux_config_compute_extra[] = {
-	{ _MMIO(0x9888), 0x121203e0 },
-	{ _MMIO(0x9888), 0x123203e0 },
-	{ _MMIO(0x9888), 0x125203e0 },
-	{ _MMIO(0x9888), 0x129203e0 },
-	{ _MMIO(0x9888), 0x12b203e0 },
-	{ _MMIO(0x9888), 0x12d203e0 },
-	{ _MMIO(0x9888), 0x024ec000 },
-	{ _MMIO(0x9888), 0x044ec000 },
-	{ _MMIO(0x9888), 0x064ec000 },
-	{ _MMIO(0x9888), 0x022f4000 },
-	{ _MMIO(0x9888), 0x084ca000 },
-	{ _MMIO(0x9888), 0x0a4c0042 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0f5000 },
-	{ _MMIO(0x9888), 0x0e0f006d },
-	{ _MMIO(0x9888), 0x022c8000 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x042d8000 },
-	{ _MMIO(0x9888), 0x06104000 },
-	{ _MMIO(0x9888), 0x06114000 },
-	{ _MMIO(0x9888), 0x06120033 },
-	{ _MMIO(0x9888), 0x00120000 },
-	{ _MMIO(0x9888), 0x06131000 },
-	{ _MMIO(0x9888), 0x04308000 },
-	{ _MMIO(0x9888), 0x04318000 },
-	{ _MMIO(0x9888), 0x04321980 },
-	{ _MMIO(0x9888), 0x00320000 },
-	{ _MMIO(0x9888), 0x04334000 },
-	{ _MMIO(0x9888), 0x04504000 },
-	{ _MMIO(0x9888), 0x04514000 },
-	{ _MMIO(0x9888), 0x04520033 },
-	{ _MMIO(0x9888), 0x00520000 },
-	{ _MMIO(0x9888), 0x04531000 },
-	{ _MMIO(0x9888), 0x00af8000 },
-	{ _MMIO(0x9888), 0x0acc0001 },
-	{ _MMIO(0x9888), 0x008d8000 },
-	{ _MMIO(0x9888), 0x028da000 },
-	{ _MMIO(0x9888), 0x0c8fb000 },
-	{ _MMIO(0x9888), 0x0e8f0001 },
-	{ _MMIO(0x9888), 0x06ac8000 },
-	{ _MMIO(0x9888), 0x02ad4000 },
-	{ _MMIO(0x9888), 0x02908000 },
-	{ _MMIO(0x9888), 0x02918000 },
-	{ _MMIO(0x9888), 0x02921980 },
-	{ _MMIO(0x9888), 0x00920000 },
-	{ _MMIO(0x9888), 0x02934000 },
-	{ _MMIO(0x9888), 0x02b04000 },
-	{ _MMIO(0x9888), 0x02b14000 },
-	{ _MMIO(0x9888), 0x02b20033 },
-	{ _MMIO(0x9888), 0x00b20000 },
-	{ _MMIO(0x9888), 0x02b31000 },
-	{ _MMIO(0x9888), 0x00d08000 },
-	{ _MMIO(0x9888), 0x00d18000 },
-	{ _MMIO(0x9888), 0x00d21980 },
-	{ _MMIO(0x9888), 0x00d34000 },
-	{ _MMIO(0x9888), 0x1190fc00 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c00 },
-	{ _MMIO(0x9888), 0x43900402 },
-	{ _MMIO(0x9888), 0x53901550 },
-	{ _MMIO(0x9888), 0x45900080 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_compute_extra_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_extra;
-	lens[n] = ARRAY_SIZE(mux_config_compute_extra);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_vme_pipe[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00100030 },
-	{ _MMIO(0x2774), 0x0000fff9 },
-	{ _MMIO(0x2778), 0x00000002 },
-	{ _MMIO(0x277c), 0x0000fffc },
-	{ _MMIO(0x2780), 0x00000002 },
-	{ _MMIO(0x2784), 0x0000fff3 },
-	{ _MMIO(0x2788), 0x00100180 },
-	{ _MMIO(0x278c), 0x0000ffcf },
-	{ _MMIO(0x2790), 0x00000002 },
-	{ _MMIO(0x2794), 0x0000ffcf },
-	{ _MMIO(0x2798), 0x00000002 },
-	{ _MMIO(0x279c), 0x0000ff3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_vme_pipe[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00008003 },
-};
-
-static const struct i915_oa_reg mux_config_vme_pipe[] = {
-	{ _MMIO(0x9888), 0x141a5800 },
-	{ _MMIO(0x9888), 0x161a00c0 },
-	{ _MMIO(0x9888), 0x12180240 },
-	{ _MMIO(0x9888), 0x14180002 },
-	{ _MMIO(0x9888), 0x149a5800 },
-	{ _MMIO(0x9888), 0x169a00c0 },
-	{ _MMIO(0x9888), 0x12980240 },
-	{ _MMIO(0x9888), 0x14980002 },
-	{ _MMIO(0x9888), 0x1a4e3fc0 },
-	{ _MMIO(0x9888), 0x002f1000 },
-	{ _MMIO(0x9888), 0x022f8000 },
-	{ _MMIO(0x9888), 0x042f3000 },
-	{ _MMIO(0x9888), 0x004c4000 },
-	{ _MMIO(0x9888), 0x0a4c9500 },
-	{ _MMIO(0x9888), 0x0c4c002a },
-	{ _MMIO(0x9888), 0x000d2000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0c0f0400 },
-	{ _MMIO(0x9888), 0x0e0f5500 },
-	{ _MMIO(0x9888), 0x100f0015 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2c8000 },
-	{ _MMIO(0x9888), 0x162caa00 },
-	{ _MMIO(0x9888), 0x182c000a },
-	{ _MMIO(0x9888), 0x04193000 },
-	{ _MMIO(0x9888), 0x081a28c1 },
-	{ _MMIO(0x9888), 0x001a0000 },
-	{ _MMIO(0x9888), 0x00133000 },
-	{ _MMIO(0x9888), 0x0613c000 },
-	{ _MMIO(0x9888), 0x0813f000 },
-	{ _MMIO(0x9888), 0x00172000 },
-	{ _MMIO(0x9888), 0x06178000 },
-	{ _MMIO(0x9888), 0x0817a000 },
-	{ _MMIO(0x9888), 0x00180037 },
-	{ _MMIO(0x9888), 0x06180940 },
-	{ _MMIO(0x9888), 0x08180000 },
-	{ _MMIO(0x9888), 0x02180000 },
-	{ _MMIO(0x9888), 0x04183000 },
-	{ _MMIO(0x9888), 0x04afc000 },
-	{ _MMIO(0x9888), 0x06af3000 },
-	{ _MMIO(0x9888), 0x0acc4000 },
-	{ _MMIO(0x9888), 0x0ccc0015 },
-	{ _MMIO(0x9888), 0x0a8da000 },
-	{ _MMIO(0x9888), 0x0c8da000 },
-	{ _MMIO(0x9888), 0x0e8f4000 },
-	{ _MMIO(0x9888), 0x108f0015 },
-	{ _MMIO(0x9888), 0x16aca000 },
-	{ _MMIO(0x9888), 0x18ac000a },
-	{ _MMIO(0x9888), 0x06993000 },
-	{ _MMIO(0x9888), 0x0c9a28c1 },
-	{ _MMIO(0x9888), 0x009a0000 },
-	{ _MMIO(0x9888), 0x0a93f000 },
-	{ _MMIO(0x9888), 0x0c93f000 },
-	{ _MMIO(0x9888), 0x0a97a000 },
-	{ _MMIO(0x9888), 0x0c97a000 },
-	{ _MMIO(0x9888), 0x0a980977 },
-	{ _MMIO(0x9888), 0x08980000 },
-	{ _MMIO(0x9888), 0x04980000 },
-	{ _MMIO(0x9888), 0x06983000 },
-	{ _MMIO(0x9888), 0x119000ff },
-	{ _MMIO(0x9888), 0x51900050 },
-	{ _MMIO(0x9888), 0x41900000 },
-	{ _MMIO(0x9888), 0x55900115 },
-	{ _MMIO(0x9888), 0x45900000 },
-	{ _MMIO(0x9888), 0x47900884 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900002 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_vme_pipe_mux_config(struct drm_i915_private *dev_priv,
-			const struct i915_oa_reg **regs,
-			int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_vme_pipe;
-	lens[n] = ARRAY_SIZE(mux_config_vme_pipe);
-	n++;
-
-	return n;
-}
-
 static const struct i915_oa_reg b_counter_config_test_oa[] = {
 	{ _MMIO(0x2740), 0x00000000 },
 	{ _MMIO(0x2744), 0x00800000 },
@@ -1930,6 +60,7 @@ static const struct i915_oa_reg flex_eu_config_test_oa[] = {
 };
 
 static const struct i915_oa_reg mux_config_test_oa[] = {
+	{ _MMIO(0x9840), 0x00000080 },
 	{ _MMIO(0x9888), 0x11810000 },
 	{ _MMIO(0x9888), 0x07810013 },
 	{ _MMIO(0x9888), 0x1f810000 },
@@ -1944,1096 +75,35 @@ static const struct i915_oa_reg mux_config_test_oa[] = {
 	{ _MMIO(0x9888), 0x33900000 },
 };
 
-static int
-get_test_oa_mux_config(struct drm_i915_private *dev_priv,
-		       const struct i915_oa_reg **regs,
-		       int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_test_oa;
-	lens[n] = ARRAY_SIZE(mux_config_test_oa);
-	n++;
-
-	return n;
-}
-
-int i915_oa_select_metric_set_sklgt3(struct drm_i915_private *dev_priv)
-{
-	dev_priv->perf.oa.n_mux_configs = 0;
-	dev_priv->perf.oa.b_counter_regs = NULL;
-	dev_priv->perf.oa.b_counter_regs_len = 0;
-	dev_priv->perf.oa.flex_regs = NULL;
-	dev_priv->perf.oa.flex_regs_len = 0;
-
-	switch (dev_priv->perf.oa.metrics_set) {
-	case METRIC_SET_ID_RENDER_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_basic_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_basic);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_basic_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_basic);
-
-		return 0;
-	case METRIC_SET_ID_RENDER_PIPE_PROFILE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_pipe_profile_mux_config(dev_priv,
-							   dev_priv->perf.oa.mux_regs,
-							   dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_PIPE_PROFILE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_pipe_profile;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_pipe_profile);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_pipe_profile;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_pipe_profile);
-
-		return 0;
-	case METRIC_SET_ID_MEMORY_READS:
-		dev_priv->perf.oa.n_mux_configs =
-			get_memory_reads_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"MEMORY_READS\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_memory_reads;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_memory_reads);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_memory_reads;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_memory_reads);
-
-		return 0;
-	case METRIC_SET_ID_MEMORY_WRITES:
-		dev_priv->perf.oa.n_mux_configs =
-			get_memory_writes_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"MEMORY_WRITES\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_memory_writes;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_memory_writes);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_memory_writes;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_memory_writes);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_EXTENDED:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_extended_mux_config(dev_priv,
-							dev_priv->perf.oa.mux_regs,
-							dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_EXTENDED\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_extended;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_extended);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_extended;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_extended);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_L3_CACHE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_l3_cache_mux_config(dev_priv,
-							dev_priv->perf.oa.mux_regs,
-							dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_L3_CACHE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_l3_cache;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_l3_cache);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_l3_cache;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_l3_cache);
-
-		return 0;
-	case METRIC_SET_ID_HDC_AND_SF:
-		dev_priv->perf.oa.n_mux_configs =
-			get_hdc_and_sf_mux_config(dev_priv,
-						  dev_priv->perf.oa.mux_regs,
-						  dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"HDC_AND_SF\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_hdc_and_sf;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_hdc_and_sf);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_hdc_and_sf;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_hdc_and_sf);
-
-		return 0;
-	case METRIC_SET_ID_L3_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_1_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_1);
-
-		return 0;
-	case METRIC_SET_ID_L3_2:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_2_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_2\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_2;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_2);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_2;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_2);
-
-		return 0;
-	case METRIC_SET_ID_L3_3:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_3_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_3\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_3;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_3);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_3;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_3);
-
-		return 0;
-	case METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND:
-		dev_priv->perf.oa.n_mux_configs =
-			get_rasterizer_and_pixel_backend_mux_config(dev_priv,
-								    dev_priv->perf.oa.mux_regs,
-								    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RASTERIZER_AND_PIXEL_BACKEND\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_rasterizer_and_pixel_backend;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_rasterizer_and_pixel_backend);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_rasterizer_and_pixel_backend;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_rasterizer_and_pixel_backend);
-
-		return 0;
-	case METRIC_SET_ID_SAMPLER:
-		dev_priv->perf.oa.n_mux_configs =
-			get_sampler_mux_config(dev_priv,
-					       dev_priv->perf.oa.mux_regs,
-					       dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"SAMPLER\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_sampler;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_sampler);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_sampler;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_sampler);
-
-		return 0;
-	case METRIC_SET_ID_TDL_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_tdl_1_mux_config(dev_priv,
-					     dev_priv->perf.oa.mux_regs,
-					     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TDL_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_tdl_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_tdl_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_tdl_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_tdl_1);
-
-		return 0;
-	case METRIC_SET_ID_TDL_2:
-		dev_priv->perf.oa.n_mux_configs =
-			get_tdl_2_mux_config(dev_priv,
-					     dev_priv->perf.oa.mux_regs,
-					     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TDL_2\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_tdl_2;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_tdl_2);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_tdl_2;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_tdl_2);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_EXTRA:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_extra_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_EXTRA\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_extra;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_extra);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_extra;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_extra);
-
-		return 0;
-	case METRIC_SET_ID_VME_PIPE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_vme_pipe_mux_config(dev_priv,
-						dev_priv->perf.oa.mux_regs,
-						dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"VME_PIPE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_vme_pipe;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_vme_pipe);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_vme_pipe;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_vme_pipe);
-
-		return 0;
-	case METRIC_SET_ID_TEST_OA:
-		dev_priv->perf.oa.n_mux_configs =
-			get_test_oa_mux_config(dev_priv,
-					       dev_priv->perf.oa.mux_regs,
-					       dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TEST_OA\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_test_oa;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_test_oa);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_test_oa;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_test_oa);
-
-		return 0;
-	default:
-		return -ENODEV;
-	}
-}
-
-static ssize_t
-show_render_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_BASIC);
-}
-
-static struct device_attribute dev_attr_render_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_basic[] = {
-	&dev_attr_render_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_basic = {
-	.name = "4616d450-2393-4836-8146-53c5ed84d359",
-	.attrs =  attrs_render_basic,
-};
-
-static ssize_t
-show_compute_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_BASIC);
-}
-
-static struct device_attribute dev_attr_compute_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_basic[] = {
-	&dev_attr_compute_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_basic = {
-	.name = "4320492b-fd03-42ac-922f-dbe1ef3b7b58",
-	.attrs =  attrs_compute_basic,
-};
-
-static ssize_t
-show_render_pipe_profile_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_PIPE_PROFILE);
-}
-
-static struct device_attribute dev_attr_render_pipe_profile_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_pipe_profile_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_pipe_profile[] = {
-	&dev_attr_render_pipe_profile_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_pipe_profile = {
-	.name = "bd2d9cae-b9ec-4f5b-9d2f-934bed398a2d",
-	.attrs =  attrs_render_pipe_profile,
-};
-
-static ssize_t
-show_memory_reads_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_MEMORY_READS);
-}
-
-static struct device_attribute dev_attr_memory_reads_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_memory_reads_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_memory_reads[] = {
-	&dev_attr_memory_reads_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_memory_reads = {
-	.name = "4ca0f3fe-7fd3-4924-98cb-1807d9879767",
-	.attrs =  attrs_memory_reads,
-};
-
-static ssize_t
-show_memory_writes_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_MEMORY_WRITES);
-}
-
-static struct device_attribute dev_attr_memory_writes_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_memory_writes_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_memory_writes[] = {
-	&dev_attr_memory_writes_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_memory_writes = {
-	.name = "a0c0172c-ee13-403d-99ff-2bdf6936cf14",
-	.attrs =  attrs_memory_writes,
-};
-
-static ssize_t
-show_compute_extended_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_EXTENDED);
-}
-
-static struct device_attribute dev_attr_compute_extended_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_extended_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_extended[] = {
-	&dev_attr_compute_extended_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_extended = {
-	.name = "52435e0b-f188-42ea-8680-21a56ee20dee",
-	.attrs =  attrs_compute_extended,
-};
-
-static ssize_t
-show_compute_l3_cache_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_L3_CACHE);
-}
-
-static struct device_attribute dev_attr_compute_l3_cache_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_l3_cache_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_l3_cache[] = {
-	&dev_attr_compute_l3_cache_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_l3_cache = {
-	.name = "27076eeb-49f3-4fed-8423-c66506005c63",
-	.attrs =  attrs_compute_l3_cache,
-};
-
-static ssize_t
-show_hdc_and_sf_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_HDC_AND_SF);
-}
-
-static struct device_attribute dev_attr_hdc_and_sf_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_hdc_and_sf_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_hdc_and_sf[] = {
-	&dev_attr_hdc_and_sf_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_hdc_and_sf = {
-	.name = "8071b409-c39a-4674-94d7-32962ecfb512",
-	.attrs =  attrs_hdc_and_sf,
-};
-
-static ssize_t
-show_l3_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_1);
-}
-
-static struct device_attribute dev_attr_l3_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_1[] = {
-	&dev_attr_l3_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_1 = {
-	.name = "5e0b391e-9ea8-4901-b2ff-b64ff616c7ed",
-	.attrs =  attrs_l3_1,
-};
-
-static ssize_t
-show_l3_2_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_2);
-}
-
-static struct device_attribute dev_attr_l3_2_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_2_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_2[] = {
-	&dev_attr_l3_2_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_2 = {
-	.name = "25dc828e-1d2d-426e-9546-a1d4233cdf16",
-	.attrs =  attrs_l3_2,
-};
-
-static ssize_t
-show_l3_3_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_3);
-}
-
-static struct device_attribute dev_attr_l3_3_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_3_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_3[] = {
-	&dev_attr_l3_3_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_3 = {
-	.name = "3dba9405-2d7e-4d70-8199-e734e82fd6bf",
-	.attrs =  attrs_l3_3,
-};
-
-static ssize_t
-show_rasterizer_and_pixel_backend_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND);
-}
-
-static struct device_attribute dev_attr_rasterizer_and_pixel_backend_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_rasterizer_and_pixel_backend_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_rasterizer_and_pixel_backend[] = {
-	&dev_attr_rasterizer_and_pixel_backend_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_rasterizer_and_pixel_backend = {
-	.name = "76935d7b-09c9-46bf-87f1-c18b4a86ebe5",
-	.attrs =  attrs_rasterizer_and_pixel_backend,
-};
-
-static ssize_t
-show_sampler_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_SAMPLER);
-}
-
-static struct device_attribute dev_attr_sampler_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_sampler_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_sampler[] = {
-	&dev_attr_sampler_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_sampler = {
-	.name = "1b34c0d6-4f4c-4d7b-833f-4aaf236d87a6",
-	.attrs =  attrs_sampler,
-};
-
-static ssize_t
-show_tdl_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TDL_1);
-}
-
-static struct device_attribute dev_attr_tdl_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_tdl_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_tdl_1[] = {
-	&dev_attr_tdl_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_tdl_1 = {
-	.name = "b375c985-9953-455b-bda2-b03f7594e9db",
-	.attrs =  attrs_tdl_1,
-};
-
-static ssize_t
-show_tdl_2_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TDL_2);
-}
-
-static struct device_attribute dev_attr_tdl_2_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_tdl_2_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_tdl_2[] = {
-	&dev_attr_tdl_2_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_tdl_2 = {
-	.name = "3e2be2bb-884a-49bb-82c5-2358e6bd5f2d",
-	.attrs =  attrs_tdl_2,
-};
-
-static ssize_t
-show_compute_extra_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_EXTRA);
-}
-
-static struct device_attribute dev_attr_compute_extra_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_extra_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_extra[] = {
-	&dev_attr_compute_extra_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_extra = {
-	.name = "2d80a648-7b5a-4e92-bbe7-3b5c76f2e221",
-	.attrs =  attrs_compute_extra,
-};
-
-static ssize_t
-show_vme_pipe_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_VME_PIPE);
-}
-
-static struct device_attribute dev_attr_vme_pipe_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_vme_pipe_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_vme_pipe[] = {
-	&dev_attr_vme_pipe_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_vme_pipe = {
-	.name = "cfae9232-6ffc-42cc-a703-9790016925f0",
-	.attrs =  attrs_vme_pipe,
-};
-
 static ssize_t
 show_test_oa_id(struct device *kdev, struct device_attribute *attr, char *buf)
 {
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TEST_OA);
-}
-
-static struct device_attribute dev_attr_test_oa_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_test_oa_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_test_oa[] = {
-	&dev_attr_test_oa_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_test_oa = {
-	.name = "2b985803-d3c9-4629-8a4f-634bfecba0e8",
-	.attrs =  attrs_test_oa,
-};
-
-int
-i915_perf_register_sysfs_sklgt3(struct drm_i915_private *dev_priv)
-{
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
-	int ret = 0;
-
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-		if (ret)
-			goto error_render_basic;
-	}
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-		if (ret)
-			goto error_compute_basic;
-	}
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-		if (ret)
-			goto error_render_pipe_profile;
-	}
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-		if (ret)
-			goto error_memory_reads;
-	}
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-		if (ret)
-			goto error_memory_writes;
-	}
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-		if (ret)
-			goto error_compute_extended;
-	}
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-		if (ret)
-			goto error_compute_l3_cache;
-	}
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-		if (ret)
-			goto error_hdc_and_sf;
-	}
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-		if (ret)
-			goto error_l3_1;
-	}
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-		if (ret)
-			goto error_l3_2;
-	}
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-		if (ret)
-			goto error_l3_3;
-	}
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-		if (ret)
-			goto error_rasterizer_and_pixel_backend;
-	}
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_sampler);
-		if (ret)
-			goto error_sampler;
-	}
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-		if (ret)
-			goto error_tdl_1;
-	}
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-		if (ret)
-			goto error_tdl_2;
-	}
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-		if (ret)
-			goto error_compute_extra;
-	}
-	if (get_vme_pipe_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_vme_pipe);
-		if (ret)
-			goto error_vme_pipe;
-	}
-	if (get_test_oa_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_test_oa);
-		if (ret)
-			goto error_test_oa;
-	}
-
-	return 0;
-
-error_test_oa:
-	if (get_vme_pipe_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_vme_pipe);
-error_vme_pipe:
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-error_compute_extra:
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-error_tdl_2:
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-error_tdl_1:
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler);
-error_sampler:
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-error_rasterizer_and_pixel_backend:
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-error_l3_3:
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-error_l3_2:
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-error_l3_1:
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-error_hdc_and_sf:
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-error_compute_l3_cache:
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-error_compute_extended:
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-error_memory_writes:
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-error_memory_reads:
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-error_render_pipe_profile:
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-error_compute_basic:
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-error_render_basic:
-	return ret;
+	return sprintf(buf, "1\n");
 }
 
 void
-i915_perf_unregister_sysfs_sklgt3(struct drm_i915_private *dev_priv)
+i915_perf_load_test_config_sklgt3(struct drm_i915_private *dev_priv)
 {
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
+	strncpy(dev_priv->perf.oa.test_config.uuid,
+		"2b985803-d3c9-4629-8a4f-634bfecba0e8",
+		UUID_STRING_LEN);
+	dev_priv->perf.oa.test_config.id = 1;
 
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler);
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-	if (get_vme_pipe_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_vme_pipe);
-	if (get_test_oa_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_test_oa);
+	dev_priv->perf.oa.test_config.mux_regs = mux_config_test_oa;
+	dev_priv->perf.oa.test_config.mux_regs_len = ARRAY_SIZE(mux_config_test_oa);
+
+	dev_priv->perf.oa.test_config.b_counter_regs = b_counter_config_test_oa;
+	dev_priv->perf.oa.test_config.b_counter_regs_len = ARRAY_SIZE(b_counter_config_test_oa);
+
+	dev_priv->perf.oa.test_config.flex_regs = flex_eu_config_test_oa;
+	dev_priv->perf.oa.test_config.flex_regs_len = ARRAY_SIZE(flex_eu_config_test_oa);
+
+	dev_priv->perf.oa.test_config.sysfs_metric.name = "2b985803-d3c9-4629-8a4f-634bfecba0e8";
+	dev_priv->perf.oa.test_config.sysfs_metric.attrs = dev_priv->perf.oa.test_config.attrs;
+
+	dev_priv->perf.oa.test_config.attrs[0] = &dev_priv->perf.oa.test_config.sysfs_metric_id.attr;
+
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.name = "id";
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.mode = 0444;
+	dev_priv->perf.oa.test_config.sysfs_metric_id.show = show_test_oa_id;
 }
diff --git a/drivers/gpu/drm/i915/i915_oa_sklgt3.h b/drivers/gpu/drm/i915/i915_oa_sklgt3.h
index c0accb1..06746b2 100644
--- a/drivers/gpu/drm/i915/i915_oa_sklgt3.h
+++ b/drivers/gpu/drm/i915/i915_oa_sklgt3.h
@@ -29,12 +29,6 @@
 #ifndef __I915_OA_SKLGT3_H__
 #define __I915_OA_SKLGT3_H__
 
-extern int i915_oa_n_builtin_metric_sets_sklgt3;
-
-extern int i915_oa_select_metric_set_sklgt3(struct drm_i915_private *dev_priv);
-
-extern int i915_perf_register_sysfs_sklgt3(struct drm_i915_private *dev_priv);
-
-extern void i915_perf_unregister_sysfs_sklgt3(struct drm_i915_private *dev_priv);
+extern void i915_perf_load_test_config_sklgt3(struct drm_i915_private *dev_priv);
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_oa_sklgt4.c b/drivers/gpu/drm/i915/i915_oa_sklgt4.c
index 9ddab43..bce031e 100644
--- a/drivers/gpu/drm/i915/i915_oa_sklgt4.c
+++ b/drivers/gpu/drm/i915/i915_oa_sklgt4.c
@@ -31,1930 +31,6 @@
 #include "i915_drv.h"
 #include "i915_oa_sklgt4.h"
 
-enum metric_set_id {
-	METRIC_SET_ID_RENDER_BASIC = 1,
-	METRIC_SET_ID_COMPUTE_BASIC,
-	METRIC_SET_ID_RENDER_PIPE_PROFILE,
-	METRIC_SET_ID_MEMORY_READS,
-	METRIC_SET_ID_MEMORY_WRITES,
-	METRIC_SET_ID_COMPUTE_EXTENDED,
-	METRIC_SET_ID_COMPUTE_L3_CACHE,
-	METRIC_SET_ID_HDC_AND_SF,
-	METRIC_SET_ID_L3_1,
-	METRIC_SET_ID_L3_2,
-	METRIC_SET_ID_L3_3,
-	METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND,
-	METRIC_SET_ID_SAMPLER,
-	METRIC_SET_ID_TDL_1,
-	METRIC_SET_ID_TDL_2,
-	METRIC_SET_ID_COMPUTE_EXTRA,
-	METRIC_SET_ID_VME_PIPE,
-	METRIC_SET_ID_TEST_OA,
-};
-
-int i915_oa_n_builtin_metric_sets_sklgt4 = 18;
-
-static const struct i915_oa_reg b_counter_config_render_basic[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_render_basic[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_render_basic[] = {
-	{ _MMIO(0x9888), 0x166c01e0 },
-	{ _MMIO(0x9888), 0x12170280 },
-	{ _MMIO(0x9888), 0x12370280 },
-	{ _MMIO(0x9888), 0x16ec01e0 },
-	{ _MMIO(0x9888), 0x176c01e0 },
-	{ _MMIO(0x9888), 0x11930317 },
-	{ _MMIO(0x9888), 0x159303df },
-	{ _MMIO(0x9888), 0x3f900003 },
-	{ _MMIO(0x9888), 0x1a4e03b0 },
-	{ _MMIO(0x9888), 0x0a6c0053 },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x0a1b4000 },
-	{ _MMIO(0x9888), 0x1c1c0001 },
-	{ _MMIO(0x9888), 0x002f1000 },
-	{ _MMIO(0x9888), 0x042f1000 },
-	{ _MMIO(0x9888), 0x004c4000 },
-	{ _MMIO(0x9888), 0x0a4ca400 },
-	{ _MMIO(0x9888), 0x0c4c0002 },
-	{ _MMIO(0x9888), 0x000d2000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0f0400 },
-	{ _MMIO(0x9888), 0x0e0f5600 },
-	{ _MMIO(0x9888), 0x100f0001 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x162caa00 },
-	{ _MMIO(0x9888), 0x062d8000 },
-	{ _MMIO(0x9888), 0x00133000 },
-	{ _MMIO(0x9888), 0x08133000 },
-	{ _MMIO(0x9888), 0x00170020 },
-	{ _MMIO(0x9888), 0x08170021 },
-	{ _MMIO(0x9888), 0x10170000 },
-	{ _MMIO(0x9888), 0x0633c000 },
-	{ _MMIO(0x9888), 0x06370800 },
-	{ _MMIO(0x9888), 0x10370000 },
-	{ _MMIO(0x9888), 0x1ace0230 },
-	{ _MMIO(0x9888), 0x0aec5300 },
-	{ _MMIO(0x9888), 0x10ec0000 },
-	{ _MMIO(0x9888), 0x1cec0000 },
-	{ _MMIO(0x9888), 0x0a9b8000 },
-	{ _MMIO(0x9888), 0x1c9c0002 },
-	{ _MMIO(0x9888), 0x0acc2000 },
-	{ _MMIO(0x9888), 0x0ccc0002 },
-	{ _MMIO(0x9888), 0x088d8000 },
-	{ _MMIO(0x9888), 0x0a8d8000 },
-	{ _MMIO(0x9888), 0x0e8f1000 },
-	{ _MMIO(0x9888), 0x108f0001 },
-	{ _MMIO(0x9888), 0x16ac8800 },
-	{ _MMIO(0x9888), 0x1b4e0020 },
-	{ _MMIO(0x9888), 0x096c5300 },
-	{ _MMIO(0x9888), 0x116c0000 },
-	{ _MMIO(0x9888), 0x1d6c0000 },
-	{ _MMIO(0x9888), 0x091b8000 },
-	{ _MMIO(0x9888), 0x1b1c8000 },
-	{ _MMIO(0x9888), 0x0b4c2000 },
-	{ _MMIO(0x9888), 0x090d8000 },
-	{ _MMIO(0x9888), 0x0f0f1000 },
-	{ _MMIO(0x9888), 0x172c0800 },
-	{ _MMIO(0x9888), 0x0d933031 },
-	{ _MMIO(0x9888), 0x0f933e3f },
-	{ _MMIO(0x9888), 0x01933d00 },
-	{ _MMIO(0x9888), 0x0393073c },
-	{ _MMIO(0x9888), 0x0593000e },
-	{ _MMIO(0x9888), 0x1d930000 },
-	{ _MMIO(0x9888), 0x19930000 },
-	{ _MMIO(0x9888), 0x1b930000 },
-	{ _MMIO(0x9888), 0x1d900157 },
-	{ _MMIO(0x9888), 0x1f900158 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x2b908000 },
-	{ _MMIO(0x9888), 0x2d908000 },
-	{ _MMIO(0x9888), 0x2f908000 },
-	{ _MMIO(0x9888), 0x31908000 },
-	{ _MMIO(0x9888), 0x15908000 },
-	{ _MMIO(0x9888), 0x17908000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1190003f },
-	{ _MMIO(0x9888), 0x5190ff30 },
-	{ _MMIO(0x9888), 0x41900060 },
-	{ _MMIO(0x9888), 0x55903033 },
-	{ _MMIO(0x9888), 0x45901421 },
-	{ _MMIO(0x9888), 0x47900803 },
-	{ _MMIO(0x9888), 0x5790fff1 },
-	{ _MMIO(0x9888), 0x49900001 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900000 },
-	{ _MMIO(0x9888), 0x5990000f },
-	{ _MMIO(0x9888), 0x43900000 },
-	{ _MMIO(0x9888), 0x5390ffff },
-};
-
-static int
-get_render_basic_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_render_basic;
-	lens[n] = ARRAY_SIZE(mux_config_render_basic);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_basic[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_basic[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_compute_basic[] = {
-	{ _MMIO(0x9888), 0x104f00e0 },
-	{ _MMIO(0x9888), 0x124f1c00 },
-	{ _MMIO(0x9888), 0x106c00e0 },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f900003 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x1a4e0820 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x064f0900 },
-	{ _MMIO(0x9888), 0x084f0032 },
-	{ _MMIO(0x9888), 0x0a4f1891 },
-	{ _MMIO(0x9888), 0x0c4f0e00 },
-	{ _MMIO(0x9888), 0x0e4f003c },
-	{ _MMIO(0x9888), 0x004f0d80 },
-	{ _MMIO(0x9888), 0x024f003b },
-	{ _MMIO(0x9888), 0x006c0002 },
-	{ _MMIO(0x9888), 0x086c0100 },
-	{ _MMIO(0x9888), 0x0c6c000c },
-	{ _MMIO(0x9888), 0x0e6c0b00 },
-	{ _MMIO(0x9888), 0x186c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x001b4000 },
-	{ _MMIO(0x9888), 0x081b8000 },
-	{ _MMIO(0x9888), 0x0c1b4000 },
-	{ _MMIO(0x9888), 0x0e1b8000 },
-	{ _MMIO(0x9888), 0x101c8000 },
-	{ _MMIO(0x9888), 0x1a1c8000 },
-	{ _MMIO(0x9888), 0x1c1c0024 },
-	{ _MMIO(0x9888), 0x065b8000 },
-	{ _MMIO(0x9888), 0x085b4000 },
-	{ _MMIO(0x9888), 0x0a5bc000 },
-	{ _MMIO(0x9888), 0x0c5b8000 },
-	{ _MMIO(0x9888), 0x0e5b4000 },
-	{ _MMIO(0x9888), 0x005b8000 },
-	{ _MMIO(0x9888), 0x025b4000 },
-	{ _MMIO(0x9888), 0x1a5c6000 },
-	{ _MMIO(0x9888), 0x1c5c001b },
-	{ _MMIO(0x9888), 0x125c8000 },
-	{ _MMIO(0x9888), 0x145c8000 },
-	{ _MMIO(0x9888), 0x004c8000 },
-	{ _MMIO(0x9888), 0x0a4c2000 },
-	{ _MMIO(0x9888), 0x0c4c0208 },
-	{ _MMIO(0x9888), 0x000da000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x020d2000 },
-	{ _MMIO(0x9888), 0x0c0f5400 },
-	{ _MMIO(0x9888), 0x0e0f5500 },
-	{ _MMIO(0x9888), 0x100f0155 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2cc000 },
-	{ _MMIO(0x9888), 0x162cfb00 },
-	{ _MMIO(0x9888), 0x182c00be },
-	{ _MMIO(0x9888), 0x022cc000 },
-	{ _MMIO(0x9888), 0x042cc000 },
-	{ _MMIO(0x9888), 0x19900157 },
-	{ _MMIO(0x9888), 0x1b900158 },
-	{ _MMIO(0x9888), 0x1d900105 },
-	{ _MMIO(0x9888), 0x1f900103 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x11900fff },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900800 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900821 },
-	{ _MMIO(0x9888), 0x47900802 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900802 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900002 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900422 },
-	{ _MMIO(0x9888), 0x53905555 },
-};
-
-static int
-get_compute_basic_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_basic;
-	lens[n] = ARRAY_SIZE(mux_config_compute_basic);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_render_pipe_profile[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007ffea },
-	{ _MMIO(0x2774), 0x00007ffc },
-	{ _MMIO(0x2778), 0x0007affa },
-	{ _MMIO(0x277c), 0x0000f5fd },
-	{ _MMIO(0x2780), 0x00079ffa },
-	{ _MMIO(0x2784), 0x0000f3fb },
-	{ _MMIO(0x2788), 0x0007bf7a },
-	{ _MMIO(0x278c), 0x0000f7e7 },
-	{ _MMIO(0x2790), 0x0007fefa },
-	{ _MMIO(0x2794), 0x0000f7cf },
-	{ _MMIO(0x2798), 0x00077ffa },
-	{ _MMIO(0x279c), 0x0000efdf },
-	{ _MMIO(0x27a0), 0x0006fffa },
-	{ _MMIO(0x27a4), 0x0000cfbf },
-	{ _MMIO(0x27a8), 0x0003fffa },
-	{ _MMIO(0x27ac), 0x00005f7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_render_pipe_profile[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_render_pipe_profile[] = {
-	{ _MMIO(0x9888), 0x0c0e001f },
-	{ _MMIO(0x9888), 0x0a0f0000 },
-	{ _MMIO(0x9888), 0x10116800 },
-	{ _MMIO(0x9888), 0x178a03e0 },
-	{ _MMIO(0x9888), 0x11824c00 },
-	{ _MMIO(0x9888), 0x11830020 },
-	{ _MMIO(0x9888), 0x13840020 },
-	{ _MMIO(0x9888), 0x11850019 },
-	{ _MMIO(0x9888), 0x11860007 },
-	{ _MMIO(0x9888), 0x01870c40 },
-	{ _MMIO(0x9888), 0x17880000 },
-	{ _MMIO(0x9888), 0x022f4000 },
-	{ _MMIO(0x9888), 0x0a4c0040 },
-	{ _MMIO(0x9888), 0x0c0d8000 },
-	{ _MMIO(0x9888), 0x040d4000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x020e5400 },
-	{ _MMIO(0x9888), 0x000e0000 },
-	{ _MMIO(0x9888), 0x080f0040 },
-	{ _MMIO(0x9888), 0x000f0000 },
-	{ _MMIO(0x9888), 0x100f0000 },
-	{ _MMIO(0x9888), 0x0e0f0040 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x06104000 },
-	{ _MMIO(0x9888), 0x06110012 },
-	{ _MMIO(0x9888), 0x06131000 },
-	{ _MMIO(0x9888), 0x01898000 },
-	{ _MMIO(0x9888), 0x0d890100 },
-	{ _MMIO(0x9888), 0x03898000 },
-	{ _MMIO(0x9888), 0x09808000 },
-	{ _MMIO(0x9888), 0x0b808000 },
-	{ _MMIO(0x9888), 0x0380c000 },
-	{ _MMIO(0x9888), 0x0f8a0075 },
-	{ _MMIO(0x9888), 0x1d8a0000 },
-	{ _MMIO(0x9888), 0x118a8000 },
-	{ _MMIO(0x9888), 0x1b8a4000 },
-	{ _MMIO(0x9888), 0x138a8000 },
-	{ _MMIO(0x9888), 0x1d81a000 },
-	{ _MMIO(0x9888), 0x15818000 },
-	{ _MMIO(0x9888), 0x17818000 },
-	{ _MMIO(0x9888), 0x0b820030 },
-	{ _MMIO(0x9888), 0x07828000 },
-	{ _MMIO(0x9888), 0x0d824000 },
-	{ _MMIO(0x9888), 0x0f828000 },
-	{ _MMIO(0x9888), 0x05824000 },
-	{ _MMIO(0x9888), 0x0d830003 },
-	{ _MMIO(0x9888), 0x0583000c },
-	{ _MMIO(0x9888), 0x09830000 },
-	{ _MMIO(0x9888), 0x03838000 },
-	{ _MMIO(0x9888), 0x07838000 },
-	{ _MMIO(0x9888), 0x0b840980 },
-	{ _MMIO(0x9888), 0x03844d80 },
-	{ _MMIO(0x9888), 0x11840000 },
-	{ _MMIO(0x9888), 0x09848000 },
-	{ _MMIO(0x9888), 0x09850080 },
-	{ _MMIO(0x9888), 0x03850003 },
-	{ _MMIO(0x9888), 0x01850000 },
-	{ _MMIO(0x9888), 0x07860000 },
-	{ _MMIO(0x9888), 0x0f860400 },
-	{ _MMIO(0x9888), 0x09870032 },
-	{ _MMIO(0x9888), 0x01888052 },
-	{ _MMIO(0x9888), 0x11880000 },
-	{ _MMIO(0x9888), 0x09884000 },
-	{ _MMIO(0x9888), 0x1b931001 },
-	{ _MMIO(0x9888), 0x1d930001 },
-	{ _MMIO(0x9888), 0x19934000 },
-	{ _MMIO(0x9888), 0x1b958000 },
-	{ _MMIO(0x9888), 0x1d950094 },
-	{ _MMIO(0x9888), 0x19958000 },
-	{ _MMIO(0x9888), 0x09e58000 },
-	{ _MMIO(0x9888), 0x0be58000 },
-	{ _MMIO(0x9888), 0x03e5c000 },
-	{ _MMIO(0x9888), 0x0592c000 },
-	{ _MMIO(0x9888), 0x0b928000 },
-	{ _MMIO(0x9888), 0x0d924000 },
-	{ _MMIO(0x9888), 0x0f924000 },
-	{ _MMIO(0x9888), 0x11928000 },
-	{ _MMIO(0x9888), 0x1392c000 },
-	{ _MMIO(0x9888), 0x09924000 },
-	{ _MMIO(0x9888), 0x01985000 },
-	{ _MMIO(0x9888), 0x07988000 },
-	{ _MMIO(0x9888), 0x09981000 },
-	{ _MMIO(0x9888), 0x0b982000 },
-	{ _MMIO(0x9888), 0x0d982000 },
-	{ _MMIO(0x9888), 0x0f989000 },
-	{ _MMIO(0x9888), 0x05982000 },
-	{ _MMIO(0x9888), 0x13904000 },
-	{ _MMIO(0x9888), 0x21904000 },
-	{ _MMIO(0x9888), 0x23904000 },
-	{ _MMIO(0x9888), 0x25908000 },
-	{ _MMIO(0x9888), 0x27904000 },
-	{ _MMIO(0x9888), 0x29908000 },
-	{ _MMIO(0x9888), 0x2b904000 },
-	{ _MMIO(0x9888), 0x2f904000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x15904000 },
-	{ _MMIO(0x9888), 0x17908000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b904000 },
-	{ _MMIO(0x9888), 0x1190c080 },
-	{ _MMIO(0x9888), 0x51901110 },
-	{ _MMIO(0x9888), 0x41900440 },
-	{ _MMIO(0x9888), 0x55901111 },
-	{ _MMIO(0x9888), 0x45900400 },
-	{ _MMIO(0x9888), 0x47900c21 },
-	{ _MMIO(0x9888), 0x57901411 },
-	{ _MMIO(0x9888), 0x49900042 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900024 },
-	{ _MMIO(0x9888), 0x59900001 },
-	{ _MMIO(0x9888), 0x43900841 },
-	{ _MMIO(0x9888), 0x53900411 },
-};
-
-static int
-get_render_pipe_profile_mux_config(struct drm_i915_private *dev_priv,
-				   const struct i915_oa_reg **regs,
-				   int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_render_pipe_profile;
-	lens[n] = ARRAY_SIZE(mux_config_render_pipe_profile);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_memory_reads[] = {
-	{ _MMIO(0x272c), 0xffffffff },
-	{ _MMIO(0x2728), 0xffffffff },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x271c), 0xffffffff },
-	{ _MMIO(0x2718), 0xffffffff },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0x86543210 },
-	{ _MMIO(0x2748), 0x86543210 },
-	{ _MMIO(0x2744), 0x00006667 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x275c), 0x86543210 },
-	{ _MMIO(0x2758), 0x86543210 },
-	{ _MMIO(0x2754), 0x00006465 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007f81a },
-	{ _MMIO(0x2774), 0x0000fe00 },
-	{ _MMIO(0x2778), 0x0007f82a },
-	{ _MMIO(0x277c), 0x0000fe00 },
-	{ _MMIO(0x2780), 0x0007f872 },
-	{ _MMIO(0x2784), 0x0000fe00 },
-	{ _MMIO(0x2788), 0x0007f8ba },
-	{ _MMIO(0x278c), 0x0000fe00 },
-	{ _MMIO(0x2790), 0x0007f87a },
-	{ _MMIO(0x2794), 0x0000fe00 },
-	{ _MMIO(0x2798), 0x0007f8ea },
-	{ _MMIO(0x279c), 0x0000fe00 },
-	{ _MMIO(0x27a0), 0x0007f8e2 },
-	{ _MMIO(0x27a4), 0x0000fe00 },
-	{ _MMIO(0x27a8), 0x0007f8f2 },
-	{ _MMIO(0x27ac), 0x0000fe00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_memory_reads[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_memory_reads[] = {
-	{ _MMIO(0x9888), 0x11810c00 },
-	{ _MMIO(0x9888), 0x1381001a },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f900064 },
-	{ _MMIO(0x9888), 0x03811300 },
-	{ _MMIO(0x9888), 0x05811b12 },
-	{ _MMIO(0x9888), 0x0781001a },
-	{ _MMIO(0x9888), 0x1f810000 },
-	{ _MMIO(0x9888), 0x17810000 },
-	{ _MMIO(0x9888), 0x19810000 },
-	{ _MMIO(0x9888), 0x1b810000 },
-	{ _MMIO(0x9888), 0x1d810000 },
-	{ _MMIO(0x9888), 0x1b930055 },
-	{ _MMIO(0x9888), 0x03e58000 },
-	{ _MMIO(0x9888), 0x05e5c000 },
-	{ _MMIO(0x9888), 0x07e54000 },
-	{ _MMIO(0x9888), 0x13900150 },
-	{ _MMIO(0x9888), 0x21900151 },
-	{ _MMIO(0x9888), 0x23900152 },
-	{ _MMIO(0x9888), 0x25900153 },
-	{ _MMIO(0x9888), 0x27900154 },
-	{ _MMIO(0x9888), 0x29900155 },
-	{ _MMIO(0x9888), 0x2b900156 },
-	{ _MMIO(0x9888), 0x2d900157 },
-	{ _MMIO(0x9888), 0x2f90015f },
-	{ _MMIO(0x9888), 0x31900105 },
-	{ _MMIO(0x9888), 0x15900103 },
-	{ _MMIO(0x9888), 0x17900101 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d908000 },
-	{ _MMIO(0x9888), 0x1f908000 },
-	{ _MMIO(0x9888), 0x11900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c60 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900c00 },
-	{ _MMIO(0x9888), 0x47900c63 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900c63 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900063 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static int
-get_memory_reads_mux_config(struct drm_i915_private *dev_priv,
-			    const struct i915_oa_reg **regs,
-			    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_memory_reads;
-	lens[n] = ARRAY_SIZE(mux_config_memory_reads);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_memory_writes[] = {
-	{ _MMIO(0x272c), 0xffffffff },
-	{ _MMIO(0x2728), 0xffffffff },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x271c), 0xffffffff },
-	{ _MMIO(0x2718), 0xffffffff },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x274c), 0x86543210 },
-	{ _MMIO(0x2748), 0x86543210 },
-	{ _MMIO(0x2744), 0x00006667 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x275c), 0x86543210 },
-	{ _MMIO(0x2758), 0x86543210 },
-	{ _MMIO(0x2754), 0x00006465 },
-	{ _MMIO(0x2750), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007f81a },
-	{ _MMIO(0x2774), 0x0000fe00 },
-	{ _MMIO(0x2778), 0x0007f82a },
-	{ _MMIO(0x277c), 0x0000fe00 },
-	{ _MMIO(0x2780), 0x0007f822 },
-	{ _MMIO(0x2784), 0x0000fe00 },
-	{ _MMIO(0x2788), 0x0007f8ba },
-	{ _MMIO(0x278c), 0x0000fe00 },
-	{ _MMIO(0x2790), 0x0007f87a },
-	{ _MMIO(0x2794), 0x0000fe00 },
-	{ _MMIO(0x2798), 0x0007f8ea },
-	{ _MMIO(0x279c), 0x0000fe00 },
-	{ _MMIO(0x27a0), 0x0007f8e2 },
-	{ _MMIO(0x27a4), 0x0000fe00 },
-	{ _MMIO(0x27a8), 0x0007f8f2 },
-	{ _MMIO(0x27ac), 0x0000fe00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_memory_writes[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00015014 },
-	{ _MMIO(0xe658), 0x00025024 },
-	{ _MMIO(0xe758), 0x00035034 },
-	{ _MMIO(0xe45c), 0x00045044 },
-	{ _MMIO(0xe55c), 0x00055054 },
-	{ _MMIO(0xe65c), 0x00065064 },
-};
-
-static const struct i915_oa_reg mux_config_memory_writes[] = {
-	{ _MMIO(0x9888), 0x11810c00 },
-	{ _MMIO(0x9888), 0x1381001a },
-	{ _MMIO(0x9888), 0x37906800 },
-	{ _MMIO(0x9888), 0x3f901000 },
-	{ _MMIO(0x9888), 0x03811300 },
-	{ _MMIO(0x9888), 0x05811b12 },
-	{ _MMIO(0x9888), 0x0781001a },
-	{ _MMIO(0x9888), 0x1f810000 },
-	{ _MMIO(0x9888), 0x17810000 },
-	{ _MMIO(0x9888), 0x19810000 },
-	{ _MMIO(0x9888), 0x1b810000 },
-	{ _MMIO(0x9888), 0x1d810000 },
-	{ _MMIO(0x9888), 0x1b930055 },
-	{ _MMIO(0x9888), 0x03e58000 },
-	{ _MMIO(0x9888), 0x05e5c000 },
-	{ _MMIO(0x9888), 0x07e54000 },
-	{ _MMIO(0x9888), 0x13900160 },
-	{ _MMIO(0x9888), 0x21900161 },
-	{ _MMIO(0x9888), 0x23900162 },
-	{ _MMIO(0x9888), 0x25900163 },
-	{ _MMIO(0x9888), 0x27900164 },
-	{ _MMIO(0x9888), 0x29900165 },
-	{ _MMIO(0x9888), 0x2b900166 },
-	{ _MMIO(0x9888), 0x2d900167 },
-	{ _MMIO(0x9888), 0x2f900150 },
-	{ _MMIO(0x9888), 0x31900105 },
-	{ _MMIO(0x9888), 0x15900103 },
-	{ _MMIO(0x9888), 0x17900101 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1d908000 },
-	{ _MMIO(0x9888), 0x1f908000 },
-	{ _MMIO(0x9888), 0x11900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c60 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900c00 },
-	{ _MMIO(0x9888), 0x47900c63 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900c63 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900063 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static int
-get_memory_writes_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_memory_writes;
-	lens[n] = ARRAY_SIZE(mux_config_memory_writes);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_extended[] = {
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007fc2a },
-	{ _MMIO(0x2774), 0x0000bf00 },
-	{ _MMIO(0x2778), 0x0007fc6a },
-	{ _MMIO(0x277c), 0x0000bf00 },
-	{ _MMIO(0x2780), 0x0007fc92 },
-	{ _MMIO(0x2784), 0x0000bf00 },
-	{ _MMIO(0x2788), 0x0007fca2 },
-	{ _MMIO(0x278c), 0x0000bf00 },
-	{ _MMIO(0x2790), 0x0007fc32 },
-	{ _MMIO(0x2794), 0x0000bf00 },
-	{ _MMIO(0x2798), 0x0007fc9a },
-	{ _MMIO(0x279c), 0x0000bf00 },
-	{ _MMIO(0x27a0), 0x0007fe6a },
-	{ _MMIO(0x27a4), 0x0000bf00 },
-	{ _MMIO(0x27a8), 0x0007fe7a },
-	{ _MMIO(0x27ac), 0x0000bf00 },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_extended[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00778008 },
-	{ _MMIO(0xe45c), 0x00088078 },
-	{ _MMIO(0xe55c), 0x00808708 },
-	{ _MMIO(0xe65c), 0x00a08908 },
-};
-
-static const struct i915_oa_reg mux_config_compute_extended[] = {
-	{ _MMIO(0x9888), 0x106c00e0 },
-	{ _MMIO(0x9888), 0x141c8160 },
-	{ _MMIO(0x9888), 0x161c8015 },
-	{ _MMIO(0x9888), 0x181c0120 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x0e4e8000 },
-	{ _MMIO(0x9888), 0x184e8000 },
-	{ _MMIO(0x9888), 0x1a4eaaa0 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x024e8000 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x0e6c0b01 },
-	{ _MMIO(0x9888), 0x006c0200 },
-	{ _MMIO(0x9888), 0x026c000c },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x1a6c0000 },
-	{ _MMIO(0x9888), 0x0e1bc000 },
-	{ _MMIO(0x9888), 0x001b8000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x001c0041 },
-	{ _MMIO(0x9888), 0x061c4200 },
-	{ _MMIO(0x9888), 0x081c4443 },
-	{ _MMIO(0x9888), 0x0a1c4645 },
-	{ _MMIO(0x9888), 0x0c1c7647 },
-	{ _MMIO(0x9888), 0x041c7357 },
-	{ _MMIO(0x9888), 0x1c1c0030 },
-	{ _MMIO(0x9888), 0x101c0000 },
-	{ _MMIO(0x9888), 0x1a1c0000 },
-	{ _MMIO(0x9888), 0x121c8000 },
-	{ _MMIO(0x9888), 0x004c8000 },
-	{ _MMIO(0x9888), 0x0a4caa2a },
-	{ _MMIO(0x9888), 0x0c4c02aa },
-	{ _MMIO(0x9888), 0x084ca000 },
-	{ _MMIO(0x9888), 0x000da000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x0c0f5400 },
-	{ _MMIO(0x9888), 0x0e0f5515 },
-	{ _MMIO(0x9888), 0x100f0155 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2c8000 },
-	{ _MMIO(0x9888), 0x162caa00 },
-	{ _MMIO(0x9888), 0x182c00aa },
-	{ _MMIO(0x9888), 0x022c8000 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x11907fff },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900040 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900802 },
-	{ _MMIO(0x9888), 0x47900842 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900842 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x43900800 },
-	{ _MMIO(0x9888), 0x53900000 },
-};
-
-static int
-get_compute_extended_mux_config(struct drm_i915_private *dev_priv,
-				const struct i915_oa_reg **regs,
-				int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_extended;
-	lens[n] = ARRAY_SIZE(mux_config_compute_extended);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_l3_cache[] = {
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x30800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2770), 0x0007fffa },
-	{ _MMIO(0x2774), 0x0000fefe },
-	{ _MMIO(0x2778), 0x0007fffa },
-	{ _MMIO(0x277c), 0x0000fefd },
-	{ _MMIO(0x2790), 0x0007fffa },
-	{ _MMIO(0x2794), 0x0000fbef },
-	{ _MMIO(0x2798), 0x0007fffa },
-	{ _MMIO(0x279c), 0x0000fbdf },
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_l3_cache[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00000003 },
-	{ _MMIO(0xe658), 0x00002001 },
-	{ _MMIO(0xe758), 0x00101100 },
-	{ _MMIO(0xe45c), 0x00201200 },
-	{ _MMIO(0xe55c), 0x00301300 },
-	{ _MMIO(0xe65c), 0x00401400 },
-};
-
-static const struct i915_oa_reg mux_config_compute_l3_cache[] = {
-	{ _MMIO(0x9888), 0x166c0760 },
-	{ _MMIO(0x9888), 0x1593001e },
-	{ _MMIO(0x9888), 0x3f900003 },
-	{ _MMIO(0x9888), 0x004e8000 },
-	{ _MMIO(0x9888), 0x0e4e8000 },
-	{ _MMIO(0x9888), 0x184e8000 },
-	{ _MMIO(0x9888), 0x1a4e8020 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x006c0051 },
-	{ _MMIO(0x9888), 0x066c5000 },
-	{ _MMIO(0x9888), 0x086c5c5d },
-	{ _MMIO(0x9888), 0x0e6c5e5f },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x186c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x001b4000 },
-	{ _MMIO(0x9888), 0x061b8000 },
-	{ _MMIO(0x9888), 0x081bc000 },
-	{ _MMIO(0x9888), 0x0e1bc000 },
-	{ _MMIO(0x9888), 0x101c8000 },
-	{ _MMIO(0x9888), 0x1a1ce000 },
-	{ _MMIO(0x9888), 0x1c1c0030 },
-	{ _MMIO(0x9888), 0x004c8000 },
-	{ _MMIO(0x9888), 0x0a4c2a00 },
-	{ _MMIO(0x9888), 0x0c4c0280 },
-	{ _MMIO(0x9888), 0x000d2000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x0c0f0400 },
-	{ _MMIO(0x9888), 0x0e0f1500 },
-	{ _MMIO(0x9888), 0x100f0140 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2c8000 },
-	{ _MMIO(0x9888), 0x162c0a00 },
-	{ _MMIO(0x9888), 0x182c00a0 },
-	{ _MMIO(0x9888), 0x03933300 },
-	{ _MMIO(0x9888), 0x05930032 },
-	{ _MMIO(0x9888), 0x11930000 },
-	{ _MMIO(0x9888), 0x1b930000 },
-	{ _MMIO(0x9888), 0x1d900157 },
-	{ _MMIO(0x9888), 0x1f900158 },
-	{ _MMIO(0x9888), 0x35900000 },
-	{ _MMIO(0x9888), 0x19908000 },
-	{ _MMIO(0x9888), 0x1b908000 },
-	{ _MMIO(0x9888), 0x1190030f },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900000 },
-	{ _MMIO(0x9888), 0x55900000 },
-	{ _MMIO(0x9888), 0x45900021 },
-	{ _MMIO(0x9888), 0x47900000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x4b900000 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x53905555 },
-	{ _MMIO(0x9888), 0x43900000 },
-};
-
-static int
-get_compute_l3_cache_mux_config(struct drm_i915_private *dev_priv,
-				const struct i915_oa_reg **regs,
-				int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_l3_cache;
-	lens[n] = ARRAY_SIZE(mux_config_compute_l3_cache);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_hdc_and_sf[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x10800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000fdff },
-};
-
-static const struct i915_oa_reg flex_eu_config_hdc_and_sf[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_hdc_and_sf[] = {
-	{ _MMIO(0x9888), 0x104f0232 },
-	{ _MMIO(0x9888), 0x124f4640 },
-	{ _MMIO(0x9888), 0x106c0232 },
-	{ _MMIO(0x9888), 0x11834400 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x0c4e8000 },
-	{ _MMIO(0x9888), 0x004f1880 },
-	{ _MMIO(0x9888), 0x024f08bb },
-	{ _MMIO(0x9888), 0x044f001b },
-	{ _MMIO(0x9888), 0x046c0100 },
-	{ _MMIO(0x9888), 0x066c000b },
-	{ _MMIO(0x9888), 0x1a6c0000 },
-	{ _MMIO(0x9888), 0x041b8000 },
-	{ _MMIO(0x9888), 0x061b4000 },
-	{ _MMIO(0x9888), 0x1a1c1800 },
-	{ _MMIO(0x9888), 0x005b8000 },
-	{ _MMIO(0x9888), 0x025bc000 },
-	{ _MMIO(0x9888), 0x045b4000 },
-	{ _MMIO(0x9888), 0x125c8000 },
-	{ _MMIO(0x9888), 0x145c8000 },
-	{ _MMIO(0x9888), 0x165c8000 },
-	{ _MMIO(0x9888), 0x185c8000 },
-	{ _MMIO(0x9888), 0x0a4c00a0 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0f5000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x022cc000 },
-	{ _MMIO(0x9888), 0x042cc000 },
-	{ _MMIO(0x9888), 0x062cc000 },
-	{ _MMIO(0x9888), 0x082cc000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x0f828000 },
-	{ _MMIO(0x9888), 0x0f8305c0 },
-	{ _MMIO(0x9888), 0x09830000 },
-	{ _MMIO(0x9888), 0x07830000 },
-	{ _MMIO(0x9888), 0x1d950080 },
-	{ _MMIO(0x9888), 0x13928000 },
-	{ _MMIO(0x9888), 0x0f988000 },
-	{ _MMIO(0x9888), 0x31904000 },
-	{ _MMIO(0x9888), 0x1190fc00 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x59900001 },
-	{ _MMIO(0x9888), 0x4b900040 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900800 },
-	{ _MMIO(0x9888), 0x43900842 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_hdc_and_sf_mux_config(struct drm_i915_private *dev_priv,
-			  const struct i915_oa_reg **regs,
-			  int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_hdc_and_sf;
-	lens[n] = ARRAY_SIZE(mux_config_hdc_and_sf);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0xf0800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00014002 },
-	{ _MMIO(0x277c), 0x0000c3ff },
-	{ _MMIO(0x2780), 0x00010002 },
-	{ _MMIO(0x2784), 0x0000c7ff },
-	{ _MMIO(0x2788), 0x00004002 },
-	{ _MMIO(0x278c), 0x0000d3ff },
-	{ _MMIO(0x2790), 0x00100700 },
-	{ _MMIO(0x2794), 0x0000ff1f },
-	{ _MMIO(0x2798), 0x00001402 },
-	{ _MMIO(0x279c), 0x0000fc3f },
-	{ _MMIO(0x27a0), 0x00001002 },
-	{ _MMIO(0x27a4), 0x0000fc7f },
-	{ _MMIO(0x27a8), 0x00000402 },
-	{ _MMIO(0x27ac), 0x0000fd3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_1[] = {
-	{ _MMIO(0x9888), 0x126c7b40 },
-	{ _MMIO(0x9888), 0x166c0020 },
-	{ _MMIO(0x9888), 0x0a603444 },
-	{ _MMIO(0x9888), 0x0a613400 },
-	{ _MMIO(0x9888), 0x1a4ea800 },
-	{ _MMIO(0x9888), 0x1c4e0002 },
-	{ _MMIO(0x9888), 0x024e8000 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x064f4000 },
-	{ _MMIO(0x9888), 0x0c6c5327 },
-	{ _MMIO(0x9888), 0x0e6c5425 },
-	{ _MMIO(0x9888), 0x006c2a00 },
-	{ _MMIO(0x9888), 0x026c285b },
-	{ _MMIO(0x9888), 0x046c005c },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1c6c0000 },
-	{ _MMIO(0x9888), 0x1e6c0000 },
-	{ _MMIO(0x9888), 0x1a6c0800 },
-	{ _MMIO(0x9888), 0x0c1bc000 },
-	{ _MMIO(0x9888), 0x0e1bc000 },
-	{ _MMIO(0x9888), 0x001b8000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x041bc000 },
-	{ _MMIO(0x9888), 0x1c1c003c },
-	{ _MMIO(0x9888), 0x121c8000 },
-	{ _MMIO(0x9888), 0x141c8000 },
-	{ _MMIO(0x9888), 0x161c8000 },
-	{ _MMIO(0x9888), 0x181c8000 },
-	{ _MMIO(0x9888), 0x1a1c0800 },
-	{ _MMIO(0x9888), 0x065b4000 },
-	{ _MMIO(0x9888), 0x1a5c1000 },
-	{ _MMIO(0x9888), 0x10600000 },
-	{ _MMIO(0x9888), 0x04600000 },
-	{ _MMIO(0x9888), 0x0c610044 },
-	{ _MMIO(0x9888), 0x10610000 },
-	{ _MMIO(0x9888), 0x06610000 },
-	{ _MMIO(0x9888), 0x0c4c02a8 },
-	{ _MMIO(0x9888), 0x084ca000 },
-	{ _MMIO(0x9888), 0x0a4c002a },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x100f0154 },
-	{ _MMIO(0x9888), 0x0c0f5000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x182c00aa },
-	{ _MMIO(0x9888), 0x022c8000 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2cc000 },
-	{ _MMIO(0x9888), 0x1190ffc0 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900420 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900021 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900400 },
-	{ _MMIO(0x9888), 0x43900421 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900040 },
-};
-
-static int
-get_l3_1_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_1;
-	lens[n] = ARRAY_SIZE(mux_config_l3_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_2[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00028002 },
-	{ _MMIO(0x277c), 0x000087ff },
-	{ _MMIO(0x2780), 0x00020002 },
-	{ _MMIO(0x2784), 0x00008fff },
-	{ _MMIO(0x2788), 0x00008002 },
-	{ _MMIO(0x278c), 0x0000a7ff },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_2[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_2[] = {
-	{ _MMIO(0x9888), 0x126c02e0 },
-	{ _MMIO(0x9888), 0x146c0001 },
-	{ _MMIO(0x9888), 0x0a623400 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x064f4000 },
-	{ _MMIO(0x9888), 0x026c3324 },
-	{ _MMIO(0x9888), 0x046c3422 },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1a6c0000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x041bc000 },
-	{ _MMIO(0x9888), 0x141c8000 },
-	{ _MMIO(0x9888), 0x161c8000 },
-	{ _MMIO(0x9888), 0x181c8000 },
-	{ _MMIO(0x9888), 0x1a1c0800 },
-	{ _MMIO(0x9888), 0x065b4000 },
-	{ _MMIO(0x9888), 0x1a5c1000 },
-	{ _MMIO(0x9888), 0x06614000 },
-	{ _MMIO(0x9888), 0x0c620044 },
-	{ _MMIO(0x9888), 0x10620000 },
-	{ _MMIO(0x9888), 0x06620000 },
-	{ _MMIO(0x9888), 0x084c8000 },
-	{ _MMIO(0x9888), 0x0a4c002a },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0f4000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2cc000 },
-	{ _MMIO(0x9888), 0x1190f800 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x43900000 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_l3_2_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_2;
-	lens[n] = ARRAY_SIZE(mux_config_l3_2);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_l3_3[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00100070 },
-	{ _MMIO(0x2774), 0x0000fff1 },
-	{ _MMIO(0x2778), 0x00028002 },
-	{ _MMIO(0x277c), 0x000087ff },
-	{ _MMIO(0x2780), 0x00020002 },
-	{ _MMIO(0x2784), 0x00008fff },
-	{ _MMIO(0x2788), 0x00008002 },
-	{ _MMIO(0x278c), 0x0000a7ff },
-};
-
-static const struct i915_oa_reg flex_eu_config_l3_3[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_l3_3[] = {
-	{ _MMIO(0x9888), 0x126c4e80 },
-	{ _MMIO(0x9888), 0x146c0000 },
-	{ _MMIO(0x9888), 0x0a633400 },
-	{ _MMIO(0x9888), 0x044e8000 },
-	{ _MMIO(0x9888), 0x064e8000 },
-	{ _MMIO(0x9888), 0x084e8000 },
-	{ _MMIO(0x9888), 0x0a4e8000 },
-	{ _MMIO(0x9888), 0x0c4e8000 },
-	{ _MMIO(0x9888), 0x026c3321 },
-	{ _MMIO(0x9888), 0x046c342f },
-	{ _MMIO(0x9888), 0x106c0000 },
-	{ _MMIO(0x9888), 0x1a6c2000 },
-	{ _MMIO(0x9888), 0x021bc000 },
-	{ _MMIO(0x9888), 0x041bc000 },
-	{ _MMIO(0x9888), 0x061b4000 },
-	{ _MMIO(0x9888), 0x141c8000 },
-	{ _MMIO(0x9888), 0x161c8000 },
-	{ _MMIO(0x9888), 0x181c8000 },
-	{ _MMIO(0x9888), 0x1a1c1800 },
-	{ _MMIO(0x9888), 0x06604000 },
-	{ _MMIO(0x9888), 0x0c630044 },
-	{ _MMIO(0x9888), 0x10630000 },
-	{ _MMIO(0x9888), 0x06630000 },
-	{ _MMIO(0x9888), 0x084c8000 },
-	{ _MMIO(0x9888), 0x0a4c00aa },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0f4000 },
-	{ _MMIO(0x9888), 0x0e0f0055 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x1190f800 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x43900842 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900002 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_l3_3_mux_config(struct drm_i915_private *dev_priv,
-		    const struct i915_oa_reg **regs,
-		    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_l3_3;
-	lens[n] = ARRAY_SIZE(mux_config_l3_3);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x30800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x0000efff },
-	{ _MMIO(0x2778), 0x00006000 },
-	{ _MMIO(0x277c), 0x0000f3ff },
-};
-
-static const struct i915_oa_reg flex_eu_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_rasterizer_and_pixel_backend[] = {
-	{ _MMIO(0x9888), 0x102f3800 },
-	{ _MMIO(0x9888), 0x144d0500 },
-	{ _MMIO(0x9888), 0x120d03c0 },
-	{ _MMIO(0x9888), 0x140d03cf },
-	{ _MMIO(0x9888), 0x0c0f0004 },
-	{ _MMIO(0x9888), 0x0c4e4000 },
-	{ _MMIO(0x9888), 0x042f0480 },
-	{ _MMIO(0x9888), 0x082f0000 },
-	{ _MMIO(0x9888), 0x022f0000 },
-	{ _MMIO(0x9888), 0x0a4c0090 },
-	{ _MMIO(0x9888), 0x064d0027 },
-	{ _MMIO(0x9888), 0x004d0000 },
-	{ _MMIO(0x9888), 0x000d0d40 },
-	{ _MMIO(0x9888), 0x020d803f },
-	{ _MMIO(0x9888), 0x040d8023 },
-	{ _MMIO(0x9888), 0x100d0000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x020f0010 },
-	{ _MMIO(0x9888), 0x000f0000 },
-	{ _MMIO(0x9888), 0x0e0f0050 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x1190fc00 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41901400 },
-	{ _MMIO(0x9888), 0x43901485 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900001 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_rasterizer_and_pixel_backend_mux_config(struct drm_i915_private *dev_priv,
-					    const struct i915_oa_reg **regs,
-					    int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_rasterizer_and_pixel_backend;
-	lens[n] = ARRAY_SIZE(mux_config_rasterizer_and_pixel_backend);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_sampler[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x70800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-	{ _MMIO(0x2770), 0x0000c000 },
-	{ _MMIO(0x2774), 0x0000e7ff },
-	{ _MMIO(0x2778), 0x00003000 },
-	{ _MMIO(0x277c), 0x0000f9ff },
-	{ _MMIO(0x2780), 0x00000c00 },
-	{ _MMIO(0x2784), 0x0000fe7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_sampler[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_sampler[] = {
-	{ _MMIO(0x9888), 0x14152c00 },
-	{ _MMIO(0x9888), 0x16150005 },
-	{ _MMIO(0x9888), 0x121600a0 },
-	{ _MMIO(0x9888), 0x14352c00 },
-	{ _MMIO(0x9888), 0x16350005 },
-	{ _MMIO(0x9888), 0x123600a0 },
-	{ _MMIO(0x9888), 0x14552c00 },
-	{ _MMIO(0x9888), 0x16550005 },
-	{ _MMIO(0x9888), 0x125600a0 },
-	{ _MMIO(0x9888), 0x062f6000 },
-	{ _MMIO(0x9888), 0x022f2000 },
-	{ _MMIO(0x9888), 0x0c4c0050 },
-	{ _MMIO(0x9888), 0x0a4c0010 },
-	{ _MMIO(0x9888), 0x0c0d8000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x100f0350 },
-	{ _MMIO(0x9888), 0x0c0fb000 },
-	{ _MMIO(0x9888), 0x0e0f00da },
-	{ _MMIO(0x9888), 0x182c0028 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x022dc000 },
-	{ _MMIO(0x9888), 0x042d4000 },
-	{ _MMIO(0x9888), 0x0c138000 },
-	{ _MMIO(0x9888), 0x0e132000 },
-	{ _MMIO(0x9888), 0x0413c000 },
-	{ _MMIO(0x9888), 0x1c140018 },
-	{ _MMIO(0x9888), 0x0c157000 },
-	{ _MMIO(0x9888), 0x0e150078 },
-	{ _MMIO(0x9888), 0x10150000 },
-	{ _MMIO(0x9888), 0x04162180 },
-	{ _MMIO(0x9888), 0x02160000 },
-	{ _MMIO(0x9888), 0x04174000 },
-	{ _MMIO(0x9888), 0x0233a000 },
-	{ _MMIO(0x9888), 0x04333000 },
-	{ _MMIO(0x9888), 0x14348000 },
-	{ _MMIO(0x9888), 0x16348000 },
-	{ _MMIO(0x9888), 0x02357870 },
-	{ _MMIO(0x9888), 0x10350000 },
-	{ _MMIO(0x9888), 0x04360043 },
-	{ _MMIO(0x9888), 0x02360000 },
-	{ _MMIO(0x9888), 0x04371000 },
-	{ _MMIO(0x9888), 0x0e538000 },
-	{ _MMIO(0x9888), 0x00538000 },
-	{ _MMIO(0x9888), 0x06533000 },
-	{ _MMIO(0x9888), 0x1c540020 },
-	{ _MMIO(0x9888), 0x12548000 },
-	{ _MMIO(0x9888), 0x0e557000 },
-	{ _MMIO(0x9888), 0x00557800 },
-	{ _MMIO(0x9888), 0x10550000 },
-	{ _MMIO(0x9888), 0x06560043 },
-	{ _MMIO(0x9888), 0x02560000 },
-	{ _MMIO(0x9888), 0x06571000 },
-	{ _MMIO(0x9888), 0x1190ff80 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900000 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900060 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c00 },
-	{ _MMIO(0x9888), 0x43900842 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900060 },
-};
-
-static int
-get_sampler_mux_config(struct drm_i915_private *dev_priv,
-		       const struct i915_oa_reg **regs,
-		       int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_sampler;
-	lens[n] = ARRAY_SIZE(mux_config_sampler);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_tdl_1[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00000002 },
-	{ _MMIO(0x2774), 0x00007fff },
-	{ _MMIO(0x2778), 0x00000000 },
-	{ _MMIO(0x277c), 0x00009fff },
-	{ _MMIO(0x2780), 0x00000002 },
-	{ _MMIO(0x2784), 0x0000efff },
-	{ _MMIO(0x2788), 0x00000000 },
-	{ _MMIO(0x278c), 0x0000f3ff },
-	{ _MMIO(0x2790), 0x00000002 },
-	{ _MMIO(0x2794), 0x0000fdff },
-	{ _MMIO(0x2798), 0x00000000 },
-	{ _MMIO(0x279c), 0x0000fe7f },
-};
-
-static const struct i915_oa_reg flex_eu_config_tdl_1[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_tdl_1[] = {
-	{ _MMIO(0x9888), 0x12120000 },
-	{ _MMIO(0x9888), 0x12320000 },
-	{ _MMIO(0x9888), 0x12520000 },
-	{ _MMIO(0x9888), 0x002f8000 },
-	{ _MMIO(0x9888), 0x022f3000 },
-	{ _MMIO(0x9888), 0x0a4c0015 },
-	{ _MMIO(0x9888), 0x0c0d8000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x100f03a0 },
-	{ _MMIO(0x9888), 0x0c0ff000 },
-	{ _MMIO(0x9888), 0x0e0f0095 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x0c2d8000 },
-	{ _MMIO(0x9888), 0x0e2d4000 },
-	{ _MMIO(0x9888), 0x062d4000 },
-	{ _MMIO(0x9888), 0x02108000 },
-	{ _MMIO(0x9888), 0x0410c000 },
-	{ _MMIO(0x9888), 0x02118000 },
-	{ _MMIO(0x9888), 0x0411c000 },
-	{ _MMIO(0x9888), 0x02121880 },
-	{ _MMIO(0x9888), 0x041219b5 },
-	{ _MMIO(0x9888), 0x00120000 },
-	{ _MMIO(0x9888), 0x02134000 },
-	{ _MMIO(0x9888), 0x04135000 },
-	{ _MMIO(0x9888), 0x0c308000 },
-	{ _MMIO(0x9888), 0x0e304000 },
-	{ _MMIO(0x9888), 0x06304000 },
-	{ _MMIO(0x9888), 0x0c318000 },
-	{ _MMIO(0x9888), 0x0e314000 },
-	{ _MMIO(0x9888), 0x06314000 },
-	{ _MMIO(0x9888), 0x0c321a80 },
-	{ _MMIO(0x9888), 0x0e320033 },
-	{ _MMIO(0x9888), 0x06320031 },
-	{ _MMIO(0x9888), 0x00320000 },
-	{ _MMIO(0x9888), 0x0c334000 },
-	{ _MMIO(0x9888), 0x0e331000 },
-	{ _MMIO(0x9888), 0x06331000 },
-	{ _MMIO(0x9888), 0x0e508000 },
-	{ _MMIO(0x9888), 0x00508000 },
-	{ _MMIO(0x9888), 0x02504000 },
-	{ _MMIO(0x9888), 0x0e518000 },
-	{ _MMIO(0x9888), 0x00518000 },
-	{ _MMIO(0x9888), 0x02514000 },
-	{ _MMIO(0x9888), 0x0e521880 },
-	{ _MMIO(0x9888), 0x00521a80 },
-	{ _MMIO(0x9888), 0x02520033 },
-	{ _MMIO(0x9888), 0x0e534000 },
-	{ _MMIO(0x9888), 0x00534000 },
-	{ _MMIO(0x9888), 0x02531000 },
-	{ _MMIO(0x9888), 0x1190ff80 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900800 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900062 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900c00 },
-	{ _MMIO(0x9888), 0x43900003 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900040 },
-};
-
-static int
-get_tdl_1_mux_config(struct drm_i915_private *dev_priv,
-		     const struct i915_oa_reg **regs,
-		     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_tdl_1;
-	lens[n] = ARRAY_SIZE(mux_config_tdl_1);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_tdl_2[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2744), 0x00800000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0x00800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x00800000 },
-};
-
-static const struct i915_oa_reg flex_eu_config_tdl_2[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00010003 },
-	{ _MMIO(0xe658), 0x00012011 },
-	{ _MMIO(0xe758), 0x00015014 },
-	{ _MMIO(0xe45c), 0x00051050 },
-	{ _MMIO(0xe55c), 0x00053052 },
-	{ _MMIO(0xe65c), 0x00055054 },
-};
-
-static const struct i915_oa_reg mux_config_tdl_2[] = {
-	{ _MMIO(0x9888), 0x12124d60 },
-	{ _MMIO(0x9888), 0x12322e60 },
-	{ _MMIO(0x9888), 0x12524d60 },
-	{ _MMIO(0x9888), 0x022f3000 },
-	{ _MMIO(0x9888), 0x0a4c0014 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x0c0fe000 },
-	{ _MMIO(0x9888), 0x0e0f0097 },
-	{ _MMIO(0x9888), 0x082c8000 },
-	{ _MMIO(0x9888), 0x0a2c8000 },
-	{ _MMIO(0x9888), 0x002d8000 },
-	{ _MMIO(0x9888), 0x062d4000 },
-	{ _MMIO(0x9888), 0x0410c000 },
-	{ _MMIO(0x9888), 0x0411c000 },
-	{ _MMIO(0x9888), 0x04121fb7 },
-	{ _MMIO(0x9888), 0x00120000 },
-	{ _MMIO(0x9888), 0x04135000 },
-	{ _MMIO(0x9888), 0x00308000 },
-	{ _MMIO(0x9888), 0x06304000 },
-	{ _MMIO(0x9888), 0x00318000 },
-	{ _MMIO(0x9888), 0x06314000 },
-	{ _MMIO(0x9888), 0x00321b80 },
-	{ _MMIO(0x9888), 0x0632003f },
-	{ _MMIO(0x9888), 0x00334000 },
-	{ _MMIO(0x9888), 0x06331000 },
-	{ _MMIO(0x9888), 0x0250c000 },
-	{ _MMIO(0x9888), 0x0251c000 },
-	{ _MMIO(0x9888), 0x02521fb7 },
-	{ _MMIO(0x9888), 0x00520000 },
-	{ _MMIO(0x9888), 0x02535000 },
-	{ _MMIO(0x9888), 0x1190fc00 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x51900000 },
-	{ _MMIO(0x9888), 0x41900800 },
-	{ _MMIO(0x9888), 0x43900063 },
-	{ _MMIO(0x9888), 0x53900000 },
-	{ _MMIO(0x9888), 0x45900040 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_tdl_2_mux_config(struct drm_i915_private *dev_priv,
-		     const struct i915_oa_reg **regs,
-		     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_tdl_2;
-	lens[n] = ARRAY_SIZE(mux_config_tdl_2);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_compute_extra[] = {
-};
-
-static const struct i915_oa_reg flex_eu_config_compute_extra[] = {
-};
-
-static const struct i915_oa_reg mux_config_compute_extra[] = {
-	{ _MMIO(0x9888), 0x121203e0 },
-	{ _MMIO(0x9888), 0x123203e0 },
-	{ _MMIO(0x9888), 0x125203e0 },
-	{ _MMIO(0x9888), 0x129203e0 },
-	{ _MMIO(0x9888), 0x12b203e0 },
-	{ _MMIO(0x9888), 0x12d203e0 },
-	{ _MMIO(0x9888), 0x131203e0 },
-	{ _MMIO(0x9888), 0x133203e0 },
-	{ _MMIO(0x9888), 0x135203e0 },
-	{ _MMIO(0x9888), 0x1a4ef000 },
-	{ _MMIO(0x9888), 0x1c4e0003 },
-	{ _MMIO(0x9888), 0x024ec000 },
-	{ _MMIO(0x9888), 0x044ec000 },
-	{ _MMIO(0x9888), 0x064ec000 },
-	{ _MMIO(0x9888), 0x022f4000 },
-	{ _MMIO(0x9888), 0x0c4c02a0 },
-	{ _MMIO(0x9888), 0x084ca000 },
-	{ _MMIO(0x9888), 0x0a4c0042 },
-	{ _MMIO(0x9888), 0x0c0d8000 },
-	{ _MMIO(0x9888), 0x0e0da000 },
-	{ _MMIO(0x9888), 0x000d8000 },
-	{ _MMIO(0x9888), 0x020da000 },
-	{ _MMIO(0x9888), 0x040da000 },
-	{ _MMIO(0x9888), 0x060d2000 },
-	{ _MMIO(0x9888), 0x100f0150 },
-	{ _MMIO(0x9888), 0x0c0f5000 },
-	{ _MMIO(0x9888), 0x0e0f006d },
-	{ _MMIO(0x9888), 0x182c00a8 },
-	{ _MMIO(0x9888), 0x022c8000 },
-	{ _MMIO(0x9888), 0x042c8000 },
-	{ _MMIO(0x9888), 0x062c8000 },
-	{ _MMIO(0x9888), 0x0c2c8000 },
-	{ _MMIO(0x9888), 0x042d8000 },
-	{ _MMIO(0x9888), 0x06104000 },
-	{ _MMIO(0x9888), 0x06114000 },
-	{ _MMIO(0x9888), 0x06120033 },
-	{ _MMIO(0x9888), 0x00120000 },
-	{ _MMIO(0x9888), 0x06131000 },
-	{ _MMIO(0x9888), 0x04308000 },
-	{ _MMIO(0x9888), 0x04318000 },
-	{ _MMIO(0x9888), 0x04321980 },
-	{ _MMIO(0x9888), 0x00320000 },
-	{ _MMIO(0x9888), 0x04334000 },
-	{ _MMIO(0x9888), 0x04504000 },
-	{ _MMIO(0x9888), 0x04514000 },
-	{ _MMIO(0x9888), 0x04520033 },
-	{ _MMIO(0x9888), 0x00520000 },
-	{ _MMIO(0x9888), 0x04531000 },
-	{ _MMIO(0x9888), 0x1acef000 },
-	{ _MMIO(0x9888), 0x1cce0003 },
-	{ _MMIO(0x9888), 0x00af8000 },
-	{ _MMIO(0x9888), 0x0ccc02a0 },
-	{ _MMIO(0x9888), 0x0acc0001 },
-	{ _MMIO(0x9888), 0x0c8d8000 },
-	{ _MMIO(0x9888), 0x0e8da000 },
-	{ _MMIO(0x9888), 0x008d8000 },
-	{ _MMIO(0x9888), 0x028da000 },
-	{ _MMIO(0x9888), 0x108f0150 },
-	{ _MMIO(0x9888), 0x0c8fb000 },
-	{ _MMIO(0x9888), 0x0e8f0001 },
-	{ _MMIO(0x9888), 0x18ac00a8 },
-	{ _MMIO(0x9888), 0x06ac8000 },
-	{ _MMIO(0x9888), 0x02ad4000 },
-	{ _MMIO(0x9888), 0x02908000 },
-	{ _MMIO(0x9888), 0x02918000 },
-	{ _MMIO(0x9888), 0x02921980 },
-	{ _MMIO(0x9888), 0x00920000 },
-	{ _MMIO(0x9888), 0x02934000 },
-	{ _MMIO(0x9888), 0x02b04000 },
-	{ _MMIO(0x9888), 0x02b14000 },
-	{ _MMIO(0x9888), 0x02b20033 },
-	{ _MMIO(0x9888), 0x00b20000 },
-	{ _MMIO(0x9888), 0x02b31000 },
-	{ _MMIO(0x9888), 0x00d08000 },
-	{ _MMIO(0x9888), 0x00d18000 },
-	{ _MMIO(0x9888), 0x00d21980 },
-	{ _MMIO(0x9888), 0x00d34000 },
-	{ _MMIO(0x9888), 0x072f8000 },
-	{ _MMIO(0x9888), 0x0d4c0100 },
-	{ _MMIO(0x9888), 0x0d0d8000 },
-	{ _MMIO(0x9888), 0x0f0da000 },
-	{ _MMIO(0x9888), 0x110f01b0 },
-	{ _MMIO(0x9888), 0x192c0080 },
-	{ _MMIO(0x9888), 0x0f2d4000 },
-	{ _MMIO(0x9888), 0x0f108000 },
-	{ _MMIO(0x9888), 0x0f118000 },
-	{ _MMIO(0x9888), 0x0f121980 },
-	{ _MMIO(0x9888), 0x01120000 },
-	{ _MMIO(0x9888), 0x0f134000 },
-	{ _MMIO(0x9888), 0x0f304000 },
-	{ _MMIO(0x9888), 0x0f314000 },
-	{ _MMIO(0x9888), 0x0f320033 },
-	{ _MMIO(0x9888), 0x01320000 },
-	{ _MMIO(0x9888), 0x0f331000 },
-	{ _MMIO(0x9888), 0x0d508000 },
-	{ _MMIO(0x9888), 0x0d518000 },
-	{ _MMIO(0x9888), 0x0d521980 },
-	{ _MMIO(0x9888), 0x01520000 },
-	{ _MMIO(0x9888), 0x0d534000 },
-	{ _MMIO(0x9888), 0x1190ff80 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900c00 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-	{ _MMIO(0x9888), 0x4b900002 },
-	{ _MMIO(0x9888), 0x59900000 },
-	{ _MMIO(0x9888), 0x51901100 },
-	{ _MMIO(0x9888), 0x41901000 },
-	{ _MMIO(0x9888), 0x43901423 },
-	{ _MMIO(0x9888), 0x53903331 },
-	{ _MMIO(0x9888), 0x45900044 },
-};
-
-static int
-get_compute_extra_mux_config(struct drm_i915_private *dev_priv,
-			     const struct i915_oa_reg **regs,
-			     int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_compute_extra;
-	lens[n] = ARRAY_SIZE(mux_config_compute_extra);
-	n++;
-
-	return n;
-}
-
-static const struct i915_oa_reg b_counter_config_vme_pipe[] = {
-	{ _MMIO(0x2740), 0x00000000 },
-	{ _MMIO(0x2710), 0x00000000 },
-	{ _MMIO(0x2714), 0xf0800000 },
-	{ _MMIO(0x2720), 0x00000000 },
-	{ _MMIO(0x2724), 0x30800000 },
-	{ _MMIO(0x2770), 0x00100030 },
-	{ _MMIO(0x2774), 0x0000fff9 },
-	{ _MMIO(0x2778), 0x00000002 },
-	{ _MMIO(0x277c), 0x0000fffc },
-	{ _MMIO(0x2780), 0x00000002 },
-	{ _MMIO(0x2784), 0x0000fff3 },
-	{ _MMIO(0x2788), 0x00100180 },
-	{ _MMIO(0x278c), 0x0000ffcf },
-	{ _MMIO(0x2790), 0x00000002 },
-	{ _MMIO(0x2794), 0x0000ffcf },
-	{ _MMIO(0x2798), 0x00000002 },
-	{ _MMIO(0x279c), 0x0000ff3f },
-};
-
-static const struct i915_oa_reg flex_eu_config_vme_pipe[] = {
-	{ _MMIO(0xe458), 0x00005004 },
-	{ _MMIO(0xe558), 0x00008003 },
-};
-
-static const struct i915_oa_reg mux_config_vme_pipe[] = {
-	{ _MMIO(0x9888), 0x141a5800 },
-	{ _MMIO(0x9888), 0x161a00c0 },
-	{ _MMIO(0x9888), 0x12180240 },
-	{ _MMIO(0x9888), 0x14180002 },
-	{ _MMIO(0x9888), 0x149a5800 },
-	{ _MMIO(0x9888), 0x169a00c0 },
-	{ _MMIO(0x9888), 0x12980240 },
-	{ _MMIO(0x9888), 0x14980002 },
-	{ _MMIO(0x9888), 0x1a4e3fc0 },
-	{ _MMIO(0x9888), 0x002f1000 },
-	{ _MMIO(0x9888), 0x022f8000 },
-	{ _MMIO(0x9888), 0x042f3000 },
-	{ _MMIO(0x9888), 0x004c4000 },
-	{ _MMIO(0x9888), 0x0a4c9500 },
-	{ _MMIO(0x9888), 0x0c4c002a },
-	{ _MMIO(0x9888), 0x000d2000 },
-	{ _MMIO(0x9888), 0x060d8000 },
-	{ _MMIO(0x9888), 0x080da000 },
-	{ _MMIO(0x9888), 0x0a0da000 },
-	{ _MMIO(0x9888), 0x0c0da000 },
-	{ _MMIO(0x9888), 0x0c0f0400 },
-	{ _MMIO(0x9888), 0x0e0f5500 },
-	{ _MMIO(0x9888), 0x100f0015 },
-	{ _MMIO(0x9888), 0x002c8000 },
-	{ _MMIO(0x9888), 0x0e2c8000 },
-	{ _MMIO(0x9888), 0x162caa00 },
-	{ _MMIO(0x9888), 0x182c000a },
-	{ _MMIO(0x9888), 0x04193000 },
-	{ _MMIO(0x9888), 0x081a28c1 },
-	{ _MMIO(0x9888), 0x001a0000 },
-	{ _MMIO(0x9888), 0x00133000 },
-	{ _MMIO(0x9888), 0x0613c000 },
-	{ _MMIO(0x9888), 0x0813f000 },
-	{ _MMIO(0x9888), 0x00172000 },
-	{ _MMIO(0x9888), 0x06178000 },
-	{ _MMIO(0x9888), 0x0817a000 },
-	{ _MMIO(0x9888), 0x00180037 },
-	{ _MMIO(0x9888), 0x06180940 },
-	{ _MMIO(0x9888), 0x08180000 },
-	{ _MMIO(0x9888), 0x02180000 },
-	{ _MMIO(0x9888), 0x04183000 },
-	{ _MMIO(0x9888), 0x04afc000 },
-	{ _MMIO(0x9888), 0x06af3000 },
-	{ _MMIO(0x9888), 0x0acc4000 },
-	{ _MMIO(0x9888), 0x0ccc0015 },
-	{ _MMIO(0x9888), 0x0a8da000 },
-	{ _MMIO(0x9888), 0x0c8da000 },
-	{ _MMIO(0x9888), 0x0e8f4000 },
-	{ _MMIO(0x9888), 0x108f0015 },
-	{ _MMIO(0x9888), 0x16aca000 },
-	{ _MMIO(0x9888), 0x18ac000a },
-	{ _MMIO(0x9888), 0x06993000 },
-	{ _MMIO(0x9888), 0x0c9a28c1 },
-	{ _MMIO(0x9888), 0x009a0000 },
-	{ _MMIO(0x9888), 0x0a93f000 },
-	{ _MMIO(0x9888), 0x0c93f000 },
-	{ _MMIO(0x9888), 0x0a97a000 },
-	{ _MMIO(0x9888), 0x0c97a000 },
-	{ _MMIO(0x9888), 0x0a980977 },
-	{ _MMIO(0x9888), 0x08980000 },
-	{ _MMIO(0x9888), 0x04980000 },
-	{ _MMIO(0x9888), 0x06983000 },
-	{ _MMIO(0x9888), 0x119000ff },
-	{ _MMIO(0x9888), 0x51900010 },
-	{ _MMIO(0x9888), 0x41900060 },
-	{ _MMIO(0x9888), 0x55900111 },
-	{ _MMIO(0x9888), 0x45900c00 },
-	{ _MMIO(0x9888), 0x47900821 },
-	{ _MMIO(0x9888), 0x57900000 },
-	{ _MMIO(0x9888), 0x49900002 },
-	{ _MMIO(0x9888), 0x37900000 },
-	{ _MMIO(0x9888), 0x33900000 },
-};
-
-static int
-get_vme_pipe_mux_config(struct drm_i915_private *dev_priv,
-			const struct i915_oa_reg **regs,
-			int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_vme_pipe;
-	lens[n] = ARRAY_SIZE(mux_config_vme_pipe);
-	n++;
-
-	return n;
-}
-
 static const struct i915_oa_reg b_counter_config_test_oa[] = {
 	{ _MMIO(0x2740), 0x00000000 },
 	{ _MMIO(0x2744), 0x00800000 },
@@ -1984,6 +60,7 @@ static const struct i915_oa_reg flex_eu_config_test_oa[] = {
 };
 
 static const struct i915_oa_reg mux_config_test_oa[] = {
+	{ _MMIO(0x9840), 0x00000080 },
 	{ _MMIO(0x9888), 0x11810000 },
 	{ _MMIO(0x9888), 0x07810013 },
 	{ _MMIO(0x9888), 0x1f810000 },
@@ -1998,1096 +75,35 @@ static const struct i915_oa_reg mux_config_test_oa[] = {
 	{ _MMIO(0x9888), 0x33900000 },
 };
 
-static int
-get_test_oa_mux_config(struct drm_i915_private *dev_priv,
-		       const struct i915_oa_reg **regs,
-		       int *lens)
-{
-	int n = 0;
-
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs) < 1);
-	BUILD_BUG_ON(ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens) < 1);
-
-	regs[n] = mux_config_test_oa;
-	lens[n] = ARRAY_SIZE(mux_config_test_oa);
-	n++;
-
-	return n;
-}
-
-int i915_oa_select_metric_set_sklgt4(struct drm_i915_private *dev_priv)
-{
-	dev_priv->perf.oa.n_mux_configs = 0;
-	dev_priv->perf.oa.b_counter_regs = NULL;
-	dev_priv->perf.oa.b_counter_regs_len = 0;
-	dev_priv->perf.oa.flex_regs = NULL;
-	dev_priv->perf.oa.flex_regs_len = 0;
-
-	switch (dev_priv->perf.oa.metrics_set) {
-	case METRIC_SET_ID_RENDER_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_basic_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_basic);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_BASIC:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_basic_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_BASIC\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_basic;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_basic);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_basic;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_basic);
-
-		return 0;
-	case METRIC_SET_ID_RENDER_PIPE_PROFILE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_render_pipe_profile_mux_config(dev_priv,
-							   dev_priv->perf.oa.mux_regs,
-							   dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RENDER_PIPE_PROFILE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_render_pipe_profile;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_render_pipe_profile);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_render_pipe_profile;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_render_pipe_profile);
-
-		return 0;
-	case METRIC_SET_ID_MEMORY_READS:
-		dev_priv->perf.oa.n_mux_configs =
-			get_memory_reads_mux_config(dev_priv,
-						    dev_priv->perf.oa.mux_regs,
-						    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"MEMORY_READS\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_memory_reads;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_memory_reads);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_memory_reads;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_memory_reads);
-
-		return 0;
-	case METRIC_SET_ID_MEMORY_WRITES:
-		dev_priv->perf.oa.n_mux_configs =
-			get_memory_writes_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"MEMORY_WRITES\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_memory_writes;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_memory_writes);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_memory_writes;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_memory_writes);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_EXTENDED:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_extended_mux_config(dev_priv,
-							dev_priv->perf.oa.mux_regs,
-							dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_EXTENDED\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_extended;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_extended);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_extended;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_extended);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_L3_CACHE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_l3_cache_mux_config(dev_priv,
-							dev_priv->perf.oa.mux_regs,
-							dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_L3_CACHE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_l3_cache;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_l3_cache);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_l3_cache;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_l3_cache);
-
-		return 0;
-	case METRIC_SET_ID_HDC_AND_SF:
-		dev_priv->perf.oa.n_mux_configs =
-			get_hdc_and_sf_mux_config(dev_priv,
-						  dev_priv->perf.oa.mux_regs,
-						  dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"HDC_AND_SF\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_hdc_and_sf;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_hdc_and_sf);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_hdc_and_sf;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_hdc_and_sf);
-
-		return 0;
-	case METRIC_SET_ID_L3_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_1_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_1);
-
-		return 0;
-	case METRIC_SET_ID_L3_2:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_2_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_2\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_2;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_2);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_2;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_2);
-
-		return 0;
-	case METRIC_SET_ID_L3_3:
-		dev_priv->perf.oa.n_mux_configs =
-			get_l3_3_mux_config(dev_priv,
-					    dev_priv->perf.oa.mux_regs,
-					    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"L3_3\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_l3_3;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_l3_3);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_l3_3;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_l3_3);
-
-		return 0;
-	case METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND:
-		dev_priv->perf.oa.n_mux_configs =
-			get_rasterizer_and_pixel_backend_mux_config(dev_priv,
-								    dev_priv->perf.oa.mux_regs,
-								    dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"RASTERIZER_AND_PIXEL_BACKEND\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_rasterizer_and_pixel_backend;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_rasterizer_and_pixel_backend);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_rasterizer_and_pixel_backend;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_rasterizer_and_pixel_backend);
-
-		return 0;
-	case METRIC_SET_ID_SAMPLER:
-		dev_priv->perf.oa.n_mux_configs =
-			get_sampler_mux_config(dev_priv,
-					       dev_priv->perf.oa.mux_regs,
-					       dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"SAMPLER\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_sampler;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_sampler);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_sampler;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_sampler);
-
-		return 0;
-	case METRIC_SET_ID_TDL_1:
-		dev_priv->perf.oa.n_mux_configs =
-			get_tdl_1_mux_config(dev_priv,
-					     dev_priv->perf.oa.mux_regs,
-					     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TDL_1\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_tdl_1;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_tdl_1);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_tdl_1;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_tdl_1);
-
-		return 0;
-	case METRIC_SET_ID_TDL_2:
-		dev_priv->perf.oa.n_mux_configs =
-			get_tdl_2_mux_config(dev_priv,
-					     dev_priv->perf.oa.mux_regs,
-					     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TDL_2\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_tdl_2;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_tdl_2);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_tdl_2;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_tdl_2);
-
-		return 0;
-	case METRIC_SET_ID_COMPUTE_EXTRA:
-		dev_priv->perf.oa.n_mux_configs =
-			get_compute_extra_mux_config(dev_priv,
-						     dev_priv->perf.oa.mux_regs,
-						     dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"COMPUTE_EXTRA\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_compute_extra;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_compute_extra);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_compute_extra;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_compute_extra);
-
-		return 0;
-	case METRIC_SET_ID_VME_PIPE:
-		dev_priv->perf.oa.n_mux_configs =
-			get_vme_pipe_mux_config(dev_priv,
-						dev_priv->perf.oa.mux_regs,
-						dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"VME_PIPE\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_vme_pipe;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_vme_pipe);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_vme_pipe;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_vme_pipe);
-
-		return 0;
-	case METRIC_SET_ID_TEST_OA:
-		dev_priv->perf.oa.n_mux_configs =
-			get_test_oa_mux_config(dev_priv,
-					       dev_priv->perf.oa.mux_regs,
-					       dev_priv->perf.oa.mux_regs_lens);
-		if (dev_priv->perf.oa.n_mux_configs == 0) {
-			DRM_DEBUG_DRIVER("No suitable MUX config for \"TEST_OA\" metric set\n");
-
-			/* EINVAL because *_register_sysfs already checked this
-			 * and so it wouldn't have been advertised to userspace and
-			 * so shouldn't have been requested
-			 */
-			return -EINVAL;
-		}
-
-		dev_priv->perf.oa.b_counter_regs =
-			b_counter_config_test_oa;
-		dev_priv->perf.oa.b_counter_regs_len =
-			ARRAY_SIZE(b_counter_config_test_oa);
-
-		dev_priv->perf.oa.flex_regs =
-			flex_eu_config_test_oa;
-		dev_priv->perf.oa.flex_regs_len =
-			ARRAY_SIZE(flex_eu_config_test_oa);
-
-		return 0;
-	default:
-		return -ENODEV;
-	}
-}
-
-static ssize_t
-show_render_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_BASIC);
-}
-
-static struct device_attribute dev_attr_render_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_basic[] = {
-	&dev_attr_render_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_basic = {
-	.name = "bad77c24-cc64-480d-99bf-e7b740713800",
-	.attrs =  attrs_render_basic,
-};
-
-static ssize_t
-show_compute_basic_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_BASIC);
-}
-
-static struct device_attribute dev_attr_compute_basic_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_basic_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_basic[] = {
-	&dev_attr_compute_basic_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_basic = {
-	.name = "7277228f-e7f3-4743-945a-6a2049d11377",
-	.attrs =  attrs_compute_basic,
-};
-
-static ssize_t
-show_render_pipe_profile_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RENDER_PIPE_PROFILE);
-}
-
-static struct device_attribute dev_attr_render_pipe_profile_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_render_pipe_profile_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_render_pipe_profile[] = {
-	&dev_attr_render_pipe_profile_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_render_pipe_profile = {
-	.name = "463c668c-3f60-49b6-8f85-d995b635b3b2",
-	.attrs =  attrs_render_pipe_profile,
-};
-
-static ssize_t
-show_memory_reads_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_MEMORY_READS);
-}
-
-static struct device_attribute dev_attr_memory_reads_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_memory_reads_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_memory_reads[] = {
-	&dev_attr_memory_reads_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_memory_reads = {
-	.name = "3ae6e74c-72c3-4040-9bd0-7961430b8cc8",
-	.attrs =  attrs_memory_reads,
-};
-
-static ssize_t
-show_memory_writes_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_MEMORY_WRITES);
-}
-
-static struct device_attribute dev_attr_memory_writes_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_memory_writes_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_memory_writes[] = {
-	&dev_attr_memory_writes_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_memory_writes = {
-	.name = "055f256d-4052-467c-8dec-6064a4806433",
-	.attrs =  attrs_memory_writes,
-};
-
-static ssize_t
-show_compute_extended_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_EXTENDED);
-}
-
-static struct device_attribute dev_attr_compute_extended_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_extended_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_extended[] = {
-	&dev_attr_compute_extended_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_extended = {
-	.name = "753972d4-87cd-4460-824d-754463ac5054",
-	.attrs =  attrs_compute_extended,
-};
-
-static ssize_t
-show_compute_l3_cache_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_L3_CACHE);
-}
-
-static struct device_attribute dev_attr_compute_l3_cache_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_l3_cache_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_l3_cache[] = {
-	&dev_attr_compute_l3_cache_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_l3_cache = {
-	.name = "4e4392e9-8f73-457b-ab44-b49f7a0c733b",
-	.attrs =  attrs_compute_l3_cache,
-};
-
-static ssize_t
-show_hdc_and_sf_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_HDC_AND_SF);
-}
-
-static struct device_attribute dev_attr_hdc_and_sf_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_hdc_and_sf_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_hdc_and_sf[] = {
-	&dev_attr_hdc_and_sf_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_hdc_and_sf = {
-	.name = "730d95dd-7da8-4e1c-ab8d-c0eb1e4c1805",
-	.attrs =  attrs_hdc_and_sf,
-};
-
-static ssize_t
-show_l3_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_1);
-}
-
-static struct device_attribute dev_attr_l3_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_1[] = {
-	&dev_attr_l3_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_1 = {
-	.name = "d9e86d70-462b-462a-851e-fd63e8c13d63",
-	.attrs =  attrs_l3_1,
-};
-
-static ssize_t
-show_l3_2_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_2);
-}
-
-static struct device_attribute dev_attr_l3_2_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_2_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_2[] = {
-	&dev_attr_l3_2_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_2 = {
-	.name = "52200424-6ee9-48b3-b7fa-0afcf1975e4d",
-	.attrs =  attrs_l3_2,
-};
-
-static ssize_t
-show_l3_3_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_L3_3);
-}
-
-static struct device_attribute dev_attr_l3_3_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_l3_3_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_l3_3[] = {
-	&dev_attr_l3_3_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_l3_3 = {
-	.name = "1988315f-0a26-44df-acb0-df7ec86b1456",
-	.attrs =  attrs_l3_3,
-};
-
-static ssize_t
-show_rasterizer_and_pixel_backend_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_RASTERIZER_AND_PIXEL_BACKEND);
-}
-
-static struct device_attribute dev_attr_rasterizer_and_pixel_backend_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_rasterizer_and_pixel_backend_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_rasterizer_and_pixel_backend[] = {
-	&dev_attr_rasterizer_and_pixel_backend_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_rasterizer_and_pixel_backend = {
-	.name = "f1f17ca7-286e-4ae5-9d15-9fccad6c665d",
-	.attrs =  attrs_rasterizer_and_pixel_backend,
-};
-
-static ssize_t
-show_sampler_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_SAMPLER);
-}
-
-static struct device_attribute dev_attr_sampler_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_sampler_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_sampler[] = {
-	&dev_attr_sampler_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_sampler = {
-	.name = "00a9e0fb-3d2e-4405-852c-dce6334ffb3b",
-	.attrs =  attrs_sampler,
-};
-
-static ssize_t
-show_tdl_1_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TDL_1);
-}
-
-static struct device_attribute dev_attr_tdl_1_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_tdl_1_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_tdl_1[] = {
-	&dev_attr_tdl_1_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_tdl_1 = {
-	.name = "13dcc50a-7ec0-409b-99d6-a3f932cedcb3",
-	.attrs =  attrs_tdl_1,
-};
-
-static ssize_t
-show_tdl_2_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TDL_2);
-}
-
-static struct device_attribute dev_attr_tdl_2_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_tdl_2_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_tdl_2[] = {
-	&dev_attr_tdl_2_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_tdl_2 = {
-	.name = "97875e21-6624-4aee-9191-682feb3eae21",
-	.attrs =  attrs_tdl_2,
-};
-
-static ssize_t
-show_compute_extra_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_COMPUTE_EXTRA);
-}
-
-static struct device_attribute dev_attr_compute_extra_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_compute_extra_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_compute_extra[] = {
-	&dev_attr_compute_extra_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_compute_extra = {
-	.name = "a5aa857d-e8f0-4dfa-8981-ce340fa748fd",
-	.attrs =  attrs_compute_extra,
-};
-
-static ssize_t
-show_vme_pipe_id(struct device *kdev, struct device_attribute *attr, char *buf)
-{
-	return sprintf(buf, "%d\n", METRIC_SET_ID_VME_PIPE);
-}
-
-static struct device_attribute dev_attr_vme_pipe_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_vme_pipe_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_vme_pipe[] = {
-	&dev_attr_vme_pipe_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_vme_pipe = {
-	.name = "0e8d8b86-4ee7-4cdd-aaaa-58adc92cb29e",
-	.attrs =  attrs_vme_pipe,
-};
-
 static ssize_t
 show_test_oa_id(struct device *kdev, struct device_attribute *attr, char *buf)
 {
-	return sprintf(buf, "%d\n", METRIC_SET_ID_TEST_OA);
-}
-
-static struct device_attribute dev_attr_test_oa_id = {
-	.attr = { .name = "id", .mode = 0444 },
-	.show = show_test_oa_id,
-	.store = NULL,
-};
-
-static struct attribute *attrs_test_oa[] = {
-	&dev_attr_test_oa_id.attr,
-	NULL,
-};
-
-static struct attribute_group group_test_oa = {
-	.name = "882fa433-1f4a-4a67-a962-c741888fe5f5",
-	.attrs =  attrs_test_oa,
-};
-
-int
-i915_perf_register_sysfs_sklgt4(struct drm_i915_private *dev_priv)
-{
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
-	int ret = 0;
-
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-		if (ret)
-			goto error_render_basic;
-	}
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-		if (ret)
-			goto error_compute_basic;
-	}
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-		if (ret)
-			goto error_render_pipe_profile;
-	}
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-		if (ret)
-			goto error_memory_reads;
-	}
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-		if (ret)
-			goto error_memory_writes;
-	}
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-		if (ret)
-			goto error_compute_extended;
-	}
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-		if (ret)
-			goto error_compute_l3_cache;
-	}
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-		if (ret)
-			goto error_hdc_and_sf;
-	}
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-		if (ret)
-			goto error_l3_1;
-	}
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-		if (ret)
-			goto error_l3_2;
-	}
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-		if (ret)
-			goto error_l3_3;
-	}
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-		if (ret)
-			goto error_rasterizer_and_pixel_backend;
-	}
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_sampler);
-		if (ret)
-			goto error_sampler;
-	}
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-		if (ret)
-			goto error_tdl_1;
-	}
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-		if (ret)
-			goto error_tdl_2;
-	}
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-		if (ret)
-			goto error_compute_extra;
-	}
-	if (get_vme_pipe_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_vme_pipe);
-		if (ret)
-			goto error_vme_pipe;
-	}
-	if (get_test_oa_mux_config(dev_priv, mux_regs, mux_lens)) {
-		ret = sysfs_create_group(dev_priv->perf.metrics_kobj, &group_test_oa);
-		if (ret)
-			goto error_test_oa;
-	}
-
-	return 0;
-
-error_test_oa:
-	if (get_vme_pipe_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_vme_pipe);
-error_vme_pipe:
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-error_compute_extra:
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-error_tdl_2:
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-error_tdl_1:
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler);
-error_sampler:
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-error_rasterizer_and_pixel_backend:
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-error_l3_3:
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-error_l3_2:
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-error_l3_1:
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-error_hdc_and_sf:
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-error_compute_l3_cache:
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-error_compute_extended:
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-error_memory_writes:
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-error_memory_reads:
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-error_render_pipe_profile:
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-error_compute_basic:
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-error_render_basic:
-	return ret;
+	return sprintf(buf, "1\n");
 }
 
 void
-i915_perf_unregister_sysfs_sklgt4(struct drm_i915_private *dev_priv)
+i915_perf_load_test_config_sklgt4(struct drm_i915_private *dev_priv)
 {
-	const struct i915_oa_reg *mux_regs[ARRAY_SIZE(dev_priv->perf.oa.mux_regs)];
-	int mux_lens[ARRAY_SIZE(dev_priv->perf.oa.mux_regs_lens)];
+	strncpy(dev_priv->perf.oa.test_config.uuid,
+		"882fa433-1f4a-4a67-a962-c741888fe5f5",
+		UUID_STRING_LEN);
+	dev_priv->perf.oa.test_config.id = 1;
 
-	if (get_render_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_basic);
-	if (get_compute_basic_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_basic);
-	if (get_render_pipe_profile_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_render_pipe_profile);
-	if (get_memory_reads_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_reads);
-	if (get_memory_writes_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_memory_writes);
-	if (get_compute_extended_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extended);
-	if (get_compute_l3_cache_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_l3_cache);
-	if (get_hdc_and_sf_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_hdc_and_sf);
-	if (get_l3_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_1);
-	if (get_l3_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_2);
-	if (get_l3_3_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_l3_3);
-	if (get_rasterizer_and_pixel_backend_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_rasterizer_and_pixel_backend);
-	if (get_sampler_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_sampler);
-	if (get_tdl_1_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_1);
-	if (get_tdl_2_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_tdl_2);
-	if (get_compute_extra_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_compute_extra);
-	if (get_vme_pipe_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_vme_pipe);
-	if (get_test_oa_mux_config(dev_priv, mux_regs, mux_lens))
-		sysfs_remove_group(dev_priv->perf.metrics_kobj, &group_test_oa);
+	dev_priv->perf.oa.test_config.mux_regs = mux_config_test_oa;
+	dev_priv->perf.oa.test_config.mux_regs_len = ARRAY_SIZE(mux_config_test_oa);
+
+	dev_priv->perf.oa.test_config.b_counter_regs = b_counter_config_test_oa;
+	dev_priv->perf.oa.test_config.b_counter_regs_len = ARRAY_SIZE(b_counter_config_test_oa);
+
+	dev_priv->perf.oa.test_config.flex_regs = flex_eu_config_test_oa;
+	dev_priv->perf.oa.test_config.flex_regs_len = ARRAY_SIZE(flex_eu_config_test_oa);
+
+	dev_priv->perf.oa.test_config.sysfs_metric.name = "882fa433-1f4a-4a67-a962-c741888fe5f5";
+	dev_priv->perf.oa.test_config.sysfs_metric.attrs = dev_priv->perf.oa.test_config.attrs;
+
+	dev_priv->perf.oa.test_config.attrs[0] = &dev_priv->perf.oa.test_config.sysfs_metric_id.attr;
+
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.name = "id";
+	dev_priv->perf.oa.test_config.sysfs_metric_id.attr.mode = 0444;
+	dev_priv->perf.oa.test_config.sysfs_metric_id.show = show_test_oa_id;
 }
diff --git a/drivers/gpu/drm/i915/i915_oa_sklgt4.h b/drivers/gpu/drm/i915/i915_oa_sklgt4.h
index 1b718f1..944fd52 100644
--- a/drivers/gpu/drm/i915/i915_oa_sklgt4.h
+++ b/drivers/gpu/drm/i915/i915_oa_sklgt4.h
@@ -29,12 +29,6 @@
 #ifndef __I915_OA_SKLGT4_H__
 #define __I915_OA_SKLGT4_H__
 
-extern int i915_oa_n_builtin_metric_sets_sklgt4;
-
-extern int i915_oa_select_metric_set_sklgt4(struct drm_i915_private *dev_priv);
-
-extern int i915_perf_register_sysfs_sklgt4(struct drm_i915_private *dev_priv);
-
-extern void i915_perf_unregister_sysfs_sklgt4(struct drm_i915_private *dev_priv);
+extern void i915_perf_load_test_config_sklgt4(struct drm_i915_private *dev_priv);
 
 #endif
diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c
index b6a7e36..8ab003d 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -46,7 +46,7 @@ struct i915_params i915 __read_mostly = {
 	.prefault_disable = 0,
 	.load_detect_test = 0,
 	.force_reset_modeset_test = 0,
-	.reset = true,
+	.reset = 2,
 	.error_capture = true,
 	.invert_brightness = 0,
 	.disable_display = 0,
@@ -115,8 +115,12 @@ MODULE_PARM_DESC(vbt_sdvo_panel_type,
 	"Override/Ignore selection of SDVO panel mode in the VBT "
 	"(-2=ignore, -1=auto [default], index in VBT BIOS table)");
 
-module_param_named_unsafe(reset, i915.reset, bool, 0600);
-MODULE_PARM_DESC(reset, "Attempt GPU resets (default: true)");
+module_param_named_unsafe(reset, i915.reset, int, 0600);
+MODULE_PARM_DESC(reset, "Attempt GPU resets (0=disabled, 1=full gpu reset, 2=engine reset [default])");
+
+module_param_named_unsafe(vbt_firmware, i915.vbt_firmware, charp, 0400);
+MODULE_PARM_DESC(vbt_firmware,
+		 "Load VBT from specified file under /lib/firmware");
 
 #if IS_ENABLED(CONFIG_DRM_I915_CAPTURE_ERROR)
 module_param_named(error_capture, i915.error_capture, bool, 0600);
diff --git a/drivers/gpu/drm/i915/i915_params.h b/drivers/gpu/drm/i915/i915_params.h
index 34148cc..ac84470 100644
--- a/drivers/gpu/drm/i915/i915_params.h
+++ b/drivers/gpu/drm/i915/i915_params.h
@@ -28,6 +28,7 @@
 #include <linux/cache.h> /* for __read_mostly */
 
 #define I915_PARAMS_FOR_EACH(func) \
+	func(char *, vbt_firmware); \
 	func(int, modeset); \
 	func(int, panel_ignore_lid); \
 	func(int, semaphores); \
@@ -51,6 +52,7 @@
 	func(int, use_mmio_flip); \
 	func(int, mmio_debug); \
 	func(int, edp_vswing); \
+	func(int, reset); \
 	func(unsigned int, inject_load_failure); \
 	/* leave bools at the end to not create holes */ \
 	func(bool, alpha_support); \
@@ -60,7 +62,6 @@
 	func(bool, prefault_disable); \
 	func(bool, load_detect_test); \
 	func(bool, force_reset_modeset_test); \
-	func(bool, reset); \
 	func(bool, error_capture); \
 	func(bool, disable_display); \
 	func(bool, verbose_state_checks); \
diff --git a/drivers/gpu/drm/i915/i915_pci.c b/drivers/gpu/drm/i915/i915_pci.c
index 506ec32..09d97e0 100644
--- a/drivers/gpu/drm/i915/i915_pci.c
+++ b/drivers/gpu/drm/i915/i915_pci.c
@@ -310,7 +310,8 @@ static const struct intel_device_info intel_haswell_info = {
 	BDW_COLORS, \
 	.has_logical_ring_contexts = 1, \
 	.has_full_48bit_ppgtt = 1, \
-	.has_64bit_reloc = 1
+	.has_64bit_reloc = 1, \
+	.has_reset_engine = 1
 
 #define BDW_PLATFORM \
 	BDW_FEATURES, \
@@ -342,6 +343,7 @@ static const struct intel_device_info intel_cherryview_info = {
 	.has_gmch_display = 1,
 	.has_aliasing_ppgtt = 1,
 	.has_full_ppgtt = 1,
+	.has_reset_engine = 1,
 	.display_mmio_offset = VLV_DISPLAY_BASE,
 	GEN_CHV_PIPEOFFSETS,
 	CURSOR_OFFSETS,
@@ -387,6 +389,7 @@ static const struct intel_device_info intel_skylake_gt3_info = {
 	.has_aliasing_ppgtt = 1, \
 	.has_full_ppgtt = 1, \
 	.has_full_48bit_ppgtt = 1, \
+	.has_reset_engine = 1, \
 	GEN_DEFAULT_PIPEOFFSETS, \
 	IVB_CURSOR_OFFSETS, \
 	BDW_COLORS
@@ -395,6 +398,7 @@ static const struct intel_device_info intel_broxton_info = {
 	GEN9_LP_FEATURES,
 	.platform = INTEL_BROXTON,
 	.ddb_size = 512,
+	.has_reset_engine = false,
 };
 
 static const struct intel_device_info intel_geminilake_info = {
@@ -446,6 +450,7 @@ static const struct intel_device_info intel_cannonlake_info = {
 	.gen = 10,
 	.ddb_size = 1024,
 	.has_csr = 1,
+	.color = { .degamma_lut_size = 0, .gamma_lut_size = 1024 }
 };
 
 /*
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index f33d902..94185d6 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -193,6 +193,7 @@
 
 #include <linux/anon_inodes.h>
 #include <linux/sizes.h>
+#include <linux/uuid.h>
 
 #include "i915_drv.h"
 #include "i915_oa_hsw.h"
@@ -357,6 +358,54 @@ struct perf_open_properties {
 	int oa_period_exponent;
 };
 
+static void free_oa_config(struct drm_i915_private *dev_priv,
+			   struct i915_oa_config *oa_config)
+{
+	if (!PTR_ERR(oa_config->flex_regs))
+		kfree(oa_config->flex_regs);
+	if (!PTR_ERR(oa_config->b_counter_regs))
+		kfree(oa_config->b_counter_regs);
+	if (!PTR_ERR(oa_config->mux_regs))
+		kfree(oa_config->mux_regs);
+	kfree(oa_config);
+}
+
+static void put_oa_config(struct drm_i915_private *dev_priv,
+			  struct i915_oa_config *oa_config)
+{
+	if (!atomic_dec_and_test(&oa_config->ref_count))
+		return;
+
+	free_oa_config(dev_priv, oa_config);
+}
+
+static int get_oa_config(struct drm_i915_private *dev_priv,
+			 int metrics_set,
+			 struct i915_oa_config **out_config)
+{
+	int ret;
+
+	if (metrics_set == 1) {
+		*out_config = &dev_priv->perf.oa.test_config;
+		atomic_inc(&dev_priv->perf.oa.test_config.ref_count);
+		return 0;
+	}
+
+	ret = mutex_lock_interruptible(&dev_priv->perf.metrics_lock);
+	if (ret)
+		return ret;
+
+	*out_config = idr_find(&dev_priv->perf.metrics_idr, metrics_set);
+	if (!*out_config)
+		ret = -EINVAL;
+	else
+		atomic_inc(&(*out_config)->ref_count);
+
+	mutex_unlock(&dev_priv->perf.metrics_lock);
+
+	return ret;
+}
+
 static u32 gen8_oa_hw_tail_read(struct drm_i915_private *dev_priv)
 {
 	return I915_READ(GEN8_OATAILPTR) & GEN8_OATAILPTR_MASK;
@@ -1246,10 +1295,12 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
 	BUG_ON(stream != dev_priv->perf.oa.exclusive_stream);
 
 	/*
-	 * Unset exclusive_stream first, it might be checked while
-	 * disabling the metric set on gen8+.
+	 * Unset exclusive_stream first, it will be checked while disabling
+	 * the metric set on gen8+.
 	 */
+	mutex_lock(&dev_priv->drm.struct_mutex);
 	dev_priv->perf.oa.exclusive_stream = NULL;
+	mutex_unlock(&dev_priv->drm.struct_mutex);
 
 	dev_priv->perf.oa.ops.disable_metric_set(dev_priv);
 
@@ -1261,6 +1312,8 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
 	if (stream->ctx)
 		oa_put_render_ctx_id(stream);
 
+	put_oa_config(dev_priv, stream->oa_config);
+
 	if (dev_priv->perf.oa.spurious_report_rs.missed) {
 		DRM_NOTE("%d spurious OA report notices suppressed due to ratelimiting\n",
 			 dev_priv->perf.oa.spurious_report_rs.missed);
@@ -1440,9 +1493,9 @@ static int alloc_oa_buffer(struct drm_i915_private *dev_priv)
 
 static void config_oa_regs(struct drm_i915_private *dev_priv,
 			   const struct i915_oa_reg *regs,
-			   int n_regs)
+			   u32 n_regs)
 {
-	int i;
+	u32 i;
 
 	for (i = 0; i < n_regs; i++) {
 		const struct i915_oa_reg *reg = regs + i;
@@ -1451,17 +1504,9 @@ static void config_oa_regs(struct drm_i915_private *dev_priv,
 	}
 }
 
-static int hsw_enable_metric_set(struct drm_i915_private *dev_priv)
+static int hsw_enable_metric_set(struct drm_i915_private *dev_priv,
+				 const struct i915_oa_config *oa_config)
 {
-	int ret = i915_oa_select_metric_set_hsw(dev_priv);
-	int i;
-
-	if (ret)
-		return ret;
-
-	I915_WRITE(GDT_CHICKEN_BITS, (I915_READ(GDT_CHICKEN_BITS) |
-				      GT_NOA_ENABLE));
-
 	/* PRM:
 	 *
 	 * OA unit is using “crclk” for its functionality. When trunk
@@ -1476,10 +1521,7 @@ static int hsw_enable_metric_set(struct drm_i915_private *dev_priv)
 	I915_WRITE(GEN6_UCGCTL1, (I915_READ(GEN6_UCGCTL1) |
 				  GEN6_CSUNIT_CLOCK_GATE_DISABLE));
 
-	for (i = 0; i < dev_priv->perf.oa.n_mux_configs; i++) {
-		config_oa_regs(dev_priv, dev_priv->perf.oa.mux_regs[i],
-			       dev_priv->perf.oa.mux_regs_lens[i]);
-	}
+	config_oa_regs(dev_priv, oa_config->mux_regs, oa_config->mux_regs_len);
 
 	/* It apparently takes a fairly long time for a new MUX
 	 * configuration to be be applied after these register writes.
@@ -1504,8 +1546,8 @@ static int hsw_enable_metric_set(struct drm_i915_private *dev_priv)
 	 */
 	usleep_range(15000, 20000);
 
-	config_oa_regs(dev_priv, dev_priv->perf.oa.b_counter_regs,
-		       dev_priv->perf.oa.b_counter_regs_len);
+	config_oa_regs(dev_priv, oa_config->b_counter_regs,
+		       oa_config->b_counter_regs_len);
 
 	return 0;
 }
@@ -1529,11 +1571,10 @@ static void hsw_disable_metric_set(struct drm_i915_private *dev_priv)
  * in the case that the OA unit has been disabled.
  */
 static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx,
-					   u32 *reg_state)
+					   u32 *reg_state,
+					   const struct i915_oa_config *oa_config)
 {
 	struct drm_i915_private *dev_priv = ctx->i915;
-	const struct i915_oa_reg *flex_regs = dev_priv->perf.oa.flex_regs;
-	int n_flex_regs = dev_priv->perf.oa.flex_regs_len;
 	u32 ctx_oactxctrl = dev_priv->perf.oa.ctx_oactxctrl_offset;
 	u32 ctx_flexeu0 = dev_priv->perf.oa.ctx_flexeu0_offset;
 	/* The MMIO offsets for Flex EU registers aren't contiguous */
@@ -1565,12 +1606,15 @@ static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx,
 		 * will be an explicit 'No Event' we can select, but not yet...
 		 */
 		u32 value = 0;
-		int j;
 
-		for (j = 0; j < n_flex_regs; j++) {
-			if (i915_mmio_reg_offset(flex_regs[j].addr) == mmio) {
-				value = flex_regs[j].value;
-				break;
+		if (oa_config) {
+			u32 j;
+
+			for (j = 0; j < oa_config->flex_regs_len; j++) {
+				if (i915_mmio_reg_offset(oa_config->flex_regs[j].addr) == mmio) {
+					value = oa_config->flex_regs[j].value;
+					break;
+				}
 			}
 		}
 
@@ -1583,11 +1627,10 @@ static void gen8_update_reg_state_unlocked(struct i915_gem_context *ctx,
  * Same as gen8_update_reg_state_unlocked only through the batchbuffer. This
  * is only used by the kernel context.
  */
-static int gen8_emit_oa_config(struct drm_i915_gem_request *req)
+static int gen8_emit_oa_config(struct drm_i915_gem_request *req,
+			       const struct i915_oa_config *oa_config)
 {
 	struct drm_i915_private *dev_priv = req->i915;
-	const struct i915_oa_reg *flex_regs = dev_priv->perf.oa.flex_regs;
-	int n_flex_regs = dev_priv->perf.oa.flex_regs_len;
 	/* The MMIO offsets for Flex EU registers aren't contiguous */
 	u32 flex_mmio[] = {
 		i915_mmio_reg_offset(EU_PERF_CNTL0),
@@ -1622,12 +1665,15 @@ static int gen8_emit_oa_config(struct drm_i915_gem_request *req)
 		 * yet...
 		 */
 		u32 value = 0;
-		int j;
 
-		for (j = 0; j < n_flex_regs; j++) {
-			if (i915_mmio_reg_offset(flex_regs[j].addr) == mmio) {
-				value = flex_regs[j].value;
-				break;
+		if (oa_config) {
+			u32 j;
+
+			for (j = 0; j < oa_config->flex_regs_len; j++) {
+				if (i915_mmio_reg_offset(oa_config->flex_regs[j].addr) == mmio) {
+					value = oa_config->flex_regs[j].value;
+					break;
+				}
 			}
 		}
 
@@ -1641,7 +1687,8 @@ static int gen8_emit_oa_config(struct drm_i915_gem_request *req)
 	return 0;
 }
 
-static int gen8_switch_to_updated_kernel_context(struct drm_i915_private *dev_priv)
+static int gen8_switch_to_updated_kernel_context(struct drm_i915_private *dev_priv,
+						 const struct i915_oa_config *oa_config)
 {
 	struct intel_engine_cs *engine = dev_priv->engine[RCS];
 	struct i915_gem_timeline *timeline;
@@ -1656,7 +1703,7 @@ static int gen8_switch_to_updated_kernel_context(struct drm_i915_private *dev_pr
 	if (IS_ERR(req))
 		return PTR_ERR(req);
 
-	ret = gen8_emit_oa_config(req);
+	ret = gen8_emit_oa_config(req, oa_config);
 	if (ret) {
 		i915_add_request(req);
 		return ret;
@@ -1707,6 +1754,7 @@ static int gen8_switch_to_updated_kernel_context(struct drm_i915_private *dev_pr
  * Note: it's only the RCS/Render context that has any OA state.
  */
 static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
+				       const struct i915_oa_config *oa_config,
 				       bool interruptible)
 {
 	struct i915_gem_context *ctx;
@@ -1724,7 +1772,7 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 	}
 
 	/* Switch away from any user context. */
-	ret = gen8_switch_to_updated_kernel_context(dev_priv);
+	ret = gen8_switch_to_updated_kernel_context(dev_priv, oa_config);
 	if (ret)
 		goto out;
 
@@ -1746,7 +1794,7 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 		goto out;
 
 	/* Update all contexts now that we've stalled the submission. */
-	list_for_each_entry(ctx, &dev_priv->context_list, link) {
+	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
 		struct intel_context *ce = &ctx->engine[RCS];
 		u32 *regs;
 
@@ -1763,7 +1811,7 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 		ce->state->obj->mm.dirty = true;
 		regs += LRC_STATE_PN * PAGE_SIZE / sizeof(*regs);
 
-		gen8_update_reg_state_unlocked(ctx, regs);
+		gen8_update_reg_state_unlocked(ctx, regs, oa_config);
 
 		i915_gem_object_unpin_map(ce->state->obj);
 	}
@@ -1774,13 +1822,10 @@ static int gen8_configure_all_contexts(struct drm_i915_private *dev_priv,
 	return ret;
 }
 
-static int gen8_enable_metric_set(struct drm_i915_private *dev_priv)
+static int gen8_enable_metric_set(struct drm_i915_private *dev_priv,
+				  const struct i915_oa_config *oa_config)
 {
-	int ret = dev_priv->perf.oa.ops.select_metric_set(dev_priv);
-	int i;
-
-	if (ret)
-		return ret;
+	int ret;
 
 	/*
 	 * We disable slice/unslice clock ratio change reports on SKL since
@@ -1817,19 +1862,14 @@ static int gen8_enable_metric_set(struct drm_i915_private *dev_priv)
 	 * to make sure all slices/subslices are ON before writing to NOA
 	 * registers.
 	 */
-	ret = gen8_configure_all_contexts(dev_priv, true);
+	ret = gen8_configure_all_contexts(dev_priv, oa_config, true);
 	if (ret)
 		return ret;
 
-	I915_WRITE(GDT_CHICKEN_BITS, 0xA0);
-	for (i = 0; i < dev_priv->perf.oa.n_mux_configs; i++) {
-		config_oa_regs(dev_priv, dev_priv->perf.oa.mux_regs[i],
-			       dev_priv->perf.oa.mux_regs_lens[i]);
-	}
-	I915_WRITE(GDT_CHICKEN_BITS, 0x80);
+	config_oa_regs(dev_priv, oa_config->mux_regs, oa_config->mux_regs_len);
 
-	config_oa_regs(dev_priv, dev_priv->perf.oa.b_counter_regs,
-		       dev_priv->perf.oa.b_counter_regs_len);
+	config_oa_regs(dev_priv, oa_config->b_counter_regs,
+		       oa_config->b_counter_regs_len);
 
 	return 0;
 }
@@ -1837,7 +1877,11 @@ static int gen8_enable_metric_set(struct drm_i915_private *dev_priv)
 static void gen8_disable_metric_set(struct drm_i915_private *dev_priv)
 {
 	/* Reset all contexts' slices/subslices configurations. */
-	gen8_configure_all_contexts(dev_priv, false);
+	gen8_configure_all_contexts(dev_priv, NULL, false);
+
+	I915_WRITE(GDT_CHICKEN_BITS, (I915_READ(GDT_CHICKEN_BITS) &
+				      ~GT_NOA_ENABLE));
+
 }
 
 static void gen7_oa_enable(struct drm_i915_private *dev_priv)
@@ -2011,11 +2055,6 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 		return -EBUSY;
 	}
 
-	if (!props->metrics_set) {
-		DRM_DEBUG("OA metric set not specified\n");
-		return -EINVAL;
-	}
-
 	if (!props->oa_format) {
 		DRM_DEBUG("OA report format not specified\n");
 		return -EINVAL;
@@ -2055,8 +2094,6 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	dev_priv->perf.oa.oa_buffer.format =
 		dev_priv->perf.oa.oa_formats[props->oa_format].format;
 
-	dev_priv->perf.oa.metrics_set = props->metrics_set;
-
 	dev_priv->perf.oa.periodic = props->oa_periodic;
 	if (dev_priv->perf.oa.periodic)
 		dev_priv->perf.oa.period_exponent = props->oa_period_exponent;
@@ -2067,6 +2104,10 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 			return ret;
 	}
 
+	ret = get_oa_config(dev_priv, props->metrics_set, &stream->oa_config);
+	if (ret)
+		goto err_config;
+
 	/* PRM - observability performance counters:
 	 *
 	 *   OACONTROL, performance counter enable, note:
@@ -2086,22 +2127,39 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
 	if (ret)
 		goto err_oa_buf_alloc;
 
-	ret = dev_priv->perf.oa.ops.enable_metric_set(dev_priv);
+	ret = dev_priv->perf.oa.ops.enable_metric_set(dev_priv,
+						      stream->oa_config);
 	if (ret)
 		goto err_enable;
 
 	stream->ops = &i915_oa_stream_ops;
 
+	/* Lock device for exclusive_stream access late because
+	 * enable_metric_set() might lock as well on gen8+.
+	 */
+	ret = i915_mutex_lock_interruptible(&dev_priv->drm);
+	if (ret)
+		goto err_lock;
+
 	dev_priv->perf.oa.exclusive_stream = stream;
 
+	mutex_unlock(&dev_priv->drm.struct_mutex);
+
 	return 0;
 
+err_lock:
+	dev_priv->perf.oa.ops.disable_metric_set(dev_priv);
+
 err_enable:
 	free_oa_buffer(dev_priv);
 
 err_oa_buf_alloc:
+	put_oa_config(dev_priv, stream->oa_config);
+
 	intel_uncore_forcewake_put(dev_priv, FORCEWAKE_ALL);
 	intel_runtime_pm_put(dev_priv);
+
+err_config:
 	if (stream->ctx)
 		oa_put_render_ctx_id(stream);
 
@@ -2112,15 +2170,14 @@ void i915_oa_init_reg_state(struct intel_engine_cs *engine,
 			    struct i915_gem_context *ctx,
 			    u32 *reg_state)
 {
-	struct drm_i915_private *dev_priv = engine->i915;
+	struct i915_perf_stream *stream;
 
 	if (engine->id != RCS)
 		return;
 
-	if (!dev_priv->perf.initialized)
-		return;
-
-	gen8_update_reg_state_unlocked(ctx, reg_state);
+	stream = engine->i915->perf.oa.exclusive_stream;
+	if (stream)
+		gen8_update_reg_state_unlocked(ctx, reg_state, stream->oa_config);
 }
 
 /**
@@ -2444,7 +2501,7 @@ static void i915_perf_destroy_locked(struct i915_perf_stream *stream)
 	list_del(&stream->link);
 
 	if (stream->ctx)
-		i915_gem_context_put_unlocked(stream->ctx);
+		i915_gem_context_put(stream->ctx);
 
 	kfree(stream);
 }
@@ -2483,27 +2540,6 @@ static const struct file_operations fops = {
 };
 
 
-static struct i915_gem_context *
-lookup_context(struct drm_i915_private *dev_priv,
-	       struct drm_i915_file_private *file_priv,
-	       u32 ctx_user_handle)
-{
-	struct i915_gem_context *ctx;
-	int ret;
-
-	ret = i915_mutex_lock_interruptible(&dev_priv->drm);
-	if (ret)
-		return ERR_PTR(ret);
-
-	ctx = i915_gem_context_lookup(file_priv, ctx_user_handle);
-	if (!IS_ERR(ctx))
-		i915_gem_context_get(ctx);
-
-	mutex_unlock(&dev_priv->drm.struct_mutex);
-
-	return ctx;
-}
-
 /**
  * i915_perf_open_ioctl_locked - DRM ioctl() for userspace to open a stream FD
  * @dev_priv: i915 device instance
@@ -2545,12 +2581,11 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
 		u32 ctx_handle = props->ctx_handle;
 		struct drm_i915_file_private *file_priv = file->driver_priv;
 
-		specific_ctx = lookup_context(dev_priv, file_priv, ctx_handle);
-		if (IS_ERR(specific_ctx)) {
-			ret = PTR_ERR(specific_ctx);
-			if (ret != -EINTR)
-				DRM_DEBUG("Failed to look up context with ID %u for opening perf stream\n",
-					  ctx_handle);
+		specific_ctx = i915_gem_context_lookup(file_priv, ctx_handle);
+		if (!specific_ctx) {
+			DRM_DEBUG("Failed to look up context with ID %u for opening perf stream\n",
+				  ctx_handle);
+			ret = -ENOENT;
 			goto err;
 		}
 	}
@@ -2633,7 +2668,7 @@ i915_perf_open_ioctl_locked(struct drm_i915_private *dev_priv,
 	kfree(stream);
 err_ctx:
 	if (specific_ctx)
-		i915_gem_context_put_unlocked(specific_ctx);
+		i915_gem_context_put(specific_ctx);
 err:
 	return ret;
 }
@@ -2665,7 +2700,7 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv,
 				    struct perf_open_properties *props)
 {
 	u64 __user *uprop = uprops;
-	int i;
+	u32 i;
 
 	memset(props, 0, sizeof(struct perf_open_properties));
 
@@ -2712,8 +2747,7 @@ static int read_properties_unlocked(struct drm_i915_private *dev_priv,
 			props->sample_flags |= SAMPLE_OA_REPORT;
 			break;
 		case DRM_I915_PERF_PROP_OA_METRICS_SET:
-			if (value == 0 ||
-			    value > dev_priv->perf.oa.n_builtin_sets) {
+			if (value == 0) {
 				DRM_DEBUG("Unknown OA metric set ID\n");
 				return -EINVAL;
 			}
@@ -2852,6 +2886,8 @@ int i915_perf_open_ioctl(struct drm_device *dev, void *data,
  */
 void i915_perf_register(struct drm_i915_private *dev_priv)
 {
+	int ret;
+
 	if (!dev_priv->perf.initialized)
 		return;
 
@@ -2867,44 +2903,42 @@ void i915_perf_register(struct drm_i915_private *dev_priv)
 	if (!dev_priv->perf.metrics_kobj)
 		goto exit;
 
+	sysfs_attr_init(&dev_priv->perf.oa.test_config.sysfs_metric_id.attr);
+
 	if (IS_HASWELL(dev_priv)) {
-		if (i915_perf_register_sysfs_hsw(dev_priv))
-			goto sysfs_error;
+		i915_perf_load_test_config_hsw(dev_priv);
 	} else if (IS_BROADWELL(dev_priv)) {
-		if (i915_perf_register_sysfs_bdw(dev_priv))
-			goto sysfs_error;
+		i915_perf_load_test_config_bdw(dev_priv);
 	} else if (IS_CHERRYVIEW(dev_priv)) {
-		if (i915_perf_register_sysfs_chv(dev_priv))
-			goto sysfs_error;
+		i915_perf_load_test_config_chv(dev_priv);
 	} else if (IS_SKYLAKE(dev_priv)) {
-		if (IS_SKL_GT2(dev_priv)) {
-			if (i915_perf_register_sysfs_sklgt2(dev_priv))
-				goto sysfs_error;
-		} else if (IS_SKL_GT3(dev_priv)) {
-			if (i915_perf_register_sysfs_sklgt3(dev_priv))
-				goto sysfs_error;
-		} else if (IS_SKL_GT4(dev_priv)) {
-			if (i915_perf_register_sysfs_sklgt4(dev_priv))
-				goto sysfs_error;
-		} else
-			goto sysfs_error;
+		if (IS_SKL_GT2(dev_priv))
+			i915_perf_load_test_config_sklgt2(dev_priv);
+		else if (IS_SKL_GT3(dev_priv))
+			i915_perf_load_test_config_sklgt3(dev_priv);
+		else if (IS_SKL_GT4(dev_priv))
+			i915_perf_load_test_config_sklgt4(dev_priv);
 	} else if (IS_BROXTON(dev_priv)) {
-		if (i915_perf_register_sysfs_bxt(dev_priv))
-			goto sysfs_error;
+		i915_perf_load_test_config_bxt(dev_priv);
 	} else if (IS_KABYLAKE(dev_priv)) {
-		if (IS_KBL_GT2(dev_priv)) {
-			if (i915_perf_register_sysfs_kblgt2(dev_priv))
-				goto sysfs_error;
-		} else if (IS_KBL_GT3(dev_priv)) {
-			if (i915_perf_register_sysfs_kblgt3(dev_priv))
-				goto sysfs_error;
-		} else
-			goto sysfs_error;
+		if (IS_KBL_GT2(dev_priv))
+			i915_perf_load_test_config_kblgt2(dev_priv);
+		else if (IS_KBL_GT3(dev_priv))
+			i915_perf_load_test_config_kblgt3(dev_priv);
 	} else if (IS_GEMINILAKE(dev_priv)) {
-		if (i915_perf_register_sysfs_glk(dev_priv))
-			goto sysfs_error;
+		i915_perf_load_test_config_glk(dev_priv);
 	}
 
+	if (dev_priv->perf.oa.test_config.id == 0)
+		goto sysfs_error;
+
+	ret = sysfs_create_group(dev_priv->perf.metrics_kobj,
+				 &dev_priv->perf.oa.test_config.sysfs_metric);
+	if (ret)
+		goto sysfs_error;
+
+	atomic_set(&dev_priv->perf.oa.test_config.ref_count, 1);
+
 	goto exit;
 
 sysfs_error:
@@ -2929,34 +2963,375 @@ void i915_perf_unregister(struct drm_i915_private *dev_priv)
 	if (!dev_priv->perf.metrics_kobj)
 		return;
 
-	if (IS_HASWELL(dev_priv))
-		i915_perf_unregister_sysfs_hsw(dev_priv);
-	else if (IS_BROADWELL(dev_priv))
-		i915_perf_unregister_sysfs_bdw(dev_priv);
-	else if (IS_CHERRYVIEW(dev_priv))
-		i915_perf_unregister_sysfs_chv(dev_priv);
-	else if (IS_SKYLAKE(dev_priv)) {
-		if (IS_SKL_GT2(dev_priv))
-			i915_perf_unregister_sysfs_sklgt2(dev_priv);
-		else if (IS_SKL_GT3(dev_priv))
-			i915_perf_unregister_sysfs_sklgt3(dev_priv);
-		else if (IS_SKL_GT4(dev_priv))
-			i915_perf_unregister_sysfs_sklgt4(dev_priv);
-	} else if (IS_BROXTON(dev_priv))
-		i915_perf_unregister_sysfs_bxt(dev_priv);
-	else if (IS_KABYLAKE(dev_priv)) {
-		if (IS_KBL_GT2(dev_priv))
-			i915_perf_unregister_sysfs_kblgt2(dev_priv);
-		else if (IS_KBL_GT3(dev_priv))
-			i915_perf_unregister_sysfs_kblgt3(dev_priv);
-	} else if (IS_GEMINILAKE(dev_priv))
-		i915_perf_unregister_sysfs_glk(dev_priv);
-
+	sysfs_remove_group(dev_priv->perf.metrics_kobj,
+			   &dev_priv->perf.oa.test_config.sysfs_metric);
 
 	kobject_put(dev_priv->perf.metrics_kobj);
 	dev_priv->perf.metrics_kobj = NULL;
 }
 
+static bool gen8_is_valid_flex_addr(struct drm_i915_private *dev_priv, u32 addr)
+{
+	static const i915_reg_t flex_eu_regs[] = {
+		EU_PERF_CNTL0,
+		EU_PERF_CNTL1,
+		EU_PERF_CNTL2,
+		EU_PERF_CNTL3,
+		EU_PERF_CNTL4,
+		EU_PERF_CNTL5,
+		EU_PERF_CNTL6,
+	};
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(flex_eu_regs); i++) {
+		if (flex_eu_regs[i].reg == addr)
+			return true;
+	}
+	return false;
+}
+
+static bool gen7_is_valid_b_counter_addr(struct drm_i915_private *dev_priv, u32 addr)
+{
+	return (addr >= OASTARTTRIG1.reg && addr <= OASTARTTRIG8.reg) ||
+		(addr >= OAREPORTTRIG1.reg && addr <= OAREPORTTRIG8.reg) ||
+		(addr >= OACEC0_0.reg && addr <= OACEC7_1.reg);
+}
+
+static bool gen7_is_valid_mux_addr(struct drm_i915_private *dev_priv, u32 addr)
+{
+	return addr == HALF_SLICE_CHICKEN2.reg ||
+		(addr >= MICRO_BP0_0.reg && addr <= NOA_WRITE.reg) ||
+		(addr >= OA_PERFCNT1_LO.reg && addr <= OA_PERFCNT2_HI.reg) ||
+		(addr >= OA_PERFMATRIX_LO.reg && addr <= OA_PERFMATRIX_HI.reg);
+}
+
+static bool gen8_is_valid_mux_addr(struct drm_i915_private *dev_priv, u32 addr)
+{
+	return gen7_is_valid_mux_addr(dev_priv, addr) ||
+		addr == WAIT_FOR_RC6_EXIT.reg ||
+		(addr >= RPM_CONFIG0.reg && addr <= NOA_CONFIG(8).reg);
+}
+
+static bool hsw_is_valid_mux_addr(struct drm_i915_private *dev_priv, u32 addr)
+{
+	return gen7_is_valid_mux_addr(dev_priv, addr) ||
+		(addr >= 0x25100 && addr <= 0x2FF90) ||
+		addr == 0x9ec0;
+}
+
+static bool chv_is_valid_mux_addr(struct drm_i915_private *dev_priv, u32 addr)
+{
+	return gen7_is_valid_mux_addr(dev_priv, addr) ||
+		(addr >= 0x182300 && addr <= 0x1823A4);
+}
+
+static uint32_t mask_reg_value(u32 reg, u32 val)
+{
+	/* HALF_SLICE_CHICKEN2 is programmed with a the
+	 * WaDisableSTUnitPowerOptimization workaround. Make sure the value
+	 * programmed by userspace doesn't change this.
+	 */
+	if (HALF_SLICE_CHICKEN2.reg == reg)
+		val = val & ~_MASKED_BIT_ENABLE(GEN8_ST_PO_DISABLE);
+
+	/* WAIT_FOR_RC6_EXIT has only one bit fullfilling the function
+	 * indicated by its name and a bunch of selection fields used by OA
+	 * configs.
+	 */
+	if (WAIT_FOR_RC6_EXIT.reg == reg)
+		val = val & ~_MASKED_BIT_ENABLE(HSW_WAIT_FOR_RC6_EXIT_ENABLE);
+
+	return val;
+}
+
+static struct i915_oa_reg *alloc_oa_regs(struct drm_i915_private *dev_priv,
+					 bool (*is_valid)(struct drm_i915_private *dev_priv, u32 addr),
+					 u32 __user *regs,
+					 u32 n_regs)
+{
+	struct i915_oa_reg *oa_regs;
+	int err;
+	u32 i;
+
+	if (!n_regs)
+		return NULL;
+
+	if (!access_ok(VERIFY_READ, regs, n_regs * sizeof(u32) * 2))
+		return ERR_PTR(-EFAULT);
+
+	/* No is_valid function means we're not allowing any register to be programmed. */
+	GEM_BUG_ON(!is_valid);
+	if (!is_valid)
+		return ERR_PTR(-EINVAL);
+
+	oa_regs = kmalloc_array(n_regs, sizeof(*oa_regs), GFP_KERNEL);
+	if (!oa_regs)
+		return ERR_PTR(-ENOMEM);
+
+	for (i = 0; i < n_regs; i++) {
+		u32 addr, value;
+
+		err = get_user(addr, regs);
+		if (err)
+			goto addr_err;
+
+		if (!is_valid(dev_priv, addr)) {
+			DRM_DEBUG("Invalid oa_reg address: %X\n", addr);
+			err = -EINVAL;
+			goto addr_err;
+		}
+
+		err = get_user(value, regs + 1);
+		if (err)
+			goto addr_err;
+
+		oa_regs[i].addr = _MMIO(addr);
+		oa_regs[i].value = mask_reg_value(addr, value);
+
+		regs += 2;
+	}
+
+	return oa_regs;
+
+addr_err:
+	kfree(oa_regs);
+	return ERR_PTR(err);
+}
+
+static ssize_t show_dynamic_id(struct device *dev,
+			       struct device_attribute *attr,
+			       char *buf)
+{
+	struct i915_oa_config *oa_config =
+		container_of(attr, typeof(*oa_config), sysfs_metric_id);
+
+	return sprintf(buf, "%d\n", oa_config->id);
+}
+
+static int create_dynamic_oa_sysfs_entry(struct drm_i915_private *dev_priv,
+					 struct i915_oa_config *oa_config)
+{
+	sysfs_attr_init(&oa_config->sysfs_metric_id.attr);
+	oa_config->sysfs_metric_id.attr.name = "id";
+	oa_config->sysfs_metric_id.attr.mode = S_IRUGO;
+	oa_config->sysfs_metric_id.show = show_dynamic_id;
+	oa_config->sysfs_metric_id.store = NULL;
+
+	oa_config->attrs[0] = &oa_config->sysfs_metric_id.attr;
+	oa_config->attrs[1] = NULL;
+
+	oa_config->sysfs_metric.name = oa_config->uuid;
+	oa_config->sysfs_metric.attrs = oa_config->attrs;
+
+	return sysfs_create_group(dev_priv->perf.metrics_kobj,
+				  &oa_config->sysfs_metric);
+}
+
+/**
+ * i915_perf_add_config_ioctl - DRM ioctl() for userspace to add a new OA config
+ * @dev: drm device
+ * @data: ioctl data (pointer to struct drm_i915_perf_oa_config) copied from
+ *        userspace (unvalidated)
+ * @file: drm file
+ *
+ * Validates the submitted OA register to be saved into a new OA config that
+ * can then be used for programming the OA unit and its NOA network.
+ *
+ * Returns: A new allocated config number to be used with the perf open ioctl
+ * or a negative error code on failure.
+ */
+int i915_perf_add_config_ioctl(struct drm_device *dev, void *data,
+			       struct drm_file *file)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	struct drm_i915_perf_oa_config *args = data;
+	struct i915_oa_config *oa_config, *tmp;
+	int err, id;
+
+	if (!dev_priv->perf.initialized) {
+		DRM_DEBUG("i915 perf interface not available for this system\n");
+		return -ENOTSUPP;
+	}
+
+	if (!dev_priv->perf.metrics_kobj) {
+		DRM_DEBUG("OA metrics weren't advertised via sysfs\n");
+		return -EINVAL;
+	}
+
+	if (i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
+		DRM_DEBUG("Insufficient privileges to add i915 OA config\n");
+		return -EACCES;
+	}
+
+	if ((!args->mux_regs_ptr || !args->n_mux_regs) &&
+	    (!args->boolean_regs_ptr || !args->n_boolean_regs) &&
+	    (!args->flex_regs_ptr || !args->n_flex_regs)) {
+		DRM_DEBUG("No OA registers given\n");
+		return -EINVAL;
+	}
+
+	oa_config = kzalloc(sizeof(*oa_config), GFP_KERNEL);
+	if (!oa_config) {
+		DRM_DEBUG("Failed to allocate memory for the OA config\n");
+		return -ENOMEM;
+	}
+
+	atomic_set(&oa_config->ref_count, 1);
+
+	if (!uuid_is_valid(args->uuid)) {
+		DRM_DEBUG("Invalid uuid format for OA config\n");
+		err = -EINVAL;
+		goto reg_err;
+	}
+
+	/* Last character in oa_config->uuid will be 0 because oa_config is
+	 * kzalloc.
+	 */
+	memcpy(oa_config->uuid, args->uuid, sizeof(args->uuid));
+
+	oa_config->mux_regs_len = args->n_mux_regs;
+	oa_config->mux_regs =
+		alloc_oa_regs(dev_priv,
+			      dev_priv->perf.oa.ops.is_valid_mux_reg,
+			      u64_to_user_ptr(args->mux_regs_ptr),
+			      args->n_mux_regs);
+
+	if (IS_ERR(oa_config->mux_regs)) {
+		DRM_DEBUG("Failed to create OA config for mux_regs\n");
+		err = PTR_ERR(oa_config->mux_regs);
+		goto reg_err;
+	}
+
+	oa_config->b_counter_regs_len = args->n_boolean_regs;
+	oa_config->b_counter_regs =
+		alloc_oa_regs(dev_priv,
+			      dev_priv->perf.oa.ops.is_valid_b_counter_reg,
+			      u64_to_user_ptr(args->boolean_regs_ptr),
+			      args->n_boolean_regs);
+
+	if (IS_ERR(oa_config->b_counter_regs)) {
+		DRM_DEBUG("Failed to create OA config for b_counter_regs\n");
+		err = PTR_ERR(oa_config->b_counter_regs);
+		goto reg_err;
+	}
+
+	if (INTEL_GEN(dev_priv) < 8) {
+		if (args->n_flex_regs != 0) {
+			err = -EINVAL;
+			goto reg_err;
+		}
+	} else {
+		oa_config->flex_regs_len = args->n_flex_regs;
+		oa_config->flex_regs =
+			alloc_oa_regs(dev_priv,
+				      dev_priv->perf.oa.ops.is_valid_flex_reg,
+				      u64_to_user_ptr(args->flex_regs_ptr),
+				      args->n_flex_regs);
+
+		if (IS_ERR(oa_config->flex_regs)) {
+			DRM_DEBUG("Failed to create OA config for flex_regs\n");
+			err = PTR_ERR(oa_config->flex_regs);
+			goto reg_err;
+		}
+	}
+
+	err = mutex_lock_interruptible(&dev_priv->perf.metrics_lock);
+	if (err)
+		goto reg_err;
+
+	/* We shouldn't have too many configs, so this iteration shouldn't be
+	 * too costly.
+	 */
+	idr_for_each_entry(&dev_priv->perf.metrics_idr, tmp, id) {
+		if (!strcmp(tmp->uuid, oa_config->uuid)) {
+			DRM_DEBUG("OA config already exists with this uuid\n");
+			err = -EADDRINUSE;
+			goto sysfs_err;
+		}
+	}
+
+	err = create_dynamic_oa_sysfs_entry(dev_priv, oa_config);
+	if (err) {
+		DRM_DEBUG("Failed to create sysfs entry for OA config\n");
+		goto sysfs_err;
+	}
+
+	/* Config id 0 is invalid, id 1 for kernel stored test config. */
+	oa_config->id = idr_alloc(&dev_priv->perf.metrics_idr,
+				  oa_config, 2,
+				  0, GFP_KERNEL);
+	if (oa_config->id < 0) {
+		DRM_DEBUG("Failed to create sysfs entry for OA config\n");
+		err = oa_config->id;
+		goto sysfs_err;
+	}
+
+	mutex_unlock(&dev_priv->perf.metrics_lock);
+
+	return oa_config->id;
+
+sysfs_err:
+	mutex_unlock(&dev_priv->perf.metrics_lock);
+reg_err:
+	put_oa_config(dev_priv, oa_config);
+	DRM_DEBUG("Failed to add new OA config\n");
+	return err;
+}
+
+/**
+ * i915_perf_remove_config_ioctl - DRM ioctl() for userspace to remove an OA config
+ * @dev: drm device
+ * @data: ioctl data (pointer to u64 integer) copied from userspace
+ * @file: drm file
+ *
+ * Configs can be removed while being used, the will stop appearing in sysfs
+ * and their content will be freed when the stream using the config is closed.
+ *
+ * Returns: 0 on success or a negative error code on failure.
+ */
+int i915_perf_remove_config_ioctl(struct drm_device *dev, void *data,
+				  struct drm_file *file)
+{
+	struct drm_i915_private *dev_priv = dev->dev_private;
+	u64 *arg = data;
+	struct i915_oa_config *oa_config;
+	int ret;
+
+	if (!dev_priv->perf.initialized) {
+		DRM_DEBUG("i915 perf interface not available for this system\n");
+		return -ENOTSUPP;
+	}
+
+	if (i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
+		DRM_DEBUG("Insufficient privileges to remove i915 OA config\n");
+		return -EACCES;
+	}
+
+	ret = mutex_lock_interruptible(&dev_priv->perf.metrics_lock);
+	if (ret)
+		goto lock_err;
+
+	oa_config = idr_find(&dev_priv->perf.metrics_idr, *arg);
+	if (!oa_config) {
+		DRM_DEBUG("Failed to remove unknown OA config\n");
+		ret = -ENOENT;
+		goto config_err;
+	}
+
+	GEM_BUG_ON(*arg != oa_config->id);
+
+	sysfs_remove_group(dev_priv->perf.metrics_kobj,
+			   &oa_config->sysfs_metric);
+
+	idr_remove(&dev_priv->perf.metrics_idr, *arg);
+	put_oa_config(dev_priv, oa_config);
+
+config_err:
+	mutex_unlock(&dev_priv->perf.metrics_lock);
+lock_err:
+	return ret;
+}
+
 static struct ctl_table oa_table[] = {
 	{
 	 .procname = "perf_stream_paranoid",
@@ -3010,9 +3385,14 @@ static struct ctl_table dev_root[] = {
  */
 void i915_perf_init(struct drm_i915_private *dev_priv)
 {
-	dev_priv->perf.oa.n_builtin_sets = 0;
+	dev_priv->perf.oa.timestamp_frequency = 0;
 
 	if (IS_HASWELL(dev_priv)) {
+		dev_priv->perf.oa.ops.is_valid_b_counter_reg =
+			gen7_is_valid_b_counter_addr;
+		dev_priv->perf.oa.ops.is_valid_mux_reg =
+			hsw_is_valid_mux_addr;
+		dev_priv->perf.oa.ops.is_valid_flex_reg = NULL;
 		dev_priv->perf.oa.ops.init_oa_buffer = gen7_init_oa_buffer;
 		dev_priv->perf.oa.ops.enable_metric_set = hsw_enable_metric_set;
 		dev_priv->perf.oa.ops.disable_metric_set = hsw_disable_metric_set;
@@ -3025,9 +3405,6 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 		dev_priv->perf.oa.timestamp_frequency = 12500000;
 
 		dev_priv->perf.oa.oa_formats = hsw_oa_formats;
-
-		dev_priv->perf.oa.n_builtin_sets =
-			i915_oa_n_builtin_metric_sets_hsw;
 	} else if (i915.enable_execlists) {
 		/* Note: that although we could theoretically also support the
 		 * legacy ringbuffer mode on BDW (and earlier iterations of
@@ -3035,6 +3412,22 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 		 * worth the complexity to maintain now that BDW+ enable
 		 * execlist mode by default.
 		 */
+		dev_priv->perf.oa.ops.is_valid_b_counter_reg =
+			gen7_is_valid_b_counter_addr;
+		dev_priv->perf.oa.ops.is_valid_mux_reg =
+			gen8_is_valid_mux_addr;
+		dev_priv->perf.oa.ops.is_valid_flex_reg =
+			gen8_is_valid_flex_addr;
+
+		dev_priv->perf.oa.ops.init_oa_buffer = gen8_init_oa_buffer;
+		dev_priv->perf.oa.ops.enable_metric_set = gen8_enable_metric_set;
+		dev_priv->perf.oa.ops.disable_metric_set = gen8_disable_metric_set;
+		dev_priv->perf.oa.ops.oa_enable = gen8_oa_enable;
+		dev_priv->perf.oa.ops.oa_disable = gen8_oa_disable;
+		dev_priv->perf.oa.ops.read = gen8_oa_read;
+		dev_priv->perf.oa.ops.oa_hw_tail_read = gen8_oa_hw_tail_read;
+
+		dev_priv->perf.oa.oa_formats = gen8_plus_oa_formats;
 
 		if (IS_GEN8(dev_priv)) {
 			dev_priv->perf.oa.ctx_oactxctrl_offset = 0x120;
@@ -3043,85 +3436,35 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 			dev_priv->perf.oa.timestamp_frequency = 12500000;
 
 			dev_priv->perf.oa.gen8_valid_ctx_bit = (1<<25);
-
-			if (IS_BROADWELL(dev_priv)) {
-				dev_priv->perf.oa.n_builtin_sets =
-					i915_oa_n_builtin_metric_sets_bdw;
-				dev_priv->perf.oa.ops.select_metric_set =
-					i915_oa_select_metric_set_bdw;
-			} else if (IS_CHERRYVIEW(dev_priv)) {
-				dev_priv->perf.oa.n_builtin_sets =
-					i915_oa_n_builtin_metric_sets_chv;
-				dev_priv->perf.oa.ops.select_metric_set =
-					i915_oa_select_metric_set_chv;
+			if (IS_CHERRYVIEW(dev_priv)) {
+				dev_priv->perf.oa.ops.is_valid_mux_reg =
+					chv_is_valid_mux_addr;
 			}
 		} else if (IS_GEN9(dev_priv)) {
 			dev_priv->perf.oa.ctx_oactxctrl_offset = 0x128;
 			dev_priv->perf.oa.ctx_flexeu0_offset = 0x3de;
 
-			dev_priv->perf.oa.timestamp_frequency = 12000000;
-
 			dev_priv->perf.oa.gen8_valid_ctx_bit = (1<<16);
 
-			if (IS_SKL_GT2(dev_priv)) {
-				dev_priv->perf.oa.n_builtin_sets =
-					i915_oa_n_builtin_metric_sets_sklgt2;
-				dev_priv->perf.oa.ops.select_metric_set =
-					i915_oa_select_metric_set_sklgt2;
-			} else if (IS_SKL_GT3(dev_priv)) {
-				dev_priv->perf.oa.n_builtin_sets =
-					i915_oa_n_builtin_metric_sets_sklgt3;
-				dev_priv->perf.oa.ops.select_metric_set =
-					i915_oa_select_metric_set_sklgt3;
-			} else if (IS_SKL_GT4(dev_priv)) {
-				dev_priv->perf.oa.n_builtin_sets =
-					i915_oa_n_builtin_metric_sets_sklgt4;
-				dev_priv->perf.oa.ops.select_metric_set =
-					i915_oa_select_metric_set_sklgt4;
-			} else if (IS_BROXTON(dev_priv)) {
+			switch (dev_priv->info.platform) {
+			case INTEL_BROXTON:
+			case INTEL_GEMINILAKE:
 				dev_priv->perf.oa.timestamp_frequency = 19200000;
-
-				dev_priv->perf.oa.n_builtin_sets =
-					i915_oa_n_builtin_metric_sets_bxt;
-				dev_priv->perf.oa.ops.select_metric_set =
-					i915_oa_select_metric_set_bxt;
-			} else if (IS_KBL_GT2(dev_priv)) {
-				dev_priv->perf.oa.n_builtin_sets =
-					i915_oa_n_builtin_metric_sets_kblgt2;
-				dev_priv->perf.oa.ops.select_metric_set =
-					i915_oa_select_metric_set_kblgt2;
-			} else if (IS_KBL_GT3(dev_priv)) {
-				dev_priv->perf.oa.n_builtin_sets =
-					i915_oa_n_builtin_metric_sets_kblgt3;
-				dev_priv->perf.oa.ops.select_metric_set =
-					i915_oa_select_metric_set_kblgt3;
-			} else if (IS_GEMINILAKE(dev_priv)) {
-				dev_priv->perf.oa.timestamp_frequency = 19200000;
-
-				dev_priv->perf.oa.n_builtin_sets =
-					i915_oa_n_builtin_metric_sets_glk;
-				dev_priv->perf.oa.ops.select_metric_set =
-					i915_oa_select_metric_set_glk;
+				break;
+			case INTEL_SKYLAKE:
+			case INTEL_KABYLAKE:
+				dev_priv->perf.oa.timestamp_frequency = 12000000;
+				break;
+			default:
+				/* Leave timestamp_frequency to 0 so we can
+				 * detect unsupported platforms.
+				 */
+				break;
 			}
 		}
-
-		if (dev_priv->perf.oa.n_builtin_sets) {
-			dev_priv->perf.oa.ops.init_oa_buffer = gen8_init_oa_buffer;
-			dev_priv->perf.oa.ops.enable_metric_set =
-				gen8_enable_metric_set;
-			dev_priv->perf.oa.ops.disable_metric_set =
-				gen8_disable_metric_set;
-			dev_priv->perf.oa.ops.oa_enable = gen8_oa_enable;
-			dev_priv->perf.oa.ops.oa_disable = gen8_oa_disable;
-			dev_priv->perf.oa.ops.read = gen8_oa_read;
-			dev_priv->perf.oa.ops.oa_hw_tail_read =
-				gen8_oa_hw_tail_read;
-
-			dev_priv->perf.oa.oa_formats = gen8_plus_oa_formats;
-		}
 	}
 
-	if (dev_priv->perf.oa.n_builtin_sets) {
+	if (dev_priv->perf.oa.timestamp_frequency) {
 		hrtimer_init(&dev_priv->perf.oa.poll_check_timer,
 				CLOCK_MONOTONIC, HRTIMER_MODE_REL);
 		dev_priv->perf.oa.poll_check_timer.function = oa_poll_check_timer_cb;
@@ -3135,10 +3478,23 @@ void i915_perf_init(struct drm_i915_private *dev_priv)
 			dev_priv->perf.oa.timestamp_frequency / 2;
 		dev_priv->perf.sysctl_header = register_sysctl_table(dev_root);
 
+		mutex_init(&dev_priv->perf.metrics_lock);
+		idr_init(&dev_priv->perf.metrics_idr);
+
 		dev_priv->perf.initialized = true;
 	}
 }
 
+static int destroy_config(int id, void *p, void *data)
+{
+	struct drm_i915_private *dev_priv = data;
+	struct i915_oa_config *oa_config = p;
+
+	put_oa_config(dev_priv, oa_config);
+
+	return 0;
+}
+
 /**
  * i915_perf_fini - Counter part to i915_perf_init()
  * @dev_priv: i915 device instance
@@ -3148,6 +3504,9 @@ void i915_perf_fini(struct drm_i915_private *dev_priv)
 	if (!dev_priv->perf.initialized)
 		return;
 
+	idr_for_each(&dev_priv->perf.metrics_idr, destroy_config, dev_priv);
+	idr_destroy(&dev_priv->perf.metrics_idr);
+
 	unregister_sysctl_table(dev_priv->perf.sysctl_header);
 
 	memset(&dev_priv->perf.oa.ops, 0, sizeof(dev_priv->perf.oa.ops));
diff --git a/drivers/gpu/drm/i915/i915_pvinfo.h b/drivers/gpu/drm/i915/i915_pvinfo.h
index 2cfe96d3..0679a58 100644
--- a/drivers/gpu/drm/i915/i915_pvinfo.h
+++ b/drivers/gpu/drm/i915/i915_pvinfo.h
@@ -49,12 +49,18 @@ enum vgt_g2v_type {
 	VGT_G2V_MAX,
 };
 
+/*
+ * VGT capabilities type
+ */
+#define VGT_CAPS_FULL_48BIT_PPGTT	BIT(2)
+
 struct vgt_if {
 	u64 magic;		/* VGT_MAGIC */
 	u16 version_major;
 	u16 version_minor;
 	u32 vgt_id;		/* ID of vGT instance */
-	u32 rsv1[12];		/* pad to offset 0x40 */
+	u32 vgt_caps;		/* VGT capabilities */
+	u32 rsv1[11];		/* pad to offset 0x40 */
 	/*
 	 *  Data structure to describe the balooning info of resources.
 	 *  Each VM can only have one portion of continuous area for now.
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 64cc674..ed7cd9e 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -25,6 +25,97 @@
 #ifndef _I915_REG_H_
 #define _I915_REG_H_
 
+/**
+ * DOC: The i915 register macro definition style guide
+ *
+ * Follow the style described here for new macros, and while changing existing
+ * macros. Do **not** mass change existing definitions just to update the style.
+ *
+ * Layout
+ * ''''''
+ *
+ * Keep helper macros near the top. For example, _PIPE() and friends.
+ *
+ * Prefix macros that generally should not be used outside of this file with
+ * underscore '_'. For example, _PIPE() and friends, single instances of
+ * registers that are defined solely for the use by function-like macros.
+ *
+ * Avoid using the underscore prefixed macros outside of this file. There are
+ * exceptions, but keep them to a minimum.
+ *
+ * There are two basic types of register definitions: Single registers and
+ * register groups. Register groups are registers which have two or more
+ * instances, for example one per pipe, port, transcoder, etc. Register groups
+ * should be defined using function-like macros.
+ *
+ * For single registers, define the register offset first, followed by register
+ * contents.
+ *
+ * For register groups, define the register instance offsets first, prefixed
+ * with underscore, followed by a function-like macro choosing the right
+ * instance based on the parameter, followed by register contents.
+ *
+ * Define the register contents (i.e. bit and bit field macros) from most
+ * significant to least significant bit. Indent the register content macros
+ * using two extra spaces between ``#define`` and the macro name.
+ *
+ * For bit fields, define a ``_MASK`` and a ``_SHIFT`` macro. Define bit field
+ * contents so that they are already shifted in place, and can be directly
+ * OR'd. For convenience, function-like macros may be used to define bit fields,
+ * but do note that the macros may be needed to read as well as write the
+ * register contents.
+ *
+ * Define bits using ``(1 << N)`` instead of ``BIT(N)``. We may change this in
+ * the future, but this is the prevailing style. Do **not** add ``_BIT`` suffix
+ * to the name.
+ *
+ * Group the register and its contents together without blank lines, separate
+ * from other registers and their contents with one blank line.
+ *
+ * Indent macro values from macro names using TABs. Align values vertically. Use
+ * braces in macro values as needed to avoid unintended precedence after macro
+ * substitution. Use spaces in macro values according to kernel coding
+ * style. Use lower case in hexadecimal values.
+ *
+ * Naming
+ * ''''''
+ *
+ * Try to name registers according to the specs. If the register name changes in
+ * the specs from platform to another, stick to the original name.
+ *
+ * Try to re-use existing register macro definitions. Only add new macros for
+ * new register offsets, or when the register contents have changed enough to
+ * warrant a full redefinition.
+ *
+ * When a register macro changes for a new platform, prefix the new macro using
+ * the platform acronym or generation. For example, ``SKL_`` or ``GEN8_``. The
+ * prefix signifies the start platform/generation using the register.
+ *
+ * When a bit (field) macro changes or gets added for a new platform, while
+ * retaining the existing register macro, add a platform acronym or generation
+ * suffix to the name. For example, ``_SKL`` or ``_GEN8``.
+ *
+ * Examples
+ * ''''''''
+ *
+ * (Note that the values in the example are indented using spaces instead of
+ * TABs to avoid misalignment in generated documentation. Use TABs in the
+ * definitions.)::
+ *
+ *  #define _FOO_A                      0xf000
+ *  #define _FOO_B                      0xf001
+ *  #define FOO(pipe)                   _MMIO_PIPE(pipe, _FOO_A, _FOO_B)
+ *  #define   FOO_ENABLE                (1 << 31)
+ *  #define   FOO_MODE_MASK             (0xf << 16)
+ *  #define   FOO_MODE_SHIFT            16
+ *  #define   FOO_MODE_BAR              (0 << 16)
+ *  #define   FOO_MODE_BAZ              (1 << 16)
+ *  #define   FOO_MODE_QUX_SNB          (2 << 16)
+ *
+ *  #define BAR                         _MMIO(0xb000)
+ *  #define GEN8_BAR                    _MMIO(0xb888)
+ */
+
 typedef struct {
 	uint32_t reg;
 } i915_reg_t;
@@ -229,6 +320,28 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define   GEN8_RPCS_EU_MIN_SHIFT	0
 #define   GEN8_RPCS_EU_MIN_MASK		(0xf << GEN8_RPCS_EU_MIN_SHIFT)
 
+#define WAIT_FOR_RC6_EXIT		_MMIO(0x20CC)
+/* HSW only */
+#define   HSW_SELECTIVE_READ_ADDRESSING_SHIFT		2
+#define   HSW_SELECTIVE_READ_ADDRESSING_MASK		(0x3 << HSW_SLECTIVE_READ_ADDRESSING_SHIFT)
+#define   HSW_SELECTIVE_WRITE_ADDRESS_SHIFT		4
+#define   HSW_SELECTIVE_WRITE_ADDRESS_MASK		(0x7 << HSW_SELECTIVE_WRITE_ADDRESS_SHIFT)
+/* HSW+ */
+#define   HSW_WAIT_FOR_RC6_EXIT_ENABLE			(1 << 0)
+#define   HSW_RCS_CONTEXT_ENABLE			(1 << 7)
+#define   HSW_RCS_INHIBIT				(1 << 8)
+/* Gen8 */
+#define   GEN8_SELECTIVE_WRITE_ADDRESS_SHIFT		4
+#define   GEN8_SELECTIVE_WRITE_ADDRESS_MASK		(0x3 << GEN8_SELECTIVE_WRITE_ADDRESS_SHIFT)
+#define   GEN8_SELECTIVE_WRITE_ADDRESS_SHIFT		4
+#define   GEN8_SELECTIVE_WRITE_ADDRESS_MASK		(0x3 << GEN8_SELECTIVE_WRITE_ADDRESS_SHIFT)
+#define   GEN8_SELECTIVE_WRITE_ADDRESSING_ENABLE	(1 << 6)
+#define   GEN8_SELECTIVE_READ_SUBSLICE_SELECT_SHIFT	9
+#define   GEN8_SELECTIVE_READ_SUBSLICE_SELECT_MASK	(0x3 << GEN8_SELECTIVE_READ_SUBSLICE_SELECT_SHIFT)
+#define   GEN8_SELECTIVE_READ_SLICE_SELECT_SHIFT	11
+#define   GEN8_SELECTIVE_READ_SLICE_SELECT_MASK		(0x3 << GEN8_SELECTIVE_READ_SLICE_SELECT_SHIFT)
+#define   GEN8_SELECTIVE_READ_ADDRESSING_ENABLE         (1 << 13)
+
 #define GAM_ECOCHK			_MMIO(0x4090)
 #define   BDW_DISABLE_HDC_INVALIDATION	(1<<25)
 #define   ECOCHK_SNB_BIT		(1<<10)
@@ -729,119 +842,10 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define EU_PERF_CNTL5	    _MMIO(0xe55c)
 #define EU_PERF_CNTL6	    _MMIO(0xe65c)
 
-#define GDT_CHICKEN_BITS    _MMIO(0x9840)
-#define GT_NOA_ENABLE	    0x00000080
-
 /*
  * OA Boolean state
  */
 
-#define OAREPORTTRIG1 _MMIO(0x2740)
-#define OAREPORTTRIG1_THRESHOLD_MASK 0xffff
-#define OAREPORTTRIG1_EDGE_LEVEL_TRIGER_SELECT_MASK 0xffff0000 /* 0=level */
-
-#define OAREPORTTRIG2 _MMIO(0x2744)
-#define OAREPORTTRIG2_INVERT_A_0  (1<<0)
-#define OAREPORTTRIG2_INVERT_A_1  (1<<1)
-#define OAREPORTTRIG2_INVERT_A_2  (1<<2)
-#define OAREPORTTRIG2_INVERT_A_3  (1<<3)
-#define OAREPORTTRIG2_INVERT_A_4  (1<<4)
-#define OAREPORTTRIG2_INVERT_A_5  (1<<5)
-#define OAREPORTTRIG2_INVERT_A_6  (1<<6)
-#define OAREPORTTRIG2_INVERT_A_7  (1<<7)
-#define OAREPORTTRIG2_INVERT_A_8  (1<<8)
-#define OAREPORTTRIG2_INVERT_A_9  (1<<9)
-#define OAREPORTTRIG2_INVERT_A_10 (1<<10)
-#define OAREPORTTRIG2_INVERT_A_11 (1<<11)
-#define OAREPORTTRIG2_INVERT_A_12 (1<<12)
-#define OAREPORTTRIG2_INVERT_A_13 (1<<13)
-#define OAREPORTTRIG2_INVERT_A_14 (1<<14)
-#define OAREPORTTRIG2_INVERT_A_15 (1<<15)
-#define OAREPORTTRIG2_INVERT_B_0  (1<<16)
-#define OAREPORTTRIG2_INVERT_B_1  (1<<17)
-#define OAREPORTTRIG2_INVERT_B_2  (1<<18)
-#define OAREPORTTRIG2_INVERT_B_3  (1<<19)
-#define OAREPORTTRIG2_INVERT_C_0  (1<<20)
-#define OAREPORTTRIG2_INVERT_C_1  (1<<21)
-#define OAREPORTTRIG2_INVERT_D_0  (1<<22)
-#define OAREPORTTRIG2_THRESHOLD_ENABLE	    (1<<23)
-#define OAREPORTTRIG2_REPORT_TRIGGER_ENABLE (1<<31)
-
-#define OAREPORTTRIG3 _MMIO(0x2748)
-#define OAREPORTTRIG3_NOA_SELECT_MASK	    0xf
-#define OAREPORTTRIG3_NOA_SELECT_8_SHIFT    0
-#define OAREPORTTRIG3_NOA_SELECT_9_SHIFT    4
-#define OAREPORTTRIG3_NOA_SELECT_10_SHIFT   8
-#define OAREPORTTRIG3_NOA_SELECT_11_SHIFT   12
-#define OAREPORTTRIG3_NOA_SELECT_12_SHIFT   16
-#define OAREPORTTRIG3_NOA_SELECT_13_SHIFT   20
-#define OAREPORTTRIG3_NOA_SELECT_14_SHIFT   24
-#define OAREPORTTRIG3_NOA_SELECT_15_SHIFT   28
-
-#define OAREPORTTRIG4 _MMIO(0x274c)
-#define OAREPORTTRIG4_NOA_SELECT_MASK	    0xf
-#define OAREPORTTRIG4_NOA_SELECT_0_SHIFT    0
-#define OAREPORTTRIG4_NOA_SELECT_1_SHIFT    4
-#define OAREPORTTRIG4_NOA_SELECT_2_SHIFT    8
-#define OAREPORTTRIG4_NOA_SELECT_3_SHIFT    12
-#define OAREPORTTRIG4_NOA_SELECT_4_SHIFT    16
-#define OAREPORTTRIG4_NOA_SELECT_5_SHIFT    20
-#define OAREPORTTRIG4_NOA_SELECT_6_SHIFT    24
-#define OAREPORTTRIG4_NOA_SELECT_7_SHIFT    28
-
-#define OAREPORTTRIG5 _MMIO(0x2750)
-#define OAREPORTTRIG5_THRESHOLD_MASK 0xffff
-#define OAREPORTTRIG5_EDGE_LEVEL_TRIGER_SELECT_MASK 0xffff0000 /* 0=level */
-
-#define OAREPORTTRIG6 _MMIO(0x2754)
-#define OAREPORTTRIG6_INVERT_A_0  (1<<0)
-#define OAREPORTTRIG6_INVERT_A_1  (1<<1)
-#define OAREPORTTRIG6_INVERT_A_2  (1<<2)
-#define OAREPORTTRIG6_INVERT_A_3  (1<<3)
-#define OAREPORTTRIG6_INVERT_A_4  (1<<4)
-#define OAREPORTTRIG6_INVERT_A_5  (1<<5)
-#define OAREPORTTRIG6_INVERT_A_6  (1<<6)
-#define OAREPORTTRIG6_INVERT_A_7  (1<<7)
-#define OAREPORTTRIG6_INVERT_A_8  (1<<8)
-#define OAREPORTTRIG6_INVERT_A_9  (1<<9)
-#define OAREPORTTRIG6_INVERT_A_10 (1<<10)
-#define OAREPORTTRIG6_INVERT_A_11 (1<<11)
-#define OAREPORTTRIG6_INVERT_A_12 (1<<12)
-#define OAREPORTTRIG6_INVERT_A_13 (1<<13)
-#define OAREPORTTRIG6_INVERT_A_14 (1<<14)
-#define OAREPORTTRIG6_INVERT_A_15 (1<<15)
-#define OAREPORTTRIG6_INVERT_B_0  (1<<16)
-#define OAREPORTTRIG6_INVERT_B_1  (1<<17)
-#define OAREPORTTRIG6_INVERT_B_2  (1<<18)
-#define OAREPORTTRIG6_INVERT_B_3  (1<<19)
-#define OAREPORTTRIG6_INVERT_C_0  (1<<20)
-#define OAREPORTTRIG6_INVERT_C_1  (1<<21)
-#define OAREPORTTRIG6_INVERT_D_0  (1<<22)
-#define OAREPORTTRIG6_THRESHOLD_ENABLE	    (1<<23)
-#define OAREPORTTRIG6_REPORT_TRIGGER_ENABLE (1<<31)
-
-#define OAREPORTTRIG7 _MMIO(0x2758)
-#define OAREPORTTRIG7_NOA_SELECT_MASK	    0xf
-#define OAREPORTTRIG7_NOA_SELECT_8_SHIFT    0
-#define OAREPORTTRIG7_NOA_SELECT_9_SHIFT    4
-#define OAREPORTTRIG7_NOA_SELECT_10_SHIFT   8
-#define OAREPORTTRIG7_NOA_SELECT_11_SHIFT   12
-#define OAREPORTTRIG7_NOA_SELECT_12_SHIFT   16
-#define OAREPORTTRIG7_NOA_SELECT_13_SHIFT   20
-#define OAREPORTTRIG7_NOA_SELECT_14_SHIFT   24
-#define OAREPORTTRIG7_NOA_SELECT_15_SHIFT   28
-
-#define OAREPORTTRIG8 _MMIO(0x275c)
-#define OAREPORTTRIG8_NOA_SELECT_MASK	    0xf
-#define OAREPORTTRIG8_NOA_SELECT_0_SHIFT    0
-#define OAREPORTTRIG8_NOA_SELECT_1_SHIFT    4
-#define OAREPORTTRIG8_NOA_SELECT_2_SHIFT    8
-#define OAREPORTTRIG8_NOA_SELECT_3_SHIFT    12
-#define OAREPORTTRIG8_NOA_SELECT_4_SHIFT    16
-#define OAREPORTTRIG8_NOA_SELECT_5_SHIFT    20
-#define OAREPORTTRIG8_NOA_SELECT_6_SHIFT    24
-#define OAREPORTTRIG8_NOA_SELECT_7_SHIFT    28
-
 #define OASTARTTRIG1 _MMIO(0x2710)
 #define OASTARTTRIG1_THRESHOLD_COUNT_MASK_MBZ 0xffff0000
 #define OASTARTTRIG1_THRESHOLD_MASK	      0xffff
@@ -956,6 +960,112 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define OASTARTTRIG8_NOA_SELECT_6_SHIFT    24
 #define OASTARTTRIG8_NOA_SELECT_7_SHIFT    28
 
+#define OAREPORTTRIG1 _MMIO(0x2740)
+#define OAREPORTTRIG1_THRESHOLD_MASK 0xffff
+#define OAREPORTTRIG1_EDGE_LEVEL_TRIGER_SELECT_MASK 0xffff0000 /* 0=level */
+
+#define OAREPORTTRIG2 _MMIO(0x2744)
+#define OAREPORTTRIG2_INVERT_A_0  (1<<0)
+#define OAREPORTTRIG2_INVERT_A_1  (1<<1)
+#define OAREPORTTRIG2_INVERT_A_2  (1<<2)
+#define OAREPORTTRIG2_INVERT_A_3  (1<<3)
+#define OAREPORTTRIG2_INVERT_A_4  (1<<4)
+#define OAREPORTTRIG2_INVERT_A_5  (1<<5)
+#define OAREPORTTRIG2_INVERT_A_6  (1<<6)
+#define OAREPORTTRIG2_INVERT_A_7  (1<<7)
+#define OAREPORTTRIG2_INVERT_A_8  (1<<8)
+#define OAREPORTTRIG2_INVERT_A_9  (1<<9)
+#define OAREPORTTRIG2_INVERT_A_10 (1<<10)
+#define OAREPORTTRIG2_INVERT_A_11 (1<<11)
+#define OAREPORTTRIG2_INVERT_A_12 (1<<12)
+#define OAREPORTTRIG2_INVERT_A_13 (1<<13)
+#define OAREPORTTRIG2_INVERT_A_14 (1<<14)
+#define OAREPORTTRIG2_INVERT_A_15 (1<<15)
+#define OAREPORTTRIG2_INVERT_B_0  (1<<16)
+#define OAREPORTTRIG2_INVERT_B_1  (1<<17)
+#define OAREPORTTRIG2_INVERT_B_2  (1<<18)
+#define OAREPORTTRIG2_INVERT_B_3  (1<<19)
+#define OAREPORTTRIG2_INVERT_C_0  (1<<20)
+#define OAREPORTTRIG2_INVERT_C_1  (1<<21)
+#define OAREPORTTRIG2_INVERT_D_0  (1<<22)
+#define OAREPORTTRIG2_THRESHOLD_ENABLE	    (1<<23)
+#define OAREPORTTRIG2_REPORT_TRIGGER_ENABLE (1<<31)
+
+#define OAREPORTTRIG3 _MMIO(0x2748)
+#define OAREPORTTRIG3_NOA_SELECT_MASK	    0xf
+#define OAREPORTTRIG3_NOA_SELECT_8_SHIFT    0
+#define OAREPORTTRIG3_NOA_SELECT_9_SHIFT    4
+#define OAREPORTTRIG3_NOA_SELECT_10_SHIFT   8
+#define OAREPORTTRIG3_NOA_SELECT_11_SHIFT   12
+#define OAREPORTTRIG3_NOA_SELECT_12_SHIFT   16
+#define OAREPORTTRIG3_NOA_SELECT_13_SHIFT   20
+#define OAREPORTTRIG3_NOA_SELECT_14_SHIFT   24
+#define OAREPORTTRIG3_NOA_SELECT_15_SHIFT   28
+
+#define OAREPORTTRIG4 _MMIO(0x274c)
+#define OAREPORTTRIG4_NOA_SELECT_MASK	    0xf
+#define OAREPORTTRIG4_NOA_SELECT_0_SHIFT    0
+#define OAREPORTTRIG4_NOA_SELECT_1_SHIFT    4
+#define OAREPORTTRIG4_NOA_SELECT_2_SHIFT    8
+#define OAREPORTTRIG4_NOA_SELECT_3_SHIFT    12
+#define OAREPORTTRIG4_NOA_SELECT_4_SHIFT    16
+#define OAREPORTTRIG4_NOA_SELECT_5_SHIFT    20
+#define OAREPORTTRIG4_NOA_SELECT_6_SHIFT    24
+#define OAREPORTTRIG4_NOA_SELECT_7_SHIFT    28
+
+#define OAREPORTTRIG5 _MMIO(0x2750)
+#define OAREPORTTRIG5_THRESHOLD_MASK 0xffff
+#define OAREPORTTRIG5_EDGE_LEVEL_TRIGER_SELECT_MASK 0xffff0000 /* 0=level */
+
+#define OAREPORTTRIG6 _MMIO(0x2754)
+#define OAREPORTTRIG6_INVERT_A_0  (1<<0)
+#define OAREPORTTRIG6_INVERT_A_1  (1<<1)
+#define OAREPORTTRIG6_INVERT_A_2  (1<<2)
+#define OAREPORTTRIG6_INVERT_A_3  (1<<3)
+#define OAREPORTTRIG6_INVERT_A_4  (1<<4)
+#define OAREPORTTRIG6_INVERT_A_5  (1<<5)
+#define OAREPORTTRIG6_INVERT_A_6  (1<<6)
+#define OAREPORTTRIG6_INVERT_A_7  (1<<7)
+#define OAREPORTTRIG6_INVERT_A_8  (1<<8)
+#define OAREPORTTRIG6_INVERT_A_9  (1<<9)
+#define OAREPORTTRIG6_INVERT_A_10 (1<<10)
+#define OAREPORTTRIG6_INVERT_A_11 (1<<11)
+#define OAREPORTTRIG6_INVERT_A_12 (1<<12)
+#define OAREPORTTRIG6_INVERT_A_13 (1<<13)
+#define OAREPORTTRIG6_INVERT_A_14 (1<<14)
+#define OAREPORTTRIG6_INVERT_A_15 (1<<15)
+#define OAREPORTTRIG6_INVERT_B_0  (1<<16)
+#define OAREPORTTRIG6_INVERT_B_1  (1<<17)
+#define OAREPORTTRIG6_INVERT_B_2  (1<<18)
+#define OAREPORTTRIG6_INVERT_B_3  (1<<19)
+#define OAREPORTTRIG6_INVERT_C_0  (1<<20)
+#define OAREPORTTRIG6_INVERT_C_1  (1<<21)
+#define OAREPORTTRIG6_INVERT_D_0  (1<<22)
+#define OAREPORTTRIG6_THRESHOLD_ENABLE	    (1<<23)
+#define OAREPORTTRIG6_REPORT_TRIGGER_ENABLE (1<<31)
+
+#define OAREPORTTRIG7 _MMIO(0x2758)
+#define OAREPORTTRIG7_NOA_SELECT_MASK	    0xf
+#define OAREPORTTRIG7_NOA_SELECT_8_SHIFT    0
+#define OAREPORTTRIG7_NOA_SELECT_9_SHIFT    4
+#define OAREPORTTRIG7_NOA_SELECT_10_SHIFT   8
+#define OAREPORTTRIG7_NOA_SELECT_11_SHIFT   12
+#define OAREPORTTRIG7_NOA_SELECT_12_SHIFT   16
+#define OAREPORTTRIG7_NOA_SELECT_13_SHIFT   20
+#define OAREPORTTRIG7_NOA_SELECT_14_SHIFT   24
+#define OAREPORTTRIG7_NOA_SELECT_15_SHIFT   28
+
+#define OAREPORTTRIG8 _MMIO(0x275c)
+#define OAREPORTTRIG8_NOA_SELECT_MASK	    0xf
+#define OAREPORTTRIG8_NOA_SELECT_0_SHIFT    0
+#define OAREPORTTRIG8_NOA_SELECT_1_SHIFT    4
+#define OAREPORTTRIG8_NOA_SELECT_2_SHIFT    8
+#define OAREPORTTRIG8_NOA_SELECT_3_SHIFT    12
+#define OAREPORTTRIG8_NOA_SELECT_4_SHIFT    16
+#define OAREPORTTRIG8_NOA_SELECT_5_SHIFT    20
+#define OAREPORTTRIG8_NOA_SELECT_6_SHIFT    24
+#define OAREPORTTRIG8_NOA_SELECT_7_SHIFT    28
+
 /* CECX_0 */
 #define OACEC_COMPARE_LESS_OR_EQUAL	6
 #define OACEC_COMPARE_NOT_EQUAL		5
@@ -994,6 +1104,51 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define OACEC7_0 _MMIO(0x27a8)
 #define OACEC7_1 _MMIO(0x27ac)
 
+/* OA perf counters */
+#define OA_PERFCNT1_LO      _MMIO(0x91B8)
+#define OA_PERFCNT1_HI      _MMIO(0x91BC)
+#define OA_PERFCNT2_LO      _MMIO(0x91C0)
+#define OA_PERFCNT2_HI      _MMIO(0x91C4)
+
+#define OA_PERFMATRIX_LO    _MMIO(0x91C8)
+#define OA_PERFMATRIX_HI    _MMIO(0x91CC)
+
+/* RPM unit config (Gen8+) */
+#define RPM_CONFIG0	    _MMIO(0x0D00)
+#define RPM_CONFIG1	    _MMIO(0x0D04)
+
+/* RPC unit config (Gen8+) */
+#define RPM_CONFIG	    _MMIO(0x0D08)
+
+/* NOA (Gen8+) */
+#define NOA_CONFIG(i)	    _MMIO(0x0D0C + (i) * 4)
+
+#define MICRO_BP0_0	    _MMIO(0x9800)
+#define MICRO_BP0_2	    _MMIO(0x9804)
+#define MICRO_BP0_1	    _MMIO(0x9808)
+
+#define MICRO_BP1_0	    _MMIO(0x980C)
+#define MICRO_BP1_2	    _MMIO(0x9810)
+#define MICRO_BP1_1	    _MMIO(0x9814)
+
+#define MICRO_BP2_0	    _MMIO(0x9818)
+#define MICRO_BP2_2	    _MMIO(0x981C)
+#define MICRO_BP2_1	    _MMIO(0x9820)
+
+#define MICRO_BP3_0	    _MMIO(0x9824)
+#define MICRO_BP3_2	    _MMIO(0x9828)
+#define MICRO_BP3_1	    _MMIO(0x982C)
+
+#define MICRO_BP_TRIGGER		_MMIO(0x9830)
+#define MICRO_BP3_COUNT_STATUS01	_MMIO(0x9834)
+#define MICRO_BP3_COUNT_STATUS23	_MMIO(0x9838)
+#define MICRO_BP_FIRED_ARMED		_MMIO(0x983C)
+
+#define GDT_CHICKEN_BITS    _MMIO(0x9840)
+#define   GT_NOA_ENABLE	    0x00000080
+
+#define NOA_DATA	    _MMIO(0x986C)
+#define NOA_WRITE	    _MMIO(0x9888)
 
 #define _GEN7_PIPEA_DE_LOAD_SL	0x70068
 #define _GEN7_PIPEB_DE_LOAD_SL	0x71068
@@ -1063,9 +1218,26 @@ static inline bool i915_mmio_reg_valid(i915_reg_t reg)
 #define   DP_SSS_RESET(pipe)			_DP_SSS(0x2, (pipe))
 #define   DP_SSS_PWR_GATE(pipe)			_DP_SSS(0x3, (pipe))
 
-/* See the PUNIT HAS v0.8 for the below bits */
-enum punit_power_well {
-	/* These numbers are fixed and must match the position of the pw bits */
+/*
+ * i915_power_well_id:
+ *
+ * Platform specific IDs used to look up power wells and - except for custom
+ * power wells - to define request/status register flag bit positions. As such
+ * the set of IDs on a given platform must be unique and except for custom
+ * power wells their value must stay fixed.
+ */
+enum i915_power_well_id {
+	/*
+	 * I830
+	 *  - custom power well
+	 */
+	I830_DISP_PW_PIPES = 0,
+
+	/*
+	 * VLV/CHV
+	 *  - PUNIT_REG_PWRGT_CTRL (bit: id*2),
+	 *    PUNIT_REG_PWRGT_STATUS (bit: id*2) (PUNIT HAS v0.8)
+	 */
 	PUNIT_POWER_WELL_RENDER			= 0,
 	PUNIT_POWER_WELL_MEDIA			= 1,
 	PUNIT_POWER_WELL_DISP2D			= 3,
@@ -1077,14 +1249,20 @@ enum punit_power_well {
 	PUNIT_POWER_WELL_DPIO_RX0		= 10,
 	PUNIT_POWER_WELL_DPIO_RX1		= 11,
 	PUNIT_POWER_WELL_DPIO_CMN_D		= 12,
+	/*  - custom power well */
+	CHV_DISP_PW_PIPE_A,			/* 13 */
 
-	/* Not actual bit groups. Used as IDs for lookup_power_well() */
-	PUNIT_POWER_WELL_ALWAYS_ON,
-};
+	/*
+	 * HSW/BDW
+	 *  - HSW_PWR_WELL_CTL_DRIVER(0) (status bit: id*2, req bit: id*2+1)
+	 */
+	HSW_DISP_PW_GLOBAL = 15,
 
-enum skl_disp_power_wells {
-	/* These numbers are fixed and must match the position of the pw bits */
-	SKL_DISP_PW_MISC_IO,
+	/*
+	 * GEN9+
+	 *  - HSW_PWR_WELL_CTL_DRIVER(0) (status bit: id*2, req bit: id*2+1)
+	 */
+	SKL_DISP_PW_MISC_IO = 0,
 	SKL_DISP_PW_DDI_A_E,
 	GLK_DISP_PW_DDI_A = SKL_DISP_PW_DDI_A_E,
 	CNL_DISP_PW_DDI_A = SKL_DISP_PW_DDI_A_E,
@@ -1103,17 +1281,19 @@ enum skl_disp_power_wells {
 	SKL_DISP_PW_1 = 14,
 	SKL_DISP_PW_2,
 
-	/* Not actual bit groups. Used as IDs for lookup_power_well() */
-	SKL_DISP_PW_ALWAYS_ON,
+	/* - custom power wells */
 	SKL_DISP_PW_DC_OFF,
-
 	BXT_DPIO_CMN_A,
 	BXT_DPIO_CMN_BC,
-	GLK_DPIO_CMN_C,
-};
+	GLK_DPIO_CMN_C,			/* 19 */
 
-#define SKL_POWER_WELL_STATE(pw) (1 << ((pw) * 2))
-#define SKL_POWER_WELL_REQ(pw) (1 << (((pw) * 2) + 1))
+	/*
+	 * Multiple platforms.
+	 * Must start following the highest ID of any platform.
+	 * - custom power wells
+	 */
+	I915_DISP_PW_ALWAYS_ON = 20,
+};
 
 #define PUNIT_REG_PWRGT_CTRL			0x60
 #define PUNIT_REG_PWRGT_STATUS			0x61
@@ -2156,6 +2336,7 @@ enum skl_disp_power_wells {
 #define DONE_REG		_MMIO(0x40b0)
 #define GEN8_PRIVATE_PAT_LO	_MMIO(0x40e0)
 #define GEN8_PRIVATE_PAT_HI	_MMIO(0x40e0 + 4)
+#define GEN10_PAT_INDEX(index)	_MMIO(0x40e0 + index*4)
 #define BSD_HWS_PGA_GEN7	_MMIO(0x04180)
 #define BLT_HWS_PGA_GEN7	_MMIO(0x04280)
 #define VEBOX_HWS_PGA_GEN7	_MMIO(0x04380)
@@ -3522,7 +3703,7 @@ enum skl_disp_power_wells {
 #define INTERVAL_1_28_US(us)	roundup(((us) * 100) >> 7, 25)
 #define INTERVAL_1_33_US(us)	(((us) * 3)   >> 2)
 #define INTERVAL_0_833_US(us)	(((us) * 6) / 5)
-#define GT_INTERVAL_FROM_US(dev_priv, us) (IS_GEN9(dev_priv) ? \
+#define GT_INTERVAL_FROM_US(dev_priv, us) (INTEL_GEN(dev_priv) >= 9 ? \
 				(IS_GEN9_LP(dev_priv) ? \
 				INTERVAL_0_833_US(us) : \
 				INTERVAL_1_33_US(us)) : \
@@ -3531,7 +3712,7 @@ enum skl_disp_power_wells {
 #define INTERVAL_1_28_TO_US(interval)  (((interval) << 7) / 100)
 #define INTERVAL_1_33_TO_US(interval)  (((interval) << 2) / 3)
 #define INTERVAL_0_833_TO_US(interval) (((interval) * 5)  / 6)
-#define GT_PM_INTERVAL_TO_US(dev_priv, interval) (IS_GEN9(dev_priv) ? \
+#define GT_PM_INTERVAL_TO_US(dev_priv, interval) (INTEL_GEN(dev_priv) >= 9 ? \
                            (IS_GEN9_LP(dev_priv) ? \
                            INTERVAL_0_833_TO_US(interval) : \
                            INTERVAL_1_33_TO_US(interval)) : \
@@ -3783,6 +3964,7 @@ enum {
 #define EDP_PSR_CTL				_MMIO(dev_priv->psr_mmio_base + 0)
 #define   EDP_PSR_ENABLE			(1<<31)
 #define   BDW_PSR_SINGLE_FRAME			(1<<30)
+#define   EDP_PSR_RESTORE_PSR_ACTIVE_CTX_MASK	(1<<29) /* SW can't modify */
 #define   EDP_PSR_LINK_STANDBY			(1<<27)
 #define   EDP_PSR_MIN_LINK_ENTRY_TIME_MASK	(3<<25)
 #define   EDP_PSR_MIN_LINK_ENTRY_TIME_8_LINES	(0<<25)
@@ -5227,6 +5409,9 @@ enum {
 
 #define _PIPE_MISC_A			0x70030
 #define _PIPE_MISC_B			0x71030
+#define   PIPEMISC_YUV420_ENABLE	(1<<27)
+#define   PIPEMISC_YUV420_MODE_FULL_BLEND (1<<26)
+#define   PIPEMISC_OUTPUT_COLORSPACE_YUV  (1<<11)
 #define   PIPEMISC_DITHER_BPC_MASK	(7<<5)
 #define   PIPEMISC_DITHER_8_BPC		(0<<5)
 #define   PIPEMISC_DITHER_10_BPC	(1<<5)
@@ -6106,6 +6291,10 @@ enum {
 #define _PLANE_KEYMSK_2_A			0x70298
 #define _PLANE_KEYMAX_1_A			0x701a0
 #define _PLANE_KEYMAX_2_A			0x702a0
+#define _PLANE_AUX_DIST_1_A			0x701c0
+#define _PLANE_AUX_DIST_2_A			0x702c0
+#define _PLANE_AUX_OFFSET_1_A			0x701c4
+#define _PLANE_AUX_OFFSET_2_A			0x702c4
 #define _PLANE_COLOR_CTL_1_A			0x701CC /* GLK+ */
 #define _PLANE_COLOR_CTL_2_A			0x702CC /* GLK+ */
 #define _PLANE_COLOR_CTL_3_A			0x703CC /* GLK+ */
@@ -6212,6 +6401,24 @@ enum {
 #define PLANE_NV12_BUF_CFG(pipe, plane)	\
 	_MMIO_PLANE(plane, _PLANE_NV12_BUF_CFG_1(pipe), _PLANE_NV12_BUF_CFG_2(pipe))
 
+#define _PLANE_AUX_DIST_1_B		0x711c0
+#define _PLANE_AUX_DIST_2_B		0x712c0
+#define _PLANE_AUX_DIST_1(pipe) \
+			_PIPE(pipe, _PLANE_AUX_DIST_1_A, _PLANE_AUX_DIST_1_B)
+#define _PLANE_AUX_DIST_2(pipe) \
+			_PIPE(pipe, _PLANE_AUX_DIST_2_A, _PLANE_AUX_DIST_2_B)
+#define PLANE_AUX_DIST(pipe, plane)     \
+	_MMIO_PLANE(plane, _PLANE_AUX_DIST_1(pipe), _PLANE_AUX_DIST_2(pipe))
+
+#define _PLANE_AUX_OFFSET_1_B		0x711c4
+#define _PLANE_AUX_OFFSET_2_B		0x712c4
+#define _PLANE_AUX_OFFSET_1(pipe)       \
+		_PIPE(pipe, _PLANE_AUX_OFFSET_1_A, _PLANE_AUX_OFFSET_1_B)
+#define _PLANE_AUX_OFFSET_2(pipe)       \
+		_PIPE(pipe, _PLANE_AUX_OFFSET_2_A, _PLANE_AUX_OFFSET_2_B)
+#define PLANE_AUX_OFFSET(pipe, plane)   \
+	_MMIO_PLANE(plane, _PLANE_AUX_OFFSET_1(pipe), _PLANE_AUX_OFFSET_2(pipe))
+
 #define _PLANE_COLOR_CTL_1_B			0x711CC
 #define _PLANE_COLOR_CTL_2_B			0x712CC
 #define _PLANE_COLOR_CTL_3_B			0x713CC
@@ -6695,6 +6902,7 @@ enum {
 # define CHICKEN3_DGMG_DONE_FIX_DISABLE		(1 << 2)
 
 #define CHICKEN_PAR1_1		_MMIO(0x42080)
+#define  SKL_RC_HASH_OUTSIDE	(1 << 15)
 #define  DPA_MASK_VBLANK_SRD	(1 << 15)
 #define  FORCE_ARB_IDLE_PLANES	(1 << 14)
 #define  SKL_EDP_PSR_FIX_RDWRAP	(1 << 3)
@@ -6703,12 +6911,10 @@ enum {
 #define  KVM_CONFIG_CHANGE_NOTIFICATION_SELECT	(1 << 14)
 
 #define CHICKEN_MISC_2		_MMIO(0x42084)
-#define  GLK_CL0_PWR_DOWN	(1 << 10)
-#define  GLK_CL1_PWR_DOWN	(1 << 11)
+#define  CNL_COMP_PWR_DOWN	(1 << 23)
 #define  GLK_CL2_PWR_DOWN	(1 << 12)
-
-#define CHICKEN_MISC_2		_MMIO(0x42084)
-#define  COMP_PWR_DOWN		(1 << 23)
+#define  GLK_CL1_PWR_DOWN	(1 << 11)
+#define  GLK_CL0_PWR_DOWN	(1 << 10)
 
 #define _CHICKEN_PIPESL_1_A	0x420b0
 #define _CHICKEN_PIPESL_1_B	0x420b4
@@ -7984,12 +8190,31 @@ enum {
 #define   SKL_AUD_CODEC_WAKE_SIGNAL		(1 << 15)
 
 /* HSW Power Wells */
-#define HSW_PWR_WELL_BIOS			_MMIO(0x45400) /* CTL1 */
-#define HSW_PWR_WELL_DRIVER			_MMIO(0x45404) /* CTL2 */
-#define HSW_PWR_WELL_KVMR			_MMIO(0x45408) /* CTL3 */
-#define HSW_PWR_WELL_DEBUG			_MMIO(0x4540C) /* CTL4 */
-#define   HSW_PWR_WELL_ENABLE_REQUEST		(1<<31)
-#define   HSW_PWR_WELL_STATE_ENABLED		(1<<30)
+#define _HSW_PWR_WELL_CTL1			0x45400
+#define _HSW_PWR_WELL_CTL2			0x45404
+#define _HSW_PWR_WELL_CTL3			0x45408
+#define _HSW_PWR_WELL_CTL4			0x4540C
+
+/*
+ * Each power well control register contains up to 16 (request, status) HW
+ * flag tuples. The register index and HW flag shift is determined by the
+ * power well ID (see i915_power_well_id). There are 4 possible sources of
+ * power well requests each source having its own set of control registers:
+ * BIOS, DRIVER, KVMR, DEBUG.
+ */
+#define _HSW_PW_REG_IDX(pw)			((pw) >> 4)
+#define _HSW_PW_SHIFT(pw)			(((pw) & 0xf) * 2)
+/* TODO: Add all PWR_WELL_CTL registers below for new platforms */
+#define HSW_PWR_WELL_CTL_BIOS(pw)	_MMIO(_PICK(_HSW_PW_REG_IDX(pw),       \
+						    _HSW_PWR_WELL_CTL1))
+#define HSW_PWR_WELL_CTL_DRIVER(pw)	_MMIO(_PICK(_HSW_PW_REG_IDX(pw),       \
+						    _HSW_PWR_WELL_CTL2))
+#define HSW_PWR_WELL_CTL_KVMR		_MMIO(_HSW_PWR_WELL_CTL3)
+#define HSW_PWR_WELL_CTL_DEBUG(pw)	_MMIO(_PICK(_HSW_PW_REG_IDX(pw),       \
+						    _HSW_PWR_WELL_CTL4))
+
+#define   HSW_PWR_WELL_CTL_REQ(pw)		(1 << (_HSW_PW_SHIFT(pw) + 1))
+#define   HSW_PWR_WELL_CTL_STATE(pw)		(1 << _HSW_PW_SHIFT(pw))
 #define HSW_PWR_WELL_CTL5			_MMIO(0x45410)
 #define   HSW_PWR_WELL_ENABLE_SINGLE_STEP	(1<<31)
 #define   HSW_PWR_WELL_PWR_GATE_OVERRIDE	(1<<20)
@@ -7997,11 +8222,17 @@ enum {
 #define HSW_PWR_WELL_CTL6			_MMIO(0x45414)
 
 /* SKL Fuse Status */
+enum skl_power_gate {
+	SKL_PG0,
+	SKL_PG1,
+	SKL_PG2,
+};
+
 #define SKL_FUSE_STATUS				_MMIO(0x42000)
-#define  SKL_FUSE_DOWNLOAD_STATUS              (1<<31)
-#define  SKL_FUSE_PG0_DIST_STATUS              (1<<27)
-#define  SKL_FUSE_PG1_DIST_STATUS              (1<<26)
-#define  SKL_FUSE_PG2_DIST_STATUS              (1<<25)
+#define  SKL_FUSE_DOWNLOAD_STATUS		(1<<31)
+/* PG0 (HW control->no power well ID), PG1..PG2 (SKL_DISP_PW1..SKL_DISP_PW2) */
+#define  SKL_PW_TO_PG(pw)			((pw) - SKL_DISP_PW_1 + SKL_PG1)
+#define  SKL_FUSE_PG_DIST_STATUS(pg)		(1 << (27 - (pg)))
 
 /* Per-pipe DDI Function Control */
 #define _TRANS_DDI_FUNC_CTL_A		0x60400
@@ -8343,6 +8574,7 @@ enum {
 #define  DPLL_CFGCR0_LINK_RATE_3240	(6 << 25)
 #define  DPLL_CFGCR0_LINK_RATE_4050	(7 << 25)
 #define  DPLL_CFGCR0_DCO_FRACTION_MASK	(0x7fff << 10)
+#define  DPLL_CFGCR0_DCO_FRAC_SHIFT	(10)
 #define  DPLL_CFGCR0_DCO_FRACTION(x)	((x) << 10)
 #define  DPLL_CFGCR0_DCO_INTEGER_MASK	(0x3ff)
 #define CNL_DPLL_CFGCR0(pll)		_MMIO_PLL(pll, _CNL_DPLL0_CFGCR0, _CNL_DPLL1_CFGCR0)
@@ -8350,6 +8582,7 @@ enum {
 #define _CNL_DPLL0_CFGCR1		0x6C004
 #define _CNL_DPLL1_CFGCR1		0x6C084
 #define  DPLL_CFGCR1_QDIV_RATIO_MASK	(0xff << 10)
+#define  DPLL_CFGCR1_QDIV_RATIO_SHIFT	(10)
 #define  DPLL_CFGCR1_QDIV_RATIO(x)	((x) << 10)
 #define  DPLL_CFGCR1_QDIV_MODE(x)	((x) << 9)
 #define  DPLL_CFGCR1_KDIV_MASK		(7 << 6)
diff --git a/drivers/gpu/drm/i915/i915_selftest.h b/drivers/gpu/drm/i915/i915_selftest.h
index 9d7d86f..78e1a1b 100644
--- a/drivers/gpu/drm/i915/i915_selftest.h
+++ b/drivers/gpu/drm/i915/i915_selftest.h
@@ -101,6 +101,4 @@ bool __igt_timeout(unsigned long timeout, const char *fmt, ...);
 #define igt_timeout(t, fmt, ...) \
 	__igt_timeout((t), KERN_WARNING pr_fmt(fmt), ##__VA_ARGS__)
 
-#define igt_can_mi_store_dword_imm(D) (INTEL_GEN(D) > 2)
-
 #endif /* !__I915_SELFTEST_H__ */
diff --git a/drivers/gpu/drm/i915/i915_sysfs.c b/drivers/gpu/drm/i915/i915_sysfs.c
index 1eef3fa..d61c872 100644
--- a/drivers/gpu/drm/i915/i915_sysfs.c
+++ b/drivers/gpu/drm/i915/i915_sysfs.c
@@ -96,7 +96,7 @@ static struct attribute *rc6_attrs[] = {
 	NULL
 };
 
-static struct attribute_group rc6_attr_group = {
+static const struct attribute_group rc6_attr_group = {
 	.name = power_group_name,
 	.attrs =  rc6_attrs
 };
@@ -107,7 +107,7 @@ static struct attribute *rc6p_attrs[] = {
 	NULL
 };
 
-static struct attribute_group rc6p_attr_group = {
+static const struct attribute_group rc6p_attr_group = {
 	.name = power_group_name,
 	.attrs =  rc6p_attrs
 };
@@ -117,7 +117,7 @@ static struct attribute *media_rc6_attrs[] = {
 	NULL
 };
 
-static struct attribute_group media_rc6_attr_group = {
+static const struct attribute_group media_rc6_attr_group = {
 	.name = power_group_name,
 	.attrs =  media_rc6_attrs
 };
@@ -209,7 +209,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
 	memcpy(*remap_info + (offset/4), buf, count);
 
 	/* NB: We defer the remapping until we switch to the context */
-	list_for_each_entry(ctx, &dev_priv->context_list, link)
+	list_for_each_entry(ctx, &dev_priv->contexts.list, link)
 		ctx->remap_slice |= (1<<slice);
 
 	ret = count;
@@ -220,7 +220,7 @@ i915_l3_write(struct file *filp, struct kobject *kobj,
 	return ret;
 }
 
-static struct bin_attribute dpf_attrs = {
+static const struct bin_attribute dpf_attrs = {
 	.attr = {.name = "l3_parity", .mode = (S_IRUSR | S_IWUSR)},
 	.size = GEN7_L3LOG_SIZE,
 	.read = i915_l3_read,
@@ -229,7 +229,7 @@ static struct bin_attribute dpf_attrs = {
 	.private = (void *)0
 };
 
-static struct bin_attribute dpf_attrs_1 = {
+static const struct bin_attribute dpf_attrs_1 = {
 	.attr = {.name = "l3_parity_slice_1", .mode = (S_IRUSR | S_IWUSR)},
 	.size = GEN7_L3LOG_SIZE,
 	.read = i915_l3_read,
@@ -253,7 +253,7 @@ static ssize_t gt_act_freq_mhz_show(struct device *kdev,
 		ret = intel_gpu_freq(dev_priv, (freq >> 8) & 0xff);
 	} else {
 		u32 rpstat = I915_READ(GEN6_RPSTAT1);
-		if (IS_GEN9(dev_priv))
+		if (INTEL_GEN(dev_priv) >= 9)
 			ret = (rpstat & GEN9_CAGF_MASK) >> GEN9_CAGF_SHIFT;
 		else if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv))
 			ret = (rpstat & HSW_CAGF_MASK) >> HSW_CAGF_SHIFT;
@@ -532,7 +532,7 @@ static ssize_t error_state_write(struct file *file, struct kobject *kobj,
 	return count;
 }
 
-static struct bin_attribute error_state_attr = {
+static const struct bin_attribute error_state_attr = {
 	.attr.name = "error",
 	.attr.mode = S_IRUSR | S_IWUSR,
 	.size = 0,
diff --git a/drivers/gpu/drm/i915/i915_vgpu.c b/drivers/gpu/drm/i915/i915_vgpu.c
index cf7a958..5fe9f3f 100644
--- a/drivers/gpu/drm/i915/i915_vgpu.c
+++ b/drivers/gpu/drm/i915/i915_vgpu.c
@@ -75,10 +75,17 @@ void i915_check_vgpu(struct drm_i915_private *dev_priv)
 		return;
 	}
 
+	dev_priv->vgpu.caps = __raw_i915_read32(dev_priv, vgtif_reg(vgt_caps));
+
 	dev_priv->vgpu.active = true;
 	DRM_INFO("Virtual GPU for Intel GVT-g detected.\n");
 }
 
+bool intel_vgpu_has_full_48bit_ppgtt(struct drm_i915_private *dev_priv)
+{
+	return dev_priv->vgpu.caps & VGT_CAPS_FULL_48BIT_PPGTT;
+}
+
 struct _balloon_info_ {
 	/*
 	 * There are up to 2 regions per mappable/unmappable graphic
diff --git a/drivers/gpu/drm/i915/i915_vgpu.h b/drivers/gpu/drm/i915/i915_vgpu.h
index 3c3b2d2..b72bd29 100644
--- a/drivers/gpu/drm/i915/i915_vgpu.h
+++ b/drivers/gpu/drm/i915/i915_vgpu.h
@@ -27,6 +27,9 @@
 #include "i915_pvinfo.h"
 
 void i915_check_vgpu(struct drm_i915_private *dev_priv);
+
+bool intel_vgpu_has_full_48bit_ppgtt(struct drm_i915_private *dev_priv);
+
 int intel_vgt_balloon(struct drm_i915_private *dev_priv);
 void intel_vgt_deballoon(struct drm_i915_private *dev_priv);
 
diff --git a/drivers/gpu/drm/i915/i915_vma.c b/drivers/gpu/drm/i915/i915_vma.c
index 1cfe137..02d1a5e 100644
--- a/drivers/gpu/drm/i915/i915_vma.c
+++ b/drivers/gpu/drm/i915/i915_vma.c
@@ -579,11 +579,17 @@ int __i915_vma_do_pin(struct i915_vma *vma,
 
 static void i915_vma_destroy(struct i915_vma *vma)
 {
+	int i;
+
 	GEM_BUG_ON(vma->node.allocated);
 	GEM_BUG_ON(i915_vma_is_active(vma));
 	GEM_BUG_ON(!i915_vma_is_closed(vma));
 	GEM_BUG_ON(vma->fence);
 
+	for (i = 0; i < ARRAY_SIZE(vma->last_read); i++)
+		GEM_BUG_ON(i915_gem_active_isset(&vma->last_read[i]));
+	GEM_BUG_ON(i915_gem_active_isset(&vma->last_fence));
+
 	list_del(&vma->vm_link);
 	if (!i915_vma_is_ggtt(vma))
 		i915_ppgtt_put(i915_vm_to_ppgtt(vma->vm));
@@ -591,33 +597,11 @@ static void i915_vma_destroy(struct i915_vma *vma)
 	kmem_cache_free(to_i915(vma->obj->base.dev)->vmas, vma);
 }
 
-void i915_vma_unlink_ctx(struct i915_vma *vma)
-{
-	struct i915_gem_context *ctx = vma->ctx;
-
-	if (ctx->vma_lut.ht_size & I915_CTX_RESIZE_IN_PROGRESS) {
-		cancel_work_sync(&ctx->vma_lut.resize);
-		ctx->vma_lut.ht_size &= ~I915_CTX_RESIZE_IN_PROGRESS;
-	}
-
-	__hlist_del(&vma->ctx_node);
-	ctx->vma_lut.ht_count--;
-
-	if (i915_vma_is_ggtt(vma))
-		vma->obj->vma_hashed = NULL;
-	vma->ctx = NULL;
-
-	i915_vma_put(vma);
-}
-
 void i915_vma_close(struct i915_vma *vma)
 {
 	GEM_BUG_ON(i915_vma_is_closed(vma));
 	vma->flags |= I915_VMA_CLOSED;
 
-	if (vma->ctx)
-		i915_vma_unlink_ctx(vma);
-
 	list_del(&vma->obj_link);
 	rb_erase(&vma->obj_node, &vma->obj->vma_tree);
 
@@ -680,9 +664,8 @@ int i915_vma_unbind(struct i915_vma *vma)
 		__i915_vma_unpin(vma);
 		if (ret)
 			return ret;
-
-		GEM_BUG_ON(i915_vma_is_active(vma));
 	}
+	GEM_BUG_ON(i915_vma_is_active(vma));
 
 	if (i915_vma_is_pinned(vma))
 		return -EBUSY;
diff --git a/drivers/gpu/drm/i915/i915_vma.h b/drivers/gpu/drm/i915/i915_vma.h
index 20cf272..1fd61e8 100644
--- a/drivers/gpu/drm/i915/i915_vma.h
+++ b/drivers/gpu/drm/i915/i915_vma.h
@@ -112,13 +112,9 @@ struct i915_vma {
 	/**
 	 * Used for performing relocations during execbuffer insertion.
 	 */
-	struct drm_i915_gem_exec_object2 *exec_entry;
+	unsigned int *exec_flags;
 	struct hlist_node exec_node;
 	u32 exec_handle;
-
-	struct i915_gem_context *ctx;
-	struct hlist_node ctx_node;
-	u32 ctx_handle;
 };
 
 struct i915_vma *
diff --git a/drivers/gpu/drm/i915/intel_atomic_plane.c b/drivers/gpu/drm/i915/intel_atomic_plane.c
index 4325cb0..ee76fab 100644
--- a/drivers/gpu/drm/i915/intel_atomic_plane.c
+++ b/drivers/gpu/drm/i915/intel_atomic_plane.c
@@ -114,6 +114,8 @@ int intel_plane_atomic_check_with_state(struct intel_crtc_state *crtc_state,
 	struct drm_i915_private *dev_priv = to_i915(plane->dev);
 	struct drm_plane_state *state = &intel_state->base;
 	struct intel_plane *intel_plane = to_intel_plane(plane);
+	const struct drm_display_mode *adjusted_mode =
+		&crtc_state->base.adjusted_mode;
 	int ret;
 
 	/*
@@ -173,6 +175,19 @@ int intel_plane_atomic_check_with_state(struct intel_crtc_state *crtc_state,
 	if (ret)
 		return ret;
 
+	/*
+	 * Y-tiling is not supported in IF-ID Interlace mode in
+	 * GEN9 and above.
+	 */
+	if (state->fb && INTEL_GEN(dev_priv) >= 9 && crtc_state->base.enable &&
+	    adjusted_mode->flags & DRM_MODE_FLAG_INTERLACE) {
+		if (state->fb->modifier == I915_FORMAT_MOD_Y_TILED ||
+		    state->fb->modifier == I915_FORMAT_MOD_Yf_TILED) {
+			DRM_DEBUG_KMS("Y/Yf tiling not supported in IF-ID mode\n");
+			return -EINVAL;
+		}
+	}
+
 	/* FIXME pre-g4x don't work like this */
 	if (intel_state->base.visible)
 		crtc_state->active_planes |= BIT(intel_plane->id);
diff --git a/drivers/gpu/drm/i915/intel_bios.c b/drivers/gpu/drm/i915/intel_bios.c
index 7ea7fd1..183e87e 100644
--- a/drivers/gpu/drm/i915/intel_bios.c
+++ b/drivers/gpu/drm/i915/intel_bios.c
@@ -1190,6 +1190,15 @@ static void parse_ddi_port(struct drm_i915_private *dev_priv, enum port port,
 	if (is_dvi) {
 		info->alternate_ddc_pin = ddc_pin;
 
+		/*
+		 * All VBTs that we got so far for B Stepping has this
+		 * information wrong for Port D. So, let's just ignore for now.
+		 */
+		if (IS_CNL_REVID(dev_priv, CNL_REVID_B0, CNL_REVID_B0) &&
+		    port == PORT_D) {
+			info->alternate_ddc_pin = 0;
+		}
+
 		sanitize_ddc_pin(dev_priv, port);
 	}
 
diff --git a/drivers/gpu/drm/i915/intel_color.c b/drivers/gpu/drm/i915/intel_color.c
index 17c4ae7..8e4e8296 100644
--- a/drivers/gpu/drm/i915/intel_color.c
+++ b/drivers/gpu/drm/i915/intel_color.c
@@ -41,6 +41,22 @@
 
 #define LEGACY_LUT_LENGTH		(sizeof(struct drm_color_lut) * 256)
 
+/* Post offset values for RGB->YCBCR conversion */
+#define POSTOFF_RGB_TO_YUV_HI 0x800
+#define POSTOFF_RGB_TO_YUV_ME 0x100
+#define POSTOFF_RGB_TO_YUV_LO 0x800
+
+/*
+ * These values are direct register values specified in the Bspec,
+ * for RGB->YUV conversion matrix (colorspace BT709)
+ */
+#define CSC_RGB_TO_YUV_RU_GU 0x2ba809d8
+#define CSC_RGB_TO_YUV_BU 0x37e80000
+#define CSC_RGB_TO_YUV_RY_GY 0x1e089cc0
+#define CSC_RGB_TO_YUV_BY 0xb5280000
+#define CSC_RGB_TO_YUV_RV_GV 0xbce89ad8
+#define CSC_RGB_TO_YUV_BV 0x1e080000
+
 /*
  * Extract the CSC coefficient from a CTM coefficient (in U32.32 fixed point
  * format). This macro takes the coefficient we want transformed and the
@@ -91,6 +107,30 @@ static void ctm_mult_by_limited(uint64_t *result, int64_t *input)
 	}
 }
 
+void i9xx_load_ycbcr_conversion_matrix(struct intel_crtc *intel_crtc)
+{
+	int pipe = intel_crtc->pipe;
+	struct drm_i915_private *dev_priv = to_i915(intel_crtc->base.dev);
+
+	I915_WRITE(PIPE_CSC_PREOFF_HI(pipe), 0);
+	I915_WRITE(PIPE_CSC_PREOFF_ME(pipe), 0);
+	I915_WRITE(PIPE_CSC_PREOFF_LO(pipe), 0);
+
+	I915_WRITE(PIPE_CSC_COEFF_RU_GU(pipe), CSC_RGB_TO_YUV_RU_GU);
+	I915_WRITE(PIPE_CSC_COEFF_BU(pipe), CSC_RGB_TO_YUV_BU);
+
+	I915_WRITE(PIPE_CSC_COEFF_RY_GY(pipe), CSC_RGB_TO_YUV_RY_GY);
+	I915_WRITE(PIPE_CSC_COEFF_BY(pipe), CSC_RGB_TO_YUV_BY);
+
+	I915_WRITE(PIPE_CSC_COEFF_RV_GV(pipe), CSC_RGB_TO_YUV_RV_GV);
+	I915_WRITE(PIPE_CSC_COEFF_BV(pipe), CSC_RGB_TO_YUV_BV);
+
+	I915_WRITE(PIPE_CSC_POSTOFF_HI(pipe), POSTOFF_RGB_TO_YUV_HI);
+	I915_WRITE(PIPE_CSC_POSTOFF_ME(pipe), POSTOFF_RGB_TO_YUV_ME);
+	I915_WRITE(PIPE_CSC_POSTOFF_LO(pipe), POSTOFF_RGB_TO_YUV_LO);
+	I915_WRITE(PIPE_CSC_MODE(pipe), 0);
+}
+
 /* Set up the pipe CSC unit. */
 static void i9xx_load_csc_matrix(struct drm_crtc_state *crtc_state)
 {
@@ -101,7 +141,10 @@ static void i9xx_load_csc_matrix(struct drm_crtc_state *crtc_state)
 	uint16_t coeffs[9] = { 0, };
 	struct intel_crtc_state *intel_crtc_state = to_intel_crtc_state(crtc_state);
 
-	if (crtc_state->ctm) {
+	if (intel_crtc_state->ycbcr420) {
+		i9xx_load_ycbcr_conversion_matrix(intel_crtc);
+		return;
+	} else if (crtc_state->ctm) {
 		struct drm_color_ctm *ctm =
 			(struct drm_color_ctm *)crtc_state->ctm->data;
 		uint64_t input[9] = { 0, };
@@ -616,7 +659,7 @@ void intel_color_init(struct drm_crtc *crtc)
 		   IS_BROXTON(dev_priv)) {
 		dev_priv->display.load_csc_matrix = i9xx_load_csc_matrix;
 		dev_priv->display.load_luts = broadwell_load_luts;
-	} else if (IS_GEMINILAKE(dev_priv)) {
+	} else if (IS_GEMINILAKE(dev_priv) || IS_CANNONLAKE(dev_priv)) {
 		dev_priv->display.load_csc_matrix = i9xx_load_csc_matrix;
 		dev_priv->display.load_luts = glk_load_luts;
 	} else {
diff --git a/drivers/gpu/drm/i915/intel_crt.c b/drivers/gpu/drm/i915/intel_crt.c
index 84a1f5e..70e0ff4 100644
--- a/drivers/gpu/drm/i915/intel_crt.c
+++ b/drivers/gpu/drm/i915/intel_crt.c
@@ -802,12 +802,10 @@ void intel_crt_reset(struct drm_encoder *encoder)
  */
 
 static const struct drm_connector_funcs intel_crt_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.late_register = intel_connector_register,
 	.early_unregister = intel_connector_unregister,
 	.destroy = intel_crt_destroy,
-	.set_property = drm_atomic_helper_connector_set_property,
 	.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
 	.atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
 };
diff --git a/drivers/gpu/drm/i915/intel_ddi.c b/drivers/gpu/drm/i915/intel_ddi.c
index d3b3252..4b4fd1f 100644
--- a/drivers/gpu/drm/i915/intel_ddi.c
+++ b/drivers/gpu/drm/i915/intel_ddi.c
@@ -1103,6 +1103,62 @@ static int skl_calc_wrpll_link(struct drm_i915_private *dev_priv,
 	return dco_freq / (p0 * p1 * p2 * 5);
 }
 
+static int cnl_calc_wrpll_link(struct drm_i915_private *dev_priv,
+			       uint32_t pll_id)
+{
+	uint32_t cfgcr0, cfgcr1;
+	uint32_t p0, p1, p2, dco_freq, ref_clock;
+
+	cfgcr0 = I915_READ(CNL_DPLL_CFGCR0(pll_id));
+	cfgcr1 = I915_READ(CNL_DPLL_CFGCR1(pll_id));
+
+	p0 = cfgcr1 & DPLL_CFGCR1_PDIV_MASK;
+	p2 = cfgcr1 & DPLL_CFGCR1_KDIV_MASK;
+
+	if (cfgcr1 & DPLL_CFGCR1_QDIV_MODE(1))
+		p1 = (cfgcr1 & DPLL_CFGCR1_QDIV_RATIO_MASK) >>
+			DPLL_CFGCR1_QDIV_RATIO_SHIFT;
+	else
+		p1 = 1;
+
+
+	switch (p0) {
+	case DPLL_CFGCR1_PDIV_2:
+		p0 = 2;
+		break;
+	case DPLL_CFGCR1_PDIV_3:
+		p0 = 3;
+		break;
+	case DPLL_CFGCR1_PDIV_5:
+		p0 = 5;
+		break;
+	case DPLL_CFGCR1_PDIV_7:
+		p0 = 7;
+		break;
+	}
+
+	switch (p2) {
+	case DPLL_CFGCR1_KDIV_1:
+		p2 = 1;
+		break;
+	case DPLL_CFGCR1_KDIV_2:
+		p2 = 2;
+		break;
+	case DPLL_CFGCR1_KDIV_4:
+		p2 = 4;
+		break;
+	}
+
+	ref_clock = dev_priv->cdclk.hw.ref;
+
+	dco_freq = (cfgcr0 & DPLL_CFGCR0_DCO_INTEGER_MASK) * ref_clock;
+
+	dco_freq += (((cfgcr0 & DPLL_CFGCR0_DCO_FRACTION_MASK) >>
+		      DPLL_CFGCR0_DCO_FRAC_SHIFT) * ref_clock) / 0x8000;
+
+	return dco_freq / (p0 * p1 * p2 * 5);
+}
+
 static void ddi_dotclock_get(struct intel_crtc_state *pipe_config)
 {
 	int dotclock;
@@ -1118,12 +1174,68 @@ static void ddi_dotclock_get(struct intel_crtc_state *pipe_config)
 	else
 		dotclock = pipe_config->port_clock;
 
+	if (pipe_config->ycbcr420)
+		dotclock *= 2;
+
 	if (pipe_config->pixel_multiplier)
 		dotclock /= pipe_config->pixel_multiplier;
 
 	pipe_config->base.adjusted_mode.crtc_clock = dotclock;
 }
 
+static void cnl_ddi_clock_get(struct intel_encoder *encoder,
+			      struct intel_crtc_state *pipe_config)
+{
+	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
+	int link_clock = 0;
+	uint32_t cfgcr0, pll_id;
+
+	pll_id = intel_get_shared_dpll_id(dev_priv, pipe_config->shared_dpll);
+
+	cfgcr0 = I915_READ(CNL_DPLL_CFGCR0(pll_id));
+
+	if (cfgcr0 & DPLL_CFGCR0_HDMI_MODE) {
+		link_clock = cnl_calc_wrpll_link(dev_priv, pll_id);
+	} else {
+		link_clock = cfgcr0 & DPLL_CFGCR0_LINK_RATE_MASK;
+
+		switch (link_clock) {
+		case DPLL_CFGCR0_LINK_RATE_810:
+			link_clock = 81000;
+			break;
+		case DPLL_CFGCR0_LINK_RATE_1080:
+			link_clock = 108000;
+			break;
+		case DPLL_CFGCR0_LINK_RATE_1350:
+			link_clock = 135000;
+			break;
+		case DPLL_CFGCR0_LINK_RATE_1620:
+			link_clock = 162000;
+			break;
+		case DPLL_CFGCR0_LINK_RATE_2160:
+			link_clock = 216000;
+			break;
+		case DPLL_CFGCR0_LINK_RATE_2700:
+			link_clock = 270000;
+			break;
+		case DPLL_CFGCR0_LINK_RATE_3240:
+			link_clock = 324000;
+			break;
+		case DPLL_CFGCR0_LINK_RATE_4050:
+			link_clock = 405000;
+			break;
+		default:
+			WARN(1, "Unsupported link rate\n");
+			break;
+		}
+		link_clock *= 2;
+	}
+
+	pipe_config->port_clock = link_clock;
+
+	ddi_dotclock_get(pipe_config);
+}
+
 static void skl_ddi_clock_get(struct intel_encoder *encoder,
 				struct intel_crtc_state *pipe_config)
 {
@@ -1267,6 +1379,8 @@ void intel_ddi_clock_get(struct intel_encoder *encoder,
 		skl_ddi_clock_get(encoder, pipe_config);
 	else if (IS_GEN9_LP(dev_priv))
 		bxt_ddi_clock_get(encoder, pipe_config);
+	else if (IS_CANNONLAKE(dev_priv))
+		cnl_ddi_clock_get(encoder, pipe_config);
 }
 
 void intel_ddi_set_pipe_settings(const struct intel_crtc_state *crtc_state)
@@ -1868,9 +1982,12 @@ static void cnl_ddi_vswing_sequence(struct intel_encoder *encoder, u32 level)
 	if ((intel_dp) && (type == INTEL_OUTPUT_EDP || type == INTEL_OUTPUT_DP)) {
 		width = intel_dp->lane_count;
 		rate = intel_dp->link_rate;
-	} else {
+	} else if (type == INTEL_OUTPUT_HDMI) {
 		width = 4;
 		/* Rate is always < than 6GHz for HDMI */
+	} else {
+		MISSING_CASE(type);
+		return;
 	}
 
 	/*
diff --git a/drivers/gpu/drm/i915/intel_device_info.c b/drivers/gpu/drm/i915/intel_device_info.c
index 77d3214..5f91ddc 100644
--- a/drivers/gpu/drm/i915/intel_device_info.c
+++ b/drivers/gpu/drm/i915/intel_device_info.c
@@ -363,7 +363,7 @@ void intel_device_info_runtime_init(struct drm_i915_private *dev_priv)
 		 */
 		if (fuse_strap & ILK_INTERNAL_DISPLAY_DISABLE ||
 		    sfuse_strap & SFUSE_STRAP_DISPLAY_DISABLED ||
-		    (dev_priv->pch_type == PCH_CPT &&
+		    (HAS_PCH_CPT(dev_priv) &&
 		     !(sfuse_strap & SFUSE_STRAP_FUSE_LOCK))) {
 			DRM_INFO("Display fused off, disabling\n");
 			info->num_pipes = 0;
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index cc484b5..0e93ec2 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -49,11 +49,6 @@
 #include <linux/dma_remapping.h>
 #include <linux/reservation.h>
 
-static bool is_mmio_work(struct intel_flip_work *work)
-{
-	return work->mmio_work.func;
-}
-
 /* Primary plane formats for gen <= 3 */
 static const uint32_t i8xx_primary_formats[] = {
 	DRM_FORMAT_C8,
@@ -72,6 +67,12 @@ static const uint32_t i965_primary_formats[] = {
 	DRM_FORMAT_XBGR2101010,
 };
 
+static const uint64_t i9xx_format_modifiers[] = {
+	I915_FORMAT_MOD_X_TILED,
+	DRM_FORMAT_MOD_LINEAR,
+	DRM_FORMAT_MOD_INVALID
+};
+
 static const uint32_t skl_primary_formats[] = {
 	DRM_FORMAT_C8,
 	DRM_FORMAT_RGB565,
@@ -87,11 +88,34 @@ static const uint32_t skl_primary_formats[] = {
 	DRM_FORMAT_VYUY,
 };
 
+static const uint64_t skl_format_modifiers_noccs[] = {
+	I915_FORMAT_MOD_Yf_TILED,
+	I915_FORMAT_MOD_Y_TILED,
+	I915_FORMAT_MOD_X_TILED,
+	DRM_FORMAT_MOD_LINEAR,
+	DRM_FORMAT_MOD_INVALID
+};
+
+static const uint64_t skl_format_modifiers_ccs[] = {
+	I915_FORMAT_MOD_Yf_TILED_CCS,
+	I915_FORMAT_MOD_Y_TILED_CCS,
+	I915_FORMAT_MOD_Yf_TILED,
+	I915_FORMAT_MOD_Y_TILED,
+	I915_FORMAT_MOD_X_TILED,
+	DRM_FORMAT_MOD_LINEAR,
+	DRM_FORMAT_MOD_INVALID
+};
+
 /* Cursor formats */
 static const uint32_t intel_cursor_formats[] = {
 	DRM_FORMAT_ARGB8888,
 };
 
+static const uint64_t cursor_format_modifiers[] = {
+	DRM_FORMAT_MOD_LINEAR,
+	DRM_FORMAT_MOD_INVALID
+};
+
 static void i9xx_crtc_clock_get(struct intel_crtc *crtc,
 				struct intel_crtc_state *pipe_config);
 static void ironlake_pch_clock_get(struct intel_crtc *crtc,
@@ -1777,7 +1801,7 @@ static void lpt_enable_pch_transcoder(struct drm_i915_private *dev_priv,
 
 	/* FDI must be feeding us bits for PCH ports */
 	assert_fdi_tx_enabled(dev_priv, (enum pipe) cpu_transcoder);
-	assert_fdi_rx_enabled(dev_priv, TRANSCODER_A);
+	assert_fdi_rx_enabled(dev_priv, PIPE_A);
 
 	/* Workaround: set timing override bit. */
 	val = I915_READ(TRANS_CHICKEN2(PIPE_A));
@@ -1853,16 +1877,16 @@ void lpt_disable_pch_transcoder(struct drm_i915_private *dev_priv)
 	I915_WRITE(TRANS_CHICKEN2(PIPE_A), val);
 }
 
-enum transcoder intel_crtc_pch_transcoder(struct intel_crtc *crtc)
+enum pipe intel_crtc_pch_transcoder(struct intel_crtc *crtc)
 {
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 
 	WARN_ON(!crtc->config->has_pch_encoder);
 
 	if (HAS_PCH_LPT(dev_priv))
-		return TRANSCODER_A;
+		return PIPE_A;
 	else
-		return (enum transcoder) crtc->pipe;
+		return crtc->pipe;
 }
 
 /**
@@ -1901,7 +1925,7 @@ static void intel_enable_pipe(struct intel_crtc *crtc)
 		if (crtc->config->has_pch_encoder) {
 			/* if driving the PCH, we need FDI enabled */
 			assert_fdi_rx_pll_enabled(dev_priv,
-						  (enum pipe) intel_crtc_pch_transcoder(crtc));
+						  intel_crtc_pch_transcoder(crtc));
 			assert_fdi_tx_pll_enabled(dev_priv,
 						  (enum pipe) cpu_transcoder);
 		}
@@ -1999,11 +2023,19 @@ intel_tile_width_bytes(const struct drm_framebuffer *fb, int plane)
 			return 128;
 		else
 			return 512;
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+		if (plane == 1)
+			return 128;
+		/* fall through */
 	case I915_FORMAT_MOD_Y_TILED:
 		if (IS_GEN2(dev_priv) || HAS_128_BYTE_Y_TILING(dev_priv))
 			return 128;
 		else
 			return 512;
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
+		if (plane == 1)
+			return 128;
+		/* fall through */
 	case I915_FORMAT_MOD_Yf_TILED:
 		switch (cpp) {
 		case 1:
@@ -2110,7 +2142,7 @@ static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
 	struct drm_i915_private *dev_priv = to_i915(fb->dev);
 
 	/* AUX_DIST needs only 4K alignment */
-	if (fb->format->format == DRM_FORMAT_NV12 && plane == 1)
+	if (plane == 1)
 		return 4096;
 
 	switch (fb->modifier) {
@@ -2120,6 +2152,8 @@ static unsigned int intel_surf_alignment(const struct drm_framebuffer *fb,
 		if (INTEL_GEN(dev_priv) >= 9)
 			return 256 * 1024;
 		return 0;
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
 	case I915_FORMAT_MOD_Y_TILED:
 	case I915_FORMAT_MOD_Yf_TILED:
 		return 1 * 1024 * 1024;
@@ -2162,6 +2196,8 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, unsigned int rotation)
 	 */
 	intel_runtime_pm_get(dev_priv);
 
+	atomic_inc(&dev_priv->gpu_error.pending_fb_pin);
+
 	vma = i915_gem_object_pin_to_display_plane(obj, alignment, &view);
 	if (IS_ERR(vma))
 		goto err;
@@ -2189,6 +2225,8 @@ intel_pin_and_fence_fb_obj(struct drm_framebuffer *fb, unsigned int rotation)
 
 	i915_vma_get(vma);
 err:
+	atomic_dec(&dev_priv->gpu_error.pending_fb_pin);
+
 	intel_runtime_pm_put(dev_priv);
 	return vma;
 }
@@ -2427,12 +2465,48 @@ static unsigned int intel_fb_modifier_to_tiling(uint64_t fb_modifier)
 	case I915_FORMAT_MOD_X_TILED:
 		return I915_TILING_X;
 	case I915_FORMAT_MOD_Y_TILED:
+	case I915_FORMAT_MOD_Y_TILED_CCS:
 		return I915_TILING_Y;
 	default:
 		return I915_TILING_NONE;
 	}
 }
 
+static const struct drm_format_info ccs_formats[] = {
+	{ .format = DRM_FORMAT_XRGB8888, .depth = 24, .num_planes = 2, .cpp = { 4, 1, }, .hsub = 8, .vsub = 16, },
+	{ .format = DRM_FORMAT_XBGR8888, .depth = 24, .num_planes = 2, .cpp = { 4, 1, }, .hsub = 8, .vsub = 16, },
+	{ .format = DRM_FORMAT_ARGB8888, .depth = 32, .num_planes = 2, .cpp = { 4, 1, }, .hsub = 8, .vsub = 16, },
+	{ .format = DRM_FORMAT_ABGR8888, .depth = 32, .num_planes = 2, .cpp = { 4, 1, }, .hsub = 8, .vsub = 16, },
+};
+
+static const struct drm_format_info *
+lookup_format_info(const struct drm_format_info formats[],
+		   int num_formats, u32 format)
+{
+	int i;
+
+	for (i = 0; i < num_formats; i++) {
+		if (formats[i].format == format)
+			return &formats[i];
+	}
+
+	return NULL;
+}
+
+static const struct drm_format_info *
+intel_get_format_info(const struct drm_mode_fb_cmd2 *cmd)
+{
+	switch (cmd->modifier[0]) {
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
+		return lookup_format_info(ccs_formats,
+					  ARRAY_SIZE(ccs_formats),
+					  cmd->pixel_format);
+	default:
+		return NULL;
+	}
+}
+
 static int
 intel_fill_fb_info(struct drm_i915_private *dev_priv,
 		   struct drm_framebuffer *fb)
@@ -2456,6 +2530,36 @@ intel_fill_fb_info(struct drm_i915_private *dev_priv,
 
 		intel_fb_offset_to_xy(&x, &y, fb, i);
 
+		if ((fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
+		     fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) && i == 1) {
+			int hsub = fb->format->hsub;
+			int vsub = fb->format->vsub;
+			int tile_width, tile_height;
+			int main_x, main_y;
+			int ccs_x, ccs_y;
+
+			intel_tile_dims(fb, i, &tile_width, &tile_height);
+
+			ccs_x = (x * hsub) % (tile_width * hsub);
+			ccs_y = (y * vsub) % (tile_height * vsub);
+			main_x = intel_fb->normal[0].x % (tile_width * hsub);
+			main_y = intel_fb->normal[0].y % (tile_height * vsub);
+
+			/*
+			 * CCS doesn't have its own x/y offset register, so the intra CCS tile
+			 * x/y offsets must match between CCS and the main surface.
+			 */
+			if (main_x != ccs_x || main_y != ccs_y) {
+				DRM_DEBUG_KMS("Bad CCS x/y (main %d,%d ccs %d,%d) full (main %d,%d ccs %d,%d)\n",
+					      main_x, main_y,
+					      ccs_x, ccs_y,
+					      intel_fb->normal[0].x,
+					      intel_fb->normal[0].y,
+					      x, y);
+				return -EINVAL;
+			}
+		}
+
 		/*
 		 * The fence (if used) is aligned to the start of the object
 		 * so having the framebuffer wrap around across the edge of the
@@ -2664,20 +2768,6 @@ intel_alloc_initial_plane_obj(struct intel_crtc *crtc,
 	return false;
 }
 
-/* Update plane->state->fb to match plane->fb after driver-internal updates */
-static void
-update_state_fb(struct drm_plane *plane)
-{
-	if (plane->fb == plane->state->fb)
-		return;
-
-	if (plane->state->fb)
-		drm_framebuffer_unreference(plane->state->fb);
-	plane->state->fb = plane->fb;
-	if (plane->state->fb)
-		drm_framebuffer_reference(plane->state->fb);
-}
-
 static void
 intel_set_plane_visible(struct intel_crtc_state *crtc_state,
 			struct intel_plane_state *plane_state,
@@ -2830,6 +2920,9 @@ static int skl_max_plane_width(const struct drm_framebuffer *fb, int plane,
 			break;
 		}
 		break;
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
+		/* FIXME AUX plane? */
 	case I915_FORMAT_MOD_Y_TILED:
 	case I915_FORMAT_MOD_Yf_TILED:
 		switch (cpp) {
@@ -2852,6 +2945,44 @@ static int skl_max_plane_width(const struct drm_framebuffer *fb, int plane,
 	return 2048;
 }
 
+static bool skl_check_main_ccs_coordinates(struct intel_plane_state *plane_state,
+					   int main_x, int main_y, u32 main_offset)
+{
+	const struct drm_framebuffer *fb = plane_state->base.fb;
+	int hsub = fb->format->hsub;
+	int vsub = fb->format->vsub;
+	int aux_x = plane_state->aux.x;
+	int aux_y = plane_state->aux.y;
+	u32 aux_offset = plane_state->aux.offset;
+	u32 alignment = intel_surf_alignment(fb, 1);
+
+	while (aux_offset >= main_offset && aux_y <= main_y) {
+		int x, y;
+
+		if (aux_x == main_x && aux_y == main_y)
+			break;
+
+		if (aux_offset == 0)
+			break;
+
+		x = aux_x / hsub;
+		y = aux_y / vsub;
+		aux_offset = intel_adjust_tile_offset(&x, &y, plane_state, 1,
+						      aux_offset, aux_offset - alignment);
+		aux_x = x * hsub + aux_x % hsub;
+		aux_y = y * vsub + aux_y % vsub;
+	}
+
+	if (aux_x != main_x || aux_y != main_y)
+		return false;
+
+	plane_state->aux.offset = aux_offset;
+	plane_state->aux.x = aux_x;
+	plane_state->aux.y = aux_y;
+
+	return true;
+}
+
 static int skl_check_main_surface(struct intel_plane_state *plane_state)
 {
 	const struct drm_framebuffer *fb = plane_state->base.fb;
@@ -2894,7 +3025,7 @@ static int skl_check_main_surface(struct intel_plane_state *plane_state)
 
 		while ((x + w) * cpp > fb->pitches[0]) {
 			if (offset == 0) {
-				DRM_DEBUG_KMS("Unable to find suitable display surface offset\n");
+				DRM_DEBUG_KMS("Unable to find suitable display surface offset due to X-tiling\n");
 				return -EINVAL;
 			}
 
@@ -2903,6 +3034,26 @@ static int skl_check_main_surface(struct intel_plane_state *plane_state)
 		}
 	}
 
+	/*
+	 * CCS AUX surface doesn't have its own x/y offsets, we must make sure
+	 * they match with the main surface x/y offsets.
+	 */
+	if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
+	    fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
+		while (!skl_check_main_ccs_coordinates(plane_state, x, y, offset)) {
+			if (offset == 0)
+				break;
+
+			offset = intel_adjust_tile_offset(&x, &y, plane_state, 0,
+							  offset, offset - alignment);
+		}
+
+		if (x != plane_state->aux.x || y != plane_state->aux.y) {
+			DRM_DEBUG_KMS("Unable to find suitable display surface offset due to CCS\n");
+			return -EINVAL;
+		}
+	}
+
 	plane_state->main.offset = offset;
 	plane_state->main.x = x;
 	plane_state->main.y = y;
@@ -2939,6 +3090,49 @@ static int skl_check_nv12_aux_surface(struct intel_plane_state *plane_state)
 	return 0;
 }
 
+static int skl_check_ccs_aux_surface(struct intel_plane_state *plane_state)
+{
+	struct intel_plane *plane = to_intel_plane(plane_state->base.plane);
+	struct intel_crtc *crtc = to_intel_crtc(plane_state->base.crtc);
+	const struct drm_framebuffer *fb = plane_state->base.fb;
+	int src_x = plane_state->base.src.x1 >> 16;
+	int src_y = plane_state->base.src.y1 >> 16;
+	int hsub = fb->format->hsub;
+	int vsub = fb->format->vsub;
+	int x = src_x / hsub;
+	int y = src_y / vsub;
+	u32 offset;
+
+	switch (plane->id) {
+	case PLANE_PRIMARY:
+	case PLANE_SPRITE0:
+		break;
+	default:
+		DRM_DEBUG_KMS("RC support only on plane 1 and 2\n");
+		return -EINVAL;
+	}
+
+	if (crtc->pipe == PIPE_C) {
+		DRM_DEBUG_KMS("No RC support on pipe C\n");
+		return -EINVAL;
+	}
+
+	if (plane_state->base.rotation & ~(DRM_MODE_ROTATE_0 | DRM_MODE_ROTATE_180)) {
+		DRM_DEBUG_KMS("RC support only with 0/180 degree rotation %x\n",
+			      plane_state->base.rotation);
+		return -EINVAL;
+	}
+
+	intel_add_fb_offsets(&x, &y, plane_state, 1);
+	offset = intel_compute_tile_offset(&x, &y, plane_state, 1);
+
+	plane_state->aux.offset = offset;
+	plane_state->aux.x = x * hsub + src_x % hsub;
+	plane_state->aux.y = y * vsub + src_y % vsub;
+
+	return 0;
+}
+
 int skl_check_plane_surface(struct intel_plane_state *plane_state)
 {
 	const struct drm_framebuffer *fb = plane_state->base.fb;
@@ -2962,6 +3156,11 @@ int skl_check_plane_surface(struct intel_plane_state *plane_state)
 		ret = skl_check_nv12_aux_surface(plane_state);
 		if (ret)
 			return ret;
+	} else if (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
+		   fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS) {
+		ret = skl_check_ccs_aux_surface(plane_state);
+		if (ret)
+			return ret;
 	} else {
 		plane_state->aux.offset = ~0xfff;
 		plane_state->aux.x = 0;
@@ -3268,8 +3467,12 @@ static u32 skl_plane_ctl_tiling(uint64_t fb_modifier)
 		return PLANE_CTL_TILED_X;
 	case I915_FORMAT_MOD_Y_TILED:
 		return PLANE_CTL_TILED_Y;
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+		return PLANE_CTL_TILED_Y | PLANE_CTL_DECOMPRESSION_ENABLE;
 	case I915_FORMAT_MOD_Yf_TILED:
 		return PLANE_CTL_TILED_YF;
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
+		return PLANE_CTL_TILED_YF | PLANE_CTL_DECOMPRESSION_ENABLE;
 	default:
 		MISSING_CASE(fb_modifier);
 	}
@@ -3311,7 +3514,7 @@ u32 skl_plane_ctl(const struct intel_crtc_state *crtc_state,
 
 	plane_ctl = PLANE_CTL_ENABLE;
 
-	if (!IS_GEMINILAKE(dev_priv)) {
+	if (!IS_GEMINILAKE(dev_priv) && !IS_CANNONLAKE(dev_priv)) {
 		plane_ctl |=
 			PLANE_CTL_PIPE_GAMMA_ENABLE |
 			PLANE_CTL_PIPE_CSC_ENABLE |
@@ -3342,6 +3545,7 @@ static void skylake_update_primary_plane(struct intel_plane *plane,
 	u32 plane_ctl = plane_state->ctl;
 	unsigned int rotation = plane_state->base.rotation;
 	u32 stride = skl_plane_stride(fb, 0, rotation);
+	u32 aux_stride = skl_plane_stride(fb, 1, rotation);
 	u32 surf_addr = plane_state->main.offset;
 	int scaler_id = plane_state->scaler_id;
 	int src_x = plane_state->main.x;
@@ -3367,7 +3571,7 @@ static void skylake_update_primary_plane(struct intel_plane *plane,
 
 	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
 
-	if (IS_GEMINILAKE(dev_priv)) {
+	if (IS_GEMINILAKE(dev_priv) || IS_CANNONLAKE(dev_priv)) {
 		I915_WRITE_FW(PLANE_COLOR_CTL(pipe, plane_id),
 			      PLANE_COLOR_PIPE_GAMMA_ENABLE |
 			      PLANE_COLOR_PIPE_CSC_ENABLE |
@@ -3378,6 +3582,10 @@ static void skylake_update_primary_plane(struct intel_plane *plane,
 	I915_WRITE_FW(PLANE_OFFSET(pipe, plane_id), (src_y << 16) | src_x);
 	I915_WRITE_FW(PLANE_STRIDE(pipe, plane_id), stride);
 	I915_WRITE_FW(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
+	I915_WRITE_FW(PLANE_AUX_DIST(pipe, plane_id),
+		      (plane_state->aux.offset - surf_addr) | aux_stride);
+	I915_WRITE_FW(PLANE_AUX_OFFSET(pipe, plane_id),
+		      (plane_state->aux.y << 16) | plane_state->aux.x);
 
 	if (scaler_id >= 0) {
 		uint32_t ps_ctrl = 0;
@@ -3419,14 +3627,6 @@ static void skylake_disable_primary_plane(struct intel_plane *primary,
 	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
 }
 
-static void intel_complete_page_flips(struct drm_i915_private *dev_priv)
-{
-	struct intel_crtc *crtc;
-
-	for_each_intel_crtc(&dev_priv->drm, crtc)
-		intel_finish_page_flip_cs(dev_priv, crtc->pipe);
-}
-
 static int
 __intel_display_resume(struct drm_device *dev,
 		       struct drm_atomic_state *state,
@@ -3485,12 +3685,14 @@ void intel_prepare_reset(struct drm_i915_private *dev_priv)
 	    !gpu_reset_clobbers_display(dev_priv))
 		return;
 
-	/* We have a modeset vs reset deadlock, defensively unbreak it.
-	 *
-	 * FIXME: We can do a _lot_ better, this is just a first iteration.
-	 */
-	i915_gem_set_wedged(dev_priv);
-	DRM_DEBUG_DRIVER("Wedging GPU to avoid deadlocks with pending modeset updates\n");
+	/* We have a modeset vs reset deadlock, defensively unbreak it. */
+	set_bit(I915_RESET_MODESET, &dev_priv->gpu_error.flags);
+	wake_up_all(&dev_priv->gpu_error.wait_queue);
+
+	if (atomic_read(&dev_priv->gpu_error.pending_fb_pin)) {
+		DRM_DEBUG_KMS("Modeset potentially stuck, unbreaking through wedging\n");
+		i915_gem_set_wedged(dev_priv);
+	}
 
 	/*
 	 * Need mode_config.mutex so that we don't
@@ -3542,13 +3744,6 @@ void intel_finish_reset(struct drm_i915_private *dev_priv)
 	if (!state)
 		goto unlock;
 
-	/*
-	 * Flips in the rings will be nuked by the reset,
-	 * so complete all pending flips so that user space
-	 * will get its events and not get stuck.
-	 */
-	intel_complete_page_flips(dev_priv);
-
 	dev_priv->modeset_restore_state = NULL;
 
 	/* reset doesn't touch the display */
@@ -3585,35 +3780,8 @@ void intel_finish_reset(struct drm_i915_private *dev_priv)
 	drm_modeset_drop_locks(ctx);
 	drm_modeset_acquire_fini(ctx);
 	mutex_unlock(&dev->mode_config.mutex);
-}
 
-static bool abort_flip_on_reset(struct intel_crtc *crtc)
-{
-	struct i915_gpu_error *error = &to_i915(crtc->base.dev)->gpu_error;
-
-	if (i915_reset_backoff(error))
-		return true;
-
-	if (crtc->reset_count != i915_reset_count(error))
-		return true;
-
-	return false;
-}
-
-static bool intel_crtc_has_pending_flip(struct drm_crtc *crtc)
-{
-	struct drm_device *dev = crtc->dev;
-	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	bool pending;
-
-	if (abort_flip_on_reset(intel_crtc))
-		return false;
-
-	spin_lock_irq(&dev->event_lock);
-	pending = to_intel_crtc(crtc)->flip_work != NULL;
-	spin_unlock_irq(&dev->event_lock);
-
-	return pending;
+	clear_bit(I915_RESET_MODESET, &dev_priv->gpu_error.flags);
 }
 
 static void intel_update_pipe_config(struct intel_crtc *crtc,
@@ -4170,21 +4338,22 @@ static void ironlake_fdi_disable(struct drm_crtc *crtc)
 
 bool intel_has_pending_fb_unpin(struct drm_i915_private *dev_priv)
 {
-	struct intel_crtc *crtc;
+	struct drm_crtc *crtc;
+	bool cleanup_done;
 
-	/* Note that we don't need to be called with mode_config.lock here
-	 * as our list of CRTC objects is static for the lifetime of the
-	 * device and so cannot disappear as we iterate. Similarly, we can
-	 * happily treat the predicates as racy, atomic checks as userspace
-	 * cannot claim and pin a new fb without at least acquring the
-	 * struct_mutex and so serialising with us.
-	 */
-	for_each_intel_crtc(&dev_priv->drm, crtc) {
-		if (atomic_read(&crtc->unpin_work_count) == 0)
+	drm_for_each_crtc(crtc, &dev_priv->drm) {
+		struct drm_crtc_commit *commit;
+		spin_lock(&crtc->commit_lock);
+		commit = list_first_entry_or_null(&crtc->commit_list,
+						  struct drm_crtc_commit, commit_entry);
+		cleanup_done = commit ?
+			try_wait_for_completion(&commit->cleanup_done) : true;
+		spin_unlock(&crtc->commit_lock);
+
+		if (cleanup_done)
 			continue;
 
-		if (crtc->flip_work)
-			intel_wait_for_vblank(dev_priv, crtc->pipe);
+		drm_crtc_wait_one_vblank(crtc);
 
 		return true;
 	}
@@ -4192,57 +4361,6 @@ bool intel_has_pending_fb_unpin(struct drm_i915_private *dev_priv)
 	return false;
 }
 
-static void page_flip_completed(struct intel_crtc *intel_crtc)
-{
-	struct drm_i915_private *dev_priv = to_i915(intel_crtc->base.dev);
-	struct intel_flip_work *work = intel_crtc->flip_work;
-
-	intel_crtc->flip_work = NULL;
-
-	if (work->event)
-		drm_crtc_send_vblank_event(&intel_crtc->base, work->event);
-
-	drm_crtc_vblank_put(&intel_crtc->base);
-
-	wake_up_all(&dev_priv->pending_flip_queue);
-	trace_i915_flip_complete(intel_crtc->plane,
-				 work->pending_flip_obj);
-
-	queue_work(dev_priv->wq, &work->unpin_work);
-}
-
-static int intel_crtc_wait_for_pending_flips(struct drm_crtc *crtc)
-{
-	struct drm_device *dev = crtc->dev;
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	long ret;
-
-	WARN_ON(waitqueue_active(&dev_priv->pending_flip_queue));
-
-	ret = wait_event_interruptible_timeout(
-					dev_priv->pending_flip_queue,
-					!intel_crtc_has_pending_flip(crtc),
-					60*HZ);
-
-	if (ret < 0)
-		return ret;
-
-	if (ret == 0) {
-		struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-		struct intel_flip_work *work;
-
-		spin_lock_irq(&dev->event_lock);
-		work = intel_crtc->flip_work;
-		if (work && !is_mmio_work(work)) {
-			WARN_ONCE(1, "Removing stuck page flip\n");
-			page_flip_completed(intel_crtc);
-		}
-		spin_unlock_irq(&dev->event_lock);
-	}
-
-	return 0;
-}
-
 void lpt_disable_iclkip(struct drm_i915_private *dev_priv)
 {
 	u32 temp;
@@ -4562,7 +4680,7 @@ static void lpt_pch_enable(const struct intel_crtc_state *crtc_state)
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 	enum transcoder cpu_transcoder = crtc_state->cpu_transcoder;
 
-	assert_pch_transcoder_disabled(dev_priv, TRANSCODER_A);
+	assert_pch_transcoder_disabled(dev_priv, PIPE_A);
 
 	lpt_program_iclkip(crtc);
 
@@ -4595,6 +4713,9 @@ skl_update_scaler(struct intel_crtc_state *crtc_state, bool force_detach,
 		&crtc_state->scaler_state;
 	struct intel_crtc *intel_crtc =
 		to_intel_crtc(crtc_state->base.crtc);
+	struct drm_i915_private *dev_priv = to_i915(intel_crtc->base.dev);
+	const struct drm_display_mode *adjusted_mode =
+		&crtc_state->base.adjusted_mode;
 	int need_scaling;
 
 	/*
@@ -4604,6 +4725,21 @@ skl_update_scaler(struct intel_crtc_state *crtc_state, bool force_detach,
 	 */
 	need_scaling = src_w != dst_w || src_h != dst_h;
 
+	if (crtc_state->ycbcr420 && scaler_user == SKL_CRTC_INDEX)
+		need_scaling = true;
+
+	/*
+	 * Scaling/fitting not supported in IF-ID mode in GEN9+
+	 * TODO: Interlace fetch mode doesn't support YUV420 planar formats.
+	 * Once NV12 is enabled, handle it here while allocating scaler
+	 * for NV12.
+	 */
+	if (INTEL_GEN(dev_priv) >= 9 && crtc_state->base.enable &&
+	    need_scaling && adjusted_mode->flags & DRM_MODE_FLAG_INTERLACE) {
+		DRM_DEBUG_KMS("Pipe/Plane scaling not supported with IF-ID mode\n");
+		return -EINVAL;
+	}
+
 	/*
 	 * if plane is being disabled or scaler is no more required or force detach
 	 *  - free scaler binded to this plane/crtc
@@ -5315,8 +5451,7 @@ static void haswell_crtc_enable(struct intel_crtc_state *pipe_config,
 		return;
 
 	if (intel_crtc->config->has_pch_encoder)
-		intel_set_pch_fifo_underrun_reporting(dev_priv, TRANSCODER_A,
-						      false);
+		intel_set_pch_fifo_underrun_reporting(dev_priv, PIPE_A, false);
 
 	intel_encoders_pre_pll_enable(crtc, pipe_config, old_state);
 
@@ -5401,8 +5536,7 @@ static void haswell_crtc_enable(struct intel_crtc_state *pipe_config,
 		intel_wait_for_vblank(dev_priv, pipe);
 		intel_wait_for_vblank(dev_priv, pipe);
 		intel_set_cpu_fifo_underrun_reporting(dev_priv, pipe, true);
-		intel_set_pch_fifo_underrun_reporting(dev_priv, TRANSCODER_A,
-						      true);
+		intel_set_pch_fifo_underrun_reporting(dev_priv, PIPE_A, true);
 	}
 
 	/* If we change the relative order between pipe/planes enabling, we need
@@ -5499,8 +5633,7 @@ static void haswell_crtc_disable(struct intel_crtc_state *old_crtc_state,
 	enum transcoder cpu_transcoder = intel_crtc->config->cpu_transcoder;
 
 	if (intel_crtc->config->has_pch_encoder)
-		intel_set_pch_fifo_underrun_reporting(dev_priv, TRANSCODER_A,
-						      false);
+		intel_set_pch_fifo_underrun_reporting(dev_priv, PIPE_A, false);
 
 	intel_encoders_disable(crtc, old_crtc_state, old_state);
 
@@ -5528,8 +5661,7 @@ static void haswell_crtc_disable(struct intel_crtc_state *old_crtc_state,
 	intel_encoders_post_disable(crtc, old_crtc_state, old_state);
 
 	if (old_crtc_state->has_pch_encoder)
-		intel_set_pch_fifo_underrun_reporting(dev_priv, TRANSCODER_A,
-						      true);
+		intel_set_pch_fifo_underrun_reporting(dev_priv, PIPE_A, true);
 }
 
 static void i9xx_pfit_enable(struct intel_crtc *crtc)
@@ -5838,8 +5970,6 @@ static void intel_crtc_disable_noatomic(struct drm_crtc *crtc,
 		return;
 
 	if (crtc->primary->state->visible) {
-		WARN_ON(intel_crtc->flip_work);
-
 		intel_pre_disable_primary_noatomic(crtc);
 
 		intel_crtc_disable_planes(crtc, 1 << drm_plane_index(crtc->primary));
@@ -6248,6 +6378,16 @@ static int intel_crtc_compute_config(struct intel_crtc *crtc,
 		return -EINVAL;
 	}
 
+	if (pipe_config->ycbcr420 && pipe_config->base.ctm) {
+		/*
+		 * There is only one pipe CSC unit per pipe, and we need that
+		 * for output conversion from RGB->YCBCR. So if CTM is already
+		 * applied we can't support YCBCR420 output.
+		 */
+		DRM_DEBUG_KMS("YCBCR420 and CTM together are not possible\n");
+		return -EINVAL;
+	}
+
 	/*
 	 * Pipe horizontal size must be even in:
 	 * - DVO ganged mode
@@ -8041,6 +8181,7 @@ static void haswell_set_pipemisc(struct drm_crtc *crtc)
 {
 	struct drm_i915_private *dev_priv = to_i915(crtc->dev);
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
+	struct intel_crtc_state *config = intel_crtc->config;
 
 	if (IS_BROADWELL(dev_priv) || INTEL_INFO(dev_priv)->gen >= 9) {
 		u32 val = 0;
@@ -8066,6 +8207,12 @@ static void haswell_set_pipemisc(struct drm_crtc *crtc)
 		if (intel_crtc->config->dither)
 			val |= PIPEMISC_DITHER_ENABLE | PIPEMISC_DITHER_TYPE_SP;
 
+		if (config->ycbcr420) {
+			val |= PIPEMISC_OUTPUT_COLORSPACE_YUV |
+				PIPEMISC_YUV420_ENABLE |
+				PIPEMISC_YUV420_MODE_FULL_BLEND;
+		}
+
 		I915_WRITE(PIPEMISC(intel_crtc->pipe), val);
 	}
 }
@@ -8393,10 +8540,16 @@ skylake_get_initial_plane_config(struct intel_crtc *crtc,
 		fb->modifier = I915_FORMAT_MOD_X_TILED;
 		break;
 	case PLANE_CTL_TILED_Y:
-		fb->modifier = I915_FORMAT_MOD_Y_TILED;
+		if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
+			fb->modifier = I915_FORMAT_MOD_Y_TILED_CCS;
+		else
+			fb->modifier = I915_FORMAT_MOD_Y_TILED;
 		break;
 	case PLANE_CTL_TILED_YF:
-		fb->modifier = I915_FORMAT_MOD_Yf_TILED;
+		if (val & PLANE_CTL_DECOMPRESSION_ENABLE)
+			fb->modifier = I915_FORMAT_MOD_Yf_TILED_CCS;
+		else
+			fb->modifier = I915_FORMAT_MOD_Yf_TILED;
 		break;
 	default:
 		MISSING_CASE(tiling);
@@ -8630,7 +8783,8 @@ static void assert_can_disable_lcpll(struct drm_i915_private *dev_priv)
 		I915_STATE_WARN(crtc->active, "CRTC for pipe %c enabled\n",
 		     pipe_name(crtc->pipe));
 
-	I915_STATE_WARN(I915_READ(HSW_PWR_WELL_DRIVER), "Power well on\n");
+	I915_STATE_WARN(I915_READ(HSW_PWR_WELL_CTL_DRIVER(HSW_DISP_PW_GLOBAL)),
+			"Display power well on\n");
 	I915_STATE_WARN(I915_READ(SPLL_CTL) & SPLL_PLL_ENABLE, "SPLL enabled\n");
 	I915_STATE_WARN(I915_READ(WRPLL_CTL(0)) & WRPLL_PLL_ENABLE, "WRPLL1 enabled\n");
 	I915_STATE_WARN(I915_READ(WRPLL_CTL(1)) & WRPLL_PLL_ENABLE, "WRPLL2 enabled\n");
@@ -9100,12 +9254,7 @@ static bool haswell_get_pipe_config(struct intel_crtc *crtc,
 	u64 power_domain_mask;
 	bool active;
 
-	if (INTEL_GEN(dev_priv) >= 9) {
-		intel_crtc_init_scalers(crtc, pipe_config);
-
-		pipe_config->scaler_state.scaler_id = -1;
-		pipe_config->scaler_state.scaler_users &= ~(1 << SKL_CRTC_INDEX);
-	}
+	intel_crtc_init_scalers(crtc, pipe_config);
 
 	power_domain = POWER_DOMAIN_PIPE(crtc->pipe);
 	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
@@ -9135,6 +9284,23 @@ static bool haswell_get_pipe_config(struct intel_crtc *crtc,
 	pipe_config->gamma_mode =
 		I915_READ(GAMMA_MODE(crtc->pipe)) & GAMMA_MODE_MODE_MASK;
 
+	if (IS_BROADWELL(dev_priv) || dev_priv->info.gen >= 9) {
+		u32 tmp = I915_READ(PIPEMISC(crtc->pipe));
+		bool clrspace_yuv = tmp & PIPEMISC_OUTPUT_COLORSPACE_YUV;
+
+		if (IS_GEMINILAKE(dev_priv) || dev_priv->info.gen >= 10) {
+			bool blend_mode_420 = tmp &
+					      PIPEMISC_YUV420_MODE_FULL_BLEND;
+
+			pipe_config->ycbcr420 = tmp & PIPEMISC_YUV420_ENABLE;
+			if (pipe_config->ycbcr420 != clrspace_yuv ||
+			    pipe_config->ycbcr420 != blend_mode_420)
+				DRM_DEBUG_KMS("Bad 4:2:0 mode (%08x)\n", tmp);
+		} else if (clrspace_yuv) {
+			DRM_DEBUG_KMS("YCbCr 4:2:0 Unsupported\n");
+		}
+	}
+
 	power_domain = POWER_DOMAIN_PIPE_PANEL_FITTER(crtc->pipe);
 	if (intel_display_power_get_if_enabled(dev_priv, power_domain)) {
 		power_domain_mask |= BIT_ULL(power_domain);
@@ -10112,849 +10278,11 @@ struct drm_display_mode *intel_crtc_mode_get(struct drm_device *dev,
 static void intel_crtc_destroy(struct drm_crtc *crtc)
 {
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	struct drm_device *dev = crtc->dev;
-	struct intel_flip_work *work;
-
-	spin_lock_irq(&dev->event_lock);
-	work = intel_crtc->flip_work;
-	intel_crtc->flip_work = NULL;
-	spin_unlock_irq(&dev->event_lock);
-
-	if (work) {
-		cancel_work_sync(&work->mmio_work);
-		cancel_work_sync(&work->unpin_work);
-		kfree(work);
-	}
 
 	drm_crtc_cleanup(crtc);
-
 	kfree(intel_crtc);
 }
 
-static void intel_unpin_work_fn(struct work_struct *__work)
-{
-	struct intel_flip_work *work =
-		container_of(__work, struct intel_flip_work, unpin_work);
-	struct intel_crtc *crtc = to_intel_crtc(work->crtc);
-	struct drm_device *dev = crtc->base.dev;
-	struct drm_plane *primary = crtc->base.primary;
-
-	if (is_mmio_work(work))
-		flush_work(&work->mmio_work);
-
-	mutex_lock(&dev->struct_mutex);
-	intel_unpin_fb_vma(work->old_vma);
-	i915_gem_object_put(work->pending_flip_obj);
-	mutex_unlock(&dev->struct_mutex);
-
-	i915_gem_request_put(work->flip_queued_req);
-
-	intel_frontbuffer_flip_complete(to_i915(dev),
-					to_intel_plane(primary)->frontbuffer_bit);
-	intel_fbc_post_update(crtc);
-	drm_framebuffer_unreference(work->old_fb);
-
-	BUG_ON(atomic_read(&crtc->unpin_work_count) == 0);
-	atomic_dec(&crtc->unpin_work_count);
-
-	kfree(work);
-}
-
-/* Is 'a' after or equal to 'b'? */
-static bool g4x_flip_count_after_eq(u32 a, u32 b)
-{
-	return !((a - b) & 0x80000000);
-}
-
-static bool __pageflip_finished_cs(struct intel_crtc *crtc,
-				   struct intel_flip_work *work)
-{
-	struct drm_device *dev = crtc->base.dev;
-	struct drm_i915_private *dev_priv = to_i915(dev);
-
-	if (abort_flip_on_reset(crtc))
-		return true;
-
-	/*
-	 * The relevant registers doen't exist on pre-ctg.
-	 * As the flip done interrupt doesn't trigger for mmio
-	 * flips on gmch platforms, a flip count check isn't
-	 * really needed there. But since ctg has the registers,
-	 * include it in the check anyway.
-	 */
-	if (INTEL_GEN(dev_priv) < 5 && !IS_G4X(dev_priv))
-		return true;
-
-	/*
-	 * BDW signals flip done immediately if the plane
-	 * is disabled, even if the plane enable is already
-	 * armed to occur at the next vblank :(
-	 */
-
-	/*
-	 * A DSPSURFLIVE check isn't enough in case the mmio and CS flips
-	 * used the same base address. In that case the mmio flip might
-	 * have completed, but the CS hasn't even executed the flip yet.
-	 *
-	 * A flip count check isn't enough as the CS might have updated
-	 * the base address just after start of vblank, but before we
-	 * managed to process the interrupt. This means we'd complete the
-	 * CS flip too soon.
-	 *
-	 * Combining both checks should get us a good enough result. It may
-	 * still happen that the CS flip has been executed, but has not
-	 * yet actually completed. But in case the base address is the same
-	 * anyway, we don't really care.
-	 */
-	return (I915_READ(DSPSURFLIVE(crtc->plane)) & ~0xfff) ==
-		crtc->flip_work->gtt_offset &&
-		g4x_flip_count_after_eq(I915_READ(PIPE_FLIPCOUNT_G4X(crtc->pipe)),
-				    crtc->flip_work->flip_count);
-}
-
-static bool
-__pageflip_finished_mmio(struct intel_crtc *crtc,
-			       struct intel_flip_work *work)
-{
-	/*
-	 * MMIO work completes when vblank is different from
-	 * flip_queued_vblank.
-	 *
-	 * Reset counter value doesn't matter, this is handled by
-	 * i915_wait_request finishing early, so no need to handle
-	 * reset here.
-	 */
-	return intel_crtc_get_vblank_counter(crtc) != work->flip_queued_vblank;
-}
-
-
-static bool pageflip_finished(struct intel_crtc *crtc,
-			      struct intel_flip_work *work)
-{
-	if (!atomic_read(&work->pending))
-		return false;
-
-	smp_rmb();
-
-	if (is_mmio_work(work))
-		return __pageflip_finished_mmio(crtc, work);
-	else
-		return __pageflip_finished_cs(crtc, work);
-}
-
-void intel_finish_page_flip_cs(struct drm_i915_private *dev_priv, int pipe)
-{
-	struct drm_device *dev = &dev_priv->drm;
-	struct intel_crtc *crtc = intel_get_crtc_for_pipe(dev_priv, pipe);
-	struct intel_flip_work *work;
-	unsigned long flags;
-
-	/* Ignore early vblank irqs */
-	if (!crtc)
-		return;
-
-	/*
-	 * This is called both by irq handlers and the reset code (to complete
-	 * lost pageflips) so needs the full irqsave spinlocks.
-	 */
-	spin_lock_irqsave(&dev->event_lock, flags);
-	work = crtc->flip_work;
-
-	if (work != NULL &&
-	    !is_mmio_work(work) &&
-	    pageflip_finished(crtc, work))
-		page_flip_completed(crtc);
-
-	spin_unlock_irqrestore(&dev->event_lock, flags);
-}
-
-void intel_finish_page_flip_mmio(struct drm_i915_private *dev_priv, int pipe)
-{
-	struct drm_device *dev = &dev_priv->drm;
-	struct intel_crtc *crtc = intel_get_crtc_for_pipe(dev_priv, pipe);
-	struct intel_flip_work *work;
-	unsigned long flags;
-
-	/* Ignore early vblank irqs */
-	if (!crtc)
-		return;
-
-	/*
-	 * This is called both by irq handlers and the reset code (to complete
-	 * lost pageflips) so needs the full irqsave spinlocks.
-	 */
-	spin_lock_irqsave(&dev->event_lock, flags);
-	work = crtc->flip_work;
-
-	if (work != NULL &&
-	    is_mmio_work(work) &&
-	    pageflip_finished(crtc, work))
-		page_flip_completed(crtc);
-
-	spin_unlock_irqrestore(&dev->event_lock, flags);
-}
-
-static inline void intel_mark_page_flip_active(struct intel_crtc *crtc,
-					       struct intel_flip_work *work)
-{
-	work->flip_queued_vblank = intel_crtc_get_vblank_counter(crtc);
-
-	/* Ensure that the work item is consistent when activating it ... */
-	smp_mb__before_atomic();
-	atomic_set(&work->pending, 1);
-}
-
-static int intel_gen2_queue_flip(struct drm_device *dev,
-				 struct drm_crtc *crtc,
-				 struct drm_framebuffer *fb,
-				 struct drm_i915_gem_object *obj,
-				 struct drm_i915_gem_request *req,
-				 uint32_t flags)
-{
-	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	u32 flip_mask, *cs;
-
-	cs = intel_ring_begin(req, 6);
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
-
-	/* Can't queue multiple flips, so wait for the previous
-	 * one to finish before executing the next.
-	 */
-	if (intel_crtc->plane)
-		flip_mask = MI_WAIT_FOR_PLANE_B_FLIP;
-	else
-		flip_mask = MI_WAIT_FOR_PLANE_A_FLIP;
-	*cs++ = MI_WAIT_FOR_EVENT | flip_mask;
-	*cs++ = MI_NOOP;
-	*cs++ = MI_DISPLAY_FLIP | MI_DISPLAY_FLIP_PLANE(intel_crtc->plane);
-	*cs++ = fb->pitches[0];
-	*cs++ = intel_crtc->flip_work->gtt_offset;
-	*cs++ = 0; /* aux display base address, unused */
-
-	return 0;
-}
-
-static int intel_gen3_queue_flip(struct drm_device *dev,
-				 struct drm_crtc *crtc,
-				 struct drm_framebuffer *fb,
-				 struct drm_i915_gem_object *obj,
-				 struct drm_i915_gem_request *req,
-				 uint32_t flags)
-{
-	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	u32 flip_mask, *cs;
-
-	cs = intel_ring_begin(req, 6);
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
-
-	if (intel_crtc->plane)
-		flip_mask = MI_WAIT_FOR_PLANE_B_FLIP;
-	else
-		flip_mask = MI_WAIT_FOR_PLANE_A_FLIP;
-	*cs++ = MI_WAIT_FOR_EVENT | flip_mask;
-	*cs++ = MI_NOOP;
-	*cs++ = MI_DISPLAY_FLIP_I915 | MI_DISPLAY_FLIP_PLANE(intel_crtc->plane);
-	*cs++ = fb->pitches[0];
-	*cs++ = intel_crtc->flip_work->gtt_offset;
-	*cs++ = MI_NOOP;
-
-	return 0;
-}
-
-static int intel_gen4_queue_flip(struct drm_device *dev,
-				 struct drm_crtc *crtc,
-				 struct drm_framebuffer *fb,
-				 struct drm_i915_gem_object *obj,
-				 struct drm_i915_gem_request *req,
-				 uint32_t flags)
-{
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	u32 pf, pipesrc, *cs;
-
-	cs = intel_ring_begin(req, 4);
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
-
-	/* i965+ uses the linear or tiled offsets from the
-	 * Display Registers (which do not change across a page-flip)
-	 * so we need only reprogram the base address.
-	 */
-	*cs++ = MI_DISPLAY_FLIP | MI_DISPLAY_FLIP_PLANE(intel_crtc->plane);
-	*cs++ = fb->pitches[0];
-	*cs++ = intel_crtc->flip_work->gtt_offset |
-		intel_fb_modifier_to_tiling(fb->modifier);
-
-	/* XXX Enabling the panel-fitter across page-flip is so far
-	 * untested on non-native modes, so ignore it for now.
-	 * pf = I915_READ(pipe == 0 ? PFA_CTL_1 : PFB_CTL_1) & PF_ENABLE;
-	 */
-	pf = 0;
-	pipesrc = I915_READ(PIPESRC(intel_crtc->pipe)) & 0x0fff0fff;
-	*cs++ = pf | pipesrc;
-
-	return 0;
-}
-
-static int intel_gen6_queue_flip(struct drm_device *dev,
-				 struct drm_crtc *crtc,
-				 struct drm_framebuffer *fb,
-				 struct drm_i915_gem_object *obj,
-				 struct drm_i915_gem_request *req,
-				 uint32_t flags)
-{
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	u32 pf, pipesrc, *cs;
-
-	cs = intel_ring_begin(req, 4);
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
-
-	*cs++ = MI_DISPLAY_FLIP | MI_DISPLAY_FLIP_PLANE(intel_crtc->plane);
-	*cs++ = fb->pitches[0] | intel_fb_modifier_to_tiling(fb->modifier);
-	*cs++ = intel_crtc->flip_work->gtt_offset;
-
-	/* Contrary to the suggestions in the documentation,
-	 * "Enable Panel Fitter" does not seem to be required when page
-	 * flipping with a non-native mode, and worse causes a normal
-	 * modeset to fail.
-	 * pf = I915_READ(PF_CTL(intel_crtc->pipe)) & PF_ENABLE;
-	 */
-	pf = 0;
-	pipesrc = I915_READ(PIPESRC(intel_crtc->pipe)) & 0x0fff0fff;
-	*cs++ = pf | pipesrc;
-
-	return 0;
-}
-
-static int intel_gen7_queue_flip(struct drm_device *dev,
-				 struct drm_crtc *crtc,
-				 struct drm_framebuffer *fb,
-				 struct drm_i915_gem_object *obj,
-				 struct drm_i915_gem_request *req,
-				 uint32_t flags)
-{
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	u32 *cs, plane_bit = 0;
-	int len, ret;
-
-	switch (intel_crtc->plane) {
-	case PLANE_A:
-		plane_bit = MI_DISPLAY_FLIP_IVB_PLANE_A;
-		break;
-	case PLANE_B:
-		plane_bit = MI_DISPLAY_FLIP_IVB_PLANE_B;
-		break;
-	case PLANE_C:
-		plane_bit = MI_DISPLAY_FLIP_IVB_PLANE_C;
-		break;
-	default:
-		WARN_ONCE(1, "unknown plane in flip command\n");
-		return -ENODEV;
-	}
-
-	len = 4;
-	if (req->engine->id == RCS) {
-		len += 6;
-		/*
-		 * On Gen 8, SRM is now taking an extra dword to accommodate
-		 * 48bits addresses, and we need a NOOP for the batch size to
-		 * stay even.
-		 */
-		if (IS_GEN8(dev_priv))
-			len += 2;
-	}
-
-	/*
-	 * BSpec MI_DISPLAY_FLIP for IVB:
-	 * "The full packet must be contained within the same cache line."
-	 *
-	 * Currently the LRI+SRM+MI_DISPLAY_FLIP all fit within the same
-	 * cacheline, if we ever start emitting more commands before
-	 * the MI_DISPLAY_FLIP we may need to first emit everything else,
-	 * then do the cacheline alignment, and finally emit the
-	 * MI_DISPLAY_FLIP.
-	 */
-	ret = intel_ring_cacheline_align(req);
-	if (ret)
-		return ret;
-
-	cs = intel_ring_begin(req, len);
-	if (IS_ERR(cs))
-		return PTR_ERR(cs);
-
-	/* Unmask the flip-done completion message. Note that the bspec says that
-	 * we should do this for both the BCS and RCS, and that we must not unmask
-	 * more than one flip event at any time (or ensure that one flip message
-	 * can be sent by waiting for flip-done prior to queueing new flips).
-	 * Experimentation says that BCS works despite DERRMR masking all
-	 * flip-done completion events and that unmasking all planes at once
-	 * for the RCS also doesn't appear to drop events. Setting the DERRMR
-	 * to zero does lead to lockups within MI_DISPLAY_FLIP.
-	 */
-	if (req->engine->id == RCS) {
-		*cs++ = MI_LOAD_REGISTER_IMM(1);
-		*cs++ = i915_mmio_reg_offset(DERRMR);
-		*cs++ = ~(DERRMR_PIPEA_PRI_FLIP_DONE |
-			  DERRMR_PIPEB_PRI_FLIP_DONE |
-			  DERRMR_PIPEC_PRI_FLIP_DONE);
-		if (IS_GEN8(dev_priv))
-			*cs++ = MI_STORE_REGISTER_MEM_GEN8 |
-				MI_SRM_LRM_GLOBAL_GTT;
-		else
-			*cs++ = MI_STORE_REGISTER_MEM | MI_SRM_LRM_GLOBAL_GTT;
-		*cs++ = i915_mmio_reg_offset(DERRMR);
-		*cs++ = i915_ggtt_offset(req->engine->scratch) + 256;
-		if (IS_GEN8(dev_priv)) {
-			*cs++ = 0;
-			*cs++ = MI_NOOP;
-		}
-	}
-
-	*cs++ = MI_DISPLAY_FLIP_I915 | plane_bit;
-	*cs++ = fb->pitches[0] | intel_fb_modifier_to_tiling(fb->modifier);
-	*cs++ = intel_crtc->flip_work->gtt_offset;
-	*cs++ = MI_NOOP;
-
-	return 0;
-}
-
-static bool use_mmio_flip(struct intel_engine_cs *engine,
-			  struct drm_i915_gem_object *obj)
-{
-	/*
-	 * This is not being used for older platforms, because
-	 * non-availability of flip done interrupt forces us to use
-	 * CS flips. Older platforms derive flip done using some clever
-	 * tricks involving the flip_pending status bits and vblank irqs.
-	 * So using MMIO flips there would disrupt this mechanism.
-	 */
-
-	if (engine == NULL)
-		return true;
-
-	if (INTEL_GEN(engine->i915) < 5)
-		return false;
-
-	if (i915.use_mmio_flip < 0)
-		return false;
-	else if (i915.use_mmio_flip > 0)
-		return true;
-	else if (i915.enable_execlists)
-		return true;
-
-	return engine != i915_gem_object_last_write_engine(obj);
-}
-
-static void skl_do_mmio_flip(struct intel_crtc *intel_crtc,
-			     unsigned int rotation,
-			     struct intel_flip_work *work)
-{
-	struct drm_device *dev = intel_crtc->base.dev;
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct drm_framebuffer *fb = intel_crtc->base.primary->fb;
-	const enum pipe pipe = intel_crtc->pipe;
-	u32 ctl, stride = skl_plane_stride(fb, 0, rotation);
-
-	ctl = I915_READ(PLANE_CTL(pipe, 0));
-	ctl &= ~PLANE_CTL_TILED_MASK;
-	switch (fb->modifier) {
-	case DRM_FORMAT_MOD_LINEAR:
-		break;
-	case I915_FORMAT_MOD_X_TILED:
-		ctl |= PLANE_CTL_TILED_X;
-		break;
-	case I915_FORMAT_MOD_Y_TILED:
-		ctl |= PLANE_CTL_TILED_Y;
-		break;
-	case I915_FORMAT_MOD_Yf_TILED:
-		ctl |= PLANE_CTL_TILED_YF;
-		break;
-	default:
-		MISSING_CASE(fb->modifier);
-	}
-
-	/*
-	 * Both PLANE_CTL and PLANE_STRIDE are not updated on vblank but on
-	 * PLANE_SURF updates, the update is then guaranteed to be atomic.
-	 */
-	I915_WRITE(PLANE_CTL(pipe, 0), ctl);
-	I915_WRITE(PLANE_STRIDE(pipe, 0), stride);
-
-	I915_WRITE(PLANE_SURF(pipe, 0), work->gtt_offset);
-	POSTING_READ(PLANE_SURF(pipe, 0));
-}
-
-static void ilk_do_mmio_flip(struct intel_crtc *intel_crtc,
-			     struct intel_flip_work *work)
-{
-	struct drm_device *dev = intel_crtc->base.dev;
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct drm_framebuffer *fb = intel_crtc->base.primary->fb;
-	i915_reg_t reg = DSPCNTR(intel_crtc->plane);
-	u32 dspcntr;
-
-	dspcntr = I915_READ(reg);
-
-	if (fb->modifier == I915_FORMAT_MOD_X_TILED)
-		dspcntr |= DISPPLANE_TILED;
-	else
-		dspcntr &= ~DISPPLANE_TILED;
-
-	I915_WRITE(reg, dspcntr);
-
-	I915_WRITE(DSPSURF(intel_crtc->plane), work->gtt_offset);
-	POSTING_READ(DSPSURF(intel_crtc->plane));
-}
-
-static void intel_mmio_flip_work_func(struct work_struct *w)
-{
-	struct intel_flip_work *work =
-		container_of(w, struct intel_flip_work, mmio_work);
-	struct intel_crtc *crtc = to_intel_crtc(work->crtc);
-	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
-	struct intel_framebuffer *intel_fb =
-		to_intel_framebuffer(crtc->base.primary->fb);
-	struct drm_i915_gem_object *obj = intel_fb->obj;
-
-	WARN_ON(i915_gem_object_wait(obj, 0, MAX_SCHEDULE_TIMEOUT, NULL) < 0);
-
-	intel_pipe_update_start(crtc);
-
-	if (INTEL_GEN(dev_priv) >= 9)
-		skl_do_mmio_flip(crtc, work->rotation, work);
-	else
-		/* use_mmio_flip() retricts MMIO flips to ilk+ */
-		ilk_do_mmio_flip(crtc, work);
-
-	intel_pipe_update_end(crtc, work);
-}
-
-static int intel_default_queue_flip(struct drm_device *dev,
-				    struct drm_crtc *crtc,
-				    struct drm_framebuffer *fb,
-				    struct drm_i915_gem_object *obj,
-				    struct drm_i915_gem_request *req,
-				    uint32_t flags)
-{
-	return -ENODEV;
-}
-
-static bool __pageflip_stall_check_cs(struct drm_i915_private *dev_priv,
-				      struct intel_crtc *intel_crtc,
-				      struct intel_flip_work *work)
-{
-	u32 addr, vblank;
-
-	if (!atomic_read(&work->pending))
-		return false;
-
-	smp_rmb();
-
-	vblank = intel_crtc_get_vblank_counter(intel_crtc);
-	if (work->flip_ready_vblank == 0) {
-		if (work->flip_queued_req &&
-		    !i915_gem_request_completed(work->flip_queued_req))
-			return false;
-
-		work->flip_ready_vblank = vblank;
-	}
-
-	if (vblank - work->flip_ready_vblank < 3)
-		return false;
-
-	/* Potential stall - if we see that the flip has happened,
-	 * assume a missed interrupt. */
-	if (INTEL_GEN(dev_priv) >= 4)
-		addr = I915_HI_DISPBASE(I915_READ(DSPSURF(intel_crtc->plane)));
-	else
-		addr = I915_READ(DSPADDR(intel_crtc->plane));
-
-	/* There is a potential issue here with a false positive after a flip
-	 * to the same address. We could address this by checking for a
-	 * non-incrementing frame counter.
-	 */
-	return addr == work->gtt_offset;
-}
-
-void intel_check_page_flip(struct drm_i915_private *dev_priv, int pipe)
-{
-	struct drm_device *dev = &dev_priv->drm;
-	struct intel_crtc *crtc = intel_get_crtc_for_pipe(dev_priv, pipe);
-	struct intel_flip_work *work;
-
-	WARN_ON(!in_interrupt());
-
-	if (crtc == NULL)
-		return;
-
-	spin_lock(&dev->event_lock);
-	work = crtc->flip_work;
-
-	if (work != NULL && !is_mmio_work(work) &&
-	    __pageflip_stall_check_cs(dev_priv, crtc, work)) {
-		WARN_ONCE(1,
-			  "Kicking stuck page flip: queued at %d, now %d\n",
-			work->flip_queued_vblank, intel_crtc_get_vblank_counter(crtc));
-		page_flip_completed(crtc);
-		work = NULL;
-	}
-
-	if (work != NULL && !is_mmio_work(work) &&
-	    intel_crtc_get_vblank_counter(crtc) - work->flip_queued_vblank > 1)
-		intel_queue_rps_boost_for_request(work->flip_queued_req);
-	spin_unlock(&dev->event_lock);
-}
-
-__maybe_unused
-static int intel_crtc_page_flip(struct drm_crtc *crtc,
-				struct drm_framebuffer *fb,
-				struct drm_pending_vblank_event *event,
-				uint32_t page_flip_flags)
-{
-	struct drm_device *dev = crtc->dev;
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct drm_framebuffer *old_fb = crtc->primary->fb;
-	struct drm_i915_gem_object *obj = intel_fb_obj(fb);
-	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-	struct drm_plane *primary = crtc->primary;
-	enum pipe pipe = intel_crtc->pipe;
-	struct intel_flip_work *work;
-	struct intel_engine_cs *engine;
-	bool mmio_flip;
-	struct drm_i915_gem_request *request;
-	struct i915_vma *vma;
-	int ret;
-
-	/*
-	 * drm_mode_page_flip_ioctl() should already catch this, but double
-	 * check to be safe.  In the future we may enable pageflipping from
-	 * a disabled primary plane.
-	 */
-	if (WARN_ON(intel_fb_obj(old_fb) == NULL))
-		return -EBUSY;
-
-	/* Can't change pixel format via MI display flips. */
-	if (fb->format != crtc->primary->fb->format)
-		return -EINVAL;
-
-	/*
-	 * TILEOFF/LINOFF registers can't be changed via MI display flips.
-	 * Note that pitch changes could also affect these register.
-	 */
-	if (INTEL_GEN(dev_priv) > 3 &&
-	    (fb->offsets[0] != crtc->primary->fb->offsets[0] ||
-	     fb->pitches[0] != crtc->primary->fb->pitches[0]))
-		return -EINVAL;
-
-	if (i915_terminally_wedged(&dev_priv->gpu_error))
-		goto out_hang;
-
-	work = kzalloc(sizeof(*work), GFP_KERNEL);
-	if (work == NULL)
-		return -ENOMEM;
-
-	work->event = event;
-	work->crtc = crtc;
-	work->old_fb = old_fb;
-	INIT_WORK(&work->unpin_work, intel_unpin_work_fn);
-
-	ret = drm_crtc_vblank_get(crtc);
-	if (ret)
-		goto free_work;
-
-	/* We borrow the event spin lock for protecting flip_work */
-	spin_lock_irq(&dev->event_lock);
-	if (intel_crtc->flip_work) {
-		/* Before declaring the flip queue wedged, check if
-		 * the hardware completed the operation behind our backs.
-		 */
-		if (pageflip_finished(intel_crtc, intel_crtc->flip_work)) {
-			DRM_DEBUG_DRIVER("flip queue: previous flip completed, continuing\n");
-			page_flip_completed(intel_crtc);
-		} else {
-			DRM_DEBUG_DRIVER("flip queue: crtc already busy\n");
-			spin_unlock_irq(&dev->event_lock);
-
-			drm_crtc_vblank_put(crtc);
-			kfree(work);
-			return -EBUSY;
-		}
-	}
-	intel_crtc->flip_work = work;
-	spin_unlock_irq(&dev->event_lock);
-
-	if (atomic_read(&intel_crtc->unpin_work_count) >= 2)
-		flush_workqueue(dev_priv->wq);
-
-	/* Reference the objects for the scheduled work. */
-	drm_framebuffer_reference(work->old_fb);
-
-	crtc->primary->fb = fb;
-	update_state_fb(crtc->primary);
-
-	work->pending_flip_obj = i915_gem_object_get(obj);
-
-	ret = i915_mutex_lock_interruptible(dev);
-	if (ret)
-		goto cleanup;
-
-	intel_crtc->reset_count = i915_reset_count(&dev_priv->gpu_error);
-	if (i915_reset_backoff_or_wedged(&dev_priv->gpu_error)) {
-		ret = -EIO;
-		goto unlock;
-	}
-
-	atomic_inc(&intel_crtc->unpin_work_count);
-
-	if (INTEL_GEN(dev_priv) >= 5 || IS_G4X(dev_priv))
-		work->flip_count = I915_READ(PIPE_FLIPCOUNT_G4X(pipe)) + 1;
-
-	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
-		engine = dev_priv->engine[BCS];
-		if (fb->modifier != old_fb->modifier)
-			/* vlv: DISPLAY_FLIP fails to change tiling */
-			engine = NULL;
-	} else if (IS_IVYBRIDGE(dev_priv) || IS_HASWELL(dev_priv)) {
-		engine = dev_priv->engine[BCS];
-	} else if (INTEL_GEN(dev_priv) >= 7) {
-		engine = i915_gem_object_last_write_engine(obj);
-		if (engine == NULL || engine->id != RCS)
-			engine = dev_priv->engine[BCS];
-	} else {
-		engine = dev_priv->engine[RCS];
-	}
-
-	mmio_flip = use_mmio_flip(engine, obj);
-
-	vma = intel_pin_and_fence_fb_obj(fb, primary->state->rotation);
-	if (IS_ERR(vma)) {
-		ret = PTR_ERR(vma);
-		goto cleanup_pending;
-	}
-
-	work->old_vma = to_intel_plane_state(primary->state)->vma;
-	to_intel_plane_state(primary->state)->vma = vma;
-
-	work->gtt_offset = i915_ggtt_offset(vma) + intel_crtc->dspaddr_offset;
-	work->rotation = crtc->primary->state->rotation;
-
-	/*
-	 * There's the potential that the next frame will not be compatible with
-	 * FBC, so we want to call pre_update() before the actual page flip.
-	 * The problem is that pre_update() caches some information about the fb
-	 * object, so we want to do this only after the object is pinned. Let's
-	 * be on the safe side and do this immediately before scheduling the
-	 * flip.
-	 */
-	intel_fbc_pre_update(intel_crtc, intel_crtc->config,
-			     to_intel_plane_state(primary->state));
-
-	if (mmio_flip) {
-		INIT_WORK(&work->mmio_work, intel_mmio_flip_work_func);
-		queue_work(system_unbound_wq, &work->mmio_work);
-	} else {
-		request = i915_gem_request_alloc(engine,
-						 dev_priv->kernel_context);
-		if (IS_ERR(request)) {
-			ret = PTR_ERR(request);
-			goto cleanup_unpin;
-		}
-
-		ret = i915_gem_request_await_object(request, obj, false);
-		if (ret)
-			goto cleanup_request;
-
-		ret = dev_priv->display.queue_flip(dev, crtc, fb, obj, request,
-						   page_flip_flags);
-		if (ret)
-			goto cleanup_request;
-
-		intel_mark_page_flip_active(intel_crtc, work);
-
-		work->flip_queued_req = i915_gem_request_get(request);
-		i915_add_request(request);
-	}
-
-	i915_gem_object_wait_priority(obj, 0, I915_PRIORITY_DISPLAY);
-	i915_gem_track_fb(intel_fb_obj(old_fb), obj,
-			  to_intel_plane(primary)->frontbuffer_bit);
-	mutex_unlock(&dev->struct_mutex);
-
-	intel_frontbuffer_flip_prepare(to_i915(dev),
-				       to_intel_plane(primary)->frontbuffer_bit);
-
-	trace_i915_flip_request(intel_crtc->plane, obj);
-
-	return 0;
-
-cleanup_request:
-	i915_add_request(request);
-cleanup_unpin:
-	to_intel_plane_state(primary->state)->vma = work->old_vma;
-	intel_unpin_fb_vma(vma);
-cleanup_pending:
-	atomic_dec(&intel_crtc->unpin_work_count);
-unlock:
-	mutex_unlock(&dev->struct_mutex);
-cleanup:
-	crtc->primary->fb = old_fb;
-	update_state_fb(crtc->primary);
-
-	i915_gem_object_put(obj);
-	drm_framebuffer_unreference(work->old_fb);
-
-	spin_lock_irq(&dev->event_lock);
-	intel_crtc->flip_work = NULL;
-	spin_unlock_irq(&dev->event_lock);
-
-	drm_crtc_vblank_put(crtc);
-free_work:
-	kfree(work);
-
-	if (ret == -EIO) {
-		struct drm_atomic_state *state;
-		struct drm_plane_state *plane_state;
-
-out_hang:
-		state = drm_atomic_state_alloc(dev);
-		if (!state)
-			return -ENOMEM;
-		state->acquire_ctx = dev->mode_config.acquire_ctx;
-
-retry:
-		plane_state = drm_atomic_get_plane_state(state, primary);
-		ret = PTR_ERR_OR_ZERO(plane_state);
-		if (!ret) {
-			drm_atomic_set_fb_for_plane(plane_state, fb);
-
-			ret = drm_atomic_set_crtc_for_plane(plane_state, crtc);
-			if (!ret)
-				ret = drm_atomic_commit(state);
-		}
-
-		if (ret == -EDEADLK) {
-			drm_modeset_backoff(state->acquire_ctx);
-			drm_atomic_state_clear(state);
-			goto retry;
-		}
-
-		drm_atomic_state_put(state);
-
-		if (ret == 0 && event) {
-			spin_lock_irq(&dev->event_lock);
-			drm_crtc_send_vblank_event(crtc, event);
-			spin_unlock_irq(&dev->event_lock);
-		}
-	}
-	return ret;
-}
-
-
 /**
  * intel_wm_need_update - Check whether watermarks need updating
  * @plane: drm plane
@@ -11352,6 +10680,9 @@ static void intel_dump_pipe_config(struct intel_crtc *crtc,
 				      pipe_config->fdi_lanes,
 				      &pipe_config->fdi_m_n);
 
+	if (pipe_config->ycbcr420)
+		DRM_DEBUG_KMS("YCbCr 4:2:0 output enabled\n");
+
 	if (intel_crtc_has_dp_encoder(pipe_config)) {
 		intel_dump_m_n_config(pipe_config, "dp m_n",
 				pipe_config->lane_count, &pipe_config->dp_m_n);
@@ -11923,6 +11254,7 @@ intel_pipe_config_compare(struct drm_i915_private *dev_priv,
 	PIPE_CONF_CHECK_I(hdmi_scrambling);
 	PIPE_CONF_CHECK_I(hdmi_high_tmds_clock_ratio);
 	PIPE_CONF_CHECK_I(has_infoframe);
+	PIPE_CONF_CHECK_I(ycbcr420);
 
 	PIPE_CONF_CHECK_I(has_audio);
 
@@ -12764,31 +12096,7 @@ static int intel_atomic_check(struct drm_device *dev,
 static int intel_atomic_prepare_commit(struct drm_device *dev,
 				       struct drm_atomic_state *state)
 {
-	struct drm_i915_private *dev_priv = to_i915(dev);
-	struct drm_crtc_state *crtc_state;
-	struct drm_crtc *crtc;
-	int i, ret;
-
-	for_each_new_crtc_in_state(state, crtc, crtc_state, i) {
-		if (state->legacy_cursor_update)
-			continue;
-
-		ret = intel_crtc_wait_for_pending_flips(crtc);
-		if (ret)
-			return ret;
-
-		if (atomic_read(&to_intel_crtc(crtc)->unpin_work_count) >= 2)
-			flush_workqueue(dev_priv->wq);
-	}
-
-	ret = mutex_lock_interruptible(&dev->struct_mutex);
-	if (ret)
-		return ret;
-
-	ret = drm_atomic_helper_prepare_planes(dev, state);
-	mutex_unlock(&dev->struct_mutex);
-
-	return ret;
+	return drm_atomic_helper_prepare_planes(dev, state);
 }
 
 u32 intel_crtc_get_vblank_counter(struct intel_crtc *crtc)
@@ -12796,7 +12104,7 @@ u32 intel_crtc_get_vblank_counter(struct intel_crtc *crtc)
 	struct drm_device *dev = crtc->base.dev;
 
 	if (!dev->max_vblank_count)
-		return drm_accurate_vblank_count(&crtc->base);
+		return drm_crtc_accurate_vblank_count(&crtc->base);
 
 	return dev->driver->get_vblank_counter(dev, crtc->pipe);
 }
@@ -12999,6 +12307,30 @@ static void intel_atomic_helper_free_state_worker(struct work_struct *work)
 	intel_atomic_helper_free_state(dev_priv);
 }
 
+static void intel_atomic_commit_fence_wait(struct intel_atomic_state *intel_state)
+{
+	struct wait_queue_entry wait_fence, wait_reset;
+	struct drm_i915_private *dev_priv = to_i915(intel_state->base.dev);
+
+	init_wait_entry(&wait_fence, 0);
+	init_wait_entry(&wait_reset, 0);
+	for (;;) {
+		prepare_to_wait(&intel_state->commit_ready.wait,
+				&wait_fence, TASK_UNINTERRUPTIBLE);
+		prepare_to_wait(&dev_priv->gpu_error.wait_queue,
+				&wait_reset, TASK_UNINTERRUPTIBLE);
+
+
+		if (i915_sw_fence_done(&intel_state->commit_ready)
+		    || test_bit(I915_RESET_MODESET, &dev_priv->gpu_error.flags))
+			break;
+
+		schedule();
+	}
+	finish_wait(&intel_state->commit_ready.wait, &wait_fence);
+	finish_wait(&dev_priv->gpu_error.wait_queue, &wait_reset);
+}
+
 static void intel_atomic_commit_tail(struct drm_atomic_state *state)
 {
 	struct drm_device *dev = state->dev;
@@ -13012,6 +12344,8 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state)
 	unsigned crtc_vblank_mask = 0;
 	int i;
 
+	intel_atomic_commit_fence_wait(intel_state);
+
 	drm_atomic_helper_wait_for_dependencies(state);
 
 	if (intel_state->modeset)
@@ -13151,9 +12485,7 @@ static void intel_atomic_commit_tail(struct drm_atomic_state *state)
 		intel_display_power_put(dev_priv, POWER_DOMAIN_MODESET);
 	}
 
-	mutex_lock(&dev->struct_mutex);
 	drm_atomic_helper_cleanup_planes(dev, state);
-	mutex_unlock(&dev->struct_mutex);
 
 	drm_atomic_helper_commit_cleanup_done(state);
 
@@ -13179,10 +12511,8 @@ intel_atomic_commit_ready(struct i915_sw_fence *fence,
 
 	switch (notify) {
 	case FENCE_COMPLETE:
-		if (state->base.commit_work.func)
-			queue_work(system_unbound_wq, &state->base.commit_work);
+		/* we do blocking waits in the worker, nothing to do here */
 		break;
-
 	case FENCE_FREE:
 		{
 			struct intel_atomic_helper *helper =
@@ -13264,7 +12594,13 @@ static int intel_atomic_commit(struct drm_device *dev,
 	if (INTEL_GEN(dev_priv) < 9)
 		state->legacy_cursor_update = false;
 
-	drm_atomic_helper_swap_state(state, true);
+	ret = drm_atomic_helper_swap_state(state, true);
+	if (ret) {
+		i915_sw_fence_commit(&intel_state->commit_ready);
+
+		drm_atomic_helper_cleanup_planes(dev, state);
+		return ret;
+	}
 	dev_priv->wm.distrust_bios_wm = false;
 	intel_shared_dpll_swap_state(state);
 	intel_atomic_track_fbs(state);
@@ -13278,14 +12614,14 @@ static int intel_atomic_commit(struct drm_device *dev,
 	}
 
 	drm_atomic_state_get(state);
-	INIT_WORK(&state->commit_work,
-		  nonblock ? intel_atomic_commit_work : NULL);
+	INIT_WORK(&state->commit_work, intel_atomic_commit_work);
 
 	i915_sw_fence_commit(&intel_state->commit_ready);
-	if (!nonblock) {
-		i915_sw_fence_wait(&intel_state->commit_ready);
+	if (nonblock)
+		queue_work(system_unbound_wq, &state->commit_work);
+	else
 		intel_atomic_commit_tail(state);
-	}
+
 
 	return 0;
 }
@@ -13293,7 +12629,6 @@ static int intel_atomic_commit(struct drm_device *dev,
 static const struct drm_crtc_funcs intel_crtc_funcs = {
 	.gamma_set = drm_atomic_helper_legacy_gamma_set,
 	.set_config = drm_atomic_helper_set_config,
-	.set_property = drm_atomic_helper_crtc_set_property,
 	.destroy = intel_crtc_destroy,
 	.page_flip = drm_atomic_helper_page_flip,
 	.atomic_duplicate_state = intel_crtc_duplicate_state,
@@ -13327,32 +12662,6 @@ intel_prepare_plane_fb(struct drm_plane *plane,
 	struct drm_i915_gem_object *old_obj = intel_fb_obj(plane->state->fb);
 	int ret;
 
-	if (obj) {
-		if (plane->type == DRM_PLANE_TYPE_CURSOR &&
-		    INTEL_INFO(dev_priv)->cursor_needs_physical) {
-			const int align = intel_cursor_alignment(dev_priv);
-
-			ret = i915_gem_object_attach_phys(obj, align);
-			if (ret) {
-				DRM_DEBUG_KMS("failed to attach phys object\n");
-				return ret;
-			}
-		} else {
-			struct i915_vma *vma;
-
-			vma = intel_pin_and_fence_fb_obj(fb, new_state->rotation);
-			if (IS_ERR(vma)) {
-				DRM_DEBUG_KMS("failed to pin object\n");
-				return PTR_ERR(vma);
-			}
-
-			to_intel_plane_state(new_state)->vma = vma;
-		}
-	}
-
-	if (!obj && !old_obj)
-		return 0;
-
 	if (old_obj) {
 		struct drm_crtc_state *crtc_state =
 			drm_atomic_get_existing_crtc_state(new_state->state,
@@ -13391,6 +12700,38 @@ intel_prepare_plane_fb(struct drm_plane *plane,
 	if (!obj)
 		return 0;
 
+	ret = i915_gem_object_pin_pages(obj);
+	if (ret)
+		return ret;
+
+	ret = mutex_lock_interruptible(&dev_priv->drm.struct_mutex);
+	if (ret) {
+		i915_gem_object_unpin_pages(obj);
+		return ret;
+	}
+
+	if (plane->type == DRM_PLANE_TYPE_CURSOR &&
+	    INTEL_INFO(dev_priv)->cursor_needs_physical) {
+		const int align = intel_cursor_alignment(dev_priv);
+
+		ret = i915_gem_object_attach_phys(obj, align);
+	} else {
+		struct i915_vma *vma;
+
+		vma = intel_pin_and_fence_fb_obj(fb, new_state->rotation);
+		if (!IS_ERR(vma))
+			to_intel_plane_state(new_state)->vma = vma;
+		else
+			ret =  PTR_ERR(vma);
+	}
+
+	i915_gem_object_wait_priority(obj, 0, I915_PRIORITY_DISPLAY);
+
+	mutex_unlock(&dev_priv->drm.struct_mutex);
+	i915_gem_object_unpin_pages(obj);
+	if (ret)
+		return ret;
+
 	if (!new_state->fence) { /* implicit fencing */
 		ret = i915_sw_fence_await_reservation(&intel_state->commit_ready,
 						      obj->resv, NULL,
@@ -13398,8 +12739,6 @@ intel_prepare_plane_fb(struct drm_plane *plane,
 						      GFP_KERNEL);
 		if (ret < 0)
 			return ret;
-
-		i915_gem_object_wait_priority(obj, 0, I915_PRIORITY_DISPLAY);
 	}
 
 	return 0;
@@ -13422,8 +12761,11 @@ intel_cleanup_plane_fb(struct drm_plane *plane,
 
 	/* Should only be called after a successful intel_prepare_plane_fb()! */
 	vma = fetch_and_zero(&to_intel_plane_state(old_state)->vma);
-	if (vma)
+	if (vma) {
+		mutex_lock(&plane->dev->struct_mutex);
 		intel_unpin_fb_vma(vma);
+		mutex_unlock(&plane->dev->struct_mutex);
+	}
 }
 
 int
@@ -13550,7 +12892,7 @@ static void intel_finish_crtc_commit(struct drm_crtc *crtc,
 {
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 
-	intel_pipe_update_end(intel_crtc, NULL);
+	intel_pipe_update_end(intel_crtc);
 }
 
 /**
@@ -13566,15 +12908,110 @@ void intel_plane_destroy(struct drm_plane *plane)
 	kfree(to_intel_plane(plane));
 }
 
-const struct drm_plane_funcs intel_plane_funcs = {
+static bool i8xx_mod_supported(uint32_t format, uint64_t modifier)
+{
+	switch (format) {
+	case DRM_FORMAT_C8:
+	case DRM_FORMAT_RGB565:
+	case DRM_FORMAT_XRGB1555:
+	case DRM_FORMAT_XRGB8888:
+		return modifier == DRM_FORMAT_MOD_LINEAR ||
+			modifier == I915_FORMAT_MOD_X_TILED;
+	default:
+		return false;
+	}
+}
+
+static bool i965_mod_supported(uint32_t format, uint64_t modifier)
+{
+	switch (format) {
+	case DRM_FORMAT_C8:
+	case DRM_FORMAT_RGB565:
+	case DRM_FORMAT_XRGB8888:
+	case DRM_FORMAT_XBGR8888:
+	case DRM_FORMAT_XRGB2101010:
+	case DRM_FORMAT_XBGR2101010:
+		return modifier == DRM_FORMAT_MOD_LINEAR ||
+			modifier == I915_FORMAT_MOD_X_TILED;
+	default:
+		return false;
+	}
+}
+
+static bool skl_mod_supported(uint32_t format, uint64_t modifier)
+{
+	switch (format) {
+	case DRM_FORMAT_XRGB8888:
+	case DRM_FORMAT_XBGR8888:
+	case DRM_FORMAT_ARGB8888:
+	case DRM_FORMAT_ABGR8888:
+		if (modifier == I915_FORMAT_MOD_Yf_TILED_CCS ||
+		    modifier == I915_FORMAT_MOD_Y_TILED_CCS)
+			return true;
+		/* fall through */
+	case DRM_FORMAT_RGB565:
+	case DRM_FORMAT_XRGB2101010:
+	case DRM_FORMAT_XBGR2101010:
+	case DRM_FORMAT_YUYV:
+	case DRM_FORMAT_YVYU:
+	case DRM_FORMAT_UYVY:
+	case DRM_FORMAT_VYUY:
+		if (modifier == I915_FORMAT_MOD_Yf_TILED)
+			return true;
+		/* fall through */
+	case DRM_FORMAT_C8:
+		if (modifier == DRM_FORMAT_MOD_LINEAR ||
+		    modifier == I915_FORMAT_MOD_X_TILED ||
+		    modifier == I915_FORMAT_MOD_Y_TILED)
+			return true;
+		/* fall through */
+	default:
+		return false;
+	}
+}
+
+static bool intel_primary_plane_format_mod_supported(struct drm_plane *plane,
+						     uint32_t format,
+						     uint64_t modifier)
+{
+	struct drm_i915_private *dev_priv = to_i915(plane->dev);
+
+	if (WARN_ON(modifier == DRM_FORMAT_MOD_INVALID))
+		return false;
+
+	if ((modifier >> 56) != DRM_FORMAT_MOD_VENDOR_INTEL &&
+	    modifier != DRM_FORMAT_MOD_LINEAR)
+		return false;
+
+	if (INTEL_GEN(dev_priv) >= 9)
+		return skl_mod_supported(format, modifier);
+	else if (INTEL_GEN(dev_priv) >= 4)
+		return i965_mod_supported(format, modifier);
+	else
+		return i8xx_mod_supported(format, modifier);
+
+	unreachable();
+}
+
+static bool intel_cursor_plane_format_mod_supported(struct drm_plane *plane,
+						    uint32_t format,
+						    uint64_t modifier)
+{
+	if (WARN_ON(modifier == DRM_FORMAT_MOD_INVALID))
+		return false;
+
+	return modifier == DRM_FORMAT_MOD_LINEAR && format == DRM_FORMAT_ARGB8888;
+}
+
+static struct drm_plane_funcs intel_plane_funcs = {
 	.update_plane = drm_atomic_helper_update_plane,
 	.disable_plane = drm_atomic_helper_disable_plane,
 	.destroy = intel_plane_destroy,
-	.set_property = drm_atomic_helper_plane_set_property,
 	.atomic_get_property = intel_plane_atomic_get_property,
 	.atomic_set_property = intel_plane_atomic_set_property,
 	.atomic_duplicate_state = intel_plane_duplicate_state,
 	.atomic_destroy_state = intel_plane_destroy_state,
+	.format_mod_supported = intel_primary_plane_format_mod_supported,
 };
 
 static int
@@ -13593,7 +13030,7 @@ intel_legacy_cursor_update(struct drm_plane *plane,
 	struct intel_plane *intel_plane = to_intel_plane(plane);
 	struct drm_framebuffer *old_fb;
 	struct drm_crtc_state *crtc_state = crtc->state;
-	struct i915_vma *old_vma;
+	struct i915_vma *old_vma, *vma;
 
 	/*
 	 * When crtc is inactive or there is a modeset pending,
@@ -13651,8 +13088,6 @@ intel_legacy_cursor_update(struct drm_plane *plane,
 			goto out_unlock;
 		}
 	} else {
-		struct i915_vma *vma;
-
 		vma = intel_pin_and_fence_fb_obj(fb, new_plane_state->rotation);
 		if (IS_ERR(vma)) {
 			DRM_DEBUG_KMS("failed to pin object\n");
@@ -13675,7 +13110,7 @@ intel_legacy_cursor_update(struct drm_plane *plane,
 	*to_intel_plane_state(old_plane_state) = *to_intel_plane_state(new_plane_state);
 	new_plane_state->fence = NULL;
 	new_plane_state->fb = old_fb;
-	to_intel_plane_state(new_plane_state)->vma = old_vma;
+	to_intel_plane_state(new_plane_state)->vma = NULL;
 
 	if (plane->state->visible) {
 		trace_intel_update_plane(plane, to_intel_crtc(crtc));
@@ -13687,7 +13122,8 @@ intel_legacy_cursor_update(struct drm_plane *plane,
 		intel_plane->disable_plane(intel_plane, to_intel_crtc(crtc));
 	}
 
-	intel_cleanup_plane_fb(plane, new_plane_state);
+	if (old_vma)
+		intel_unpin_fb_vma(old_vma);
 
 out_unlock:
 	mutex_unlock(&dev_priv->drm.struct_mutex);
@@ -13705,11 +13141,11 @@ static const struct drm_plane_funcs intel_cursor_plane_funcs = {
 	.update_plane = intel_legacy_cursor_update,
 	.disable_plane = drm_atomic_helper_disable_plane,
 	.destroy = intel_plane_destroy,
-	.set_property = drm_atomic_helper_plane_set_property,
 	.atomic_get_property = intel_plane_atomic_get_property,
 	.atomic_set_property = intel_plane_atomic_set_property,
 	.atomic_duplicate_state = intel_plane_duplicate_state,
 	.atomic_destroy_state = intel_plane_destroy_state,
+	.format_mod_supported = intel_cursor_plane_format_mod_supported,
 };
 
 static struct intel_plane *
@@ -13720,6 +13156,7 @@ intel_primary_plane_create(struct drm_i915_private *dev_priv, enum pipe pipe)
 	const uint32_t *intel_primary_formats;
 	unsigned int supported_rotations;
 	unsigned int num_formats;
+	const uint64_t *modifiers;
 	int ret;
 
 	primary = kzalloc(sizeof(*primary), GFP_KERNEL);
@@ -13755,21 +13192,34 @@ intel_primary_plane_create(struct drm_i915_private *dev_priv, enum pipe pipe)
 	primary->frontbuffer_bit = INTEL_FRONTBUFFER_PRIMARY(pipe);
 	primary->check_plane = intel_check_primary_plane;
 
-	if (INTEL_GEN(dev_priv) >= 9) {
+	if (INTEL_GEN(dev_priv) >= 10 || IS_GEMINILAKE(dev_priv)) {
 		intel_primary_formats = skl_primary_formats;
 		num_formats = ARRAY_SIZE(skl_primary_formats);
+		modifiers = skl_format_modifiers_ccs;
+
+		primary->update_plane = skylake_update_primary_plane;
+		primary->disable_plane = skylake_disable_primary_plane;
+	} else if (INTEL_GEN(dev_priv) >= 9) {
+		intel_primary_formats = skl_primary_formats;
+		num_formats = ARRAY_SIZE(skl_primary_formats);
+		if (pipe < PIPE_C)
+			modifiers = skl_format_modifiers_ccs;
+		else
+			modifiers = skl_format_modifiers_noccs;
 
 		primary->update_plane = skylake_update_primary_plane;
 		primary->disable_plane = skylake_disable_primary_plane;
 	} else if (INTEL_GEN(dev_priv) >= 4) {
 		intel_primary_formats = i965_primary_formats;
 		num_formats = ARRAY_SIZE(i965_primary_formats);
+		modifiers = i9xx_format_modifiers;
 
 		primary->update_plane = i9xx_update_primary_plane;
 		primary->disable_plane = i9xx_disable_primary_plane;
 	} else {
 		intel_primary_formats = i8xx_primary_formats;
 		num_formats = ARRAY_SIZE(i8xx_primary_formats);
+		modifiers = i9xx_format_modifiers;
 
 		primary->update_plane = i9xx_update_primary_plane;
 		primary->disable_plane = i9xx_disable_primary_plane;
@@ -13779,18 +13229,21 @@ intel_primary_plane_create(struct drm_i915_private *dev_priv, enum pipe pipe)
 		ret = drm_universal_plane_init(&dev_priv->drm, &primary->base,
 					       0, &intel_plane_funcs,
 					       intel_primary_formats, num_formats,
+					       modifiers,
 					       DRM_PLANE_TYPE_PRIMARY,
 					       "plane 1%c", pipe_name(pipe));
 	else if (INTEL_GEN(dev_priv) >= 5 || IS_G4X(dev_priv))
 		ret = drm_universal_plane_init(&dev_priv->drm, &primary->base,
 					       0, &intel_plane_funcs,
 					       intel_primary_formats, num_formats,
+					       modifiers,
 					       DRM_PLANE_TYPE_PRIMARY,
 					       "primary %c", pipe_name(pipe));
 	else
 		ret = drm_universal_plane_init(&dev_priv->drm, &primary->base,
 					       0, &intel_plane_funcs,
 					       intel_primary_formats, num_formats,
+					       modifiers,
 					       DRM_PLANE_TYPE_PRIMARY,
 					       "plane %c", plane_name(primary->plane));
 	if (ret)
@@ -13876,6 +13329,7 @@ intel_cursor_plane_create(struct drm_i915_private *dev_priv,
 				       0, &intel_cursor_plane_funcs,
 				       intel_cursor_formats,
 				       ARRAY_SIZE(intel_cursor_formats),
+				       cursor_format_modifiers,
 				       DRM_PLANE_TYPE_CURSOR,
 				       "cursor %c", pipe_name(pipe));
 	if (ret)
@@ -14398,10 +13852,12 @@ static int intel_framebuffer_init(struct intel_framebuffer *intel_fb,
 				  struct drm_mode_fb_cmd2 *mode_cmd)
 {
 	struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
+	struct drm_framebuffer *fb = &intel_fb->base;
 	struct drm_format_name_buf format_name;
-	u32 pitch_limit, stride_alignment;
+	u32 pitch_limit;
 	unsigned int tiling, stride;
 	int ret = -EINVAL;
+	int i;
 
 	i915_gem_object_lock(obj);
 	obj->framebuffer_references++;
@@ -14430,6 +13886,19 @@ static int intel_framebuffer_init(struct intel_framebuffer *intel_fb,
 
 	/* Passed in modifier sanity checking. */
 	switch (mode_cmd->modifier[0]) {
+	case I915_FORMAT_MOD_Y_TILED_CCS:
+	case I915_FORMAT_MOD_Yf_TILED_CCS:
+		switch (mode_cmd->pixel_format) {
+		case DRM_FORMAT_XBGR8888:
+		case DRM_FORMAT_ABGR8888:
+		case DRM_FORMAT_XRGB8888:
+		case DRM_FORMAT_ARGB8888:
+			break;
+		default:
+			DRM_DEBUG_KMS("RC supported only with RGB8888 formats\n");
+			goto err;
+		}
+		/* fall through */
 	case I915_FORMAT_MOD_Y_TILED:
 	case I915_FORMAT_MOD_Yf_TILED:
 		if (INTEL_GEN(dev_priv) < 9) {
@@ -14534,25 +14003,46 @@ static int intel_framebuffer_init(struct intel_framebuffer *intel_fb,
 	if (mode_cmd->offsets[0] != 0)
 		goto err;
 
-	drm_helper_mode_fill_fb_struct(&dev_priv->drm,
-				       &intel_fb->base, mode_cmd);
+	drm_helper_mode_fill_fb_struct(&dev_priv->drm, fb, mode_cmd);
 
-	stride_alignment = intel_fb_stride_alignment(&intel_fb->base, 0);
-	if (mode_cmd->pitches[0] & (stride_alignment - 1)) {
-		DRM_DEBUG_KMS("pitch (%d) must be at least %u byte aligned\n",
-			      mode_cmd->pitches[0], stride_alignment);
-		goto err;
+	for (i = 0; i < fb->format->num_planes; i++) {
+		u32 stride_alignment;
+
+		if (mode_cmd->handles[i] != mode_cmd->handles[0]) {
+			DRM_DEBUG_KMS("bad plane %d handle\n", i);
+			return -EINVAL;
+		}
+
+		stride_alignment = intel_fb_stride_alignment(fb, i);
+
+		/*
+		 * Display WA #0531: skl,bxt,kbl,glk
+		 *
+		 * Render decompression and plane width > 3840
+		 * combined with horizontal panning requires the
+		 * plane stride to be a multiple of 4. We'll just
+		 * require the entire fb to accommodate that to avoid
+		 * potential runtime errors at plane configuration time.
+		 */
+		if (IS_GEN9(dev_priv) && i == 0 && fb->width > 3840 &&
+		    (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
+		     fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
+			stride_alignment *= 4;
+
+		if (fb->pitches[i] & (stride_alignment - 1)) {
+			DRM_DEBUG_KMS("plane %d pitch (%d) must be at least %u byte aligned\n",
+				      i, fb->pitches[i], stride_alignment);
+			goto err;
+		}
 	}
 
 	intel_fb->obj = obj;
 
-	ret = intel_fill_fb_info(dev_priv, &intel_fb->base);
+	ret = intel_fill_fb_info(dev_priv, fb);
 	if (ret)
 		goto err;
 
-	ret = drm_framebuffer_init(obj->base.dev,
-				   &intel_fb->base,
-				   &intel_fb_funcs);
+	ret = drm_framebuffer_init(&dev_priv->drm, fb, &intel_fb_funcs);
 	if (ret) {
 		DRM_ERROR("framebuffer init failed %d\n", ret);
 		goto err;
@@ -14600,6 +14090,7 @@ static void intel_atomic_state_free(struct drm_atomic_state *state)
 
 static const struct drm_mode_config_funcs intel_mode_funcs = {
 	.fb_create = intel_user_framebuffer_create,
+	.get_format_info = intel_get_format_info,
 	.output_poll_changed = intel_fbdev_output_poll_changed,
 	.atomic_check = intel_atomic_check,
 	.atomic_commit = intel_atomic_commit,
@@ -14699,34 +14190,6 @@ void intel_init_display_hooks(struct drm_i915_private *dev_priv)
 		dev_priv->display.update_crtcs = skl_update_crtcs;
 	else
 		dev_priv->display.update_crtcs = intel_update_crtcs;
-
-	switch (INTEL_INFO(dev_priv)->gen) {
-	case 2:
-		dev_priv->display.queue_flip = intel_gen2_queue_flip;
-		break;
-
-	case 3:
-		dev_priv->display.queue_flip = intel_gen3_queue_flip;
-		break;
-
-	case 4:
-	case 5:
-		dev_priv->display.queue_flip = intel_gen4_queue_flip;
-		break;
-
-	case 6:
-		dev_priv->display.queue_flip = intel_gen6_queue_flip;
-		break;
-	case 7:
-	case 8: /* FIXME(BDW): Check that the gen8 RCS flip works. */
-		dev_priv->display.queue_flip = intel_gen7_queue_flip;
-		break;
-	case 9:
-		/* Drop through - unsupported since execlist only. */
-	default:
-		/* Default just returns -ENODEV to indicate unsupported */
-		dev_priv->display.queue_flip = intel_default_queue_flip;
-	}
 }
 
 /*
@@ -14758,6 +14221,17 @@ static void quirk_backlight_present(struct drm_device *dev)
 	DRM_INFO("applying backlight present quirk\n");
 }
 
+/* Toshiba Satellite P50-C-18C requires T12 delay to be min 800ms
+ * which is 300 ms greater than eDP spec T12 min.
+ */
+static void quirk_increase_t12_delay(struct drm_device *dev)
+{
+	struct drm_i915_private *dev_priv = to_i915(dev);
+
+	dev_priv->quirks |= QUIRK_INCREASE_T12_DELAY;
+	DRM_INFO("Applying T12 delay quirk\n");
+}
+
 struct intel_quirk {
 	int device;
 	int subsystem_vendor;
@@ -14841,6 +14315,9 @@ static struct intel_quirk intel_quirks[] = {
 
 	/* Dell Chromebook 11 (2015 version) */
 	{ 0x0a16, 0x1028, 0x0a35, quirk_backlight_present },
+
+	/* Toshiba Satellite P50-C-18C */
+	{ 0x191B, 0x1179, 0xF840, quirk_increase_t12_delay },
 };
 
 static void intel_init_quirks(struct drm_device *dev)
@@ -15643,7 +15120,7 @@ intel_modeset_setup_hw_state(struct drm_device *dev,
 	} else if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
 		vlv_wm_get_hw_state(dev);
 		vlv_wm_sanitize(dev_priv);
-	} else if (IS_GEN9(dev_priv)) {
+	} else if (INTEL_GEN(dev_priv) >= 9) {
 		skl_wm_get_hw_state(dev);
 	} else if (HAS_PCH_SPLIT(dev_priv)) {
 		ilk_wm_get_hw_state(dev);
@@ -15750,6 +15227,9 @@ void intel_modeset_cleanup(struct drm_device *dev)
 	 */
 	drm_kms_helper_poll_fini(dev);
 
+	/* poll work can call into fbdev, hence clean that up afterwards */
+	intel_fbdev_fini(dev_priv);
+
 	intel_unregister_dsm_handler();
 
 	intel_fbc_global_disable(dev_priv);
@@ -15869,7 +15349,8 @@ intel_display_capture_error_state(struct drm_i915_private *dev_priv)
 		return NULL;
 
 	if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv))
-		error->power_well_driver = I915_READ(HSW_PWR_WELL_DRIVER);
+		error->power_well_driver =
+			I915_READ(HSW_PWR_WELL_CTL_DRIVER(HSW_DISP_PW_GLOBAL));
 
 	for_each_pipe(dev_priv, i) {
 		error->pipe[i].power_domain_on =
diff --git a/drivers/gpu/drm/i915/intel_dp.c b/drivers/gpu/drm/i915/intel_dp.c
index 64fa774..4fd4853 100644
--- a/drivers/gpu/drm/i915/intel_dp.c
+++ b/drivers/gpu/drm/i915/intel_dp.c
@@ -97,6 +97,9 @@ static const int bxt_rates[] = { 162000, 216000, 243000, 270000,
 				  324000, 432000, 540000 };
 static const int skl_rates[] = { 162000, 216000, 270000,
 				  324000, 432000, 540000 };
+static const int cnl_rates[] = { 162000, 216000, 270000,
+				 324000, 432000, 540000,
+				 648000, 810000 };
 static const int default_rates[] = { 162000, 270000, 540000 };
 
 /**
@@ -229,8 +232,10 @@ intel_dp_set_source_rates(struct intel_dp *intel_dp)
 {
 	struct intel_digital_port *dig_port = dp_to_dig_port(intel_dp);
 	struct drm_i915_private *dev_priv = to_i915(dig_port->base.base.dev);
+	enum port port = dig_port->port;
 	const int *source_rates;
 	int size;
+	u32 voltage;
 
 	/* This should only be done once */
 	WARN_ON(intel_dp->source_rates || intel_dp->num_source_rates);
@@ -238,6 +243,13 @@ intel_dp_set_source_rates(struct intel_dp *intel_dp)
 	if (IS_GEN9_LP(dev_priv)) {
 		source_rates = bxt_rates;
 		size = ARRAY_SIZE(bxt_rates);
+	} else if (IS_CANNONLAKE(dev_priv)) {
+		source_rates = cnl_rates;
+		size = ARRAY_SIZE(cnl_rates);
+		voltage = I915_READ(CNL_PORT_COMP_DW3) & VOLTAGE_INFO_MASK;
+		if (port == PORT_A || port == PORT_D ||
+		    voltage == VOLTAGE_INFO_0_85V)
+			size -= 2;
 	} else if (IS_GEN9_BC(dev_priv)) {
 		source_rates = skl_rates;
 		size = ARRAY_SIZE(skl_rates);
@@ -322,19 +334,20 @@ static int intel_dp_common_len_rate_limit(struct intel_dp *intel_dp,
 	return 0;
 }
 
-static bool intel_dp_link_params_valid(struct intel_dp *intel_dp)
+static bool intel_dp_link_params_valid(struct intel_dp *intel_dp, int link_rate,
+				       uint8_t lane_count)
 {
 	/*
 	 * FIXME: we need to synchronize the current link parameters with
 	 * hardware readout. Currently fast link training doesn't work on
 	 * boot-up.
 	 */
-	if (intel_dp->link_rate == 0 ||
-	    intel_dp->link_rate > intel_dp->max_link_rate)
+	if (link_rate == 0 ||
+	    link_rate > intel_dp->max_link_rate)
 		return false;
 
-	if (intel_dp->lane_count == 0 ||
-	    intel_dp->lane_count > intel_dp_max_lane_count(intel_dp))
+	if (lane_count == 0 ||
+	    lane_count > intel_dp_max_lane_count(intel_dp))
 		return false;
 
 	return true;
@@ -1606,6 +1619,23 @@ static int intel_dp_compute_bpp(struct intel_dp *intel_dp,
 	return bpp;
 }
 
+static bool intel_edp_compare_alt_mode(struct drm_display_mode *m1,
+				       struct drm_display_mode *m2)
+{
+	bool bres = false;
+
+	if (m1 && m2)
+		bres = (m1->hdisplay == m2->hdisplay &&
+			m1->hsync_start == m2->hsync_start &&
+			m1->hsync_end == m2->hsync_end &&
+			m1->htotal == m2->htotal &&
+			m1->vdisplay == m2->vdisplay &&
+			m1->vsync_start == m2->vsync_start &&
+			m1->vsync_end == m2->vsync_end &&
+			m1->vtotal == m2->vtotal);
+	return bres;
+}
+
 bool
 intel_dp_compute_config(struct intel_encoder *encoder,
 			struct intel_crtc_state *pipe_config,
@@ -1652,8 +1682,16 @@ intel_dp_compute_config(struct intel_encoder *encoder,
 		pipe_config->has_audio = intel_conn_state->force_audio == HDMI_AUDIO_ON;
 
 	if (is_edp(intel_dp) && intel_connector->panel.fixed_mode) {
-		intel_fixed_panel_mode(intel_connector->panel.fixed_mode,
-				       adjusted_mode);
+		struct drm_display_mode *panel_mode =
+			intel_connector->panel.alt_fixed_mode;
+		struct drm_display_mode *req_mode = &pipe_config->base.mode;
+
+		if (!intel_edp_compare_alt_mode(req_mode, panel_mode))
+			panel_mode = intel_connector->panel.fixed_mode;
+
+		drm_mode_debug_printmodeline(panel_mode);
+
+		intel_fixed_panel_mode(panel_mode, adjusted_mode);
 
 		if (INTEL_GEN(dev_priv) >= 9) {
 			int ret;
@@ -1677,12 +1715,18 @@ intel_dp_compute_config(struct intel_encoder *encoder,
 	if (intel_dp->compliance.test_type == DP_TEST_LINK_TRAINING) {
 		int index;
 
-		index = intel_dp_rate_index(intel_dp->common_rates,
-					    intel_dp->num_common_rates,
-					    intel_dp->compliance.test_link_rate);
-		if (index >= 0)
-			min_clock = max_clock = index;
-		min_lane_count = max_lane_count = intel_dp->compliance.test_lane_count;
+		/* Validate the compliance test data since max values
+		 * might have changed due to link train fallback.
+		 */
+		if (intel_dp_link_params_valid(intel_dp, intel_dp->compliance.test_link_rate,
+					       intel_dp->compliance.test_lane_count)) {
+			index = intel_dp_rate_index(intel_dp->common_rates,
+						    intel_dp->num_common_rates,
+						    intel_dp->compliance.test_link_rate);
+			if (index >= 0)
+				min_clock = max_clock = index;
+			min_lane_count = max_lane_count = intel_dp->compliance.test_lane_count;
+		}
 	}
 	DRM_DEBUG_KMS("DP link computation with max lane count %i "
 		      "max bw %d pixel clock %iKHz\n",
@@ -3963,8 +4007,7 @@ intel_dp_get_sink_irq_esi(struct intel_dp *intel_dp, u8 *sink_irq_vector)
 static uint8_t intel_dp_autotest_link_training(struct intel_dp *intel_dp)
 {
 	int status = 0;
-	int min_lane_count = 1;
-	int link_rate_index, test_link_rate;
+	int test_link_rate;
 	uint8_t test_lane_count, test_link_bw;
 	/* (DP CTS 1.2)
 	 * 4.3.1.11
@@ -3978,10 +4021,6 @@ static uint8_t intel_dp_autotest_link_training(struct intel_dp *intel_dp)
 		return DP_TEST_NAK;
 	}
 	test_lane_count &= DP_MAX_LANE_COUNT_MASK;
-	/* Validate the requested lane count */
-	if (test_lane_count < min_lane_count ||
-	    test_lane_count > intel_dp->max_link_lane_count)
-		return DP_TEST_NAK;
 
 	status = drm_dp_dpcd_readb(&intel_dp->aux, DP_TEST_LINK_RATE,
 				   &test_link_bw);
@@ -3989,12 +4028,11 @@ static uint8_t intel_dp_autotest_link_training(struct intel_dp *intel_dp)
 		DRM_DEBUG_KMS("Link Rate read failed\n");
 		return DP_TEST_NAK;
 	}
-	/* Validate the requested link rate */
 	test_link_rate = drm_dp_bw_code_to_link_rate(test_link_bw);
-	link_rate_index = intel_dp_rate_index(intel_dp->common_rates,
-					      intel_dp->num_common_rates,
-					      test_link_rate);
-	if (link_rate_index < 0)
+
+	/* Validate the requested link rate and lane count */
+	if (!intel_dp_link_params_valid(intel_dp, test_link_rate,
+					test_lane_count))
 		return DP_TEST_NAK;
 
 	intel_dp->compliance.test_lane_count = test_lane_count;
@@ -4263,7 +4301,8 @@ intel_dp_check_link_status(struct intel_dp *intel_dp)
 	 * Validate the cached values of intel_dp->link_rate and
 	 * intel_dp->lane_count before attempting to retrain.
 	 */
-	if (!intel_dp_link_params_valid(intel_dp))
+	if (!intel_dp_link_params_valid(intel_dp, intel_dp->link_rate,
+					intel_dp->lane_count))
 		return;
 
 	/* Retrain if Channel EQ or CR not ok */
@@ -4418,8 +4457,6 @@ static bool ibx_digital_port_connected(struct drm_i915_private *dev_priv,
 	u32 bit;
 
 	switch (port->port) {
-	case PORT_A:
-		return true;
 	case PORT_B:
 		bit = SDE_PORTB_HOTPLUG;
 		break;
@@ -4443,8 +4480,6 @@ static bool cpt_digital_port_connected(struct drm_i915_private *dev_priv,
 	u32 bit;
 
 	switch (port->port) {
-	case PORT_A:
-		return true;
 	case PORT_B:
 		bit = SDE_PORTB_HOTPLUG_CPT;
 		break;
@@ -4454,12 +4489,28 @@ static bool cpt_digital_port_connected(struct drm_i915_private *dev_priv,
 	case PORT_D:
 		bit = SDE_PORTD_HOTPLUG_CPT;
 		break;
+	default:
+		MISSING_CASE(port->port);
+		return false;
+	}
+
+	return I915_READ(SDEISR) & bit;
+}
+
+static bool spt_digital_port_connected(struct drm_i915_private *dev_priv,
+				       struct intel_digital_port *port)
+{
+	u32 bit;
+
+	switch (port->port) {
+	case PORT_A:
+		bit = SDE_PORTA_HOTPLUG_SPT;
+		break;
 	case PORT_E:
 		bit = SDE_PORTE_HOTPLUG_SPT;
 		break;
 	default:
-		MISSING_CASE(port->port);
-		return false;
+		return cpt_digital_port_connected(dev_priv, port);
 	}
 
 	return I915_READ(SDEISR) & bit;
@@ -4511,6 +4562,42 @@ static bool gm45_digital_port_connected(struct drm_i915_private *dev_priv,
 	return I915_READ(PORT_HOTPLUG_STAT) & bit;
 }
 
+static bool ilk_digital_port_connected(struct drm_i915_private *dev_priv,
+				       struct intel_digital_port *port)
+{
+	if (port->port == PORT_A)
+		return I915_READ(DEISR) & DE_DP_A_HOTPLUG;
+	else
+		return ibx_digital_port_connected(dev_priv, port);
+}
+
+static bool snb_digital_port_connected(struct drm_i915_private *dev_priv,
+				       struct intel_digital_port *port)
+{
+	if (port->port == PORT_A)
+		return I915_READ(DEISR) & DE_DP_A_HOTPLUG;
+	else
+		return cpt_digital_port_connected(dev_priv, port);
+}
+
+static bool ivb_digital_port_connected(struct drm_i915_private *dev_priv,
+				       struct intel_digital_port *port)
+{
+	if (port->port == PORT_A)
+		return I915_READ(DEISR) & DE_DP_A_HOTPLUG_IVB;
+	else
+		return cpt_digital_port_connected(dev_priv, port);
+}
+
+static bool bdw_digital_port_connected(struct drm_i915_private *dev_priv,
+				       struct intel_digital_port *port)
+{
+	if (port->port == PORT_A)
+		return I915_READ(GEN8_DE_PORT_ISR) & GEN8_PORT_DP_A_HOTPLUG;
+	else
+		return cpt_digital_port_connected(dev_priv, port);
+}
+
 static bool bxt_digital_port_connected(struct drm_i915_private *dev_priv,
 				       struct intel_digital_port *intel_dig_port)
 {
@@ -4518,7 +4605,7 @@ static bool bxt_digital_port_connected(struct drm_i915_private *dev_priv,
 	enum port port;
 	u32 bit;
 
-	intel_hpd_pin_to_port(intel_encoder->hpd_pin, &port);
+	port = intel_hpd_pin_to_port(intel_encoder->hpd_pin);
 	switch (port) {
 	case PORT_A:
 		bit = BXT_DE_PORT_HP_DDIA;
@@ -4547,16 +4634,25 @@ static bool bxt_digital_port_connected(struct drm_i915_private *dev_priv,
 bool intel_digital_port_connected(struct drm_i915_private *dev_priv,
 				  struct intel_digital_port *port)
 {
-	if (HAS_PCH_IBX(dev_priv))
-		return ibx_digital_port_connected(dev_priv, port);
-	else if (HAS_PCH_SPLIT(dev_priv))
-		return cpt_digital_port_connected(dev_priv, port);
+	if (HAS_GMCH_DISPLAY(dev_priv)) {
+		if (IS_GM45(dev_priv))
+			return gm45_digital_port_connected(dev_priv, port);
+		else
+			return g4x_digital_port_connected(dev_priv, port);
+	}
+
+	if (IS_GEN5(dev_priv))
+		return ilk_digital_port_connected(dev_priv, port);
+	else if (IS_GEN6(dev_priv))
+		return snb_digital_port_connected(dev_priv, port);
+	else if (IS_GEN7(dev_priv))
+		return ivb_digital_port_connected(dev_priv, port);
+	else if (IS_GEN8(dev_priv))
+		return bdw_digital_port_connected(dev_priv, port);
 	else if (IS_GEN9_LP(dev_priv))
 		return bxt_digital_port_connected(dev_priv, port);
-	else if (IS_GM45(dev_priv))
-		return gm45_digital_port_connected(dev_priv, port);
 	else
-		return g4x_digital_port_connected(dev_priv, port);
+		return spt_digital_port_connected(dev_priv, port);
 }
 
 static struct edid *
@@ -4950,10 +5046,8 @@ void intel_dp_encoder_reset(struct drm_encoder *encoder)
 }
 
 static const struct drm_connector_funcs intel_dp_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.force = intel_dp_force,
 	.fill_modes = drm_helper_probe_single_connector_modes,
-	.set_property = drm_atomic_helper_connector_set_property,
 	.atomic_get_property = intel_digital_connector_atomic_get_property,
 	.atomic_set_property = intel_digital_connector_atomic_set_property,
 	.late_register = intel_dp_connector_register,
@@ -5121,12 +5215,8 @@ intel_pps_readout_hw_state(struct drm_i915_private *dev_priv,
 		   PANEL_POWER_DOWN_DELAY_SHIFT;
 
 	if (IS_GEN9_LP(dev_priv) || HAS_PCH_CNP(dev_priv)) {
-		u16 tmp = (pp_ctl & BXT_POWER_CYCLE_DELAY_MASK) >>
-			BXT_POWER_CYCLE_DELAY_SHIFT;
-		if (tmp > 0)
-			seq->t11_t12 = (tmp - 1) * 1000;
-		else
-			seq->t11_t12 = 0;
+		seq->t11_t12 = ((pp_ctl & BXT_POWER_CYCLE_DELAY_MASK) >>
+				BXT_POWER_CYCLE_DELAY_SHIFT) * 1000;
 	} else {
 		seq->t11_t12 = ((pp_div & PANEL_POWER_CYCLE_DELAY_MASK) >>
 		       PANEL_POWER_CYCLE_DELAY_SHIFT) * 1000;
@@ -5177,6 +5267,21 @@ intel_dp_init_panel_power_sequencer(struct drm_device *dev,
 	intel_pps_dump_state("cur", &cur);
 
 	vbt = dev_priv->vbt.edp.pps;
+	/* On Toshiba Satellite P50-C-18C system the VBT T12 delay
+	 * of 500ms appears to be too short. Ocassionally the panel
+	 * just fails to power back on. Increasing the delay to 800ms
+	 * seems sufficient to avoid this problem.
+	 */
+	if (dev_priv->quirks & QUIRK_INCREASE_T12_DELAY) {
+		vbt.t11_t12 = max_t(u16, vbt.t11_t12, 800 * 10);
+		DRM_DEBUG_KMS("Increasing T12 panel delay as per the quirk to %d\n",
+			      vbt.t11_t12);
+	}
+	/* T11_T12 delay is special and actually in units of 100ms, but zero
+	 * based in the hw (so we need to add 100 ms). But the sw vbt
+	 * table multiplies it with 1000 to make it in units of 100usec,
+	 * too. */
+	vbt.t11_t12 += 100 * 10;
 
 	/* Upper limits from eDP 1.3 spec. Note that we use the clunky units of
 	 * our hw here, which are all in 100usec. */
@@ -5280,7 +5385,7 @@ intel_dp_init_panel_power_sequencer_registers(struct drm_device *dev,
 	if (IS_GEN9_LP(dev_priv) || HAS_PCH_CNP(dev_priv)) {
 		pp_div = I915_READ(regs.pp_ctrl);
 		pp_div &= ~BXT_POWER_CYCLE_DELAY_MASK;
-		pp_div |= (DIV_ROUND_UP((seq->t11_t12 + 1), 1000)
+		pp_div |= (DIV_ROUND_UP(seq->t11_t12, 1000)
 				<< BXT_POWER_CYCLE_DELAY_SHIFT);
 	} else {
 		pp_div = ((100 * div)/2 - 1) << PP_REFERENCE_DIVIDER_SHIFT;
@@ -5714,6 +5819,7 @@ static bool intel_edp_init_connector(struct intel_dp *intel_dp,
 	struct drm_device *dev = intel_encoder->base.dev;
 	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct drm_display_mode *fixed_mode = NULL;
+	struct drm_display_mode *alt_fixed_mode = NULL;
 	struct drm_display_mode *downclock_mode = NULL;
 	bool has_dpcd;
 	struct drm_display_mode *scan;
@@ -5769,13 +5875,14 @@ static bool intel_edp_init_connector(struct intel_dp *intel_dp,
 	}
 	intel_connector->edid = edid;
 
-	/* prefer fixed mode from EDID if available */
+	/* prefer fixed mode from EDID if available, save an alt mode also */
 	list_for_each_entry(scan, &connector->probed_modes, head) {
 		if ((scan->type & DRM_MODE_TYPE_PREFERRED)) {
 			fixed_mode = drm_mode_duplicate(dev, scan);
 			downclock_mode = intel_dp_drrs_init(
 						intel_connector, fixed_mode);
-			break;
+		} else if (!alt_fixed_mode) {
+			alt_fixed_mode = drm_mode_duplicate(dev, scan);
 		}
 	}
 
@@ -5812,7 +5919,8 @@ static bool intel_edp_init_connector(struct intel_dp *intel_dp,
 			      pipe_name(pipe));
 	}
 
-	intel_panel_init(&intel_connector->panel, fixed_mode, downclock_mode);
+	intel_panel_init(&intel_connector->panel, fixed_mode, alt_fixed_mode,
+			 downclock_mode);
 	intel_connector->panel.backlight.power = intel_edp_backlight_power;
 	intel_panel_setup_backlight(connector, pipe);
 
@@ -5838,26 +5946,22 @@ intel_dp_init_connector_port_info(struct intel_digital_port *intel_dig_port)
 	struct intel_encoder *encoder = &intel_dig_port->base;
 	struct intel_dp *intel_dp = &intel_dig_port->dp;
 
+	encoder->hpd_pin = intel_hpd_pin(intel_dig_port->port);
+
 	switch (intel_dig_port->port) {
 	case PORT_A:
-		encoder->hpd_pin = HPD_PORT_A;
 		intel_dp->aux_power_domain = POWER_DOMAIN_AUX_A;
 		break;
 	case PORT_B:
-		encoder->hpd_pin = HPD_PORT_B;
 		intel_dp->aux_power_domain = POWER_DOMAIN_AUX_B;
 		break;
 	case PORT_C:
-		encoder->hpd_pin = HPD_PORT_C;
 		intel_dp->aux_power_domain = POWER_DOMAIN_AUX_C;
 		break;
 	case PORT_D:
-		encoder->hpd_pin = HPD_PORT_D;
 		intel_dp->aux_power_domain = POWER_DOMAIN_AUX_D;
 		break;
 	case PORT_E:
-		encoder->hpd_pin = HPD_PORT_E;
-
 		/* FIXME: Check VBT for actual wiring of PORT E */
 		intel_dp->aux_power_domain = POWER_DOMAIN_AUX_D;
 		break;
diff --git a/drivers/gpu/drm/i915/intel_dp_aux_backlight.c b/drivers/gpu/drm/i915/intel_dp_aux_backlight.c
index 228ca06..d2830ba 100644
--- a/drivers/gpu/drm/i915/intel_dp_aux_backlight.c
+++ b/drivers/gpu/drm/i915/intel_dp_aux_backlight.c
@@ -98,13 +98,87 @@ intel_dp_aux_set_backlight(const struct drm_connector_state *conn_state, u32 lev
 	}
 }
 
+/*
+ * Set PWM Frequency divider to match desired frequency in vbt.
+ * The PWM Frequency is calculated as 27Mhz / (F x P).
+ * - Where F = PWM Frequency Pre-Divider value programmed by field 7:0 of the
+ *             EDP_BACKLIGHT_FREQ_SET register (DPCD Address 00728h)
+ * - Where P = 2^Pn, where Pn is the value programmed by field 4:0 of the
+ *             EDP_PWMGEN_BIT_COUNT register (DPCD Address 00724h)
+ */
+static bool intel_dp_aux_set_pwm_freq(struct intel_connector *connector)
+{
+	struct drm_i915_private *dev_priv = to_i915(connector->base.dev);
+	struct intel_dp *intel_dp = enc_to_intel_dp(&connector->encoder->base);
+	int freq, fxp, fxp_min, fxp_max, fxp_actual, f = 1;
+	u8 pn, pn_min, pn_max;
+
+	/* Find desired value of (F x P)
+	 * Note that, if F x P is out of supported range, the maximum value or
+	 * minimum value will applied automatically. So no need to check that.
+	 */
+	freq = dev_priv->vbt.backlight.pwm_freq_hz;
+	DRM_DEBUG_KMS("VBT defined backlight frequency %u Hz\n", freq);
+	if (!freq) {
+		DRM_DEBUG_KMS("Use panel default backlight frequency\n");
+		return false;
+	}
+
+	fxp = DIV_ROUND_CLOSEST(KHz(DP_EDP_BACKLIGHT_FREQ_BASE_KHZ), freq);
+
+	/* Use highest possible value of Pn for more granularity of brightness
+	 * adjustment while satifying the conditions below.
+	 * - Pn is in the range of Pn_min and Pn_max
+	 * - F is in the range of 1 and 255
+	 * - FxP is within 25% of desired value.
+	 *   Note: 25% is arbitrary value and may need some tweak.
+	 */
+	if (drm_dp_dpcd_readb(&intel_dp->aux,
+			       DP_EDP_PWMGEN_BIT_COUNT_CAP_MIN, &pn_min) != 1) {
+		DRM_DEBUG_KMS("Failed to read pwmgen bit count cap min\n");
+		return false;
+	}
+	if (drm_dp_dpcd_readb(&intel_dp->aux,
+			       DP_EDP_PWMGEN_BIT_COUNT_CAP_MAX, &pn_max) != 1) {
+		DRM_DEBUG_KMS("Failed to read pwmgen bit count cap max\n");
+		return false;
+	}
+	pn_min &= DP_EDP_PWMGEN_BIT_COUNT_MASK;
+	pn_max &= DP_EDP_PWMGEN_BIT_COUNT_MASK;
+
+	fxp_min = DIV_ROUND_CLOSEST(fxp * 3, 4);
+	fxp_max = DIV_ROUND_CLOSEST(fxp * 5, 4);
+	if (fxp_min < (1 << pn_min) || (255 << pn_max) < fxp_max) {
+		DRM_DEBUG_KMS("VBT defined backlight frequency out of range\n");
+		return false;
+	}
+
+	for (pn = pn_max; pn >= pn_min; pn--) {
+		f = clamp(DIV_ROUND_CLOSEST(fxp, 1 << pn), 1, 255);
+		fxp_actual = f << pn;
+		if (fxp_min <= fxp_actual && fxp_actual <= fxp_max)
+			break;
+	}
+
+	if (drm_dp_dpcd_writeb(&intel_dp->aux,
+			       DP_EDP_PWMGEN_BIT_COUNT, pn) < 0) {
+		DRM_DEBUG_KMS("Failed to write aux pwmgen bit count\n");
+		return false;
+	}
+	if (drm_dp_dpcd_writeb(&intel_dp->aux,
+			       DP_EDP_BACKLIGHT_FREQ_SET, (u8) f) < 0) {
+		DRM_DEBUG_KMS("Failed to write aux backlight freq\n");
+		return false;
+	}
+	return true;
+}
+
 static void intel_dp_aux_enable_backlight(const struct intel_crtc_state *crtc_state,
 					  const struct drm_connector_state *conn_state)
 {
 	struct intel_connector *connector = to_intel_connector(conn_state->connector);
 	struct intel_dp *intel_dp = enc_to_intel_dp(&connector->encoder->base);
-	uint8_t dpcd_buf = 0;
-	uint8_t edp_backlight_mode = 0;
+	uint8_t dpcd_buf, new_dpcd_buf, edp_backlight_mode;
 
 	if (drm_dp_dpcd_readb(&intel_dp->aux,
 			DP_EDP_BACKLIGHT_MODE_SET_REGISTER, &dpcd_buf) != 1) {
@@ -113,18 +187,15 @@ static void intel_dp_aux_enable_backlight(const struct intel_crtc_state *crtc_st
 		return;
 	}
 
+	new_dpcd_buf = dpcd_buf;
 	edp_backlight_mode = dpcd_buf & DP_EDP_BACKLIGHT_CONTROL_MODE_MASK;
 
 	switch (edp_backlight_mode) {
 	case DP_EDP_BACKLIGHT_CONTROL_MODE_PWM:
 	case DP_EDP_BACKLIGHT_CONTROL_MODE_PRESET:
 	case DP_EDP_BACKLIGHT_CONTROL_MODE_PRODUCT:
-		dpcd_buf &= ~DP_EDP_BACKLIGHT_CONTROL_MODE_MASK;
-		dpcd_buf |= DP_EDP_BACKLIGHT_CONTROL_MODE_DPCD;
-		if (drm_dp_dpcd_writeb(&intel_dp->aux,
-			DP_EDP_BACKLIGHT_MODE_SET_REGISTER, dpcd_buf) < 0) {
-			DRM_DEBUG_KMS("Failed to write aux backlight mode\n");
-		}
+		new_dpcd_buf &= ~DP_EDP_BACKLIGHT_CONTROL_MODE_MASK;
+		new_dpcd_buf |= DP_EDP_BACKLIGHT_CONTROL_MODE_DPCD;
 		break;
 
 	/* Do nothing when it is already DPCD mode */
@@ -133,6 +204,17 @@ static void intel_dp_aux_enable_backlight(const struct intel_crtc_state *crtc_st
 		break;
 	}
 
+	if (intel_dp->edp_dpcd[2] & DP_EDP_BACKLIGHT_FREQ_AUX_SET_CAP)
+		if (intel_dp_aux_set_pwm_freq(connector))
+			new_dpcd_buf |= DP_EDP_BACKLIGHT_FREQ_AUX_SET_ENABLE;
+
+	if (new_dpcd_buf != dpcd_buf) {
+		if (drm_dp_dpcd_writeb(&intel_dp->aux,
+			DP_EDP_BACKLIGHT_MODE_SET_REGISTER, new_dpcd_buf) < 0) {
+			DRM_DEBUG_KMS("Failed to write aux backlight mode\n");
+		}
+	}
+
 	set_aux_backlight_enable(intel_dp, true);
 	intel_dp_aux_set_backlight(conn_state, connector->panel.backlight.level);
 }
diff --git a/drivers/gpu/drm/i915/intel_dp_link_training.c b/drivers/gpu/drm/i915/intel_dp_link_training.c
index b79c1c0..05907fa 100644
--- a/drivers/gpu/drm/i915/intel_dp_link_training.c
+++ b/drivers/gpu/drm/i915/intel_dp_link_training.c
@@ -321,12 +321,16 @@ intel_dp_start_link_train(struct intel_dp *intel_dp)
 	if (!intel_dp_link_training_channel_equalization(intel_dp))
 		goto failure_handling;
 
-	DRM_DEBUG_KMS("Link Training Passed at Link Rate = %d, Lane count = %d",
+	DRM_DEBUG_KMS("[CONNECTOR:%d:%s] Link Training Passed at Link Rate = %d, Lane count = %d",
+		      intel_connector->base.base.id,
+		      intel_connector->base.name,
 		      intel_dp->link_rate, intel_dp->lane_count);
 	return;
 
  failure_handling:
-	DRM_DEBUG_KMS("Link Training failed at link rate = %d, lane count = %d",
+	DRM_DEBUG_KMS("[CONNECTOR:%d:%s] Link Training failed at link rate = %d, lane count = %d",
+		      intel_connector->base.base.id,
+		      intel_connector->base.name,
 		      intel_dp->link_rate, intel_dp->lane_count);
 	if (!intel_dp_get_link_train_fallback_values(intel_dp,
 						     intel_dp->link_rate,
diff --git a/drivers/gpu/drm/i915/intel_dp_mst.c b/drivers/gpu/drm/i915/intel_dp_mst.c
index 2cf046b..93fc8ab 100644
--- a/drivers/gpu/drm/i915/intel_dp_mst.c
+++ b/drivers/gpu/drm/i915/intel_dp_mst.c
@@ -346,10 +346,8 @@ intel_dp_mst_connector_destroy(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs intel_dp_mst_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.detect = intel_dp_mst_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
-	.set_property = drm_atomic_helper_connector_set_property,
 	.late_register = intel_connector_register,
 	.early_unregister = intel_connector_unregister,
 	.destroy = intel_dp_mst_connector_destroy,
@@ -372,6 +370,9 @@ intel_dp_mst_mode_valid(struct drm_connector *connector,
 	int bpp = 24; /* MST uses fixed bpp */
 	int max_rate, mode_rate, max_lanes, max_link_clock;
 
+	if (!intel_dp)
+		return MODE_ERROR;
+
 	max_link_clock = intel_dp_max_link_rate(intel_dp);
 	max_lanes = intel_dp_max_lane_count(intel_dp);
 
@@ -443,28 +444,6 @@ static bool intel_dp_mst_get_hw_state(struct intel_connector *connector)
 	return false;
 }
 
-static void intel_connector_add_to_fbdev(struct intel_connector *connector)
-{
-#ifdef CONFIG_DRM_FBDEV_EMULATION
-	struct drm_i915_private *dev_priv = to_i915(connector->base.dev);
-
-	if (dev_priv->fbdev)
-		drm_fb_helper_add_one_connector(&dev_priv->fbdev->helper,
-						&connector->base);
-#endif
-}
-
-static void intel_connector_remove_from_fbdev(struct intel_connector *connector)
-{
-#ifdef CONFIG_DRM_FBDEV_EMULATION
-	struct drm_i915_private *dev_priv = to_i915(connector->base.dev);
-
-	if (dev_priv->fbdev)
-		drm_fb_helper_remove_one_connector(&dev_priv->fbdev->helper,
-						   &connector->base);
-#endif
-}
-
 static struct drm_connector *intel_dp_add_mst_connector(struct drm_dp_mst_topology_mgr *mgr, struct drm_dp_mst_port *port, const char *pathprop)
 {
 	struct intel_dp *intel_dp = container_of(mgr, struct intel_dp, mst_mgr);
@@ -500,31 +479,32 @@ static struct drm_connector *intel_dp_add_mst_connector(struct drm_dp_mst_topolo
 
 static void intel_dp_register_mst_connector(struct drm_connector *connector)
 {
-	struct intel_connector *intel_connector = to_intel_connector(connector);
-	struct drm_device *dev = connector->dev;
+	struct drm_i915_private *dev_priv = to_i915(connector->dev);
 
-	drm_modeset_lock_all(dev);
-	intel_connector_add_to_fbdev(intel_connector);
-	drm_modeset_unlock_all(dev);
+	if (dev_priv->fbdev)
+		drm_fb_helper_add_one_connector(&dev_priv->fbdev->helper,
+						connector);
 
-	drm_connector_register(&intel_connector->base);
+	drm_connector_register(connector);
 }
 
 static void intel_dp_destroy_mst_connector(struct drm_dp_mst_topology_mgr *mgr,
 					   struct drm_connector *connector)
 {
 	struct intel_connector *intel_connector = to_intel_connector(connector);
-	struct drm_device *dev = connector->dev;
+	struct drm_i915_private *dev_priv = to_i915(connector->dev);
 
 	drm_connector_unregister(connector);
 
-	/* need to nuke the connector */
-	drm_modeset_lock_all(dev);
-	intel_connector_remove_from_fbdev(intel_connector);
+	if (dev_priv->fbdev)
+		drm_fb_helper_remove_one_connector(&dev_priv->fbdev->helper,
+						   connector);
+	/* prevent race with the check in ->detect */
+	drm_modeset_lock(&connector->dev->mode_config.connection_mutex, NULL);
 	intel_connector->mst_port = NULL;
-	drm_modeset_unlock_all(dev);
+	drm_modeset_unlock(&connector->dev->mode_config.connection_mutex);
 
-	drm_connector_unreference(&intel_connector->base);
+	drm_connector_unreference(connector);
 	DRM_DEBUG_KMS("\n");
 }
 
diff --git a/drivers/gpu/drm/i915/intel_dpll_mgr.c b/drivers/gpu/drm/i915/intel_dpll_mgr.c
index 2f7b0e6..a2a3d93 100644
--- a/drivers/gpu/drm/i915/intel_dpll_mgr.c
+++ b/drivers/gpu/drm/i915/intel_dpll_mgr.c
@@ -2379,6 +2379,15 @@ cnl_get_dpll(struct intel_crtc *crtc, struct intel_crtc_state *crtc_state,
 	return pll;
 }
 
+static void cnl_dump_hw_state(struct drm_i915_private *dev_priv,
+			      struct intel_dpll_hw_state *hw_state)
+{
+	DRM_DEBUG_KMS("dpll_hw_state: "
+		      "cfgcr0: 0x%x, cfgcr1: 0x%x\n",
+		      hw_state->cfgcr0,
+		      hw_state->cfgcr1);
+}
+
 static const struct intel_shared_dpll_funcs cnl_ddi_pll_funcs = {
 	.enable = cnl_ddi_pll_enable,
 	.disable = cnl_ddi_pll_disable,
@@ -2395,7 +2404,7 @@ static const struct dpll_info cnl_plls[] = {
 static const struct intel_dpll_mgr cnl_pll_mgr = {
 	.dpll_info = cnl_plls,
 	.get_dpll = cnl_get_dpll,
-	.dump_hw_state = skl_dump_hw_state,
+	.dump_hw_state = cnl_dump_hw_state,
 };
 
 /**
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index d93efb4..fa47285 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -265,6 +265,7 @@ struct intel_encoder {
 
 struct intel_panel {
 	struct drm_display_mode *fixed_mode;
+	struct drm_display_mode *alt_fixed_mode;
 	struct drm_display_mode *downclock_mode;
 
 	/* backlight */
@@ -780,13 +781,15 @@ struct intel_crtc_state {
 
 	/* HDMI High TMDS char rate ratio */
 	bool hdmi_high_tmds_clock_ratio;
+
+	/* output format is YCBCR 4:2:0 */
+	bool ycbcr420;
 };
 
 struct intel_crtc {
 	struct drm_crtc base;
 	enum pipe pipe;
 	enum plane plane;
-	u8 lut_r[256], lut_g[256], lut_b[256];
 	/*
 	 * Whether the crtc and the connected output pipeline is active. Implies
 	 * that crtc->enabled is set, i.e. the current mode configuration has
@@ -797,9 +800,6 @@ struct intel_crtc {
 	u8 plane_ids_mask;
 	unsigned long long enabled_power_domains;
 	struct intel_overlay *overlay;
-	struct intel_flip_work *flip_work;
-
-	atomic_t unpin_work_count;
 
 	/* Display surface base address adjustement for pageflips. Note that on
 	 * gen4+ this only adjusts up to a tile, offsets within a tile are
@@ -1132,24 +1132,6 @@ intel_get_crtc_for_plane(struct drm_i915_private *dev_priv, enum plane plane)
 	return dev_priv->plane_to_crtc_mapping[plane];
 }
 
-struct intel_flip_work {
-	struct work_struct unpin_work;
-	struct work_struct mmio_work;
-
-	struct drm_crtc *crtc;
-	struct i915_vma *old_vma;
-	struct drm_framebuffer *old_fb;
-	struct drm_i915_gem_object *pending_flip_obj;
-	struct drm_pending_vblank_event *event;
-	atomic_t pending;
-	u32 flip_count;
-	u32 gtt_offset;
-	struct drm_i915_gem_request *flip_queued_req;
-	u32 flip_queued_vblank;
-	u32 flip_ready_vblank;
-	unsigned int rotation;
-};
-
 struct intel_load_detect_pipe {
 	struct drm_atomic_state *restore_state;
 };
@@ -1211,12 +1193,12 @@ hdmi_to_dig_port(struct intel_hdmi *intel_hdmi)
 bool intel_set_cpu_fifo_underrun_reporting(struct drm_i915_private *dev_priv,
 					   enum pipe pipe, bool enable);
 bool intel_set_pch_fifo_underrun_reporting(struct drm_i915_private *dev_priv,
-					   enum transcoder pch_transcoder,
+					   enum pipe pch_transcoder,
 					   bool enable);
 void intel_cpu_fifo_underrun_irq_handler(struct drm_i915_private *dev_priv,
 					 enum pipe pipe);
 void intel_pch_fifo_underrun_irq_handler(struct drm_i915_private *dev_priv,
-					 enum transcoder pch_transcoder);
+					 enum pipe pch_transcoder);
 void intel_check_cpu_fifo_underruns(struct drm_i915_private *dev_priv);
 void intel_check_pch_fifo_underruns(struct drm_i915_private *dev_priv);
 
@@ -1251,9 +1233,9 @@ static inline bool intel_irqs_enabled(struct drm_i915_private *dev_priv)
 
 int intel_get_crtc_scanline(struct intel_crtc *crtc);
 void gen8_irq_power_well_post_enable(struct drm_i915_private *dev_priv,
-				     unsigned int pipe_mask);
+				     u8 pipe_mask);
 void gen8_irq_power_well_pre_disable(struct drm_i915_private *dev_priv,
-				     unsigned int pipe_mask);
+				     u8 pipe_mask);
 void gen9_reset_guc_interrupts(struct drm_i915_private *dev_priv);
 void gen9_enable_guc_interrupts(struct drm_i915_private *dev_priv);
 void gen9_disable_guc_interrupts(struct drm_i915_private *dev_priv);
@@ -1326,7 +1308,7 @@ void intel_set_cdclk(struct drm_i915_private *dev_priv,
 /* intel_display.c */
 void i830_enable_pipe(struct drm_i915_private *dev_priv, enum pipe pipe);
 void i830_disable_pipe(struct drm_i915_private *dev_priv, enum pipe pipe);
-enum transcoder intel_crtc_pch_transcoder(struct intel_crtc *crtc);
+enum pipe intel_crtc_pch_transcoder(struct intel_crtc *crtc);
 void intel_update_rawclk(struct drm_i915_private *dev_priv);
 int vlv_get_hpll_vco(struct drm_i915_private *dev_priv);
 int vlv_get_cck_clock(struct drm_i915_private *dev_priv,
@@ -1335,7 +1317,6 @@ int vlv_get_cck_clock_hpll(struct drm_i915_private *dev_priv,
 			   const char *name, u32 reg);
 void lpt_disable_pch_transcoder(struct drm_i915_private *dev_priv);
 void lpt_disable_iclkip(struct drm_i915_private *dev_priv);
-extern const struct drm_plane_funcs intel_plane_funcs;
 void intel_init_display_hooks(struct drm_i915_private *dev_priv);
 unsigned int intel_fb_xy_to_linear(int x, int y,
 				   const struct intel_plane_state *state,
@@ -1408,9 +1389,6 @@ void intel_unpin_fb_vma(struct i915_vma *vma);
 struct drm_framebuffer *
 intel_framebuffer_create(struct drm_i915_gem_object *obj,
 			 struct drm_mode_fb_cmd2 *mode_cmd);
-void intel_finish_page_flip_cs(struct drm_i915_private *dev_priv, int pipe);
-void intel_finish_page_flip_mmio(struct drm_i915_private *dev_priv, int pipe);
-void intel_check_page_flip(struct drm_i915_private *dev_priv, int pipe);
 int intel_prepare_plane_fb(struct drm_plane *plane,
 			   struct drm_plane_state *new_state);
 void intel_cleanup_plane_fb(struct drm_plane *plane,
@@ -1597,7 +1575,8 @@ void intel_hpd_poll_init(struct drm_i915_private *dev_priv);
 #ifdef CONFIG_DRM_FBDEV_EMULATION
 extern int intel_fbdev_init(struct drm_device *dev);
 extern void intel_fbdev_initial_config_async(struct drm_device *dev);
-extern void intel_fbdev_fini(struct drm_device *dev);
+extern void intel_fbdev_unregister(struct drm_i915_private *dev_priv);
+extern void intel_fbdev_fini(struct drm_i915_private *dev_priv);
 extern void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous);
 extern void intel_fbdev_output_poll_changed(struct drm_device *dev);
 extern void intel_fbdev_restore_mode(struct drm_device *dev);
@@ -1611,7 +1590,11 @@ static inline void intel_fbdev_initial_config_async(struct drm_device *dev)
 {
 }
 
-static inline void intel_fbdev_fini(struct drm_device *dev)
+static inline void intel_fbdev_unregister(struct drm_i915_private *dev_priv)
+{
+}
+
+static inline void intel_fbdev_fini(struct drm_i915_private *dev_priv)
 {
 }
 
@@ -1696,6 +1679,7 @@ void intel_overlay_reset(struct drm_i915_private *dev_priv);
 /* intel_panel.c */
 int intel_panel_init(struct intel_panel *panel,
 		     struct drm_display_mode *fixed_mode,
+		     struct drm_display_mode *alt_fixed_mode,
 		     struct drm_display_mode *downclock_mode);
 void intel_panel_fini(struct intel_panel *panel);
 void intel_fixed_panel_mode(const struct drm_display_mode *fixed_mode,
@@ -1858,9 +1842,8 @@ void intel_suspend_gt_powersave(struct drm_i915_private *dev_priv);
 void gen6_rps_busy(struct drm_i915_private *dev_priv);
 void gen6_rps_reset_ei(struct drm_i915_private *dev_priv);
 void gen6_rps_idle(struct drm_i915_private *dev_priv);
-void gen6_rps_boost(struct drm_i915_private *dev_priv,
-		    struct intel_rps_client *rps,
-		    unsigned long submitted);
+void gen6_rps_boost(struct drm_i915_gem_request *rq,
+		    struct intel_rps_client *rps);
 void intel_queue_rps_boost_for_request(struct drm_i915_gem_request *req);
 void g4x_wm_get_hw_state(struct drm_device *dev);
 void vlv_wm_get_hw_state(struct drm_device *dev);
@@ -1902,7 +1885,7 @@ struct intel_plane *intel_sprite_plane_create(struct drm_i915_private *dev_priv,
 int intel_sprite_set_colorkey(struct drm_device *dev, void *data,
 			      struct drm_file *file_priv);
 void intel_pipe_update_start(struct intel_crtc *crtc);
-void intel_pipe_update_end(struct intel_crtc *crtc, struct intel_flip_work *work);
+void intel_pipe_update_end(struct intel_crtc *crtc);
 
 /* intel_tv.c */
 void intel_tv_init(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_dsi.c b/drivers/gpu/drm/i915/intel_dsi.c
index 50ec836..f0c11ae 100644
--- a/drivers/gpu/drm/i915/intel_dsi.c
+++ b/drivers/gpu/drm/i915/intel_dsi.c
@@ -1653,12 +1653,10 @@ static const struct drm_connector_helper_funcs intel_dsi_connector_helper_funcs
 };
 
 static const struct drm_connector_funcs intel_dsi_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.late_register = intel_connector_register,
 	.early_unregister = intel_connector_unregister,
 	.destroy = intel_dsi_connector_destroy,
 	.fill_modes = drm_helper_probe_single_connector_modes,
-	.set_property = drm_atomic_helper_connector_set_property,
 	.atomic_get_property = intel_digital_connector_atomic_get_property,
 	.atomic_set_property = intel_digital_connector_atomic_set_property,
 	.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
@@ -1851,7 +1849,7 @@ void intel_dsi_init(struct drm_i915_private *dev_priv)
 	connector->display_info.width_mm = fixed_mode->width_mm;
 	connector->display_info.height_mm = fixed_mode->height_mm;
 
-	intel_panel_init(&intel_connector->panel, fixed_mode, NULL);
+	intel_panel_init(&intel_connector->panel, fixed_mode, NULL, NULL);
 	intel_panel_setup_backlight(connector, INVALID_PIPE);
 
 	intel_dsi_add_properties(intel_connector);
diff --git a/drivers/gpu/drm/i915/intel_dvo.c b/drivers/gpu/drm/i915/intel_dvo.c
index c1544a5..c0a0272 100644
--- a/drivers/gpu/drm/i915/intel_dvo.c
+++ b/drivers/gpu/drm/i915/intel_dvo.c
@@ -344,13 +344,11 @@ static void intel_dvo_destroy(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs intel_dvo_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.detect = intel_dvo_detect,
 	.late_register = intel_connector_register,
 	.early_unregister = intel_connector_unregister,
 	.destroy = intel_dvo_destroy,
 	.fill_modes = drm_helper_probe_single_connector_modes,
-	.set_property = drm_atomic_helper_connector_set_property,
 	.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
 	.atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
 };
@@ -554,7 +552,7 @@ void intel_dvo_init(struct drm_i915_private *dev_priv)
 			 */
 			intel_panel_init(&intel_connector->panel,
 					 intel_dvo_get_current_mode(connector),
-					 NULL);
+					 NULL, NULL);
 			intel_dvo->panel_wants_dither = true;
 		}
 
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index 5b4de71..9ab5969 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -149,6 +149,7 @@ __intel_engine_context_size(struct drm_i915_private *dev_priv, u8 class)
 		switch (INTEL_GEN(dev_priv)) {
 		default:
 			MISSING_CASE(INTEL_GEN(dev_priv));
+		case 10:
 		case 9:
 			return GEN9_LR_CONTEXT_RENDER_SIZE;
 		case 8:
@@ -291,11 +292,9 @@ int intel_engines_init_mmio(struct drm_i915_private *dev_priv)
  */
 int intel_engines_init(struct drm_i915_private *dev_priv)
 {
-	struct intel_device_info *device_info = mkwrite_device_info(dev_priv);
 	struct intel_engine_cs *engine;
 	enum intel_engine_id id, err_id;
-	unsigned int mask = 0;
-	int err = 0;
+	int err;
 
 	for_each_engine(engine, dev_priv, id) {
 		const struct engine_class_info *class_info =
@@ -306,40 +305,30 @@ int intel_engines_init(struct drm_i915_private *dev_priv)
 			init = class_info->init_execlists;
 		else
 			init = class_info->init_legacy;
-		if (!init) {
-			kfree(engine);
-			dev_priv->engine[id] = NULL;
-			continue;
-		}
+
+		err = -EINVAL;
+		err_id = id;
+
+		if (GEM_WARN_ON(!init))
+			goto cleanup;
 
 		err = init(engine);
-		if (err) {
-			err_id = id;
+		if (err)
 			goto cleanup;
-		}
 
 		GEM_BUG_ON(!engine->submit_request);
-		mask |= ENGINE_MASK(id);
 	}
 
-	/*
-	 * Catch failures to update intel_engines table when the new engines
-	 * are added to the driver by a warning and disabling the forgotten
-	 * engines.
-	 */
-	if (WARN_ON(mask != INTEL_INFO(dev_priv)->ring_mask))
-		device_info->ring_mask = mask;
-
-	device_info->num_rings = hweight32(mask);
-
 	return 0;
 
 cleanup:
 	for_each_engine(engine, dev_priv, id) {
-		if (id >= err_id)
+		if (id >= err_id) {
 			kfree(engine);
-		else
+			dev_priv->engine[id] = NULL;
+		} else {
 			dev_priv->gt.cleanup_engine(engine);
+		}
 	}
 	return err;
 }
@@ -348,9 +337,6 @@ void intel_engine_init_global_seqno(struct intel_engine_cs *engine, u32 seqno)
 {
 	struct drm_i915_private *dev_priv = engine->i915;
 
-	GEM_BUG_ON(!intel_engine_is_idle(engine));
-	GEM_BUG_ON(i915_gem_active_isset(&engine->timeline->last_request));
-
 	/* Our semaphore implementation is strictly monotonic (i.e. we proceed
 	 * so long as the semaphore value in the register/page is greater
 	 * than the sync value), so whenever we reset the seqno,
@@ -1294,6 +1280,10 @@ bool intel_engine_is_idle(struct intel_engine_cs *engine)
 	if (port_request(&engine->execlist_port[0]))
 		return false;
 
+	/* ELSP is empty, but there are ready requests? */
+	if (READ_ONCE(engine->execlist_first))
+		return false;
+
 	/* Ring stopped? */
 	if (!ring_is_idle(engine))
 		return false;
@@ -1340,6 +1330,7 @@ void intel_engines_mark_idle(struct drm_i915_private *i915)
 	for_each_engine(engine, i915, id) {
 		intel_engine_disarm_breadcrumbs(engine);
 		i915_gem_batch_pool_fini(&engine->batch_pool);
+		tasklet_kill(&engine->irq_tasklet);
 		engine->no_priolist = false;
 	}
 }
diff --git a/drivers/gpu/drm/i915/intel_fbc.c b/drivers/gpu/drm/i915/intel_fbc.c
index 860b8c2..3fca9fa 100644
--- a/drivers/gpu/drm/i915/intel_fbc.c
+++ b/drivers/gpu/drm/i915/intel_fbc.c
@@ -461,6 +461,8 @@ static void intel_fbc_schedule_activation(struct intel_crtc *crtc)
 	struct intel_fbc_work *work = &fbc->work;
 
 	WARN_ON(!mutex_is_locked(&fbc->lock));
+	if (WARN_ON(!fbc->enabled))
+		return;
 
 	if (drm_crtc_vblank_get(&crtc->base)) {
 		DRM_ERROR("vblank not available for FBC on pipe %c\n",
@@ -1216,7 +1218,7 @@ static void intel_fbc_underrun_work_fn(struct work_struct *work)
 	mutex_lock(&fbc->lock);
 
 	/* Maybe we were scheduled twice. */
-	if (fbc->underrun_detected)
+	if (fbc->underrun_detected || !fbc->enabled)
 		goto out;
 
 	DRM_DEBUG_KMS("Disabling FBC due to FIFO underrun.\n");
diff --git a/drivers/gpu/drm/i915/intel_fbdev.c b/drivers/gpu/drm/i915/intel_fbdev.c
index 0c4cde6..262e75c 100644
--- a/drivers/gpu/drm/i915/intel_fbdev.c
+++ b/drivers/gpu/drm/i915/intel_fbdev.c
@@ -232,7 +232,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
 
 	strcpy(info->fix.id, "inteldrmfb");
 
-	info->flags = FBINFO_DEFAULT | FBINFO_CAN_FORCE_OUTPUT;
 	info->fbops = &intelfb_ops;
 
 	/* setup aperture base/size for vesafb takeover */
@@ -281,27 +280,6 @@ static int intelfb_create(struct drm_fb_helper *helper,
 	return ret;
 }
 
-/** Sets the color ramps on behalf of RandR */
-static void intel_crtc_fb_gamma_set(struct drm_crtc *crtc, u16 red, u16 green,
-				    u16 blue, int regno)
-{
-	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-
-	intel_crtc->lut_r[regno] = red >> 8;
-	intel_crtc->lut_g[regno] = green >> 8;
-	intel_crtc->lut_b[regno] = blue >> 8;
-}
-
-static void intel_crtc_fb_gamma_get(struct drm_crtc *crtc, u16 *red, u16 *green,
-				    u16 *blue, int regno)
-{
-	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
-
-	*red = intel_crtc->lut_r[regno] << 8;
-	*green = intel_crtc->lut_g[regno] << 8;
-	*blue = intel_crtc->lut_b[regno] << 8;
-}
-
 static struct drm_fb_helper_crtc *
 intel_fb_helper_crtc(struct drm_fb_helper *fb_helper, struct drm_crtc *crtc)
 {
@@ -352,14 +330,20 @@ static bool intel_fb_initial_config(struct drm_fb_helper *fb_helper,
 	unsigned int count = min(fb_helper->connector_count, BITS_PER_LONG);
 	int i, j;
 	bool *save_enabled;
-	bool fallback = true;
+	bool fallback = true, ret = true;
 	int num_connectors_enabled = 0;
 	int num_connectors_detected = 0;
+	struct drm_modeset_acquire_ctx ctx;
 
 	save_enabled = kcalloc(count, sizeof(bool), GFP_KERNEL);
 	if (!save_enabled)
 		return false;
 
+	drm_modeset_acquire_init(&ctx, 0);
+
+	while (drm_modeset_lock_all_ctx(fb_helper->dev, &ctx) != 0)
+		drm_modeset_backoff(&ctx);
+
 	memcpy(save_enabled, enabled, count);
 	mask = GENMASK(count - 1, 0);
 	conn_configured = 0;
@@ -370,7 +354,6 @@ static bool intel_fb_initial_config(struct drm_fb_helper *fb_helper,
 		struct drm_connector *connector;
 		struct drm_encoder *encoder;
 		struct drm_fb_helper_crtc *new_crtc;
-		struct intel_crtc *intel_crtc;
 
 		fb_conn = fb_helper->connector_info[i];
 		connector = fb_conn->connector;
@@ -412,13 +395,6 @@ static bool intel_fb_initial_config(struct drm_fb_helper *fb_helper,
 
 		num_connectors_enabled++;
 
-		intel_crtc = to_intel_crtc(connector->state->crtc);
-		for (j = 0; j < 256; j++) {
-			intel_crtc->lut_r[j] = j;
-			intel_crtc->lut_g[j] = j;
-			intel_crtc->lut_b[j] = j;
-		}
-
 		new_crtc = intel_fb_helper_crtc(fb_helper,
 						connector->state->crtc);
 
@@ -509,18 +485,18 @@ static bool intel_fb_initial_config(struct drm_fb_helper *fb_helper,
 bail:
 		DRM_DEBUG_KMS("Not using firmware configuration\n");
 		memcpy(enabled, save_enabled, count);
-		kfree(save_enabled);
-		return false;
+		ret = false;
 	}
 
+	drm_modeset_drop_locks(&ctx);
+	drm_modeset_acquire_fini(&ctx);
+
 	kfree(save_enabled);
-	return true;
+	return ret;
 }
 
 static const struct drm_fb_helper_funcs intel_fb_helper_funcs = {
 	.initial_config = intel_fb_initial_config,
-	.gamma_set = intel_crtc_fb_gamma_set,
-	.gamma_get = intel_crtc_fb_gamma_get,
 	.fb_probe = intelfb_create,
 };
 
@@ -531,8 +507,6 @@ static void intel_fbdev_destroy(struct intel_fbdev *ifbdev)
 	 * trying to rectify all the possible error paths leading here.
 	 */
 
-	drm_fb_helper_unregister_fbi(&ifbdev->helper);
-
 	drm_fb_helper_fini(&ifbdev->helper);
 
 	if (ifbdev->vma) {
@@ -720,8 +694,10 @@ static void intel_fbdev_initial_config(void *data, async_cookie_t cookie)
 
 	/* Due to peculiar init order wrt to hpd handling this is separate. */
 	if (drm_fb_helper_initial_config(&ifbdev->helper,
-					 ifbdev->preferred_bpp))
-		intel_fbdev_fini(ifbdev->helper.dev);
+					 ifbdev->preferred_bpp)) {
+		intel_fbdev_unregister(to_i915(ifbdev->helper.dev));
+		intel_fbdev_fini(to_i915(ifbdev->helper.dev));
+	}
 }
 
 void intel_fbdev_initial_config_async(struct drm_device *dev)
@@ -744,9 +720,8 @@ static void intel_fbdev_sync(struct intel_fbdev *ifbdev)
 	ifbdev->cookie = 0;
 }
 
-void intel_fbdev_fini(struct drm_device *dev)
+void intel_fbdev_unregister(struct drm_i915_private *dev_priv)
 {
-	struct drm_i915_private *dev_priv = to_i915(dev);
 	struct intel_fbdev *ifbdev = dev_priv->fbdev;
 
 	if (!ifbdev)
@@ -756,8 +731,17 @@ void intel_fbdev_fini(struct drm_device *dev)
 	if (!current_is_async())
 		intel_fbdev_sync(ifbdev);
 
+	drm_fb_helper_unregister_fbi(&ifbdev->helper);
+}
+
+void intel_fbdev_fini(struct drm_i915_private *dev_priv)
+{
+	struct intel_fbdev *ifbdev = fetch_and_zero(&dev_priv->fbdev);
+
+	if (!ifbdev)
+		return;
+
 	intel_fbdev_destroy(ifbdev);
-	dev_priv->fbdev = NULL;
 }
 
 void intel_fbdev_set_suspend(struct drm_device *dev, int state, bool synchronous)
@@ -813,7 +797,7 @@ void intel_fbdev_output_poll_changed(struct drm_device *dev)
 {
 	struct intel_fbdev *ifbdev = to_i915(dev)->fbdev;
 
-	if (ifbdev && ifbdev->vma)
+	if (ifbdev)
 		drm_fb_helper_hotplug_event(&ifbdev->helper);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_fifo_underrun.c b/drivers/gpu/drm/i915/intel_fifo_underrun.c
index d484862..5a7cca3 100644
--- a/drivers/gpu/drm/i915/intel_fifo_underrun.c
+++ b/drivers/gpu/drm/i915/intel_fifo_underrun.c
@@ -313,11 +313,11 @@ bool intel_set_cpu_fifo_underrun_reporting(struct drm_i915_private *dev_priv,
  * Returns the previous state of underrun reporting.
  */
 bool intel_set_pch_fifo_underrun_reporting(struct drm_i915_private *dev_priv,
-					   enum transcoder pch_transcoder,
+					   enum pipe pch_transcoder,
 					   bool enable)
 {
 	struct intel_crtc *crtc =
-		intel_get_crtc_for_pipe(dev_priv, (enum pipe) pch_transcoder);
+		intel_get_crtc_for_pipe(dev_priv, pch_transcoder);
 	unsigned long flags;
 	bool old;
 
@@ -390,7 +390,7 @@ void intel_cpu_fifo_underrun_irq_handler(struct drm_i915_private *dev_priv,
  * interrupt to avoid an irq storm.
  */
 void intel_pch_fifo_underrun_irq_handler(struct drm_i915_private *dev_priv,
-					 enum transcoder pch_transcoder)
+					 enum pipe pch_transcoder)
 {
 	if (intel_set_pch_fifo_underrun_reporting(dev_priv, pch_transcoder,
 						  false)) {
diff --git a/drivers/gpu/drm/i915/intel_hangcheck.c b/drivers/gpu/drm/i915/intel_hangcheck.c
index 9b0ece4..d9d87d9 100644
--- a/drivers/gpu/drm/i915/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/intel_hangcheck.c
@@ -324,7 +324,7 @@ hangcheck_get_action(struct intel_engine_cs *engine,
 	if (engine->hangcheck.seqno != hc->seqno)
 		return ENGINE_ACTIVE_SEQNO;
 
-	if (i915_seqno_passed(hc->seqno, intel_engine_last_submit(engine)))
+	if (intel_engine_is_idle(engine))
 		return ENGINE_IDLE;
 
 	return engine_stuck(engine, hc->acthd);
diff --git a/drivers/gpu/drm/i915/intel_hdmi.c b/drivers/gpu/drm/i915/intel_hdmi.c
index ec0779a..e8abea7 100644
--- a/drivers/gpu/drm/i915/intel_hdmi.c
+++ b/drivers/gpu/drm/i915/intel_hdmi.c
@@ -459,22 +459,31 @@ static void intel_hdmi_set_avi_infoframe(struct drm_encoder *encoder,
 	struct intel_hdmi *intel_hdmi = enc_to_intel_hdmi(encoder);
 	const struct drm_display_mode *adjusted_mode =
 		&crtc_state->base.adjusted_mode;
+	struct drm_connector *connector = &intel_hdmi->attached_connector->base;
+	bool is_hdmi2_sink = connector->display_info.hdmi.scdc.supported;
 	union hdmi_infoframe frame;
 	int ret;
 
 	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi,
-						       adjusted_mode);
+						       adjusted_mode,
+						       is_hdmi2_sink);
 	if (ret < 0) {
 		DRM_ERROR("couldn't fill AVI infoframe\n");
 		return;
 	}
 
+	if (crtc_state->ycbcr420)
+		frame.avi.colorspace = HDMI_COLORSPACE_YUV420;
+	else
+		frame.avi.colorspace = HDMI_COLORSPACE_RGB;
+
 	drm_hdmi_avi_infoframe_quant_range(&frame.avi, adjusted_mode,
 					   crtc_state->limited_color_range ?
 					   HDMI_QUANTIZATION_RANGE_LIMITED :
 					   HDMI_QUANTIZATION_RANGE_FULL,
 					   intel_hdmi->rgb_quant_range_selectable);
 
+	/* TODO: handle pixel repetition for YCBCR420 outputs */
 	intel_write_infoframe(encoder, crtc_state, &frame);
 }
 
@@ -1292,6 +1301,9 @@ intel_hdmi_mode_valid(struct drm_connector *connector,
 	if (mode->flags & DRM_MODE_FLAG_DBLCLK)
 		clock *= 2;
 
+	if (drm_mode_is_420_only(&connector->display_info, mode))
+		clock /= 2;
+
 	/* check if we can do 8bpc */
 	status = hdmi_port_clock_valid(hdmi, clock, true, force_dvi);
 
@@ -1321,14 +1333,21 @@ static bool hdmi_12bpc_possible(struct intel_crtc_state *crtc_state)
 	if (crtc_state->output_types != 1 << INTEL_OUTPUT_HDMI)
 		return false;
 
-	for_each_connector_in_state(state, connector, connector_state, i) {
+	for_each_new_connector_in_state(state, connector, connector_state, i) {
 		const struct drm_display_info *info = &connector->display_info;
 
 		if (connector_state->crtc != crtc_state->base.crtc)
 			continue;
 
-		if ((info->edid_hdmi_dc_modes & DRM_EDID_HDMI_DC_36) == 0)
-			return false;
+		if (crtc_state->ycbcr420) {
+			const struct drm_hdmi_info *hdmi = &info->hdmi;
+
+			if (!(hdmi->y420_dc_modes & DRM_EDID_YCBCR420_DC_36))
+				return false;
+		} else {
+			if (!(info->edid_hdmi_dc_modes & DRM_EDID_HDMI_DC_36))
+				return false;
+		}
 	}
 
 	/* Display Wa #1139 */
@@ -1339,6 +1358,36 @@ static bool hdmi_12bpc_possible(struct intel_crtc_state *crtc_state)
 	return true;
 }
 
+static bool
+intel_hdmi_ycbcr420_config(struct drm_connector *connector,
+			   struct intel_crtc_state *config,
+			   int *clock_12bpc, int *clock_8bpc)
+{
+	struct intel_crtc *intel_crtc = to_intel_crtc(config->base.crtc);
+
+	if (!connector->ycbcr_420_allowed) {
+		DRM_ERROR("Platform doesn't support YCBCR420 output\n");
+		return false;
+	}
+
+	/* YCBCR420 TMDS rate requirement is half the pixel clock */
+	config->port_clock /= 2;
+	*clock_12bpc /= 2;
+	*clock_8bpc /= 2;
+	config->ycbcr420 = true;
+
+	/* YCBCR 420 output conversion needs a scaler */
+	if (skl_update_scaler_crtc(config)) {
+		DRM_DEBUG_KMS("Scaler allocation for output failed\n");
+		return false;
+	}
+
+	intel_pch_panel_fitting(intel_crtc, config,
+				DRM_MODE_SCALE_FULLSCREEN);
+
+	return true;
+}
+
 bool intel_hdmi_compute_config(struct intel_encoder *encoder,
 			       struct intel_crtc_state *pipe_config,
 			       struct drm_connector_state *conn_state)
@@ -1346,7 +1395,8 @@ bool intel_hdmi_compute_config(struct intel_encoder *encoder,
 	struct intel_hdmi *intel_hdmi = enc_to_intel_hdmi(&encoder->base);
 	struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
 	struct drm_display_mode *adjusted_mode = &pipe_config->base.adjusted_mode;
-	struct drm_scdc *scdc = &conn_state->connector->display_info.hdmi.scdc;
+	struct drm_connector *connector = conn_state->connector;
+	struct drm_scdc *scdc = &connector->display_info.hdmi.scdc;
 	struct intel_digital_connector_state *intel_conn_state =
 		to_intel_digital_connector_state(conn_state);
 	int clock_8bpc = pipe_config->base.adjusted_mode.crtc_clock;
@@ -1376,6 +1426,14 @@ bool intel_hdmi_compute_config(struct intel_encoder *encoder,
 		clock_12bpc *= 2;
 	}
 
+	if (drm_mode_is_420_only(&connector->display_info, adjusted_mode)) {
+		if (!intel_hdmi_ycbcr420_config(connector, pipe_config,
+						&clock_12bpc, &clock_8bpc)) {
+			DRM_ERROR("Can't support YCBCR420 output\n");
+			return false;
+		}
+	}
+
 	if (HAS_PCH_SPLIT(dev_priv) && !HAS_DDI(dev_priv))
 		pipe_config->has_pch_encoder = true;
 
@@ -1703,11 +1761,9 @@ static void intel_hdmi_destroy(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs intel_hdmi_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.detect = intel_hdmi_detect,
 	.force = intel_hdmi_force,
 	.fill_modes = drm_helper_probe_single_connector_modes,
-	.set_property = drm_atomic_helper_connector_set_property,
 	.atomic_get_property = intel_digital_connector_atomic_get_property,
 	.atomic_set_property = intel_digital_connector_atomic_set_property,
 	.late_register = intel_connector_register,
@@ -1787,6 +1843,93 @@ void intel_hdmi_handle_sink_scrambling(struct intel_encoder *encoder,
 	DRM_DEBUG_KMS("sink scrambling handled\n");
 }
 
+static u8 chv_port_to_ddc_pin(struct drm_i915_private *dev_priv, enum port port)
+{
+	u8 ddc_pin;
+
+	switch (port) {
+	case PORT_B:
+		ddc_pin = GMBUS_PIN_DPB;
+		break;
+	case PORT_C:
+		ddc_pin = GMBUS_PIN_DPC;
+		break;
+	case PORT_D:
+		ddc_pin = GMBUS_PIN_DPD_CHV;
+		break;
+	default:
+		MISSING_CASE(port);
+		ddc_pin = GMBUS_PIN_DPB;
+		break;
+	}
+	return ddc_pin;
+}
+
+static u8 bxt_port_to_ddc_pin(struct drm_i915_private *dev_priv, enum port port)
+{
+	u8 ddc_pin;
+
+	switch (port) {
+	case PORT_B:
+		ddc_pin = GMBUS_PIN_1_BXT;
+		break;
+	case PORT_C:
+		ddc_pin = GMBUS_PIN_2_BXT;
+		break;
+	default:
+		MISSING_CASE(port);
+		ddc_pin = GMBUS_PIN_1_BXT;
+		break;
+	}
+	return ddc_pin;
+}
+
+static u8 cnp_port_to_ddc_pin(struct drm_i915_private *dev_priv,
+			      enum port port)
+{
+	u8 ddc_pin;
+
+	switch (port) {
+	case PORT_B:
+		ddc_pin = GMBUS_PIN_1_BXT;
+		break;
+	case PORT_C:
+		ddc_pin = GMBUS_PIN_2_BXT;
+		break;
+	case PORT_D:
+		ddc_pin = GMBUS_PIN_4_CNP;
+		break;
+	default:
+		MISSING_CASE(port);
+		ddc_pin = GMBUS_PIN_1_BXT;
+		break;
+	}
+	return ddc_pin;
+}
+
+static u8 g4x_port_to_ddc_pin(struct drm_i915_private *dev_priv,
+			      enum port port)
+{
+	u8 ddc_pin;
+
+	switch (port) {
+	case PORT_B:
+		ddc_pin = GMBUS_PIN_DPB;
+		break;
+	case PORT_C:
+		ddc_pin = GMBUS_PIN_DPC;
+		break;
+	case PORT_D:
+		ddc_pin = GMBUS_PIN_DPD;
+		break;
+	default:
+		MISSING_CASE(port);
+		ddc_pin = GMBUS_PIN_DPB;
+		break;
+	}
+	return ddc_pin;
+}
+
 static u8 intel_hdmi_ddc_pin(struct drm_i915_private *dev_priv,
 			     enum port port)
 {
@@ -1800,32 +1943,14 @@ static u8 intel_hdmi_ddc_pin(struct drm_i915_private *dev_priv,
 		return info->alternate_ddc_pin;
 	}
 
-	switch (port) {
-	case PORT_B:
-		if (IS_GEN9_LP(dev_priv) || HAS_PCH_CNP(dev_priv))
-			ddc_pin = GMBUS_PIN_1_BXT;
-		else
-			ddc_pin = GMBUS_PIN_DPB;
-		break;
-	case PORT_C:
-		if (IS_GEN9_LP(dev_priv) || HAS_PCH_CNP(dev_priv))
-			ddc_pin = GMBUS_PIN_2_BXT;
-		else
-			ddc_pin = GMBUS_PIN_DPC;
-		break;
-	case PORT_D:
-		if (HAS_PCH_CNP(dev_priv))
-			ddc_pin = GMBUS_PIN_4_CNP;
-		else if (IS_CHERRYVIEW(dev_priv))
-			ddc_pin = GMBUS_PIN_DPD_CHV;
-		else
-			ddc_pin = GMBUS_PIN_DPD;
-		break;
-	default:
-		MISSING_CASE(port);
-		ddc_pin = GMBUS_PIN_DPB;
-		break;
-	}
+	if (IS_CHERRYVIEW(dev_priv))
+		ddc_pin = chv_port_to_ddc_pin(dev_priv, port);
+	else if (IS_GEN9_LP(dev_priv))
+		ddc_pin = bxt_port_to_ddc_pin(dev_priv, port);
+	else if (HAS_PCH_CNP(dev_priv))
+		ddc_pin = cnp_port_to_ddc_pin(dev_priv, port);
+	else
+		ddc_pin = g4x_port_to_ddc_pin(dev_priv, port);
 
 	DRM_DEBUG_KMS("Using DDC pin 0x%x for port %c (platform default)\n",
 		      ddc_pin, port_name(port));
@@ -1859,25 +1984,14 @@ void intel_hdmi_init_connector(struct intel_digital_port *intel_dig_port,
 	connector->doublescan_allowed = 0;
 	connector->stereo_allowed = 1;
 
+	if (IS_GEMINILAKE(dev_priv))
+		connector->ycbcr_420_allowed = true;
+
 	intel_hdmi->ddc_bus = intel_hdmi_ddc_pin(dev_priv, port);
 
-	switch (port) {
-	case PORT_B:
-		intel_encoder->hpd_pin = HPD_PORT_B;
-		break;
-	case PORT_C:
-		intel_encoder->hpd_pin = HPD_PORT_C;
-		break;
-	case PORT_D:
-		intel_encoder->hpd_pin = HPD_PORT_D;
-		break;
-	case PORT_E:
-		intel_encoder->hpd_pin = HPD_PORT_E;
-		break;
-	default:
-		MISSING_CASE(port);
+	if (WARN_ON(port == PORT_A))
 		return;
-	}
+	intel_encoder->hpd_pin = intel_hpd_pin(port);
 
 	if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
 		intel_hdmi->write_infoframe = vlv_write_infoframe;
diff --git a/drivers/gpu/drm/i915/intel_hotplug.c b/drivers/gpu/drm/i915/intel_hotplug.c
index f120027..875d5d2 100644
--- a/drivers/gpu/drm/i915/intel_hotplug.c
+++ b/drivers/gpu/drm/i915/intel_hotplug.c
@@ -76,26 +76,54 @@
  * it will use i915_hotplug_work_func where this logic is handled.
  */
 
-bool intel_hpd_pin_to_port(enum hpd_pin pin, enum port *port)
+/**
+ * intel_hpd_port - return port hard associated with certain pin.
+ * @pin: the hpd pin to get associated port
+ *
+ * Return port that is associatade with @pin and PORT_NONE if no port is
+ * hard associated with that @pin.
+ */
+enum port intel_hpd_pin_to_port(enum hpd_pin pin)
 {
 	switch (pin) {
 	case HPD_PORT_A:
-		*port = PORT_A;
-		return true;
+		return PORT_A;
 	case HPD_PORT_B:
-		*port = PORT_B;
-		return true;
+		return PORT_B;
 	case HPD_PORT_C:
-		*port = PORT_C;
-		return true;
+		return PORT_C;
 	case HPD_PORT_D:
-		*port = PORT_D;
-		return true;
+		return PORT_D;
 	case HPD_PORT_E:
-		*port = PORT_E;
-		return true;
+		return PORT_E;
 	default:
-		return false;	/* no hpd */
+		return PORT_NONE; /* no port for this pin */
+	}
+}
+
+/**
+ * intel_hpd_pin - return pin hard associated with certain port.
+ * @port: the hpd port to get associated pin
+ *
+ * Return pin that is associatade with @port and HDP_NONE if no pin is
+ * hard associated with that @port.
+ */
+enum hpd_pin intel_hpd_pin(enum port port)
+{
+	switch (port) {
+	case PORT_A:
+		return HPD_PORT_A;
+	case PORT_B:
+		return HPD_PORT_B;
+	case PORT_C:
+		return HPD_PORT_C;
+	case PORT_D:
+		return HPD_PORT_D;
+	case PORT_E:
+		return HPD_PORT_E;
+	default:
+		MISSING_CASE(port);
+		return HPD_NONE;
 	}
 }
 
@@ -389,8 +417,9 @@ void intel_hpd_irq_handler(struct drm_i915_private *dev_priv,
 		if (!(BIT(i) & pin_mask))
 			continue;
 
-		is_dig_port = intel_hpd_pin_to_port(i, &port) &&
-			      dev_priv->hotplug.irq_port[port];
+		port = intel_hpd_pin_to_port(i);
+		is_dig_port = port != PORT_NONE &&
+			dev_priv->hotplug.irq_port[port];
 
 		if (is_dig_port) {
 			bool long_hpd = long_mask & BIT(i);
diff --git a/drivers/gpu/drm/i915/intel_i2c.c b/drivers/gpu/drm/i915/intel_i2c.c
index 3c9e00d..6698826 100644
--- a/drivers/gpu/drm/i915/intel_i2c.c
+++ b/drivers/gpu/drm/i915/intel_i2c.c
@@ -592,7 +592,6 @@ gmbus_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs, int num)
 	int ret;
 
 	intel_display_power_get(dev_priv, POWER_DOMAIN_GMBUS);
-	mutex_lock(&dev_priv->gmbus_mutex);
 
 	if (bus->force_bit) {
 		ret = i2c_bit_algo.master_xfer(adapter, msgs, num);
@@ -604,7 +603,6 @@ gmbus_xfer(struct i2c_adapter *adapter, struct i2c_msg *msgs, int num)
 			bus->force_bit |= GMBUS_FORCE_BIT_RETRY;
 	}
 
-	mutex_unlock(&dev_priv->gmbus_mutex);
 	intel_display_power_put(dev_priv, POWER_DOMAIN_GMBUS);
 
 	return ret;
@@ -624,6 +622,39 @@ static const struct i2c_algorithm gmbus_algorithm = {
 	.functionality	= gmbus_func
 };
 
+static void gmbus_lock_bus(struct i2c_adapter *adapter,
+			   unsigned int flags)
+{
+	struct intel_gmbus *bus = to_intel_gmbus(adapter);
+	struct drm_i915_private *dev_priv = bus->dev_priv;
+
+	mutex_lock(&dev_priv->gmbus_mutex);
+}
+
+static int gmbus_trylock_bus(struct i2c_adapter *adapter,
+			     unsigned int flags)
+{
+	struct intel_gmbus *bus = to_intel_gmbus(adapter);
+	struct drm_i915_private *dev_priv = bus->dev_priv;
+
+	return mutex_trylock(&dev_priv->gmbus_mutex);
+}
+
+static void gmbus_unlock_bus(struct i2c_adapter *adapter,
+			     unsigned int flags)
+{
+	struct intel_gmbus *bus = to_intel_gmbus(adapter);
+	struct drm_i915_private *dev_priv = bus->dev_priv;
+
+	mutex_unlock(&dev_priv->gmbus_mutex);
+}
+
+const struct i2c_lock_operations gmbus_lock_ops = {
+	.lock_bus =    gmbus_lock_bus,
+	.trylock_bus = gmbus_trylock_bus,
+	.unlock_bus =  gmbus_unlock_bus,
+};
+
 /**
  * intel_gmbus_setup - instantiate all Intel i2c GMBuses
  * @dev_priv: i915 device private
@@ -665,6 +696,7 @@ int intel_setup_gmbus(struct drm_i915_private *dev_priv)
 		bus->dev_priv = dev_priv;
 
 		bus->adapter.algo = &gmbus_algorithm;
+		bus->adapter.lock_ops = &gmbus_lock_ops;
 
 		/*
 		 * We wish to retry with bit banging
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index 2afa4da..6f972e6 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -1327,6 +1327,31 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 {
 	struct execlist_port *port = engine->execlist_port;
 	struct intel_context *ce;
+	unsigned int n;
+
+	/*
+	 * Catch up with any missed context-switch interrupts.
+	 *
+	 * Ideally we would just read the remaining CSB entries now that we
+	 * know the gpu is idle. However, the CSB registers are sometimes^W
+	 * often trashed across a GPU reset! Instead we have to rely on
+	 * guessing the missed context-switch events by looking at what
+	 * requests were completed.
+	 */
+	if (!request) {
+		for (n = 0; n < ARRAY_SIZE(engine->execlist_port); n++)
+			i915_gem_request_put(port_request(&port[n]));
+		memset(engine->execlist_port, 0, sizeof(engine->execlist_port));
+		return;
+	}
+
+	if (request->ctx != port_request(port)->ctx) {
+		i915_gem_request_put(port_request(port));
+		port[0] = port[1];
+		memset(&port[1], 0, sizeof(port[1]));
+	}
+
+	GEM_BUG_ON(request->ctx != port_request(port)->ctx);
 
 	/* If the request was innocent, we leave the request in the ELSP
 	 * and will try to replay it on restarting. The context image may
@@ -1338,7 +1363,7 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 	 * and have to at least restore the RING register in the context
 	 * image back to the expected values to skip over the guilty request.
 	 */
-	if (!request || request->fence.error != -EIO)
+	if (request->fence.error != -EIO)
 		return;
 
 	/* We want a simple context + ring to execute the breadcrumb update.
@@ -1360,15 +1385,6 @@ static void reset_common_ring(struct intel_engine_cs *engine,
 	request->ring->head = request->postfix;
 	intel_ring_update_space(request->ring);
 
-	/* Catch up with any missed context-switch interrupts */
-	if (request->ctx != port_request(port)->ctx) {
-		i915_gem_request_put(port_request(port));
-		port[0] = port[1];
-		memset(&port[1], 0, sizeof(port[1]));
-	}
-
-	GEM_BUG_ON(request->ctx != port_request(port)->ctx);
-
 	/* Reset WaIdleLiteRestore:bdw,skl as well */
 	request->tail =
 		intel_ring_wrap(request->ring,
@@ -2092,7 +2108,7 @@ void intel_lr_context_resume(struct drm_i915_private *dev_priv)
 	 * So to avoid that we reset the context images upon resume. For
 	 * simplicity, we just zero everything out.
 	 */
-	list_for_each_entry(ctx, &dev_priv->context_list, link) {
+	list_for_each_entry(ctx, &dev_priv->contexts.list, link) {
 		for_each_engine(engine, dev_priv, id) {
 			struct intel_context *ce = &ctx->engine[engine->id];
 			u32 *reg;
diff --git a/drivers/gpu/drm/i915/intel_lvds.c b/drivers/gpu/drm/i915/intel_lvds.c
index 6fe5d7c..8e21577 100644
--- a/drivers/gpu/drm/i915/intel_lvds.c
+++ b/drivers/gpu/drm/i915/intel_lvds.c
@@ -595,10 +595,8 @@ static const struct drm_connector_helper_funcs intel_lvds_connector_helper_funcs
 };
 
 static const struct drm_connector_funcs intel_lvds_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.detect = intel_lvds_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
-	.set_property = drm_atomic_helper_connector_set_property,
 	.atomic_get_property = intel_digital_connector_atomic_get_property,
 	.atomic_set_property = intel_digital_connector_atomic_set_property,
 	.late_register = intel_connector_register,
@@ -1140,7 +1138,8 @@ void intel_lvds_init(struct drm_i915_private *dev_priv)
 out:
 	mutex_unlock(&dev->mode_config.mutex);
 
-	intel_panel_init(&intel_connector->panel, fixed_mode, downclock_mode);
+	intel_panel_init(&intel_connector->panel, fixed_mode, NULL,
+			 downclock_mode);
 	intel_panel_setup_backlight(connector, INVALID_PIPE);
 
 	lvds_encoder->is_dual_link = compute_is_dual_link_lvds(lvds_encoder);
diff --git a/drivers/gpu/drm/i915/intel_opregion.c b/drivers/gpu/drm/i915/intel_opregion.c
index 2bd0300..98154ef 100644
--- a/drivers/gpu/drm/i915/intel_opregion.c
+++ b/drivers/gpu/drm/i915/intel_opregion.c
@@ -27,6 +27,7 @@
 
 #include <linux/acpi.h>
 #include <linux/dmi.h>
+#include <linux/firmware.h>
 #include <acpi/video.h>
 
 #include <drm/drmP.h>
@@ -829,6 +830,10 @@ void intel_opregion_unregister(struct drm_i915_private *dev_priv)
 		memunmap(opregion->rvda);
 		opregion->rvda = NULL;
 	}
+	if (opregion->vbt_firmware) {
+		kfree(opregion->vbt_firmware);
+		opregion->vbt_firmware = NULL;
+	}
 	opregion->header = NULL;
 	opregion->acpi = NULL;
 	opregion->swsci = NULL;
@@ -912,6 +917,43 @@ static const struct dmi_system_id intel_no_opregion_vbt[] = {
 	{ }
 };
 
+static int intel_load_vbt_firmware(struct drm_i915_private *dev_priv)
+{
+	struct intel_opregion *opregion = &dev_priv->opregion;
+	const struct firmware *fw = NULL;
+	const char *name = i915.vbt_firmware;
+	int ret;
+
+	if (!name || !*name)
+		return -ENOENT;
+
+	ret = request_firmware(&fw, name, &dev_priv->drm.pdev->dev);
+	if (ret) {
+		DRM_ERROR("Requesting VBT firmware \"%s\" failed (%d)\n",
+			  name, ret);
+		return ret;
+	}
+
+	if (intel_bios_is_valid_vbt(fw->data, fw->size)) {
+		opregion->vbt_firmware = kmemdup(fw->data, fw->size, GFP_KERNEL);
+		if (opregion->vbt_firmware) {
+			DRM_DEBUG_KMS("Found valid VBT firmware \"%s\"\n", name);
+			opregion->vbt = opregion->vbt_firmware;
+			opregion->vbt_size = fw->size;
+			ret = 0;
+		} else {
+			ret = -ENOMEM;
+		}
+	} else {
+		DRM_DEBUG_KMS("Invalid VBT firmware \"%s\"\n", name);
+		ret = -EINVAL;
+	}
+
+	release_firmware(fw);
+
+	return ret;
+}
+
 int intel_opregion_setup(struct drm_i915_private *dev_priv)
 {
 	struct intel_opregion *opregion = &dev_priv->opregion;
@@ -974,6 +1016,9 @@ int intel_opregion_setup(struct drm_i915_private *dev_priv)
 	if (mboxes & MBOX_ASLE_EXT)
 		DRM_DEBUG_DRIVER("ASLE extension supported\n");
 
+	if (intel_load_vbt_firmware(dev_priv) == 0)
+		goto out;
+
 	if (dmi_check_system(intel_no_opregion_vbt))
 		goto out;
 
diff --git a/drivers/gpu/drm/i915/intel_overlay.c b/drivers/gpu/drm/i915/intel_overlay.c
index b96aed9..aace22e7 100644
--- a/drivers/gpu/drm/i915/intel_overlay.c
+++ b/drivers/gpu/drm/i915/intel_overlay.c
@@ -799,9 +799,13 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay,
 	if (ret != 0)
 		return ret;
 
+	atomic_inc(&dev_priv->gpu_error.pending_fb_pin);
+
 	vma = i915_gem_object_pin_to_display_plane(new_bo, 0, NULL);
-	if (IS_ERR(vma))
-		return PTR_ERR(vma);
+	if (IS_ERR(vma)) {
+		ret = PTR_ERR(vma);
+		goto out_pin_section;
+	}
 
 	ret = i915_vma_put_fence(vma);
 	if (ret)
@@ -886,6 +890,9 @@ static int intel_overlay_do_put_image(struct intel_overlay *overlay,
 
 out_unpin:
 	i915_gem_object_unpin_from_display_plane(vma);
+out_pin_section:
+	atomic_dec(&dev_priv->gpu_error.pending_fb_pin);
+
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/i915/intel_panel.c b/drivers/gpu/drm/i915/intel_panel.c
index 593349b..a17b1de 100644
--- a/drivers/gpu/drm/i915/intel_panel.c
+++ b/drivers/gpu/drm/i915/intel_panel.c
@@ -110,7 +110,8 @@ intel_pch_panel_fitting(struct intel_crtc *intel_crtc,
 
 	/* Native modes don't need fitting */
 	if (adjusted_mode->crtc_hdisplay == pipe_config->pipe_src_w &&
-	    adjusted_mode->crtc_vdisplay == pipe_config->pipe_src_h)
+	    adjusted_mode->crtc_vdisplay == pipe_config->pipe_src_h &&
+	    !pipe_config->ycbcr420)
 		goto done;
 
 	switch (fitting_mode) {
@@ -1919,11 +1920,13 @@ intel_panel_init_backlight_funcs(struct intel_panel *panel)
 
 int intel_panel_init(struct intel_panel *panel,
 		     struct drm_display_mode *fixed_mode,
+		     struct drm_display_mode *alt_fixed_mode,
 		     struct drm_display_mode *downclock_mode)
 {
 	intel_panel_init_backlight_funcs(panel);
 
 	panel->fixed_mode = fixed_mode;
+	panel->alt_fixed_mode = alt_fixed_mode;
 	panel->downclock_mode = downclock_mode;
 
 	return 0;
@@ -1937,6 +1940,10 @@ void intel_panel_fini(struct intel_panel *panel)
 	if (panel->fixed_mode)
 		drm_mode_destroy(intel_connector->base.dev, panel->fixed_mode);
 
+	if (panel->alt_fixed_mode)
+		drm_mode_destroy(intel_connector->base.dev,
+				panel->alt_fixed_mode);
+
 	if (panel->downclock_mode)
 		drm_mode_destroy(intel_connector->base.dev,
 				panel->downclock_mode);
diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c
index 40b224b..ed66293 100644
--- a/drivers/gpu/drm/i915/intel_pm.c
+++ b/drivers/gpu/drm/i915/intel_pm.c
@@ -62,6 +62,20 @@ static void gen9_init_clock_gating(struct drm_i915_private *dev_priv)
 	I915_WRITE(CHICKEN_PAR1_1,
 		   I915_READ(CHICKEN_PAR1_1) | SKL_EDP_PSR_FIX_RDWRAP);
 
+	/*
+	 * Display WA#0390: skl,bxt,kbl,glk
+	 *
+	 * Must match Sampler, Pixel Back End, and Media
+	 * (0xE194 bit 8, 0x7014 bit 13, 0x4DDC bits 27 and 31).
+	 *
+	 * Including bits outside the page in the hash would
+	 * require 2 (or 4?) MiB alignment of resources. Just
+	 * assume the defaul hashing mode which only uses bits
+	 * within the page.
+	 */
+	I915_WRITE(CHICKEN_PAR1_1,
+		   I915_READ(CHICKEN_PAR1_1) & ~SKL_RC_HASH_OUTSIDE);
+
 	I915_WRITE(GEN8_CONFIG0,
 		   I915_READ(GEN8_CONFIG0) | GEN9_DEFAULT_FIXES);
 
@@ -78,6 +92,12 @@ static void gen9_init_clock_gating(struct drm_i915_private *dev_priv)
 	/* WaFbcHighMemBwCorruptionAvoidance:skl,bxt,kbl,cfl */
 	I915_WRITE(ILK_DPFC_CHICKEN, I915_READ(ILK_DPFC_CHICKEN) |
 		   ILK_DPFC_DISABLE_DUMMY0);
+
+	if (IS_SKYLAKE(dev_priv)) {
+		/* WaDisableDopClockGating */
+		I915_WRITE(GEN7_MISCCPCTL, I915_READ(GEN7_MISCCPCTL)
+			   & ~GEN7_DOP_CLOCK_GATE_ENABLE);
+	}
 }
 
 static void bxt_init_clock_gating(struct drm_i915_private *dev_priv)
@@ -2758,7 +2778,7 @@ hsw_compute_linetime_wm(const struct intel_crtc_state *cstate)
 static void intel_read_wm_latency(struct drm_i915_private *dev_priv,
 				  uint16_t wm[8])
 {
-	if (IS_GEN9(dev_priv)) {
+	if (INTEL_GEN(dev_priv) >= 9) {
 		uint32_t val;
 		int ret, i;
 		int level, max_level = ilk_wm_max_level(dev_priv);
@@ -2818,7 +2838,7 @@ static void intel_read_wm_latency(struct drm_i915_private *dev_priv,
 		}
 
 		/*
-		 * WaWmMemoryReadLatency:skl,glk
+		 * WaWmMemoryReadLatency:skl+,glk
 		 *
 		 * punit doesn't take into account the read latency so we need
 		 * to add 2us to the various latency levels we retrieve from the
@@ -2857,6 +2877,8 @@ static void intel_read_wm_latency(struct drm_i915_private *dev_priv,
 		wm[0] = 7;
 		wm[1] = (mltr >> MLTR_WM1_SHIFT) & ILK_SRLT_MASK;
 		wm[2] = (mltr >> MLTR_WM2_SHIFT) & ILK_SRLT_MASK;
+	} else {
+		MISSING_CASE(INTEL_DEVID(dev_priv));
 	}
 }
 
@@ -2912,7 +2934,7 @@ static void intel_print_wm_latency(struct drm_i915_private *dev_priv,
 		 * - latencies are in us on gen9.
 		 * - before then, WM1+ latency values are in 0.5us units
 		 */
-		if (IS_GEN9(dev_priv))
+		if (INTEL_GEN(dev_priv) >= 9)
 			latency *= 10;
 		else if (level > 0)
 			latency *= 5;
@@ -3530,8 +3552,6 @@ bool ilk_disable_lp_wm(struct drm_device *dev)
 	return _ilk_disable_lp_wm(dev_priv, WM_DIRTY_LP_ALL);
 }
 
-#define SKL_SAGV_BLOCK_TIME	30 /* µs */
-
 /*
  * FIXME: We still don't have the proper code detect if we need to apply the WA,
  * so assume we'll always need it in order to avoid underruns.
@@ -3549,7 +3569,8 @@ static bool skl_needs_memory_bw_wa(struct intel_atomic_state *state)
 static bool
 intel_has_sagv(struct drm_i915_private *dev_priv)
 {
-	if (IS_KABYLAKE(dev_priv) || IS_COFFEELAKE(dev_priv))
+	if (IS_KABYLAKE(dev_priv) || IS_COFFEELAKE(dev_priv) ||
+	    IS_CANNONLAKE(dev_priv))
 		return true;
 
 	if (IS_SKYLAKE(dev_priv) &&
@@ -3655,12 +3676,13 @@ bool intel_can_enable_sagv(struct drm_atomic_state *state)
 	struct intel_crtc_state *cstate;
 	enum pipe pipe;
 	int level, latency;
+	int sagv_block_time_us = IS_GEN9(dev_priv) ? 30 : 20;
 
 	if (!intel_has_sagv(dev_priv))
 		return false;
 
 	/*
-	 * SKL workaround: bspec recommends we disable the SAGV when we have
+	 * SKL+ workaround: bspec recommends we disable the SAGV when we have
 	 * more then one pipe enabled
 	 *
 	 * If there are no active CRTCs, no additional checks need be performed
@@ -3699,11 +3721,11 @@ bool intel_can_enable_sagv(struct drm_atomic_state *state)
 			latency += 15;
 
 		/*
-		 * If any of the planes on this pipe don't enable wm levels
-		 * that incur memory latencies higher then 30µs we can't enable
-		 * the SAGV
+		 * If any of the planes on this pipe don't enable wm levels that
+		 * incur memory latencies higher than sagv_block_time_us we
+		 * can't enable the SAGV.
 		 */
-		if (latency < SKL_SAGV_BLOCK_TIME)
+		if (latency < sagv_block_time_us)
 			return false;
 	}
 
@@ -3837,7 +3859,7 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
 	uint_fixed_16_16_t downscale_h, downscale_w;
 
 	if (WARN_ON(!intel_wm_plane_visible(cstate, pstate)))
-		return u32_to_fixed_16_16(0);
+		return u32_to_fixed16(0);
 
 	/* n.b., src is 16.16 fixed point, dst is whole integer */
 	if (plane->id == PLANE_CURSOR) {
@@ -3861,10 +3883,10 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
 		dst_h = drm_rect_height(&pstate->base.dst);
 	}
 
-	fp_w_ratio = fixed_16_16_div(src_w, dst_w);
-	fp_h_ratio = fixed_16_16_div(src_h, dst_h);
-	downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
-	downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
+	fp_w_ratio = div_fixed16(src_w, dst_w);
+	fp_h_ratio = div_fixed16(src_h, dst_h);
+	downscale_w = max_fixed16(fp_w_ratio, u32_to_fixed16(1));
+	downscale_h = max_fixed16(fp_h_ratio, u32_to_fixed16(1));
 
 	return mul_fixed16(downscale_w, downscale_h);
 }
@@ -3872,7 +3894,7 @@ skl_plane_downscale_amount(const struct intel_crtc_state *cstate,
 static uint_fixed_16_16_t
 skl_pipe_downscale_amount(const struct intel_crtc_state *crtc_state)
 {
-	uint_fixed_16_16_t pipe_downscale = u32_to_fixed_16_16(1);
+	uint_fixed_16_16_t pipe_downscale = u32_to_fixed16(1);
 
 	if (!crtc_state->base.enable)
 		return pipe_downscale;
@@ -3891,10 +3913,10 @@ skl_pipe_downscale_amount(const struct intel_crtc_state *crtc_state)
 		if (!dst_w || !dst_h)
 			return pipe_downscale;
 
-		fp_w_ratio = fixed_16_16_div(src_w, dst_w);
-		fp_h_ratio = fixed_16_16_div(src_h, dst_h);
-		downscale_w = max_fixed_16_16(fp_w_ratio, u32_to_fixed_16_16(1));
-		downscale_h = max_fixed_16_16(fp_h_ratio, u32_to_fixed_16_16(1));
+		fp_w_ratio = div_fixed16(src_w, dst_w);
+		fp_h_ratio = div_fixed16(src_h, dst_h);
+		downscale_w = max_fixed16(fp_w_ratio, u32_to_fixed16(1));
+		downscale_h = max_fixed16(fp_h_ratio, u32_to_fixed16(1));
 
 		pipe_downscale = mul_fixed16(downscale_w, downscale_h);
 	}
@@ -3913,14 +3935,14 @@ int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
 	int crtc_clock, dotclk;
 	uint32_t pipe_max_pixel_rate;
 	uint_fixed_16_16_t pipe_downscale;
-	uint_fixed_16_16_t max_downscale = u32_to_fixed_16_16(1);
+	uint_fixed_16_16_t max_downscale = u32_to_fixed16(1);
 
 	if (!cstate->base.enable)
 		return 0;
 
 	drm_atomic_crtc_state_for_each_plane_state(plane, pstate, crtc_state) {
 		uint_fixed_16_16_t plane_downscale;
-		uint_fixed_16_16_t fp_9_div_8 = fixed_16_16_div(9, 8);
+		uint_fixed_16_16_t fp_9_div_8 = div_fixed16(9, 8);
 		int bpp;
 
 		if (!intel_wm_plane_visible(cstate,
@@ -3938,7 +3960,7 @@ int skl_check_pipe_max_pixel_rate(struct intel_crtc *intel_crtc,
 			plane_downscale = mul_fixed16(plane_downscale,
 						      fp_9_div_8);
 
-		max_downscale = max_fixed_16_16(plane_downscale, max_downscale);
+		max_downscale = max_fixed16(plane_downscale, max_downscale);
 	}
 	pipe_downscale = skl_pipe_downscale_amount(cstate);
 
@@ -4071,7 +4093,9 @@ skl_ddb_min_alloc(const struct drm_plane_state *pstate,
 
 	/* For Non Y-tile return 8-blocks */
 	if (fb->modifier != I915_FORMAT_MOD_Y_TILED &&
-	    fb->modifier != I915_FORMAT_MOD_Yf_TILED)
+	    fb->modifier != I915_FORMAT_MOD_Yf_TILED &&
+	    fb->modifier != I915_FORMAT_MOD_Y_TILED_CCS &&
+	    fb->modifier != I915_FORMAT_MOD_Yf_TILED_CCS)
 		return 8;
 
 	/*
@@ -4266,8 +4290,9 @@ skl_allocate_pipe_ddb(struct intel_crtc_state *cstate,
  * should allow pixel_rate up to ~2 GHz which seems sufficient since max
  * 2xcdclk is 1350 MHz and the pixel rate should never exceed that.
 */
-static uint_fixed_16_16_t skl_wm_method1(uint32_t pixel_rate, uint8_t cpp,
-					 uint32_t latency)
+static uint_fixed_16_16_t
+skl_wm_method1(const struct drm_i915_private *dev_priv, uint32_t pixel_rate,
+	       uint8_t cpp, uint32_t latency)
 {
 	uint32_t wm_intermediate_val;
 	uint_fixed_16_16_t ret;
@@ -4276,7 +4301,11 @@ static uint_fixed_16_16_t skl_wm_method1(uint32_t pixel_rate, uint8_t cpp,
 		return FP_16_16_MAX;
 
 	wm_intermediate_val = latency * pixel_rate * cpp;
-	ret = fixed_16_16_div_u64(wm_intermediate_val, 1000 * 512);
+	ret = div_fixed16(wm_intermediate_val, 1000 * 512);
+
+	if (INTEL_GEN(dev_priv) >= 10)
+		ret = add_fixed16_u32(ret, 1);
+
 	return ret;
 }
 
@@ -4294,7 +4323,7 @@ static uint_fixed_16_16_t skl_wm_method2(uint32_t pixel_rate,
 	wm_intermediate_val = latency * pixel_rate;
 	wm_intermediate_val = DIV_ROUND_UP(wm_intermediate_val,
 					   pipe_htotal * 1000);
-	ret = mul_u32_fixed_16_16(wm_intermediate_val, plane_blocks_per_line);
+	ret = mul_u32_fixed16(wm_intermediate_val, plane_blocks_per_line);
 	return ret;
 }
 
@@ -4306,15 +4335,15 @@ intel_get_linetime_us(struct intel_crtc_state *cstate)
 	uint_fixed_16_16_t linetime_us;
 
 	if (!cstate->base.active)
-		return u32_to_fixed_16_16(0);
+		return u32_to_fixed16(0);
 
 	pixel_rate = cstate->pixel_rate;
 
 	if (WARN_ON(pixel_rate == 0))
-		return u32_to_fixed_16_16(0);
+		return u32_to_fixed16(0);
 
 	crtc_htotal = cstate->base.adjusted_mode.crtc_htotal;
-	linetime_us = fixed_16_16_div_u64(crtc_htotal * 1000, pixel_rate);
+	linetime_us = div_fixed16(crtc_htotal * 1000, pixel_rate);
 
 	return linetime_us;
 }
@@ -4361,7 +4390,7 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	uint32_t plane_bytes_per_line;
 	uint32_t res_blocks, res_lines;
 	uint8_t cpp;
-	uint32_t width = 0, height = 0;
+	uint32_t width = 0;
 	uint32_t plane_pixel_rate;
 	uint_fixed_16_16_t y_tile_minimum;
 	uint32_t y_min_scanlines;
@@ -4377,7 +4406,9 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	}
 
 	y_tiled = fb->modifier == I915_FORMAT_MOD_Y_TILED ||
-		  fb->modifier == I915_FORMAT_MOD_Yf_TILED;
+		  fb->modifier == I915_FORMAT_MOD_Yf_TILED ||
+		  fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
+		  fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS;
 	x_tiled = fb->modifier == I915_FORMAT_MOD_X_TILED;
 
 	/* Display WA #1141: kbl,cfl */
@@ -4390,7 +4421,6 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 
 	if (plane->id == PLANE_CURSOR) {
 		width = intel_pstate->base.crtc_w;
-		height = intel_pstate->base.crtc_h;
 	} else {
 		/*
 		 * Src coordinates are already rotated by 270 degrees for
@@ -4398,16 +4428,13 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 		 * GTT mapping), hence no need to account for rotation here.
 		 */
 		width = drm_rect_width(&intel_pstate->base.src) >> 16;
-		height = drm_rect_height(&intel_pstate->base.src) >> 16;
 	}
 
-	cpp = fb->format->cpp[0];
+	cpp = (fb->format->format == DRM_FORMAT_NV12) ? fb->format->cpp[1] :
+							fb->format->cpp[0];
 	plane_pixel_rate = skl_adjusted_plane_pixel_rate(cstate, intel_pstate);
 
 	if (drm_rotation_90_or_270(pstate->rotation)) {
-		int cpp = (fb->format->format == DRM_FORMAT_NV12) ?
-			fb->format->cpp[1] :
-			fb->format->cpp[0];
 
 		switch (cpp) {
 		case 1:
@@ -4434,51 +4461,62 @@ static int skl_compute_plane_wm(const struct drm_i915_private *dev_priv,
 	if (y_tiled) {
 		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line *
 					   y_min_scanlines, 512);
-		plane_blocks_per_line = fixed_16_16_div(interm_pbpl,
+
+		if (INTEL_GEN(dev_priv) >= 10)
+			interm_pbpl++;
+
+		plane_blocks_per_line = div_fixed16(interm_pbpl,
 							y_min_scanlines);
-	} else if (x_tiled) {
+	} else if (x_tiled && INTEL_GEN(dev_priv) == 9) {
 		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line, 512);
-		plane_blocks_per_line = u32_to_fixed_16_16(interm_pbpl);
+		plane_blocks_per_line = u32_to_fixed16(interm_pbpl);
 	} else {
 		interm_pbpl = DIV_ROUND_UP(plane_bytes_per_line, 512) + 1;
-		plane_blocks_per_line = u32_to_fixed_16_16(interm_pbpl);
+		plane_blocks_per_line = u32_to_fixed16(interm_pbpl);
 	}
 
-	method1 = skl_wm_method1(plane_pixel_rate, cpp, latency);
+	method1 = skl_wm_method1(dev_priv, plane_pixel_rate, cpp, latency);
 	method2 = skl_wm_method2(plane_pixel_rate,
 				 cstate->base.adjusted_mode.crtc_htotal,
 				 latency,
 				 plane_blocks_per_line);
 
-	y_tile_minimum = mul_u32_fixed_16_16(y_min_scanlines,
-					     plane_blocks_per_line);
+	y_tile_minimum = mul_u32_fixed16(y_min_scanlines,
+					 plane_blocks_per_line);
 
 	if (y_tiled) {
-		selected_result = max_fixed_16_16(method2, y_tile_minimum);
+		selected_result = max_fixed16(method2, y_tile_minimum);
 	} else {
 		uint32_t linetime_us;
 
-		linetime_us = fixed_16_16_to_u32_round_up(
+		linetime_us = fixed16_to_u32_round_up(
 				intel_get_linetime_us(cstate));
 		if ((cpp * cstate->base.adjusted_mode.crtc_htotal / 512 < 1) &&
 		    (plane_bytes_per_line / 512 < 1))
 			selected_result = method2;
 		else if (ddb_allocation >=
-			 fixed_16_16_to_u32_round_up(plane_blocks_per_line))
-			selected_result = min_fixed_16_16(method1, method2);
+			 fixed16_to_u32_round_up(plane_blocks_per_line))
+			selected_result = min_fixed16(method1, method2);
 		else if (latency >= linetime_us)
-			selected_result = min_fixed_16_16(method1, method2);
+			selected_result = min_fixed16(method1, method2);
 		else
 			selected_result = method1;
 	}
 
-	res_blocks = fixed_16_16_to_u32_round_up(selected_result) + 1;
+	res_blocks = fixed16_to_u32_round_up(selected_result) + 1;
 	res_lines = div_round_up_fixed16(selected_result,
 					 plane_blocks_per_line);
 
+	/* Display WA #1125: skl,bxt,kbl,glk */
+	if (level == 0 &&
+	    (fb->modifier == I915_FORMAT_MOD_Y_TILED_CCS ||
+	     fb->modifier == I915_FORMAT_MOD_Yf_TILED_CCS))
+		res_blocks += fixed16_to_u32_round_up(y_tile_minimum);
+
+	/* Display WA #1126: skl,bxt,kbl,glk */
 	if (level >= 1 && level <= 7) {
 		if (y_tiled) {
-			res_blocks += fixed_16_16_to_u32_round_up(y_tile_minimum);
+			res_blocks += fixed16_to_u32_round_up(y_tile_minimum);
 			res_lines += y_min_scanlines;
 		} else {
 			res_blocks++;
@@ -4563,8 +4601,7 @@ skl_compute_linetime_wm(struct intel_crtc_state *cstate)
 	if (is_fixed16_zero(linetime_us))
 		return 0;
 
-	linetime_wm = fixed_16_16_to_u32_round_up(mul_u32_fixed_16_16(8,
-				linetime_us));
+	linetime_wm = fixed16_to_u32_round_up(mul_u32_fixed16(8, linetime_us));
 
 	/* Display WA #1135: bxt. */
 	if (IS_BROXTON(dev_priv) && dev_priv->ipc_enabled)
@@ -5852,7 +5889,7 @@ static u32 intel_rps_limits(struct drm_i915_private *dev_priv, u8 val)
 	 * the hw runs at the minimal clock before selecting the desired
 	 * frequency, if the down threshold expires in that window we will not
 	 * receive a down interrupt. */
-	if (IS_GEN9(dev_priv)) {
+	if (INTEL_GEN(dev_priv) >= 9) {
 		limits = (dev_priv->rps.max_freq_softlimit) << 23;
 		if (val <= dev_priv->rps.min_freq_softlimit)
 			limits |= (dev_priv->rps.min_freq_softlimit) << 14;
@@ -5994,7 +6031,7 @@ static int gen6_set_rps(struct drm_i915_private *dev_priv, u8 val)
 	if (val != dev_priv->rps.cur_freq) {
 		gen6_set_rps_thresholds(dev_priv, val);
 
-		if (IS_GEN9(dev_priv))
+		if (INTEL_GEN(dev_priv) >= 9)
 			I915_WRITE(GEN6_RPNSWREQ,
 				   GEN9_FREQUENCY(val));
 		else if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv))
@@ -6126,47 +6163,35 @@ void gen6_rps_idle(struct drm_i915_private *dev_priv)
 			   gen6_sanitize_rps_pm_mask(dev_priv, ~0));
 	}
 	mutex_unlock(&dev_priv->rps.hw_lock);
-
-	spin_lock(&dev_priv->rps.client_lock);
-	while (!list_empty(&dev_priv->rps.clients))
-		list_del_init(dev_priv->rps.clients.next);
-	spin_unlock(&dev_priv->rps.client_lock);
 }
 
-void gen6_rps_boost(struct drm_i915_private *dev_priv,
-		    struct intel_rps_client *rps,
-		    unsigned long submitted)
+void gen6_rps_boost(struct drm_i915_gem_request *rq,
+		    struct intel_rps_client *rps)
 {
+	struct drm_i915_private *i915 = rq->i915;
+	bool boost;
+
 	/* This is intentionally racy! We peek at the state here, then
 	 * validate inside the RPS worker.
 	 */
-	if (!(dev_priv->gt.awake &&
-	      dev_priv->rps.enabled &&
-	      dev_priv->rps.cur_freq < dev_priv->rps.boost_freq))
+	if (!i915->rps.enabled)
 		return;
 
-	/* Force a RPS boost (and don't count it against the client) if
-	 * the GPU is severely congested.
-	 */
-	if (rps && time_after(jiffies, submitted + DRM_I915_THROTTLE_JIFFIES))
-		rps = NULL;
-
-	spin_lock(&dev_priv->rps.client_lock);
-	if (rps == NULL || list_empty(&rps->link)) {
-		spin_lock_irq(&dev_priv->irq_lock);
-		if (dev_priv->rps.interrupts_enabled) {
-			dev_priv->rps.client_boost = true;
-			schedule_work(&dev_priv->rps.work);
-		}
-		spin_unlock_irq(&dev_priv->irq_lock);
-
-		if (rps != NULL) {
-			list_add(&rps->link, &dev_priv->rps.clients);
-			rps->boosts++;
-		} else
-			dev_priv->rps.boosts++;
+	boost = false;
+	spin_lock_irq(&rq->lock);
+	if (!rq->waitboost && !i915_gem_request_completed(rq)) {
+		atomic_inc(&i915->rps.num_waiters);
+		rq->waitboost = true;
+		boost = true;
 	}
-	spin_unlock(&dev_priv->rps.client_lock);
+	spin_unlock_irq(&rq->lock);
+	if (!boost)
+		return;
+
+	if (READ_ONCE(i915->rps.cur_freq) < i915->rps.boost_freq)
+		schedule_work(&i915->rps.work);
+
+	atomic_inc(rps ? &rps->boosts : &i915->rps.boosts);
 }
 
 int intel_set_rps(struct drm_i915_private *dev_priv, u8 val)
@@ -6365,7 +6390,7 @@ static void gen6_init_rps_frequencies(struct drm_i915_private *dev_priv)
 
 	dev_priv->rps.efficient_freq = dev_priv->rps.rp1_freq;
 	if (IS_HASWELL(dev_priv) || IS_BROADWELL(dev_priv) ||
-	    IS_GEN9_BC(dev_priv)) {
+	    IS_GEN9_BC(dev_priv) || IS_CANNONLAKE(dev_priv)) {
 		u32 ddcc_status = 0;
 
 		if (sandybridge_pcode_read(dev_priv,
@@ -6378,7 +6403,7 @@ static void gen6_init_rps_frequencies(struct drm_i915_private *dev_priv)
 					dev_priv->rps.max_freq);
 	}
 
-	if (IS_GEN9_BC(dev_priv)) {
+	if (IS_GEN9_BC(dev_priv) || IS_CANNONLAKE(dev_priv)) {
 		/* Store the frequency values in 16.66 MHZ units, which is
 		 * the natural hardware unit for SKL
 		 */
@@ -6684,7 +6709,7 @@ static void gen6_update_ring_freq(struct drm_i915_private *dev_priv)
 	/* convert DDR frequency from units of 266.6MHz to bandwidth */
 	min_ring_freq = mult_frac(min_ring_freq, 8, 3);
 
-	if (IS_GEN9_BC(dev_priv)) {
+	if (IS_GEN9_BC(dev_priv) || IS_CANNONLAKE(dev_priv)) {
 		/* Convert GT frequency to 50 HZ units */
 		min_gpu_freq = dev_priv->rps.min_freq / GEN9_FREQ_SCALER;
 		max_gpu_freq = dev_priv->rps.max_freq / GEN9_FREQ_SCALER;
@@ -6702,7 +6727,7 @@ static void gen6_update_ring_freq(struct drm_i915_private *dev_priv)
 		int diff = max_gpu_freq - gpu_freq;
 		unsigned int ia_freq = 0, ring_freq = 0;
 
-		if (IS_GEN9_BC(dev_priv)) {
+		if (IS_GEN9_BC(dev_priv) || IS_CANNONLAKE(dev_priv)) {
 			/*
 			 * ring_freq = 2 * GT. ring_freq is in 100MHz units
 			 * No floor required for ring frequency on SKL.
@@ -7833,7 +7858,7 @@ void intel_enable_gt_powersave(struct drm_i915_private *dev_priv)
 	} else if (INTEL_GEN(dev_priv) >= 9) {
 		gen9_enable_rc6(dev_priv);
 		gen9_enable_rps(dev_priv);
-		if (IS_GEN9_BC(dev_priv))
+		if (IS_GEN9_BC(dev_priv) || IS_CANNONLAKE(dev_priv))
 			gen6_update_ring_freq(dev_priv);
 	} else if (IS_BROADWELL(dev_priv)) {
 		gen8_enable_rps(dev_priv);
@@ -8848,6 +8873,7 @@ static inline int gen6_check_mailbox_status(struct drm_i915_private *dev_priv)
 	case GEN6_PCODE_SUCCESS:
 		return 0;
 	case GEN6_PCODE_UNIMPLEMENTED_CMD:
+		return -ENODEV;
 	case GEN6_PCODE_ILLEGAL_CMD:
 		return -ENXIO;
 	case GEN6_PCODE_MIN_FREQ_TABLE_GT_RATIO_OUT_OF_RANGE:
@@ -8895,7 +8921,8 @@ int sandybridge_pcode_read(struct drm_i915_private *dev_priv, u32 mbox, u32 *val
 	 */
 
 	if (I915_READ_FW(GEN6_PCODE_MAILBOX) & GEN6_PCODE_READY) {
-		DRM_DEBUG_DRIVER("warning: pcode (read) mailbox access failed\n");
+		DRM_DEBUG_DRIVER("warning: pcode (read from mbox %x) mailbox access failed for %ps\n",
+				 mbox, __builtin_return_address(0));
 		return -EAGAIN;
 	}
 
@@ -8906,7 +8933,8 @@ int sandybridge_pcode_read(struct drm_i915_private *dev_priv, u32 mbox, u32 *val
 	if (__intel_wait_for_register_fw(dev_priv,
 					 GEN6_PCODE_MAILBOX, GEN6_PCODE_READY, 0,
 					 500, 0, NULL)) {
-		DRM_ERROR("timeout waiting for pcode read (%d) to finish\n", mbox);
+		DRM_ERROR("timeout waiting for pcode read (from mbox %x) to finish for %ps\n",
+			  mbox, __builtin_return_address(0));
 		return -ETIMEDOUT;
 	}
 
@@ -8919,8 +8947,8 @@ int sandybridge_pcode_read(struct drm_i915_private *dev_priv, u32 mbox, u32 *val
 		status = gen6_check_mailbox_status(dev_priv);
 
 	if (status) {
-		DRM_DEBUG_DRIVER("warning: pcode (read) mailbox access failed: %d\n",
-				 status);
+		DRM_DEBUG_DRIVER("warning: pcode (read from mbox %x) mailbox access failed for %ps: %d\n",
+				 mbox, __builtin_return_address(0), status);
 		return status;
 	}
 
@@ -8940,7 +8968,8 @@ int sandybridge_pcode_write(struct drm_i915_private *dev_priv,
 	 */
 
 	if (I915_READ_FW(GEN6_PCODE_MAILBOX) & GEN6_PCODE_READY) {
-		DRM_DEBUG_DRIVER("warning: pcode (write) mailbox access failed\n");
+		DRM_DEBUG_DRIVER("warning: pcode (write of 0x%08x to mbox %x) mailbox access failed for %ps\n",
+				 val, mbox, __builtin_return_address(0));
 		return -EAGAIN;
 	}
 
@@ -8951,7 +8980,8 @@ int sandybridge_pcode_write(struct drm_i915_private *dev_priv,
 	if (__intel_wait_for_register_fw(dev_priv,
 					 GEN6_PCODE_MAILBOX, GEN6_PCODE_READY, 0,
 					 500, 0, NULL)) {
-		DRM_ERROR("timeout waiting for pcode write (%d) to finish\n", mbox);
+		DRM_ERROR("timeout waiting for pcode write of 0x%08x to mbox %x to finish for %ps\n",
+			  val, mbox, __builtin_return_address(0));
 		return -ETIMEDOUT;
 	}
 
@@ -8963,8 +8993,8 @@ int sandybridge_pcode_write(struct drm_i915_private *dev_priv,
 		status = gen6_check_mailbox_status(dev_priv);
 
 	if (status) {
-		DRM_DEBUG_DRIVER("warning: pcode (write) mailbox access failed: %d\n",
-				 status);
+		DRM_DEBUG_DRIVER("warning: pcode (write of 0x%08x to mbox %x) mailbox access failed for %ps: %d\n",
+				 val, mbox, __builtin_return_address(0), status);
 		return status;
 	}
 
@@ -9078,7 +9108,7 @@ static int chv_freq_opcode(struct drm_i915_private *dev_priv, int val)
 
 int intel_gpu_freq(struct drm_i915_private *dev_priv, int val)
 {
-	if (IS_GEN9(dev_priv))
+	if (INTEL_GEN(dev_priv) >= 9)
 		return DIV_ROUND_CLOSEST(val * GT_FREQUENCY_MULTIPLIER,
 					 GEN9_FREQ_SCALER);
 	else if (IS_CHERRYVIEW(dev_priv))
@@ -9091,7 +9121,7 @@ int intel_gpu_freq(struct drm_i915_private *dev_priv, int val)
 
 int intel_freq_opcode(struct drm_i915_private *dev_priv, int val)
 {
-	if (IS_GEN9(dev_priv))
+	if (INTEL_GEN(dev_priv) >= 9)
 		return DIV_ROUND_CLOSEST(val * GEN9_FREQ_SCALER,
 					 GT_FREQUENCY_MULTIPLIER);
 	else if (IS_CHERRYVIEW(dev_priv))
@@ -9113,7 +9143,7 @@ static void __intel_rps_boost_work(struct work_struct *work)
 	struct drm_i915_gem_request *req = boost->req;
 
 	if (!i915_gem_request_completed(req))
-		gen6_rps_boost(req->i915, NULL, req->emitted_jiffies);
+		gen6_rps_boost(req, NULL);
 
 	i915_gem_request_put(req);
 	kfree(boost);
@@ -9142,11 +9172,10 @@ void intel_queue_rps_boost_for_request(struct drm_i915_gem_request *req)
 void intel_pm_setup(struct drm_i915_private *dev_priv)
 {
 	mutex_init(&dev_priv->rps.hw_lock);
-	spin_lock_init(&dev_priv->rps.client_lock);
 
 	INIT_DELAYED_WORK(&dev_priv->rps.autoenable_work,
 			  __intel_autoenable_gt_powersave);
-	INIT_LIST_HEAD(&dev_priv->rps.clients);
+	atomic_set(&dev_priv->rps.num_waiters, 0);
 
 	dev_priv->pm.suspended = false;
 	atomic_set(&dev_priv->pm.wakeref_count, 0);
diff --git a/drivers/gpu/drm/i915/intel_psr.c b/drivers/gpu/drm/i915/intel_psr.c
index 559f1ab..1b31ab0 100644
--- a/drivers/gpu/drm/i915/intel_psr.c
+++ b/drivers/gpu/drm/i915/intel_psr.c
@@ -315,6 +315,7 @@ static void intel_enable_source_psr1(struct intel_dp *intel_dp)
 	else
 		val |= EDP_PSR_TP1_TP2_SEL;
 
+	val |= I915_READ(EDP_PSR_CTL) & EDP_PSR_RESTORE_PSR_ACTIVE_CTX_MASK;
 	I915_WRITE(EDP_PSR_CTL, val);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_renderstate_gen9.c b/drivers/gpu/drm/i915/intel_renderstate_gen9.c
index 16a7ec2..7d3ac02 100644
--- a/drivers/gpu/drm/i915/intel_renderstate_gen9.c
+++ b/drivers/gpu/drm/i915/intel_renderstate_gen9.c
@@ -20,7 +20,7 @@
  * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
  * DEALINGS IN THE SOFTWARE.
  *
- * Generated by: intel-gpu-tools-1.8-220-g01153e7
+ * Generated by: intel-gpu-tools-1.19-177-g68e2eab2
  */
 
 #include "intel_renderstate.h"
@@ -873,7 +873,7 @@ static const u32 gen9_null_state_batch[] = {
 	0x00000000,
 	0x00000000,
 	0x78550003,
-	0x00000000,
+	0x0000000f,
 	0x00000000,
 	0x00000000,
 	0x00000000,
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c
index acd1da9..cdf084e 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.c
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.c
@@ -1712,6 +1712,9 @@ u32 *intel_ring_begin(struct drm_i915_gem_request *req,
 	unsigned int total_bytes;
 	u32 *cs;
 
+	/* Packets must be qword aligned. */
+	GEM_BUG_ON(num_dwords & 1);
+
 	total_bytes = bytes + req->reserved_space;
 	GEM_BUG_ON(total_bytes > ring->effective_size);
 
@@ -2140,7 +2143,7 @@ static void intel_ring_default_vfuncs(struct drm_i915_private *dev_priv,
 
 		engine->emit_breadcrumb = gen6_sema_emit_breadcrumb;
 
-		num_rings = hweight32(INTEL_INFO(dev_priv)->ring_mask) - 1;
+		num_rings = INTEL_INFO(dev_priv)->num_rings - 1;
 		if (INTEL_GEN(dev_priv) >= 8) {
 			engine->emit_breadcrumb_sz += num_rings * 6;
 		} else {
@@ -2184,8 +2187,7 @@ int intel_init_render_ring_buffer(struct intel_engine_cs *engine)
 
 			engine->semaphore.signal = gen8_rcs_signal;
 
-			num_rings =
-				hweight32(INTEL_INFO(dev_priv)->ring_mask) - 1;
+			num_rings = INTEL_INFO(dev_priv)->num_rings - 1;
 			engine->emit_breadcrumb_sz += num_rings * 8;
 		}
 	} else if (INTEL_GEN(dev_priv) >= 6) {
diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h
index 6aa20ac..02d8974 100644
--- a/drivers/gpu/drm/i915/intel_ringbuffer.h
+++ b/drivers/gpu/drm/i915/intel_ringbuffer.h
@@ -121,6 +121,7 @@ struct intel_engine_hangcheck {
 	unsigned long action_timestamp;
 	int deadlock;
 	struct intel_instdone instdone;
+	struct drm_i915_gem_request *active_request;
 	bool stalled;
 };
 
@@ -734,4 +735,16 @@ bool intel_engines_are_idle(struct drm_i915_private *dev_priv);
 void intel_engines_mark_idle(struct drm_i915_private *i915);
 void intel_engines_reset_default_submission(struct drm_i915_private *i915);
 
+static inline bool
+__intel_engine_can_store_dword(unsigned int gen, unsigned int class)
+{
+	if (gen <= 2)
+		return false; /* uses physical not virtual addresses */
+
+	if (gen == 6 && class == VIDEO_DECODE_CLASS)
+		return false; /* b0rked */
+
+	return true;
+}
+
 #endif /* _INTEL_RINGBUFFER_H_ */
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index efe80ed..b66d8e1 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -50,10 +50,11 @@
  */
 
 bool intel_display_power_well_is_enabled(struct drm_i915_private *dev_priv,
-				    int power_well_id);
+					 enum i915_power_well_id power_well_id);
 
 static struct i915_power_well *
-lookup_power_well(struct drm_i915_private *dev_priv, int power_well_id);
+lookup_power_well(struct drm_i915_private *dev_priv,
+		  enum i915_power_well_id power_well_id);
 
 const char *
 intel_display_power_domain_str(enum intel_display_power_domain domain)
@@ -168,18 +169,6 @@ static void intel_power_well_put(struct drm_i915_private *dev_priv,
 		intel_power_well_disable(dev_priv, power_well);
 }
 
-/*
- * We should only use the power well if we explicitly asked the hardware to
- * enable it, so check if it's enabled and also check if we've requested it to
- * be enabled.
- */
-static bool hsw_power_well_enabled(struct drm_i915_private *dev_priv,
-				   struct i915_power_well *power_well)
-{
-	return I915_READ(HSW_PWR_WELL_DRIVER) ==
-		     (HSW_PWR_WELL_ENABLE_REQUEST | HSW_PWR_WELL_STATE_ENABLED);
-}
-
 /**
  * __intel_display_power_is_enabled - unlocked check for a power domain
  * @dev_priv: i915 device instance
@@ -278,7 +267,8 @@ void intel_display_set_init_power(struct drm_i915_private *dev_priv,
  * to be enabled, and it will only be disabled if none of the registers is
  * requesting it to be enabled.
  */
-static void hsw_power_well_post_enable(struct drm_i915_private *dev_priv)
+static void hsw_power_well_post_enable(struct drm_i915_private *dev_priv,
+				       u8 irq_pipe_mask, bool has_vga)
 {
 	struct pci_dev *pdev = dev_priv->drm.pdev;
 
@@ -292,264 +282,158 @@ static void hsw_power_well_post_enable(struct drm_i915_private *dev_priv)
 	 * sure vgacon can keep working normally without triggering interrupts
 	 * and error messages.
 	 */
-	vga_get_uninterruptible(pdev, VGA_RSRC_LEGACY_IO);
-	outb(inb(VGA_MSR_READ), VGA_MSR_WRITE);
-	vga_put(pdev, VGA_RSRC_LEGACY_IO);
-
-	if (IS_BROADWELL(dev_priv))
-		gen8_irq_power_well_post_enable(dev_priv,
-						1 << PIPE_C | 1 << PIPE_B);
-}
-
-static void hsw_power_well_pre_disable(struct drm_i915_private *dev_priv)
-{
-	if (IS_BROADWELL(dev_priv))
-		gen8_irq_power_well_pre_disable(dev_priv,
-						1 << PIPE_C | 1 << PIPE_B);
-}
-
-static void skl_power_well_post_enable(struct drm_i915_private *dev_priv,
-				       struct i915_power_well *power_well)
-{
-	struct pci_dev *pdev = dev_priv->drm.pdev;
-
-	/*
-	 * After we re-enable the power well, if we touch VGA register 0x3d5
-	 * we'll get unclaimed register interrupts. This stops after we write
-	 * anything to the VGA MSR register. The vgacon module uses this
-	 * register all the time, so if we unbind our driver and, as a
-	 * consequence, bind vgacon, we'll get stuck in an infinite loop at
-	 * console_unlock(). So make here we touch the VGA MSR register, making
-	 * sure vgacon can keep working normally without triggering interrupts
-	 * and error messages.
-	 */
-	if (power_well->id == SKL_DISP_PW_2) {
+	if (has_vga) {
 		vga_get_uninterruptible(pdev, VGA_RSRC_LEGACY_IO);
 		outb(inb(VGA_MSR_READ), VGA_MSR_WRITE);
 		vga_put(pdev, VGA_RSRC_LEGACY_IO);
-
-		gen8_irq_power_well_post_enable(dev_priv,
-						1 << PIPE_C | 1 << PIPE_B);
 	}
+
+	if (irq_pipe_mask)
+		gen8_irq_power_well_post_enable(dev_priv, irq_pipe_mask);
 }
 
-static void skl_power_well_pre_disable(struct drm_i915_private *dev_priv,
-				       struct i915_power_well *power_well)
+static void hsw_power_well_pre_disable(struct drm_i915_private *dev_priv,
+				       u8 irq_pipe_mask)
 {
-	if (power_well->id == SKL_DISP_PW_2)
-		gen8_irq_power_well_pre_disable(dev_priv,
-						1 << PIPE_C | 1 << PIPE_B);
+	if (irq_pipe_mask)
+		gen8_irq_power_well_pre_disable(dev_priv, irq_pipe_mask);
 }
 
-static void hsw_set_power_well(struct drm_i915_private *dev_priv,
-			       struct i915_power_well *power_well, bool enable)
+
+static void hsw_wait_for_power_well_enable(struct drm_i915_private *dev_priv,
+					   struct i915_power_well *power_well)
 {
-	bool is_enabled, enable_requested;
-	uint32_t tmp;
+	enum i915_power_well_id id = power_well->id;
 
-	tmp = I915_READ(HSW_PWR_WELL_DRIVER);
-	is_enabled = tmp & HSW_PWR_WELL_STATE_ENABLED;
-	enable_requested = tmp & HSW_PWR_WELL_ENABLE_REQUEST;
+	/* Timeout for PW1:10 us, AUX:not specified, other PWs:20 us. */
+	WARN_ON(intel_wait_for_register(dev_priv,
+					HSW_PWR_WELL_CTL_DRIVER(id),
+					HSW_PWR_WELL_CTL_STATE(id),
+					HSW_PWR_WELL_CTL_STATE(id),
+					1));
+}
 
-	if (enable) {
-		if (!enable_requested)
-			I915_WRITE(HSW_PWR_WELL_DRIVER,
-				   HSW_PWR_WELL_ENABLE_REQUEST);
+static u32 hsw_power_well_requesters(struct drm_i915_private *dev_priv,
+				     enum i915_power_well_id id)
+{
+	u32 req_mask = HSW_PWR_WELL_CTL_REQ(id);
+	u32 ret;
 
-		if (!is_enabled) {
-			DRM_DEBUG_KMS("Enabling power well\n");
-			if (intel_wait_for_register(dev_priv,
-						    HSW_PWR_WELL_DRIVER,
-						    HSW_PWR_WELL_STATE_ENABLED,
-						    HSW_PWR_WELL_STATE_ENABLED,
-						    20))
-				DRM_ERROR("Timeout enabling power well\n");
-			hsw_power_well_post_enable(dev_priv);
-		}
+	ret = I915_READ(HSW_PWR_WELL_CTL_BIOS(id)) & req_mask ? 1 : 0;
+	ret |= I915_READ(HSW_PWR_WELL_CTL_DRIVER(id)) & req_mask ? 2 : 0;
+	ret |= I915_READ(HSW_PWR_WELL_CTL_KVMR) & req_mask ? 4 : 0;
+	ret |= I915_READ(HSW_PWR_WELL_CTL_DEBUG(id)) & req_mask ? 8 : 0;
 
-	} else {
-		if (enable_requested) {
-			hsw_power_well_pre_disable(dev_priv);
-			I915_WRITE(HSW_PWR_WELL_DRIVER, 0);
-			POSTING_READ(HSW_PWR_WELL_DRIVER);
-			DRM_DEBUG_KMS("Requesting to disable the power well\n");
-		}
+	return ret;
+}
+
+static void hsw_wait_for_power_well_disable(struct drm_i915_private *dev_priv,
+					    struct i915_power_well *power_well)
+{
+	enum i915_power_well_id id = power_well->id;
+	bool disabled;
+	u32 reqs;
+
+	/*
+	 * Bspec doesn't require waiting for PWs to get disabled, but still do
+	 * this for paranoia. The known cases where a PW will be forced on:
+	 * - a KVMR request on any power well via the KVMR request register
+	 * - a DMC request on PW1 and MISC_IO power wells via the BIOS and
+	 *   DEBUG request registers
+	 * Skip the wait in case any of the request bits are set and print a
+	 * diagnostic message.
+	 */
+	wait_for((disabled = !(I915_READ(HSW_PWR_WELL_CTL_DRIVER(id)) &
+			       HSW_PWR_WELL_CTL_STATE(id))) ||
+		 (reqs = hsw_power_well_requesters(dev_priv, id)), 1);
+	if (disabled)
+		return;
+
+	DRM_DEBUG_KMS("%s forced on (bios:%d driver:%d kvmr:%d debug:%d)\n",
+		      power_well->name,
+		      !!(reqs & 1), !!(reqs & 2), !!(reqs & 4), !!(reqs & 8));
+}
+
+static void gen9_wait_for_power_well_fuses(struct drm_i915_private *dev_priv,
+					   enum skl_power_gate pg)
+{
+	/* Timeout 5us for PG#0, for other PGs 1us */
+	WARN_ON(intel_wait_for_register(dev_priv, SKL_FUSE_STATUS,
+					SKL_FUSE_PG_DIST_STATUS(pg),
+					SKL_FUSE_PG_DIST_STATUS(pg), 1));
+}
+
+static void hsw_power_well_enable(struct drm_i915_private *dev_priv,
+				  struct i915_power_well *power_well)
+{
+	enum i915_power_well_id id = power_well->id;
+	bool wait_fuses = power_well->hsw.has_fuses;
+	enum skl_power_gate pg;
+	u32 val;
+
+	if (wait_fuses) {
+		pg = SKL_PW_TO_PG(id);
+		/*
+		 * For PW1 we have to wait both for the PW0/PG0 fuse state
+		 * before enabling the power well and PW1/PG1's own fuse
+		 * state after the enabling. For all other power wells with
+		 * fuses we only have to wait for that PW/PG's fuse state
+		 * after the enabling.
+		 */
+		if (pg == SKL_PG1)
+			gen9_wait_for_power_well_fuses(dev_priv, SKL_PG0);
 	}
+
+	val = I915_READ(HSW_PWR_WELL_CTL_DRIVER(id));
+	I915_WRITE(HSW_PWR_WELL_CTL_DRIVER(id), val | HSW_PWR_WELL_CTL_REQ(id));
+	hsw_wait_for_power_well_enable(dev_priv, power_well);
+
+	if (wait_fuses)
+		gen9_wait_for_power_well_fuses(dev_priv, pg);
+
+	hsw_power_well_post_enable(dev_priv, power_well->hsw.irq_pipe_mask,
+				   power_well->hsw.has_vga);
 }
 
-#define SKL_DISPLAY_POWERWELL_2_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_A) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_B) |			\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_B) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_C) |			\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_C) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_B_PANEL_FITTER) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_C_PANEL_FITTER) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_D_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_E_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_AUX_B) |                       \
-	BIT_ULL(POWER_DOMAIN_AUX_C) |			\
-	BIT_ULL(POWER_DOMAIN_AUX_D) |			\
-	BIT_ULL(POWER_DOMAIN_AUDIO) |			\
-	BIT_ULL(POWER_DOMAIN_VGA) |				\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define SKL_DISPLAY_DDI_IO_A_E_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_A_IO) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_E_IO) |		\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define SKL_DISPLAY_DDI_IO_B_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_IO) |		\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define SKL_DISPLAY_DDI_IO_C_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_IO) |		\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define SKL_DISPLAY_DDI_IO_D_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_D_IO) |		\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define SKL_DISPLAY_DC_OFF_POWER_DOMAINS (		\
-	SKL_DISPLAY_POWERWELL_2_POWER_DOMAINS |		\
-	BIT_ULL(POWER_DOMAIN_MODESET) |			\
-	BIT_ULL(POWER_DOMAIN_AUX_A) |			\
-	BIT_ULL(POWER_DOMAIN_INIT))
+static void hsw_power_well_disable(struct drm_i915_private *dev_priv,
+				   struct i915_power_well *power_well)
+{
+	enum i915_power_well_id id = power_well->id;
+	u32 val;
 
-#define BXT_DISPLAY_POWERWELL_2_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_A) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_B) |			\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_B) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_C) |			\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_C) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_B_PANEL_FITTER) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_C_PANEL_FITTER) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_AUX_B) |			\
-	BIT_ULL(POWER_DOMAIN_AUX_C) |			\
-	BIT_ULL(POWER_DOMAIN_AUDIO) |			\
-	BIT_ULL(POWER_DOMAIN_VGA) |				\
-	BIT_ULL(POWER_DOMAIN_GMBUS) |			\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define BXT_DISPLAY_DC_OFF_POWER_DOMAINS (		\
-	BXT_DISPLAY_POWERWELL_2_POWER_DOMAINS |		\
-	BIT_ULL(POWER_DOMAIN_MODESET) |			\
-	BIT_ULL(POWER_DOMAIN_AUX_A) |			\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define BXT_DPIO_CMN_A_POWER_DOMAINS (			\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_A_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_AUX_A) |			\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define BXT_DPIO_CMN_BC_POWER_DOMAINS (			\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_AUX_B) |			\
-	BIT_ULL(POWER_DOMAIN_AUX_C) |			\
-	BIT_ULL(POWER_DOMAIN_INIT))
+	hsw_power_well_pre_disable(dev_priv, power_well->hsw.irq_pipe_mask);
 
-#define GLK_DISPLAY_POWERWELL_2_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_A) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_B) |			\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_B) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_C) |			\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_C) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_B_PANEL_FITTER) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_C_PANEL_FITTER) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_AUX_B) |                       \
-	BIT_ULL(POWER_DOMAIN_AUX_C) |			\
-	BIT_ULL(POWER_DOMAIN_AUDIO) |			\
-	BIT_ULL(POWER_DOMAIN_VGA) |				\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define GLK_DISPLAY_DDI_IO_A_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_A_IO))
-#define GLK_DISPLAY_DDI_IO_B_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_IO))
-#define GLK_DISPLAY_DDI_IO_C_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_IO))
-#define GLK_DPIO_CMN_A_POWER_DOMAINS (			\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_A_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_AUX_A) |			\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define GLK_DPIO_CMN_B_POWER_DOMAINS (			\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_AUX_B) |			\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define GLK_DPIO_CMN_C_POWER_DOMAINS (			\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_AUX_C) |			\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define GLK_DISPLAY_AUX_A_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_AUX_A) |		\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define GLK_DISPLAY_AUX_B_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_AUX_B) |		\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define GLK_DISPLAY_AUX_C_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_AUX_C) |		\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define GLK_DISPLAY_DC_OFF_POWER_DOMAINS (		\
-	GLK_DISPLAY_POWERWELL_2_POWER_DOMAINS |		\
-	BIT_ULL(POWER_DOMAIN_MODESET) |			\
-	BIT_ULL(POWER_DOMAIN_AUX_A) |			\
-	BIT_ULL(POWER_DOMAIN_INIT))
+	val = I915_READ(HSW_PWR_WELL_CTL_DRIVER(id));
+	I915_WRITE(HSW_PWR_WELL_CTL_DRIVER(id),
+		   val & ~HSW_PWR_WELL_CTL_REQ(id));
+	hsw_wait_for_power_well_disable(dev_priv, power_well);
+}
 
-#define CNL_DISPLAY_POWERWELL_2_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_A) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_B) |			\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_B) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_C) |			\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_C) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_B_PANEL_FITTER) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_C_PANEL_FITTER) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_D_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_E_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_AUX_B) |                       \
-	BIT_ULL(POWER_DOMAIN_AUX_C) |			\
-	BIT_ULL(POWER_DOMAIN_AUX_D) |			\
-	BIT_ULL(POWER_DOMAIN_AUDIO) |			\
-	BIT_ULL(POWER_DOMAIN_VGA) |				\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define CNL_DISPLAY_DDI_A_IO_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_A_IO) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_E_IO) |		\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define CNL_DISPLAY_DDI_B_IO_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_IO) |		\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define CNL_DISPLAY_DDI_C_IO_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_IO) |		\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define CNL_DISPLAY_DDI_D_IO_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_D_IO) |		\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define CNL_DISPLAY_AUX_A_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_AUX_A) |			\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define CNL_DISPLAY_AUX_B_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_AUX_B) |			\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define CNL_DISPLAY_AUX_C_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_AUX_C) |			\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define CNL_DISPLAY_AUX_D_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_AUX_D) |			\
-	BIT_ULL(POWER_DOMAIN_INIT))
-#define CNL_DISPLAY_DC_OFF_POWER_DOMAINS (		\
-	CNL_DISPLAY_POWERWELL_2_POWER_DOMAINS |		\
-	BIT_ULL(POWER_DOMAIN_MODESET) |			\
-	BIT_ULL(POWER_DOMAIN_AUX_A) |			\
-	BIT_ULL(POWER_DOMAIN_INIT))
+/*
+ * We should only use the power well if we explicitly asked the hardware to
+ * enable it, so check if it's enabled and also check if we've requested it to
+ * be enabled.
+ */
+static bool hsw_power_well_enabled(struct drm_i915_private *dev_priv,
+				   struct i915_power_well *power_well)
+{
+	enum i915_power_well_id id = power_well->id;
+	u32 mask = HSW_PWR_WELL_CTL_REQ(id) | HSW_PWR_WELL_CTL_STATE(id);
+
+	return (I915_READ(HSW_PWR_WELL_CTL_DRIVER(id)) & mask) == mask;
+}
 
 static void assert_can_enable_dc9(struct drm_i915_private *dev_priv)
 {
+	enum i915_power_well_id id = SKL_DISP_PW_2;
+
 	WARN_ONCE((I915_READ(DC_STATE_EN) & DC_STATE_EN_DC9),
 		  "DC9 already programmed to be enabled.\n");
 	WARN_ONCE(I915_READ(DC_STATE_EN) & DC_STATE_EN_UPTO_DC5,
 		  "DC5 still not disabled to enable DC9.\n");
-	WARN_ONCE(I915_READ(HSW_PWR_WELL_DRIVER), "Power well on.\n");
+	WARN_ONCE(I915_READ(HSW_PWR_WELL_CTL_DRIVER(id)) &
+		  HSW_PWR_WELL_CTL_REQ(id),
+		  "Power well 2 on.\n");
 	WARN_ONCE(intel_irqs_enabled(dev_priv),
 		  "Interrupts not disabled yet.\n");
 
@@ -744,223 +628,39 @@ void skl_disable_dc6(struct drm_i915_private *dev_priv)
 	gen9_set_dc_state(dev_priv, DC_STATE_DISABLE);
 }
 
-static void
-gen9_sanitize_power_well_requests(struct drm_i915_private *dev_priv,
-				  struct i915_power_well *power_well)
-{
-	enum skl_disp_power_wells power_well_id = power_well->id;
-	u32 val;
-	u32 mask;
-
-	mask = SKL_POWER_WELL_REQ(power_well_id);
-
-	val = I915_READ(HSW_PWR_WELL_KVMR);
-	if (WARN_ONCE(val & mask, "Clearing unexpected KVMR request for %s\n",
-		      power_well->name))
-		I915_WRITE(HSW_PWR_WELL_KVMR, val & ~mask);
-
-	val = I915_READ(HSW_PWR_WELL_BIOS);
-	val |= I915_READ(HSW_PWR_WELL_DEBUG);
-
-	if (!(val & mask))
-		return;
-
-	/*
-	 * DMC is known to force on the request bits for power well 1 on SKL
-	 * and BXT and the misc IO power well on SKL but we don't expect any
-	 * other request bits to be set, so WARN for those.
-	 */
-	if (power_well_id == SKL_DISP_PW_1 ||
-	    (IS_GEN9_BC(dev_priv) &&
-	     power_well_id == SKL_DISP_PW_MISC_IO))
-		DRM_DEBUG_DRIVER("Clearing auxiliary requests for %s forced on "
-				 "by DMC\n", power_well->name);
-	else
-		WARN_ONCE(1, "Clearing unexpected auxiliary requests for %s\n",
-			  power_well->name);
-
-	I915_WRITE(HSW_PWR_WELL_BIOS, val & ~mask);
-	I915_WRITE(HSW_PWR_WELL_DEBUG, val & ~mask);
-}
-
-static void skl_set_power_well(struct drm_i915_private *dev_priv,
-			       struct i915_power_well *power_well, bool enable)
-{
-	uint32_t tmp, fuse_status;
-	uint32_t req_mask, state_mask;
-	bool is_enabled, enable_requested, check_fuse_status = false;
-
-	tmp = I915_READ(HSW_PWR_WELL_DRIVER);
-	fuse_status = I915_READ(SKL_FUSE_STATUS);
-
-	switch (power_well->id) {
-	case SKL_DISP_PW_1:
-		if (intel_wait_for_register(dev_priv,
-					    SKL_FUSE_STATUS,
-					    SKL_FUSE_PG0_DIST_STATUS,
-					    SKL_FUSE_PG0_DIST_STATUS,
-					    1)) {
-			DRM_ERROR("PG0 not enabled\n");
-			return;
-		}
-		break;
-	case SKL_DISP_PW_2:
-		if (!(fuse_status & SKL_FUSE_PG1_DIST_STATUS)) {
-			DRM_ERROR("PG1 in disabled state\n");
-			return;
-		}
-		break;
-	case SKL_DISP_PW_MISC_IO:
-	case SKL_DISP_PW_DDI_A_E: /* GLK_DISP_PW_DDI_A, CNL_DISP_PW_DDI_A */
-	case SKL_DISP_PW_DDI_B:
-	case SKL_DISP_PW_DDI_C:
-	case SKL_DISP_PW_DDI_D:
-	case GLK_DISP_PW_AUX_A: /* CNL_DISP_PW_AUX_A */
-	case GLK_DISP_PW_AUX_B: /* CNL_DISP_PW_AUX_B */
-	case GLK_DISP_PW_AUX_C: /* CNL_DISP_PW_AUX_C */
-	case CNL_DISP_PW_AUX_D:
-		break;
-	default:
-		WARN(1, "Unknown power well %lu\n", power_well->id);
-		return;
-	}
-
-	req_mask = SKL_POWER_WELL_REQ(power_well->id);
-	enable_requested = tmp & req_mask;
-	state_mask = SKL_POWER_WELL_STATE(power_well->id);
-	is_enabled = tmp & state_mask;
-
-	if (!enable && enable_requested)
-		skl_power_well_pre_disable(dev_priv, power_well);
-
-	if (enable) {
-		if (!enable_requested) {
-			WARN((tmp & state_mask) &&
-				!I915_READ(HSW_PWR_WELL_BIOS),
-				"Invalid for power well status to be enabled, unless done by the BIOS, \
-				when request is to disable!\n");
-			I915_WRITE(HSW_PWR_WELL_DRIVER, tmp | req_mask);
-		}
-
-		if (!is_enabled) {
-			DRM_DEBUG_KMS("Enabling %s\n", power_well->name);
-			check_fuse_status = true;
-		}
-	} else {
-		if (enable_requested) {
-			I915_WRITE(HSW_PWR_WELL_DRIVER,	tmp & ~req_mask);
-			POSTING_READ(HSW_PWR_WELL_DRIVER);
-			DRM_DEBUG_KMS("Disabling %s\n", power_well->name);
-		}
-
-		gen9_sanitize_power_well_requests(dev_priv, power_well);
-	}
-
-	if (wait_for(!!(I915_READ(HSW_PWR_WELL_DRIVER) & state_mask) == enable,
-		     1))
-		DRM_ERROR("%s %s timeout\n",
-			  power_well->name, enable ? "enable" : "disable");
-
-	if (check_fuse_status) {
-		if (power_well->id == SKL_DISP_PW_1) {
-			if (intel_wait_for_register(dev_priv,
-						    SKL_FUSE_STATUS,
-						    SKL_FUSE_PG1_DIST_STATUS,
-						    SKL_FUSE_PG1_DIST_STATUS,
-						    1))
-				DRM_ERROR("PG1 distributing status timeout\n");
-		} else if (power_well->id == SKL_DISP_PW_2) {
-			if (intel_wait_for_register(dev_priv,
-						    SKL_FUSE_STATUS,
-						    SKL_FUSE_PG2_DIST_STATUS,
-						    SKL_FUSE_PG2_DIST_STATUS,
-						    1))
-				DRM_ERROR("PG2 distributing status timeout\n");
-		}
-	}
-
-	if (enable && !is_enabled)
-		skl_power_well_post_enable(dev_priv, power_well);
-}
-
 static void hsw_power_well_sync_hw(struct drm_i915_private *dev_priv,
 				   struct i915_power_well *power_well)
 {
-	/* Take over the request bit if set by BIOS. */
-	if (I915_READ(HSW_PWR_WELL_BIOS) & HSW_PWR_WELL_ENABLE_REQUEST) {
-		if (!(I915_READ(HSW_PWR_WELL_DRIVER) &
-		      HSW_PWR_WELL_ENABLE_REQUEST))
-			I915_WRITE(HSW_PWR_WELL_DRIVER,
-				   HSW_PWR_WELL_ENABLE_REQUEST);
-		I915_WRITE(HSW_PWR_WELL_BIOS, 0);
-	}
-}
-
-static void hsw_power_well_enable(struct drm_i915_private *dev_priv,
-				  struct i915_power_well *power_well)
-{
-	hsw_set_power_well(dev_priv, power_well, true);
-}
-
-static void hsw_power_well_disable(struct drm_i915_private *dev_priv,
-				   struct i915_power_well *power_well)
-{
-	hsw_set_power_well(dev_priv, power_well, false);
-}
-
-static bool skl_power_well_enabled(struct drm_i915_private *dev_priv,
-					struct i915_power_well *power_well)
-{
-	uint32_t mask = SKL_POWER_WELL_REQ(power_well->id) |
-		SKL_POWER_WELL_STATE(power_well->id);
-
-	return (I915_READ(HSW_PWR_WELL_DRIVER) & mask) == mask;
-}
-
-static void skl_power_well_sync_hw(struct drm_i915_private *dev_priv,
-				struct i915_power_well *power_well)
-{
-	uint32_t mask = SKL_POWER_WELL_REQ(power_well->id);
-	uint32_t bios_req = I915_READ(HSW_PWR_WELL_BIOS);
+	enum i915_power_well_id id = power_well->id;
+	u32 mask = HSW_PWR_WELL_CTL_REQ(id);
+	u32 bios_req = I915_READ(HSW_PWR_WELL_CTL_BIOS(id));
 
 	/* Take over the request bit if set by BIOS. */
 	if (bios_req & mask) {
-		uint32_t drv_req = I915_READ(HSW_PWR_WELL_DRIVER);
+		u32 drv_req = I915_READ(HSW_PWR_WELL_CTL_DRIVER(id));
 
 		if (!(drv_req & mask))
-			I915_WRITE(HSW_PWR_WELL_DRIVER, drv_req | mask);
-		I915_WRITE(HSW_PWR_WELL_BIOS, bios_req & ~mask);
+			I915_WRITE(HSW_PWR_WELL_CTL_DRIVER(id), drv_req | mask);
+		I915_WRITE(HSW_PWR_WELL_CTL_BIOS(id), bios_req & ~mask);
 	}
 }
 
-static void skl_power_well_enable(struct drm_i915_private *dev_priv,
-				struct i915_power_well *power_well)
-{
-	skl_set_power_well(dev_priv, power_well, true);
-}
-
-static void skl_power_well_disable(struct drm_i915_private *dev_priv,
-				struct i915_power_well *power_well)
-{
-	skl_set_power_well(dev_priv, power_well, false);
-}
-
 static void bxt_dpio_cmn_power_well_enable(struct drm_i915_private *dev_priv,
 					   struct i915_power_well *power_well)
 {
-	bxt_ddi_phy_init(dev_priv, power_well->data);
+	bxt_ddi_phy_init(dev_priv, power_well->bxt.phy);
 }
 
 static void bxt_dpio_cmn_power_well_disable(struct drm_i915_private *dev_priv,
 					    struct i915_power_well *power_well)
 {
-	bxt_ddi_phy_uninit(dev_priv, power_well->data);
+	bxt_ddi_phy_uninit(dev_priv, power_well->bxt.phy);
 }
 
 static bool bxt_dpio_cmn_power_well_enabled(struct drm_i915_private *dev_priv,
 					    struct i915_power_well *power_well)
 {
-	return bxt_ddi_phy_is_enabled(dev_priv, power_well->data);
+	return bxt_ddi_phy_is_enabled(dev_priv, power_well->bxt.phy);
 }
 
 static void bxt_verify_ddi_phy_power_wells(struct drm_i915_private *dev_priv)
@@ -969,16 +669,16 @@ static void bxt_verify_ddi_phy_power_wells(struct drm_i915_private *dev_priv)
 
 	power_well = lookup_power_well(dev_priv, BXT_DPIO_CMN_A);
 	if (power_well->count > 0)
-		bxt_ddi_phy_verify_state(dev_priv, power_well->data);
+		bxt_ddi_phy_verify_state(dev_priv, power_well->bxt.phy);
 
 	power_well = lookup_power_well(dev_priv, BXT_DPIO_CMN_BC);
 	if (power_well->count > 0)
-		bxt_ddi_phy_verify_state(dev_priv, power_well->data);
+		bxt_ddi_phy_verify_state(dev_priv, power_well->bxt.phy);
 
 	if (IS_GEMINILAKE(dev_priv)) {
 		power_well = lookup_power_well(dev_priv, GLK_DPIO_CMN_C);
 		if (power_well->count > 0)
-			bxt_ddi_phy_verify_state(dev_priv, power_well->data);
+			bxt_ddi_phy_verify_state(dev_priv, power_well->bxt.phy);
 	}
 }
 
@@ -1076,7 +776,7 @@ static void i830_pipes_power_well_sync_hw(struct drm_i915_private *dev_priv,
 static void vlv_set_power_well(struct drm_i915_private *dev_priv,
 			       struct i915_power_well *power_well, bool enable)
 {
-	enum punit_power_well power_well_id = power_well->id;
+	enum i915_power_well_id power_well_id = power_well->id;
 	u32 mask;
 	u32 state;
 	u32 ctrl;
@@ -1124,7 +824,7 @@ static void vlv_power_well_disable(struct drm_i915_private *dev_priv,
 static bool vlv_power_well_enabled(struct drm_i915_private *dev_priv,
 				   struct i915_power_well *power_well)
 {
-	int power_well_id = power_well->id;
+	enum i915_power_well_id power_well_id = power_well->id;
 	bool enabled = false;
 	u32 mask;
 	u32 state;
@@ -1311,8 +1011,9 @@ static void vlv_dpio_cmn_power_well_disable(struct drm_i915_private *dev_priv,
 
 #define POWER_DOMAIN_MASK (GENMASK_ULL(POWER_DOMAIN_NUM - 1, 0))
 
-static struct i915_power_well *lookup_power_well(struct drm_i915_private *dev_priv,
-						 int power_well_id)
+static struct i915_power_well *
+lookup_power_well(struct drm_i915_private *dev_priv,
+		  enum i915_power_well_id power_well_id)
 {
 	struct i915_power_domains *power_domains = &dev_priv->power_domains;
 	int i;
@@ -1659,7 +1360,7 @@ void chv_phy_powergate_lanes(struct intel_encoder *encoder,
 static bool chv_pipe_power_well_enabled(struct drm_i915_private *dev_priv,
 					struct i915_power_well *power_well)
 {
-	enum pipe pipe = power_well->id;
+	enum pipe pipe = PIPE_A;
 	bool enabled;
 	u32 state, ctrl;
 
@@ -1689,7 +1390,7 @@ static void chv_set_pipe_power_well(struct drm_i915_private *dev_priv,
 				    struct i915_power_well *power_well,
 				    bool enable)
 {
-	enum pipe pipe = power_well->id;
+	enum pipe pipe = PIPE_A;
 	u32 state;
 	u32 ctrl;
 
@@ -1722,7 +1423,7 @@ static void chv_set_pipe_power_well(struct drm_i915_private *dev_priv,
 static void chv_pipe_power_well_enable(struct drm_i915_private *dev_priv,
 				       struct i915_power_well *power_well)
 {
-	WARN_ON_ONCE(power_well->id != PIPE_A);
+	WARN_ON_ONCE(power_well->id != CHV_DISP_PW_PIPE_A);
 
 	chv_set_pipe_power_well(dev_priv, power_well, true);
 
@@ -1732,7 +1433,7 @@ static void chv_pipe_power_well_enable(struct drm_i915_private *dev_priv,
 static void chv_pipe_power_well_disable(struct drm_i915_private *dev_priv,
 					struct i915_power_well *power_well)
 {
-	WARN_ON_ONCE(power_well->id != PIPE_A);
+	WARN_ON_ONCE(power_well->id != CHV_DISP_PW_PIPE_A);
 
 	vlv_display_power_well_deinit(dev_priv);
 
@@ -1848,37 +1549,13 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
 	intel_runtime_pm_put(dev_priv);
 }
 
-#define HSW_DISPLAY_POWER_DOMAINS (			\
-	BIT_ULL(POWER_DOMAIN_PIPE_B) |			\
-	BIT_ULL(POWER_DOMAIN_PIPE_C) |			\
-	BIT_ULL(POWER_DOMAIN_PIPE_A_PANEL_FITTER) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_B_PANEL_FITTER) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_C_PANEL_FITTER) |		\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_A) |		\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_B) |		\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_C) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_D_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_CRT) | /* DDI E */	\
-	BIT_ULL(POWER_DOMAIN_VGA) |				\
-	BIT_ULL(POWER_DOMAIN_AUDIO) |			\
-	BIT_ULL(POWER_DOMAIN_INIT))
-
-#define BDW_DISPLAY_POWER_DOMAINS (			\
-	BIT_ULL(POWER_DOMAIN_PIPE_B) |			\
-	BIT_ULL(POWER_DOMAIN_PIPE_C) |			\
-	BIT_ULL(POWER_DOMAIN_PIPE_B_PANEL_FITTER) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_C_PANEL_FITTER) |		\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_A) |		\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_B) |		\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_C) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_DDI_D_LANES) |		\
-	BIT_ULL(POWER_DOMAIN_PORT_CRT) | /* DDI E */	\
-	BIT_ULL(POWER_DOMAIN_VGA) |				\
-	BIT_ULL(POWER_DOMAIN_AUDIO) |			\
+#define I830_PIPES_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_PIPE_A) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_B) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_A_PANEL_FITTER) |	\
+	BIT_ULL(POWER_DOMAIN_PIPE_B_PANEL_FITTER) |	\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_A) |	\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_B) |	\
 	BIT_ULL(POWER_DOMAIN_INIT))
 
 #define VLV_DISPLAY_POWER_DOMAINS (		\
@@ -1961,13 +1638,201 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
 	BIT_ULL(POWER_DOMAIN_AUX_D) |		\
 	BIT_ULL(POWER_DOMAIN_INIT))
 
-#define I830_PIPES_POWER_DOMAINS (		\
-	BIT_ULL(POWER_DOMAIN_PIPE_A) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_B) |		\
-	BIT_ULL(POWER_DOMAIN_PIPE_A_PANEL_FITTER) |	\
-	BIT_ULL(POWER_DOMAIN_PIPE_B_PANEL_FITTER) |	\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_A) |	\
-	BIT_ULL(POWER_DOMAIN_TRANSCODER_B) |	\
+#define HSW_DISPLAY_POWER_DOMAINS (			\
+	BIT_ULL(POWER_DOMAIN_PIPE_B) |			\
+	BIT_ULL(POWER_DOMAIN_PIPE_C) |			\
+	BIT_ULL(POWER_DOMAIN_PIPE_A_PANEL_FITTER) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_B_PANEL_FITTER) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_C_PANEL_FITTER) |		\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_A) |		\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_B) |		\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_C) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_D_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_CRT) | /* DDI E */	\
+	BIT_ULL(POWER_DOMAIN_VGA) |				\
+	BIT_ULL(POWER_DOMAIN_AUDIO) |			\
+	BIT_ULL(POWER_DOMAIN_INIT))
+
+#define BDW_DISPLAY_POWER_DOMAINS (			\
+	BIT_ULL(POWER_DOMAIN_PIPE_B) |			\
+	BIT_ULL(POWER_DOMAIN_PIPE_C) |			\
+	BIT_ULL(POWER_DOMAIN_PIPE_B_PANEL_FITTER) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_C_PANEL_FITTER) |		\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_A) |		\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_B) |		\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_C) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_D_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_CRT) | /* DDI E */	\
+	BIT_ULL(POWER_DOMAIN_VGA) |				\
+	BIT_ULL(POWER_DOMAIN_AUDIO) |			\
+	BIT_ULL(POWER_DOMAIN_INIT))
+
+#define SKL_DISPLAY_POWERWELL_2_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_A) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_B) |			\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_B) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_C) |			\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_C) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_B_PANEL_FITTER) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_C_PANEL_FITTER) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_D_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_E_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_AUX_B) |                       \
+	BIT_ULL(POWER_DOMAIN_AUX_C) |			\
+	BIT_ULL(POWER_DOMAIN_AUX_D) |			\
+	BIT_ULL(POWER_DOMAIN_AUDIO) |			\
+	BIT_ULL(POWER_DOMAIN_VGA) |				\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define SKL_DISPLAY_DDI_IO_A_E_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_A_IO) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_E_IO) |		\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define SKL_DISPLAY_DDI_IO_B_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_IO) |		\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define SKL_DISPLAY_DDI_IO_C_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_IO) |		\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define SKL_DISPLAY_DDI_IO_D_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_D_IO) |		\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define SKL_DISPLAY_DC_OFF_POWER_DOMAINS (		\
+	SKL_DISPLAY_POWERWELL_2_POWER_DOMAINS |		\
+	BIT_ULL(POWER_DOMAIN_MODESET) |			\
+	BIT_ULL(POWER_DOMAIN_AUX_A) |			\
+	BIT_ULL(POWER_DOMAIN_INIT))
+
+#define BXT_DISPLAY_POWERWELL_2_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_A) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_B) |			\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_B) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_C) |			\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_C) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_B_PANEL_FITTER) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_C_PANEL_FITTER) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_AUX_B) |			\
+	BIT_ULL(POWER_DOMAIN_AUX_C) |			\
+	BIT_ULL(POWER_DOMAIN_AUDIO) |			\
+	BIT_ULL(POWER_DOMAIN_VGA) |				\
+	BIT_ULL(POWER_DOMAIN_GMBUS) |			\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define BXT_DISPLAY_DC_OFF_POWER_DOMAINS (		\
+	BXT_DISPLAY_POWERWELL_2_POWER_DOMAINS |		\
+	BIT_ULL(POWER_DOMAIN_MODESET) |			\
+	BIT_ULL(POWER_DOMAIN_AUX_A) |			\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define BXT_DPIO_CMN_A_POWER_DOMAINS (			\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_A_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_AUX_A) |			\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define BXT_DPIO_CMN_BC_POWER_DOMAINS (			\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_AUX_B) |			\
+	BIT_ULL(POWER_DOMAIN_AUX_C) |			\
+	BIT_ULL(POWER_DOMAIN_INIT))
+
+#define GLK_DISPLAY_POWERWELL_2_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_A) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_B) |			\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_B) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_C) |			\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_C) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_B_PANEL_FITTER) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_C_PANEL_FITTER) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_AUX_B) |                       \
+	BIT_ULL(POWER_DOMAIN_AUX_C) |			\
+	BIT_ULL(POWER_DOMAIN_AUDIO) |			\
+	BIT_ULL(POWER_DOMAIN_VGA) |				\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define GLK_DISPLAY_DDI_IO_A_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_A_IO))
+#define GLK_DISPLAY_DDI_IO_B_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_IO))
+#define GLK_DISPLAY_DDI_IO_C_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_IO))
+#define GLK_DPIO_CMN_A_POWER_DOMAINS (			\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_A_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_AUX_A) |			\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define GLK_DPIO_CMN_B_POWER_DOMAINS (			\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_AUX_B) |			\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define GLK_DPIO_CMN_C_POWER_DOMAINS (			\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_AUX_C) |			\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define GLK_DISPLAY_AUX_A_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_AUX_A) |		\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define GLK_DISPLAY_AUX_B_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_AUX_B) |		\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define GLK_DISPLAY_AUX_C_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_AUX_C) |		\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define GLK_DISPLAY_DC_OFF_POWER_DOMAINS (		\
+	GLK_DISPLAY_POWERWELL_2_POWER_DOMAINS |		\
+	BIT_ULL(POWER_DOMAIN_MODESET) |			\
+	BIT_ULL(POWER_DOMAIN_AUX_A) |			\
+	BIT_ULL(POWER_DOMAIN_INIT))
+
+#define CNL_DISPLAY_POWERWELL_2_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_A) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_B) |			\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_B) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_C) |			\
+	BIT_ULL(POWER_DOMAIN_TRANSCODER_C) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_B_PANEL_FITTER) |		\
+	BIT_ULL(POWER_DOMAIN_PIPE_C_PANEL_FITTER) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_D_LANES) |		\
+	BIT_ULL(POWER_DOMAIN_AUX_B) |                       \
+	BIT_ULL(POWER_DOMAIN_AUX_C) |			\
+	BIT_ULL(POWER_DOMAIN_AUX_D) |			\
+	BIT_ULL(POWER_DOMAIN_AUDIO) |			\
+	BIT_ULL(POWER_DOMAIN_VGA) |				\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define CNL_DISPLAY_DDI_A_IO_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_A_IO) |		\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define CNL_DISPLAY_DDI_B_IO_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_B_IO) |		\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define CNL_DISPLAY_DDI_C_IO_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_C_IO) |		\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define CNL_DISPLAY_DDI_D_IO_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_PORT_DDI_D_IO) |		\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define CNL_DISPLAY_AUX_A_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_AUX_A) |			\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define CNL_DISPLAY_AUX_B_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_AUX_B) |			\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define CNL_DISPLAY_AUX_C_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_AUX_C) |			\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define CNL_DISPLAY_AUX_D_POWER_DOMAINS (		\
+	BIT_ULL(POWER_DOMAIN_AUX_D) |			\
+	BIT_ULL(POWER_DOMAIN_INIT))
+#define CNL_DISPLAY_DC_OFF_POWER_DOMAINS (		\
+	CNL_DISPLAY_POWERWELL_2_POWER_DOMAINS |		\
+	BIT_ULL(POWER_DOMAIN_MODESET) |			\
+	BIT_ULL(POWER_DOMAIN_AUX_A) |			\
 	BIT_ULL(POWER_DOMAIN_INIT))
 
 static const struct i915_power_well_ops i9xx_always_on_power_well_ops = {
@@ -1997,6 +1862,7 @@ static struct i915_power_well i9xx_always_on_power_well[] = {
 		.always_on = 1,
 		.domains = POWER_DOMAIN_MASK,
 		.ops = &i9xx_always_on_power_well_ops,
+		.id = I915_DISP_PW_ALWAYS_ON,
 	},
 };
 
@@ -2013,11 +1879,13 @@ static struct i915_power_well i830_power_wells[] = {
 		.always_on = 1,
 		.domains = POWER_DOMAIN_MASK,
 		.ops = &i9xx_always_on_power_well_ops,
+		.id = I915_DISP_PW_ALWAYS_ON,
 	},
 	{
 		.name = "pipes",
 		.domains = I830_PIPES_POWER_DOMAINS,
 		.ops = &i830_pipes_power_well_ops,
+		.id = I830_DISP_PW_PIPES,
 	},
 };
 
@@ -2028,13 +1896,6 @@ static const struct i915_power_well_ops hsw_power_well_ops = {
 	.is_enabled = hsw_power_well_enabled,
 };
 
-static const struct i915_power_well_ops skl_power_well_ops = {
-	.sync_hw = skl_power_well_sync_hw,
-	.enable = skl_power_well_enable,
-	.disable = skl_power_well_disable,
-	.is_enabled = skl_power_well_enabled,
-};
-
 static const struct i915_power_well_ops gen9_dc_off_power_well_ops = {
 	.sync_hw = i9xx_power_well_sync_hw_noop,
 	.enable = gen9_dc_off_power_well_enable,
@@ -2055,11 +1916,16 @@ static struct i915_power_well hsw_power_wells[] = {
 		.always_on = 1,
 		.domains = POWER_DOMAIN_MASK,
 		.ops = &i9xx_always_on_power_well_ops,
+		.id = I915_DISP_PW_ALWAYS_ON,
 	},
 	{
 		.name = "display",
 		.domains = HSW_DISPLAY_POWER_DOMAINS,
 		.ops = &hsw_power_well_ops,
+		.id = HSW_DISP_PW_GLOBAL,
+		{
+			.hsw.has_vga = true,
+		},
 	},
 };
 
@@ -2069,11 +1935,17 @@ static struct i915_power_well bdw_power_wells[] = {
 		.always_on = 1,
 		.domains = POWER_DOMAIN_MASK,
 		.ops = &i9xx_always_on_power_well_ops,
+		.id = I915_DISP_PW_ALWAYS_ON,
 	},
 	{
 		.name = "display",
 		.domains = BDW_DISPLAY_POWER_DOMAINS,
 		.ops = &hsw_power_well_ops,
+		.id = HSW_DISP_PW_GLOBAL,
+		{
+			.hsw.irq_pipe_mask = BIT(PIPE_B) | BIT(PIPE_C),
+			.hsw.has_vga = true,
+		},
 	},
 };
 
@@ -2104,7 +1976,7 @@ static struct i915_power_well vlv_power_wells[] = {
 		.always_on = 1,
 		.domains = POWER_DOMAIN_MASK,
 		.ops = &i9xx_always_on_power_well_ops,
-		.id = PUNIT_POWER_WELL_ALWAYS_ON,
+		.id = I915_DISP_PW_ALWAYS_ON,
 	},
 	{
 		.name = "display",
@@ -2162,6 +2034,7 @@ static struct i915_power_well chv_power_wells[] = {
 		.always_on = 1,
 		.domains = POWER_DOMAIN_MASK,
 		.ops = &i9xx_always_on_power_well_ops,
+		.id = I915_DISP_PW_ALWAYS_ON,
 	},
 	{
 		.name = "display",
@@ -2171,7 +2044,7 @@ static struct i915_power_well chv_power_wells[] = {
 		 * required for any pipe to work.
 		 */
 		.domains = CHV_DISPLAY_POWER_DOMAINS,
-		.id = PIPE_A,
+		.id = CHV_DISP_PW_PIPE_A,
 		.ops = &chv_pipe_power_well_ops,
 	},
 	{
@@ -2189,7 +2062,7 @@ static struct i915_power_well chv_power_wells[] = {
 };
 
 bool intel_display_power_well_is_enabled(struct drm_i915_private *dev_priv,
-				    int power_well_id)
+					 enum i915_power_well_id power_well_id)
 {
 	struct i915_power_well *power_well;
 	bool ret;
@@ -2206,20 +2079,23 @@ static struct i915_power_well skl_power_wells[] = {
 		.always_on = 1,
 		.domains = POWER_DOMAIN_MASK,
 		.ops = &i9xx_always_on_power_well_ops,
-		.id = SKL_DISP_PW_ALWAYS_ON,
+		.id = I915_DISP_PW_ALWAYS_ON,
 	},
 	{
 		.name = "power well 1",
 		/* Handled by the DMC firmware */
 		.domains = 0,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = SKL_DISP_PW_1,
+		{
+			.hsw.has_fuses = true,
+		},
 	},
 	{
 		.name = "MISC IO power well",
 		/* Handled by the DMC firmware */
 		.domains = 0,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = SKL_DISP_PW_MISC_IO,
 	},
 	{
@@ -2231,31 +2107,36 @@ static struct i915_power_well skl_power_wells[] = {
 	{
 		.name = "power well 2",
 		.domains = SKL_DISPLAY_POWERWELL_2_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = SKL_DISP_PW_2,
+		{
+			.hsw.irq_pipe_mask = BIT(PIPE_B) | BIT(PIPE_C),
+			.hsw.has_vga = true,
+			.hsw.has_fuses = true,
+		},
 	},
 	{
 		.name = "DDI A/E IO power well",
 		.domains = SKL_DISPLAY_DDI_IO_A_E_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = SKL_DISP_PW_DDI_A_E,
 	},
 	{
 		.name = "DDI B IO power well",
 		.domains = SKL_DISPLAY_DDI_IO_B_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = SKL_DISP_PW_DDI_B,
 	},
 	{
 		.name = "DDI C IO power well",
 		.domains = SKL_DISPLAY_DDI_IO_C_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = SKL_DISP_PW_DDI_C,
 	},
 	{
 		.name = "DDI D IO power well",
 		.domains = SKL_DISPLAY_DDI_IO_D_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = SKL_DISP_PW_DDI_D,
 	},
 };
@@ -2266,12 +2147,16 @@ static struct i915_power_well bxt_power_wells[] = {
 		.always_on = 1,
 		.domains = POWER_DOMAIN_MASK,
 		.ops = &i9xx_always_on_power_well_ops,
+		.id = I915_DISP_PW_ALWAYS_ON,
 	},
 	{
 		.name = "power well 1",
 		.domains = 0,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = SKL_DISP_PW_1,
+		{
+			.hsw.has_fuses = true,
+		},
 	},
 	{
 		.name = "DC off",
@@ -2282,22 +2167,31 @@ static struct i915_power_well bxt_power_wells[] = {
 	{
 		.name = "power well 2",
 		.domains = BXT_DISPLAY_POWERWELL_2_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = SKL_DISP_PW_2,
+		{
+			.hsw.irq_pipe_mask = BIT(PIPE_B) | BIT(PIPE_C),
+			.hsw.has_vga = true,
+			.hsw.has_fuses = true,
+		},
 	},
 	{
 		.name = "dpio-common-a",
 		.domains = BXT_DPIO_CMN_A_POWER_DOMAINS,
 		.ops = &bxt_dpio_cmn_power_well_ops,
 		.id = BXT_DPIO_CMN_A,
-		.data = DPIO_PHY1,
+		{
+			.bxt.phy = DPIO_PHY1,
+		},
 	},
 	{
 		.name = "dpio-common-bc",
 		.domains = BXT_DPIO_CMN_BC_POWER_DOMAINS,
 		.ops = &bxt_dpio_cmn_power_well_ops,
 		.id = BXT_DPIO_CMN_BC,
-		.data = DPIO_PHY0,
+		{
+			.bxt.phy = DPIO_PHY0,
+		},
 	},
 };
 
@@ -2307,13 +2201,17 @@ static struct i915_power_well glk_power_wells[] = {
 		.always_on = 1,
 		.domains = POWER_DOMAIN_MASK,
 		.ops = &i9xx_always_on_power_well_ops,
+		.id = I915_DISP_PW_ALWAYS_ON,
 	},
 	{
 		.name = "power well 1",
 		/* Handled by the DMC firmware */
 		.domains = 0,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = SKL_DISP_PW_1,
+		{
+			.hsw.has_fuses = true,
+		},
 	},
 	{
 		.name = "DC off",
@@ -2324,64 +2222,75 @@ static struct i915_power_well glk_power_wells[] = {
 	{
 		.name = "power well 2",
 		.domains = GLK_DISPLAY_POWERWELL_2_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = SKL_DISP_PW_2,
+		{
+			.hsw.irq_pipe_mask = BIT(PIPE_B) | BIT(PIPE_C),
+			.hsw.has_vga = true,
+			.hsw.has_fuses = true,
+		},
 	},
 	{
 		.name = "dpio-common-a",
 		.domains = GLK_DPIO_CMN_A_POWER_DOMAINS,
 		.ops = &bxt_dpio_cmn_power_well_ops,
 		.id = BXT_DPIO_CMN_A,
-		.data = DPIO_PHY1,
+		{
+			.bxt.phy = DPIO_PHY1,
+		},
 	},
 	{
 		.name = "dpio-common-b",
 		.domains = GLK_DPIO_CMN_B_POWER_DOMAINS,
 		.ops = &bxt_dpio_cmn_power_well_ops,
 		.id = BXT_DPIO_CMN_BC,
-		.data = DPIO_PHY0,
+		{
+			.bxt.phy = DPIO_PHY0,
+		},
 	},
 	{
 		.name = "dpio-common-c",
 		.domains = GLK_DPIO_CMN_C_POWER_DOMAINS,
 		.ops = &bxt_dpio_cmn_power_well_ops,
 		.id = GLK_DPIO_CMN_C,
-		.data = DPIO_PHY2,
+		{
+			.bxt.phy = DPIO_PHY2,
+		},
 	},
 	{
 		.name = "AUX A",
 		.domains = GLK_DISPLAY_AUX_A_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = GLK_DISP_PW_AUX_A,
 	},
 	{
 		.name = "AUX B",
 		.domains = GLK_DISPLAY_AUX_B_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = GLK_DISP_PW_AUX_B,
 	},
 	{
 		.name = "AUX C",
 		.domains = GLK_DISPLAY_AUX_C_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = GLK_DISP_PW_AUX_C,
 	},
 	{
 		.name = "DDI A IO power well",
 		.domains = GLK_DISPLAY_DDI_IO_A_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = GLK_DISP_PW_DDI_A,
 	},
 	{
 		.name = "DDI B IO power well",
 		.domains = GLK_DISPLAY_DDI_IO_B_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = SKL_DISP_PW_DDI_B,
 	},
 	{
 		.name = "DDI C IO power well",
 		.domains = GLK_DISPLAY_DDI_IO_C_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = SKL_DISP_PW_DDI_C,
 	},
 };
@@ -2392,36 +2301,40 @@ static struct i915_power_well cnl_power_wells[] = {
 		.always_on = 1,
 		.domains = POWER_DOMAIN_MASK,
 		.ops = &i9xx_always_on_power_well_ops,
+		.id = I915_DISP_PW_ALWAYS_ON,
 	},
 	{
 		.name = "power well 1",
 		/* Handled by the DMC firmware */
 		.domains = 0,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = SKL_DISP_PW_1,
+		{
+			.hsw.has_fuses = true,
+		},
 	},
 	{
 		.name = "AUX A",
 		.domains = CNL_DISPLAY_AUX_A_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = CNL_DISP_PW_AUX_A,
 	},
 	{
 		.name = "AUX B",
 		.domains = CNL_DISPLAY_AUX_B_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = CNL_DISP_PW_AUX_B,
 	},
 	{
 		.name = "AUX C",
 		.domains = CNL_DISPLAY_AUX_C_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = CNL_DISP_PW_AUX_C,
 	},
 	{
 		.name = "AUX D",
 		.domains = CNL_DISPLAY_AUX_D_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = CNL_DISP_PW_AUX_D,
 	},
 	{
@@ -2433,31 +2346,36 @@ static struct i915_power_well cnl_power_wells[] = {
 	{
 		.name = "power well 2",
 		.domains = CNL_DISPLAY_POWERWELL_2_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = SKL_DISP_PW_2,
+		{
+			.hsw.irq_pipe_mask = BIT(PIPE_B) | BIT(PIPE_C),
+			.hsw.has_vga = true,
+			.hsw.has_fuses = true,
+		},
 	},
 	{
 		.name = "DDI A IO power well",
 		.domains = CNL_DISPLAY_DDI_A_IO_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = CNL_DISP_PW_DDI_A,
 	},
 	{
 		.name = "DDI B IO power well",
 		.domains = CNL_DISPLAY_DDI_B_IO_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = SKL_DISP_PW_DDI_B,
 	},
 	{
 		.name = "DDI C IO power well",
 		.domains = CNL_DISPLAY_DDI_C_IO_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = SKL_DISP_PW_DDI_C,
 	},
 	{
 		.name = "DDI D IO power well",
 		.domains = CNL_DISPLAY_DDI_D_IO_POWER_DOMAINS,
-		.ops = &skl_power_well_ops,
+		.ops = &hsw_power_well_ops,
 		.id = SKL_DISP_PW_DDI_D,
 	},
 };
@@ -2479,7 +2397,7 @@ static uint32_t get_allowed_dc_mask(const struct drm_i915_private *dev_priv,
 	int requested_dc;
 	int max_dc;
 
-	if (IS_GEN9_BC(dev_priv)) {
+	if (IS_GEN9_BC(dev_priv) || IS_CANNONLAKE(dev_priv)) {
 		max_dc = 2;
 		mask = 0;
 	} else if (IS_GEN9_LP(dev_priv)) {
@@ -2521,6 +2439,22 @@ static uint32_t get_allowed_dc_mask(const struct drm_i915_private *dev_priv,
 	return mask;
 }
 
+static void assert_power_well_ids_unique(struct drm_i915_private *dev_priv)
+{
+	struct i915_power_domains *power_domains = &dev_priv->power_domains;
+	u64 power_well_ids;
+	int i;
+
+	power_well_ids = 0;
+	for (i = 0; i < power_domains->power_well_count; i++) {
+		enum i915_power_well_id id = power_domains->power_wells[i].id;
+
+		WARN_ON(id >= sizeof(power_well_ids) * 8);
+		WARN_ON(power_well_ids & BIT_ULL(id));
+		power_well_ids |= BIT_ULL(id);
+	}
+}
+
 #define set_power_wells(power_domains, __power_wells) ({		\
 	(power_domains)->power_wells = (__power_wells);			\
 	(power_domains)->power_well_count = ARRAY_SIZE(__power_wells);	\
@@ -2572,6 +2506,8 @@ int intel_power_domains_init(struct drm_i915_private *dev_priv)
 		set_power_wells(power_domains, i9xx_always_on_power_well);
 	}
 
+	assert_power_well_ids_unique(dev_priv);
+
 	return 0;
 }
 
@@ -2694,13 +2630,18 @@ static void skl_display_core_uninit(struct drm_i915_private *dev_priv)
 
 	mutex_lock(&power_domains->lock);
 
-	well = lookup_power_well(dev_priv, SKL_DISP_PW_MISC_IO);
-	intel_power_well_disable(dev_priv, well);
-
+	/*
+	 * BSpec says to keep the MISC IO power well enabled here, only
+	 * remove our request for power well 1.
+	 * Note that even though the driver's request is removed power well 1
+	 * may stay enabled after this due to DMC's own request on it.
+	 */
 	well = lookup_power_well(dev_priv, SKL_DISP_PW_1);
 	intel_power_well_disable(dev_priv, well);
 
 	mutex_unlock(&power_domains->lock);
+
+	usleep_range(10, 30);		/* 10 us delay per Bspec */
 }
 
 void bxt_display_core_init(struct drm_i915_private *dev_priv,
@@ -2751,13 +2692,19 @@ void bxt_display_core_uninit(struct drm_i915_private *dev_priv)
 
 	/* The spec doesn't call for removing the reset handshake flag */
 
-	/* Disable PG1 */
+	/*
+	 * Disable PW1 (PG1).
+	 * Note that even though the driver's request is removed power well 1
+	 * may stay enabled after this due to DMC's own request on it.
+	 */
 	mutex_lock(&power_domains->lock);
 
 	well = lookup_power_well(dev_priv, SKL_DISP_PW_1);
 	intel_power_well_disable(dev_priv, well);
 
 	mutex_unlock(&power_domains->lock);
+
+	usleep_range(10, 30);		/* 10 us delay per Bspec */
 }
 
 #define CNL_PROCMON_IDX(val) \
@@ -2796,7 +2743,7 @@ static void cnl_display_core_init(struct drm_i915_private *dev_priv, bool resume
 
 	/* 2. Enable Comp */
 	val = I915_READ(CHICKEN_MISC_2);
-	val &= ~COMP_PWR_DOWN;
+	val &= ~CNL_COMP_PWR_DOWN;
 	I915_WRITE(CHICKEN_MISC_2, val);
 
 	val = I915_READ(CNL_PORT_COMP_DW3);
@@ -2821,7 +2768,10 @@ static void cnl_display_core_init(struct drm_i915_private *dev_priv, bool resume
 	val |= CL_POWER_DOWN_ENABLE;
 	I915_WRITE(CNL_PORT_CL1CM_DW5, val);
 
-	/* 4. Enable Power Well 1 (PG1) and Aux IO Power */
+	/*
+	 * 4. Enable Power Well 1 (PG1).
+	 *    The AUX IO power wells will be enabled on demand.
+	 */
 	mutex_lock(&power_domains->lock);
 	well = lookup_power_well(dev_priv, SKL_DISP_PW_1);
 	intel_power_well_enable(dev_priv, well);
@@ -2853,15 +2803,21 @@ static void cnl_display_core_uninit(struct drm_i915_private *dev_priv)
 	/* 3. Disable CD clock */
 	cnl_uninit_cdclk(dev_priv);
 
-	/* 4. Disable Power Well 1 (PG1) and Aux IO Power */
+	/*
+	 * 4. Disable Power Well 1 (PG1).
+	 *    The AUX IO power wells are toggled on demand, so they are already
+	 *    disabled at this point.
+	 */
 	mutex_lock(&power_domains->lock);
 	well = lookup_power_well(dev_priv, SKL_DISP_PW_1);
 	intel_power_well_disable(dev_priv, well);
 	mutex_unlock(&power_domains->lock);
 
+	usleep_range(10, 30);		/* 10 us delay per Bspec */
+
 	/* 5. Disable Comp */
 	val = I915_READ(CHICKEN_MISC_2);
-	val |= COMP_PWR_DOWN;
+	val |= CNL_COMP_PWR_DOWN;
 	I915_WRITE(CHICKEN_MISC_2, val);
 }
 
diff --git a/drivers/gpu/drm/i915/intel_sdvo.c b/drivers/gpu/drm/i915/intel_sdvo.c
index 3f8f30b..3dc38c2 100644
--- a/drivers/gpu/drm/i915/intel_sdvo.c
+++ b/drivers/gpu/drm/i915/intel_sdvo.c
@@ -451,23 +451,24 @@ static const char * const cmd_status_names[] = {
 	"Scaling not supported"
 };
 
-static bool intel_sdvo_write_cmd(struct intel_sdvo *intel_sdvo, u8 cmd,
-				 const void *args, int args_len)
+static bool __intel_sdvo_write_cmd(struct intel_sdvo *intel_sdvo, u8 cmd,
+				   const void *args, int args_len,
+				   bool unlocked)
 {
 	u8 *buf, status;
 	struct i2c_msg *msgs;
 	int i, ret = true;
 
-        /* Would be simpler to allocate both in one go ? */        
+	/* Would be simpler to allocate both in one go ? */
 	buf = kzalloc(args_len * 2 + 2, GFP_KERNEL);
 	if (!buf)
 		return false;
 
 	msgs = kcalloc(args_len + 3, sizeof(*msgs), GFP_KERNEL);
 	if (!msgs) {
-	        kfree(buf);
+		kfree(buf);
 		return false;
-        }
+	}
 
 	intel_sdvo_debug_write(intel_sdvo, cmd, args, args_len);
 
@@ -498,7 +499,10 @@ static bool intel_sdvo_write_cmd(struct intel_sdvo *intel_sdvo, u8 cmd,
 	msgs[i+2].len = 1;
 	msgs[i+2].buf = &status;
 
-	ret = i2c_transfer(intel_sdvo->i2c, msgs, i+3);
+	if (unlocked)
+		ret = i2c_transfer(intel_sdvo->i2c, msgs, i+3);
+	else
+		ret = __i2c_transfer(intel_sdvo->i2c, msgs, i+3);
 	if (ret < 0) {
 		DRM_DEBUG_KMS("I2c transfer returned %d\n", ret);
 		ret = false;
@@ -516,6 +520,12 @@ static bool intel_sdvo_write_cmd(struct intel_sdvo *intel_sdvo, u8 cmd,
 	return ret;
 }
 
+static bool intel_sdvo_write_cmd(struct intel_sdvo *intel_sdvo, u8 cmd,
+				 const void *args, int args_len)
+{
+	return __intel_sdvo_write_cmd(intel_sdvo, cmd, args, args_len, true);
+}
+
 static bool intel_sdvo_read_response(struct intel_sdvo *intel_sdvo,
 				     void *response, int response_len)
 {
@@ -602,13 +612,13 @@ static int intel_sdvo_get_pixel_multiplier(const struct drm_display_mode *adjust
 		return 4;
 }
 
-static bool intel_sdvo_set_control_bus_switch(struct intel_sdvo *intel_sdvo,
-					      u8 ddc_bus)
+static bool __intel_sdvo_set_control_bus_switch(struct intel_sdvo *intel_sdvo,
+						u8 ddc_bus)
 {
 	/* This must be the immediately preceding write before the i2c xfer */
-	return intel_sdvo_write_cmd(intel_sdvo,
-				    SDVO_CMD_SET_CONTROL_BUS_SWITCH,
-				    &ddc_bus, 1);
+	return __intel_sdvo_write_cmd(intel_sdvo,
+				      SDVO_CMD_SET_CONTROL_BUS_SWITCH,
+				      &ddc_bus, 1, false);
 }
 
 static bool intel_sdvo_set_value(struct intel_sdvo *intel_sdvo, u8 cmd, const void *data, int len)
@@ -996,7 +1006,8 @@ static bool intel_sdvo_set_avi_infoframe(struct intel_sdvo *intel_sdvo,
 	ssize_t len;
 
 	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi,
-						       &pipe_config->base.adjusted_mode);
+						       &pipe_config->base.adjusted_mode,
+						       false);
 	if (ret < 0) {
 		DRM_ERROR("couldn't fill AVI infoframe\n");
 		return false;
@@ -1343,13 +1354,15 @@ static void intel_sdvo_pre_enable(struct intel_encoder *intel_encoder,
 		sdvox |= (9 << 19) | SDVO_BORDER_ENABLE;
 	}
 
-	if (INTEL_PCH_TYPE(dev_priv) >= PCH_CPT)
+	if (HAS_PCH_CPT(dev_priv))
 		sdvox |= SDVO_PIPE_SEL_CPT(crtc->pipe);
 	else
 		sdvox |= SDVO_PIPE_SEL(crtc->pipe);
 
-	if (crtc_state->has_audio)
+	if (crtc_state->has_audio) {
+		WARN_ON_ONCE(INTEL_GEN(dev_priv) < 4);
 		sdvox |= SDVO_AUDIO_ENABLE;
+	}
 
 	if (INTEL_GEN(dev_priv) >= 4) {
 		/* done in crtc_mode_set as the dpll_md reg must be written early */
@@ -1479,6 +1492,9 @@ static void intel_sdvo_get_config(struct intel_encoder *encoder,
 	if (sdvox & HDMI_COLOR_RANGE_16_235)
 		pipe_config->limited_color_range = true;
 
+	if (sdvox & SDVO_AUDIO_ENABLE)
+		pipe_config->has_audio = true;
+
 	if (intel_sdvo_get_value(intel_sdvo, SDVO_CMD_GET_ENCODE,
 				 &val, 1)) {
 		if (val == SDVO_ENCODE_HDMI)
@@ -2192,10 +2208,8 @@ intel_sdvo_connector_duplicate_state(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs intel_sdvo_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.detect = intel_sdvo_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
-	.set_property = drm_atomic_helper_connector_set_property,
 	.atomic_get_property = intel_sdvo_connector_atomic_get_property,
 	.atomic_set_property = intel_sdvo_connector_atomic_set_property,
 	.late_register = intel_sdvo_connector_register,
@@ -2454,6 +2468,7 @@ static bool
 intel_sdvo_dvi_init(struct intel_sdvo *intel_sdvo, int device)
 {
 	struct drm_encoder *encoder = &intel_sdvo->base.base;
+	struct drm_i915_private *dev_priv = to_i915(encoder->dev);
 	struct drm_connector *connector;
 	struct intel_encoder *intel_encoder = to_intel_encoder(encoder);
 	struct intel_connector *intel_connector;
@@ -2489,7 +2504,9 @@ intel_sdvo_dvi_init(struct intel_sdvo *intel_sdvo, int device)
 	encoder->encoder_type = DRM_MODE_ENCODER_TMDS;
 	connector->connector_type = DRM_MODE_CONNECTOR_DVID;
 
-	if (intel_sdvo_is_hdmi_connector(intel_sdvo, device)) {
+	/* gen3 doesn't do the hdmi bits in the SDVO register */
+	if (INTEL_GEN(dev_priv) >= 4 &&
+	    intel_sdvo_is_hdmi_connector(intel_sdvo, device)) {
 		connector->connector_type = DRM_MODE_CONNECTOR_HDMIA;
 		intel_sdvo->is_hdmi = true;
 	}
@@ -2925,7 +2942,7 @@ static int intel_sdvo_ddc_proxy_xfer(struct i2c_adapter *adapter,
 {
 	struct intel_sdvo *sdvo = adapter->algo_data;
 
-	if (!intel_sdvo_set_control_bus_switch(sdvo, sdvo->ddc_bus))
+	if (!__intel_sdvo_set_control_bus_switch(sdvo, sdvo->ddc_bus))
 		return -EIO;
 
 	return sdvo->i2c->algo->master_xfer(sdvo->i2c, msgs, num);
@@ -2942,6 +2959,33 @@ static const struct i2c_algorithm intel_sdvo_ddc_proxy = {
 	.functionality	= intel_sdvo_ddc_proxy_func
 };
 
+static void proxy_lock_bus(struct i2c_adapter *adapter,
+			   unsigned int flags)
+{
+	struct intel_sdvo *sdvo = adapter->algo_data;
+	sdvo->i2c->lock_ops->lock_bus(sdvo->i2c, flags);
+}
+
+static int proxy_trylock_bus(struct i2c_adapter *adapter,
+			     unsigned int flags)
+{
+	struct intel_sdvo *sdvo = adapter->algo_data;
+	return sdvo->i2c->lock_ops->trylock_bus(sdvo->i2c, flags);
+}
+
+static void proxy_unlock_bus(struct i2c_adapter *adapter,
+			     unsigned int flags)
+{
+	struct intel_sdvo *sdvo = adapter->algo_data;
+	sdvo->i2c->lock_ops->unlock_bus(sdvo->i2c, flags);
+}
+
+const struct i2c_lock_operations proxy_lock_ops = {
+	.lock_bus =    proxy_lock_bus,
+	.trylock_bus = proxy_trylock_bus,
+	.unlock_bus =  proxy_unlock_bus,
+};
+
 static bool
 intel_sdvo_init_ddc_proxy(struct intel_sdvo *sdvo,
 			  struct drm_i915_private *dev_priv)
@@ -2954,6 +2998,7 @@ intel_sdvo_init_ddc_proxy(struct intel_sdvo *sdvo,
 	sdvo->ddc.dev.parent = &pdev->dev;
 	sdvo->ddc.algo_data = sdvo;
 	sdvo->ddc.algo = &intel_sdvo_ddc_proxy;
+	sdvo->ddc.lock_ops = &proxy_lock_ops;
 
 	return i2c_add_adapter(&sdvo->ddc) == 0;
 }
diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
index 0c650c2..524933b 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -30,6 +30,7 @@
  * support.
  */
 #include <drm/drmP.h>
+#include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc.h>
 #include <drm/drm_fourcc.h>
 #include <drm/drm_rect.h>
@@ -176,7 +177,7 @@ void intel_pipe_update_start(struct intel_crtc *crtc)
  * re-enables interrupts and verifies the update was actually completed
  * before a vblank using the value of @start_vbl_count.
  */
-void intel_pipe_update_end(struct intel_crtc *crtc, struct intel_flip_work *work)
+void intel_pipe_update_end(struct intel_crtc *crtc)
 {
 	enum pipe pipe = crtc->pipe;
 	int scanline_end = intel_get_crtc_scanline(crtc);
@@ -184,12 +185,6 @@ void intel_pipe_update_end(struct intel_crtc *crtc, struct intel_flip_work *work
 	ktime_t end_vbl_time = ktime_get();
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
 
-	if (work) {
-		work->flip_queued_vblank = end_vbl_count;
-		smp_mb__before_atomic();
-		atomic_set(&work->pending, 1);
-	}
-
 	trace_i915_pipe_update_end(crtc, end_vbl_count, scanline_end);
 
 	/* We're still in the vblank-evade critical section, this can't race.
@@ -244,6 +239,7 @@ skl_update_plane(struct intel_plane *plane,
 	u32 surf_addr = plane_state->main.offset;
 	unsigned int rotation = plane_state->base.rotation;
 	u32 stride = skl_plane_stride(fb, 0, rotation);
+	u32 aux_stride = skl_plane_stride(fb, 1, rotation);
 	int crtc_x = plane_state->base.dst.x1;
 	int crtc_y = plane_state->base.dst.y1;
 	uint32_t crtc_w = drm_rect_width(&plane_state->base.dst);
@@ -262,7 +258,7 @@ skl_update_plane(struct intel_plane *plane,
 
 	spin_lock_irqsave(&dev_priv->uncore.lock, irqflags);
 
-	if (IS_GEMINILAKE(dev_priv)) {
+	if (IS_GEMINILAKE(dev_priv) || IS_CANNONLAKE(dev_priv)) {
 		I915_WRITE_FW(PLANE_COLOR_CTL(pipe, plane_id),
 			      PLANE_COLOR_PIPE_GAMMA_ENABLE |
 			      PLANE_COLOR_PIPE_CSC_ENABLE |
@@ -278,6 +274,10 @@ skl_update_plane(struct intel_plane *plane,
 	I915_WRITE_FW(PLANE_OFFSET(pipe, plane_id), (y << 16) | x);
 	I915_WRITE_FW(PLANE_STRIDE(pipe, plane_id), stride);
 	I915_WRITE_FW(PLANE_SIZE(pipe, plane_id), (src_h << 16) | src_w);
+	I915_WRITE_FW(PLANE_AUX_DIST(pipe, plane_id),
+		      (plane_state->aux.offset - surf_addr) | aux_stride);
+	I915_WRITE_FW(PLANE_AUX_OFFSET(pipe, plane_id),
+		      (plane_state->aux.y << 16) | plane_state->aux.x);
 
 	/* program plane scaler */
 	if (plane_state->scaler_id >= 0) {
@@ -1038,6 +1038,12 @@ static const uint32_t g4x_plane_formats[] = {
 	DRM_FORMAT_VYUY,
 };
 
+static const uint64_t i9xx_plane_format_modifiers[] = {
+	I915_FORMAT_MOD_X_TILED,
+	DRM_FORMAT_MOD_LINEAR,
+	DRM_FORMAT_MOD_INVALID
+};
+
 static const uint32_t snb_plane_formats[] = {
 	DRM_FORMAT_XBGR8888,
 	DRM_FORMAT_XRGB8888,
@@ -1073,6 +1079,122 @@ static uint32_t skl_plane_formats[] = {
 	DRM_FORMAT_VYUY,
 };
 
+static const uint64_t skl_plane_format_modifiers[] = {
+	I915_FORMAT_MOD_X_TILED,
+	DRM_FORMAT_MOD_LINEAR,
+	DRM_FORMAT_MOD_INVALID
+};
+
+static bool g4x_sprite_plane_format_mod_supported(struct drm_plane *plane,
+						  uint32_t format,
+						  uint64_t modifier)
+{
+	switch (format) {
+	case DRM_FORMAT_XBGR8888:
+	case DRM_FORMAT_XRGB8888:
+	case DRM_FORMAT_YUYV:
+	case DRM_FORMAT_YVYU:
+	case DRM_FORMAT_UYVY:
+	case DRM_FORMAT_VYUY:
+		if (modifier == DRM_FORMAT_MOD_LINEAR ||
+		    modifier == I915_FORMAT_MOD_X_TILED)
+			return true;
+		/* fall through */
+	default:
+		return false;
+	}
+}
+
+static bool vlv_sprite_plane_format_mod_supported(struct drm_plane *plane,
+						  uint32_t format,
+						  uint64_t modifier)
+{
+	switch (format) {
+	case DRM_FORMAT_YUYV:
+	case DRM_FORMAT_YVYU:
+	case DRM_FORMAT_UYVY:
+	case DRM_FORMAT_VYUY:
+	case DRM_FORMAT_RGB565:
+	case DRM_FORMAT_XRGB8888:
+	case DRM_FORMAT_ARGB8888:
+	case DRM_FORMAT_XBGR2101010:
+	case DRM_FORMAT_ABGR2101010:
+	case DRM_FORMAT_XBGR8888:
+	case DRM_FORMAT_ABGR8888:
+		if (modifier == DRM_FORMAT_MOD_LINEAR ||
+		    modifier == I915_FORMAT_MOD_X_TILED)
+			return true;
+		/* fall through */
+	default:
+		return false;
+	}
+}
+
+static bool skl_sprite_plane_format_mod_supported(struct drm_plane *plane,
+						  uint32_t format,
+						  uint64_t modifier)
+{
+	/* This is the same as primary plane since SKL has universal planes */
+	switch (format) {
+	case DRM_FORMAT_XRGB8888:
+	case DRM_FORMAT_XBGR8888:
+	case DRM_FORMAT_ARGB8888:
+	case DRM_FORMAT_ABGR8888:
+	case DRM_FORMAT_RGB565:
+	case DRM_FORMAT_XRGB2101010:
+	case DRM_FORMAT_XBGR2101010:
+	case DRM_FORMAT_YUYV:
+	case DRM_FORMAT_YVYU:
+	case DRM_FORMAT_UYVY:
+	case DRM_FORMAT_VYUY:
+		if (modifier == I915_FORMAT_MOD_Yf_TILED)
+			return true;
+		/* fall through */
+	case DRM_FORMAT_C8:
+		if (modifier == DRM_FORMAT_MOD_LINEAR ||
+		    modifier == I915_FORMAT_MOD_X_TILED ||
+		    modifier == I915_FORMAT_MOD_Y_TILED)
+			return true;
+		/* fall through */
+	default:
+		return false;
+	}
+}
+
+static bool intel_sprite_plane_format_mod_supported(struct drm_plane *plane,
+                                                    uint32_t format,
+                                                    uint64_t modifier)
+{
+	struct drm_i915_private *dev_priv = to_i915(plane->dev);
+
+	if (WARN_ON(modifier == DRM_FORMAT_MOD_INVALID))
+		return false;
+
+	if ((modifier >> 56) != DRM_FORMAT_MOD_VENDOR_INTEL &&
+	    modifier != DRM_FORMAT_MOD_LINEAR)
+		return false;
+
+	if (INTEL_GEN(dev_priv) >= 9)
+		return skl_sprite_plane_format_mod_supported(plane, format, modifier);
+	else if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv))
+		return vlv_sprite_plane_format_mod_supported(plane, format, modifier);
+	else
+		return g4x_sprite_plane_format_mod_supported(plane, format, modifier);
+
+	unreachable();
+}
+
+static const struct drm_plane_funcs intel_sprite_plane_funcs = {
+        .update_plane = drm_atomic_helper_update_plane,
+        .disable_plane = drm_atomic_helper_disable_plane,
+        .destroy = intel_plane_destroy,
+        .atomic_get_property = intel_plane_atomic_get_property,
+        .atomic_set_property = intel_plane_atomic_set_property,
+        .atomic_duplicate_state = intel_plane_duplicate_state,
+        .atomic_destroy_state = intel_plane_destroy_state,
+        .format_mod_supported = intel_sprite_plane_format_mod_supported,
+};
+
 struct intel_plane *
 intel_sprite_plane_create(struct drm_i915_private *dev_priv,
 			  enum pipe pipe, int plane)
@@ -1081,6 +1203,7 @@ intel_sprite_plane_create(struct drm_i915_private *dev_priv,
 	struct intel_plane_state *state = NULL;
 	unsigned long possible_crtcs;
 	const uint32_t *plane_formats;
+	const uint64_t *modifiers;
 	unsigned int supported_rotations;
 	int num_plane_formats;
 	int ret;
@@ -1098,7 +1221,7 @@ intel_sprite_plane_create(struct drm_i915_private *dev_priv,
 	}
 	intel_plane->base.state = &state->base;
 
-	if (INTEL_GEN(dev_priv) >= 9) {
+	if (INTEL_GEN(dev_priv) >= 10) {
 		intel_plane->can_scale = true;
 		state->scaler_id = -1;
 
@@ -1107,6 +1230,17 @@ intel_sprite_plane_create(struct drm_i915_private *dev_priv,
 
 		plane_formats = skl_plane_formats;
 		num_plane_formats = ARRAY_SIZE(skl_plane_formats);
+		modifiers = skl_plane_format_modifiers;
+	} else if (INTEL_GEN(dev_priv) >= 9) {
+		intel_plane->can_scale = true;
+		state->scaler_id = -1;
+
+		intel_plane->update_plane = skl_update_plane;
+		intel_plane->disable_plane = skl_disable_plane;
+
+		plane_formats = skl_plane_formats;
+		num_plane_formats = ARRAY_SIZE(skl_plane_formats);
+		modifiers = skl_plane_format_modifiers;
 	} else if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
 		intel_plane->can_scale = false;
 		intel_plane->max_downscale = 1;
@@ -1116,6 +1250,7 @@ intel_sprite_plane_create(struct drm_i915_private *dev_priv,
 
 		plane_formats = vlv_plane_formats;
 		num_plane_formats = ARRAY_SIZE(vlv_plane_formats);
+		modifiers = i9xx_plane_format_modifiers;
 	} else if (INTEL_GEN(dev_priv) >= 7) {
 		if (IS_IVYBRIDGE(dev_priv)) {
 			intel_plane->can_scale = true;
@@ -1130,6 +1265,7 @@ intel_sprite_plane_create(struct drm_i915_private *dev_priv,
 
 		plane_formats = snb_plane_formats;
 		num_plane_formats = ARRAY_SIZE(snb_plane_formats);
+		modifiers = i9xx_plane_format_modifiers;
 	} else {
 		intel_plane->can_scale = true;
 		intel_plane->max_downscale = 16;
@@ -1137,6 +1273,7 @@ intel_sprite_plane_create(struct drm_i915_private *dev_priv,
 		intel_plane->update_plane = g4x_update_plane;
 		intel_plane->disable_plane = g4x_disable_plane;
 
+		modifiers = i9xx_plane_format_modifiers;
 		if (IS_GEN6(dev_priv)) {
 			plane_formats = snb_plane_formats;
 			num_plane_formats = ARRAY_SIZE(snb_plane_formats);
@@ -1169,14 +1306,16 @@ intel_sprite_plane_create(struct drm_i915_private *dev_priv,
 
 	if (INTEL_GEN(dev_priv) >= 9)
 		ret = drm_universal_plane_init(&dev_priv->drm, &intel_plane->base,
-					       possible_crtcs, &intel_plane_funcs,
+					       possible_crtcs, &intel_sprite_plane_funcs,
 					       plane_formats, num_plane_formats,
+					       modifiers,
 					       DRM_PLANE_TYPE_OVERLAY,
 					       "plane %d%c", plane + 2, pipe_name(pipe));
 	else
 		ret = drm_universal_plane_init(&dev_priv->drm, &intel_plane->base,
-					       possible_crtcs, &intel_plane_funcs,
+					       possible_crtcs, &intel_sprite_plane_funcs,
 					       plane_formats, num_plane_formats,
+					       modifiers,
 					       DRM_PLANE_TYPE_OVERLAY,
 					       "sprite %c", sprite_name(pipe, plane));
 	if (ret)
diff --git a/drivers/gpu/drm/i915/intel_tv.c b/drivers/gpu/drm/i915/intel_tv.c
index 784df02..906893c 100644
--- a/drivers/gpu/drm/i915/intel_tv.c
+++ b/drivers/gpu/drm/i915/intel_tv.c
@@ -1407,11 +1407,9 @@ intel_tv_destroy(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs intel_tv_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.late_register = intel_connector_register,
 	.early_unregister = intel_connector_unregister,
 	.destroy = intel_tv_destroy,
-	.set_property = drm_atomic_helper_connector_set_property,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
 	.atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
diff --git a/drivers/gpu/drm/i915/intel_uc.c b/drivers/gpu/drm/i915/intel_uc.c
index 27e072c..0178ba4 100644
--- a/drivers/gpu/drm/i915/intel_uc.c
+++ b/drivers/gpu/drm/i915/intel_uc.c
@@ -94,7 +94,7 @@ void intel_uc_sanitize_options(struct drm_i915_private *dev_priv)
 		i915.enable_guc_submission = HAS_GUC_SCHED(dev_priv);
 }
 
-static void guc_write_irq_trigger(struct intel_guc *guc)
+static void gen8_guc_raise_irq(struct intel_guc *guc)
 {
 	struct drm_i915_private *dev_priv = guc_to_i915(guc);
 
@@ -109,7 +109,7 @@ void intel_uc_init_early(struct drm_i915_private *dev_priv)
 
 	mutex_init(&guc->send_mutex);
 	guc->send = intel_guc_send_nop;
-	guc->notify = guc_write_irq_trigger;
+	guc->notify = gen8_guc_raise_irq;
 }
 
 static void fetch_uc_fw(struct drm_i915_private *dev_priv,
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c
index 9882724b..1d7b879 100644
--- a/drivers/gpu/drm/i915/intel_uncore.c
+++ b/drivers/gpu/drm/i915/intel_uncore.c
@@ -643,7 +643,7 @@ find_fw_domain(struct drm_i915_private *dev_priv, u32 offset)
 	{ .start = (s), .end = (e), .domains = (d) }
 
 #define HAS_FWTABLE(dev_priv) \
-	(IS_GEN9(dev_priv) || \
+	(INTEL_GEN(dev_priv) >= 9 || \
 	 IS_CHERRYVIEW(dev_priv) || \
 	 IS_VALLEYVIEW(dev_priv))
 
@@ -1072,7 +1072,7 @@ static void intel_uncore_fw_domains_init(struct drm_i915_private *dev_priv)
 		dev_priv->uncore.fw_clear = _MASKED_BIT_DISABLE(FORCEWAKE_KERNEL);
 	}
 
-	if (IS_GEN9(dev_priv)) {
+	if (INTEL_GEN(dev_priv) >= 9) {
 		dev_priv->uncore.funcs.force_wake_get = fw_domains_get;
 		dev_priv->uncore.funcs.force_wake_put = fw_domains_put;
 		fw_domain_init(dev_priv, FW_DOMAIN_ID_RENDER,
@@ -1497,7 +1497,6 @@ static int gen6_reset_engines(struct drm_i915_private *dev_priv,
 		[VECS] = GEN6_GRDOM_VECS,
 	};
 	u32 hw_mask;
-	int ret;
 
 	if (engine_mask == ALL_ENGINES) {
 		hw_mask = GEN6_GRDOM_FULL;
@@ -1509,11 +1508,7 @@ static int gen6_reset_engines(struct drm_i915_private *dev_priv,
 			hw_mask |= hw_engine_mask[engine->id];
 	}
 
-	ret = gen6_hw_domain_reset(dev_priv, hw_mask);
-
-	intel_uncore_forcewake_reset(dev_priv, true);
-
-	return ret;
+	return gen6_hw_domain_reset(dev_priv, hw_mask);
 }
 
 /**
@@ -1719,6 +1714,17 @@ bool intel_has_gpu_reset(struct drm_i915_private *dev_priv)
 	return intel_get_gpu_reset(dev_priv) != NULL;
 }
 
+/*
+ * When GuC submission is enabled, GuC manages ELSP and can initiate the
+ * engine reset too. For now, fall back to full GPU reset if it is enabled.
+ */
+bool intel_has_reset_engine(struct drm_i915_private *dev_priv)
+{
+	return (dev_priv->info.has_reset_engine &&
+		!dev_priv->guc.execbuf_client &&
+		i915.reset >= 2);
+}
+
 int intel_guc_reset(struct drm_i915_private *dev_priv)
 {
 	int ret;
diff --git a/drivers/gpu/drm/i915/selftests/huge_gem_object.c b/drivers/gpu/drm/i915/selftests/huge_gem_object.c
index caf76af..c5c7e8e 100644
--- a/drivers/gpu/drm/i915/selftests/huge_gem_object.c
+++ b/drivers/gpu/drm/i915/selftests/huge_gem_object.c
@@ -111,6 +111,7 @@ huge_gem_object(struct drm_i915_private *i915,
 		dma_addr_t dma_size)
 {
 	struct drm_i915_gem_object *obj;
+	unsigned int cache_level;
 
 	GEM_BUG_ON(!phys_size || phys_size > dma_size);
 	GEM_BUG_ON(!IS_ALIGNED(phys_size, PAGE_SIZE));
@@ -128,9 +129,8 @@ huge_gem_object(struct drm_i915_private *i915,
 
 	obj->base.read_domains = I915_GEM_DOMAIN_CPU;
 	obj->base.write_domain = I915_GEM_DOMAIN_CPU;
-	obj->cache_level = HAS_LLC(i915) ? I915_CACHE_LLC : I915_CACHE_NONE;
-	obj->cache_coherent = i915_gem_object_is_coherent(obj);
-	obj->cache_dirty = !obj->cache_coherent;
+	cache_level = HAS_LLC(i915) ? I915_CACHE_LLC : I915_CACHE_NONE;
+	i915_gem_object_set_cache_coherency(obj, cache_level);
 	obj->scratch = phys_size;
 
 	return obj;
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
index 95d4aeb..35d778d 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_coherency.c
@@ -241,7 +241,7 @@ static bool always_valid(struct drm_i915_private *i915)
 
 static bool needs_mi_store_dword(struct drm_i915_private *i915)
 {
-	return igt_can_mi_store_dword_imm(i915);
+	return intel_engine_can_store_dword(i915->engine[RCS]);
 }
 
 static const struct igt_coherency_mode {
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_context.c b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
index 12b85b3..fb0a58f 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_context.c
@@ -38,8 +38,6 @@ gpu_fill_dw(struct i915_vma *vma, u64 offset, unsigned long count, u32 value)
 	u32 *cmd;
 	int err;
 
-	GEM_BUG_ON(!igt_can_mi_store_dword_imm(vma->vm->i915));
-
 	size = (4 * count + 1) * sizeof(u32);
 	size = round_up(size, PAGE_SIZE);
 	obj = i915_gem_object_create_internal(vma->vm->i915, size);
@@ -123,6 +121,7 @@ static int gpu_fill(struct drm_i915_gem_object *obj,
 	int err;
 
 	GEM_BUG_ON(obj->base.size > vm->total);
+	GEM_BUG_ON(!intel_engine_can_store_dword(engine));
 
 	vma = i915_vma_instance(obj, vm, NULL);
 	if (IS_ERR(vma))
@@ -359,6 +358,9 @@ static int igt_ctx_exec(void *arg)
 		}
 
 		for_each_engine(engine, i915, id) {
+			if (!intel_engine_can_store_dword(engine))
+				continue;
+
 			if (!obj) {
 				obj = create_test_object(ctx, file, &objects);
 				if (IS_ERR(obj)) {
diff --git a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
index 50710e3..6b132ca 100644
--- a/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/i915_gem_gtt.c
@@ -197,6 +197,9 @@ static int lowlevel_hole(struct drm_i915_private *i915,
 {
 	I915_RND_STATE(seed_prng);
 	unsigned int size;
+	struct i915_vma mock_vma;
+
+	memset(&mock_vma, 0, sizeof(struct i915_vma));
 
 	/* Keep creating larger objects until one cannot fit into the hole */
 	for (size = 12; (hole_end - hole_start) >> size; size++) {
@@ -255,8 +258,11 @@ static int lowlevel_hole(struct drm_i915_private *i915,
 			    vm->allocate_va_range(vm, addr, BIT_ULL(size)))
 				break;
 
-			vm->insert_entries(vm, obj->mm.pages, addr,
-					   I915_CACHE_NONE, 0);
+			mock_vma.pages = obj->mm.pages;
+			mock_vma.node.size = BIT_ULL(size);
+			mock_vma.node.start = addr;
+
+			vm->insert_entries(vm, &mock_vma, I915_CACHE_NONE, 0);
 		}
 		count = n;
 
diff --git a/drivers/gpu/drm/i915/selftests/i915_vma.c b/drivers/gpu/drm/i915/selftests/i915_vma.c
index fb9072d..2e86ec1 100644
--- a/drivers/gpu/drm/i915/selftests/i915_vma.c
+++ b/drivers/gpu/drm/i915/selftests/i915_vma.c
@@ -186,16 +186,20 @@ static int igt_vma_create(void *arg)
 				goto end;
 		}
 
-		list_for_each_entry_safe(ctx, cn, &contexts, link)
+		list_for_each_entry_safe(ctx, cn, &contexts, link) {
+			list_del_init(&ctx->link);
 			mock_context_close(ctx);
+		}
 	}
 
 end:
 	/* Final pass to lookup all created contexts */
 	err = create_vmas(i915, &objects, &contexts);
 out:
-	list_for_each_entry_safe(ctx, cn, &contexts, link)
+	list_for_each_entry_safe(ctx, cn, &contexts, link) {
+		list_del_init(&ctx->link);
 		mock_context_close(ctx);
+	}
 
 	list_for_each_entry_safe(obj, on, &objects, st_link)
 		i915_gem_object_put(obj);
diff --git a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
index aa31d6c..02e52a1 100644
--- a/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
+++ b/drivers/gpu/drm/i915/selftests/intel_hangcheck.c
@@ -22,8 +22,13 @@
  *
  */
 
+#include <linux/kthread.h>
+
 #include "../i915_selftest.h"
 
+#include "mock_context.h"
+#include "mock_drm.h"
+
 struct hang {
 	struct drm_i915_private *i915;
 	struct drm_i915_gem_object *hws;
@@ -248,9 +253,6 @@ static int igt_hang_sanitycheck(void *arg)
 
 	/* Basic check that we can execute our hanging batch */
 
-	if (!igt_can_mi_store_dword_imm(i915))
-		return 0;
-
 	mutex_lock(&i915->drm.struct_mutex);
 	err = hang_init(&h, i915);
 	if (err)
@@ -259,6 +261,9 @@ static int igt_hang_sanitycheck(void *arg)
 	for_each_engine(engine, i915, id) {
 		long timeout;
 
+		if (!intel_engine_can_store_dword(engine))
+			continue;
+
 		rq = hang_create_request(&h, engine, i915->kernel_context);
 		if (IS_ERR(rq)) {
 			err = PTR_ERR(rq);
@@ -292,6 +297,37 @@ static int igt_hang_sanitycheck(void *arg)
 	return err;
 }
 
+static void global_reset_lock(struct drm_i915_private *i915)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	while (test_and_set_bit(I915_RESET_BACKOFF, &i915->gpu_error.flags))
+		wait_event(i915->gpu_error.reset_queue,
+			   !test_bit(I915_RESET_BACKOFF,
+				     &i915->gpu_error.flags));
+
+	for_each_engine(engine, i915, id) {
+		while (test_and_set_bit(I915_RESET_ENGINE + id,
+					&i915->gpu_error.flags))
+			wait_on_bit(&i915->gpu_error.flags,
+				    I915_RESET_ENGINE + id,
+				    TASK_UNINTERRUPTIBLE);
+	}
+}
+
+static void global_reset_unlock(struct drm_i915_private *i915)
+{
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+
+	for_each_engine(engine, i915, id)
+		clear_bit(I915_RESET_ENGINE + id, &i915->gpu_error.flags);
+
+	clear_bit(I915_RESET_BACKOFF, &i915->gpu_error.flags);
+	wake_up_all(&i915->gpu_error.reset_queue);
+}
+
 static int igt_global_reset(void *arg)
 {
 	struct drm_i915_private *i915 = arg;
@@ -300,13 +336,13 @@ static int igt_global_reset(void *arg)
 
 	/* Check that we can issue a global GPU reset */
 
-	set_bit(I915_RESET_BACKOFF, &i915->gpu_error.flags);
+	global_reset_lock(i915);
 	set_bit(I915_RESET_HANDOFF, &i915->gpu_error.flags);
 
 	mutex_lock(&i915->drm.struct_mutex);
 	reset_count = i915_reset_count(&i915->gpu_error);
 
-	i915_reset(i915);
+	i915_reset(i915, I915_RESET_QUIET);
 
 	if (i915_reset_count(&i915->gpu_error) == reset_count) {
 		pr_err("No GPU reset recorded!\n");
@@ -315,7 +351,214 @@ static int igt_global_reset(void *arg)
 	mutex_unlock(&i915->drm.struct_mutex);
 
 	GEM_BUG_ON(test_bit(I915_RESET_HANDOFF, &i915->gpu_error.flags));
-	clear_bit(I915_RESET_BACKOFF, &i915->gpu_error.flags);
+	global_reset_unlock(i915);
+
+	if (i915_terminally_wedged(&i915->gpu_error))
+		err = -EIO;
+
+	return err;
+}
+
+static int igt_reset_engine(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_engine_cs *engine;
+	enum intel_engine_id id;
+	unsigned int reset_count, reset_engine_count;
+	int err = 0;
+
+	/* Check that we can issue a global GPU and engine reset */
+
+	if (!intel_has_reset_engine(i915))
+		return 0;
+
+	for_each_engine(engine, i915, id) {
+		set_bit(I915_RESET_ENGINE + engine->id, &i915->gpu_error.flags);
+		reset_count = i915_reset_count(&i915->gpu_error);
+		reset_engine_count = i915_reset_engine_count(&i915->gpu_error,
+							     engine);
+
+		err = i915_reset_engine(engine, I915_RESET_QUIET);
+		if (err) {
+			pr_err("i915_reset_engine failed\n");
+			break;
+		}
+
+		if (i915_reset_count(&i915->gpu_error) != reset_count) {
+			pr_err("Full GPU reset recorded! (engine reset expected)\n");
+			err = -EINVAL;
+			break;
+		}
+
+		if (i915_reset_engine_count(&i915->gpu_error, engine) ==
+		    reset_engine_count) {
+			pr_err("No %s engine reset recorded!\n", engine->name);
+			err = -EINVAL;
+			break;
+		}
+
+		clear_bit(I915_RESET_ENGINE + engine->id,
+			  &i915->gpu_error.flags);
+	}
+
+	if (i915_terminally_wedged(&i915->gpu_error))
+		err = -EIO;
+
+	return err;
+}
+
+static int active_engine(void *data)
+{
+	struct intel_engine_cs *engine = data;
+	struct drm_i915_gem_request *rq[2] = {};
+	struct i915_gem_context *ctx[2];
+	struct drm_file *file;
+	unsigned long count = 0;
+	int err = 0;
+
+	file = mock_file(engine->i915);
+	if (IS_ERR(file))
+		return PTR_ERR(file);
+
+	mutex_lock(&engine->i915->drm.struct_mutex);
+	ctx[0] = live_context(engine->i915, file);
+	mutex_unlock(&engine->i915->drm.struct_mutex);
+	if (IS_ERR(ctx[0])) {
+		err = PTR_ERR(ctx[0]);
+		goto err_file;
+	}
+
+	mutex_lock(&engine->i915->drm.struct_mutex);
+	ctx[1] = live_context(engine->i915, file);
+	mutex_unlock(&engine->i915->drm.struct_mutex);
+	if (IS_ERR(ctx[1])) {
+		err = PTR_ERR(ctx[1]);
+		i915_gem_context_put(ctx[0]);
+		goto err_file;
+	}
+
+	while (!kthread_should_stop()) {
+		unsigned int idx = count++ & 1;
+		struct drm_i915_gem_request *old = rq[idx];
+		struct drm_i915_gem_request *new;
+
+		mutex_lock(&engine->i915->drm.struct_mutex);
+		new = i915_gem_request_alloc(engine, ctx[idx]);
+		if (IS_ERR(new)) {
+			mutex_unlock(&engine->i915->drm.struct_mutex);
+			err = PTR_ERR(new);
+			break;
+		}
+
+		rq[idx] = i915_gem_request_get(new);
+		i915_add_request(new);
+		mutex_unlock(&engine->i915->drm.struct_mutex);
+
+		if (old) {
+			i915_wait_request(old, 0, MAX_SCHEDULE_TIMEOUT);
+			i915_gem_request_put(old);
+		}
+	}
+
+	for (count = 0; count < ARRAY_SIZE(rq); count++)
+		i915_gem_request_put(rq[count]);
+
+err_file:
+	mock_file_free(engine->i915, file);
+	return err;
+}
+
+static int igt_reset_active_engines(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_engine_cs *engine, *active;
+	enum intel_engine_id id, tmp;
+	int err = 0;
+
+	/* Check that issuing a reset on one engine does not interfere
+	 * with any other engine.
+	 */
+
+	if (!intel_has_reset_engine(i915))
+		return 0;
+
+	for_each_engine(engine, i915, id) {
+		struct task_struct *threads[I915_NUM_ENGINES];
+		unsigned long resets[I915_NUM_ENGINES];
+		unsigned long global = i915_reset_count(&i915->gpu_error);
+		IGT_TIMEOUT(end_time);
+
+		memset(threads, 0, sizeof(threads));
+		for_each_engine(active, i915, tmp) {
+			struct task_struct *tsk;
+
+			if (active == engine)
+				continue;
+
+			resets[tmp] = i915_reset_engine_count(&i915->gpu_error,
+							      active);
+
+			tsk = kthread_run(active_engine, active,
+					  "igt/%s", active->name);
+			if (IS_ERR(tsk)) {
+				err = PTR_ERR(tsk);
+				goto unwind;
+			}
+
+			threads[tmp] = tsk;
+			get_task_struct(tsk);
+		}
+
+		set_bit(I915_RESET_ENGINE + engine->id, &i915->gpu_error.flags);
+		do {
+			err = i915_reset_engine(engine, I915_RESET_QUIET);
+			if (err) {
+				pr_err("i915_reset_engine(%s) failed, err=%d\n",
+				       engine->name, err);
+				break;
+			}
+		} while (time_before(jiffies, end_time));
+		clear_bit(I915_RESET_ENGINE + engine->id,
+			  &i915->gpu_error.flags);
+
+unwind:
+		for_each_engine(active, i915, tmp) {
+			int ret;
+
+			if (!threads[tmp])
+				continue;
+
+			ret = kthread_stop(threads[tmp]);
+			if (ret) {
+				pr_err("kthread for active engine %s failed, err=%d\n",
+				       active->name, ret);
+				if (!err)
+					err = ret;
+			}
+			put_task_struct(threads[tmp]);
+
+			if (resets[tmp] != i915_reset_engine_count(&i915->gpu_error,
+								   active)) {
+				pr_err("Innocent engine %s was reset (count=%ld)\n",
+				       active->name,
+				       i915_reset_engine_count(&i915->gpu_error,
+							       active) - resets[tmp]);
+				err = -EIO;
+			}
+		}
+
+		if (global != i915_reset_count(&i915->gpu_error)) {
+			pr_err("Global reset (count=%ld)!\n",
+			       i915_reset_count(&i915->gpu_error) - global);
+			err = -EIO;
+		}
+
+		if (err)
+			break;
+
+		cond_resched();
+	}
+
 	if (i915_terminally_wedged(&i915->gpu_error))
 		err = -EIO;
 
@@ -356,9 +599,12 @@ static int igt_wait_reset(void *arg)
 	long timeout;
 	int err;
 
+	if (!intel_engine_can_store_dword(i915->engine[RCS]))
+		return 0;
+
 	/* Check that we detect a stuck waiter and issue a reset */
 
-	set_bit(I915_RESET_BACKOFF, &i915->gpu_error.flags);
+	global_reset_lock(i915);
 
 	mutex_lock(&i915->drm.struct_mutex);
 	err = hang_init(&h, i915);
@@ -403,7 +649,7 @@ static int igt_wait_reset(void *arg)
 	hang_fini(&h);
 unlock:
 	mutex_unlock(&i915->drm.struct_mutex);
-	clear_bit(I915_RESET_BACKOFF, &i915->gpu_error.flags);
+	global_reset_unlock(i915);
 
 	if (i915_terminally_wedged(&i915->gpu_error))
 		return -EIO;
@@ -421,10 +667,8 @@ static int igt_reset_queue(void *arg)
 
 	/* Check that we replay pending requests following a hang */
 
-	if (!igt_can_mi_store_dword_imm(i915))
-		return 0;
+	global_reset_lock(i915);
 
-	set_bit(I915_RESET_BACKOFF, &i915->gpu_error.flags);
 	mutex_lock(&i915->drm.struct_mutex);
 	err = hang_init(&h, i915);
 	if (err)
@@ -435,6 +679,9 @@ static int igt_reset_queue(void *arg)
 		IGT_TIMEOUT(end_time);
 		unsigned int count;
 
+		if (!intel_engine_can_store_dword(engine))
+			continue;
+
 		prev = hang_create_request(&h, engine, i915->kernel_context);
 		if (IS_ERR(prev)) {
 			err = PTR_ERR(prev);
@@ -471,7 +718,7 @@ static int igt_reset_queue(void *arg)
 
 			reset_count = fake_hangcheck(prev);
 
-			i915_reset(i915);
+			i915_reset(i915, I915_RESET_QUIET);
 
 			GEM_BUG_ON(test_bit(I915_RESET_HANDOFF,
 					    &i915->gpu_error.flags));
@@ -518,7 +765,7 @@ static int igt_reset_queue(void *arg)
 	hang_fini(&h);
 unlock:
 	mutex_unlock(&i915->drm.struct_mutex);
-	clear_bit(I915_RESET_BACKOFF, &i915->gpu_error.flags);
+	global_reset_unlock(i915);
 
 	if (i915_terminally_wedged(&i915->gpu_error))
 		return -EIO;
@@ -526,13 +773,83 @@ static int igt_reset_queue(void *arg)
 	return err;
 }
 
+static int igt_handle_error(void *arg)
+{
+	struct drm_i915_private *i915 = arg;
+	struct intel_engine_cs *engine = i915->engine[RCS];
+	struct hang h;
+	struct drm_i915_gem_request *rq;
+	struct i915_gpu_state *error;
+	int err;
+
+	/* Check that we can issue a global GPU and engine reset */
+
+	if (!intel_has_reset_engine(i915))
+		return 0;
+
+	if (!intel_engine_can_store_dword(i915->engine[RCS]))
+		return 0;
+
+	mutex_lock(&i915->drm.struct_mutex);
+
+	err = hang_init(&h, i915);
+	if (err)
+		goto err_unlock;
+
+	rq = hang_create_request(&h, engine, i915->kernel_context);
+	if (IS_ERR(rq)) {
+		err = PTR_ERR(rq);
+		goto err_fini;
+	}
+
+	i915_gem_request_get(rq);
+	__i915_add_request(rq, true);
+
+	if (!wait_for_hang(&h, rq)) {
+		pr_err("Failed to start request %x\n", rq->fence.seqno);
+		err = -EIO;
+		goto err_request;
+	}
+
+	mutex_unlock(&i915->drm.struct_mutex);
+
+	/* Temporarily disable error capture */
+	error = xchg(&i915->gpu_error.first_error, (void *)-1);
+
+	engine->hangcheck.stalled = true;
+	engine->hangcheck.seqno = intel_engine_get_seqno(engine);
+
+	i915_handle_error(i915, intel_engine_flag(engine), "%s", __func__);
+
+	xchg(&i915->gpu_error.first_error, error);
+
+	mutex_lock(&i915->drm.struct_mutex);
+
+	if (rq->fence.error != -EIO) {
+		pr_err("Guilty request not identified!\n");
+		err = -EINVAL;
+		goto err_request;
+	}
+
+err_request:
+	i915_gem_request_put(rq);
+err_fini:
+	hang_fini(&h);
+err_unlock:
+	mutex_unlock(&i915->drm.struct_mutex);
+	return err;
+}
+
 int intel_hangcheck_live_selftests(struct drm_i915_private *i915)
 {
 	static const struct i915_subtest tests[] = {
 		SUBTEST(igt_hang_sanitycheck),
 		SUBTEST(igt_global_reset),
+		SUBTEST(igt_reset_engine),
+		SUBTEST(igt_reset_active_engines),
 		SUBTEST(igt_wait_reset),
 		SUBTEST(igt_reset_queue),
+		SUBTEST(igt_handle_error),
 	};
 
 	if (!intel_has_gpu_reset(i915))
diff --git a/drivers/gpu/drm/i915/selftests/mock_context.c b/drivers/gpu/drm/i915/selftests/mock_context.c
index f8b9cc2..098ce643 100644
--- a/drivers/gpu/drm/i915/selftests/mock_context.c
+++ b/drivers/gpu/drm/i915/selftests/mock_context.c
@@ -40,18 +40,13 @@ mock_context(struct drm_i915_private *i915,
 	INIT_LIST_HEAD(&ctx->link);
 	ctx->i915 = i915;
 
-	ctx->vma_lut.ht_bits = VMA_HT_BITS;
-	ctx->vma_lut.ht_size = BIT(VMA_HT_BITS);
-	ctx->vma_lut.ht = kcalloc(ctx->vma_lut.ht_size,
-				  sizeof(*ctx->vma_lut.ht),
-				  GFP_KERNEL);
-	if (!ctx->vma_lut.ht)
-		goto err_free;
+	INIT_RADIX_TREE(&ctx->handles_vma, GFP_KERNEL);
+	INIT_LIST_HEAD(&ctx->handles_list);
 
-	ret = ida_simple_get(&i915->context_hw_ida,
+	ret = ida_simple_get(&i915->contexts.hw_ida,
 			     0, MAX_CONTEXT_HW_ID, GFP_KERNEL);
 	if (ret < 0)
-		goto err_vma_ht;
+		goto err_handles;
 	ctx->hw_id = ret;
 
 	if (name) {
@@ -66,9 +61,7 @@ mock_context(struct drm_i915_private *i915,
 
 	return ctx;
 
-err_vma_ht:
-	kvfree(ctx->vma_lut.ht);
-err_free:
+err_handles:
 	kfree(ctx);
 	return NULL;
 
@@ -86,3 +79,20 @@ void mock_context_close(struct i915_gem_context *ctx)
 
 	i915_gem_context_put(ctx);
 }
+
+void mock_init_contexts(struct drm_i915_private *i915)
+{
+	INIT_LIST_HEAD(&i915->contexts.list);
+	ida_init(&i915->contexts.hw_ida);
+
+	INIT_WORK(&i915->contexts.free_work, contexts_free_worker);
+	init_llist_head(&i915->contexts.free_list);
+}
+
+struct i915_gem_context *
+live_context(struct drm_i915_private *i915, struct drm_file *file)
+{
+	lockdep_assert_held(&i915->drm.struct_mutex);
+
+	return i915_gem_create_context(i915, file->driver_priv);
+}
diff --git a/drivers/gpu/drm/i915/selftests/mock_context.h b/drivers/gpu/drm/i915/selftests/mock_context.h
index 2427e5c..2f432c0 100644
--- a/drivers/gpu/drm/i915/selftests/mock_context.h
+++ b/drivers/gpu/drm/i915/selftests/mock_context.h
@@ -25,10 +25,15 @@
 #ifndef __MOCK_CONTEXT_H
 #define __MOCK_CONTEXT_H
 
+void mock_init_contexts(struct drm_i915_private *i915);
+
 struct i915_gem_context *
 mock_context(struct drm_i915_private *i915,
 	     const char *name);
 
 void mock_context_close(struct i915_gem_context *ctx);
 
+struct i915_gem_context *
+live_context(struct drm_i915_private *i915, struct drm_file *file);
+
 #endif /* !__MOCK_CONTEXT_H */
diff --git a/drivers/gpu/drm/i915/selftests/mock_engine.c b/drivers/gpu/drm/i915/selftests/mock_engine.c
index 5b18a2d..fc0fd74 100644
--- a/drivers/gpu/drm/i915/selftests/mock_engine.c
+++ b/drivers/gpu/drm/i915/selftests/mock_engine.c
@@ -123,10 +123,12 @@ static struct intel_ring *mock_ring(struct intel_engine_cs *engine)
 }
 
 struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
-				    const char *name)
+				    const char *name,
+				    int id)
 {
 	struct mock_engine *engine;
-	static int id;
+
+	GEM_BUG_ON(id >= I915_NUM_ENGINES);
 
 	engine = kzalloc(sizeof(*engine) + PAGE_SIZE, GFP_KERNEL);
 	if (!engine)
@@ -141,7 +143,7 @@ struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
 	/* minimal engine setup for requests */
 	engine->base.i915 = i915;
 	snprintf(engine->base.name, sizeof(engine->base.name), "%s", name);
-	engine->base.id = id++;
+	engine->base.id = id;
 	engine->base.status_page.page_addr = (void *)(engine + 1);
 
 	engine->base.context_pin = mock_context_pin;
diff --git a/drivers/gpu/drm/i915/selftests/mock_engine.h b/drivers/gpu/drm/i915/selftests/mock_engine.h
index e5e2402..133d0c2 100644
--- a/drivers/gpu/drm/i915/selftests/mock_engine.h
+++ b/drivers/gpu/drm/i915/selftests/mock_engine.h
@@ -40,7 +40,8 @@ struct mock_engine {
 };
 
 struct intel_engine_cs *mock_engine(struct drm_i915_private *i915,
-				    const char *name);
+				    const char *name,
+				    int id);
 void mock_engine_flush(struct intel_engine_cs *engine);
 void mock_engine_reset(struct intel_engine_cs *engine);
 void mock_engine_free(struct intel_engine_cs *engine);
diff --git a/drivers/gpu/drm/i915/selftests/mock_gem_device.c b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
index 8cdec45..6787234 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gem_device.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gem_device.c
@@ -22,6 +22,7 @@
  *
  */
 
+#include <linux/pm_domain.h>
 #include <linux/pm_runtime.h>
 
 #include "mock_engine.h"
@@ -53,15 +54,17 @@ static void mock_device_release(struct drm_device *dev)
 
 	mutex_lock(&i915->drm.struct_mutex);
 	mock_device_flush(i915);
+	i915_gem_contexts_lost(i915);
 	mutex_unlock(&i915->drm.struct_mutex);
 
 	cancel_delayed_work_sync(&i915->gt.retire_work);
 	cancel_delayed_work_sync(&i915->gt.idle_work);
+	i915_gem_drain_workqueue(i915);
 
 	mutex_lock(&i915->drm.struct_mutex);
 	for_each_engine(engine, i915, id)
 		mock_engine_free(engine);
-	i915_gem_context_fini(i915);
+	i915_gem_contexts_fini(i915);
 	mutex_unlock(&i915->drm.struct_mutex);
 
 	drain_workqueue(i915->wq);
@@ -108,6 +111,23 @@ static void mock_idle_work_handler(struct work_struct *work)
 {
 }
 
+static int pm_domain_resume(struct device *dev)
+{
+	return pm_generic_runtime_resume(dev);
+}
+
+static int pm_domain_suspend(struct device *dev)
+{
+	return pm_generic_runtime_suspend(dev);
+}
+
+static struct dev_pm_domain pm_domain = {
+	.ops = {
+		.runtime_suspend = pm_domain_suspend,
+		.runtime_resume = pm_domain_resume,
+	},
+};
+
 struct drm_i915_private *mock_gem_device(void)
 {
 	struct drm_i915_private *i915;
@@ -126,8 +146,10 @@ struct drm_i915_private *mock_gem_device(void)
 	dev_set_name(&pdev->dev, "mock");
 	dma_coerce_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));
 
+	dev_pm_domain_set(&pdev->dev, &pm_domain);
+	pm_runtime_enable(&pdev->dev);
 	pm_runtime_dont_use_autosuspend(&pdev->dev);
-	pm_runtime_get_sync(&pdev->dev);
+	WARN_ON(pm_runtime_get_sync(&pdev->dev));
 
 	i915 = (struct drm_i915_private *)(pdev + 1);
 	pci_set_drvdata(pdev, i915);
@@ -160,7 +182,7 @@ struct drm_i915_private *mock_gem_device(void)
 	INIT_LIST_HEAD(&i915->mm.unbound_list);
 	INIT_LIST_HEAD(&i915->mm.bound_list);
 
-	ida_init(&i915->context_hw_ida);
+	mock_init_contexts(i915);
 
 	INIT_DELAYED_WORK(&i915->gt.retire_work, mock_retire_work_handler);
 	INIT_DELAYED_WORK(&i915->gt.idle_work, mock_idle_work_handler);
@@ -204,7 +226,7 @@ struct drm_i915_private *mock_gem_device(void)
 	mutex_unlock(&i915->drm.struct_mutex);
 
 	mkwrite_device_info(i915)->ring_mask = BIT(0);
-	i915->engine[RCS] = mock_engine(i915, "mock");
+	i915->engine[RCS] = mock_engine(i915, "mock", RCS);
 	if (!i915->engine[RCS])
 		goto err_priorities;
 
diff --git a/drivers/gpu/drm/i915/selftests/mock_gtt.c b/drivers/gpu/drm/i915/selftests/mock_gtt.c
index a61309c..f2118cf 100644
--- a/drivers/gpu/drm/i915/selftests/mock_gtt.c
+++ b/drivers/gpu/drm/i915/selftests/mock_gtt.c
@@ -33,8 +33,7 @@ static void mock_insert_page(struct i915_address_space *vm,
 }
 
 static void mock_insert_entries(struct i915_address_space *vm,
-				struct sg_table *st,
-				u64 start,
+				struct i915_vma *vma,
 				enum i915_cache_level level, u32 flags)
 {
 }
diff --git a/drivers/gpu/drm/imx/imx-drm-core.c b/drivers/gpu/drm/imx/imx-drm-core.c
index 95e2181..f91cb72 100644
--- a/drivers/gpu/drm/imx/imx-drm-core.c
+++ b/drivers/gpu/drm/imx/imx-drm-core.c
@@ -115,7 +115,7 @@ static void imx_drm_atomic_commit_tail(struct drm_atomic_state *state)
 {
 	struct drm_device *dev = state->dev;
 	struct drm_plane *plane;
-	struct drm_plane_state *old_plane_state;
+	struct drm_plane_state *old_plane_state, *new_plane_state;
 	bool plane_disabling = false;
 	int i;
 
@@ -127,15 +127,15 @@ static void imx_drm_atomic_commit_tail(struct drm_atomic_state *state)
 
 	drm_atomic_helper_commit_modeset_enables(dev, state);
 
-	for_each_plane_in_state(state, plane, old_plane_state, i) {
-		if (drm_atomic_plane_disabling(old_plane_state, plane->state))
+	for_each_oldnew_plane_in_state(state, plane, old_plane_state, new_plane_state, i) {
+		if (drm_atomic_plane_disabling(old_plane_state, new_plane_state))
 			plane_disabling = true;
 	}
 
 	if (plane_disabling) {
 		drm_atomic_helper_wait_for_vblanks(dev, state);
 
-		for_each_plane_in_state(state, plane, old_plane_state, i)
+		for_each_old_plane_in_state(state, plane, old_plane_state, i)
 			ipu_plane_disable_deferred(plane);
 
 	}
@@ -182,8 +182,6 @@ static struct drm_driver imx_drm_driver = {
 	.gem_free_object_unlocked = drm_gem_cma_free_object,
 	.gem_vm_ops		= &drm_gem_cma_vm_ops,
 	.dumb_create		= drm_gem_cma_dumb_create,
-	.dumb_map_offset	= drm_gem_cma_dumb_map_offset,
-	.dumb_destroy		= drm_gem_dumb_destroy,
 
 	.prime_handle_to_fd	= drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle	= drm_gem_prime_fd_to_handle,
diff --git a/drivers/gpu/drm/imx/imx-ldb.c b/drivers/gpu/drm/imx/imx-ldb.c
index 8b05ecb..56dd7a9 100644
--- a/drivers/gpu/drm/imx/imx-ldb.c
+++ b/drivers/gpu/drm/imx/imx-ldb.c
@@ -389,7 +389,6 @@ static int imx_ldb_encoder_atomic_check(struct drm_encoder *encoder,
 
 
 static const struct drm_connector_funcs imx_ldb_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = imx_drm_connector_destroy,
 	.reset = drm_atomic_helper_connector_reset,
diff --git a/drivers/gpu/drm/imx/imx-tve.c b/drivers/gpu/drm/imx/imx-tve.c
index 4826bb7..bc27c26 100644
--- a/drivers/gpu/drm/imx/imx-tve.c
+++ b/drivers/gpu/drm/imx/imx-tve.c
@@ -341,7 +341,6 @@ static int imx_tve_atomic_check(struct drm_encoder *encoder,
 }
 
 static const struct drm_connector_funcs imx_tve_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = imx_drm_connector_destroy,
 	.reset = drm_atomic_helper_connector_reset,
diff --git a/drivers/gpu/drm/imx/ipuv3-crtc.c b/drivers/gpu/drm/imx/ipuv3-crtc.c
index 5456c15..53e0b24 100644
--- a/drivers/gpu/drm/imx/ipuv3-crtc.c
+++ b/drivers/gpu/drm/imx/ipuv3-crtc.c
@@ -50,7 +50,8 @@ static inline struct ipu_crtc *to_ipu_crtc(struct drm_crtc *crtc)
 	return container_of(crtc, struct ipu_crtc, base);
 }
 
-static void ipu_crtc_enable(struct drm_crtc *crtc)
+static void ipu_crtc_atomic_enable(struct drm_crtc *crtc,
+				   struct drm_crtc_state *old_state)
 {
 	struct ipu_crtc *ipu_crtc = to_ipu_crtc(crtc);
 	struct ipu_soc *ipu = dev_get_drvdata(ipu_crtc->dev->parent);
@@ -293,7 +294,7 @@ static const struct drm_crtc_helper_funcs ipu_helper_funcs = {
 	.atomic_check = ipu_crtc_atomic_check,
 	.atomic_begin = ipu_crtc_atomic_begin,
 	.atomic_disable = ipu_crtc_atomic_disable,
-	.enable = ipu_crtc_enable,
+	.atomic_enable = ipu_crtc_atomic_enable,
 };
 
 static void ipu_put_resources(struct ipu_crtc *ipu_crtc)
diff --git a/drivers/gpu/drm/imx/ipuv3-plane.c b/drivers/gpu/drm/imx/ipuv3-plane.c
index d384598..cf98596 100644
--- a/drivers/gpu/drm/imx/ipuv3-plane.c
+++ b/drivers/gpu/drm/imx/ipuv3-plane.c
@@ -496,6 +496,27 @@ static int ipu_chan_assign_axi_id(int ipu_chan)
 	}
 }
 
+static void ipu_calculate_bursts(u32 width, u32 cpp, u32 stride,
+				 u8 *burstsize, u8 *num_bursts)
+{
+	const unsigned int width_bytes = width * cpp;
+	unsigned int npb, bursts;
+
+	/* Maximum number of pixels per burst without overshooting stride */
+	for (npb = 64 / cpp; npb > 0; --npb) {
+		if (round_up(width_bytes, npb * cpp) <= stride)
+			break;
+	}
+	*burstsize = npb;
+
+	/* Maximum number of consecutive bursts without overshooting stride */
+	for (bursts = 8; bursts > 1; bursts /= 2) {
+		if (round_up(width_bytes, npb * cpp * bursts) <= stride)
+			break;
+	}
+	*num_bursts = bursts;
+}
+
 static void ipu_plane_atomic_update(struct drm_plane *plane,
 				    struct drm_plane_state *old_state)
 {
@@ -509,6 +530,9 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
 	unsigned long alpha_eba = 0;
 	enum ipu_color_space ics;
 	unsigned int axi_id = 0;
+	const struct drm_format_info *info;
+	u8 burstsize, num_bursts;
+	u32 width, height;
 	int active;
 
 	if (ipu_plane->dp_flow == IPU_DP_FLOW_SYNC_FG)
@@ -525,8 +549,8 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
 		ipu_prg_channel_configure(ipu_plane->ipu_ch, axi_id,
 					  drm_rect_width(&state->src) >> 16,
 					  drm_rect_height(&state->src) >> 16,
-					  state->fb->pitches[0],
-					  state->fb->format->format, &eba);
+					  fb->pitches[0],
+					  fb->format->format, &eba);
 	}
 
 	if (old_state->fb && !drm_atomic_crtc_needs_modeset(crtc_state)) {
@@ -555,7 +579,7 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
 		ipu_dp_setup_channel(ipu_plane->dp, ics,
 					IPUV3_COLORSPACE_UNKNOWN);
 		/* Enable local alpha on partial plane */
-		switch (state->fb->format->format) {
+		switch (fb->format->format) {
 		case DRM_FORMAT_ARGB1555:
 		case DRM_FORMAT_ABGR1555:
 		case DRM_FORMAT_RGBA5551:
@@ -581,15 +605,21 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
 
 	ipu_dmfc_config_wait4eot(ipu_plane->dmfc, drm_rect_width(dst));
 
+	width = drm_rect_width(&state->src) >> 16;
+	height = drm_rect_height(&state->src) >> 16;
+	info = drm_format_info(fb->format->format);
+	ipu_calculate_bursts(width, info->cpp[0], fb->pitches[0],
+			     &burstsize, &num_bursts);
+
 	ipu_cpmem_zero(ipu_plane->ipu_ch);
-	ipu_cpmem_set_resolution(ipu_plane->ipu_ch,
-				 drm_rect_width(&state->src) >> 16,
-				 drm_rect_height(&state->src) >> 16);
-	ipu_cpmem_set_fmt(ipu_plane->ipu_ch, state->fb->format->format);
+	ipu_cpmem_set_resolution(ipu_plane->ipu_ch, width, height);
+	ipu_cpmem_set_fmt(ipu_plane->ipu_ch, fb->format->format);
+	ipu_cpmem_set_burstsize(ipu_plane->ipu_ch, burstsize);
 	ipu_cpmem_set_high_priority(ipu_plane->ipu_ch);
 	ipu_idmac_set_double_buffer(ipu_plane->ipu_ch, 1);
-	ipu_cpmem_set_stride(ipu_plane->ipu_ch, state->fb->pitches[0]);
+	ipu_cpmem_set_stride(ipu_plane->ipu_ch, fb->pitches[0]);
 	ipu_cpmem_set_axi_id(ipu_plane->ipu_ch, axi_id);
+
 	switch (fb->format->format) {
 	case DRM_FORMAT_YUV420:
 	case DRM_FORMAT_YVU420:
@@ -629,6 +659,7 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
 	case DRM_FORMAT_RGBX8888_A8:
 	case DRM_FORMAT_BGRX8888_A8:
 		alpha_eba = drm_plane_state_to_eba(state, 1);
+		num_bursts = 0;
 
 		dev_dbg(ipu_plane->base.dev->dev, "phys = %lu %lu, x = %d, y = %d",
 			eba, alpha_eba, state->src.x1 >> 16, state->src.y1 >> 16);
@@ -642,8 +673,7 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
 		ipu_cpmem_set_format_passthrough(ipu_plane->alpha_ch, 8);
 		ipu_cpmem_set_high_priority(ipu_plane->alpha_ch);
 		ipu_idmac_set_double_buffer(ipu_plane->alpha_ch, 1);
-		ipu_cpmem_set_stride(ipu_plane->alpha_ch,
-				     state->fb->pitches[1]);
+		ipu_cpmem_set_stride(ipu_plane->alpha_ch, fb->pitches[1]);
 		ipu_cpmem_set_burstsize(ipu_plane->alpha_ch, 16);
 		ipu_cpmem_set_buffer(ipu_plane->alpha_ch, 0, alpha_eba);
 		ipu_cpmem_set_buffer(ipu_plane->alpha_ch, 1, alpha_eba);
@@ -655,6 +685,7 @@ static void ipu_plane_atomic_update(struct drm_plane *plane,
 	}
 	ipu_cpmem_set_buffer(ipu_plane->ipu_ch, 0, eba);
 	ipu_cpmem_set_buffer(ipu_plane->ipu_ch, 1, eba);
+	ipu_idmac_lock_enable(ipu_plane->ipu_ch, num_bursts);
 	ipu_plane_enable(ipu_plane);
 }
 
@@ -673,7 +704,7 @@ int ipu_planes_assign_pre(struct drm_device *dev,
 	int available_pres = ipu_prg_max_active_channels();
 	int i;
 
-	for_each_plane_in_state(state, plane, plane_state, i) {
+	for_each_new_plane_in_state(state, plane, plane_state, i) {
 		struct ipu_plane_state *ipu_state =
 				to_ipu_plane_state(plane_state);
 		struct ipu_plane *ipu_plane = to_ipu_plane(plane);
@@ -716,8 +747,8 @@ struct ipu_plane *ipu_plane_init(struct drm_device *dev, struct ipu_soc *ipu,
 
 	ret = drm_universal_plane_init(dev, &ipu_plane->base, possible_crtcs,
 				       &ipu_plane_funcs, ipu_plane_formats,
-				       ARRAY_SIZE(ipu_plane_formats), type,
-				       NULL);
+				       ARRAY_SIZE(ipu_plane_formats),
+				       NULL, type, NULL);
 	if (ret) {
 		DRM_ERROR("failed to initialize plane\n");
 		kfree(ipu_plane);
diff --git a/drivers/gpu/drm/imx/parallel-display.c b/drivers/gpu/drm/imx/parallel-display.c
index 8aca202..8def97d 100644
--- a/drivers/gpu/drm/imx/parallel-display.c
+++ b/drivers/gpu/drm/imx/parallel-display.c
@@ -135,7 +135,6 @@ static int imx_pd_encoder_atomic_check(struct drm_encoder *encoder,
 }
 
 static const struct drm_connector_funcs imx_pd_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = imx_drm_connector_destroy,
 	.reset = drm_atomic_helper_connector_reset,
diff --git a/drivers/gpu/drm/mediatek/mtk_disp_color.c b/drivers/gpu/drm/mediatek/mtk_disp_color.c
index ef79a6d..f609b62 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_color.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_color.c
@@ -84,8 +84,8 @@ static int mtk_disp_color_bind(struct device *dev, struct device *master,
 
 	ret = mtk_ddp_comp_register(drm_dev, &priv->ddp_comp);
 	if (ret < 0) {
-		dev_err(dev, "Failed to register component %s: %d\n",
-			dev->of_node->full_name, ret);
+		dev_err(dev, "Failed to register component %pOF: %d\n",
+			dev->of_node, ret);
 		return ret;
 	}
 
diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
index 35bc5ba..978782a 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
@@ -235,8 +235,8 @@ static int mtk_disp_ovl_bind(struct device *dev, struct device *master,
 
 	ret = mtk_ddp_comp_register(drm_dev, &priv->ddp_comp);
 	if (ret < 0) {
-		dev_err(dev, "Failed to register component %s: %d\n",
-			dev->of_node->full_name, ret);
+		dev_err(dev, "Failed to register component %pOF: %d\n",
+			dev->of_node, ret);
 		return ret;
 	}
 
diff --git a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
index b68a513..585943c 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
@@ -155,8 +155,8 @@ static int mtk_disp_rdma_bind(struct device *dev, struct device *master,
 
 	ret = mtk_ddp_comp_register(drm_dev, &priv->ddp_comp);
 	if (ret < 0) {
-		dev_err(dev, "Failed to register component %s: %d\n",
-			dev->of_node->full_name, ret);
+		dev_err(dev, "Failed to register component %pOF: %d\n",
+			dev->of_node, ret);
 		return ret;
 	}
 
diff --git a/drivers/gpu/drm/mediatek/mtk_dpi.c b/drivers/gpu/drm/mediatek/mtk_dpi.c
index 32ca351..e80a603 100644
--- a/drivers/gpu/drm/mediatek/mtk_dpi.c
+++ b/drivers/gpu/drm/mediatek/mtk_dpi.c
@@ -605,8 +605,8 @@ static int mtk_dpi_bind(struct device *dev, struct device *master, void *data)
 
 	ret = mtk_ddp_comp_register(drm_dev, &dpi->ddp_comp);
 	if (ret < 0) {
-		dev_err(dev, "Failed to register component %s: %d\n",
-			dev->of_node->full_name, ret);
+		dev_err(dev, "Failed to register component %pOF: %d\n",
+			dev->of_node, ret);
 		return ret;
 	}
 
@@ -710,7 +710,7 @@ static int mtk_dpi_probe(struct platform_device *pdev)
 	if (!bridge_node)
 		return -ENODEV;
 
-	dev_info(dev, "Found bridge node: %s\n", bridge_node->full_name);
+	dev_info(dev, "Found bridge node: %pOF\n", bridge_node);
 
 	dpi->bridge = of_drm_find_bridge(bridge_node);
 	of_node_put(bridge_node);
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
index cb32c93..658b8dd 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
@@ -366,7 +366,8 @@ static void mtk_crtc_ddp_config(struct drm_crtc *crtc)
 	}
 }
 
-static void mtk_drm_crtc_enable(struct drm_crtc *crtc)
+static void mtk_drm_crtc_atomic_enable(struct drm_crtc *crtc,
+				       struct drm_crtc_state *old_state)
 {
 	struct mtk_drm_crtc *mtk_crtc = to_mtk_crtc(crtc);
 	struct mtk_ddp_comp *ovl = mtk_crtc->ddp_comp[0];
@@ -390,7 +391,8 @@ static void mtk_drm_crtc_enable(struct drm_crtc *crtc)
 	mtk_crtc->enabled = true;
 }
 
-static void mtk_drm_crtc_disable(struct drm_crtc *crtc)
+static void mtk_drm_crtc_atomic_disable(struct drm_crtc *crtc,
+					struct drm_crtc_state *old_state)
 {
 	struct mtk_drm_crtc *mtk_crtc = to_mtk_crtc(crtc);
 	struct mtk_ddp_comp *ovl = mtk_crtc->ddp_comp[0];
@@ -487,10 +489,10 @@ static const struct drm_crtc_funcs mtk_crtc_funcs = {
 static const struct drm_crtc_helper_funcs mtk_crtc_helper_funcs = {
 	.mode_fixup	= mtk_drm_crtc_mode_fixup,
 	.mode_set_nofb	= mtk_drm_crtc_mode_set_nofb,
-	.enable		= mtk_drm_crtc_enable,
-	.disable	= mtk_drm_crtc_disable,
 	.atomic_begin	= mtk_drm_crtc_atomic_begin,
 	.atomic_flush	= mtk_drm_crtc_atomic_flush,
+	.atomic_enable	= mtk_drm_crtc_atomic_enable,
+	.atomic_disable	= mtk_drm_crtc_atomic_disable,
 };
 
 static int mtk_drm_crtc_init(struct drm_device *drm,
@@ -577,8 +579,7 @@ int mtk_drm_crtc_create(struct drm_device *drm_dev,
 		node = priv->comp_node[comp_id];
 		comp = priv->ddp_comp[comp_id];
 		if (!comp) {
-			dev_err(dev, "Component %s not initialized\n",
-				node->full_name);
+			dev_err(dev, "Component %pOF not initialized\n", node);
 			ret = -ENODEV;
 			goto unprepare;
 		}
@@ -586,8 +587,8 @@ int mtk_drm_crtc_create(struct drm_device *drm_dev,
 		ret = clk_prepare(comp->clk);
 		if (ret) {
 			dev_err(dev,
-				"Failed to prepare clock for component %s: %d\n",
-				node->full_name, ret);
+				"Failed to prepare clock for component %pOF: %d\n",
+				node, ret);
 			goto unprepare;
 		}
 
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
index 07d7ea2..4672317 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
@@ -295,15 +295,13 @@ int mtk_ddp_comp_init(struct device *dev, struct device_node *node,
 	larb_node = of_parse_phandle(node, "mediatek,larb", 0);
 	if (!larb_node) {
 		dev_err(dev,
-			"Missing mediadek,larb phandle in %s node\n",
-			node->full_name);
+			"Missing mediadek,larb phandle in %pOF node\n", node);
 		return -EINVAL;
 	}
 
 	larb_pdev = of_find_device_by_node(larb_node);
 	if (!larb_pdev) {
-		dev_warn(dev, "Waiting for larb device %s\n",
-			 larb_node->full_name);
+		dev_warn(dev, "Waiting for larb device %pOF\n", larb_node);
 		of_node_put(larb_node);
 		return -EPROBE_DEFER;
 	}
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index 41d2cff..a2ca90fc 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -48,11 +48,11 @@ static void mtk_atomic_schedule(struct mtk_drm_private *private,
 static void mtk_atomic_wait_for_fences(struct drm_atomic_state *state)
 {
 	struct drm_plane *plane;
-	struct drm_plane_state *plane_state;
+	struct drm_plane_state *new_plane_state;
 	int i;
 
-	for_each_plane_in_state(state, plane, plane_state, i)
-		mtk_fb_wait(plane->state->fb);
+	for_each_new_plane_in_state(state, plane, new_plane_state, i)
+		mtk_fb_wait(new_plane_state->fb);
 }
 
 static void mtk_atomic_complete(struct mtk_drm_private *private,
@@ -109,7 +109,12 @@ static int mtk_atomic_commit(struct drm_device *drm,
 	mutex_lock(&private->commit.lock);
 	flush_work(&private->commit.work);
 
-	drm_atomic_helper_swap_state(state, true);
+	ret = drm_atomic_helper_swap_state(state, true);
+	if (ret) {
+		mutex_unlock(&private->commit.lock);
+		drm_atomic_helper_cleanup_planes(drm, state);
+		return ret;
+	}
 
 	drm_atomic_state_get(state);
 	if (async)
@@ -187,8 +192,8 @@ static int mtk_drm_kms_init(struct drm_device *drm)
 
 	pdev = of_find_device_by_node(private->mutex_node);
 	if (!pdev) {
-		dev_err(drm->dev, "Waiting for disp-mutex device %s\n",
-			private->mutex_node->full_name);
+		dev_err(drm->dev, "Waiting for disp-mutex device %pOF\n",
+			private->mutex_node);
 		of_node_put(private->mutex_node);
 		return -EPROBE_DEFER;
 	}
@@ -266,7 +271,6 @@ static void mtk_drm_kms_deinit(struct drm_device *drm)
 {
 	drm_kms_helper_poll_fini(drm);
 
-	drm_vblank_cleanup(drm);
 	component_unbind_all(drm->dev, drm);
 	drm_mode_config_cleanup(drm);
 }
@@ -289,8 +293,6 @@ static struct drm_driver mtk_drm_driver = {
 	.gem_free_object_unlocked = mtk_drm_gem_free_object,
 	.gem_vm_ops = &drm_gem_cma_vm_ops,
 	.dumb_create = mtk_drm_gem_dumb_create,
-	.dumb_map_offset = mtk_drm_gem_dumb_map_offset,
-	.dumb_destroy = drm_gem_dumb_destroy,
 
 	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
@@ -417,8 +419,8 @@ static int mtk_drm_probe(struct platform_device *pdev)
 			continue;
 
 		if (!of_device_is_available(node)) {
-			dev_dbg(dev, "Skipping disabled component %s\n",
-				node->full_name);
+			dev_dbg(dev, "Skipping disabled component %pOF\n",
+				node);
 			continue;
 		}
 
@@ -431,8 +433,8 @@ static int mtk_drm_probe(struct platform_device *pdev)
 
 		comp_id = mtk_ddp_comp_get_id(node, comp_type);
 		if (comp_id < 0) {
-			dev_warn(dev, "Skipping unknown component %s\n",
-				 node->full_name);
+			dev_warn(dev, "Skipping unknown component %pOF\n",
+				 node);
 			continue;
 		}
 
@@ -448,8 +450,8 @@ static int mtk_drm_probe(struct platform_device *pdev)
 		    comp_type == MTK_DISP_RDMA ||
 		    comp_type == MTK_DSI ||
 		    comp_type == MTK_DPI) {
-			dev_info(dev, "Adding component match for %s\n",
-				 node->full_name);
+			dev_info(dev, "Adding component match for %pOF\n",
+				 node);
 			drm_of_component_match_add(dev, &match, compare_of,
 						   node);
 		} else {
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_fb.c b/drivers/gpu/drm/mediatek/mtk_drm_fb.c
index d4246c9..0d8d506 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_fb.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_fb.c
@@ -58,7 +58,7 @@ static void mtk_drm_fb_destroy(struct drm_framebuffer *fb)
 
 	drm_framebuffer_cleanup(fb);
 
-	drm_gem_object_unreference_unlocked(mtk_fb->gem_obj);
+	drm_gem_object_put_unlocked(mtk_fb->gem_obj);
 
 	kfree(mtk_fb);
 }
@@ -160,6 +160,6 @@ struct drm_framebuffer *mtk_drm_mode_fb_create(struct drm_device *dev,
 	return &mtk_fb->base;
 
 unreference:
-	drm_gem_object_unreference_unlocked(gem);
+	drm_gem_object_put_unlocked(gem);
 	return ERR_PTR(ret);
 }
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.c b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
index 7abc550..f595ac8 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_gem.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.c
@@ -122,7 +122,7 @@ int mtk_drm_gem_dumb_create(struct drm_file *file_priv, struct drm_device *dev,
 		goto err_handle_create;
 
 	/* drop reference from allocate - handle holds it now. */
-	drm_gem_object_unreference_unlocked(&mtk_gem->base);
+	drm_gem_object_put_unlocked(&mtk_gem->base);
 
 	return 0;
 
@@ -131,31 +131,6 @@ int mtk_drm_gem_dumb_create(struct drm_file *file_priv, struct drm_device *dev,
 	return ret;
 }
 
-int mtk_drm_gem_dumb_map_offset(struct drm_file *file_priv,
-				struct drm_device *dev, uint32_t handle,
-				uint64_t *offset)
-{
-	struct drm_gem_object *obj;
-	int ret;
-
-	obj = drm_gem_object_lookup(file_priv, handle);
-	if (!obj) {
-		DRM_ERROR("failed to lookup gem object.\n");
-		return -EINVAL;
-	}
-
-	ret = drm_gem_create_mmap_offset(obj);
-	if (ret)
-		goto out;
-
-	*offset = drm_vma_node_offset_addr(&obj->vma_node);
-	DRM_DEBUG_KMS("offset = 0x%llx\n", *offset);
-
-out:
-	drm_gem_object_unreference_unlocked(obj);
-	return ret;
-}
-
 static int mtk_drm_gem_object_mmap(struct drm_gem_object *obj,
 				   struct vm_area_struct *vma)
 
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_gem.h b/drivers/gpu/drm/mediatek/mtk_drm_gem.h
index 2752718..534639b 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_gem.h
+++ b/drivers/gpu/drm/mediatek/mtk_drm_gem.h
@@ -46,9 +46,6 @@ struct mtk_drm_gem_obj *mtk_drm_gem_create(struct drm_device *dev, size_t size,
 					   bool alloc_kmap);
 int mtk_drm_gem_dumb_create(struct drm_file *file_priv, struct drm_device *dev,
 			    struct drm_mode_create_dumb *args);
-int mtk_drm_gem_dumb_map_offset(struct drm_file *file_priv,
-				struct drm_device *dev, uint32_t handle,
-				uint64_t *offset);
 int mtk_drm_gem_mmap(struct file *filp, struct vm_area_struct *vma);
 int mtk_drm_gem_mmap_buf(struct drm_gem_object *obj,
 			 struct vm_area_struct *vma);
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_plane.c b/drivers/gpu/drm/mediatek/mtk_drm_plane.c
index 1a59b9a..6f12189 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_plane.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_plane.c
@@ -175,7 +175,7 @@ int mtk_plane_init(struct drm_device *dev, struct drm_plane *plane,
 
 	err = drm_universal_plane_init(dev, plane, possible_crtcs,
 				       &mtk_plane_funcs, formats,
-				       ARRAY_SIZE(formats), type, NULL);
+				       ARRAY_SIZE(formats), NULL, type, NULL);
 	if (err) {
 		DRM_ERROR("failed to initialize plane\n");
 		return err;
diff --git a/drivers/gpu/drm/mediatek/mtk_dsi.c b/drivers/gpu/drm/mediatek/mtk_dsi.c
index 97253c8..7e5e24c 100644
--- a/drivers/gpu/drm/mediatek/mtk_dsi.c
+++ b/drivers/gpu/drm/mediatek/mtk_dsi.c
@@ -766,7 +766,6 @@ static const struct drm_encoder_helper_funcs mtk_dsi_encoder_helper_funcs = {
 };
 
 static const struct drm_connector_funcs mtk_dsi_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = drm_connector_cleanup,
 	.reset = drm_atomic_helper_connector_reset,
@@ -1048,8 +1047,8 @@ static int mtk_dsi_bind(struct device *dev, struct device *master, void *data)
 
 	ret = mtk_ddp_comp_register(drm, &dsi->ddp_comp);
 	if (ret < 0) {
-		dev_err(dev, "Failed to register component %s: %d\n",
-			dev->of_node->full_name, ret);
+		dev_err(dev, "Failed to register component %pOF: %d\n",
+			dev->of_node, ret);
 		return ret;
 	}
 
diff --git a/drivers/gpu/drm/mediatek/mtk_hdmi.c b/drivers/gpu/drm/mediatek/mtk_hdmi.c
index 71eb4fb..690c675 100644
--- a/drivers/gpu/drm/mediatek/mtk_hdmi.c
+++ b/drivers/gpu/drm/mediatek/mtk_hdmi.c
@@ -975,7 +975,7 @@ static int mtk_hdmi_setup_avi_infoframe(struct mtk_hdmi *hdmi,
 	u8 buffer[17];
 	ssize_t err;
 
-	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode);
+	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
 	if (err < 0) {
 		dev_err(hdmi->dev,
 			"Failed to get AVI infoframe from mode: %zd\n", err);
@@ -1261,7 +1261,6 @@ static struct drm_encoder *mtk_hdmi_conn_best_enc(struct drm_connector *conn)
 }
 
 static const struct drm_connector_funcs mtk_hdmi_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.detect = hdmi_conn_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = hdmi_conn_destroy,
@@ -1456,8 +1455,8 @@ static int mtk_hdmi_dt_parse_pdata(struct mtk_hdmi *hdmi,
 
 	cec_pdev = of_find_device_by_node(cec_np);
 	if (!cec_pdev) {
-		dev_err(hdmi->dev, "Waiting for CEC device %s\n",
-			cec_np->full_name);
+		dev_err(hdmi->dev, "Waiting for CEC device %pOF\n",
+			cec_np);
 		return -EPROBE_DEFER;
 	}
 	hdmi->cec_dev = &cec_pdev->dev;
@@ -1501,8 +1500,8 @@ static int mtk_hdmi_dt_parse_pdata(struct mtk_hdmi *hdmi,
 
 	i2c_np = of_parse_phandle(remote, "ddc-i2c-bus", 0);
 	if (!i2c_np) {
-		dev_err(dev, "Failed to find ddc-i2c-bus node in %s\n",
-			remote->full_name);
+		dev_err(dev, "Failed to find ddc-i2c-bus node in %pOF\n",
+			remote);
 		of_node_put(remote);
 		return -EINVAL;
 	}
diff --git a/drivers/gpu/drm/meson/meson_crtc.c b/drivers/gpu/drm/meson/meson_crtc.c
index c986eb0..5155f01 100644
--- a/drivers/gpu/drm/meson/meson_crtc.c
+++ b/drivers/gpu/drm/meson/meson_crtc.c
@@ -79,7 +79,8 @@ static const struct drm_crtc_funcs meson_crtc_funcs = {
 
 };
 
-static void meson_crtc_enable(struct drm_crtc *crtc)
+static void meson_crtc_atomic_enable(struct drm_crtc *crtc,
+				     struct drm_crtc_state *old_state)
 {
 	struct meson_crtc *meson_crtc = to_meson_crtc(crtc);
 	struct drm_crtc_state *crtc_state = crtc->state;
@@ -102,7 +103,8 @@ static void meson_crtc_enable(struct drm_crtc *crtc)
 	priv->viu.osd1_enabled = true;
 }
 
-static void meson_crtc_disable(struct drm_crtc *crtc)
+static void meson_crtc_atomic_disable(struct drm_crtc *crtc,
+				      struct drm_crtc_state *old_state)
 {
 	struct meson_crtc *meson_crtc = to_meson_crtc(crtc);
 	struct meson_drm *priv = meson_crtc->priv;
@@ -149,10 +151,10 @@ static void meson_crtc_atomic_flush(struct drm_crtc *crtc,
 }
 
 static const struct drm_crtc_helper_funcs meson_crtc_helper_funcs = {
-	.enable		= meson_crtc_enable,
-	.disable	= meson_crtc_disable,
 	.atomic_begin	= meson_crtc_atomic_begin,
 	.atomic_flush	= meson_crtc_atomic_flush,
+	.atomic_enable	= meson_crtc_atomic_enable,
+	.atomic_disable	= meson_crtc_atomic_disable,
 };
 
 void meson_crtc_irq(struct meson_drm *priv)
diff --git a/drivers/gpu/drm/meson/meson_drv.c b/drivers/gpu/drm/meson/meson_drv.c
index 4d98fac..7742c7d 100644
--- a/drivers/gpu/drm/meson/meson_drv.c
+++ b/drivers/gpu/drm/meson/meson_drv.c
@@ -116,8 +116,6 @@ static struct drm_driver meson_driver = {
 
 	/* GEM Ops */
 	.dumb_create		= drm_gem_cma_dumb_create,
-	.dumb_destroy		= drm_gem_dumb_destroy,
-	.dumb_map_offset	= drm_gem_cma_dumb_map_offset,
 	.gem_free_object_unlocked = drm_gem_cma_free_object,
 	.gem_vm_ops		= &drm_gem_cma_vm_ops,
 
@@ -303,9 +301,8 @@ static const struct component_master_ops meson_drv_master_ops = {
 
 static int compare_of(struct device *dev, void *data)
 {
-	DRM_DEBUG_DRIVER("Comparing of node %s with %s\n",
-			 of_node_full_name(dev->of_node),
-			 of_node_full_name(data));
+	DRM_DEBUG_DRIVER("Comparing of node %pOF with %pOF\n",
+			 dev->of_node, data);
 
 	return dev->of_node == data;
 }
diff --git a/drivers/gpu/drm/meson/meson_plane.c b/drivers/gpu/drm/meson/meson_plane.c
index a32d3b6..17e96fa 100644
--- a/drivers/gpu/drm/meson/meson_plane.c
+++ b/drivers/gpu/drm/meson/meson_plane.c
@@ -223,6 +223,7 @@ int meson_plane_create(struct meson_drm *priv)
 				 &meson_plane_funcs,
 				 supported_drm_formats,
 				 ARRAY_SIZE(supported_drm_formats),
+				 NULL,
 				 DRM_PLANE_TYPE_PRIMARY, "meson_primary_plane");
 
 	drm_plane_helper_add(plane, &meson_plane_helper_funcs);
diff --git a/drivers/gpu/drm/meson/meson_venc_cvbs.c b/drivers/gpu/drm/meson/meson_venc_cvbs.c
index 00775b3..79d95ca 100644
--- a/drivers/gpu/drm/meson/meson_venc_cvbs.c
+++ b/drivers/gpu/drm/meson/meson_venc_cvbs.c
@@ -118,7 +118,6 @@ static int meson_cvbs_connector_mode_valid(struct drm_connector *connector,
 }
 
 static const struct drm_connector_funcs meson_cvbs_connector_funcs = {
-	.dpms			= drm_atomic_helper_connector_dpms,
 	.detect			= meson_cvbs_connector_detect,
 	.fill_modes		= drm_helper_probe_single_connector_modes,
 	.destroy		= meson_cvbs_connector_destroy,
diff --git a/drivers/gpu/drm/mga/mga_drv.c b/drivers/gpu/drm/mga/mga_drv.c
index 63ba0699..1aad278 100644
--- a/drivers/gpu/drm/mga/mga_drv.c
+++ b/drivers/gpu/drm/mga/mga_drv.c
@@ -62,7 +62,6 @@ static struct drm_driver driver = {
 	.load = mga_driver_load,
 	.unload = mga_driver_unload,
 	.lastclose = mga_driver_lastclose,
-	.set_busid = drm_pci_set_busid,
 	.dma_quiescent = mga_driver_dma_quiescent,
 	.get_vblank_counter = mga_get_vblank_counter,
 	.enable_vblank = mga_enable_vblank,
@@ -90,12 +89,12 @@ static struct pci_driver mga_pci_driver = {
 static int __init mga_init(void)
 {
 	driver.num_ioctls = mga_max_ioctl;
-	return drm_pci_init(&driver, &mga_pci_driver);
+	return drm_legacy_pci_init(&driver, &mga_pci_driver);
 }
 
 static void __exit mga_exit(void)
 {
-	drm_pci_exit(&driver, &mga_pci_driver);
+	drm_legacy_pci_exit(&driver, &mga_pci_driver);
 }
 
 module_init(mga_init);
diff --git a/drivers/gpu/drm/mgag200/mgag200_cursor.c b/drivers/gpu/drm/mgag200/mgag200_cursor.c
index 2ac3fcb..968e203 100644
--- a/drivers/gpu/drm/mgag200/mgag200_cursor.c
+++ b/drivers/gpu/drm/mgag200/mgag200_cursor.c
@@ -248,7 +248,7 @@ int mga_crtc_cursor_set(struct drm_crtc *crtc,
 out_unreserve1:
 	mgag200_bo_unreserve(pixels_2);
 out_unref:
-	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
 
 	return ret;
 }
diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.c b/drivers/gpu/drm/mgag200/mgag200_drv.c
index 9ac0078..74cdde2 100644
--- a/drivers/gpu/drm/mgag200/mgag200_drv.c
+++ b/drivers/gpu/drm/mgag200/mgag200_drv.c
@@ -91,7 +91,6 @@ static struct drm_driver driver = {
 	.driver_features = DRIVER_GEM | DRIVER_MODESET,
 	.load = mgag200_driver_load,
 	.unload = mgag200_driver_unload,
-	.set_busid = drm_pci_set_busid,
 	.fops = &mgag200_driver_fops,
 	.name = DRIVER_NAME,
 	.desc = DRIVER_DESC,
@@ -103,7 +102,6 @@ static struct drm_driver driver = {
 	.gem_free_object_unlocked = mgag200_gem_free_object,
 	.dumb_create = mgag200_dumb_create,
 	.dumb_map_offset = mgag200_dumb_mmap_offset,
-	.dumb_destroy = drm_gem_dumb_destroy,
 };
 
 static struct pci_driver mgag200_pci_driver = {
@@ -120,12 +118,13 @@ static int __init mgag200_init(void)
 
 	if (mgag200_modeset == 0)
 		return -EINVAL;
-	return drm_pci_init(&driver, &mgag200_pci_driver);
+
+	return pci_register_driver(&mgag200_pci_driver);
 }
 
 static void __exit mgag200_exit(void)
 {
-	drm_pci_exit(&driver, &mgag200_pci_driver);
+	pci_unregister_driver(&mgag200_pci_driver);
 }
 
 module_init(mgag200_init);
diff --git a/drivers/gpu/drm/mgag200/mgag200_drv.h b/drivers/gpu/drm/mgag200/mgag200_drv.h
index c88b6ec..04f1dfb 100644
--- a/drivers/gpu/drm/mgag200/mgag200_drv.h
+++ b/drivers/gpu/drm/mgag200/mgag200_drv.h
@@ -237,11 +237,6 @@ mgag200_bo(struct ttm_buffer_object *bo)
 {
 	return container_of(bo, struct mgag200_bo, bo);
 }
-				/* mgag200_crtc.c */
-void mga_crtc_fb_gamma_set(struct drm_crtc *crtc, u16 red, u16 green,
-			     u16 blue, int regno);
-void mga_crtc_fb_gamma_get(struct drm_crtc *crtc, u16 *red, u16 *green,
-			     u16 *blue, int regno);
 
 				/* mgag200_mode.c */
 int mgag200_modeset_init(struct mga_device *mdev);
diff --git a/drivers/gpu/drm/mgag200/mgag200_fb.c b/drivers/gpu/drm/mgag200/mgag200_fb.c
index 5d3b1fa..30726c9 100644
--- a/drivers/gpu/drm/mgag200/mgag200_fb.c
+++ b/drivers/gpu/drm/mgag200/mgag200_fb.c
@@ -210,7 +210,6 @@ static int mgag200fb_create(struct drm_fb_helper *helper,
 
 	strcpy(info->fix.id, "mgadrmfb");
 
-	info->flags = FBINFO_DEFAULT | FBINFO_CAN_FORCE_OUTPUT;
 	info->fbops = &mgag200fb_ops;
 
 	/* setup aperture base/size for vesafb takeover */
@@ -233,7 +232,7 @@ static int mgag200fb_create(struct drm_fb_helper *helper,
 err_alloc_fbi:
 	vfree(sysram);
 err_sysram:
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 
 	return ret;
 }
@@ -246,7 +245,7 @@ static int mga_fbdev_destroy(struct drm_device *dev,
 	drm_fb_helper_unregister_fbi(&mfbdev->helper);
 
 	if (mfb->obj) {
-		drm_gem_object_unreference_unlocked(mfb->obj);
+		drm_gem_object_put_unlocked(mfb->obj);
 		mfb->obj = NULL;
 	}
 	drm_fb_helper_fini(&mfbdev->helper);
@@ -258,8 +257,6 @@ static int mga_fbdev_destroy(struct drm_device *dev,
 }
 
 static const struct drm_fb_helper_funcs mga_fb_helper_funcs = {
-	.gamma_set = mga_crtc_fb_gamma_set,
-	.gamma_get = mga_crtc_fb_gamma_get,
 	.fb_probe = mgag200fb_create,
 };
 
diff --git a/drivers/gpu/drm/mgag200/mgag200_main.c b/drivers/gpu/drm/mgag200/mgag200_main.c
index dce8a3e..780f983 100644
--- a/drivers/gpu/drm/mgag200/mgag200_main.c
+++ b/drivers/gpu/drm/mgag200/mgag200_main.c
@@ -18,7 +18,7 @@ static void mga_user_framebuffer_destroy(struct drm_framebuffer *fb)
 {
 	struct mga_framebuffer *mga_fb = to_mga_framebuffer(fb);
 
-	drm_gem_object_unreference_unlocked(mga_fb->obj);
+	drm_gem_object_put_unlocked(mga_fb->obj);
 	drm_framebuffer_cleanup(fb);
 	kfree(fb);
 }
@@ -59,13 +59,13 @@ mgag200_user_framebuffer_create(struct drm_device *dev,
 
 	mga_fb = kzalloc(sizeof(*mga_fb), GFP_KERNEL);
 	if (!mga_fb) {
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ERR_PTR(-ENOMEM);
 	}
 
 	ret = mgag200_framebuffer_init(dev, mga_fb, mode_cmd, obj);
 	if (ret) {
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		kfree(mga_fb);
 		return ERR_PTR(ret);
 	}
@@ -317,7 +317,7 @@ int mgag200_dumb_create(struct drm_file *file,
 		return ret;
 
 	ret = drm_gem_handle_create(file, gobj, &handle);
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	if (ret)
 		return ret;
 
@@ -366,6 +366,6 @@ mgag200_dumb_mmap_offset(struct drm_file *file,
 	bo = gem_to_mga_bo(obj);
 	*offset = mgag200_bo_mmap_offset(bo);
 
-	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
 	return 0;
 }
diff --git a/drivers/gpu/drm/mgag200/mgag200_mode.c b/drivers/gpu/drm/mgag200/mgag200_mode.c
index f4b5358..5e9cd4c 100644
--- a/drivers/gpu/drm/mgag200/mgag200_mode.c
+++ b/drivers/gpu/drm/mgag200/mgag200_mode.c
@@ -27,15 +27,19 @@
 
 static void mga_crtc_load_lut(struct drm_crtc *crtc)
 {
-	struct mga_crtc *mga_crtc = to_mga_crtc(crtc);
 	struct drm_device *dev = crtc->dev;
 	struct mga_device *mdev = dev->dev_private;
 	struct drm_framebuffer *fb = crtc->primary->fb;
+	u16 *r_ptr, *g_ptr, *b_ptr;
 	int i;
 
 	if (!crtc->enabled)
 		return;
 
+	r_ptr = crtc->gamma_store;
+	g_ptr = r_ptr + crtc->gamma_size;
+	b_ptr = g_ptr + crtc->gamma_size;
+
 	WREG8(DAC_INDEX + MGA1064_INDEX, 0);
 
 	if (fb && fb->format->cpp[0] * 8 == 16) {
@@ -46,25 +50,27 @@ static void mga_crtc_load_lut(struct drm_crtc *crtc)
 				if (i > (MGAG200_LUT_SIZE >> 1)) {
 					r = b = 0;
 				} else {
-					r = mga_crtc->lut_r[i << 1];
-					b = mga_crtc->lut_b[i << 1];
+					r = *r_ptr++ >> 8;
+					b = *b_ptr++ >> 8;
+					r_ptr++;
+					b_ptr++;
 				}
 			} else {
-				r = mga_crtc->lut_r[i];
-				b = mga_crtc->lut_b[i];
+				r = *r_ptr++ >> 8;
+				b = *b_ptr++ >> 8;
 			}
 			/* VGA registers */
 			WREG8(DAC_INDEX + MGA1064_COL_PAL, r);
-			WREG8(DAC_INDEX + MGA1064_COL_PAL, mga_crtc->lut_g[i]);
+			WREG8(DAC_INDEX + MGA1064_COL_PAL, *g_ptr++ >> 8);
 			WREG8(DAC_INDEX + MGA1064_COL_PAL, b);
 		}
 		return;
 	}
 	for (i = 0; i < MGAG200_LUT_SIZE; i++) {
 		/* VGA registers */
-		WREG8(DAC_INDEX + MGA1064_COL_PAL, mga_crtc->lut_r[i]);
-		WREG8(DAC_INDEX + MGA1064_COL_PAL, mga_crtc->lut_g[i]);
-		WREG8(DAC_INDEX + MGA1064_COL_PAL, mga_crtc->lut_b[i]);
+		WREG8(DAC_INDEX + MGA1064_COL_PAL, *r_ptr++ >> 8);
+		WREG8(DAC_INDEX + MGA1064_COL_PAL, *g_ptr++ >> 8);
+		WREG8(DAC_INDEX + MGA1064_COL_PAL, *b_ptr++ >> 8);
 	}
 }
 
@@ -1399,14 +1405,6 @@ static int mga_crtc_gamma_set(struct drm_crtc *crtc, u16 *red, u16 *green,
 			      u16 *blue, uint32_t size,
 			      struct drm_modeset_acquire_ctx *ctx)
 {
-	struct mga_crtc *mga_crtc = to_mga_crtc(crtc);
-	int i;
-
-	for (i = 0; i < size; i++) {
-		mga_crtc->lut_r[i] = red[i] >> 8;
-		mga_crtc->lut_g[i] = green[i] >> 8;
-		mga_crtc->lut_b[i] = blue[i] >> 8;
-	}
 	mga_crtc_load_lut(crtc);
 
 	return 0;
@@ -1455,14 +1453,12 @@ static const struct drm_crtc_helper_funcs mga_helper_funcs = {
 	.mode_set_base = mga_crtc_mode_set_base,
 	.prepare = mga_crtc_prepare,
 	.commit = mga_crtc_commit,
-	.load_lut = mga_crtc_load_lut,
 };
 
 /* CRTC setup */
 static void mga_crtc_init(struct mga_device *mdev)
 {
 	struct mga_crtc *mga_crtc;
-	int i;
 
 	mga_crtc = kzalloc(sizeof(struct mga_crtc) +
 			      (MGAG200FB_CONN_LIMIT * sizeof(struct drm_connector *)),
@@ -1476,37 +1472,9 @@ static void mga_crtc_init(struct mga_device *mdev)
 	drm_mode_crtc_set_gamma_size(&mga_crtc->base, MGAG200_LUT_SIZE);
 	mdev->mode_info.crtc = mga_crtc;
 
-	for (i = 0; i < MGAG200_LUT_SIZE; i++) {
-		mga_crtc->lut_r[i] = i;
-		mga_crtc->lut_g[i] = i;
-		mga_crtc->lut_b[i] = i;
-	}
-
 	drm_crtc_helper_add(&mga_crtc->base, &mga_helper_funcs);
 }
 
-/** Sets the color ramps on behalf of fbcon */
-void mga_crtc_fb_gamma_set(struct drm_crtc *crtc, u16 red, u16 green,
-			      u16 blue, int regno)
-{
-	struct mga_crtc *mga_crtc = to_mga_crtc(crtc);
-
-	mga_crtc->lut_r[regno] = red >> 8;
-	mga_crtc->lut_g[regno] = green >> 8;
-	mga_crtc->lut_b[regno] = blue >> 8;
-}
-
-/** Gets the color ramps on behalf of fbcon */
-void mga_crtc_fb_gamma_get(struct drm_crtc *crtc, u16 *red, u16 *green,
-			      u16 *blue, int regno)
-{
-	struct mga_crtc *mga_crtc = to_mga_crtc(crtc);
-
-	*red = (u16)mga_crtc->lut_r[regno] << 8;
-	*green = (u16)mga_crtc->lut_g[regno] << 8;
-	*blue = (u16)mga_crtc->lut_b[regno] << 8;
-}
-
 /*
  * The encoder comes after the CRTC in the output pipeline, but before
  * the connector. It's responsible for ensuring that the digital
diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
index 0e3828ed..7791313 100644
--- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.c
@@ -486,8 +486,6 @@ struct msm_gpu *a3xx_gpu_init(struct drm_device *dev)
 	adreno_gpu = &a3xx_gpu->base;
 	gpu = &adreno_gpu->base;
 
-	a3xx_gpu->pdev = pdev;
-
 	gpu->perfcntrs = perfcntrs;
 	gpu->num_perfcntrs = ARRAY_SIZE(perfcntrs);
 
diff --git a/drivers/gpu/drm/msm/adreno/a3xx_gpu.h b/drivers/gpu/drm/msm/adreno/a3xx_gpu.h
index 85ff66c..ab60dc9 100644
--- a/drivers/gpu/drm/msm/adreno/a3xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a3xx_gpu.h
@@ -28,7 +28,6 @@
 
 struct a3xx_gpu {
 	struct adreno_gpu base;
-	struct platform_device *pdev;
 
 	/* if OCMEM is used for GMEM: */
 	uint32_t ocmem_base;
diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
index 19abf22..58341ef 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.c
@@ -568,8 +568,6 @@ struct msm_gpu *a4xx_gpu_init(struct drm_device *dev)
 	adreno_gpu = &a4xx_gpu->base;
 	gpu = &adreno_gpu->base;
 
-	a4xx_gpu->pdev = pdev;
-
 	gpu->perfcntrs = NULL;
 	gpu->num_perfcntrs = 0;
 
diff --git a/drivers/gpu/drm/msm/adreno/a4xx_gpu.h b/drivers/gpu/drm/msm/adreno/a4xx_gpu.h
index 0124720..f757184 100644
--- a/drivers/gpu/drm/msm/adreno/a4xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a4xx_gpu.h
@@ -23,7 +23,6 @@
 
 struct a4xx_gpu {
 	struct adreno_gpu base;
-	struct platform_device *pdev;
 
 	/* if OCMEM is used for GMEM: */
 	uint32_t ocmem_base;
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index f9eae03..17c59d8 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -284,28 +284,14 @@ static int a5xx_me_init(struct msm_gpu *gpu)
 static struct drm_gem_object *a5xx_ucode_load_bo(struct msm_gpu *gpu,
 		const struct firmware *fw, u64 *iova)
 {
-	struct drm_device *drm = gpu->dev;
 	struct drm_gem_object *bo;
 	void *ptr;
 
-	bo = msm_gem_new_locked(drm, fw->size - 4, MSM_BO_UNCACHED);
-	if (IS_ERR(bo))
-		return bo;
+	ptr = msm_gem_kernel_new_locked(gpu->dev, fw->size - 4,
+		MSM_BO_UNCACHED | MSM_BO_GPU_READONLY, gpu->aspace, &bo, iova);
 
-	ptr = msm_gem_get_vaddr(bo);
-	if (!ptr) {
-		drm_gem_object_unreference(bo);
-		return ERR_PTR(-ENOMEM);
-	}
-
-	if (iova) {
-		int ret = msm_gem_get_iova(bo, gpu->aspace, iova);
-
-		if (ret) {
-			drm_gem_object_unreference(bo);
-			return ERR_PTR(ret);
-		}
-	}
+	if (IS_ERR(ptr))
+		return ERR_CAST(ptr);
 
 	memcpy(ptr, &fw->data[4], fw->size - 4);
 
@@ -372,8 +358,7 @@ static int a5xx_zap_shader_init(struct msm_gpu *gpu)
 {
 	static bool loaded;
 	struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu);
-	struct a5xx_gpu *a5xx_gpu = to_a5xx_gpu(adreno_gpu);
-	struct platform_device *pdev = a5xx_gpu->pdev;
+	struct platform_device *pdev = gpu->pdev;
 	int ret;
 
 	/*
@@ -410,6 +395,7 @@ static int a5xx_zap_shader_init(struct msm_gpu *gpu)
 	  A5XX_RBBM_INT_0_MASK_RBBM_ETS_MS_TIMEOUT | \
 	  A5XX_RBBM_INT_0_MASK_RBBM_ATB_ASYNC_OVERFLOW | \
 	  A5XX_RBBM_INT_0_MASK_CP_HW_ERROR | \
+	  A5XX_RBBM_INT_0_MASK_MISC_HANG_DETECT | \
 	  A5XX_RBBM_INT_0_MASK_CP_CACHE_FLUSH_TS | \
 	  A5XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS | \
 	  A5XX_RBBM_INT_0_MASK_GPMU_VOLTAGE_DROOP)
@@ -812,6 +798,27 @@ static void a5xx_gpmu_err_irq(struct msm_gpu *gpu)
 	dev_err_ratelimited(gpu->dev->dev, "GPMU | voltage droop\n");
 }
 
+static void a5xx_fault_detect_irq(struct msm_gpu *gpu)
+{
+	struct drm_device *dev = gpu->dev;
+	struct msm_drm_private *priv = dev->dev_private;
+
+	dev_err(dev->dev, "gpu fault fence %x status %8.8X rb %4.4x/%4.4x ib1 %16.16llX/%4.4x ib2 %16.16llX/%4.4x\n",
+		gpu->funcs->last_fence(gpu),
+		gpu_read(gpu, REG_A5XX_RBBM_STATUS),
+		gpu_read(gpu, REG_A5XX_CP_RB_RPTR),
+		gpu_read(gpu, REG_A5XX_CP_RB_WPTR),
+		gpu_read64(gpu, REG_A5XX_CP_IB1_BASE, REG_A5XX_CP_IB1_BASE_HI),
+		gpu_read(gpu, REG_A5XX_CP_IB1_BUFSZ),
+		gpu_read64(gpu, REG_A5XX_CP_IB2_BASE, REG_A5XX_CP_IB2_BASE_HI),
+		gpu_read(gpu, REG_A5XX_CP_IB2_BUFSZ));
+
+	/* Turn off the hangcheck timer to keep it from bothering us */
+	del_timer(&gpu->hangcheck_timer);
+
+	queue_work(priv->wq, &gpu->recover_work);
+}
+
 #define RBBM_ERROR_MASK \
 	(A5XX_RBBM_INT_0_MASK_RBBM_AHB_ERROR | \
 	A5XX_RBBM_INT_0_MASK_RBBM_TRANSFER_TIMEOUT | \
@@ -838,6 +845,9 @@ static irqreturn_t a5xx_irq(struct msm_gpu *gpu)
 	if (status & A5XX_RBBM_INT_0_MASK_CP_HW_ERROR)
 		a5xx_cp_err_irq(gpu);
 
+	if (status & A5XX_RBBM_INT_0_MASK_MISC_HANG_DETECT)
+		a5xx_fault_detect_irq(gpu);
+
 	if (status & A5XX_RBBM_INT_0_MASK_UCHE_OOB_ACCESS)
 		a5xx_uche_err_irq(gpu);
 
@@ -1015,7 +1025,6 @@ struct msm_gpu *a5xx_gpu_init(struct drm_device *dev)
 	adreno_gpu = &a5xx_gpu->base;
 	gpu = &adreno_gpu->base;
 
-	a5xx_gpu->pdev = pdev;
 	adreno_gpu->registers = a5xx_registers;
 	adreno_gpu->reg_offsets = a5xx_register_offsets;
 
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
index 11370922..e944516 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.h
@@ -23,7 +23,6 @@
 
 struct a5xx_gpu {
 	struct adreno_gpu base;
-	struct platform_device *pdev;
 
 	struct drm_gem_object *pm4_bo;
 	uint64_t pm4_iova;
diff --git a/drivers/gpu/drm/msm/adreno/a5xx_power.c b/drivers/gpu/drm/msm/adreno/a5xx_power.c
index 87af6ee..04aab1d 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_power.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_power.c
@@ -294,16 +294,10 @@ void a5xx_gpmu_ucode_init(struct msm_gpu *gpu)
 	 */
 	bosize = (cmds_size + (cmds_size / TYPE4_MAX_PAYLOAD) + 1) << 2;
 
-	a5xx_gpu->gpmu_bo = msm_gem_new_locked(drm, bosize, MSM_BO_UNCACHED);
-	if (IS_ERR(a5xx_gpu->gpmu_bo))
-		goto err;
-
-	if (msm_gem_get_iova(a5xx_gpu->gpmu_bo, gpu->aspace,
-			&a5xx_gpu->gpmu_iova))
-		goto err;
-
-	ptr = msm_gem_get_vaddr(a5xx_gpu->gpmu_bo);
-	if (!ptr)
+	ptr = msm_gem_kernel_new_locked(drm, bosize,
+		MSM_BO_UNCACHED | MSM_BO_GPU_READONLY, gpu->aspace,
+		&a5xx_gpu->gpmu_bo, &a5xx_gpu->gpmu_iova);
+	if (IS_ERR(ptr))
 		goto err;
 
 	while (cmds_size > 0) {
diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.c b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
index 7414c6b..c8b4ac2 100644
--- a/drivers/gpu/drm/msm/adreno/adreno_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.c
@@ -337,11 +337,6 @@ void adreno_wait_ring(struct msm_gpu *gpu, uint32_t ndwords)
 		DRM_ERROR("%s: timeout waiting for ringbuffer space\n", gpu->name);
 }
 
-static const char *iommu_ports[] = {
-		"gfx3d_user", "gfx3d_priv",
-		"gfx3d1_user", "gfx3d1_priv",
-};
-
 int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 		struct adreno_gpu *adreno_gpu, const struct adreno_gpu_funcs *funcs)
 {
@@ -373,15 +368,15 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 
 	adreno_gpu_config.ringsz = RB_SIZE;
 
+	pm_runtime_set_autosuspend_delay(&pdev->dev, DRM_MSM_INACTIVE_PERIOD);
+	pm_runtime_use_autosuspend(&pdev->dev);
+	pm_runtime_enable(&pdev->dev);
+
 	ret = msm_gpu_init(drm, pdev, &adreno_gpu->base, &funcs->base,
 			adreno_gpu->info->name, &adreno_gpu_config);
 	if (ret)
 		return ret;
 
-	pm_runtime_set_autosuspend_delay(&pdev->dev, DRM_MSM_INACTIVE_PERIOD);
-	pm_runtime_use_autosuspend(&pdev->dev);
-	pm_runtime_enable(&pdev->dev);
-
 	ret = request_firmware(&adreno_gpu->pm4, adreno_gpu->info->pm4fw, drm->dev);
 	if (ret) {
 		dev_err(drm->dev, "failed to load %s PM4 firmware: %d\n",
@@ -396,37 +391,17 @@ int adreno_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 		return ret;
 	}
 
-	if (gpu->aspace && gpu->aspace->mmu) {
-		struct msm_mmu *mmu = gpu->aspace->mmu;
-		ret = mmu->funcs->attach(mmu, iommu_ports,
-				ARRAY_SIZE(iommu_ports));
-		if (ret)
-			return ret;
-	}
+	adreno_gpu->memptrs = msm_gem_kernel_new(drm,
+		sizeof(*adreno_gpu->memptrs), MSM_BO_UNCACHED, gpu->aspace,
+		&adreno_gpu->memptrs_bo, &adreno_gpu->memptrs_iova);
 
-	adreno_gpu->memptrs_bo = msm_gem_new(drm, sizeof(*adreno_gpu->memptrs),
-			MSM_BO_UNCACHED);
-	if (IS_ERR(adreno_gpu->memptrs_bo)) {
-		ret = PTR_ERR(adreno_gpu->memptrs_bo);
-		adreno_gpu->memptrs_bo = NULL;
-		dev_err(drm->dev, "could not allocate memptrs: %d\n", ret);
-		return ret;
-	}
-
-	adreno_gpu->memptrs = msm_gem_get_vaddr(adreno_gpu->memptrs_bo);
 	if (IS_ERR(adreno_gpu->memptrs)) {
-		dev_err(drm->dev, "could not vmap memptrs\n");
-		return -ENOMEM;
+		ret = PTR_ERR(adreno_gpu->memptrs);
+		adreno_gpu->memptrs = NULL;
+		dev_err(drm->dev, "could not allocate memptrs: %d\n", ret);
 	}
 
-	ret = msm_gem_get_iova(adreno_gpu->memptrs_bo, gpu->aspace,
-			&adreno_gpu->memptrs_iova);
-	if (ret) {
-		dev_err(drm->dev, "could not map memptrs: %d\n", ret);
-		return ret;
-	}
-
-	return 0;
+	return ret;
 }
 
 void adreno_gpu_cleanup(struct adreno_gpu *adreno_gpu)
@@ -446,10 +421,4 @@ void adreno_gpu_cleanup(struct adreno_gpu *adreno_gpu)
 	release_firmware(adreno_gpu->pfp);
 
 	msm_gpu_cleanup(gpu);
-
-	if (gpu->aspace) {
-		gpu->aspace->mmu->funcs->detach(gpu->aspace->mmu,
-			iommu_ports, ARRAY_SIZE(iommu_ports));
-		msm_gem_address_space_put(gpu->aspace);
-	}
 }
diff --git a/drivers/gpu/drm/msm/dsi/dsi.c b/drivers/gpu/drm/msm/dsi/dsi.c
index 311c1c1..98742d7 100644
--- a/drivers/gpu/drm/msm/dsi/dsi.c
+++ b/drivers/gpu/drm/msm/dsi/dsi.c
@@ -161,12 +161,17 @@ static const struct of_device_id dt_match[] = {
 	{}
 };
 
+static const struct dev_pm_ops dsi_pm_ops = {
+	SET_RUNTIME_PM_OPS(msm_dsi_runtime_suspend, msm_dsi_runtime_resume, NULL)
+};
+
 static struct platform_driver dsi_driver = {
 	.probe = dsi_dev_probe,
 	.remove = dsi_dev_remove,
 	.driver = {
 		.name = "msm_dsi",
 		.of_match_table = dt_match,
+		.pm = &dsi_pm_ops,
 	},
 };
 
diff --git a/drivers/gpu/drm/msm/dsi/dsi.h b/drivers/gpu/drm/msm/dsi/dsi.h
index 9e60173..2302046 100644
--- a/drivers/gpu/drm/msm/dsi/dsi.h
+++ b/drivers/gpu/drm/msm/dsi/dsi.h
@@ -179,6 +179,8 @@ void msm_dsi_host_destroy(struct mipi_dsi_host *host);
 int msm_dsi_host_modeset_init(struct mipi_dsi_host *host,
 					struct drm_device *dev);
 int msm_dsi_host_init(struct msm_dsi *msm_dsi);
+int msm_dsi_runtime_suspend(struct device *dev);
+int msm_dsi_runtime_resume(struct device *dev);
 
 /* dsi phy */
 struct msm_dsi_phy;
diff --git a/drivers/gpu/drm/msm/dsi/dsi_host.c b/drivers/gpu/drm/msm/dsi/dsi_host.c
index c7b612c..dbb31a0 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_host.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_host.c
@@ -135,7 +135,6 @@ struct msm_dsi_host {
 	struct completion video_comp;
 	struct mutex dev_mutex;
 	struct mutex cmd_mutex;
-	struct mutex clk_mutex;
 	spinlock_t intr_lock; /* Protect interrupt ctrl register */
 
 	u32 err_work_state;
@@ -221,6 +220,8 @@ static const struct msm_dsi_cfg_handler *dsi_get_config(
 		goto put_gdsc;
 	}
 
+	pm_runtime_get_sync(dev);
+
 	ret = regulator_enable(gdsc_reg);
 	if (ret) {
 		pr_err("%s: unable to enable gdsc\n", __func__);
@@ -247,6 +248,7 @@ static const struct msm_dsi_cfg_handler *dsi_get_config(
 	clk_disable_unprepare(ahb_clk);
 disable_gdsc:
 	regulator_disable(gdsc_reg);
+	pm_runtime_put_autosuspend(dev);
 put_clk:
 	clk_put(ahb_clk);
 put_gdsc:
@@ -455,6 +457,34 @@ static void dsi_bus_clk_disable(struct msm_dsi_host *msm_host)
 		clk_disable_unprepare(msm_host->bus_clks[i]);
 }
 
+int msm_dsi_runtime_suspend(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct msm_dsi *msm_dsi = platform_get_drvdata(pdev);
+	struct mipi_dsi_host *host = msm_dsi->host;
+	struct msm_dsi_host *msm_host = to_msm_dsi_host(host);
+
+	if (!msm_host->cfg_hnd)
+		return 0;
+
+	dsi_bus_clk_disable(msm_host);
+
+	return 0;
+}
+
+int msm_dsi_runtime_resume(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct msm_dsi *msm_dsi = platform_get_drvdata(pdev);
+	struct mipi_dsi_host *host = msm_dsi->host;
+	struct msm_dsi_host *msm_host = to_msm_dsi_host(host);
+
+	if (!msm_host->cfg_hnd)
+		return 0;
+
+	return dsi_bus_clk_enable(msm_host);
+}
+
 static int dsi_link_clk_enable_6g(struct msm_dsi_host *msm_host)
 {
 	int ret;
@@ -596,35 +626,6 @@ static void dsi_link_clk_disable(struct msm_dsi_host *msm_host)
 	}
 }
 
-static int dsi_clk_ctrl(struct msm_dsi_host *msm_host, bool enable)
-{
-	int ret = 0;
-
-	mutex_lock(&msm_host->clk_mutex);
-	if (enable) {
-		ret = dsi_bus_clk_enable(msm_host);
-		if (ret) {
-			pr_err("%s: Can not enable bus clk, %d\n",
-				__func__, ret);
-			goto unlock_ret;
-		}
-		ret = dsi_link_clk_enable(msm_host);
-		if (ret) {
-			pr_err("%s: Can not enable link clk, %d\n",
-				__func__, ret);
-			dsi_bus_clk_disable(msm_host);
-			goto unlock_ret;
-		}
-	} else {
-		dsi_link_clk_disable(msm_host);
-		dsi_bus_clk_disable(msm_host);
-	}
-
-unlock_ret:
-	mutex_unlock(&msm_host->clk_mutex);
-	return ret;
-}
-
 static int dsi_calc_clk_rate(struct msm_dsi_host *msm_host)
 {
 	struct drm_display_mode *mode = msm_host->mode;
@@ -1699,6 +1700,7 @@ int msm_dsi_host_init(struct msm_dsi *msm_dsi)
 	}
 
 	msm_host->pdev = pdev;
+	msm_dsi->host = &msm_host->base;
 
 	ret = dsi_host_parse_dt(msm_host);
 	if (ret) {
@@ -1713,6 +1715,8 @@ int msm_dsi_host_init(struct msm_dsi *msm_dsi)
 		goto fail;
 	}
 
+	pm_runtime_enable(&pdev->dev);
+
 	msm_host->cfg_hnd = dsi_get_config(msm_host);
 	if (!msm_host->cfg_hnd) {
 		ret = -EINVAL;
@@ -1753,7 +1757,6 @@ int msm_dsi_host_init(struct msm_dsi *msm_dsi)
 	init_completion(&msm_host->video_comp);
 	mutex_init(&msm_host->dev_mutex);
 	mutex_init(&msm_host->cmd_mutex);
-	mutex_init(&msm_host->clk_mutex);
 	spin_lock_init(&msm_host->intr_lock);
 
 	/* setup workqueue */
@@ -1761,7 +1764,6 @@ int msm_dsi_host_init(struct msm_dsi *msm_dsi)
 	INIT_WORK(&msm_host->err_work, dsi_err_worker);
 	INIT_WORK(&msm_host->hpd_work, dsi_hpd_worker);
 
-	msm_dsi->host = &msm_host->base;
 	msm_dsi->id = msm_host->id;
 
 	DBG("Dsi Host %d initialized", msm_host->id);
@@ -1783,9 +1785,10 @@ void msm_dsi_host_destroy(struct mipi_dsi_host *host)
 		msm_host->workqueue = NULL;
 	}
 
-	mutex_destroy(&msm_host->clk_mutex);
 	mutex_destroy(&msm_host->cmd_mutex);
 	mutex_destroy(&msm_host->dev_mutex);
+
+	pm_runtime_disable(&msm_host->pdev->dev);
 }
 
 int msm_dsi_host_modeset_init(struct mipi_dsi_host *host,
@@ -1881,7 +1884,8 @@ int msm_dsi_host_xfer_prepare(struct mipi_dsi_host *host,
 	 * mdss interrupt is generated in mdp core clock domain
 	 * mdp clock need to be enabled to receive dsi interrupt
 	 */
-	dsi_clk_ctrl(msm_host, 1);
+	pm_runtime_get_sync(&msm_host->pdev->dev);
+	dsi_link_clk_enable(msm_host);
 
 	/* TODO: vote for bus bandwidth */
 
@@ -1911,7 +1915,8 @@ void msm_dsi_host_xfer_restore(struct mipi_dsi_host *host,
 
 	/* TODO: unvote for bus bandwidth */
 
-	dsi_clk_ctrl(msm_host, 0);
+	dsi_link_clk_disable(msm_host);
+	pm_runtime_put_autosuspend(&msm_host->pdev->dev);
 }
 
 int msm_dsi_host_cmd_tx(struct mipi_dsi_host *host,
@@ -2160,8 +2165,11 @@ int msm_dsi_host_enable(struct mipi_dsi_host *host)
 	 * and only turned on before MDP START.
 	 * This part of code should be enabled once mdp driver support it.
 	 */
-	/* if (msm_panel->mode == MSM_DSI_CMD_MODE)
-		dsi_clk_ctrl(msm_host, 0); */
+	/* if (msm_panel->mode == MSM_DSI_CMD_MODE) {
+	 *	dsi_link_clk_disable(msm_host);
+	 *	pm_runtime_put_autosuspend(&msm_host->pdev->dev);
+	 * }
+	 */
 
 	return 0;
 }
@@ -2217,9 +2225,11 @@ int msm_dsi_host_power_on(struct mipi_dsi_host *host,
 		goto unlock_ret;
 	}
 
-	ret = dsi_clk_ctrl(msm_host, 1);
+	pm_runtime_get_sync(&msm_host->pdev->dev);
+	ret = dsi_link_clk_enable(msm_host);
 	if (ret) {
-		pr_err("%s: failed to enable clocks. ret=%d\n", __func__, ret);
+		pr_err("%s: failed to enable link clocks. ret=%d\n",
+		       __func__, ret);
 		goto fail_disable_reg;
 	}
 
@@ -2243,7 +2253,8 @@ int msm_dsi_host_power_on(struct mipi_dsi_host *host,
 	return 0;
 
 fail_disable_clk:
-	dsi_clk_ctrl(msm_host, 0);
+	dsi_link_clk_disable(msm_host);
+	pm_runtime_put_autosuspend(&msm_host->pdev->dev);
 fail_disable_reg:
 	dsi_host_regulator_disable(msm_host);
 unlock_ret:
@@ -2268,7 +2279,8 @@ int msm_dsi_host_power_off(struct mipi_dsi_host *host)
 
 	pinctrl_pm_select_sleep_state(&msm_host->pdev->dev);
 
-	dsi_clk_ctrl(msm_host, 0);
+	dsi_link_clk_disable(msm_host);
+	pm_runtime_put_autosuspend(&msm_host->pdev->dev);
 
 	dsi_host_regulator_disable(msm_host);
 
diff --git a/drivers/gpu/drm/msm/dsi/dsi_manager.c b/drivers/gpu/drm/msm/dsi/dsi_manager.c
index a879ffa..8552481 100644
--- a/drivers/gpu/drm/msm/dsi/dsi_manager.c
+++ b/drivers/gpu/drm/msm/dsi/dsi_manager.c
@@ -626,7 +626,6 @@ static void dsi_mgr_bridge_mode_set(struct drm_bridge *bridge,
 }
 
 static const struct drm_connector_funcs dsi_mgr_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.detect = dsi_mgr_connector_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = dsi_mgr_connector_destroy,
diff --git a/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c b/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c
index 0c2eb9c9..7c9bf91 100644
--- a/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c
+++ b/drivers/gpu/drm/msm/dsi/phy/dsi_phy.c
@@ -373,7 +373,7 @@ static int dsi_phy_enable_resource(struct msm_dsi_phy *phy)
 static void dsi_phy_disable_resource(struct msm_dsi_phy *phy)
 {
 	clk_disable_unprepare(phy->ahb_clk);
-	pm_runtime_put_sync(&phy->pdev->dev);
+	pm_runtime_put_autosuspend(&phy->pdev->dev);
 }
 
 static const struct of_device_id dsi_phy_dt_match[] = {
diff --git a/drivers/gpu/drm/msm/edp/edp_connector.c b/drivers/gpu/drm/msm/edp/edp_connector.c
index 5960628..6f3fc6b 100644
--- a/drivers/gpu/drm/msm/edp/edp_connector.c
+++ b/drivers/gpu/drm/msm/edp/edp_connector.c
@@ -92,7 +92,6 @@ static int edp_connector_mode_valid(struct drm_connector *connector,
 }
 
 static const struct drm_connector_funcs edp_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.detect = edp_connector_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = edp_connector_destroy,
diff --git a/drivers/gpu/drm/msm/hdmi/hdmi.c b/drivers/gpu/drm/msm/hdmi/hdmi.c
index a968cad..17e069a 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi.c
@@ -239,6 +239,8 @@ static struct hdmi *msm_hdmi_init(struct platform_device *pdev)
 		hdmi->pwr_clks[i] = clk;
 	}
 
+	pm_runtime_enable(&pdev->dev);
+
 	hdmi->workq = alloc_ordered_workqueue("msm_hdmi", 0);
 
 	hdmi->i2c = msm_hdmi_i2c_init(hdmi);
diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c b/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
index ae40e71..7e357077 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi_bridge.c
@@ -35,6 +35,8 @@ static void msm_hdmi_power_on(struct drm_bridge *bridge)
 	const struct hdmi_platform_config *config = hdmi->config;
 	int i, ret;
 
+	pm_runtime_get_sync(&hdmi->pdev->dev);
+
 	for (i = 0; i < config->pwr_reg_cnt; i++) {
 		ret = regulator_enable(hdmi->pwr_regs[i]);
 		if (ret) {
@@ -84,6 +86,8 @@ static void power_off(struct drm_bridge *bridge)
 					config->pwr_reg_names[i], ret);
 		}
 	}
+
+	pm_runtime_put_autosuspend(&hdmi->pdev->dev);
 }
 
 #define AVI_IFRAME_LINE_NUMBER 1
@@ -97,7 +101,7 @@ static void msm_hdmi_config_avi_infoframe(struct hdmi *hdmi)
 	u32 val;
 	int len;
 
-	drm_hdmi_avi_infoframe_from_display_mode(&frame.avi, mode);
+	drm_hdmi_avi_infoframe_from_display_mode(&frame.avi, mode, false);
 
 	len = hdmi_infoframe_pack(&frame, buffer, sizeof(buffer));
 	if (len < 0) {
diff --git a/drivers/gpu/drm/msm/hdmi/hdmi_connector.c b/drivers/gpu/drm/msm/hdmi/hdmi_connector.c
index a2515b4..c0848df 100644
--- a/drivers/gpu/drm/msm/hdmi/hdmi_connector.c
+++ b/drivers/gpu/drm/msm/hdmi/hdmi_connector.c
@@ -137,6 +137,36 @@ static int gpio_config(struct hdmi *hdmi, bool on)
 	return ret;
 }
 
+static void enable_hpd_clocks(struct hdmi *hdmi, bool enable)
+{
+	const struct hdmi_platform_config *config = hdmi->config;
+	struct device *dev = &hdmi->pdev->dev;
+	int i, ret;
+
+	if (enable) {
+		for (i = 0; i < config->hpd_clk_cnt; i++) {
+			if (config->hpd_freq && config->hpd_freq[i]) {
+				ret = clk_set_rate(hdmi->hpd_clks[i],
+						   config->hpd_freq[i]);
+				if (ret)
+					dev_warn(dev,
+						 "failed to set clk %s (%d)\n",
+						 config->hpd_clk_names[i], ret);
+			}
+
+			ret = clk_prepare_enable(hdmi->hpd_clks[i]);
+			if (ret) {
+				dev_err(dev,
+					"failed to enable hpd clk: %s (%d)\n",
+					config->hpd_clk_names[i], ret);
+			}
+		}
+	} else {
+		for (i = config->hpd_clk_cnt - 1; i >= 0; i--)
+			clk_disable_unprepare(hdmi->hpd_clks[i]);
+	}
+}
+
 static int hpd_enable(struct hdmi_connector *hdmi_connector)
 {
 	struct hdmi *hdmi = hdmi_connector->hdmi;
@@ -167,22 +197,8 @@ static int hpd_enable(struct hdmi_connector *hdmi_connector)
 		goto fail;
 	}
 
-	for (i = 0; i < config->hpd_clk_cnt; i++) {
-		if (config->hpd_freq && config->hpd_freq[i]) {
-			ret = clk_set_rate(hdmi->hpd_clks[i],
-					config->hpd_freq[i]);
-			if (ret)
-				dev_warn(dev, "failed to set clk %s (%d)\n",
-						config->hpd_clk_names[i], ret);
-		}
-
-		ret = clk_prepare_enable(hdmi->hpd_clks[i]);
-		if (ret) {
-			dev_err(dev, "failed to enable hpd clk: %s (%d)\n",
-					config->hpd_clk_names[i], ret);
-			goto fail;
-		}
-	}
+	pm_runtime_get_sync(dev);
+	enable_hpd_clocks(hdmi, true);
 
 	msm_hdmi_set_mode(hdmi, false);
 	msm_hdmi_phy_reset(hdmi);
@@ -225,8 +241,8 @@ static void hdp_disable(struct hdmi_connector *hdmi_connector)
 
 	msm_hdmi_set_mode(hdmi, false);
 
-	for (i = 0; i < config->hpd_clk_cnt; i++)
-		clk_disable_unprepare(hdmi->hpd_clks[i]);
+	enable_hpd_clocks(hdmi, false);
+	pm_runtime_put_autosuspend(dev);
 
 	ret = gpio_config(hdmi, false);
 	if (ret)
@@ -285,7 +301,16 @@ void msm_hdmi_connector_irq(struct drm_connector *connector)
 
 static enum drm_connector_status detect_reg(struct hdmi *hdmi)
 {
-	uint32_t hpd_int_status = hdmi_read(hdmi, REG_HDMI_HPD_INT_STATUS);
+	uint32_t hpd_int_status;
+
+	pm_runtime_get_sync(&hdmi->pdev->dev);
+	enable_hpd_clocks(hdmi, true);
+
+	hpd_int_status = hdmi_read(hdmi, REG_HDMI_HPD_INT_STATUS);
+
+	enable_hpd_clocks(hdmi, false);
+	pm_runtime_put_autosuspend(&hdmi->pdev->dev);
+
 	return (hpd_int_status & HDMI_HPD_INT_STATUS_CABLE_DETECTED) ?
 			connector_status_connected : connector_status_disconnected;
 }
@@ -407,7 +432,6 @@ static int msm_hdmi_connector_mode_valid(struct drm_connector *connector,
 }
 
 static const struct drm_connector_funcs hdmi_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.detect = hdmi_connector_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = hdmi_connector_destroy,
diff --git a/drivers/gpu/drm/msm/mdp/mdp4/mdp4_crtc.c b/drivers/gpu/drm/msm/mdp/mdp4/mdp4_crtc.c
index 615e1de..47fa2ab 100644
--- a/drivers/gpu/drm/msm/mdp/mdp4/mdp4_crtc.c
+++ b/drivers/gpu/drm/msm/mdp/mdp4/mdp4_crtc.c
@@ -279,7 +279,8 @@ static void mdp4_crtc_mode_set_nofb(struct drm_crtc *crtc)
 	}
 }
 
-static void mdp4_crtc_disable(struct drm_crtc *crtc)
+static void mdp4_crtc_atomic_disable(struct drm_crtc *crtc,
+				     struct drm_crtc_state *old_state)
 {
 	struct mdp4_crtc *mdp4_crtc = to_mdp4_crtc(crtc);
 	struct mdp4_kms *mdp4_kms = get_kms(crtc);
@@ -295,7 +296,8 @@ static void mdp4_crtc_disable(struct drm_crtc *crtc)
 	mdp4_crtc->enabled = false;
 }
 
-static void mdp4_crtc_enable(struct drm_crtc *crtc)
+static void mdp4_crtc_atomic_enable(struct drm_crtc *crtc,
+				    struct drm_crtc_state *old_state)
 {
 	struct mdp4_crtc *mdp4_crtc = to_mdp4_crtc(crtc);
 	struct mdp4_kms *mdp4_kms = get_kms(crtc);
@@ -482,7 +484,6 @@ static const struct drm_crtc_funcs mdp4_crtc_funcs = {
 	.set_config = drm_atomic_helper_set_config,
 	.destroy = mdp4_crtc_destroy,
 	.page_flip = drm_atomic_helper_page_flip,
-	.set_property = drm_atomic_helper_crtc_set_property,
 	.cursor_set = mdp4_crtc_cursor_set,
 	.cursor_move = mdp4_crtc_cursor_move,
 	.reset = drm_atomic_helper_crtc_reset,
@@ -492,11 +493,11 @@ static const struct drm_crtc_funcs mdp4_crtc_funcs = {
 
 static const struct drm_crtc_helper_funcs mdp4_crtc_helper_funcs = {
 	.mode_set_nofb = mdp4_crtc_mode_set_nofb,
-	.disable = mdp4_crtc_disable,
-	.enable = mdp4_crtc_enable,
 	.atomic_check = mdp4_crtc_atomic_check,
 	.atomic_begin = mdp4_crtc_atomic_begin,
 	.atomic_flush = mdp4_crtc_atomic_flush,
+	.atomic_enable = mdp4_crtc_atomic_enable,
+	.atomic_disable = mdp4_crtc_atomic_disable,
 };
 
 static void mdp4_crtc_vblank_irq(struct mdp_irq *irq, uint32_t irqstatus)
diff --git a/drivers/gpu/drm/msm/mdp/mdp4/mdp4_kms.c b/drivers/gpu/drm/msm/mdp/mdp4/mdp4_kms.c
index bcd1f5c..f7f0874 100644
--- a/drivers/gpu/drm/msm/mdp/mdp4/mdp4_kms.c
+++ b/drivers/gpu/drm/msm/mdp/mdp4/mdp4_kms.c
@@ -114,7 +114,7 @@ static void mdp4_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
 	mdp4_enable(mdp4_kms);
 
 	/* see 119ecb7fd */
-	for_each_crtc_in_state(state, crtc, crtc_state, i)
+	for_each_new_crtc_in_state(state, crtc, crtc_state, i)
 		drm_crtc_vblank_get(crtc);
 }
 
@@ -126,7 +126,7 @@ static void mdp4_complete_commit(struct msm_kms *kms, struct drm_atomic_state *s
 	struct drm_crtc_state *crtc_state;
 
 	/* see 119ecb7fd */
-	for_each_crtc_in_state(state, crtc, crtc_state, i)
+	for_each_new_crtc_in_state(state, crtc, crtc_state, i)
 		drm_crtc_vblank_put(crtc);
 
 	mdp4_disable(mdp4_kms);
diff --git a/drivers/gpu/drm/msm/mdp/mdp4/mdp4_lvds_connector.c b/drivers/gpu/drm/msm/mdp/mdp4/mdp4_lvds_connector.c
index 353429b..e3b1c86 100644
--- a/drivers/gpu/drm/msm/mdp/mdp4/mdp4_lvds_connector.c
+++ b/drivers/gpu/drm/msm/mdp/mdp4/mdp4_lvds_connector.c
@@ -91,7 +91,6 @@ static int mdp4_lvds_connector_mode_valid(struct drm_connector *connector,
 }
 
 static const struct drm_connector_funcs mdp4_lvds_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.detect = mdp4_lvds_connector_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = mdp4_lvds_connector_destroy,
diff --git a/drivers/gpu/drm/msm/mdp/mdp4/mdp4_plane.c b/drivers/gpu/drm/msm/mdp/mdp4/mdp4_plane.c
index a20e3d6..7a1ad3a 100644
--- a/drivers/gpu/drm/msm/mdp/mdp4/mdp4_plane.c
+++ b/drivers/gpu/drm/msm/mdp/mdp4/mdp4_plane.c
@@ -401,7 +401,7 @@ struct drm_plane *mdp4_plane_init(struct drm_device *dev,
 	type = private_plane ? DRM_PLANE_TYPE_PRIMARY : DRM_PLANE_TYPE_OVERLAY;
 	ret = drm_universal_plane_init(dev, plane, 0xff, &mdp4_plane_funcs,
 				 mdp4_plane->formats, mdp4_plane->nformats,
-				 type, NULL);
+				 NULL, type, NULL);
 	if (ret)
 		goto fail;
 
diff --git a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_cmd_encoder.c b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_cmd_encoder.c
index aa7402e..60790df9 100644
--- a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_cmd_encoder.c
+++ b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_cmd_encoder.c
@@ -192,6 +192,7 @@ int mdp5_cmd_encoder_set_split_display(struct drm_encoder *encoder,
 {
 	struct mdp5_encoder *mdp5_cmd_enc = to_mdp5_encoder(encoder);
 	struct mdp5_kms *mdp5_kms;
+	struct device *dev;
 	int intf_num;
 	u32 data = 0;
 
@@ -214,14 +215,16 @@ int mdp5_cmd_encoder_set_split_display(struct drm_encoder *encoder,
 	/* Smart Panel, Sync mode */
 	data |= MDP5_SPLIT_DPL_UPPER_SMART_PANEL;
 
+	dev = &mdp5_kms->pdev->dev;
+
 	/* Make sure clocks are on when connectors calling this function. */
-	mdp5_enable(mdp5_kms);
+	pm_runtime_get_sync(dev);
 	mdp5_write(mdp5_kms, REG_MDP5_SPLIT_DPL_UPPER, data);
 
 	mdp5_write(mdp5_kms, REG_MDP5_SPLIT_DPL_LOWER,
 		   MDP5_SPLIT_DPL_LOWER_SMART_PANEL);
 	mdp5_write(mdp5_kms, REG_MDP5_SPLIT_DPL_EN, 1);
-	mdp5_disable(mdp5_kms);
+	pm_runtime_put_autosuspend(dev);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_crtc.c b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_crtc.c
index 735a87a..6fcb58a 100644
--- a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_crtc.c
+++ b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_crtc.c
@@ -409,11 +409,13 @@ static void mdp5_crtc_mode_set_nofb(struct drm_crtc *crtc)
 	spin_unlock_irqrestore(&mdp5_crtc->lm_lock, flags);
 }
 
-static void mdp5_crtc_disable(struct drm_crtc *crtc)
+static void mdp5_crtc_atomic_disable(struct drm_crtc *crtc,
+				     struct drm_crtc_state *old_state)
 {
 	struct mdp5_crtc *mdp5_crtc = to_mdp5_crtc(crtc);
 	struct mdp5_crtc_state *mdp5_cstate = to_mdp5_crtc_state(crtc->state);
 	struct mdp5_kms *mdp5_kms = get_kms(crtc);
+	struct device *dev = &mdp5_kms->pdev->dev;
 
 	DBG("%s", crtc->name);
 
@@ -424,23 +426,28 @@ static void mdp5_crtc_disable(struct drm_crtc *crtc)
 		mdp_irq_unregister(&mdp5_kms->base, &mdp5_crtc->pp_done);
 
 	mdp_irq_unregister(&mdp5_kms->base, &mdp5_crtc->err);
-	mdp5_disable(mdp5_kms);
+	pm_runtime_put_autosuspend(dev);
 
 	mdp5_crtc->enabled = false;
 }
 
-static void mdp5_crtc_enable(struct drm_crtc *crtc)
+static void mdp5_crtc_atomic_enable(struct drm_crtc *crtc,
+				    struct drm_crtc_state *old_state)
 {
 	struct mdp5_crtc *mdp5_crtc = to_mdp5_crtc(crtc);
 	struct mdp5_crtc_state *mdp5_cstate = to_mdp5_crtc_state(crtc->state);
 	struct mdp5_kms *mdp5_kms = get_kms(crtc);
+	struct device *dev = &mdp5_kms->pdev->dev;
 
 	DBG("%s", crtc->name);
 
 	if (WARN_ON(mdp5_crtc->enabled))
 		return;
 
-	mdp5_enable(mdp5_kms);
+	pm_runtime_get_sync(dev);
+
+	mdp5_crtc_mode_set_nofb(crtc);
+
 	mdp_irq_register(&mdp5_kms->base, &mdp5_crtc->err);
 
 	if (mdp5_cstate->cmd_mode)
@@ -531,7 +538,7 @@ static bool is_fullscreen(struct drm_crtc_state *cstate,
 		((pstate->crtc_y + pstate->crtc_h) >= cstate->mode.vdisplay);
 }
 
-enum mdp_mixer_stage_id get_start_stage(struct drm_crtc *crtc,
+static enum mdp_mixer_stage_id get_start_stage(struct drm_crtc *crtc,
 					struct drm_crtc_state *new_crtc_state,
 					struct drm_plane_state *bpstate)
 {
@@ -725,6 +732,7 @@ static int mdp5_crtc_cursor_set(struct drm_crtc *crtc,
 	struct mdp5_pipeline *pipeline = &mdp5_cstate->pipeline;
 	struct drm_device *dev = crtc->dev;
 	struct mdp5_kms *mdp5_kms = get_kms(crtc);
+	struct platform_device *pdev = mdp5_kms->pdev;
 	struct msm_kms *kms = &mdp5_kms->base.base;
 	struct drm_gem_object *cursor_bo, *old_bo = NULL;
 	uint32_t blendcfg, stride;
@@ -753,7 +761,7 @@ static int mdp5_crtc_cursor_set(struct drm_crtc *crtc,
 	if (!handle) {
 		DBG("Cursor off");
 		cursor_enable = false;
-		mdp5_enable(mdp5_kms);
+		pm_runtime_get_sync(&pdev->dev);
 		goto set_cursor;
 	}
 
@@ -768,6 +776,8 @@ static int mdp5_crtc_cursor_set(struct drm_crtc *crtc,
 	lm = mdp5_cstate->pipeline.mixer->lm;
 	stride = width * drm_format_plane_cpp(DRM_FORMAT_ARGB8888, 0);
 
+	pm_runtime_get_sync(&pdev->dev);
+
 	spin_lock_irqsave(&mdp5_crtc->cursor.lock, flags);
 	old_bo = mdp5_crtc->cursor.scanout_bo;
 
@@ -777,8 +787,6 @@ static int mdp5_crtc_cursor_set(struct drm_crtc *crtc,
 
 	get_roi(crtc, &roi_w, &roi_h);
 
-	mdp5_enable(mdp5_kms);
-
 	mdp5_write(mdp5_kms, REG_MDP5_LM_CURSOR_STRIDE(lm), stride);
 	mdp5_write(mdp5_kms, REG_MDP5_LM_CURSOR_FORMAT(lm),
 			MDP5_LM_CURSOR_FORMAT_FORMAT(CURSOR_FMT_ARGB8888));
@@ -796,6 +804,8 @@ static int mdp5_crtc_cursor_set(struct drm_crtc *crtc,
 
 	spin_unlock_irqrestore(&mdp5_crtc->cursor.lock, flags);
 
+	pm_runtime_put_autosuspend(&pdev->dev);
+
 set_cursor:
 	ret = mdp5_ctl_set_cursor(ctl, pipeline, 0, cursor_enable);
 	if (ret) {
@@ -807,7 +817,7 @@ static int mdp5_crtc_cursor_set(struct drm_crtc *crtc,
 	crtc_flush(crtc, flush_mask);
 
 end:
-	mdp5_disable(mdp5_kms);
+	pm_runtime_put_autosuspend(&pdev->dev);
 	if (old_bo) {
 		drm_flip_work_queue(&mdp5_crtc->unref_cursor_work, old_bo);
 		/* enable vblank to complete cursor work: */
@@ -840,7 +850,7 @@ static int mdp5_crtc_cursor_move(struct drm_crtc *crtc, int x, int y)
 
 	get_roi(crtc, &roi_w, &roi_h);
 
-	mdp5_enable(mdp5_kms);
+	pm_runtime_get_sync(&mdp5_kms->pdev->dev);
 
 	spin_lock_irqsave(&mdp5_crtc->cursor.lock, flags);
 	mdp5_write(mdp5_kms, REG_MDP5_LM_CURSOR_SIZE(lm),
@@ -853,7 +863,7 @@ static int mdp5_crtc_cursor_move(struct drm_crtc *crtc, int x, int y)
 
 	crtc_flush(crtc, flush_mask);
 
-	mdp5_disable(mdp5_kms);
+	pm_runtime_put_autosuspend(&mdp5_kms->pdev->dev);
 
 	return 0;
 }
@@ -925,7 +935,6 @@ static const struct drm_crtc_funcs mdp5_crtc_funcs = {
 	.set_config = drm_atomic_helper_set_config,
 	.destroy = mdp5_crtc_destroy,
 	.page_flip = drm_atomic_helper_page_flip,
-	.set_property = drm_atomic_helper_crtc_set_property,
 	.reset = mdp5_crtc_reset,
 	.atomic_duplicate_state = mdp5_crtc_duplicate_state,
 	.atomic_destroy_state = mdp5_crtc_destroy_state,
@@ -938,7 +947,6 @@ static const struct drm_crtc_funcs mdp5_crtc_no_lm_cursor_funcs = {
 	.set_config = drm_atomic_helper_set_config,
 	.destroy = mdp5_crtc_destroy,
 	.page_flip = drm_atomic_helper_page_flip,
-	.set_property = drm_atomic_helper_crtc_set_property,
 	.reset = mdp5_crtc_reset,
 	.atomic_duplicate_state = mdp5_crtc_duplicate_state,
 	.atomic_destroy_state = mdp5_crtc_destroy_state,
@@ -947,11 +955,11 @@ static const struct drm_crtc_funcs mdp5_crtc_no_lm_cursor_funcs = {
 
 static const struct drm_crtc_helper_funcs mdp5_crtc_helper_funcs = {
 	.mode_set_nofb = mdp5_crtc_mode_set_nofb,
-	.disable = mdp5_crtc_disable,
-	.enable = mdp5_crtc_enable,
 	.atomic_check = mdp5_crtc_atomic_check,
 	.atomic_begin = mdp5_crtc_atomic_begin,
 	.atomic_flush = mdp5_crtc_atomic_flush,
+	.atomic_enable = mdp5_crtc_atomic_enable,
+	.atomic_disable = mdp5_crtc_atomic_disable,
 };
 
 static void mdp5_crtc_vblank_irq(struct mdp_irq *irq, uint32_t irqstatus)
diff --git a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_encoder.c b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_encoder.c
index 70bef51..5b85138 100644
--- a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_encoder.c
+++ b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_encoder.c
@@ -297,6 +297,10 @@ static void mdp5_encoder_enable(struct drm_encoder *encoder)
 {
 	struct mdp5_encoder *mdp5_encoder = to_mdp5_encoder(encoder);
 	struct mdp5_interface *intf = mdp5_encoder->intf;
+	/* this isn't right I think */
+	struct drm_crtc_state *cstate = encoder->crtc->state;
+
+	mdp5_encoder_mode_set(encoder, &cstate->mode, &cstate->adjusted_mode);
 
 	if (intf->mode == MDP5_INTF_DSI_MODE_COMMAND)
 		mdp5_cmd_encoder_enable(encoder);
@@ -320,7 +324,6 @@ static int mdp5_encoder_atomic_check(struct drm_encoder *encoder,
 }
 
 static const struct drm_encoder_helper_funcs mdp5_encoder_helper_funcs = {
-	.mode_set = mdp5_encoder_mode_set,
 	.disable = mdp5_encoder_disable,
 	.enable = mdp5_encoder_enable,
 	.atomic_check = mdp5_encoder_atomic_check,
@@ -350,6 +353,7 @@ int mdp5_vid_encoder_set_split_display(struct drm_encoder *encoder,
 	struct mdp5_encoder *mdp5_encoder = to_mdp5_encoder(encoder);
 	struct mdp5_encoder *mdp5_slave_enc = to_mdp5_encoder(slave_encoder);
 	struct mdp5_kms *mdp5_kms;
+	struct device *dev;
 	int intf_num;
 	u32 data = 0;
 
@@ -369,8 +373,10 @@ int mdp5_vid_encoder_set_split_display(struct drm_encoder *encoder,
 	else
 		return -EINVAL;
 
+	dev = &mdp5_kms->pdev->dev;
 	/* Make sure clocks are on when connectors calling this function. */
-	mdp5_enable(mdp5_kms);
+	pm_runtime_get_sync(dev);
+
 	/* Dumb Panel, Sync mode */
 	mdp5_write(mdp5_kms, REG_MDP5_SPLIT_DPL_UPPER, 0);
 	mdp5_write(mdp5_kms, REG_MDP5_SPLIT_DPL_LOWER, data);
@@ -378,7 +384,7 @@ int mdp5_vid_encoder_set_split_display(struct drm_encoder *encoder,
 
 	mdp5_ctl_pair(mdp5_encoder->ctl, mdp5_slave_enc->ctl, true);
 
-	mdp5_disable(mdp5_kms);
+	pm_runtime_put_autosuspend(dev);
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_irq.c b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_irq.c
index 3ce8b9d..bb5deb00 100644
--- a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_irq.c
+++ b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_irq.c
@@ -49,16 +49,19 @@ static void mdp5_irq_error_handler(struct mdp_irq *irq, uint32_t irqstatus)
 void mdp5_irq_preinstall(struct msm_kms *kms)
 {
 	struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
-	mdp5_enable(mdp5_kms);
+	struct device *dev = &mdp5_kms->pdev->dev;
+
+	pm_runtime_get_sync(dev);
 	mdp5_write(mdp5_kms, REG_MDP5_INTR_CLEAR, 0xffffffff);
 	mdp5_write(mdp5_kms, REG_MDP5_INTR_EN, 0x00000000);
-	mdp5_disable(mdp5_kms);
+	pm_runtime_put_autosuspend(dev);
 }
 
 int mdp5_irq_postinstall(struct msm_kms *kms)
 {
 	struct mdp_kms *mdp_kms = to_mdp_kms(kms);
 	struct mdp5_kms *mdp5_kms = to_mdp5_kms(mdp_kms);
+	struct device *dev = &mdp5_kms->pdev->dev;
 	struct mdp_irq *error_handler = &mdp5_kms->error_handler;
 
 	error_handler->irq = mdp5_irq_error_handler;
@@ -67,9 +70,9 @@ int mdp5_irq_postinstall(struct msm_kms *kms)
 			MDP5_IRQ_INTF2_UNDER_RUN |
 			MDP5_IRQ_INTF3_UNDER_RUN;
 
-	mdp5_enable(mdp5_kms);
+	pm_runtime_get_sync(dev);
 	mdp_irq_register(mdp_kms, error_handler);
-	mdp5_disable(mdp5_kms);
+	pm_runtime_put_autosuspend(dev);
 
 	return 0;
 }
@@ -77,9 +80,11 @@ int mdp5_irq_postinstall(struct msm_kms *kms)
 void mdp5_irq_uninstall(struct msm_kms *kms)
 {
 	struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
-	mdp5_enable(mdp5_kms);
+	struct device *dev = &mdp5_kms->pdev->dev;
+
+	pm_runtime_get_sync(dev);
 	mdp5_write(mdp5_kms, REG_MDP5_INTR_EN, 0x00000000);
-	mdp5_disable(mdp5_kms);
+	pm_runtime_put_autosuspend(dev);
 }
 
 irqreturn_t mdp5_irq(struct msm_kms *kms)
@@ -109,11 +114,12 @@ irqreturn_t mdp5_irq(struct msm_kms *kms)
 int mdp5_enable_vblank(struct msm_kms *kms, struct drm_crtc *crtc)
 {
 	struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
+	struct device *dev = &mdp5_kms->pdev->dev;
 
-	mdp5_enable(mdp5_kms);
+	pm_runtime_get_sync(dev);
 	mdp_update_vblank_mask(to_mdp_kms(kms),
 			mdp5_crtc_vblank(crtc), true);
-	mdp5_disable(mdp5_kms);
+	pm_runtime_put_autosuspend(dev);
 
 	return 0;
 }
@@ -121,9 +127,10 @@ int mdp5_enable_vblank(struct msm_kms *kms, struct drm_crtc *crtc)
 void mdp5_disable_vblank(struct msm_kms *kms, struct drm_crtc *crtc)
 {
 	struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
+	struct device *dev = &mdp5_kms->pdev->dev;
 
-	mdp5_enable(mdp5_kms);
+	pm_runtime_get_sync(dev);
 	mdp_update_vblank_mask(to_mdp_kms(kms),
 			mdp5_crtc_vblank(crtc), false);
-	mdp5_disable(mdp5_kms);
+	pm_runtime_put_autosuspend(dev);
 }
diff --git a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.c b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.c
index 1c603ae..f7c0698 100644
--- a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.c
+++ b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.c
@@ -30,11 +30,10 @@ static const char *iommu_ports[] = {
 static int mdp5_hw_init(struct msm_kms *kms)
 {
 	struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
-	struct platform_device *pdev = mdp5_kms->pdev;
+	struct device *dev = &mdp5_kms->pdev->dev;
 	unsigned long flags;
 
-	pm_runtime_get_sync(&pdev->dev);
-	mdp5_enable(mdp5_kms);
+	pm_runtime_get_sync(dev);
 
 	/* Magic unknown register writes:
 	 *
@@ -66,8 +65,7 @@ static int mdp5_hw_init(struct msm_kms *kms)
 
 	mdp5_ctlm_hw_reset(mdp5_kms->ctlm);
 
-	mdp5_disable(mdp5_kms);
-	pm_runtime_put_sync(&pdev->dev);
+	pm_runtime_put_sync(dev);
 
 	return 0;
 }
@@ -111,8 +109,9 @@ static void mdp5_swap_state(struct msm_kms *kms, struct drm_atomic_state *state)
 static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *state)
 {
 	struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
+	struct device *dev = &mdp5_kms->pdev->dev;
 
-	mdp5_enable(mdp5_kms);
+	pm_runtime_get_sync(dev);
 
 	if (mdp5_kms->smp)
 		mdp5_smp_prepare_commit(mdp5_kms->smp, &mdp5_kms->state->smp);
@@ -121,11 +120,12 @@ static void mdp5_prepare_commit(struct msm_kms *kms, struct drm_atomic_state *st
 static void mdp5_complete_commit(struct msm_kms *kms, struct drm_atomic_state *state)
 {
 	struct mdp5_kms *mdp5_kms = to_mdp5_kms(to_mdp_kms(kms));
+	struct device *dev = &mdp5_kms->pdev->dev;
 
 	if (mdp5_kms->smp)
 		mdp5_smp_complete_commit(mdp5_kms->smp, &mdp5_kms->state->smp);
 
-	mdp5_disable(mdp5_kms);
+	pm_runtime_put_autosuspend(dev);
 }
 
 static void mdp5_wait_for_crtc_commit_done(struct msm_kms *kms,
@@ -249,6 +249,9 @@ int mdp5_disable(struct mdp5_kms *mdp5_kms)
 {
 	DBG("");
 
+	mdp5_kms->enable_count--;
+	WARN_ON(mdp5_kms->enable_count < 0);
+
 	clk_disable_unprepare(mdp5_kms->ahb_clk);
 	clk_disable_unprepare(mdp5_kms->axi_clk);
 	clk_disable_unprepare(mdp5_kms->core_clk);
@@ -262,6 +265,8 @@ int mdp5_enable(struct mdp5_kms *mdp5_kms)
 {
 	DBG("");
 
+	mdp5_kms->enable_count++;
+
 	clk_prepare_enable(mdp5_kms->ahb_clk);
 	clk_prepare_enable(mdp5_kms->axi_clk);
 	clk_prepare_enable(mdp5_kms->core_clk);
@@ -486,11 +491,12 @@ static int modeset_init(struct mdp5_kms *mdp5_kms)
 static void read_mdp_hw_revision(struct mdp5_kms *mdp5_kms,
 				 u32 *major, u32 *minor)
 {
+	struct device *dev = &mdp5_kms->pdev->dev;
 	u32 version;
 
-	mdp5_enable(mdp5_kms);
+	pm_runtime_get_sync(dev);
 	version = mdp5_read(mdp5_kms, REG_MDP5_HW_VERSION);
-	mdp5_disable(mdp5_kms);
+	pm_runtime_put_autosuspend(dev);
 
 	*major = FIELD(version, MDP5_HW_VERSION_MAJOR);
 	*minor = FIELD(version, MDP5_HW_VERSION_MINOR);
@@ -643,7 +649,7 @@ struct msm_kms *mdp5_kms_init(struct drm_device *dev)
 	 * have left things on, in which case we'll start getting faults if
 	 * we don't disable):
 	 */
-	mdp5_enable(mdp5_kms);
+	pm_runtime_get_sync(&pdev->dev);
 	for (i = 0; i < MDP5_INTF_NUM_MAX; i++) {
 		if (mdp5_cfg_intf_is_virtual(config->hw->intf.connect[i]) ||
 		    !config->hw->intf.base[i])
@@ -652,7 +658,6 @@ struct msm_kms *mdp5_kms_init(struct drm_device *dev)
 
 		mdp5_write(mdp5_kms, REG_MDP5_INTF_FRAME_LINE_COUNT_EN(i), 0x3);
 	}
-	mdp5_disable(mdp5_kms);
 	mdelay(16);
 
 	if (config->platform.iommu) {
@@ -678,6 +683,8 @@ struct msm_kms *mdp5_kms_init(struct drm_device *dev)
 		aspace = NULL;;
 	}
 
+	pm_runtime_put_autosuspend(&pdev->dev);
+
 	ret = modeset_init(mdp5_kms);
 	if (ret) {
 		dev_err(&pdev->dev, "modeset_init failed: %d\n", ret);
@@ -1005,6 +1012,30 @@ static int mdp5_dev_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static __maybe_unused int mdp5_runtime_suspend(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct mdp5_kms *mdp5_kms = platform_get_drvdata(pdev);
+
+	DBG("");
+
+	return mdp5_disable(mdp5_kms);
+}
+
+static __maybe_unused int mdp5_runtime_resume(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct mdp5_kms *mdp5_kms = platform_get_drvdata(pdev);
+
+	DBG("");
+
+	return mdp5_enable(mdp5_kms);
+}
+
+static const struct dev_pm_ops mdp5_pm_ops = {
+	SET_RUNTIME_PM_OPS(mdp5_runtime_suspend, mdp5_runtime_resume, NULL)
+};
+
 static const struct of_device_id mdp5_dt_match[] = {
 	{ .compatible = "qcom,mdp5", },
 	/* to support downstream DT files */
@@ -1019,6 +1050,7 @@ static struct platform_driver mdp5_driver = {
 	.driver = {
 		.name = "msm_mdp",
 		.of_match_table = mdp5_dt_match,
+		.pm = &mdp5_pm_ops,
 	},
 };
 
diff --git a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.h b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.h
index 17caa0e..9b3fe01 100644
--- a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.h
+++ b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_kms.h
@@ -76,6 +76,8 @@ struct mdp5_kms {
 	bool rpm_enabled;
 
 	struct mdp_irq error_handler;
+
+	int enable_count;
 };
 #define to_mdp5_kms(x) container_of(x, struct mdp5_kms, base)
 
@@ -167,11 +169,13 @@ struct mdp5_encoder {
 
 static inline void mdp5_write(struct mdp5_kms *mdp5_kms, u32 reg, u32 data)
 {
+	WARN_ON(mdp5_kms->enable_count <= 0);
 	msm_writel(data, mdp5_kms->mmio + reg);
 }
 
 static inline u32 mdp5_read(struct mdp5_kms *mdp5_kms, u32 reg)
 {
+	WARN_ON(mdp5_kms->enable_count <= 0);
 	return msm_readl(mdp5_kms->mmio + reg);
 }
 
@@ -255,9 +259,6 @@ static inline uint32_t lm2ppdone(struct mdp5_hw_mixer *mixer)
 	return MDP5_IRQ_PING_PONG_0_DONE << mixer->pp;
 }
 
-int mdp5_disable(struct mdp5_kms *mdp5_kms);
-int mdp5_enable(struct mdp5_kms *mdp5_kms);
-
 void mdp5_set_irqmask(struct mdp_kms *mdp_kms, uint32_t irqmask,
 		uint32_t old_irqmask);
 void mdp5_irq_preinstall(struct msm_kms *kms);
diff --git a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_mdss.c b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_mdss.c
index 9c34d78..f2a0db7 100644
--- a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_mdss.c
+++ b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_mdss.c
@@ -31,6 +31,10 @@ struct msm_mdss {
 
 	struct regulator *vdd;
 
+	struct clk *ahb_clk;
+	struct clk *axi_clk;
+	struct clk *vsync_clk;
+
 	struct {
 		volatile unsigned long enabled_mask;
 		struct irq_domain *domain;
@@ -140,6 +144,51 @@ static int mdss_irq_domain_init(struct msm_mdss *mdss)
 	return 0;
 }
 
+int msm_mdss_enable(struct msm_mdss *mdss)
+{
+	DBG("");
+
+	clk_prepare_enable(mdss->ahb_clk);
+	if (mdss->axi_clk)
+		clk_prepare_enable(mdss->axi_clk);
+	if (mdss->vsync_clk)
+		clk_prepare_enable(mdss->vsync_clk);
+
+	return 0;
+}
+
+int msm_mdss_disable(struct msm_mdss *mdss)
+{
+	DBG("");
+
+	if (mdss->vsync_clk)
+		clk_disable_unprepare(mdss->vsync_clk);
+	if (mdss->axi_clk)
+		clk_disable_unprepare(mdss->axi_clk);
+	clk_disable_unprepare(mdss->ahb_clk);
+
+	return 0;
+}
+
+static int msm_mdss_get_clocks(struct msm_mdss *mdss)
+{
+	struct platform_device *pdev = to_platform_device(mdss->dev->dev);
+
+	mdss->ahb_clk = msm_clk_get(pdev, "iface");
+	if (IS_ERR(mdss->ahb_clk))
+		mdss->ahb_clk = NULL;
+
+	mdss->axi_clk = msm_clk_get(pdev, "bus");
+	if (IS_ERR(mdss->axi_clk))
+		mdss->axi_clk = NULL;
+
+	mdss->vsync_clk = msm_clk_get(pdev, "vsync");
+	if (IS_ERR(mdss->vsync_clk))
+		mdss->vsync_clk = NULL;
+
+	return 0;
+}
+
 void msm_mdss_destroy(struct drm_device *dev)
 {
 	struct msm_drm_private *priv = dev->dev_private;
@@ -153,8 +202,6 @@ void msm_mdss_destroy(struct drm_device *dev)
 
 	regulator_disable(mdss->vdd);
 
-	pm_runtime_put_sync(dev->dev);
-
 	pm_runtime_disable(dev->dev);
 }
 
@@ -190,6 +237,12 @@ int msm_mdss_init(struct drm_device *dev)
 		goto fail;
 	}
 
+	ret = msm_mdss_get_clocks(mdss);
+	if (ret) {
+		dev_err(dev->dev, "failed to get clocks: %d\n", ret);
+		goto fail;
+	}
+
 	/* Regulator to enable GDSCs in downstream kernels */
 	mdss->vdd = devm_regulator_get(dev->dev, "vdd");
 	if (IS_ERR(mdss->vdd)) {
@@ -221,12 +274,6 @@ int msm_mdss_init(struct drm_device *dev)
 
 	pm_runtime_enable(dev->dev);
 
-	/*
-	 * TODO: This is needed as the MDSS GDSC is only tied to MDSS's power
-	 * domain. Remove this once runtime PM is adapted for all the devices.
-	 */
-	pm_runtime_get_sync(dev->dev);
-
 	return 0;
 fail_irq:
 	regulator_disable(mdss->vdd);
diff --git a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c
index 61f39c8..4b22ac3 100644
--- a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c
+++ b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_plane.c
@@ -246,7 +246,6 @@ static const struct drm_plane_funcs mdp5_plane_funcs = {
 		.update_plane = drm_atomic_helper_update_plane,
 		.disable_plane = drm_atomic_helper_disable_plane,
 		.destroy = mdp5_plane_destroy,
-		.set_property = drm_atomic_helper_plane_set_property,
 		.atomic_set_property = mdp5_plane_atomic_set_property,
 		.atomic_get_property = mdp5_plane_atomic_get_property,
 		.reset = mdp5_plane_reset,
@@ -259,7 +258,6 @@ static const struct drm_plane_funcs mdp5_cursor_plane_funcs = {
 		.update_plane = mdp5_update_cursor_plane_legacy,
 		.disable_plane = drm_atomic_helper_disable_plane,
 		.destroy = mdp5_plane_destroy,
-		.set_property = drm_atomic_helper_plane_set_property,
 		.atomic_set_property = mdp5_plane_atomic_set_property,
 		.atomic_get_property = mdp5_plane_atomic_get_property,
 		.reset = mdp5_plane_reset,
@@ -1139,12 +1137,12 @@ struct drm_plane *mdp5_plane_init(struct drm_device *dev,
 		ret = drm_universal_plane_init(dev, plane, 0xff,
 				&mdp5_cursor_plane_funcs,
 				mdp5_plane->formats, mdp5_plane->nformats,
-				type, NULL);
+				NULL, type, NULL);
 	else
 		ret = drm_universal_plane_init(dev, plane, 0xff,
 				&mdp5_plane_funcs,
 				mdp5_plane->formats, mdp5_plane->nformats,
-				type, NULL);
+				NULL, type, NULL);
 	if (ret)
 		goto fail;
 
diff --git a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_smp.c b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_smp.c
index 58f712d..ae4983d 100644
--- a/drivers/gpu/drm/msm/mdp/mdp5/mdp5_smp.c
+++ b/drivers/gpu/drm/msm/mdp/mdp5/mdp5_smp.c
@@ -28,6 +28,13 @@ struct mdp5_smp {
 
 	int blk_cnt;
 	int blk_size;
+
+	/* register cache */
+	u32 alloc_w[22];
+	u32 alloc_r[22];
+	u32 pipe_reqprio_fifo_wm0[SSPP_MAX];
+	u32 pipe_reqprio_fifo_wm1[SSPP_MAX];
+	u32 pipe_reqprio_fifo_wm2[SSPP_MAX];
 };
 
 static inline
@@ -98,16 +105,15 @@ static int smp_request_block(struct mdp5_smp *smp,
 static void set_fifo_thresholds(struct mdp5_smp *smp,
 		enum mdp5_pipe pipe, int nblks)
 {
-	struct mdp5_kms *mdp5_kms = get_kms(smp);
 	u32 smp_entries_per_blk = smp->blk_size / (128 / BITS_PER_BYTE);
 	u32 val;
 
 	/* 1/4 of SMP pool that is being fetched */
 	val = (nblks * smp_entries_per_blk) / 4;
 
-	mdp5_write(mdp5_kms, REG_MDP5_PIPE_REQPRIO_FIFO_WM_0(pipe), val * 1);
-	mdp5_write(mdp5_kms, REG_MDP5_PIPE_REQPRIO_FIFO_WM_1(pipe), val * 2);
-	mdp5_write(mdp5_kms, REG_MDP5_PIPE_REQPRIO_FIFO_WM_2(pipe), val * 3);
+	smp->pipe_reqprio_fifo_wm0[pipe] = val * 1;
+	smp->pipe_reqprio_fifo_wm1[pipe] = val * 2;
+	smp->pipe_reqprio_fifo_wm2[pipe] = val * 3;
 }
 
 /*
@@ -222,7 +228,6 @@ void mdp5_smp_release(struct mdp5_smp *smp, struct mdp5_smp_state *state,
 static unsigned update_smp_state(struct mdp5_smp *smp,
 		u32 cid, mdp5_smp_state_t *assigned)
 {
-	struct mdp5_kms *mdp5_kms = get_kms(smp);
 	int cnt = smp->blk_cnt;
 	unsigned nblks = 0;
 	u32 blk, val;
@@ -231,7 +236,7 @@ static unsigned update_smp_state(struct mdp5_smp *smp,
 		int idx = blk / 3;
 		int fld = blk % 3;
 
-		val = mdp5_read(mdp5_kms, REG_MDP5_SMP_ALLOC_W_REG(idx));
+		val = smp->alloc_w[idx];
 
 		switch (fld) {
 		case 0:
@@ -248,8 +253,8 @@ static unsigned update_smp_state(struct mdp5_smp *smp,
 			break;
 		}
 
-		mdp5_write(mdp5_kms, REG_MDP5_SMP_ALLOC_W_REG(idx), val);
-		mdp5_write(mdp5_kms, REG_MDP5_SMP_ALLOC_R_REG(idx), val);
+		smp->alloc_w[idx] = val;
+		smp->alloc_r[idx] = val;
 
 		nblks++;
 	}
@@ -257,6 +262,39 @@ static unsigned update_smp_state(struct mdp5_smp *smp,
 	return nblks;
 }
 
+static void write_smp_alloc_regs(struct mdp5_smp *smp)
+{
+	struct mdp5_kms *mdp5_kms = get_kms(smp);
+	int i, num_regs;
+
+	num_regs = smp->blk_cnt / 3 + 1;
+
+	for (i = 0; i < num_regs; i++) {
+		mdp5_write(mdp5_kms, REG_MDP5_SMP_ALLOC_W_REG(i),
+			   smp->alloc_w[i]);
+		mdp5_write(mdp5_kms, REG_MDP5_SMP_ALLOC_R_REG(i),
+			   smp->alloc_r[i]);
+	}
+}
+
+static void write_smp_fifo_regs(struct mdp5_smp *smp)
+{
+	struct mdp5_kms *mdp5_kms = get_kms(smp);
+	int i;
+
+	for (i = 0; i < mdp5_kms->num_hwpipes; i++) {
+		struct mdp5_hw_pipe *hwpipe = mdp5_kms->hwpipes[i];
+		enum mdp5_pipe pipe = hwpipe->pipe;
+
+		mdp5_write(mdp5_kms, REG_MDP5_PIPE_REQPRIO_FIFO_WM_0(pipe),
+			   smp->pipe_reqprio_fifo_wm0[pipe]);
+		mdp5_write(mdp5_kms, REG_MDP5_PIPE_REQPRIO_FIFO_WM_1(pipe),
+			   smp->pipe_reqprio_fifo_wm1[pipe]);
+		mdp5_write(mdp5_kms, REG_MDP5_PIPE_REQPRIO_FIFO_WM_2(pipe),
+			   smp->pipe_reqprio_fifo_wm2[pipe]);
+	}
+}
+
 void mdp5_smp_prepare_commit(struct mdp5_smp *smp, struct mdp5_smp_state *state)
 {
 	enum mdp5_pipe pipe;
@@ -277,6 +315,9 @@ void mdp5_smp_prepare_commit(struct mdp5_smp *smp, struct mdp5_smp_state *state)
 		set_fifo_thresholds(smp, pipe, nblks);
 	}
 
+	write_smp_alloc_regs(smp);
+	write_smp_fifo_regs(smp);
+
 	state->assigned = 0;
 }
 
@@ -289,6 +330,8 @@ void mdp5_smp_complete_commit(struct mdp5_smp *smp, struct mdp5_smp_state *state
 		set_fifo_thresholds(smp, pipe, 0);
 	}
 
+	write_smp_fifo_regs(smp);
+
 	state->released = 0;
 }
 
diff --git a/drivers/gpu/drm/msm/msm_atomic.c b/drivers/gpu/drm/msm/msm_atomic.c
index 9633a68b..025d454 100644
--- a/drivers/gpu/drm/msm/msm_atomic.c
+++ b/drivers/gpu/drm/msm/msm_atomic.c
@@ -84,13 +84,13 @@ static void msm_atomic_wait_for_commit_done(struct drm_device *dev,
 		struct drm_atomic_state *old_state)
 {
 	struct drm_crtc *crtc;
-	struct drm_crtc_state *crtc_state;
+	struct drm_crtc_state *new_crtc_state;
 	struct msm_drm_private *priv = old_state->dev->dev_private;
 	struct msm_kms *kms = priv->kms;
 	int i;
 
-	for_each_crtc_in_state(old_state, crtc, crtc_state, i) {
-		if (!crtc->state->enable)
+	for_each_new_crtc_in_state(old_state, crtc, new_crtc_state, i) {
+		if (!new_crtc_state->active)
 			continue;
 
 		kms->funcs->wait_for_crtc_commit_done(kms, crtc);
@@ -195,7 +195,7 @@ int msm_atomic_commit(struct drm_device *dev,
 	struct drm_crtc *crtc;
 	struct drm_crtc_state *crtc_state;
 	struct drm_plane *plane;
-	struct drm_plane_state *plane_state;
+	struct drm_plane_state *old_plane_state, *new_plane_state;
 	int i, ret;
 
 	ret = drm_atomic_helper_prepare_planes(dev, state);
@@ -211,19 +211,19 @@ int msm_atomic_commit(struct drm_device *dev,
 	/*
 	 * Figure out what crtcs we have:
 	 */
-	for_each_crtc_in_state(state, crtc, crtc_state, i)
+	for_each_new_crtc_in_state(state, crtc, crtc_state, i)
 		c->crtc_mask |= drm_crtc_mask(crtc);
 
 	/*
 	 * Figure out what fence to wait for:
 	 */
-	for_each_plane_in_state(state, plane, plane_state, i) {
-		if ((plane->state->fb != plane_state->fb) && plane_state->fb) {
-			struct drm_gem_object *obj = msm_framebuffer_bo(plane_state->fb, 0);
+	for_each_oldnew_plane_in_state(state, plane, old_plane_state, new_plane_state, i) {
+		if ((new_plane_state->fb != old_plane_state->fb) && new_plane_state->fb) {
+			struct drm_gem_object *obj = msm_framebuffer_bo(new_plane_state->fb, 0);
 			struct msm_gem_object *msm_obj = to_msm_bo(obj);
 			struct dma_fence *fence = reservation_object_get_excl_rcu(msm_obj->resv);
 
-			drm_atomic_set_fence_for_plane(plane_state, fence);
+			drm_atomic_set_fence_for_plane(new_plane_state, fence);
 		}
 	}
 
@@ -232,20 +232,18 @@ int msm_atomic_commit(struct drm_device *dev,
 	 * mark our set of crtc's as busy:
 	 */
 	ret = start_atomic(dev->dev_private, c->crtc_mask);
-	if (ret) {
-		kfree(c);
-		goto error;
-	}
+	if (ret)
+		goto err_free;
+
+	BUG_ON(drm_atomic_helper_swap_state(state, false) < 0);
 
 	/*
 	 * This is the point of no return - everything below never fails except
 	 * when the hw goes bonghits. Which means we can commit the new state on
 	 * the software side now.
+	 *
+	 * swap driver private state while still holding state_lock
 	 */
-
-	drm_atomic_helper_swap_state(state, true);
-
-	/* swap driver private state while still holding state_lock */
 	if (to_kms_state(state)->state)
 		priv->kms->funcs->swap_state(priv->kms, state);
 
@@ -275,6 +273,8 @@ int msm_atomic_commit(struct drm_device *dev,
 
 	return 0;
 
+err_free:
+	kfree(c);
 error:
 	drm_atomic_helper_cleanup_planes(dev, state);
 	return ret;
diff --git a/drivers/gpu/drm/msm/msm_drv.c b/drivers/gpu/drm/msm/msm_drv.c
index f49f6ac..606df7b 100644
--- a/drivers/gpu/drm/msm/msm_drv.c
+++ b/drivers/gpu/drm/msm/msm_drv.c
@@ -73,6 +73,10 @@ bool dumpstate = false;
 MODULE_PARM_DESC(dumpstate, "Dump KMS state on errors");
 module_param(dumpstate, bool, 0600);
 
+static bool modeset = true;
+MODULE_PARM_DESC(modeset, "Use kernel modesetting [KMS] (1=on (default), 0=disable)");
+module_param(modeset, bool, 0600);
+
 /*
  * Util/helpers:
  */
@@ -832,7 +836,6 @@ static struct drm_driver msm_driver = {
 	.gem_vm_ops         = &vm_ops,
 	.dumb_create        = msm_gem_dumb_create,
 	.dumb_map_offset    = msm_gem_dumb_map_offset,
-	.dumb_destroy       = drm_gem_dumb_destroy,
 	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
 	.gem_prime_export   = drm_gem_prime_export,
@@ -879,8 +882,37 @@ static int msm_pm_resume(struct device *dev)
 }
 #endif
 
+#ifdef CONFIG_PM
+static int msm_runtime_suspend(struct device *dev)
+{
+	struct drm_device *ddev = dev_get_drvdata(dev);
+	struct msm_drm_private *priv = ddev->dev_private;
+
+	DBG("");
+
+	if (priv->mdss)
+		return msm_mdss_disable(priv->mdss);
+
+	return 0;
+}
+
+static int msm_runtime_resume(struct device *dev)
+{
+	struct drm_device *ddev = dev_get_drvdata(dev);
+	struct msm_drm_private *priv = ddev->dev_private;
+
+	DBG("");
+
+	if (priv->mdss)
+		return msm_mdss_enable(priv->mdss);
+
+	return 0;
+}
+#endif
+
 static const struct dev_pm_ops msm_pm_ops = {
 	SET_SYSTEM_SLEEP_PM_OPS(msm_pm_suspend, msm_pm_resume)
+	SET_RUNTIME_PM_OPS(msm_runtime_suspend, msm_runtime_resume, NULL)
 };
 
 /*
@@ -1104,6 +1136,9 @@ static struct platform_driver msm_platform_driver = {
 
 static int __init msm_drm_register(void)
 {
+	if (!modeset)
+		return -EINVAL;
+
 	DBG("init");
 	msm_mdp_register();
 	msm_dsi_register();
diff --git a/drivers/gpu/drm/msm/msm_drv.h b/drivers/gpu/drm/msm/msm_drv.h
index fc8d24f..5e8109c 100644
--- a/drivers/gpu/drm/msm/msm_drv.h
+++ b/drivers/gpu/drm/msm/msm_drv.h
@@ -55,8 +55,6 @@ struct msm_fence_cb;
 struct msm_gem_address_space;
 struct msm_gem_vma;
 
-#define NUM_DOMAINS 2    /* one for KMS, then one per gpu core (?) */
-
 struct msm_file_private {
 	/* currently we don't do anything useful with this.. but when
 	 * per-context address spaces are supported we'd keep track of
@@ -237,6 +235,12 @@ struct drm_gem_object *msm_gem_new(struct drm_device *dev,
 		uint32_t size, uint32_t flags);
 struct drm_gem_object *msm_gem_new_locked(struct drm_device *dev,
 		uint32_t size, uint32_t flags);
+void *msm_gem_kernel_new(struct drm_device *dev, uint32_t size,
+		uint32_t flags, struct msm_gem_address_space *aspace,
+		struct drm_gem_object **bo, uint64_t *iova);
+void *msm_gem_kernel_new_locked(struct drm_device *dev, uint32_t size,
+		uint32_t flags, struct msm_gem_address_space *aspace,
+		struct drm_gem_object **bo, uint64_t *iova);
 struct drm_gem_object *msm_gem_import(struct drm_device *dev,
 		struct dma_buf *dmabuf, struct sg_table *sgt);
 
@@ -248,10 +252,10 @@ uint32_t msm_framebuffer_iova(struct drm_framebuffer *fb,
 		struct msm_gem_address_space *aspace, int plane);
 struct drm_gem_object *msm_framebuffer_bo(struct drm_framebuffer *fb, int plane);
 const struct msm_format *msm_framebuffer_format(struct drm_framebuffer *fb);
-struct drm_framebuffer *msm_framebuffer_init(struct drm_device *dev,
-		const struct drm_mode_fb_cmd2 *mode_cmd, struct drm_gem_object **bos);
 struct drm_framebuffer *msm_framebuffer_create(struct drm_device *dev,
 		struct drm_file *file, const struct drm_mode_fb_cmd2 *mode_cmd);
+struct drm_framebuffer * msm_alloc_stolen_fb(struct drm_device *dev,
+		int w, int h, int p, uint32_t format);
 
 struct drm_fb_helper *msm_fbdev_init(struct drm_device *dev);
 void msm_fbdev_free(struct drm_device *dev);
diff --git a/drivers/gpu/drm/msm/msm_fb.c b/drivers/gpu/drm/msm/msm_fb.c
index 6ecb7b1..fc175e7 100644
--- a/drivers/gpu/drm/msm/msm_fb.c
+++ b/drivers/gpu/drm/msm/msm_fb.c
@@ -20,6 +20,7 @@
 
 #include "msm_drv.h"
 #include "msm_kms.h"
+#include "msm_gem.h"
 
 struct msm_framebuffer {
 	struct drm_framebuffer base;
@@ -28,6 +29,8 @@ struct msm_framebuffer {
 };
 #define to_msm_framebuffer(x) container_of(x, struct msm_framebuffer, base)
 
+static struct drm_framebuffer *msm_framebuffer_init(struct drm_device *dev,
+		const struct drm_mode_fb_cmd2 *mode_cmd, struct drm_gem_object **bos);
 
 static int msm_framebuffer_create_handle(struct drm_framebuffer *fb,
 		struct drm_file *file_priv,
@@ -161,7 +164,7 @@ struct drm_framebuffer *msm_framebuffer_create(struct drm_device *dev,
 	return ERR_PTR(ret);
 }
 
-struct drm_framebuffer *msm_framebuffer_init(struct drm_device *dev,
+static struct drm_framebuffer *msm_framebuffer_init(struct drm_device *dev,
 		const struct drm_mode_fb_cmd2 *mode_cmd, struct drm_gem_object **bos)
 {
 	struct msm_drm_private *priv = dev->dev_private;
@@ -237,3 +240,43 @@ struct drm_framebuffer *msm_framebuffer_init(struct drm_device *dev,
 
 	return ERR_PTR(ret);
 }
+
+struct drm_framebuffer *
+msm_alloc_stolen_fb(struct drm_device *dev, int w, int h, int p, uint32_t format)
+{
+	struct drm_mode_fb_cmd2 mode_cmd = {
+		.pixel_format = format,
+		.width = w,
+		.height = h,
+		.pitches = { p },
+	};
+	struct drm_gem_object *bo;
+	struct drm_framebuffer *fb;
+	int size;
+
+	/* allocate backing bo */
+	size = mode_cmd.pitches[0] * mode_cmd.height;
+	DBG("allocating %d bytes for fb %d", size, dev->primary->index);
+	bo = msm_gem_new(dev, size, MSM_BO_SCANOUT | MSM_BO_WC | MSM_BO_STOLEN);
+	if (IS_ERR(bo)) {
+		dev_warn(dev->dev, "could not allocate stolen bo\n");
+		/* try regular bo: */
+		bo = msm_gem_new(dev, size, MSM_BO_SCANOUT | MSM_BO_WC);
+	}
+	if (IS_ERR(bo)) {
+		dev_err(dev->dev, "failed to allocate buffer object\n");
+		return ERR_CAST(bo);
+	}
+
+	fb = msm_framebuffer_init(dev, &mode_cmd, &bo);
+	if (IS_ERR(fb)) {
+		dev_err(dev->dev, "failed to allocate fb\n");
+		/* note: if fb creation failed, we can't rely on fb destroy
+		 * to unref the bo:
+		 */
+		drm_gem_object_unreference_unlocked(bo);
+		return ERR_CAST(fb);
+	}
+
+	return fb;
+}
diff --git a/drivers/gpu/drm/msm/msm_fbdev.c b/drivers/gpu/drm/msm/msm_fbdev.c
index 5ecf4ff..c178563 100644
--- a/drivers/gpu/drm/msm/msm_fbdev.c
+++ b/drivers/gpu/drm/msm/msm_fbdev.c
@@ -19,7 +19,6 @@
 #include <drm/drm_fb_helper.h>
 
 #include "msm_drv.h"
-#include "msm_gem.h"
 #include "msm_kms.h"
 
 extern int msm_gem_mmap_obj(struct drm_gem_object *obj,
@@ -35,7 +34,6 @@ static int msm_fbdev_mmap(struct fb_info *info, struct vm_area_struct *vma);
 struct msm_fbdev {
 	struct drm_fb_helper base;
 	struct drm_framebuffer *fb;
-	struct drm_gem_object *bo;
 };
 
 static struct fb_ops msm_fb_ops = {
@@ -57,16 +55,16 @@ static int msm_fbdev_mmap(struct fb_info *info, struct vm_area_struct *vma)
 {
 	struct drm_fb_helper *helper = (struct drm_fb_helper *)info->par;
 	struct msm_fbdev *fbdev = to_msm_fbdev(helper);
-	struct drm_gem_object *drm_obj = fbdev->bo;
+	struct drm_gem_object *bo = msm_framebuffer_bo(fbdev->fb, 0);
 	int ret = 0;
 
-	ret = drm_gem_mmap_obj(drm_obj, drm_obj->size, vma);
+	ret = drm_gem_mmap_obj(bo, bo->size, vma);
 	if (ret) {
 		pr_err("%s:drm_gem_mmap_obj fail\n", __func__);
 		return ret;
 	}
 
-	return msm_gem_mmap_obj(drm_obj, vma);
+	return msm_gem_mmap_obj(bo, vma);
 }
 
 static int msm_fbdev_create(struct drm_fb_helper *helper,
@@ -76,47 +74,30 @@ static int msm_fbdev_create(struct drm_fb_helper *helper,
 	struct drm_device *dev = helper->dev;
 	struct msm_drm_private *priv = dev->dev_private;
 	struct drm_framebuffer *fb = NULL;
+	struct drm_gem_object *bo;
 	struct fb_info *fbi = NULL;
-	struct drm_mode_fb_cmd2 mode_cmd = {0};
 	uint64_t paddr;
-	int ret, size;
+	uint32_t format;
+	int ret, pitch;
+
+	format = drm_mode_legacy_fb_format(sizes->surface_bpp, sizes->surface_depth);
 
 	DBG("create fbdev: %dx%d@%d (%dx%d)", sizes->surface_width,
 			sizes->surface_height, sizes->surface_bpp,
 			sizes->fb_width, sizes->fb_height);
 
-	mode_cmd.pixel_format = drm_mode_legacy_fb_format(sizes->surface_bpp,
-			sizes->surface_depth);
+	pitch = align_pitch(sizes->surface_width, sizes->surface_bpp);
+	fb = msm_alloc_stolen_fb(dev, sizes->surface_width,
+			sizes->surface_height, pitch, format);
 
-	mode_cmd.width = sizes->surface_width;
-	mode_cmd.height = sizes->surface_height;
-
-	mode_cmd.pitches[0] = align_pitch(
-			mode_cmd.width, sizes->surface_bpp);
-
-	/* allocate backing bo */
-	size = mode_cmd.pitches[0] * mode_cmd.height;
-	DBG("allocating %d bytes for fb %d", size, dev->primary->index);
-	fbdev->bo = msm_gem_new(dev, size, MSM_BO_SCANOUT |
-			MSM_BO_WC | MSM_BO_STOLEN);
-	if (IS_ERR(fbdev->bo)) {
-		ret = PTR_ERR(fbdev->bo);
-		fbdev->bo = NULL;
-		dev_err(dev->dev, "failed to allocate buffer object: %d\n", ret);
-		goto fail;
-	}
-
-	fb = msm_framebuffer_init(dev, &mode_cmd, &fbdev->bo);
 	if (IS_ERR(fb)) {
 		dev_err(dev->dev, "failed to allocate fb\n");
-		/* note: if fb creation failed, we can't rely on fb destroy
-		 * to unref the bo:
-		 */
-		drm_gem_object_unreference_unlocked(fbdev->bo);
 		ret = PTR_ERR(fb);
 		goto fail;
 	}
 
+	bo = msm_framebuffer_bo(fb, 0);
+
 	mutex_lock(&dev->struct_mutex);
 
 	/*
@@ -124,7 +105,7 @@ static int msm_fbdev_create(struct drm_fb_helper *helper,
 	 * in panic (ie. lock-safe, etc) we could avoid pinning the
 	 * buffer now:
 	 */
-	ret = msm_gem_get_iova(fbdev->bo, priv->kms->aspace, &paddr);
+	ret = msm_gem_get_iova(bo, priv->kms->aspace, &paddr);
 	if (ret) {
 		dev_err(dev->dev, "failed to get buffer obj iova: %d\n", ret);
 		goto fail_unlock;
@@ -143,7 +124,6 @@ static int msm_fbdev_create(struct drm_fb_helper *helper,
 	helper->fb = fb;
 
 	fbi->par = helper;
-	fbi->flags = FBINFO_DEFAULT;
 	fbi->fbops = &msm_fb_ops;
 
 	strcpy(fbi->fix.id, "msm");
@@ -153,14 +133,14 @@ static int msm_fbdev_create(struct drm_fb_helper *helper,
 
 	dev->mode_config.fb_base = paddr;
 
-	fbi->screen_base = msm_gem_get_vaddr(fbdev->bo);
+	fbi->screen_base = msm_gem_get_vaddr(bo);
 	if (IS_ERR(fbi->screen_base)) {
 		ret = PTR_ERR(fbi->screen_base);
 		goto fail_unlock;
 	}
-	fbi->screen_size = fbdev->bo->size;
+	fbi->screen_size = bo->size;
 	fbi->fix.smem_start = paddr;
-	fbi->fix.smem_len = fbdev->bo->size;
+	fbi->fix.smem_len = bo->size;
 
 	DBG("par=%p, %dx%d", fbi->par, fbi->var.xres, fbi->var.yres);
 	DBG("allocated %dx%d fb", fbdev->fb->width, fbdev->fb->height);
@@ -242,7 +222,9 @@ void msm_fbdev_free(struct drm_device *dev)
 
 	/* this will free the backing object */
 	if (fbdev->fb) {
-		msm_gem_put_vaddr(fbdev->bo);
+		struct drm_gem_object *bo =
+			msm_framebuffer_bo(fbdev->fb, 0);
+		msm_gem_put_vaddr(bo);
 		drm_framebuffer_remove(fbdev->fb);
 	}
 
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c
index a0c60e7..f15821a0 100644
--- a/drivers/gpu/drm/msm/msm_gem.c
+++ b/drivers/gpu/drm/msm/msm_gem.c
@@ -1024,3 +1024,49 @@ struct drm_gem_object *msm_gem_import(struct drm_device *dev,
 	drm_gem_object_unreference_unlocked(obj);
 	return ERR_PTR(ret);
 }
+
+static void *_msm_gem_kernel_new(struct drm_device *dev, uint32_t size,
+		uint32_t flags, struct msm_gem_address_space *aspace,
+		struct drm_gem_object **bo, uint64_t *iova, bool locked)
+{
+	void *vaddr;
+	struct drm_gem_object *obj = _msm_gem_new(dev, size, flags, locked);
+	int ret;
+
+	if (IS_ERR(obj))
+		return ERR_CAST(obj);
+
+	if (iova) {
+		ret = msm_gem_get_iova(obj, aspace, iova);
+		if (ret) {
+			drm_gem_object_unreference(obj);
+			return ERR_PTR(ret);
+		}
+	}
+
+	vaddr = msm_gem_get_vaddr(obj);
+	if (!vaddr) {
+		msm_gem_put_iova(obj, aspace);
+		drm_gem_object_unreference(obj);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	if (bo)
+		*bo = obj;
+
+	return vaddr;
+}
+
+void *msm_gem_kernel_new(struct drm_device *dev, uint32_t size,
+		uint32_t flags, struct msm_gem_address_space *aspace,
+		struct drm_gem_object **bo, uint64_t *iova)
+{
+	return _msm_gem_kernel_new(dev, size, flags, aspace, bo, iova, false);
+}
+
+void *msm_gem_kernel_new_locked(struct drm_device *dev, uint32_t size,
+		uint32_t flags, struct msm_gem_address_space *aspace,
+		struct drm_gem_object **bo, uint64_t *iova)
+{
+	return _msm_gem_kernel_new(dev, size, flags, aspace, bo, iova, true);
+}
diff --git a/drivers/gpu/drm/msm/msm_gpu.c b/drivers/gpu/drm/msm/msm_gpu.c
index 9f3dbc2..ffbff27 100644
--- a/drivers/gpu/drm/msm/msm_gpu.c
+++ b/drivers/gpu/drm/msm/msm_gpu.c
@@ -562,11 +562,49 @@ static int get_clocks(struct platform_device *pdev, struct msm_gpu *gpu)
 	return 0;
 }
 
+static struct msm_gem_address_space *
+msm_gpu_create_address_space(struct msm_gpu *gpu, struct platform_device *pdev,
+		uint64_t va_start, uint64_t va_end)
+{
+	struct iommu_domain *iommu;
+	struct msm_gem_address_space *aspace;
+	int ret;
+
+	/*
+	 * Setup IOMMU.. eventually we will (I think) do this once per context
+	 * and have separate page tables per context.  For now, to keep things
+	 * simple and to get something working, just use a single address space:
+	 */
+	iommu = iommu_domain_alloc(&platform_bus_type);
+	if (!iommu)
+		return NULL;
+
+	iommu->geometry.aperture_start = va_start;
+	iommu->geometry.aperture_end = va_end;
+
+	dev_info(gpu->dev->dev, "%s: using IOMMU\n", gpu->name);
+
+	aspace = msm_gem_address_space_create(&pdev->dev, iommu, "gpu");
+	if (IS_ERR(aspace)) {
+		dev_err(gpu->dev->dev, "failed to init iommu: %ld\n",
+			PTR_ERR(aspace));
+		iommu_domain_free(iommu);
+		return ERR_CAST(aspace);
+	}
+
+	ret = aspace->mmu->funcs->attach(aspace->mmu, NULL, 0);
+	if (ret) {
+		msm_gem_address_space_put(aspace);
+		return ERR_PTR(ret);
+	}
+
+	return aspace;
+}
+
 int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 		struct msm_gpu *gpu, const struct msm_gpu_funcs *funcs,
 		const char *name, struct msm_gpu_config *config)
 {
-	struct iommu_domain *iommu;
 	int ret;
 
 	if (WARN_ON(gpu->num_perfcntrs > ARRAY_SIZE(gpu->last_cntrs)))
@@ -636,28 +674,19 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 	if (IS_ERR(gpu->gpu_cx))
 		gpu->gpu_cx = NULL;
 
-	/* Setup IOMMU.. eventually we will (I think) do this once per context
-	 * and have separate page tables per context.  For now, to keep things
-	 * simple and to get something working, just use a single address space:
-	 */
-	iommu = iommu_domain_alloc(&platform_bus_type);
-	if (iommu) {
-		iommu->geometry.aperture_start = config->va_start;
-		iommu->geometry.aperture_end = config->va_end;
+	gpu->pdev = pdev;
+	platform_set_drvdata(pdev, gpu);
 
-		dev_info(drm->dev, "%s: using IOMMU\n", name);
-		gpu->aspace = msm_gem_address_space_create(&pdev->dev,
-				iommu, "gpu");
-		if (IS_ERR(gpu->aspace)) {
-			ret = PTR_ERR(gpu->aspace);
-			dev_err(drm->dev, "failed to init iommu: %d\n", ret);
-			gpu->aspace = NULL;
-			iommu_domain_free(iommu);
-			goto fail;
-		}
+	bs_init(gpu);
 
-	} else {
+	gpu->aspace = msm_gpu_create_address_space(gpu, pdev,
+		config->va_start, config->va_end);
+
+	if (gpu->aspace == NULL)
 		dev_info(drm->dev, "%s: no IOMMU, fallback to VRAM carveout!\n", name);
+	else if (IS_ERR(gpu->aspace)) {
+		ret = PTR_ERR(gpu->aspace);
+		goto fail;
 	}
 
 	/* Create ringbuffer: */
@@ -669,14 +698,10 @@ int msm_gpu_init(struct drm_device *drm, struct platform_device *pdev,
 		goto fail;
 	}
 
-	gpu->pdev = pdev;
-	platform_set_drvdata(pdev, gpu);
-
-	bs_init(gpu);
-
 	return 0;
 
 fail:
+	platform_set_drvdata(pdev, NULL);
 	return ret;
 }
 
@@ -693,7 +718,9 @@ void msm_gpu_cleanup(struct msm_gpu *gpu)
 			msm_gem_put_iova(gpu->rb->bo, gpu->aspace);
 		msm_ringbuffer_destroy(gpu->rb);
 	}
-
-	if (gpu->fctx)
-		msm_fence_context_free(gpu->fctx);
+	if (gpu->aspace) {
+		gpu->aspace->mmu->funcs->detach(gpu->aspace->mmu,
+			NULL, 0);
+		msm_gem_address_space_put(gpu->aspace);
+	}
 }
diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h
index a8f2ba5..17d5824 100644
--- a/drivers/gpu/drm/msm/msm_kms.h
+++ b/drivers/gpu/drm/msm/msm_kms.h
@@ -99,5 +99,7 @@ struct msm_kms *mdp4_kms_init(struct drm_device *dev);
 struct msm_kms *mdp5_kms_init(struct drm_device *dev);
 int msm_mdss_init(struct drm_device *dev);
 void msm_mdss_destroy(struct drm_device *dev);
+int msm_mdss_enable(struct msm_mdss *mdss);
+int msm_mdss_disable(struct msm_mdss *mdss);
 
 #endif /* __MSM_KMS_H__ */
diff --git a/drivers/gpu/drm/msm/msm_ringbuffer.c b/drivers/gpu/drm/msm/msm_ringbuffer.c
index 791bca3..bf065a5 100644
--- a/drivers/gpu/drm/msm/msm_ringbuffer.c
+++ b/drivers/gpu/drm/msm/msm_ringbuffer.c
@@ -33,16 +33,14 @@ struct msm_ringbuffer *msm_ringbuffer_new(struct msm_gpu *gpu, int size)
 	}
 
 	ring->gpu = gpu;
-	ring->bo = msm_gem_new(gpu->dev, size, MSM_BO_WC);
-	if (IS_ERR(ring->bo)) {
-		ret = PTR_ERR(ring->bo);
-		ring->bo = NULL;
-		goto fail;
-	}
 
-	ring->start = msm_gem_get_vaddr(ring->bo);
+	/* Pass NULL for the iova pointer - we will map it later */
+	ring->start = msm_gem_kernel_new(gpu->dev, size, MSM_BO_WC,
+		gpu->aspace, &ring->bo, NULL);
+
 	if (IS_ERR(ring->start)) {
 		ret = PTR_ERR(ring->start);
+		ring->start = 0;
 		goto fail;
 	}
 	ring->end   = ring->start + (size / 4);
diff --git a/drivers/gpu/drm/mxsfb/mxsfb_drv.c b/drivers/gpu/drm/mxsfb/mxsfb_drv.c
index d1b9c34..7fbad9c 100644
--- a/drivers/gpu/drm/mxsfb/mxsfb_drv.c
+++ b/drivers/gpu/drm/mxsfb/mxsfb_drv.c
@@ -190,7 +190,7 @@ static int mxsfb_load(struct drm_device *drm, unsigned long flags)
 	}
 
 	ret = drm_simple_display_pipe_init(drm, &mxsfb->pipe, &mxsfb_funcs,
-			mxsfb_formats, ARRAY_SIZE(mxsfb_formats),
+			mxsfb_formats, ARRAY_SIZE(mxsfb_formats), NULL,
 			&mxsfb->connector);
 	if (ret < 0) {
 		dev_err(drm->dev, "Cannot setup simple display pipe\n");
@@ -256,7 +256,6 @@ static void mxsfb_unload(struct drm_device *drm)
 
 	drm_kms_helper_poll_fini(drm);
 	drm_mode_config_cleanup(drm);
-	drm_vblank_cleanup(drm);
 
 	pm_runtime_get_sync(drm->dev);
 	drm_irq_uninstall(drm);
@@ -335,11 +334,9 @@ static struct drm_driver mxsfb_driver = {
 	.irq_uninstall		= mxsfb_irq_preinstall,
 	.enable_vblank		= mxsfb_enable_vblank,
 	.disable_vblank		= mxsfb_disable_vblank,
-	.gem_free_object	= drm_gem_cma_free_object,
+	.gem_free_object_unlocked = drm_gem_cma_free_object,
 	.gem_vm_ops		= &drm_gem_cma_vm_ops,
 	.dumb_create		= drm_gem_cma_dumb_create,
-	.dumb_map_offset	= drm_gem_cma_dumb_map_offset,
-	.dumb_destroy		= drm_gem_dumb_destroy,
 	.prime_handle_to_fd	= drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle	= drm_gem_prime_fd_to_handle,
 	.gem_prime_export	= drm_gem_prime_export,
diff --git a/drivers/gpu/drm/mxsfb/mxsfb_out.c b/drivers/gpu/drm/mxsfb/mxsfb_out.c
index f7d729a..e5edf01 100644
--- a/drivers/gpu/drm/mxsfb/mxsfb_out.c
+++ b/drivers/gpu/drm/mxsfb/mxsfb_out.c
@@ -74,7 +74,6 @@ static void mxsfb_panel_connector_destroy(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs mxsfb_panel_connector_funcs = {
-	.dpms			= drm_atomic_helper_connector_dpms,
 	.detect			= mxsfb_panel_connector_detect,
 	.fill_modes		= drm_helper_probe_single_connector_modes,
 	.destroy		= mxsfb_panel_connector_destroy,
diff --git a/drivers/gpu/drm/nouveau/dispnv04/crtc.c b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
index 4b4b0b4..6aa6ee1 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/crtc.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/crtc.c
@@ -764,13 +764,18 @@ nv_crtc_gamma_load(struct drm_crtc *crtc)
 	struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
 	struct drm_device *dev = nv_crtc->base.dev;
 	struct rgb { uint8_t r, g, b; } __attribute__((packed)) *rgbs;
+	u16 *r, *g, *b;
 	int i;
 
 	rgbs = (struct rgb *)nv04_display(dev)->mode_reg.crtc_reg[nv_crtc->index].DAC;
+	r = crtc->gamma_store;
+	g = r + crtc->gamma_size;
+	b = g + crtc->gamma_size;
+
 	for (i = 0; i < 256; i++) {
-		rgbs[i].r = nv_crtc->lut.r[i] >> 8;
-		rgbs[i].g = nv_crtc->lut.g[i] >> 8;
-		rgbs[i].b = nv_crtc->lut.b[i] >> 8;
+		rgbs[i].r = *r++ >> 8;
+		rgbs[i].g = *g++ >> 8;
+		rgbs[i].b = *b++ >> 8;
 	}
 
 	nouveau_hw_load_state_palette(dev, nv_crtc->index, &nv04_display(dev)->mode_reg);
@@ -792,13 +797,6 @@ nv_crtc_gamma_set(struct drm_crtc *crtc, u16 *r, u16 *g, u16 *b,
 		  struct drm_modeset_acquire_ctx *ctx)
 {
 	struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
-	int i;
-
-	for (i = 0; i < size; i++) {
-		nv_crtc->lut.r[i] = r[i];
-		nv_crtc->lut.g[i] = g[i];
-		nv_crtc->lut.b[i] = b[i];
-	}
 
 	/* We need to know the depth before we upload, but it's possible to
 	 * get called before a framebuffer is bound.  If this is the case,
@@ -1095,25 +1093,51 @@ static const struct drm_crtc_helper_funcs nv04_crtc_helper_funcs = {
 	.mode_set = nv_crtc_mode_set,
 	.mode_set_base = nv04_crtc_mode_set_base,
 	.mode_set_base_atomic = nv04_crtc_mode_set_base_atomic,
-	.load_lut = nv_crtc_gamma_load,
 	.disable = nv_crtc_disable,
 };
 
+static const uint32_t modeset_formats[] = {
+        DRM_FORMAT_XRGB8888,
+        DRM_FORMAT_RGB565,
+        DRM_FORMAT_XRGB1555,
+};
+
+static struct drm_plane *
+create_primary_plane(struct drm_device *dev)
+{
+        struct drm_plane *primary;
+        int ret;
+
+        primary = kzalloc(sizeof(*primary), GFP_KERNEL);
+        if (primary == NULL) {
+                DRM_DEBUG_KMS("Failed to allocate primary plane\n");
+                return NULL;
+        }
+
+        /* possible_crtc's will be filled in later by crtc_init */
+        ret = drm_universal_plane_init(dev, primary, 0,
+                                       &drm_primary_helper_funcs,
+                                       modeset_formats,
+                                       ARRAY_SIZE(modeset_formats), NULL,
+                                       DRM_PLANE_TYPE_PRIMARY, NULL);
+        if (ret) {
+                kfree(primary);
+                primary = NULL;
+        }
+
+        return primary;
+}
+
 int
 nv04_crtc_create(struct drm_device *dev, int crtc_num)
 {
 	struct nouveau_crtc *nv_crtc;
-	int ret, i;
+	int ret;
 
 	nv_crtc = kzalloc(sizeof(*nv_crtc), GFP_KERNEL);
 	if (!nv_crtc)
 		return -ENOMEM;
 
-	for (i = 0; i < 256; i++) {
-		nv_crtc->lut.r[i] = i << 8;
-		nv_crtc->lut.g[i] = i << 8;
-		nv_crtc->lut.b[i] = i << 8;
-	}
 	nv_crtc->lut.depth = 0;
 
 	nv_crtc->index = crtc_num;
@@ -1122,7 +1146,9 @@ nv04_crtc_create(struct drm_device *dev, int crtc_num)
 	nv_crtc->save = nv_crtc_save;
 	nv_crtc->restore = nv_crtc_restore;
 
-	drm_crtc_init(dev, &nv_crtc->base, &nv04_crtc_funcs);
+	drm_crtc_init_with_planes(dev, &nv_crtc->base,
+                                  create_primary_plane(dev), NULL,
+                                  &nv04_crtc_funcs, NULL);
 	drm_crtc_helper_add(&nv_crtc->base, &nv04_crtc_helper_funcs);
 	drm_mode_crtc_set_gamma_size(&nv_crtc->base, 256);
 
diff --git a/drivers/gpu/drm/nouveau/dispnv04/overlay.c b/drivers/gpu/drm/nouveau/dispnv04/overlay.c
index e54944d..c8c2333 100644
--- a/drivers/gpu/drm/nouveau/dispnv04/overlay.c
+++ b/drivers/gpu/drm/nouveau/dispnv04/overlay.c
@@ -63,6 +63,7 @@ static uint32_t formats[] = {
 	DRM_FORMAT_YUYV,
 	DRM_FORMAT_UYVY,
 	DRM_FORMAT_NV12,
+	DRM_FORMAT_NV21,
 };
 
 /* Sine can be approximated with
@@ -90,6 +91,26 @@ cos_mul(int degrees, int factor)
 }
 
 static int
+verify_scaling(const struct drm_framebuffer *fb, uint8_t shift,
+               uint32_t src_x, uint32_t src_y, uint32_t src_w, uint32_t src_h,
+               uint32_t crtc_w, uint32_t crtc_h)
+{
+	if (crtc_w < (src_w >> shift) || crtc_h < (src_h >> shift)) {
+		DRM_DEBUG_KMS("Unsuitable framebuffer scaling: %dx%d -> %dx%d\n",
+			      src_w, src_h, crtc_w, crtc_h);
+		return -ERANGE;
+	}
+
+	if (src_x != 0 || src_y != 0) {
+		DRM_DEBUG_KMS("Unsuitable framebuffer offset: %d,%d\n",
+                              src_x, src_y);
+		return -ERANGE;
+	}
+
+	return 0;
+}
+
+static int
 nv10_update_plane(struct drm_plane *plane, struct drm_crtc *crtc,
 		  struct drm_framebuffer *fb, int crtc_x, int crtc_y,
 		  unsigned int crtc_w, unsigned int crtc_h,
@@ -107,7 +128,9 @@ nv10_update_plane(struct drm_plane *plane, struct drm_crtc *crtc,
 	bool flip = nv_plane->flip;
 	int soff = NV_PCRTC0_SIZE * nv_crtc->index;
 	int soff2 = NV_PCRTC0_SIZE * !nv_crtc->index;
-	int format, ret;
+	unsigned shift = drm->client.device.info.chipset >= 0x30 ? 1 : 3;
+	unsigned format = 0;
+	int ret;
 
 	/* Source parameters given in 16.16 fixed point, ignore fractional. */
 	src_x >>= 16;
@@ -115,18 +138,9 @@ nv10_update_plane(struct drm_plane *plane, struct drm_crtc *crtc,
 	src_w >>= 16;
 	src_h >>= 16;
 
-	format = ALIGN(src_w * 4, 0x100);
-
-	if (format > 0xffff)
-		return -ERANGE;
-
-	if (drm->client.device.info.chipset >= 0x30) {
-		if (crtc_w < (src_w >> 1) || crtc_h < (src_h >> 1))
-			return -ERANGE;
-	} else {
-		if (crtc_w < (src_w >> 3) || crtc_h < (src_h >> 3))
-			return -ERANGE;
-	}
+	ret = verify_scaling(fb, shift, 0, 0, src_w, src_h, crtc_w, crtc_h);
+	if (ret)
+		return ret;
 
 	ret = nouveau_bo_pin(nv_fb->nvbo, TTM_PL_FLAG_VRAM, false);
 	if (ret)
@@ -146,21 +160,23 @@ nv10_update_plane(struct drm_plane *plane, struct drm_crtc *crtc,
 	nvif_wr32(dev, NV_PVIDEO_POINT_OUT(flip), crtc_y << 16 | crtc_x);
 	nvif_wr32(dev, NV_PVIDEO_SIZE_OUT(flip), crtc_h << 16 | crtc_w);
 
-	if (fb->format->format != DRM_FORMAT_UYVY)
+	if (fb->format->format == DRM_FORMAT_YUYV ||
+	    fb->format->format == DRM_FORMAT_NV12)
 		format |= NV_PVIDEO_FORMAT_COLOR_LE_CR8YB8CB8YA8;
-	if (fb->format->format == DRM_FORMAT_NV12)
+	if (fb->format->format == DRM_FORMAT_NV12 ||
+	    fb->format->format == DRM_FORMAT_NV21)
 		format |= NV_PVIDEO_FORMAT_PLANAR;
 	if (nv_plane->iturbt_709)
 		format |= NV_PVIDEO_FORMAT_MATRIX_ITURBT709;
 	if (nv_plane->colorkey & (1 << 24))
 		format |= NV_PVIDEO_FORMAT_DISPLAY_COLOR_KEY;
 
-	if (fb->format->format == DRM_FORMAT_NV12) {
+	if (format & NV_PVIDEO_FORMAT_PLANAR) {
 		nvif_wr32(dev, NV_PVIDEO_UVPLANE_BASE(flip), 0);
 		nvif_wr32(dev, NV_PVIDEO_UVPLANE_OFFSET_BUFF(flip),
 			nv_fb->nvbo->bo.offset + fb->offsets[1]);
 	}
-	nvif_wr32(dev, NV_PVIDEO_FORMAT(flip), format);
+	nvif_wr32(dev, NV_PVIDEO_FORMAT(flip), format | fb->pitches[0]);
 	nvif_wr32(dev, NV_PVIDEO_STOP, 0);
 	/* TODO: wait for vblank? */
 	nvif_wr32(dev, NV_PVIDEO_BUFFER, flip ? 0x10 : 0x1);
@@ -357,7 +373,7 @@ nv04_update_plane(struct drm_plane *plane, struct drm_crtc *crtc,
 	struct nouveau_bo *cur = nv_plane->cur;
 	uint32_t overlay = 1;
 	int brightness = (nv_plane->brightness - 512) * 62 / 512;
-	int pitch, ret, i;
+	int ret, i;
 
 	/* Source parameters given in 16.16 fixed point, ignore fractional. */
 	src_x >>= 16;
@@ -365,17 +381,9 @@ nv04_update_plane(struct drm_plane *plane, struct drm_crtc *crtc,
 	src_w >>= 16;
 	src_h >>= 16;
 
-	pitch = ALIGN(src_w * 4, 0x100);
-
-	if (pitch > 0xffff)
-		return -ERANGE;
-
-	/* TODO: Compute an offset? Not sure how to do this for YUYV. */
-	if (src_x != 0 || src_y != 0)
-		return -ERANGE;
-
-	if (crtc_w < src_w || crtc_h < src_h)
-		return -ERANGE;
+	ret = verify_scaling(fb, 0, src_x, src_y, src_w, src_h, crtc_w, crtc_h);
+	if (ret)
+		return ret;
 
 	ret = nouveau_bo_pin(nv_fb->nvbo, TTM_PL_FLAG_VRAM, false);
 	if (ret)
@@ -389,8 +397,9 @@ nv04_update_plane(struct drm_plane *plane, struct drm_crtc *crtc,
 
 	for (i = 0; i < 2; i++) {
 		nvif_wr32(dev, NV_PVIDEO_BUFF0_START_ADDRESS + 4 * i,
-			nv_fb->nvbo->bo.offset);
-		nvif_wr32(dev, NV_PVIDEO_BUFF0_PITCH_LENGTH + 4 * i, pitch);
+			  nv_fb->nvbo->bo.offset);
+		nvif_wr32(dev, NV_PVIDEO_BUFF0_PITCH_LENGTH + 4 * i,
+			  fb->pitches[0]);
 		nvif_wr32(dev, NV_PVIDEO_BUFF0_OFFSET + 4 * i, 0);
 	}
 	nvif_wr32(dev, NV_PVIDEO_WINDOW_START, crtc_y << 16 | crtc_x);
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/bios/conn.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/bios/conn.h
index e8e77ee..deb4772 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/bios/conn.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/bios/conn.h
@@ -18,6 +18,7 @@ enum dcb_connector_type {
 	DCB_CONNECTOR_HDMI_C = 0x63,
 	DCB_CONNECTOR_DMS59_DP0 = 0x64,
 	DCB_CONNECTOR_DMS59_DP1 = 0x65,
+	DCB_CONNECTOR_WFD	= 0x70,
 	DCB_CONNECTOR_NONE = 0xff
 };
 
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/bios/dcb.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/bios/dcb.h
index 4892a65..903d117 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/bios/dcb.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/bios/dcb.h
@@ -6,6 +6,7 @@ enum dcb_output_type {
 	DCB_OUTPUT_TMDS		= 0x2,
 	DCB_OUTPUT_LVDS		= 0x3,
 	DCB_OUTPUT_DP		= 0x6,
+	DCB_OUTPUT_WFD		= 0x8,
 	DCB_OUTPUT_EOL		= 0xe,
 	DCB_OUTPUT_UNUSED	= 0xf,
 	DCB_OUTPUT_ANY = -1,
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
index b268b96..1bfd93b 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/therm.h
@@ -96,4 +96,5 @@ int g84_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
 int gt215_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
 int gf119_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
 int gm107_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
+int gm200_therm_new(struct nvkm_device *, int, struct nvkm_therm **);
 #endif
diff --git a/drivers/gpu/drm/nouveau/nouveau_bios.c b/drivers/gpu/drm/nouveau/nouveau_bios.c
index b998c33..dd6fba5 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bios.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bios.c
@@ -351,11 +351,8 @@ static int parse_fp_mode_table(struct drm_device *dev, struct nvbios *bios)
 	struct lvdstableheader lth;
 
 	if (bios->fp.fptablepointer == 0x0) {
-		/* Apple cards don't have the fp table; the laptops use DDC */
-		/* The table is also missing on some x86 IGPs */
-#ifndef __powerpc__
-		NV_ERROR(drm, "Pointer to flat panel table invalid\n");
-#endif
+		/* Most laptop cards lack an fp table. They use DDC. */
+		NV_DEBUG(drm, "Pointer to flat panel table invalid\n");
 		bios->digital_min_front_porch = 0x4b;
 		return 0;
 	}
diff --git a/drivers/gpu/drm/nouveau/nouveau_connector.c b/drivers/gpu/drm/nouveau/nouveau_connector.c
index dab78c6..70d8e0d 100644
--- a/drivers/gpu/drm/nouveau/nouveau_connector.c
+++ b/drivers/gpu/drm/nouveau/nouveau_connector.c
@@ -770,9 +770,6 @@ nouveau_connector_set_property(struct drm_connector *connector,
 	struct drm_encoder *encoder = to_drm_encoder(nv_encoder);
 	int ret;
 
-	if (drm_drv_uses_atomic_modeset(connector->dev))
-		return drm_atomic_helper_connector_set_property(connector, property, value);
-
 	ret = connector->funcs->atomic_set_property(&nv_connector->base,
 						    &asyc->state,
 						    property, value);
@@ -1075,17 +1072,9 @@ nouveau_connector_helper_funcs = {
 	.best_encoder = nouveau_connector_best_encoder,
 };
 
-static int
-nouveau_connector_dpms(struct drm_connector *connector, int mode)
-{
-	if (drm_drv_uses_atomic_modeset(connector->dev))
-		return drm_atomic_helper_connector_dpms(connector, mode);
-	return drm_helper_connector_dpms(connector, mode);
-}
-
 static const struct drm_connector_funcs
 nouveau_connector_funcs = {
-	.dpms = nouveau_connector_dpms,
+	.dpms = drm_helper_connector_dpms,
 	.reset = nouveau_conn_reset,
 	.detect = nouveau_connector_detect,
 	.force = nouveau_connector_force,
@@ -1100,7 +1089,7 @@ nouveau_connector_funcs = {
 
 static const struct drm_connector_funcs
 nouveau_connector_funcs_lvds = {
-	.dpms = nouveau_connector_dpms,
+	.dpms = drm_helper_connector_dpms,
 	.reset = nouveau_conn_reset,
 	.detect = nouveau_connector_detect_lvds,
 	.force = nouveau_connector_force,
@@ -1195,6 +1184,7 @@ drm_conntype_from_dcb(enum dcb_connector_type dcb)
 	case DCB_CONNECTOR_HDMI_0   :
 	case DCB_CONNECTOR_HDMI_1   :
 	case DCB_CONNECTOR_HDMI_C   : return DRM_MODE_CONNECTOR_HDMIA;
+	case DCB_CONNECTOR_WFD	    : return DRM_MODE_CONNECTOR_VIRTUAL;
 	default:
 		break;
 	}
diff --git a/drivers/gpu/drm/nouveau/nouveau_crtc.h b/drivers/gpu/drm/nouveau/nouveau_crtc.h
index 050fcf3..b7a18fb 100644
--- a/drivers/gpu/drm/nouveau/nouveau_crtc.h
+++ b/drivers/gpu/drm/nouveau/nouveau_crtc.h
@@ -61,9 +61,6 @@ struct nouveau_crtc {
 
 	struct {
 		struct nouveau_bo *nvbo;
-		uint16_t r[256];
-		uint16_t g[256];
-		uint16_t b[256];
 		int depth;
 	} lut;
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_display.c b/drivers/gpu/drm/nouveau/nouveau_display.c
index f362c9f..2e7785f 100644
--- a/drivers/gpu/drm/nouveau/nouveau_display.c
+++ b/drivers/gpu/drm/nouveau/nouveau_display.c
@@ -159,8 +159,6 @@ nouveau_display_vblank_fini(struct drm_device *dev)
 {
 	struct drm_crtc *crtc;
 
-	drm_vblank_cleanup(dev);
-
 	list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
 		struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
 		nvif_notify_fini(&nv_crtc->vblank);
@@ -233,9 +231,30 @@ nouveau_framebuffer_new(struct drm_device *dev,
 			struct nouveau_bo *nvbo,
 			struct nouveau_framebuffer **pfb)
 {
+	struct nouveau_drm *drm = nouveau_drm(dev);
 	struct nouveau_framebuffer *fb;
 	int ret;
 
+        /* YUV overlays have special requirements pre-NV50 */
+	if (drm->client.device.info.family < NV_DEVICE_INFO_V0_TESLA &&
+
+	    (mode_cmd->pixel_format == DRM_FORMAT_YUYV ||
+	     mode_cmd->pixel_format == DRM_FORMAT_UYVY ||
+	     mode_cmd->pixel_format == DRM_FORMAT_NV12 ||
+	     mode_cmd->pixel_format == DRM_FORMAT_NV21) &&
+	    (mode_cmd->pitches[0] & 0x3f || /* align 64 */
+	     mode_cmd->pitches[0] >= 0x10000 || /* at most 64k pitch */
+	     (mode_cmd->pitches[1] && /* pitches for planes must match */
+	      mode_cmd->pitches[0] != mode_cmd->pitches[1]))) {
+		struct drm_format_name_buf format_name;
+		DRM_DEBUG_KMS("Unsuitable framebuffer: format: %s; pitches: 0x%x\n 0x%x\n",
+			      drm_get_format_name(mode_cmd->pixel_format,
+						  &format_name),
+			      mode_cmd->pitches[0],
+			      mode_cmd->pitches[1]);
+		return -EINVAL;
+	}
+
 	if (!(fb = *pfb = kzalloc(sizeof(*fb), GFP_KERNEL)))
 		return -ENOMEM;
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index 90757af..595630d 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -585,18 +585,18 @@ nouveau_do_suspend(struct drm_device *dev, bool runtime)
 	nouveau_led_suspend(dev);
 
 	if (dev->mode_config.num_crtc) {
-		NV_INFO(drm, "suspending console...\n");
+		NV_DEBUG(drm, "suspending console...\n");
 		nouveau_fbcon_set_suspend(dev, 1);
-		NV_INFO(drm, "suspending display...\n");
+		NV_DEBUG(drm, "suspending display...\n");
 		ret = nouveau_display_suspend(dev, runtime);
 		if (ret)
 			return ret;
 	}
 
-	NV_INFO(drm, "evicting buffers...\n");
+	NV_DEBUG(drm, "evicting buffers...\n");
 	ttm_bo_evict_mm(&drm->ttm.bdev, TTM_PL_VRAM);
 
-	NV_INFO(drm, "waiting for kernel channels to go idle...\n");
+	NV_DEBUG(drm, "waiting for kernel channels to go idle...\n");
 	if (drm->cechan) {
 		ret = nouveau_channel_idle(drm->cechan);
 		if (ret)
@@ -609,7 +609,7 @@ nouveau_do_suspend(struct drm_device *dev, bool runtime)
 			goto fail_display;
 	}
 
-	NV_INFO(drm, "suspending fence...\n");
+	NV_DEBUG(drm, "suspending fence...\n");
 	if (drm->fence && nouveau_fence(drm)->suspend) {
 		if (!nouveau_fence(drm)->suspend(drm)) {
 			ret = -ENOMEM;
@@ -617,7 +617,7 @@ nouveau_do_suspend(struct drm_device *dev, bool runtime)
 		}
 	}
 
-	NV_INFO(drm, "suspending object tree...\n");
+	NV_DEBUG(drm, "suspending object tree...\n");
 	ret = nvif_client_suspend(&drm->client.base);
 	if (ret)
 		goto fail_client;
@@ -630,7 +630,7 @@ nouveau_do_suspend(struct drm_device *dev, bool runtime)
 
 fail_display:
 	if (dev->mode_config.num_crtc) {
-		NV_INFO(drm, "resuming display...\n");
+		NV_DEBUG(drm, "resuming display...\n");
 		nouveau_display_resume(dev, runtime);
 	}
 	return ret;
@@ -641,19 +641,19 @@ nouveau_do_resume(struct drm_device *dev, bool runtime)
 {
 	struct nouveau_drm *drm = nouveau_drm(dev);
 
-	NV_INFO(drm, "resuming object tree...\n");
+	NV_DEBUG(drm, "resuming object tree...\n");
 	nvif_client_resume(&drm->client.base);
 
-	NV_INFO(drm, "resuming fence...\n");
+	NV_DEBUG(drm, "resuming fence...\n");
 	if (drm->fence && nouveau_fence(drm)->resume)
 		nouveau_fence(drm)->resume(drm);
 
 	nouveau_run_vbios_init(dev);
 
 	if (dev->mode_config.num_crtc) {
-		NV_INFO(drm, "resuming display...\n");
+		NV_DEBUG(drm, "resuming display...\n");
 		nouveau_display_resume(dev, runtime);
-		NV_INFO(drm, "resuming console...\n");
+		NV_DEBUG(drm, "resuming console...\n");
 		nouveau_fbcon_set_suspend(dev, 0);
 	}
 
@@ -998,7 +998,6 @@ driver_stub = {
 
 	.dumb_create = nouveau_display_dumb_create,
 	.dumb_map_offset = nouveau_display_dumb_map_offset,
-	.dumb_destroy = drm_gem_dumb_destroy,
 
 	.name = DRIVER_NAME,
 	.desc = DRIVER_DESC,
@@ -1098,7 +1097,6 @@ static int __init
 nouveau_drm_init(void)
 {
 	driver_pci = driver_stub;
-	driver_pci.set_busid = drm_pci_set_busid;
 	driver_platform = driver_stub;
 
 	nouveau_display_options();
@@ -1117,7 +1115,12 @@ nouveau_drm_init(void)
 
 	nouveau_register_dsm_handler();
 	nouveau_backlight_ctor();
-	return drm_pci_init(&driver_pci, &nouveau_drm_pci_driver);
+
+#ifdef CONFIG_PCI
+	return pci_register_driver(&nouveau_drm_pci_driver);
+#else
+	return 0;
+#endif
 }
 
 static void __exit
@@ -1126,7 +1129,9 @@ nouveau_drm_exit(void)
 	if (!nouveau_modeset)
 		return;
 
-	drm_pci_exit(&driver_pci, &nouveau_drm_pci_driver);
+#ifdef CONFIG_PCI
+	pci_unregister_driver(&nouveau_drm_pci_driver);
+#endif
 	nouveau_backlight_dtor();
 	nouveau_unregister_dsm_handler();
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_fbcon.c b/drivers/gpu/drm/nouveau/nouveau_fbcon.c
index 2665a07..f770784 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fbcon.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fbcon.c
@@ -278,26 +278,6 @@ nouveau_fbcon_accel_init(struct drm_device *dev)
 		info->fbops = &nouveau_fbcon_ops;
 }
 
-static void nouveau_fbcon_gamma_set(struct drm_crtc *crtc, u16 red, u16 green,
-				    u16 blue, int regno)
-{
-	struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
-
-	nv_crtc->lut.r[regno] = red;
-	nv_crtc->lut.g[regno] = green;
-	nv_crtc->lut.b[regno] = blue;
-}
-
-static void nouveau_fbcon_gamma_get(struct drm_crtc *crtc, u16 *red, u16 *green,
-				    u16 *blue, int regno)
-{
-	struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
-
-	*red = nv_crtc->lut.r[regno];
-	*green = nv_crtc->lut.g[regno];
-	*blue = nv_crtc->lut.b[regno];
-}
-
 static void
 nouveau_fbcon_zfill(struct drm_device *dev, struct nouveau_fbdev *fbcon)
 {
@@ -467,8 +447,6 @@ void nouveau_fbcon_gpu_lockup(struct fb_info *info)
 }
 
 static const struct drm_fb_helper_funcs nouveau_fbcon_helper_funcs = {
-	.gamma_set = nouveau_fbcon_gamma_set,
-	.gamma_get = nouveau_fbcon_gamma_get,
 	.fb_probe = nouveau_fbcon_create,
 };
 
diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c b/drivers/gpu/drm/nouveau/nouveau_ttm.c
index 999c35a..b0ad7fc 100644
--- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
@@ -179,7 +179,8 @@ nouveau_gart_manager_new(struct ttm_mem_type_manager *man,
 }
 
 static void
-nouveau_gart_manager_debug(struct ttm_mem_type_manager *man, const char *prefix)
+nouveau_gart_manager_debug(struct ttm_mem_type_manager *man,
+			   struct drm_printer *printer)
 {
 }
 
@@ -252,7 +253,8 @@ nv04_gart_manager_new(struct ttm_mem_type_manager *man,
 }
 
 static void
-nv04_gart_manager_debug(struct ttm_mem_type_manager *man, const char *prefix)
+nv04_gart_manager_debug(struct ttm_mem_type_manager *man,
+			struct drm_printer *printer)
 {
 }
 
diff --git a/drivers/gpu/drm/nouveau/nv50_display.c b/drivers/gpu/drm/nouveau/nv50_display.c
index 2bc0dc9..2dbf62a 100644
--- a/drivers/gpu/drm/nouveau/nv50_display.c
+++ b/drivers/gpu/drm/nouveau/nv50_display.c
@@ -1055,7 +1055,6 @@ nv50_wndw = {
 	.disable_plane = drm_atomic_helper_disable_plane,
 	.destroy = nv50_wndw_destroy,
 	.reset = nv50_wndw_reset,
-	.set_property = drm_atomic_helper_plane_set_property,
 	.atomic_duplicate_state = nv50_wndw_atomic_duplicate_state,
 	.atomic_destroy_state = nv50_wndw_atomic_destroy_state,
 };
@@ -1083,8 +1082,9 @@ nv50_wndw_ctor(const struct nv50_wndw_func *func, struct drm_device *dev,
 	wndw->func = func;
 	wndw->dmac = dmac;
 
-	ret = drm_universal_plane_init(dev, &wndw->plane, 0, &nv50_wndw, format,
-				       nformat, type, "%s-%d", name, index);
+	ret = drm_universal_plane_init(dev, &wndw->plane, 0, &nv50_wndw,
+				       format, nformat, NULL,
+				       type, "%s-%d", name, index);
 	if (ret)
 		return ret;
 
@@ -2103,7 +2103,7 @@ nv50_head_atomic_check(struct drm_crtc *crtc, struct drm_crtc_state *state)
 
 	NV_ATOMIC(drm, "%s atomic_check %d\n", crtc->name, asyh->state.active);
 	if (asyh->state.active) {
-		for_each_connector_in_state(asyh->state.state, conn, conns, i) {
+		for_each_new_connector_in_state(asyh->state.state, conn, conns, i) {
 			if (conns->crtc == crtc) {
 				asyc = nouveau_conn_atom(conns);
 				break;
@@ -2204,28 +2204,29 @@ nv50_head_lut_load(struct drm_crtc *crtc)
 	struct nv50_disp *disp = nv50_disp(crtc->dev);
 	struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
 	void __iomem *lut = nvbo_kmap_obj_iovirtual(nv_crtc->lut.nvbo);
+	u16 *r, *g, *b;
 	int i;
 
-	for (i = 0; i < 256; i++) {
-		u16 r = nv_crtc->lut.r[i] >> 2;
-		u16 g = nv_crtc->lut.g[i] >> 2;
-		u16 b = nv_crtc->lut.b[i] >> 2;
+	r = crtc->gamma_store;
+	g = r + crtc->gamma_size;
+	b = g + crtc->gamma_size;
 
+	for (i = 0; i < 256; i++) {
 		if (disp->disp->oclass < GF110_DISP) {
-			writew(r + 0x0000, lut + (i * 0x08) + 0);
-			writew(g + 0x0000, lut + (i * 0x08) + 2);
-			writew(b + 0x0000, lut + (i * 0x08) + 4);
+			writew((*r++ >> 2) + 0x0000, lut + (i * 0x08) + 0);
+			writew((*g++ >> 2) + 0x0000, lut + (i * 0x08) + 2);
+			writew((*b++ >> 2) + 0x0000, lut + (i * 0x08) + 4);
 		} else {
-			writew(r + 0x6000, lut + (i * 0x20) + 0);
-			writew(g + 0x6000, lut + (i * 0x20) + 2);
-			writew(b + 0x6000, lut + (i * 0x20) + 4);
+			/* 0x6000 interferes with the 14-bit color??? */
+			writew((*r++ >> 2) + 0x6000, lut + (i * 0x20) + 0);
+			writew((*g++ >> 2) + 0x6000, lut + (i * 0x20) + 2);
+			writew((*b++ >> 2) + 0x6000, lut + (i * 0x20) + 4);
 		}
 	}
 }
 
 static const struct drm_crtc_helper_funcs
 nv50_head_help = {
-	.load_lut = nv50_head_lut_load,
 	.atomic_check = nv50_head_atomic_check,
 };
 
@@ -2234,15 +2235,6 @@ nv50_head_gamma_set(struct drm_crtc *crtc, u16 *r, u16 *g, u16 *b,
 		    uint32_t size,
 		    struct drm_modeset_acquire_ctx *ctx)
 {
-	struct nouveau_crtc *nv_crtc = nouveau_crtc(crtc);
-	u32 i;
-
-	for (i = 0; i < size; i++) {
-		nv_crtc->lut.r[i] = r[i];
-		nv_crtc->lut.g[i] = g[i];
-		nv_crtc->lut.b[i] = b[i];
-	}
-
 	nv50_head_lut_load(crtc);
 	return 0;
 }
@@ -2325,7 +2317,6 @@ nv50_head_func = {
 	.destroy = nv50_head_destroy,
 	.set_config = drm_atomic_helper_set_config,
 	.page_flip = drm_atomic_helper_page_flip,
-	.set_property = drm_atomic_helper_crtc_set_property,
 	.atomic_duplicate_state = nv50_head_atomic_duplicate_state,
 	.atomic_destroy_state = nv50_head_atomic_destroy_state,
 };
@@ -2340,19 +2331,13 @@ nv50_head_create(struct drm_device *dev, int index)
 	struct nv50_base *base;
 	struct nv50_curs *curs;
 	struct drm_crtc *crtc;
-	int ret, i;
+	int ret;
 
 	head = kzalloc(sizeof(*head), GFP_KERNEL);
 	if (!head)
 		return -ENOMEM;
 
 	head->base.index = index;
-	for (i = 0; i < 256; i++) {
-		head->base.lut.r[i] = i << 8;
-		head->base.lut.g[i] = i << 8;
-		head->base.lut.b[i] = i << 8;
-	}
-
 	ret = nv50_base_new(drm, head, &base);
 	if (ret == 0)
 		ret = nv50_curs_new(drm, head, &curs);
@@ -2762,7 +2747,8 @@ nv50_hdmi_enable(struct drm_encoder *encoder, struct drm_display_mode *mode)
 	if (!drm_detect_hdmi_monitor(nv_connector->edid))
 		return;
 
-	ret = drm_hdmi_avi_infoframe_from_display_mode(&avi_frame.avi, mode);
+	ret = drm_hdmi_avi_infoframe_from_display_mode(&avi_frame.avi, mode,
+						       false);
 	if (!ret) {
 		/* We have an AVI InfoFrame, populate it to the display */
 		args.pwr.avi_infoframe_length
@@ -3119,11 +3105,9 @@ nv50_mstc_destroy(struct drm_connector *connector)
 
 static const struct drm_connector_funcs
 nv50_mstc = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.reset = nouveau_conn_reset,
 	.detect = nv50_mstc_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
-	.set_property = drm_atomic_helper_connector_set_property,
 	.destroy = nv50_mstc_destroy,
 	.atomic_duplicate_state = nouveau_conn_atomic_duplicate_state,
 	.atomic_destroy_state = nouveau_conn_atomic_destroy_state,
@@ -3157,7 +3141,7 @@ nv50_mstc_new(struct nv50_mstm *mstm, struct drm_dp_mst_port *port,
 	mstc->connector.funcs->reset(&mstc->connector);
 	nouveau_conn_attach_properties(&mstc->connector);
 
-	for (i = 0; i < ARRAY_SIZE(mstm->msto) && mstm->msto; i++)
+	for (i = 0; i < ARRAY_SIZE(mstm->msto) && mstm->msto[i]; i++)
 		drm_mode_connector_attach_encoder(&mstc->connector, &mstm->msto[i]->encoder);
 
 	drm_object_attach_property(&mstc->connector.base, dev->mode_config.path_property, 0);
@@ -3913,9 +3897,9 @@ static void
 nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 {
 	struct drm_device *dev = state->dev;
-	struct drm_crtc_state *crtc_state;
+	struct drm_crtc_state *new_crtc_state, *old_crtc_state;
 	struct drm_crtc *crtc;
-	struct drm_plane_state *plane_state;
+	struct drm_plane_state *new_plane_state;
 	struct drm_plane *plane;
 	struct nouveau_drm *drm = nouveau_drm(dev);
 	struct nv50_disp *disp = nv50_disp(dev);
@@ -3934,13 +3918,13 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 		mutex_lock(&disp->mutex);
 
 	/* Disable head(s). */
-	for_each_crtc_in_state(state, crtc, crtc_state, i) {
-		struct nv50_head_atom *asyh = nv50_head_atom(crtc->state);
+	for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) {
+		struct nv50_head_atom *asyh = nv50_head_atom(new_crtc_state);
 		struct nv50_head *head = nv50_head(crtc);
 
 		NV_ATOMIC(drm, "%s: clr %04x (set %04x)\n", crtc->name,
 			  asyh->clr.mask, asyh->set.mask);
-		if (crtc_state->active && !asyh->state.active)
+		if (old_crtc_state->active && !new_crtc_state->active)
 			drm_crtc_vblank_off(crtc);
 
 		if (asyh->clr.mask) {
@@ -3950,8 +3934,8 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 	}
 
 	/* Disable plane(s). */
-	for_each_plane_in_state(state, plane, plane_state, i) {
-		struct nv50_wndw_atom *asyw = nv50_wndw_atom(plane->state);
+	for_each_new_plane_in_state(state, plane, new_plane_state, i) {
+		struct nv50_wndw_atom *asyw = nv50_wndw_atom(new_plane_state);
 		struct nv50_wndw *wndw = nv50_wndw(plane);
 
 		NV_ATOMIC(drm, "%s: clr %02x (set %02x)\n", plane->name,
@@ -4016,8 +4000,8 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 	}
 
 	/* Update head(s). */
-	for_each_crtc_in_state(state, crtc, crtc_state, i) {
-		struct nv50_head_atom *asyh = nv50_head_atom(crtc->state);
+	for_each_oldnew_crtc_in_state(state, crtc, old_crtc_state, new_crtc_state, i) {
+		struct nv50_head_atom *asyh = nv50_head_atom(new_crtc_state);
 		struct nv50_head *head = nv50_head(crtc);
 
 		NV_ATOMIC(drm, "%s: set %04x (clr %04x)\n", crtc->name,
@@ -4028,17 +4012,17 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 			interlock_core = 1;
 		}
 
-		if (asyh->state.active) {
-			if (!crtc_state->active)
+		if (new_crtc_state->active) {
+			if (!old_crtc_state->active)
 				drm_crtc_vblank_on(crtc);
-			if (asyh->state.event)
+			if (new_crtc_state->event)
 				drm_crtc_vblank_get(crtc);
 		}
 	}
 
 	/* Update plane(s). */
-	for_each_plane_in_state(state, plane, plane_state, i) {
-		struct nv50_wndw_atom *asyw = nv50_wndw_atom(plane->state);
+	for_each_new_plane_in_state(state, plane, new_plane_state, i) {
+		struct nv50_wndw_atom *asyw = nv50_wndw_atom(new_plane_state);
 		struct nv50_wndw *wndw = nv50_wndw(plane);
 
 		NV_ATOMIC(drm, "%s: set %02x (clr %02x)\n", plane->name,
@@ -4068,25 +4052,26 @@ nv50_disp_atomic_commit_tail(struct drm_atomic_state *state)
 		mutex_unlock(&disp->mutex);
 
 	/* Wait for HW to signal completion. */
-	for_each_plane_in_state(state, plane, plane_state, i) {
-		struct nv50_wndw_atom *asyw = nv50_wndw_atom(plane->state);
+	for_each_new_plane_in_state(state, plane, new_plane_state, i) {
+		struct nv50_wndw_atom *asyw = nv50_wndw_atom(new_plane_state);
 		struct nv50_wndw *wndw = nv50_wndw(plane);
 		int ret = nv50_wndw_wait_armed(wndw, asyw);
 		if (ret)
 			NV_ERROR(drm, "%s: timeout\n", plane->name);
 	}
 
-	for_each_crtc_in_state(state, crtc, crtc_state, i) {
-		if (crtc->state->event) {
+	for_each_new_crtc_in_state(state, crtc, new_crtc_state, i) {
+		if (new_crtc_state->event) {
 			unsigned long flags;
 			/* Get correct count/ts if racing with vblank irq */
-			if (crtc->state->active)
-				drm_accurate_vblank_count(crtc);
+			if (new_crtc_state->active)
+				drm_crtc_accurate_vblank_count(crtc);
 			spin_lock_irqsave(&crtc->dev->event_lock, flags);
-			drm_crtc_send_vblank_event(crtc, crtc->state->event);
+			drm_crtc_send_vblank_event(crtc, new_crtc_state->event);
 			spin_unlock_irqrestore(&crtc->dev->event_lock, flags);
-			crtc->state->event = NULL;
-			if (crtc->state->active)
+
+			new_crtc_state->event = NULL;
+			if (new_crtc_state->active)
 				drm_crtc_vblank_put(crtc);
 		}
 	}
@@ -4111,7 +4096,7 @@ nv50_disp_atomic_commit(struct drm_device *dev,
 {
 	struct nouveau_drm *drm = nouveau_drm(dev);
 	struct nv50_disp *disp = nv50_disp(dev);
-	struct drm_plane_state *plane_state;
+	struct drm_plane_state *old_plane_state;
 	struct drm_plane *plane;
 	struct drm_crtc *crtc;
 	bool active = false;
@@ -4134,12 +4119,17 @@ nv50_disp_atomic_commit(struct drm_device *dev,
 	if (!nonblock) {
 		ret = drm_atomic_helper_wait_for_fences(dev, state, true);
 		if (ret)
-			goto done;
+			goto err_cleanup;
 	}
 
-	for_each_plane_in_state(state, plane, plane_state, i) {
-		struct nv50_wndw_atom *asyw = nv50_wndw_atom(plane_state);
+	ret = drm_atomic_helper_swap_state(state, true);
+	if (ret)
+		goto err_cleanup;
+
+	for_each_old_plane_in_state(state, plane, old_plane_state, i) {
+		struct nv50_wndw_atom *asyw = nv50_wndw_atom(old_plane_state);
 		struct nv50_wndw *wndw = nv50_wndw(plane);
+
 		if (asyw->set.image) {
 			asyw->ntfy.handle = wndw->dmac->sync.handle;
 			asyw->ntfy.offset = wndw->ntfy;
@@ -4150,7 +4140,6 @@ nv50_disp_atomic_commit(struct drm_device *dev,
 		}
 	}
 
-	drm_atomic_helper_swap_state(state, true);
 	drm_atomic_state_get(state);
 
 	if (nonblock)
@@ -4162,7 +4151,7 @@ nv50_disp_atomic_commit(struct drm_device *dev,
 		if (crtc->state->enable) {
 			if (!drm->have_disp_power_ref) {
 				drm->have_disp_power_ref = true;
-				return ret;
+				return 0;
 			}
 			active = true;
 			break;
@@ -4174,6 +4163,9 @@ nv50_disp_atomic_commit(struct drm_device *dev,
 		drm->have_disp_power_ref = false;
 	}
 
+err_cleanup:
+	if (ret)
+		drm_atomic_helper_cleanup_planes(dev, state);
 done:
 	pm_runtime_put_autosuspend(dev->dev);
 	return ret;
@@ -4200,18 +4192,19 @@ nv50_disp_outp_atomic_add(struct nv50_atom *atom, struct drm_encoder *encoder)
 
 static int
 nv50_disp_outp_atomic_check_clr(struct nv50_atom *atom,
-				struct drm_connector *connector)
+				struct drm_connector_state *old_connector_state)
 {
-	struct drm_encoder *encoder = connector->state->best_encoder;
-	struct drm_crtc_state *crtc_state;
+	struct drm_encoder *encoder = old_connector_state->best_encoder;
+	struct drm_crtc_state *old_crtc_state, *new_crtc_state;
 	struct drm_crtc *crtc;
 	struct nv50_outp_atom *outp;
 
-	if (!(crtc = connector->state->crtc))
+	if (!(crtc = old_connector_state->crtc))
 		return 0;
 
-	crtc_state = drm_atomic_get_existing_crtc_state(&atom->state, crtc);
-	if (crtc->state->active && drm_atomic_crtc_needs_modeset(crtc_state)) {
+	old_crtc_state = drm_atomic_get_old_crtc_state(&atom->state, crtc);
+	new_crtc_state = drm_atomic_get_new_crtc_state(&atom->state, crtc);
+	if (old_crtc_state->active && drm_atomic_crtc_needs_modeset(new_crtc_state)) {
 		outp = nv50_disp_outp_atomic_add(atom, encoder);
 		if (IS_ERR(outp))
 			return PTR_ERR(outp);
@@ -4232,15 +4225,15 @@ nv50_disp_outp_atomic_check_set(struct nv50_atom *atom,
 				struct drm_connector_state *connector_state)
 {
 	struct drm_encoder *encoder = connector_state->best_encoder;
-	struct drm_crtc_state *crtc_state;
+	struct drm_crtc_state *new_crtc_state;
 	struct drm_crtc *crtc;
 	struct nv50_outp_atom *outp;
 
 	if (!(crtc = connector_state->crtc))
 		return 0;
 
-	crtc_state = drm_atomic_get_existing_crtc_state(&atom->state, crtc);
-	if (crtc_state->active && drm_atomic_crtc_needs_modeset(crtc_state)) {
+	new_crtc_state = drm_atomic_get_new_crtc_state(&atom->state, crtc);
+	if (new_crtc_state->active && drm_atomic_crtc_needs_modeset(new_crtc_state)) {
 		outp = nv50_disp_outp_atomic_add(atom, encoder);
 		if (IS_ERR(outp))
 			return PTR_ERR(outp);
@@ -4256,7 +4249,7 @@ static int
 nv50_disp_atomic_check(struct drm_device *dev, struct drm_atomic_state *state)
 {
 	struct nv50_atom *atom = nv50_atom(state);
-	struct drm_connector_state *connector_state;
+	struct drm_connector_state *old_connector_state, *new_connector_state;
 	struct drm_connector *connector;
 	int ret, i;
 
@@ -4264,12 +4257,12 @@ nv50_disp_atomic_check(struct drm_device *dev, struct drm_atomic_state *state)
 	if (ret)
 		return ret;
 
-	for_each_connector_in_state(state, connector, connector_state, i) {
-		ret = nv50_disp_outp_atomic_check_clr(atom, connector);
+	for_each_oldnew_connector_in_state(state, connector, old_connector_state, new_connector_state, i) {
+		ret = nv50_disp_outp_atomic_check_clr(atom, old_connector_state);
 		if (ret)
 			return ret;
 
-		ret = nv50_disp_outp_atomic_check_set(atom, connector_state);
+		ret = nv50_disp_outp_atomic_check_set(atom, new_connector_state);
 		if (ret)
 			return ret;
 	}
@@ -4458,11 +4451,13 @@ nv50_display_create(struct drm_device *dev)
 
 	/* create crtc objects to represent the hw heads */
 	if (disp->disp->oclass >= GF110_DISP)
-		crtcs = nvif_rd32(&device->object, 0x022448);
+		crtcs = nvif_rd32(&device->object, 0x612004) & 0xf;
 	else
-		crtcs = 2;
+		crtcs = 0x3;
 
-	for (i = 0; i < crtcs; i++) {
+	for (i = 0; i < fls(crtcs); i++) {
+		if (!(crtcs & (1 << i)))
+			continue;
 		ret = nv50_head_create(dev, i);
 		if (ret)
 			goto out;
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
index 7bdc7a5..e096a5d 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
@@ -2043,6 +2043,7 @@ nv120_chipset = {
 	.mxm = nv50_mxm_new,
 	.pci = gk104_pci_new,
 	.pmu = gm107_pmu_new,
+	.therm = gm200_therm_new,
 	.secboot = gm200_secboot_new,
 	.timer = gk20a_timer_new,
 	.top = gk104_top_new,
@@ -2077,6 +2078,7 @@ nv124_chipset = {
 	.mxm = nv50_mxm_new,
 	.pci = gk104_pci_new,
 	.pmu = gm107_pmu_new,
+	.therm = gm200_therm_new,
 	.secboot = gm200_secboot_new,
 	.timer = gk20a_timer_new,
 	.top = gk104_top_new,
@@ -2111,6 +2113,7 @@ nv126_chipset = {
 	.mxm = nv50_mxm_new,
 	.pci = gk104_pci_new,
 	.pmu = gm107_pmu_new,
+	.therm = gm200_therm_new,
 	.secboot = gm200_secboot_new,
 	.timer = gk20a_timer_new,
 	.top = gk104_top_new,
@@ -2321,6 +2324,35 @@ nv137_chipset = {
 };
 
 static const struct nvkm_device_chip
+nv138_chipset = {
+	.name = "GP108",
+	.bar = gf100_bar_new,
+	.bios = nvkm_bios_new,
+	.bus = gf100_bus_new,
+	.devinit = gm200_devinit_new,
+	.fb = gp102_fb_new,
+	.fuse = gm107_fuse_new,
+	.gpio = gk104_gpio_new,
+	.i2c = gm200_i2c_new,
+	.ibus = gm200_ibus_new,
+	.imem = nv50_instmem_new,
+	.ltc = gp100_ltc_new,
+	.mc = gp100_mc_new,
+	.mmu = gf100_mmu_new,
+	.pci = gp100_pci_new,
+	.pmu = gp102_pmu_new,
+	.timer = gk20a_timer_new,
+	.top = gk104_top_new,
+	.ce[0] = gp102_ce_new,
+	.ce[1] = gp102_ce_new,
+	.ce[2] = gp102_ce_new,
+	.ce[3] = gp102_ce_new,
+	.disp = gp102_disp_new,
+	.dma = gf119_dma_new,
+	.fifo = gp100_fifo_new,
+};
+
+static const struct nvkm_device_chip
 nv13b_chipset = {
 	.name = "GP10B",
 	.bar = gk20a_bar_new,
@@ -2782,6 +2814,7 @@ nvkm_device_ctor(const struct nvkm_device_func *func,
 		case 0x134: device->chip = &nv134_chipset; break;
 		case 0x136: device->chip = &nv136_chipset; break;
 		case 0x137: device->chip = &nv137_chipset; break;
+		case 0x138: device->chip = &nv138_chipset; break;
 		case 0x13b: device->chip = &nv13b_chipset; break;
 		default:
 			nvdev_error(device, "unknown chipset (%08x)\n", boot0);
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/base.c
index 88582af..93a75e5 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/base.c
@@ -285,6 +285,10 @@ nvkm_disp_oneinit(struct nvkm_engine *engine)
 		case DCB_OUTPUT_DP:
 			ret = nvkm_dp_new(disp, i, &dcbE, &outp);
 			break;
+		case DCB_OUTPUT_WFD:
+			/* No support for WFD yet. */
+			ret = -ENODEV;
+			continue;
 		default:
 			nvkm_warn(subdev, "dcb %d type %d unknown\n",
 				  i, dcbE.type);
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/headgf119.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/headgf119.c
index b335527..9fd7ae3 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/headgf119.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/headgf119.c
@@ -92,5 +92,8 @@ gf119_head = {
 int
 gf119_head_new(struct nvkm_disp *disp, int id)
 {
+	struct nvkm_device *device = disp->engine.subdev.device;
+	if (!(nvkm_rd32(device, 0x612004) & (0x00000001 << id)))
+		return 0;
 	return nvkm_head_new_(&gf119_head, disp, id);
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/mpeg/nv31.c b/drivers/gpu/drm/nouveau/nvkm/engine/mpeg/nv31.c
index 8a88952..7fea7d4 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/mpeg/nv31.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/mpeg/nv31.c
@@ -124,6 +124,8 @@ nv31_mpeg_tile(struct nvkm_engine *engine, int i, struct nvkm_fb_tile *tile)
 static bool
 nv31_mpeg_mthd_dma(struct nvkm_device *device, u32 mthd, u32 data)
 {
+	struct nv31_mpeg *mpeg = nv31_mpeg(device->mpeg);
+	struct nvkm_subdev *subdev = &mpeg->engine.subdev;
 	u32 inst = data << 4;
 	u32 dma0 = nvkm_rd32(device, 0x700000 + inst);
 	u32 dma1 = nvkm_rd32(device, 0x700004 + inst);
@@ -132,8 +134,11 @@ nv31_mpeg_mthd_dma(struct nvkm_device *device, u32 mthd, u32 data)
 	u32 size = dma1 + 1;
 
 	/* only allow linear DMA objects */
-	if (!(dma0 & 0x00002000))
+	if (!(dma0 & 0x00002000)) {
+		nvkm_error(subdev, "inst %08x dma0 %08x dma1 %08x dma2 %08x\n",
+			   inst, dma0, dma1, dma2);
 		return false;
+	}
 
 	if (mthd == 0x0190) {
 		/* DMA_CMD */
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/mpeg/nv40.c b/drivers/gpu/drm/nouveau/nvkm/engine/mpeg/nv40.c
index 16de5bd..b5ec7c5 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/mpeg/nv40.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/mpeg/nv40.c
@@ -31,6 +31,8 @@ bool
 nv40_mpeg_mthd_dma(struct nvkm_device *device, u32 mthd, u32 data)
 {
 	struct nvkm_instmem *imem = device->imem;
+	struct nv31_mpeg *mpeg = nv31_mpeg(device->mpeg);
+	struct nvkm_subdev *subdev = &mpeg->engine.subdev;
 	u32 inst = data << 4;
 	u32 dma0 = nvkm_instmem_rd32(imem, inst + 0);
 	u32 dma1 = nvkm_instmem_rd32(imem, inst + 4);
@@ -39,8 +41,11 @@ nv40_mpeg_mthd_dma(struct nvkm_device *device, u32 mthd, u32 data)
 	u32 size = dma1 + 1;
 
 	/* only allow linear DMA objects */
-	if (!(dma0 & 0x00002000))
+	if (!(dma0 & 0x00002000)) {
+		nvkm_error(subdev, "inst %08x dma0 %08x dma1 %08x dma2 %08x\n",
+			   inst, dma0, dma1, dma2);
 		return false;
+	}
 
 	if (mthd == 0x0190) {
 		/* DMA_CMD */
diff --git a/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue.c b/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue.c
index d45d794..77273b5 100644
--- a/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue.c
+++ b/drivers/gpu/drm/nouveau/nvkm/falcon/msgqueue.c
@@ -251,7 +251,7 @@ cmd_write(struct nvkm_msgqueue *priv, struct nvkm_msgqueue_hdr *cmd,
 	  struct nvkm_msgqueue_queue *queue)
 {
 	const struct nvkm_subdev *subdev = priv->falcon->owner;
-	static unsigned long timeout = ~0;
+	static unsigned timeout = 2000;
 	unsigned long end_jiffies = jiffies + msecs_to_jiffies(timeout);
 	int ret = -EAGAIN;
 	bool commit = true;
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.c
index 6d8f212..676c167 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.c
@@ -24,6 +24,7 @@
 #include "gf100.h"
 
 #include <core/gpuobj.h>
+#include <core/option.h>
 #include <subdev/fb.h>
 #include <subdev/mmu.h>
 
@@ -59,6 +60,8 @@ gf100_bar_ctor_vm(struct gf100_bar *bar, struct gf100_bar_vm *bar_vm,
 		return ret;
 
 	bar_len = device->func->resource_size(device, bar_nr);
+	if (bar_nr == 3 && bar->bar2_halve)
+		bar_len >>= 1;
 
 	ret = nvkm_vm_new(device, 0, bar_len, 0, key, &vm);
 	if (ret)
@@ -129,6 +132,8 @@ gf100_bar_init(struct nvkm_bar *base)
 
 	if (bar->bar[0].mem) {
 		addr = nvkm_memory_addr(bar->bar[0].mem) >> 12;
+		if (bar->bar2_halve)
+			addr |= 0x40000000;
 		nvkm_wr32(device, 0x001714, 0x80000000 | addr);
 	}
 
@@ -161,6 +166,7 @@ gf100_bar_new_(const struct nvkm_bar_func *func, struct nvkm_device *device,
 	if (!(bar = kzalloc(sizeof(*bar), GFP_KERNEL)))
 		return -ENOMEM;
 	nvkm_bar_ctor(func, device, index, &bar->base);
+	bar->bar2_halve = nvkm_boolopt(device->cfgopt, "NvBar2Halve", false);
 	*pbar = &bar->base;
 	return 0;
 }
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.h b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.h
index f7dea69..20a5255 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.h
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gf100.h
@@ -11,6 +11,7 @@ struct gf100_bar_vm {
 
 struct gf100_bar {
 	struct nvkm_bar base;
+	bool bar2_halve;
 	struct gf100_bar_vm bar[2];
 };
 
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.c
index 3841ad6..a239e73 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/fb/gf100.c
@@ -60,12 +60,12 @@ gf100_fb_oneinit(struct nvkm_fb *base)
 	size = min(size, 0x1000);
 
 	ret = nvkm_memory_new(device, NVKM_MEM_TARGET_INST, size, 0x1000,
-			      false, &fb->base.mmu_rd);
+			      true, &fb->base.mmu_rd);
 	if (ret)
 		return ret;
 
 	ret = nvkm_memory_new(device, NVKM_MEM_TARGET_INST, size, 0x1000,
-			      false, &fb->base.mmu_wr);
+			      true, &fb->base.mmu_wr);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mc/gf100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mc/gf100.c
index d2c4d60..f937664 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mc/gf100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mc/gf100.c
@@ -27,6 +27,7 @@ static const struct nvkm_mc_map
 gf100_mc_reset[] = {
 	{ 0x00020000, NVKM_ENGINE_MSPDEC },
 	{ 0x00008000, NVKM_ENGINE_MSVLD },
+	{ 0x00002000, NVKM_SUBDEV_PMU, true },
 	{ 0x00001000, NVKM_ENGINE_GR },
 	{ 0x00000100, NVKM_ENGINE_FIFO },
 	{ 0x00000080, NVKM_ENGINE_CE1 },
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
index eb9b278..a4cb824 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
@@ -192,6 +192,10 @@ nvkm_pci_new_(const struct nvkm_pci_func *func, struct nvkm_device *device,
 		}
 	}
 
+#ifdef __BIG_ENDIAN
+	pci->msi = false;
+#endif
+
 	pci->msi = nvkm_boolopt(device->cfgopt, "NvMSI", pci->msi);
 	if (pci->msi && func->msi_rearm) {
 		pci->msi = pci_enable_msi(pci->pdev) == 0;
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c
index 3306f9f..ce70a19 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c
@@ -75,7 +75,7 @@ nvkm_pmu_reset(struct nvkm_pmu *pmu)
 {
 	struct nvkm_device *device = pmu->subdev.device;
 
-	if (!(nvkm_rd32(device, 0x000200) & 0x00002000))
+	if (!pmu->func->enabled(pmu))
 		return 0;
 
 	/* Inhibit interrupts, and wait for idle. */
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gf100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gf100.c
index 0e36d4c..0b45865 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gf100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gf100.c
@@ -24,13 +24,30 @@
 #include "priv.h"
 #include "fuc/gf100.fuc3.h"
 
+#include <subdev/mc.h>
+
+void
+gf100_pmu_reset(struct nvkm_pmu *pmu)
+{
+	struct nvkm_device *device = pmu->subdev.device;
+	nvkm_mc_disable(device, NVKM_SUBDEV_PMU);
+	nvkm_mc_enable(device, NVKM_SUBDEV_PMU);
+}
+
+bool
+gf100_pmu_enabled(struct nvkm_pmu *pmu)
+{
+	return nvkm_mc_enabled(pmu->subdev.device, NVKM_SUBDEV_PMU);
+}
+
 static const struct nvkm_pmu_func
 gf100_pmu = {
 	.code.data = gf100_pmu_code,
 	.code.size = sizeof(gf100_pmu_code),
 	.data.data = gf100_pmu_data,
 	.data.size = sizeof(gf100_pmu_data),
-	.reset = gt215_pmu_reset,
+	.enabled = gf100_pmu_enabled,
+	.reset = gf100_pmu_reset,
 	.init = gt215_pmu_init,
 	.fini = gt215_pmu_fini,
 	.intr = gt215_pmu_intr,
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gf119.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gf119.c
index 0e4ba42..3dfa79d 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gf119.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gf119.c
@@ -30,7 +30,8 @@ gf119_pmu = {
 	.code.size = sizeof(gf119_pmu_code),
 	.data.data = gf119_pmu_data,
 	.data.size = sizeof(gf119_pmu_data),
-	.reset = gt215_pmu_reset,
+	.enabled = gf100_pmu_enabled,
+	.reset = gf100_pmu_reset,
 	.init = gt215_pmu_init,
 	.fini = gt215_pmu_fini,
 	.intr = gt215_pmu_intr,
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk104.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk104.c
index 2ad858d..8f7ec10 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk104.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk104.c
@@ -109,7 +109,8 @@ gk104_pmu = {
 	.code.size = sizeof(gk104_pmu_code),
 	.data.data = gk104_pmu_data,
 	.data.size = sizeof(gk104_pmu_data),
-	.reset = gt215_pmu_reset,
+	.enabled = gf100_pmu_enabled,
+	.reset = gf100_pmu_reset,
 	.init = gt215_pmu_init,
 	.fini = gt215_pmu_fini,
 	.intr = gt215_pmu_intr,
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk110.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk110.c
index fc4b8ec..345741d 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk110.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk110.c
@@ -88,7 +88,8 @@ gk110_pmu = {
 	.code.size = sizeof(gk110_pmu_code),
 	.data.data = gk110_pmu_data,
 	.data.size = sizeof(gk110_pmu_data),
-	.reset = gt215_pmu_reset,
+	.enabled = gf100_pmu_enabled,
+	.reset = gf100_pmu_reset,
 	.init = gt215_pmu_init,
 	.fini = gt215_pmu_fini,
 	.intr = gt215_pmu_intr,
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk208.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk208.c
index e9a9127..e4acf78 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk208.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk208.c
@@ -30,7 +30,8 @@ gk208_pmu = {
 	.code.size = sizeof(gk208_pmu_code),
 	.data.data = gk208_pmu_data,
 	.data.size = sizeof(gk208_pmu_data),
-	.reset = gt215_pmu_reset,
+	.enabled = gf100_pmu_enabled,
+	.reset = gf100_pmu_reset,
 	.init = gt215_pmu_init,
 	.fini = gt215_pmu_fini,
 	.intr = gt215_pmu_intr,
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk20a.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk20a.c
index 978aae3..05e8185 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk20a.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gk20a.c
@@ -196,9 +196,10 @@ gk20a_dvfs_data= {
 
 static const struct nvkm_pmu_func
 gk20a_pmu = {
+	.enabled = gf100_pmu_enabled,
 	.init = gk20a_pmu_init,
 	.fini = gk20a_pmu_fini,
-	.reset = gt215_pmu_reset,
+	.reset = gf100_pmu_reset,
 };
 
 int
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm107.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm107.c
index 9a248ed..459df1e 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm107.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm107.c
@@ -32,7 +32,8 @@ gm107_pmu = {
 	.code.size = sizeof(gm107_pmu_code),
 	.data.data = gm107_pmu_data,
 	.data.size = sizeof(gm107_pmu_data),
-	.reset = gt215_pmu_reset,
+	.enabled = gf100_pmu_enabled,
+	.reset = gf100_pmu_reset,
 	.init = gt215_pmu_init,
 	.fini = gt215_pmu_fini,
 	.intr = gt215_pmu_intr,
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm20b.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm20b.c
index 44bef22..31c8431 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm20b.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gm20b.c
@@ -38,6 +38,7 @@ gm20b_pmu_recv(struct nvkm_pmu *pmu)
 
 static const struct nvkm_pmu_func
 gm20b_pmu = {
+	.enabled = gf100_pmu_enabled,
 	.intr = gt215_pmu_intr,
 	.recv = gm20b_pmu_recv,
 };
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp100.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp100.c
index 6c41c20c..e210cd6 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp100.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp100.c
@@ -25,7 +25,8 @@
 
 static const struct nvkm_pmu_func
 gp100_pmu = {
-	.reset = gt215_pmu_reset,
+	.enabled = gf100_pmu_enabled,
+	.reset = gf100_pmu_reset,
 };
 
 int
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp102.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp102.c
index f017352..98c7a2a 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp102.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gp102.c
@@ -31,8 +31,15 @@ gp102_pmu_reset(struct nvkm_pmu *pmu)
 	nvkm_mask(device, 0x10a3c0, 0x00000001, 0x00000000);
 }
 
+static bool
+gp102_pmu_enabled(struct nvkm_pmu *pmu)
+{
+	return !(nvkm_rd32(pmu->subdev.device, 0x10a3c0) & 0x00000001);
+}
+
 static const struct nvkm_pmu_func
 gp102_pmu = {
+	.enabled = gp102_pmu_enabled,
 	.reset = gp102_pmu_reset,
 };
 
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gt215.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gt215.c
index 90d428b..e04216d 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gt215.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/gt215.c
@@ -180,13 +180,19 @@ gt215_pmu_fini(struct nvkm_pmu *pmu)
 	nvkm_wr32(pmu->subdev.device, 0x10a014, 0x00000060);
 }
 
-void
+static void
 gt215_pmu_reset(struct nvkm_pmu *pmu)
 {
 	struct nvkm_device *device = pmu->subdev.device;
-	nvkm_mask(device, 0x000200, 0x00002000, 0x00000000);
-	nvkm_mask(device, 0x000200, 0x00002000, 0x00002000);
-	nvkm_rd32(device, 0x000200);
+	nvkm_mask(device, 0x022210, 0x00000001, 0x00000000);
+	nvkm_mask(device, 0x022210, 0x00000001, 0x00000001);
+	nvkm_rd32(device, 0x022210);
+}
+
+static bool
+gt215_pmu_enabled(struct nvkm_pmu *pmu)
+{
+	return nvkm_rd32(pmu->subdev.device, 0x022210) & 0x00000001;
 }
 
 int
@@ -241,6 +247,7 @@ gt215_pmu = {
 	.code.size = sizeof(gt215_pmu_code),
 	.data.data = gt215_pmu_data,
 	.data.size = sizeof(gt215_pmu_data),
+	.enabled = gt215_pmu_enabled,
 	.reset = gt215_pmu_reset,
 	.init = gt215_pmu_init,
 	.fini = gt215_pmu_fini,
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/priv.h b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/priv.h
index 096cba0..a4c48a1 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/priv.h
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pmu/priv.h
@@ -20,6 +20,7 @@ struct nvkm_pmu_func {
 		u32  size;
 	} data;
 
+	bool (*enabled)(struct nvkm_pmu *);
 	void (*reset)(struct nvkm_pmu *);
 	int (*init)(struct nvkm_pmu *);
 	void (*fini)(struct nvkm_pmu *);
@@ -30,12 +31,14 @@ struct nvkm_pmu_func {
 	void (*pgob)(struct nvkm_pmu *, bool);
 };
 
-void gt215_pmu_reset(struct nvkm_pmu *);
 int gt215_pmu_init(struct nvkm_pmu *);
 void gt215_pmu_fini(struct nvkm_pmu *);
 void gt215_pmu_intr(struct nvkm_pmu *);
 void gt215_pmu_recv(struct nvkm_pmu *);
 int gt215_pmu_send(struct nvkm_pmu *, u32[2], u32, u32, u32, u32);
 
+bool gf100_pmu_enabled(struct nvkm_pmu *);
+void gf100_pmu_reset(struct nvkm_pmu *);
+
 void gk110_pmu_pgob(struct nvkm_pmu *, bool);
 #endif
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/Kbuild b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/Kbuild
index 135758b..2bafcc1 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/Kbuild
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/Kbuild
@@ -11,3 +11,4 @@
 nvkm-y += nvkm/subdev/therm/gt215.o
 nvkm-y += nvkm/subdev/therm/gf119.o
 nvkm-y += nvkm/subdev/therm/gm107.o
+nvkm-y += nvkm/subdev/therm/gm200.o
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/g84.c b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/g84.c
index 86e8193..96f8da4 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/g84.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/g84.c
@@ -203,7 +203,7 @@ g84_therm_fini(struct nvkm_therm *therm)
 	nvkm_wr32(device, 0x1100, 0x10000); /* PBUS */
 }
 
-static void
+void
 g84_therm_init(struct nvkm_therm *therm)
 {
 	g84_sensor_setup(therm);
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gm200.c b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gm200.c
new file mode 100644
index 0000000..73dc780
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/gm200.c
@@ -0,0 +1,39 @@
+/*
+ * Copyright 2017 Karol Herbst
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ *
+ * Authors: Karol Herbst
+ */
+#include "priv.h"
+
+static const struct nvkm_therm_func
+gm200_therm = {
+	.init = g84_therm_init,
+	.fini = g84_therm_fini,
+	.temp_get = g84_temp_get,
+	.program_alarms = nvkm_therm_program_alarms_polling,
+};
+
+int
+gm200_therm_new(struct nvkm_device *device, int index,
+		struct nvkm_therm **ptherm)
+{
+	return nvkm_therm_new_(&gm200_therm, device, index, ptherm);
+}
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/priv.h b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/priv.h
index 235a5d8..1f46e37 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/priv.h
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/priv.h
@@ -111,6 +111,7 @@ void g84_therm_fini(struct nvkm_therm *);
 
 int gt215_therm_fan_sense(struct nvkm_therm *);
 
+void g84_therm_init(struct nvkm_therm *);
 void gf119_therm_init(struct nvkm_therm *);
 
 int nvkm_fanpwm_create(struct nvkm_therm *, struct dcb_gpio_func *);
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c
index e93b241..ddb2b2c 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/therm/temp.c
@@ -83,7 +83,7 @@ nvkm_therm_sensor_event(struct nvkm_therm *therm, enum nvkm_therm_thrs thrs,
 {
 	struct nvkm_subdev *subdev = &therm->subdev;
 	bool active;
-	const char *thresolds[] = {
+	static const char * const thresholds[] = {
 		"fanboost", "downclock", "critical", "shutdown"
 	};
 	int temperature = therm->func->temp_get(therm);
@@ -94,10 +94,10 @@ nvkm_therm_sensor_event(struct nvkm_therm *therm, enum nvkm_therm_thrs thrs,
 	if (dir == NVKM_THERM_THRS_FALLING)
 		nvkm_info(subdev,
 			  "temperature (%i C) went below the '%s' threshold\n",
-			  temperature, thresolds[thrs]);
+			  temperature, thresholds[thrs]);
 	else
 		nvkm_info(subdev, "temperature (%i C) hit the '%s' threshold\n",
-			  temperature, thresolds[thrs]);
+			  temperature, thresholds[thrs]);
 
 	active = (dir == NVKM_THERM_THRS_RISING);
 	switch (thrs) {
diff --git a/drivers/gpu/drm/omapdrm/displays/connector-analog-tv.c b/drivers/gpu/drm/omapdrm/displays/connector-analog-tv.c
index e1fa143..542a765 100644
--- a/drivers/gpu/drm/omapdrm/displays/connector-analog-tv.c
+++ b/drivers/gpu/drm/omapdrm/displays/connector-analog-tv.c
@@ -198,6 +198,9 @@ static int tvc_probe(struct platform_device *pdev)
 	struct omap_dss_device *dssdev;
 	int r;
 
+	if (!pdev->dev.of_node)
+		return -ENODEV;
+
 	ddata = devm_kzalloc(&pdev->dev, sizeof(*ddata), GFP_KERNEL);
 	if (!ddata)
 		return -ENOMEM;
diff --git a/drivers/gpu/drm/omapdrm/displays/connector-hdmi.c b/drivers/gpu/drm/omapdrm/displays/connector-hdmi.c
index 79cb69f..d9d25df 100644
--- a/drivers/gpu/drm/omapdrm/displays/connector-hdmi.c
+++ b/drivers/gpu/drm/omapdrm/displays/connector-hdmi.c
@@ -15,6 +15,7 @@
 #include <linux/platform_device.h>
 #include <linux/of.h>
 #include <linux/of_gpio.h>
+#include <linux/mutex.h>
 
 #include <drm/drm_edid.h>
 
@@ -37,6 +38,10 @@ static const struct videomode hdmic_default_vm = {
 struct panel_drv_data {
 	struct omap_dss_device dssdev;
 	struct omap_dss_device *in;
+	void (*hpd_cb)(void *cb_data, enum drm_connector_status status);
+	void *hpd_cb_data;
+	bool hpd_enabled;
+	struct mutex hpd_lock;
 
 	struct device *dev;
 
@@ -167,6 +172,70 @@ static bool hdmic_detect(struct omap_dss_device *dssdev)
 		return in->ops.hdmi->detect(in);
 }
 
+static int hdmic_register_hpd_cb(struct omap_dss_device *dssdev,
+				 void (*cb)(void *cb_data,
+					    enum drm_connector_status status),
+				 void *cb_data)
+{
+	struct panel_drv_data *ddata = to_panel_data(dssdev);
+	struct omap_dss_device *in = ddata->in;
+
+	if (gpio_is_valid(ddata->hpd_gpio)) {
+		mutex_lock(&ddata->hpd_lock);
+		ddata->hpd_cb = cb;
+		ddata->hpd_cb_data = cb_data;
+		mutex_unlock(&ddata->hpd_lock);
+		return 0;
+	} else if (in->ops.hdmi->register_hpd_cb) {
+		return in->ops.hdmi->register_hpd_cb(in, cb, cb_data);
+	}
+
+	return -ENOTSUPP;
+}
+
+static void hdmic_unregister_hpd_cb(struct omap_dss_device *dssdev)
+{
+	struct panel_drv_data *ddata = to_panel_data(dssdev);
+	struct omap_dss_device *in = ddata->in;
+
+	if (gpio_is_valid(ddata->hpd_gpio)) {
+		mutex_lock(&ddata->hpd_lock);
+		ddata->hpd_cb = NULL;
+		ddata->hpd_cb_data = NULL;
+		mutex_unlock(&ddata->hpd_lock);
+	} else if (in->ops.hdmi->unregister_hpd_cb) {
+		in->ops.hdmi->unregister_hpd_cb(in);
+	}
+}
+
+static void hdmic_enable_hpd(struct omap_dss_device *dssdev)
+{
+	struct panel_drv_data *ddata = to_panel_data(dssdev);
+	struct omap_dss_device *in = ddata->in;
+
+	if (gpio_is_valid(ddata->hpd_gpio)) {
+		mutex_lock(&ddata->hpd_lock);
+		ddata->hpd_enabled = true;
+		mutex_unlock(&ddata->hpd_lock);
+	} else if (in->ops.hdmi->enable_hpd) {
+		in->ops.hdmi->enable_hpd(in);
+	}
+}
+
+static void hdmic_disable_hpd(struct omap_dss_device *dssdev)
+{
+	struct panel_drv_data *ddata = to_panel_data(dssdev);
+	struct omap_dss_device *in = ddata->in;
+
+	if (gpio_is_valid(ddata->hpd_gpio)) {
+		mutex_lock(&ddata->hpd_lock);
+		ddata->hpd_enabled = false;
+		mutex_unlock(&ddata->hpd_lock);
+	} else if (in->ops.hdmi->disable_hpd) {
+		in->ops.hdmi->disable_hpd(in);
+	}
+}
+
 static int hdmic_set_hdmi_mode(struct omap_dss_device *dssdev, bool hdmi_mode)
 {
 	struct panel_drv_data *ddata = to_panel_data(dssdev);
@@ -197,10 +266,34 @@ static struct omap_dss_driver hdmic_driver = {
 
 	.read_edid		= hdmic_read_edid,
 	.detect			= hdmic_detect,
+	.register_hpd_cb	= hdmic_register_hpd_cb,
+	.unregister_hpd_cb	= hdmic_unregister_hpd_cb,
+	.enable_hpd		= hdmic_enable_hpd,
+	.disable_hpd		= hdmic_disable_hpd,
 	.set_hdmi_mode		= hdmic_set_hdmi_mode,
 	.set_hdmi_infoframe	= hdmic_set_infoframe,
 };
 
+static irqreturn_t hdmic_hpd_isr(int irq, void *data)
+{
+	struct panel_drv_data *ddata = data;
+
+	mutex_lock(&ddata->hpd_lock);
+	if (ddata->hpd_enabled && ddata->hpd_cb) {
+		enum drm_connector_status status;
+
+		if (hdmic_detect(&ddata->dssdev))
+			status = connector_status_connected;
+		else
+			status = connector_status_disconnected;
+
+		ddata->hpd_cb(ddata->hpd_cb_data, status);
+	}
+	mutex_unlock(&ddata->hpd_lock);
+
+	return IRQ_HANDLED;
+}
+
 static int hdmic_probe_of(struct platform_device *pdev)
 {
 	struct panel_drv_data *ddata = platform_get_drvdata(pdev);
@@ -246,11 +339,22 @@ static int hdmic_probe(struct platform_device *pdev)
 	if (r)
 		return r;
 
+	mutex_init(&ddata->hpd_lock);
+
 	if (gpio_is_valid(ddata->hpd_gpio)) {
 		r = devm_gpio_request_one(&pdev->dev, ddata->hpd_gpio,
 				GPIOF_DIR_IN, "hdmi_hpd");
 		if (r)
 			goto err_reg;
+
+		r = devm_request_threaded_irq(&pdev->dev,
+				gpio_to_irq(ddata->hpd_gpio),
+				NULL, hdmic_hpd_isr,
+				IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING |
+				IRQF_ONESHOT,
+				"hdmic hpd", ddata);
+		if (r)
+			goto err_reg;
 	}
 
 	ddata->vm = hdmic_default_vm;
diff --git a/drivers/gpu/drm/omapdrm/displays/encoder-tpd12s015.c b/drivers/gpu/drm/omapdrm/displays/encoder-tpd12s015.c
index 58276a4..a9e9d66 100644
--- a/drivers/gpu/drm/omapdrm/displays/encoder-tpd12s015.c
+++ b/drivers/gpu/drm/omapdrm/displays/encoder-tpd12s015.c
@@ -15,12 +15,17 @@
 #include <linux/slab.h>
 #include <linux/platform_device.h>
 #include <linux/gpio/consumer.h>
+#include <linux/mutex.h>
 
 #include "../dss/omapdss.h"
 
 struct panel_drv_data {
 	struct omap_dss_device dssdev;
 	struct omap_dss_device *in;
+	void (*hpd_cb)(void *cb_data, enum drm_connector_status status);
+	void *hpd_cb_data;
+	bool hpd_enabled;
+	struct mutex hpd_lock;
 
 	struct gpio_desc *ct_cp_hpd_gpio;
 	struct gpio_desc *ls_oe_gpio;
@@ -162,6 +167,49 @@ static bool tpd_detect(struct omap_dss_device *dssdev)
 	return gpiod_get_value_cansleep(ddata->hpd_gpio);
 }
 
+static int tpd_register_hpd_cb(struct omap_dss_device *dssdev,
+			       void (*cb)(void *cb_data,
+					  enum drm_connector_status status),
+			       void *cb_data)
+{
+	struct panel_drv_data *ddata = to_panel_data(dssdev);
+
+	mutex_lock(&ddata->hpd_lock);
+	ddata->hpd_cb = cb;
+	ddata->hpd_cb_data = cb_data;
+	mutex_unlock(&ddata->hpd_lock);
+
+	return 0;
+}
+
+static void tpd_unregister_hpd_cb(struct omap_dss_device *dssdev)
+{
+	struct panel_drv_data *ddata = to_panel_data(dssdev);
+
+	mutex_lock(&ddata->hpd_lock);
+	ddata->hpd_cb = NULL;
+	ddata->hpd_cb_data = NULL;
+	mutex_unlock(&ddata->hpd_lock);
+}
+
+static void tpd_enable_hpd(struct omap_dss_device *dssdev)
+{
+	struct panel_drv_data *ddata = to_panel_data(dssdev);
+
+	mutex_lock(&ddata->hpd_lock);
+	ddata->hpd_enabled = true;
+	mutex_unlock(&ddata->hpd_lock);
+}
+
+static void tpd_disable_hpd(struct omap_dss_device *dssdev)
+{
+	struct panel_drv_data *ddata = to_panel_data(dssdev);
+
+	mutex_lock(&ddata->hpd_lock);
+	ddata->hpd_enabled = false;
+	mutex_unlock(&ddata->hpd_lock);
+}
+
 static int tpd_set_infoframe(struct omap_dss_device *dssdev,
 		const struct hdmi_avi_infoframe *avi)
 {
@@ -193,10 +241,34 @@ static const struct omapdss_hdmi_ops tpd_hdmi_ops = {
 
 	.read_edid		= tpd_read_edid,
 	.detect			= tpd_detect,
+	.register_hpd_cb	= tpd_register_hpd_cb,
+	.unregister_hpd_cb	= tpd_unregister_hpd_cb,
+	.enable_hpd		= tpd_enable_hpd,
+	.disable_hpd		= tpd_disable_hpd,
 	.set_infoframe		= tpd_set_infoframe,
 	.set_hdmi_mode		= tpd_set_hdmi_mode,
 };
 
+static irqreturn_t tpd_hpd_isr(int irq, void *data)
+{
+	struct panel_drv_data *ddata = data;
+
+	mutex_lock(&ddata->hpd_lock);
+	if (ddata->hpd_enabled && ddata->hpd_cb) {
+		enum drm_connector_status status;
+
+		if (tpd_detect(&ddata->dssdev))
+			status = connector_status_connected;
+		else
+			status = connector_status_disconnected;
+
+		ddata->hpd_cb(ddata->hpd_cb_data, status);
+	}
+	mutex_unlock(&ddata->hpd_lock);
+
+	return IRQ_HANDLED;
+}
+
 static int tpd_probe_of(struct platform_device *pdev)
 {
 	struct panel_drv_data *ddata = platform_get_drvdata(pdev);
@@ -261,6 +333,15 @@ static int tpd_probe(struct platform_device *pdev)
 
 	ddata->hpd_gpio = gpio;
 
+	mutex_init(&ddata->hpd_lock);
+
+	r = devm_request_threaded_irq(&pdev->dev, gpiod_to_irq(ddata->hpd_gpio),
+		NULL, tpd_hpd_isr,
+		IRQF_TRIGGER_RISING | IRQF_TRIGGER_FALLING | IRQF_ONESHOT,
+		"tpd12s015 hpd", ddata);
+	if (r)
+		goto err_gpio;
+
 	dssdev = &ddata->dssdev;
 	dssdev->ops.hdmi = &tpd_hdmi_ops;
 	dssdev->dev = &pdev->dev;
diff --git a/drivers/gpu/drm/omapdrm/displays/panel-dpi.c b/drivers/gpu/drm/omapdrm/displays/panel-dpi.c
index 6468a76..e065f7e 100644
--- a/drivers/gpu/drm/omapdrm/displays/panel-dpi.c
+++ b/drivers/gpu/drm/omapdrm/displays/panel-dpi.c
@@ -231,6 +231,9 @@ static int panel_dpi_probe(struct platform_device *pdev)
 	struct omap_dss_device *dssdev;
 	int r;
 
+	if (!pdev->dev.of_node)
+		return -ENODEV;
+
 	ddata = devm_kzalloc(&pdev->dev, sizeof(*ddata), GFP_KERNEL);
 	if (ddata == NULL)
 		return -ENOMEM;
diff --git a/drivers/gpu/drm/omapdrm/displays/panel-dsi-cm.c b/drivers/gpu/drm/omapdrm/displays/panel-dsi-cm.c
index 76787a7..92c556a 100644
--- a/drivers/gpu/drm/omapdrm/displays/panel-dsi-cm.c
+++ b/drivers/gpu/drm/omapdrm/displays/panel-dsi-cm.c
@@ -554,7 +554,7 @@ static struct attribute *dsicm_attrs[] = {
 	NULL,
 };
 
-static struct attribute_group dsicm_attr_group = {
+static const struct attribute_group dsicm_attr_group = {
 	.attrs = dsicm_attrs,
 };
 
diff --git a/drivers/gpu/drm/omapdrm/displays/panel-lgphilips-lb035q02.c b/drivers/gpu/drm/omapdrm/displays/panel-lgphilips-lb035q02.c
index c90474a..74d1396 100644
--- a/drivers/gpu/drm/omapdrm/displays/panel-lgphilips-lb035q02.c
+++ b/drivers/gpu/drm/omapdrm/displays/panel-lgphilips-lb035q02.c
@@ -19,7 +19,7 @@
 
 #include "../dss/omapdss.h"
 
-static struct videomode lb035q02_vm = {
+static const struct videomode lb035q02_vm = {
 	.hactive = 320,
 	.vactive = 240,
 
diff --git a/drivers/gpu/drm/omapdrm/displays/panel-sony-acx565akm.c b/drivers/gpu/drm/omapdrm/displays/panel-sony-acx565akm.c
index 346aefd..8e5bff4 100644
--- a/drivers/gpu/drm/omapdrm/displays/panel-sony-acx565akm.c
+++ b/drivers/gpu/drm/omapdrm/displays/panel-sony-acx565akm.c
@@ -503,7 +503,7 @@ static struct attribute *bldev_attrs[] = {
 	NULL,
 };
 
-static struct attribute_group bldev_attr_group = {
+static const struct attribute_group bldev_attr_group = {
 	.attrs = bldev_attrs,
 };
 
@@ -720,6 +720,9 @@ static int acx565akm_probe(struct spi_device *spi)
 
 	dev_dbg(&spi->dev, "%s\n", __func__);
 
+	if (!spi->dev.of_node)
+		return -ENODEV;
+
 	spi->mode = SPI_MODE_3;
 
 	ddata = devm_kzalloc(&spi->dev, sizeof(*ddata), GFP_KERNEL);
diff --git a/drivers/gpu/drm/omapdrm/displays/panel-tpo-td028ttec1.c b/drivers/gpu/drm/omapdrm/displays/panel-tpo-td028ttec1.c
index cbf4c67..0a38a0e 100644
--- a/drivers/gpu/drm/omapdrm/displays/panel-tpo-td028ttec1.c
+++ b/drivers/gpu/drm/omapdrm/displays/panel-tpo-td028ttec1.c
@@ -40,7 +40,7 @@ struct panel_drv_data {
 	struct spi_device *spi_dev;
 };
 
-static struct videomode td028ttec1_panel_vm = {
+static const struct videomode td028ttec1_panel_vm = {
 	.hactive	= 480,
 	.vactive	= 640,
 	.pixelclock	= 22153000,
diff --git a/drivers/gpu/drm/omapdrm/displays/panel-tpo-td043mtea1.c b/drivers/gpu/drm/omapdrm/displays/panel-tpo-td043mtea1.c
index 20c6d8f..ac4a6d4 100644
--- a/drivers/gpu/drm/omapdrm/displays/panel-tpo-td043mtea1.c
+++ b/drivers/gpu/drm/omapdrm/displays/panel-tpo-td043mtea1.c
@@ -282,7 +282,7 @@ static struct attribute *tpo_td043_attrs[] = {
 	NULL,
 };
 
-static struct attribute_group tpo_td043_attr_group = {
+static const struct attribute_group tpo_td043_attr_group = {
 	.attrs = tpo_td043_attrs,
 };
 
diff --git a/drivers/gpu/drm/omapdrm/dss/Makefile b/drivers/gpu/drm/omapdrm/dss/Makefile
index 688195e..142ce5a 100644
--- a/drivers/gpu/drm/omapdrm/dss/Makefile
+++ b/drivers/gpu/drm/omapdrm/dss/Makefile
@@ -5,7 +5,7 @@
 
 obj-$(CONFIG_OMAP2_DSS) += omapdss.o
 # Core DSS files
-omapdss-y := core.o dss.o dss_features.o dispc.o dispc_coefs.o \
+omapdss-y := core.o dss.o dispc.o dispc_coefs.o \
 	pll.o video-pll.o
 omapdss-$(CONFIG_OMAP2_DSS_DPI) += dpi.o
 omapdss-$(CONFIG_OMAP2_DSS_VENC) += venc.o
diff --git a/drivers/gpu/drm/omapdrm/dss/core.c b/drivers/gpu/drm/omapdrm/dss/core.c
index bdce4bf..197ddbc 100644
--- a/drivers/gpu/drm/omapdrm/dss/core.c
+++ b/drivers/gpu/drm/omapdrm/dss/core.c
@@ -24,182 +24,10 @@
 
 #include <linux/kernel.h>
 #include <linux/module.h>
-#include <linux/clk.h>
-#include <linux/err.h>
 #include <linux/platform_device.h>
-#include <linux/seq_file.h>
-#include <linux/debugfs.h>
-#include <linux/io.h>
-#include <linux/device.h>
-#include <linux/regulator/consumer.h>
-#include <linux/suspend.h>
-#include <linux/slab.h>
 
 #include "omapdss.h"
 #include "dss.h"
-#include "dss_features.h"
-
-static struct {
-	struct platform_device *pdev;
-} core;
-
-enum omapdss_version omapdss_get_version(void)
-{
-	struct omap_dss_board_info *pdata = core.pdev->dev.platform_data;
-	return pdata->version;
-}
-EXPORT_SYMBOL(omapdss_get_version);
-
-int dss_dsi_enable_pads(int dsi_id, unsigned lane_mask)
-{
-	struct omap_dss_board_info *board_data = core.pdev->dev.platform_data;
-
-	if (!board_data->dsi_enable_pads)
-		return -ENOENT;
-
-	return board_data->dsi_enable_pads(dsi_id, lane_mask);
-}
-
-void dss_dsi_disable_pads(int dsi_id, unsigned lane_mask)
-{
-	struct omap_dss_board_info *board_data = core.pdev->dev.platform_data;
-
-	if (!board_data->dsi_disable_pads)
-		return;
-
-	return board_data->dsi_disable_pads(dsi_id, lane_mask);
-}
-
-int dss_set_min_bus_tput(struct device *dev, unsigned long tput)
-{
-	struct omap_dss_board_info *pdata = core.pdev->dev.platform_data;
-
-	if (pdata->set_min_bus_tput)
-		return pdata->set_min_bus_tput(dev, tput);
-	else
-		return 0;
-}
-
-#if defined(CONFIG_OMAP2_DSS_DEBUGFS)
-static int dss_debug_show(struct seq_file *s, void *unused)
-{
-	void (*func)(struct seq_file *) = s->private;
-	func(s);
-	return 0;
-}
-
-static int dss_debug_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, dss_debug_show, inode->i_private);
-}
-
-static const struct file_operations dss_debug_fops = {
-	.open           = dss_debug_open,
-	.read           = seq_read,
-	.llseek         = seq_lseek,
-	.release        = single_release,
-};
-
-static struct dentry *dss_debugfs_dir;
-
-static int dss_initialize_debugfs(void)
-{
-	dss_debugfs_dir = debugfs_create_dir("omapdss", NULL);
-	if (IS_ERR(dss_debugfs_dir)) {
-		int err = PTR_ERR(dss_debugfs_dir);
-		dss_debugfs_dir = NULL;
-		return err;
-	}
-
-	debugfs_create_file("clk", S_IRUGO, dss_debugfs_dir,
-			&dss_debug_dump_clocks, &dss_debug_fops);
-
-	return 0;
-}
-
-static void dss_uninitialize_debugfs(void)
-{
-	if (dss_debugfs_dir)
-		debugfs_remove_recursive(dss_debugfs_dir);
-}
-
-int dss_debugfs_create_file(const char *name, void (*write)(struct seq_file *))
-{
-	struct dentry *d;
-
-	d = debugfs_create_file(name, S_IRUGO, dss_debugfs_dir,
-			write, &dss_debug_fops);
-
-	return PTR_ERR_OR_ZERO(d);
-}
-#else /* CONFIG_OMAP2_DSS_DEBUGFS */
-static inline int dss_initialize_debugfs(void)
-{
-	return 0;
-}
-static inline void dss_uninitialize_debugfs(void)
-{
-}
-int dss_debugfs_create_file(const char *name, void (*write)(struct seq_file *))
-{
-	return 0;
-}
-#endif /* CONFIG_OMAP2_DSS_DEBUGFS */
-
-/* PLATFORM DEVICE */
-
-static void dss_disable_all_devices(void)
-{
-	struct omap_dss_device *dssdev = NULL;
-
-	for_each_dss_dev(dssdev) {
-		if (!dssdev->driver)
-			continue;
-
-		if (dssdev->state == OMAP_DSS_DISPLAY_ACTIVE)
-			dssdev->driver->disable(dssdev);
-	}
-}
-
-static int __init omap_dss_probe(struct platform_device *pdev)
-{
-	int r;
-
-	core.pdev = pdev;
-
-	dss_features_init(omapdss_get_version());
-
-	r = dss_initialize_debugfs();
-	if (r)
-		goto err_debugfs;
-
-	return 0;
-
-err_debugfs:
-
-	return r;
-}
-
-static int omap_dss_remove(struct platform_device *pdev)
-{
-	dss_uninitialize_debugfs();
-
-	return 0;
-}
-
-static void omap_dss_shutdown(struct platform_device *pdev)
-{
-	DSSDBG("shutdown\n");
-	dss_disable_all_devices();
-}
-
-static struct platform_driver omap_dss_driver = {
-	.remove         = omap_dss_remove,
-	.shutdown	= omap_dss_shutdown,
-	.driver         = {
-		.name   = "omapdss",
-	},
-};
 
 /* INIT */
 static int (*dss_output_drv_reg_funcs[])(void) __initdata = {
@@ -236,21 +64,25 @@ static void (*dss_output_drv_unreg_funcs[])(void) = {
 	dss_uninit_platform_driver,
 };
 
+static struct platform_device *omap_drm_device;
+
 static int __init omap_dss_init(void)
 {
 	int r;
 	int i;
 
-	r = platform_driver_probe(&omap_dss_driver, omap_dss_probe);
-	if (r)
-		return r;
-
 	for (i = 0; i < ARRAY_SIZE(dss_output_drv_reg_funcs); ++i) {
 		r = dss_output_drv_reg_funcs[i]();
 		if (r)
 			goto err_reg;
 	}
 
+	omap_drm_device = platform_device_register_simple("omapdrm", 0, NULL, 0);
+	if (IS_ERR(omap_drm_device)) {
+		r = PTR_ERR(omap_drm_device);
+		goto err_reg;
+	}
+
 	return 0;
 
 err_reg:
@@ -259,8 +91,6 @@ static int __init omap_dss_init(void)
 			++i)
 		dss_output_drv_unreg_funcs[i]();
 
-	platform_driver_unregister(&omap_dss_driver);
-
 	return r;
 }
 
@@ -268,10 +98,10 @@ static void __exit omap_dss_exit(void)
 {
 	int i;
 
+	platform_device_unregister(omap_drm_device);
+
 	for (i = 0; i < ARRAY_SIZE(dss_output_drv_unreg_funcs); ++i)
 		dss_output_drv_unreg_funcs[i]();
-
-	platform_driver_unregister(&omap_dss_driver);
 }
 
 module_init(omap_dss_init);
diff --git a/drivers/gpu/drm/omapdrm/dss/dispc.c b/drivers/gpu/drm/omapdrm/dss/dispc.c
index fd7504b..0f4fdb2 100644
--- a/drivers/gpu/drm/omapdrm/dss/dispc.c
+++ b/drivers/gpu/drm/omapdrm/dss/dispc.c
@@ -39,13 +39,14 @@
 #include <linux/mfd/syscon.h>
 #include <linux/regmap.h>
 #include <linux/of.h>
+#include <linux/of_device.h>
 #include <linux/component.h>
+#include <linux/sys_soc.h>
 #include <drm/drm_fourcc.h>
 #include <drm/drm_blend.h>
 
 #include "omapdss.h"
 #include "dss.h"
-#include "dss_features.h"
 #include "dispc.h"
 
 /* DISPC */
@@ -63,6 +64,33 @@ enum omap_burst_size {
 #define REG_FLD_MOD(idx, val, start, end)				\
 	dispc_write_reg(idx, FLD_MOD(dispc_read_reg(idx), val, start, end))
 
+/* DISPC has feature id */
+enum dispc_feature_id {
+	FEAT_LCDENABLEPOL,
+	FEAT_LCDENABLESIGNAL,
+	FEAT_PCKFREEENABLE,
+	FEAT_FUNCGATED,
+	FEAT_MGR_LCD2,
+	FEAT_MGR_LCD3,
+	FEAT_LINEBUFFERSPLIT,
+	FEAT_ROWREPEATENABLE,
+	FEAT_RESIZECONF,
+	/* Independent core clk divider */
+	FEAT_CORE_CLK_DIV,
+	FEAT_HANDLE_UV_SEPARATE,
+	FEAT_ATTR2,
+	FEAT_CPR,
+	FEAT_PRELOAD,
+	FEAT_FIR_COEF_V,
+	FEAT_ALPHA_FIXED_ZORDER,
+	FEAT_ALPHA_FREE_ZORDER,
+	FEAT_FIFO_MERGE,
+	/* An unknown HW bug causing the normal FIFO thresholds not to work */
+	FEAT_OMAP3_DSI_FIFO_BUG,
+	FEAT_BURST_2D,
+	FEAT_MFLAG,
+};
+
 struct dispc_features {
 	u8 sw_start;
 	u8 fp_start;
@@ -76,6 +104,9 @@ struct dispc_features {
 	u16 mgr_height_max;
 	unsigned long max_lcd_pclk;
 	unsigned long max_tv_pclk;
+	unsigned int max_downscale;
+	unsigned int max_line_width;
+	unsigned int min_pcd;
 	int (*calc_scaling) (unsigned long pclk, unsigned long lclk,
 		const struct videomode *vm,
 		u16 width, u16 height, u16 out_width, u16 out_height,
@@ -86,6 +117,16 @@ struct dispc_features {
 		u16 width, u16 height, u16 out_width, u16 out_height,
 		bool mem_to_mem);
 	u8 num_fifos;
+	const enum dispc_feature_id *features;
+	unsigned int num_features;
+	const struct dss_reg_field *reg_fields;
+	const unsigned int num_reg_fields;
+	const enum omap_overlay_caps *overlay_caps;
+	const u32 **supported_color_modes;
+	unsigned int num_mgrs;
+	unsigned int num_ovls;
+	unsigned int buffer_size_unit;
+	unsigned int burst_size_unit;
 
 	/* swap GFX & WB fifos */
 	bool gfx_fifo_workaround:1;
@@ -180,6 +221,17 @@ enum mgr_reg_fields {
 	DISPC_MGR_FLD_NUM,
 };
 
+/* DISPC register field id */
+enum dispc_feat_reg_field {
+	FEAT_REG_FIRHINC,
+	FEAT_REG_FIRVINC,
+	FEAT_REG_FIFOHIGHTHRESHOLD,
+	FEAT_REG_FIFOLOWTHRESHOLD,
+	FEAT_REG_FIFOSIZE,
+	FEAT_REG_HORIZONTALACCU,
+	FEAT_REG_VERTICALACCU,
+};
+
 struct dispc_reg_field {
 	u16 reg;
 	u8 high;
@@ -343,6 +395,38 @@ static void mgr_fld_write(enum omap_channel channel,
 		spin_unlock_irqrestore(&dispc.control_lock, flags);
 }
 
+static int dispc_get_num_ovls(void)
+{
+	return dispc.feat->num_ovls;
+}
+
+static int dispc_get_num_mgrs(void)
+{
+	return dispc.feat->num_mgrs;
+}
+
+static void dispc_get_reg_field(enum dispc_feat_reg_field id,
+				u8 *start, u8 *end)
+{
+	if (id >= dispc.feat->num_reg_fields)
+		BUG();
+
+	*start = dispc.feat->reg_fields[id].start;
+	*end = dispc.feat->reg_fields[id].end;
+}
+
+static bool dispc_has_feature(enum dispc_feature_id id)
+{
+	unsigned int i;
+
+	for (i = 0; i < dispc.feat->num_features; i++) {
+		if (dispc.feat->features[i] == id)
+			return true;
+	}
+
+	return false;
+}
+
 #define SR(reg) \
 	dispc.ctx[DISPC_##reg / sizeof(u32)] = dispc_read_reg(DISPC_##reg)
 #define RR(reg) \
@@ -358,19 +442,19 @@ static void dispc_save_context(void)
 	SR(CONTROL);
 	SR(CONFIG);
 	SR(LINE_NUMBER);
-	if (dss_has_feature(FEAT_ALPHA_FIXED_ZORDER) ||
-			dss_has_feature(FEAT_ALPHA_FREE_ZORDER))
+	if (dispc_has_feature(FEAT_ALPHA_FIXED_ZORDER) ||
+			dispc_has_feature(FEAT_ALPHA_FREE_ZORDER))
 		SR(GLOBAL_ALPHA);
-	if (dss_has_feature(FEAT_MGR_LCD2)) {
+	if (dispc_has_feature(FEAT_MGR_LCD2)) {
 		SR(CONTROL2);
 		SR(CONFIG2);
 	}
-	if (dss_has_feature(FEAT_MGR_LCD3)) {
+	if (dispc_has_feature(FEAT_MGR_LCD3)) {
 		SR(CONTROL3);
 		SR(CONFIG3);
 	}
 
-	for (i = 0; i < dss_feat_get_num_mgrs(); i++) {
+	for (i = 0; i < dispc_get_num_mgrs(); i++) {
 		SR(DEFAULT_COLOR(i));
 		SR(TRANS_COLOR(i));
 		SR(SIZE_MGR(i));
@@ -385,14 +469,14 @@ static void dispc_save_context(void)
 		SR(DATA_CYCLE2(i));
 		SR(DATA_CYCLE3(i));
 
-		if (dss_has_feature(FEAT_CPR)) {
+		if (dispc_has_feature(FEAT_CPR)) {
 			SR(CPR_COEF_R(i));
 			SR(CPR_COEF_G(i));
 			SR(CPR_COEF_B(i));
 		}
 	}
 
-	for (i = 0; i < dss_feat_get_num_ovls(); i++) {
+	for (i = 0; i < dispc_get_num_ovls(); i++) {
 		SR(OVL_BA0(i));
 		SR(OVL_BA1(i));
 		SR(OVL_POSITION(i));
@@ -401,7 +485,7 @@ static void dispc_save_context(void)
 		SR(OVL_FIFO_THRESHOLD(i));
 		SR(OVL_ROW_INC(i));
 		SR(OVL_PIXEL_INC(i));
-		if (dss_has_feature(FEAT_PRELOAD))
+		if (dispc_has_feature(FEAT_PRELOAD))
 			SR(OVL_PRELOAD(i));
 		if (i == OMAP_DSS_GFX) {
 			SR(OVL_WINDOW_SKIP(i));
@@ -422,12 +506,12 @@ static void dispc_save_context(void)
 		for (j = 0; j < 5; j++)
 			SR(OVL_CONV_COEF(i, j));
 
-		if (dss_has_feature(FEAT_FIR_COEF_V)) {
+		if (dispc_has_feature(FEAT_FIR_COEF_V)) {
 			for (j = 0; j < 8; j++)
 				SR(OVL_FIR_COEF_V(i, j));
 		}
 
-		if (dss_has_feature(FEAT_HANDLE_UV_SEPARATE)) {
+		if (dispc_has_feature(FEAT_HANDLE_UV_SEPARATE)) {
 			SR(OVL_BA0_UV(i));
 			SR(OVL_BA1_UV(i));
 			SR(OVL_FIR2(i));
@@ -443,11 +527,11 @@ static void dispc_save_context(void)
 			for (j = 0; j < 8; j++)
 				SR(OVL_FIR_COEF_V2(i, j));
 		}
-		if (dss_has_feature(FEAT_ATTR2))
+		if (dispc_has_feature(FEAT_ATTR2))
 			SR(OVL_ATTRIBUTES2(i));
 	}
 
-	if (dss_has_feature(FEAT_CORE_CLK_DIV))
+	if (dispc_has_feature(FEAT_CORE_CLK_DIV))
 		SR(DIVISOR);
 
 	dispc.ctx_valid = true;
@@ -468,15 +552,15 @@ static void dispc_restore_context(void)
 	/*RR(CONTROL);*/
 	RR(CONFIG);
 	RR(LINE_NUMBER);
-	if (dss_has_feature(FEAT_ALPHA_FIXED_ZORDER) ||
-			dss_has_feature(FEAT_ALPHA_FREE_ZORDER))
+	if (dispc_has_feature(FEAT_ALPHA_FIXED_ZORDER) ||
+			dispc_has_feature(FEAT_ALPHA_FREE_ZORDER))
 		RR(GLOBAL_ALPHA);
-	if (dss_has_feature(FEAT_MGR_LCD2))
+	if (dispc_has_feature(FEAT_MGR_LCD2))
 		RR(CONFIG2);
-	if (dss_has_feature(FEAT_MGR_LCD3))
+	if (dispc_has_feature(FEAT_MGR_LCD3))
 		RR(CONFIG3);
 
-	for (i = 0; i < dss_feat_get_num_mgrs(); i++) {
+	for (i = 0; i < dispc_get_num_mgrs(); i++) {
 		RR(DEFAULT_COLOR(i));
 		RR(TRANS_COLOR(i));
 		RR(SIZE_MGR(i));
@@ -491,14 +575,14 @@ static void dispc_restore_context(void)
 		RR(DATA_CYCLE2(i));
 		RR(DATA_CYCLE3(i));
 
-		if (dss_has_feature(FEAT_CPR)) {
+		if (dispc_has_feature(FEAT_CPR)) {
 			RR(CPR_COEF_R(i));
 			RR(CPR_COEF_G(i));
 			RR(CPR_COEF_B(i));
 		}
 	}
 
-	for (i = 0; i < dss_feat_get_num_ovls(); i++) {
+	for (i = 0; i < dispc_get_num_ovls(); i++) {
 		RR(OVL_BA0(i));
 		RR(OVL_BA1(i));
 		RR(OVL_POSITION(i));
@@ -507,7 +591,7 @@ static void dispc_restore_context(void)
 		RR(OVL_FIFO_THRESHOLD(i));
 		RR(OVL_ROW_INC(i));
 		RR(OVL_PIXEL_INC(i));
-		if (dss_has_feature(FEAT_PRELOAD))
+		if (dispc_has_feature(FEAT_PRELOAD))
 			RR(OVL_PRELOAD(i));
 		if (i == OMAP_DSS_GFX) {
 			RR(OVL_WINDOW_SKIP(i));
@@ -528,12 +612,12 @@ static void dispc_restore_context(void)
 		for (j = 0; j < 5; j++)
 			RR(OVL_CONV_COEF(i, j));
 
-		if (dss_has_feature(FEAT_FIR_COEF_V)) {
+		if (dispc_has_feature(FEAT_FIR_COEF_V)) {
 			for (j = 0; j < 8; j++)
 				RR(OVL_FIR_COEF_V(i, j));
 		}
 
-		if (dss_has_feature(FEAT_HANDLE_UV_SEPARATE)) {
+		if (dispc_has_feature(FEAT_HANDLE_UV_SEPARATE)) {
 			RR(OVL_BA0_UV(i));
 			RR(OVL_BA1_UV(i));
 			RR(OVL_FIR2(i));
@@ -549,18 +633,18 @@ static void dispc_restore_context(void)
 			for (j = 0; j < 8; j++)
 				RR(OVL_FIR_COEF_V2(i, j));
 		}
-		if (dss_has_feature(FEAT_ATTR2))
+		if (dispc_has_feature(FEAT_ATTR2))
 			RR(OVL_ATTRIBUTES2(i));
 	}
 
-	if (dss_has_feature(FEAT_CORE_CLK_DIV))
+	if (dispc_has_feature(FEAT_CORE_CLK_DIV))
 		RR(DIVISOR);
 
 	/* enable last, because LCD & DIGIT enable are here */
 	RR(CONTROL);
-	if (dss_has_feature(FEAT_MGR_LCD2))
+	if (dispc_has_feature(FEAT_MGR_LCD2))
 		RR(CONTROL2);
-	if (dss_has_feature(FEAT_MGR_LCD3))
+	if (dispc_has_feature(FEAT_MGR_LCD3))
 		RR(CONTROL3);
 	/* clear spurious SYNC_LOST_DIGIT interrupts */
 	dispc_clear_irqstatus(DISPC_IRQ_SYNC_LOST_DIGIT);
@@ -779,7 +863,7 @@ static void dispc_ovl_write_color_conv_coef(enum omap_plane_id plane,
 static void dispc_setup_color_conv_coef(void)
 {
 	int i;
-	int num_ovl = dss_feat_get_num_ovls();
+	int num_ovl = dispc_get_num_ovls();
 	const struct color_conv_coef ctbl_bt601_5_ovl = {
 		/* YUV -> RGB */
 		298, 409, 0, 298, -208, -100, 298, 0, 517, 0,
@@ -868,10 +952,10 @@ static void dispc_ovl_enable_zorder_planes(void)
 {
 	int i;
 
-	if (!dss_has_feature(FEAT_ALPHA_FREE_ZORDER))
+	if (!dispc_has_feature(FEAT_ALPHA_FREE_ZORDER))
 		return;
 
-	for (i = 0; i < dss_feat_get_num_ovls(); i++)
+	for (i = 0; i < dispc_get_num_ovls(); i++)
 		REG_FLD_MOD(DISPC_OVL_ATTRIBUTES(i), 1, 25, 25);
 }
 
@@ -994,7 +1078,7 @@ static bool format_is_yuv(u32 fourcc)
 static void dispc_ovl_configure_burst_type(enum omap_plane_id plane,
 		enum omap_dss_rotation_type rotation_type)
 {
-	if (dss_has_feature(FEAT_BURST_2D) == 0)
+	if (dispc_has_feature(FEAT_BURST_2D) == 0)
 		return;
 
 	if (rotation_type == OMAP_DSS_ROT_TILER)
@@ -1025,7 +1109,7 @@ static void dispc_ovl_set_channel_out(enum omap_plane_id plane,
 	}
 
 	val = dispc_read_reg(DISPC_OVL_ATTRIBUTES(plane));
-	if (dss_has_feature(FEAT_MGR_LCD2)) {
+	if (dispc_has_feature(FEAT_MGR_LCD2)) {
 		switch (channel) {
 		case OMAP_DSS_CHANNEL_LCD:
 			chan = 0;
@@ -1040,7 +1124,7 @@ static void dispc_ovl_set_channel_out(enum omap_plane_id plane,
 			chan2 = 1;
 			break;
 		case OMAP_DSS_CHANNEL_LCD3:
-			if (dss_has_feature(FEAT_MGR_LCD3)) {
+			if (dispc_has_feature(FEAT_MGR_LCD3)) {
 				chan = 0;
 				chan2 = 2;
 			} else {
@@ -1089,7 +1173,7 @@ static enum omap_channel dispc_ovl_get_channel_out(enum omap_plane_id plane)
 	if (FLD_GET(val, shift, shift) == 1)
 		return OMAP_DSS_CHANNEL_DIGIT;
 
-	if (!dss_has_feature(FEAT_MGR_LCD2))
+	if (!dispc_has_feature(FEAT_MGR_LCD2))
 		return OMAP_DSS_CHANNEL_LCD;
 
 	switch (FLD_GET(val, 31, 30)) {
@@ -1128,7 +1212,7 @@ static void dispc_configure_burst_sizes(void)
 	const int burst_size = BURST_SIZE_X8;
 
 	/* Configure burst size always to maximum size */
-	for (i = 0; i < dss_feat_get_num_ovls(); ++i)
+	for (i = 0; i < dispc_get_num_ovls(); ++i)
 		dispc_ovl_set_burst_size(i, burst_size);
 	if (dispc.feat->has_writeback)
 		dispc_ovl_set_burst_size(OMAP_DSS_WB, burst_size);
@@ -1136,19 +1220,28 @@ static void dispc_configure_burst_sizes(void)
 
 static u32 dispc_ovl_get_burst_size(enum omap_plane_id plane)
 {
-	unsigned unit = dss_feat_get_burst_size_unit();
 	/* burst multiplier is always x8 (see dispc_configure_burst_sizes()) */
-	return unit * 8;
+	return dispc.feat->burst_size_unit * 8;
+}
+
+static bool dispc_ovl_color_mode_supported(enum omap_plane_id plane, u32 fourcc)
+{
+	const u32 *modes;
+	unsigned int i;
+
+	modes = dispc.feat->supported_color_modes[plane];
+
+	for (i = 0; modes[i]; ++i) {
+		if (modes[i] == fourcc)
+			return true;
+	}
+
+	return false;
 }
 
 static const u32 *dispc_ovl_get_color_modes(enum omap_plane_id plane)
 {
-	return dss_feat_get_supported_color_modes(plane);
-}
-
-static int dispc_get_num_ovls(void)
-{
-	return dss_feat_get_num_ovls();
+	return dispc.feat->supported_color_modes[plane];
 }
 
 static void dispc_mgr_enable_cpr(enum omap_channel channel, bool enable)
@@ -1223,9 +1316,9 @@ static void dispc_init_fifos(void)
 	u32 unit;
 	int i;
 
-	unit = dss_feat_get_buffer_size_unit();
+	unit = dispc.feat->buffer_size_unit;
 
-	dss_feat_get_reg_field(FEAT_REG_FIFOSIZE, &start, &end);
+	dispc_get_reg_field(FEAT_REG_FIFOSIZE, &start, &end);
 
 	for (fifo = 0; fifo < dispc.feat->num_fifos; ++fifo) {
 		size = REG_GET(DISPC_OVL_FIFO_SIZE_STATUS(fifo), start, end);
@@ -1265,7 +1358,7 @@ static void dispc_init_fifos(void)
 	/*
 	 * Setup default fifo thresholds.
 	 */
-	for (i = 0; i < dss_feat_get_num_ovls(); ++i) {
+	for (i = 0; i < dispc_get_num_ovls(); ++i) {
 		u32 low, high;
 		const bool use_fifomerge = false;
 		const bool manual_update = false;
@@ -1307,7 +1400,7 @@ void dispc_ovl_set_fifo_threshold(enum omap_plane_id plane, u32 low,
 	u8 hi_start, hi_end, lo_start, lo_end;
 	u32 unit;
 
-	unit = dss_feat_get_buffer_size_unit();
+	unit = dispc.feat->buffer_size_unit;
 
 	WARN_ON(low % unit != 0);
 	WARN_ON(high % unit != 0);
@@ -1315,8 +1408,8 @@ void dispc_ovl_set_fifo_threshold(enum omap_plane_id plane, u32 low,
 	low /= unit;
 	high /= unit;
 
-	dss_feat_get_reg_field(FEAT_REG_FIFOHIGHTHRESHOLD, &hi_start, &hi_end);
-	dss_feat_get_reg_field(FEAT_REG_FIFOLOWTHRESHOLD, &lo_start, &lo_end);
+	dispc_get_reg_field(FEAT_REG_FIFOHIGHTHRESHOLD, &hi_start, &hi_end);
+	dispc_get_reg_field(FEAT_REG_FIFOLOWTHRESHOLD, &lo_start, &lo_end);
 
 	DSSDBG("fifo(%d) threshold (bytes), old %u/%u, new %u/%u\n",
 			plane,
@@ -1335,14 +1428,14 @@ void dispc_ovl_set_fifo_threshold(enum omap_plane_id plane, u32 low,
 	 * large for the preload field, set the threshold to the maximum value
 	 * that can be held by the preload register
 	 */
-	if (dss_has_feature(FEAT_PRELOAD) && dispc.feat->set_max_preload &&
+	if (dispc_has_feature(FEAT_PRELOAD) && dispc.feat->set_max_preload &&
 			plane != OMAP_DSS_WB)
 		dispc_write_reg(DISPC_OVL_PRELOAD(plane), min(high, 0xfffu));
 }
 
 void dispc_enable_fifomerge(bool enable)
 {
-	if (!dss_has_feature(FEAT_FIFO_MERGE)) {
+	if (!dispc_has_feature(FEAT_FIFO_MERGE)) {
 		WARN_ON(enable);
 		return;
 	}
@@ -1360,7 +1453,7 @@ void dispc_ovl_compute_fifo_thresholds(enum omap_plane_id plane,
 	 * buffer_units, and the fifo thresholds must be buffer_unit aligned.
 	 */
 
-	unsigned buf_unit = dss_feat_get_buffer_size_unit();
+	unsigned buf_unit = dispc.feat->buffer_size_unit;
 	unsigned ovl_fifo_size, total_fifo_size, burst_size;
 	int i;
 
@@ -1369,7 +1462,7 @@ void dispc_ovl_compute_fifo_thresholds(enum omap_plane_id plane,
 
 	if (use_fifomerge) {
 		total_fifo_size = 0;
-		for (i = 0; i < dss_feat_get_num_ovls(); ++i)
+		for (i = 0; i < dispc_get_num_ovls(); ++i)
 			total_fifo_size += dispc_ovl_get_fifo_size(i);
 	} else {
 		total_fifo_size = ovl_fifo_size;
@@ -1381,7 +1474,7 @@ void dispc_ovl_compute_fifo_thresholds(enum omap_plane_id plane,
 	 * combined fifo size
 	 */
 
-	if (manual_update && dss_has_feature(FEAT_OMAP3_DSI_FIFO_BUG)) {
+	if (manual_update && dispc_has_feature(FEAT_OMAP3_DSI_FIFO_BUG)) {
 		*fifo_low = ovl_fifo_size - burst_size * 2;
 		*fifo_high = total_fifo_size - burst_size;
 	} else if (plane == OMAP_DSS_WB) {
@@ -1435,9 +1528,9 @@ static void dispc_init_mflag(void)
 		(1 << 0) |	/* MFLAG_CTRL = force always on */
 		(0 << 2));	/* MFLAG_START = disable */
 
-	for (i = 0; i < dss_feat_get_num_ovls(); ++i) {
+	for (i = 0; i < dispc_get_num_ovls(); ++i) {
 		u32 size = dispc_ovl_get_fifo_size(i);
-		u32 unit = dss_feat_get_buffer_size_unit();
+		u32 unit = dispc.feat->buffer_size_unit;
 		u32 low, high;
 
 		dispc_ovl_set_mflag(i, true);
@@ -1456,7 +1549,7 @@ static void dispc_init_mflag(void)
 
 	if (dispc.feat->has_writeback) {
 		u32 size = dispc_ovl_get_fifo_size(OMAP_DSS_WB);
-		u32 unit = dss_feat_get_buffer_size_unit();
+		u32 unit = dispc.feat->buffer_size_unit;
 		u32 low, high;
 
 		dispc_ovl_set_mflag(OMAP_DSS_WB, true);
@@ -1483,10 +1576,8 @@ static void dispc_ovl_set_fir(enum omap_plane_id plane,
 	if (color_comp == DISPC_COLOR_COMPONENT_RGB_Y) {
 		u8 hinc_start, hinc_end, vinc_start, vinc_end;
 
-		dss_feat_get_reg_field(FEAT_REG_FIRHINC,
-					&hinc_start, &hinc_end);
-		dss_feat_get_reg_field(FEAT_REG_FIRVINC,
-					&vinc_start, &vinc_end);
+		dispc_get_reg_field(FEAT_REG_FIRHINC, &hinc_start, &hinc_end);
+		dispc_get_reg_field(FEAT_REG_FIRVINC, &vinc_start, &vinc_end);
 		val = FLD_VAL(vinc, vinc_start, vinc_end) |
 				FLD_VAL(hinc, hinc_start, hinc_end);
 
@@ -1503,8 +1594,8 @@ static void dispc_ovl_set_vid_accu0(enum omap_plane_id plane, int haccu,
 	u32 val;
 	u8 hor_start, hor_end, vert_start, vert_end;
 
-	dss_feat_get_reg_field(FEAT_REG_HORIZONTALACCU, &hor_start, &hor_end);
-	dss_feat_get_reg_field(FEAT_REG_VERTICALACCU, &vert_start, &vert_end);
+	dispc_get_reg_field(FEAT_REG_HORIZONTALACCU, &hor_start, &hor_end);
+	dispc_get_reg_field(FEAT_REG_VERTICALACCU, &vert_start, &vert_end);
 
 	val = FLD_VAL(vaccu, vert_start, vert_end) |
 			FLD_VAL(haccu, hor_start, hor_end);
@@ -1518,8 +1609,8 @@ static void dispc_ovl_set_vid_accu1(enum omap_plane_id plane, int haccu,
 	u32 val;
 	u8 hor_start, hor_end, vert_start, vert_end;
 
-	dss_feat_get_reg_field(FEAT_REG_HORIZONTALACCU, &hor_start, &hor_end);
-	dss_feat_get_reg_field(FEAT_REG_VERTICALACCU, &vert_start, &vert_end);
+	dispc_get_reg_field(FEAT_REG_HORIZONTALACCU, &hor_start, &hor_end);
+	dispc_get_reg_field(FEAT_REG_VERTICALACCU, &vert_start, &vert_end);
 
 	val = FLD_VAL(vaccu, vert_start, vert_end) |
 			FLD_VAL(haccu, hor_start, hor_end);
@@ -1671,14 +1762,14 @@ static void dispc_ovl_set_scaling_common(enum omap_plane_id plane,
 	l |= five_taps ? (1 << 21) : 0;
 
 	/* VRESIZECONF and HRESIZECONF */
-	if (dss_has_feature(FEAT_RESIZECONF)) {
+	if (dispc_has_feature(FEAT_RESIZECONF)) {
 		l &= ~(0x3 << 7);
 		l |= (orig_width <= out_width) ? 0 : (1 << 7);
 		l |= (orig_height <= out_height) ? 0 : (1 << 8);
 	}
 
 	/* LINEBUFFERSPLIT */
-	if (dss_has_feature(FEAT_LINEBUFFERSPLIT)) {
+	if (dispc_has_feature(FEAT_LINEBUFFERSPLIT)) {
 		l &= ~(0x1 << 22);
 		l |= five_taps ? (1 << 22) : 0;
 	}
@@ -1713,7 +1804,7 @@ static void dispc_ovl_set_scaling_uv(enum omap_plane_id plane,
 	int scale_y = out_height != orig_height;
 	bool chroma_upscale = plane != OMAP_DSS_WB;
 
-	if (!dss_has_feature(FEAT_HANDLE_UV_SEPARATE))
+	if (!dispc_has_feature(FEAT_HANDLE_UV_SEPARATE))
 		return;
 
 	if (!format_is_yuv(fourcc)) {
@@ -1860,11 +1951,11 @@ static void dispc_ovl_set_rotation_attrs(enum omap_plane_id plane, u8 rotation,
 		vidrot = 1;
 
 	REG_FLD_MOD(DISPC_OVL_ATTRIBUTES(plane), vidrot, 13, 12);
-	if (dss_has_feature(FEAT_ROWREPEATENABLE))
+	if (dispc_has_feature(FEAT_ROWREPEATENABLE))
 		REG_FLD_MOD(DISPC_OVL_ATTRIBUTES(plane),
 			row_repeat ? 1 : 0, 18, 18);
 
-	if (dss_feat_color_mode_supported(plane, DRM_FORMAT_NV12)) {
+	if (dispc_ovl_color_mode_supported(plane, DRM_FORMAT_NV12)) {
 		bool doublestride =
 			fourcc == DRM_FORMAT_NV12 &&
 			rotation_type == OMAP_DSS_ROT_TILER &&
@@ -2118,8 +2209,7 @@ static int dispc_ovl_calc_scaling_24xx(unsigned long pclk, unsigned long lclk,
 	int error;
 	u16 in_width, in_height;
 	int min_factor = min(*decim_x, *decim_y);
-	const int maxsinglelinewidth =
-			dss_feat_get_param_max(FEAT_PARAM_LINEWIDTH);
+	const int maxsinglelinewidth = dispc.feat->max_line_width;
 
 	*five_taps = false;
 
@@ -2163,8 +2253,7 @@ static int dispc_ovl_calc_scaling_34xx(unsigned long pclk, unsigned long lclk,
 {
 	int error;
 	u16 in_width, in_height;
-	const int maxsinglelinewidth =
-			dss_feat_get_param_max(FEAT_PARAM_LINEWIDTH);
+	const int maxsinglelinewidth = dispc.feat->max_line_width;
 
 	do {
 		in_height = height / *decim_y;
@@ -2249,9 +2338,8 @@ static int dispc_ovl_calc_scaling_44xx(unsigned long pclk, unsigned long lclk,
 	u16 in_width, in_width_max;
 	int decim_x_min = *decim_x;
 	u16 in_height = height / *decim_y;
-	const int maxsinglelinewidth =
-				dss_feat_get_param_max(FEAT_PARAM_LINEWIDTH);
-	const int maxdownscale = dss_feat_get_param_max(FEAT_PARAM_DOWNSCALE);
+	const int maxsinglelinewidth = dispc.feat->max_line_width;
+	const int maxdownscale = dispc.feat->max_downscale;
 
 	if (mem_to_mem) {
 		in_width_max = out_width * maxdownscale;
@@ -2311,7 +2399,7 @@ static int dispc_ovl_calc_scaling(unsigned long pclk, unsigned long lclk,
 		int *x_predecim, int *y_predecim, u16 pos_x,
 		enum omap_dss_rotation_type rotation_type, bool mem_to_mem)
 {
-	const int maxdownscale = dss_feat_get_param_max(FEAT_PARAM_DOWNSCALE);
+	const int maxdownscale = dispc.feat->max_downscale;
 	const int max_decim_limit = 16;
 	unsigned long core_clk = 0;
 	int decim_x, decim_y, ret;
@@ -2332,7 +2420,7 @@ static int dispc_ovl_calc_scaling(unsigned long pclk, unsigned long lclk,
 	} else {
 		*x_predecim = max_decim_limit;
 		*y_predecim = (rotation_type == OMAP_DSS_ROT_TILER &&
-				dss_has_feature(FEAT_BURST_2D)) ?
+				dispc_has_feature(FEAT_BURST_2D)) ?
 				2 : max_decim_limit;
 	}
 
@@ -2428,7 +2516,7 @@ static int dispc_ovl_setup_common(enum omap_plane_id plane,
 			out_height);
 	}
 
-	if (!dss_feat_color_mode_supported(plane, fourcc))
+	if (!dispc_ovl_color_mode_supported(plane, fourcc))
 		return -EINVAL;
 
 	r = dispc_ovl_calc_scaling(pclk, lclk, caps, vm, in_width,
@@ -2549,7 +2637,7 @@ static int dispc_ovl_setup(enum omap_plane_id plane,
 		enum omap_channel channel)
 {
 	int r;
-	enum omap_overlay_caps caps = dss_feat_get_overlay_caps(plane);
+	enum omap_overlay_caps caps = dispc.feat->overlay_caps[plane];
 	const bool replication = true;
 
 	DSSDBG("dispc_ovl_setup %d, pa %pad, pa_uv %pad, sw %d, %d,%d, %dx%d ->"
@@ -2647,12 +2735,12 @@ static int dispc_ovl_enable(enum omap_plane_id plane, bool enable)
 
 static enum omap_dss_output_id dispc_mgr_get_supported_outputs(enum omap_channel channel)
 {
-	return dss_feat_get_supported_outputs(channel);
+	return dss_get_supported_outputs(channel);
 }
 
 static void dispc_lcd_enable_signal_polarity(bool act_high)
 {
-	if (!dss_has_feature(FEAT_LCDENABLEPOL))
+	if (!dispc_has_feature(FEAT_LCDENABLEPOL))
 		return;
 
 	REG_FLD_MOD(DISPC_CONTROL, act_high ? 1 : 0, 29, 29);
@@ -2660,7 +2748,7 @@ static void dispc_lcd_enable_signal_polarity(bool act_high)
 
 void dispc_lcd_enable_signal(bool enable)
 {
-	if (!dss_has_feature(FEAT_LCDENABLESIGNAL))
+	if (!dispc_has_feature(FEAT_LCDENABLESIGNAL))
 		return;
 
 	REG_FLD_MOD(DISPC_CONTROL, enable ? 1 : 0, 28, 28);
@@ -2668,17 +2756,12 @@ void dispc_lcd_enable_signal(bool enable)
 
 void dispc_pck_free_enable(bool enable)
 {
-	if (!dss_has_feature(FEAT_PCKFREEENABLE))
+	if (!dispc_has_feature(FEAT_PCKFREEENABLE))
 		return;
 
 	REG_FLD_MOD(DISPC_CONTROL, enable ? 1 : 0, 27, 27);
 }
 
-static int dispc_get_num_mgrs(void)
-{
-	return dss_feat_get_num_mgrs();
-}
-
 static void dispc_mgr_enable_fifohandcheck(enum omap_channel channel, bool enable)
 {
 	mgr_fld_write(channel, DISPC_MGR_FLD_FIFOHANDCHECK, enable);
@@ -2718,7 +2801,7 @@ static void dispc_mgr_enable_trans_key(enum omap_channel ch, bool enable)
 static void dispc_mgr_enable_alpha_fixed_zorder(enum omap_channel ch,
 		bool enable)
 {
-	if (!dss_has_feature(FEAT_ALPHA_FIXED_ZORDER))
+	if (!dispc_has_feature(FEAT_ALPHA_FIXED_ZORDER))
 		return;
 
 	if (ch == OMAP_DSS_CHANNEL_LCD)
@@ -2735,7 +2818,7 @@ static void dispc_mgr_setup(enum omap_channel channel,
 	dispc_mgr_enable_trans_key(channel, info->trans_enabled);
 	dispc_mgr_enable_alpha_fixed_zorder(channel,
 			info->partial_alpha_enabled);
-	if (dss_has_feature(FEAT_CPR)) {
+	if (dispc_has_feature(FEAT_CPR)) {
 		dispc_mgr_enable_cpr(channel, info->cpr_enable);
 		dispc_mgr_set_cpr_coef(channel, &info->cpr_coefs);
 	}
@@ -3013,7 +3096,7 @@ static void dispc_mgr_set_lcd_divisor(enum omap_channel channel, u16 lck_div,
 	dispc_write_reg(DISPC_DIVISORo(channel),
 			FLD_VAL(lck_div, 23, 16) | FLD_VAL(pck_div, 7, 0));
 
-	if (!dss_has_feature(FEAT_CORE_CLK_DIV) &&
+	if (!dispc_has_feature(FEAT_CORE_CLK_DIV) &&
 			channel == OMAP_DSS_CHANNEL_LCD)
 		dispc.core_clk_rate = dispc_fclk_rate() / lck_div;
 }
@@ -3168,7 +3251,7 @@ void dispc_dump_clocks(struct seq_file *s)
 
 	seq_printf(s, "fck\t\t%-16lu\n", dispc_fclk_rate());
 
-	if (dss_has_feature(FEAT_CORE_CLK_DIV)) {
+	if (dispc_has_feature(FEAT_CORE_CLK_DIV)) {
 		seq_printf(s, "- DISPC-CORE-CLK -\n");
 		l = dispc_read_reg(DISPC_DIVISOR);
 		lcd = FLD_GET(l, 23, 16);
@@ -3179,9 +3262,9 @@ void dispc_dump_clocks(struct seq_file *s)
 
 	dispc_dump_clocks_channel(s, OMAP_DSS_CHANNEL_LCD);
 
-	if (dss_has_feature(FEAT_MGR_LCD2))
+	if (dispc_has_feature(FEAT_MGR_LCD2))
 		dispc_dump_clocks_channel(s, OMAP_DSS_CHANNEL_LCD2);
-	if (dss_has_feature(FEAT_MGR_LCD3))
+	if (dispc_has_feature(FEAT_MGR_LCD3))
 		dispc_dump_clocks_channel(s, OMAP_DSS_CHANNEL_LCD3);
 
 	dispc_runtime_put();
@@ -3221,18 +3304,18 @@ static void dispc_dump_regs(struct seq_file *s)
 	DUMPREG(DISPC_CAPABLE);
 	DUMPREG(DISPC_LINE_STATUS);
 	DUMPREG(DISPC_LINE_NUMBER);
-	if (dss_has_feature(FEAT_ALPHA_FIXED_ZORDER) ||
-			dss_has_feature(FEAT_ALPHA_FREE_ZORDER))
+	if (dispc_has_feature(FEAT_ALPHA_FIXED_ZORDER) ||
+			dispc_has_feature(FEAT_ALPHA_FREE_ZORDER))
 		DUMPREG(DISPC_GLOBAL_ALPHA);
-	if (dss_has_feature(FEAT_MGR_LCD2)) {
+	if (dispc_has_feature(FEAT_MGR_LCD2)) {
 		DUMPREG(DISPC_CONTROL2);
 		DUMPREG(DISPC_CONFIG2);
 	}
-	if (dss_has_feature(FEAT_MGR_LCD3)) {
+	if (dispc_has_feature(FEAT_MGR_LCD3)) {
 		DUMPREG(DISPC_CONTROL3);
 		DUMPREG(DISPC_CONFIG3);
 	}
-	if (dss_has_feature(FEAT_MFLAG))
+	if (dispc_has_feature(FEAT_MFLAG))
 		DUMPREG(DISPC_GLOBAL_MFLAG_ATTRIBUTE);
 
 #undef DUMPREG
@@ -3245,7 +3328,7 @@ static void dispc_dump_regs(struct seq_file *s)
 	p_names = mgr_names;
 
 	/* DISPC channel specific registers */
-	for (i = 0; i < dss_feat_get_num_mgrs(); i++) {
+	for (i = 0; i < dispc_get_num_mgrs(); i++) {
 		DUMPREG(i, DISPC_DEFAULT_COLOR);
 		DUMPREG(i, DISPC_TRANS_COLOR);
 		DUMPREG(i, DISPC_SIZE_MGR);
@@ -3262,7 +3345,7 @@ static void dispc_dump_regs(struct seq_file *s)
 		DUMPREG(i, DISPC_DATA_CYCLE2);
 		DUMPREG(i, DISPC_DATA_CYCLE3);
 
-		if (dss_has_feature(FEAT_CPR)) {
+		if (dispc_has_feature(FEAT_CPR)) {
 			DUMPREG(i, DISPC_CPR_COEF_R);
 			DUMPREG(i, DISPC_CPR_COEF_G);
 			DUMPREG(i, DISPC_CPR_COEF_B);
@@ -3271,7 +3354,7 @@ static void dispc_dump_regs(struct seq_file *s)
 
 	p_names = ovl_names;
 
-	for (i = 0; i < dss_feat_get_num_ovls(); i++) {
+	for (i = 0; i < dispc_get_num_ovls(); i++) {
 		DUMPREG(i, DISPC_OVL_BA0);
 		DUMPREG(i, DISPC_OVL_BA1);
 		DUMPREG(i, DISPC_OVL_POSITION);
@@ -3282,9 +3365,9 @@ static void dispc_dump_regs(struct seq_file *s)
 		DUMPREG(i, DISPC_OVL_ROW_INC);
 		DUMPREG(i, DISPC_OVL_PIXEL_INC);
 
-		if (dss_has_feature(FEAT_PRELOAD))
+		if (dispc_has_feature(FEAT_PRELOAD))
 			DUMPREG(i, DISPC_OVL_PRELOAD);
-		if (dss_has_feature(FEAT_MFLAG))
+		if (dispc_has_feature(FEAT_MFLAG))
 			DUMPREG(i, DISPC_OVL_MFLAG_THRESHOLD);
 
 		if (i == OMAP_DSS_GFX) {
@@ -3297,14 +3380,14 @@ static void dispc_dump_regs(struct seq_file *s)
 		DUMPREG(i, DISPC_OVL_PICTURE_SIZE);
 		DUMPREG(i, DISPC_OVL_ACCU0);
 		DUMPREG(i, DISPC_OVL_ACCU1);
-		if (dss_has_feature(FEAT_HANDLE_UV_SEPARATE)) {
+		if (dispc_has_feature(FEAT_HANDLE_UV_SEPARATE)) {
 			DUMPREG(i, DISPC_OVL_BA0_UV);
 			DUMPREG(i, DISPC_OVL_BA1_UV);
 			DUMPREG(i, DISPC_OVL_FIR2);
 			DUMPREG(i, DISPC_OVL_ACCU2_0);
 			DUMPREG(i, DISPC_OVL_ACCU2_1);
 		}
-		if (dss_has_feature(FEAT_ATTR2))
+		if (dispc_has_feature(FEAT_ATTR2))
 			DUMPREG(i, DISPC_OVL_ATTRIBUTES2);
 	}
 
@@ -3319,21 +3402,21 @@ static void dispc_dump_regs(struct seq_file *s)
 		DUMPREG(i, DISPC_OVL_ROW_INC);
 		DUMPREG(i, DISPC_OVL_PIXEL_INC);
 
-		if (dss_has_feature(FEAT_MFLAG))
+		if (dispc_has_feature(FEAT_MFLAG))
 			DUMPREG(i, DISPC_OVL_MFLAG_THRESHOLD);
 
 		DUMPREG(i, DISPC_OVL_FIR);
 		DUMPREG(i, DISPC_OVL_PICTURE_SIZE);
 		DUMPREG(i, DISPC_OVL_ACCU0);
 		DUMPREG(i, DISPC_OVL_ACCU1);
-		if (dss_has_feature(FEAT_HANDLE_UV_SEPARATE)) {
+		if (dispc_has_feature(FEAT_HANDLE_UV_SEPARATE)) {
 			DUMPREG(i, DISPC_OVL_BA0_UV);
 			DUMPREG(i, DISPC_OVL_BA1_UV);
 			DUMPREG(i, DISPC_OVL_FIR2);
 			DUMPREG(i, DISPC_OVL_ACCU2_0);
 			DUMPREG(i, DISPC_OVL_ACCU2_1);
 		}
-		if (dss_has_feature(FEAT_ATTR2))
+		if (dispc_has_feature(FEAT_ATTR2))
 			DUMPREG(i, DISPC_OVL_ATTRIBUTES2);
 	}
 
@@ -3349,7 +3432,7 @@ static void dispc_dump_regs(struct seq_file *s)
 	/* Video pipeline coefficient registers */
 
 	/* start from OMAP_DSS_VIDEO1 */
-	for (i = 1; i < dss_feat_get_num_ovls(); i++) {
+	for (i = 1; i < dispc_get_num_ovls(); i++) {
 		for (j = 0; j < 8; j++)
 			DUMPREG(i, DISPC_OVL_FIR_COEF_H, j);
 
@@ -3359,12 +3442,12 @@ static void dispc_dump_regs(struct seq_file *s)
 		for (j = 0; j < 5; j++)
 			DUMPREG(i, DISPC_OVL_CONV_COEF, j);
 
-		if (dss_has_feature(FEAT_FIR_COEF_V)) {
+		if (dispc_has_feature(FEAT_FIR_COEF_V)) {
 			for (j = 0; j < 8; j++)
 				DUMPREG(i, DISPC_OVL_FIR_COEF_V, j);
 		}
 
-		if (dss_has_feature(FEAT_HANDLE_UV_SEPARATE)) {
+		if (dispc_has_feature(FEAT_HANDLE_UV_SEPARATE)) {
 			for (j = 0; j < 8; j++)
 				DUMPREG(i, DISPC_OVL_FIR_COEF_H2, j);
 
@@ -3397,7 +3480,7 @@ int dispc_calc_clock_rates(unsigned long dispc_fclk_rate,
 	return 0;
 }
 
-bool dispc_div_calc(unsigned long dispc,
+bool dispc_div_calc(unsigned long dispc_freq,
 		unsigned long pck_min, unsigned long pck_max,
 		dispc_div_calc_func func, void *data)
 {
@@ -3415,19 +3498,19 @@ bool dispc_div_calc(unsigned long dispc,
 	min_fck_per_pck = 0;
 #endif
 
-	pckd_hw_min = dss_feat_get_param_min(FEAT_PARAM_DSS_PCD);
-	pckd_hw_max = dss_feat_get_param_max(FEAT_PARAM_DSS_PCD);
+	pckd_hw_min = dispc.feat->min_pcd;
+	pckd_hw_max = 255;
 
-	lck_max = dss_feat_get_param_max(FEAT_PARAM_DSS_FCK);
+	lck_max = dss_get_max_fck_rate();
 
 	pck_min = pck_min ? pck_min : 1;
 	pck_max = pck_max ? pck_max : ULONG_MAX;
 
-	lckd_start = max(DIV_ROUND_UP(dispc, lck_max), 1ul);
-	lckd_stop = min(dispc / pck_min, 255ul);
+	lckd_start = max(DIV_ROUND_UP(dispc_freq, lck_max), 1ul);
+	lckd_stop = min(dispc_freq / pck_min, 255ul);
 
 	for (lckd = lckd_start; lckd <= lckd_stop; ++lckd) {
-		lck = dispc / lckd;
+		lck = dispc_freq / lckd;
 
 		pckd_start = max(DIV_ROUND_UP(lck, pck_max), pckd_hw_min);
 		pckd_stop = min(lck / pck_min, pckd_hw_max);
@@ -3441,7 +3524,7 @@ bool dispc_div_calc(unsigned long dispc,
 			 * also. Thus we need to use the calculated lck. For
 			 * OMAP4+ the DISPC fclk is a separate clock.
 			 */
-			if (dss_has_feature(FEAT_CORE_CLK_DIV))
+			if (dispc_has_feature(FEAT_CORE_CLK_DIV))
 				fck = dispc_core_clk_rate();
 			else
 				fck = lck;
@@ -3556,10 +3639,10 @@ static void dispc_restore_gamma_tables(void)
 
 	dispc_mgr_write_gamma_table(OMAP_DSS_CHANNEL_DIGIT);
 
-	if (dss_has_feature(FEAT_MGR_LCD2))
+	if (dispc_has_feature(FEAT_MGR_LCD2))
 		dispc_mgr_write_gamma_table(OMAP_DSS_CHANNEL_LCD2);
 
-	if (dss_has_feature(FEAT_MGR_LCD3))
+	if (dispc_has_feature(FEAT_MGR_LCD3))
 		dispc_mgr_write_gamma_table(OMAP_DSS_CHANNEL_LCD3);
 }
 
@@ -3627,11 +3710,11 @@ static int dispc_init_gamma_tables(void)
 		u32 *gt;
 
 		if (channel == OMAP_DSS_CHANNEL_LCD2 &&
-		    !dss_has_feature(FEAT_MGR_LCD2))
+		    !dispc_has_feature(FEAT_MGR_LCD2))
 			continue;
 
 		if (channel == OMAP_DSS_CHANNEL_LCD3 &&
-		    !dss_has_feature(FEAT_MGR_LCD3))
+		    !dispc_has_feature(FEAT_MGR_LCD3))
 			continue;
 
 		gt = devm_kmalloc_array(&dispc.pdev->dev, gdesc->len,
@@ -3651,7 +3734,7 @@ static void _omap_dispc_initial_config(void)
 	u32 l;
 
 	/* Exclusively enable DISPC_CORE_CLK and set divider to 1 */
-	if (dss_has_feature(FEAT_CORE_CLK_DIV)) {
+	if (dispc_has_feature(FEAT_CORE_CLK_DIV)) {
 		l = dispc_read_reg(DISPC_DIVISOR);
 		/* Use DISPC_DIVISOR.LCD, instead of DISPC_DIVISOR1.LCD */
 		l = FLD_MOD(l, 1, 0, 0);
@@ -3669,7 +3752,7 @@ static void _omap_dispc_initial_config(void)
 	 * func-clock auto-gating. For newer versions
 	 * (dispc.feat->has_gamma_table) this enables tv-out gamma tables.
 	 */
-	if (dss_has_feature(FEAT_FUNCGATED) || dispc.feat->has_gamma_table)
+	if (dispc_has_feature(FEAT_FUNCGATED) || dispc.feat->has_gamma_table)
 		REG_FLD_MOD(DISPC_CONFIG, 1, 9, 9);
 
 	dispc_setup_color_conv_coef();
@@ -3685,10 +3768,272 @@ static void _omap_dispc_initial_config(void)
 	if (dispc.feat->mstandby_workaround)
 		REG_FLD_MOD(DISPC_MSTANDBY_CTRL, 1, 0, 0);
 
-	if (dss_has_feature(FEAT_MFLAG))
+	if (dispc_has_feature(FEAT_MFLAG))
 		dispc_init_mflag();
 }
 
+static const enum dispc_feature_id omap2_dispc_features_list[] = {
+	FEAT_LCDENABLEPOL,
+	FEAT_LCDENABLESIGNAL,
+	FEAT_PCKFREEENABLE,
+	FEAT_FUNCGATED,
+	FEAT_ROWREPEATENABLE,
+	FEAT_RESIZECONF,
+};
+
+static const enum dispc_feature_id omap3_dispc_features_list[] = {
+	FEAT_LCDENABLEPOL,
+	FEAT_LCDENABLESIGNAL,
+	FEAT_PCKFREEENABLE,
+	FEAT_FUNCGATED,
+	FEAT_LINEBUFFERSPLIT,
+	FEAT_ROWREPEATENABLE,
+	FEAT_RESIZECONF,
+	FEAT_CPR,
+	FEAT_PRELOAD,
+	FEAT_FIR_COEF_V,
+	FEAT_ALPHA_FIXED_ZORDER,
+	FEAT_FIFO_MERGE,
+	FEAT_OMAP3_DSI_FIFO_BUG,
+};
+
+static const enum dispc_feature_id am43xx_dispc_features_list[] = {
+	FEAT_LCDENABLEPOL,
+	FEAT_LCDENABLESIGNAL,
+	FEAT_PCKFREEENABLE,
+	FEAT_FUNCGATED,
+	FEAT_LINEBUFFERSPLIT,
+	FEAT_ROWREPEATENABLE,
+	FEAT_RESIZECONF,
+	FEAT_CPR,
+	FEAT_PRELOAD,
+	FEAT_FIR_COEF_V,
+	FEAT_ALPHA_FIXED_ZORDER,
+	FEAT_FIFO_MERGE,
+};
+
+static const enum dispc_feature_id omap4_dispc_features_list[] = {
+	FEAT_MGR_LCD2,
+	FEAT_CORE_CLK_DIV,
+	FEAT_HANDLE_UV_SEPARATE,
+	FEAT_ATTR2,
+	FEAT_CPR,
+	FEAT_PRELOAD,
+	FEAT_FIR_COEF_V,
+	FEAT_ALPHA_FREE_ZORDER,
+	FEAT_FIFO_MERGE,
+	FEAT_BURST_2D,
+};
+
+static const enum dispc_feature_id omap5_dispc_features_list[] = {
+	FEAT_MGR_LCD2,
+	FEAT_MGR_LCD3,
+	FEAT_CORE_CLK_DIV,
+	FEAT_HANDLE_UV_SEPARATE,
+	FEAT_ATTR2,
+	FEAT_CPR,
+	FEAT_PRELOAD,
+	FEAT_FIR_COEF_V,
+	FEAT_ALPHA_FREE_ZORDER,
+	FEAT_FIFO_MERGE,
+	FEAT_BURST_2D,
+	FEAT_MFLAG,
+};
+
+static const struct dss_reg_field omap2_dispc_reg_fields[] = {
+	[FEAT_REG_FIRHINC]			= { 11, 0 },
+	[FEAT_REG_FIRVINC]			= { 27, 16 },
+	[FEAT_REG_FIFOLOWTHRESHOLD]		= { 8, 0 },
+	[FEAT_REG_FIFOHIGHTHRESHOLD]		= { 24, 16 },
+	[FEAT_REG_FIFOSIZE]			= { 8, 0 },
+	[FEAT_REG_HORIZONTALACCU]		= { 9, 0 },
+	[FEAT_REG_VERTICALACCU]			= { 25, 16 },
+};
+
+static const struct dss_reg_field omap3_dispc_reg_fields[] = {
+	[FEAT_REG_FIRHINC]			= { 12, 0 },
+	[FEAT_REG_FIRVINC]			= { 28, 16 },
+	[FEAT_REG_FIFOLOWTHRESHOLD]		= { 11, 0 },
+	[FEAT_REG_FIFOHIGHTHRESHOLD]		= { 27, 16 },
+	[FEAT_REG_FIFOSIZE]			= { 10, 0 },
+	[FEAT_REG_HORIZONTALACCU]		= { 9, 0 },
+	[FEAT_REG_VERTICALACCU]			= { 25, 16 },
+};
+
+static const struct dss_reg_field omap4_dispc_reg_fields[] = {
+	[FEAT_REG_FIRHINC]			= { 12, 0 },
+	[FEAT_REG_FIRVINC]			= { 28, 16 },
+	[FEAT_REG_FIFOLOWTHRESHOLD]		= { 15, 0 },
+	[FEAT_REG_FIFOHIGHTHRESHOLD]		= { 31, 16 },
+	[FEAT_REG_FIFOSIZE]			= { 15, 0 },
+	[FEAT_REG_HORIZONTALACCU]		= { 10, 0 },
+	[FEAT_REG_VERTICALACCU]			= { 26, 16 },
+};
+
+static const enum omap_overlay_caps omap2_dispc_overlay_caps[] = {
+	/* OMAP_DSS_GFX */
+	OMAP_DSS_OVL_CAP_POS | OMAP_DSS_OVL_CAP_REPLICATION,
+
+	/* OMAP_DSS_VIDEO1 */
+	OMAP_DSS_OVL_CAP_SCALE | OMAP_DSS_OVL_CAP_POS |
+		OMAP_DSS_OVL_CAP_REPLICATION,
+
+	/* OMAP_DSS_VIDEO2 */
+	OMAP_DSS_OVL_CAP_SCALE | OMAP_DSS_OVL_CAP_POS |
+		OMAP_DSS_OVL_CAP_REPLICATION,
+};
+
+static const enum omap_overlay_caps omap3430_dispc_overlay_caps[] = {
+	/* OMAP_DSS_GFX */
+	OMAP_DSS_OVL_CAP_GLOBAL_ALPHA | OMAP_DSS_OVL_CAP_POS |
+		OMAP_DSS_OVL_CAP_REPLICATION,
+
+	/* OMAP_DSS_VIDEO1 */
+	OMAP_DSS_OVL_CAP_SCALE | OMAP_DSS_OVL_CAP_POS |
+		OMAP_DSS_OVL_CAP_REPLICATION,
+
+	/* OMAP_DSS_VIDEO2 */
+	OMAP_DSS_OVL_CAP_SCALE | OMAP_DSS_OVL_CAP_GLOBAL_ALPHA |
+		OMAP_DSS_OVL_CAP_POS | OMAP_DSS_OVL_CAP_REPLICATION,
+};
+
+static const enum omap_overlay_caps omap3630_dispc_overlay_caps[] = {
+	/* OMAP_DSS_GFX */
+	OMAP_DSS_OVL_CAP_GLOBAL_ALPHA | OMAP_DSS_OVL_CAP_PRE_MULT_ALPHA |
+		OMAP_DSS_OVL_CAP_POS | OMAP_DSS_OVL_CAP_REPLICATION,
+
+	/* OMAP_DSS_VIDEO1 */
+	OMAP_DSS_OVL_CAP_SCALE | OMAP_DSS_OVL_CAP_POS |
+		OMAP_DSS_OVL_CAP_REPLICATION,
+
+	/* OMAP_DSS_VIDEO2 */
+	OMAP_DSS_OVL_CAP_SCALE | OMAP_DSS_OVL_CAP_GLOBAL_ALPHA |
+		OMAP_DSS_OVL_CAP_PRE_MULT_ALPHA | OMAP_DSS_OVL_CAP_POS |
+		OMAP_DSS_OVL_CAP_REPLICATION,
+};
+
+static const enum omap_overlay_caps omap4_dispc_overlay_caps[] = {
+	/* OMAP_DSS_GFX */
+	OMAP_DSS_OVL_CAP_GLOBAL_ALPHA | OMAP_DSS_OVL_CAP_PRE_MULT_ALPHA |
+		OMAP_DSS_OVL_CAP_ZORDER | OMAP_DSS_OVL_CAP_POS |
+		OMAP_DSS_OVL_CAP_REPLICATION,
+
+	/* OMAP_DSS_VIDEO1 */
+	OMAP_DSS_OVL_CAP_SCALE | OMAP_DSS_OVL_CAP_GLOBAL_ALPHA |
+		OMAP_DSS_OVL_CAP_PRE_MULT_ALPHA | OMAP_DSS_OVL_CAP_ZORDER |
+		OMAP_DSS_OVL_CAP_POS | OMAP_DSS_OVL_CAP_REPLICATION,
+
+	/* OMAP_DSS_VIDEO2 */
+	OMAP_DSS_OVL_CAP_SCALE | OMAP_DSS_OVL_CAP_GLOBAL_ALPHA |
+		OMAP_DSS_OVL_CAP_PRE_MULT_ALPHA | OMAP_DSS_OVL_CAP_ZORDER |
+		OMAP_DSS_OVL_CAP_POS | OMAP_DSS_OVL_CAP_REPLICATION,
+
+	/* OMAP_DSS_VIDEO3 */
+	OMAP_DSS_OVL_CAP_SCALE | OMAP_DSS_OVL_CAP_GLOBAL_ALPHA |
+		OMAP_DSS_OVL_CAP_PRE_MULT_ALPHA | OMAP_DSS_OVL_CAP_ZORDER |
+		OMAP_DSS_OVL_CAP_POS | OMAP_DSS_OVL_CAP_REPLICATION,
+};
+
+#define COLOR_ARRAY(arr...) (const u32[]) { arr, 0 }
+
+static const u32 *omap2_dispc_supported_color_modes[] = {
+
+	/* OMAP_DSS_GFX */
+	COLOR_ARRAY(
+	DRM_FORMAT_RGBX4444, DRM_FORMAT_RGB565,
+	DRM_FORMAT_XRGB8888, DRM_FORMAT_RGB888),
+
+	/* OMAP_DSS_VIDEO1 */
+	COLOR_ARRAY(
+	DRM_FORMAT_RGB565, DRM_FORMAT_XRGB8888,
+	DRM_FORMAT_RGB888, DRM_FORMAT_YUYV,
+	DRM_FORMAT_UYVY),
+
+	/* OMAP_DSS_VIDEO2 */
+	COLOR_ARRAY(
+	DRM_FORMAT_RGB565, DRM_FORMAT_XRGB8888,
+	DRM_FORMAT_RGB888, DRM_FORMAT_YUYV,
+	DRM_FORMAT_UYVY),
+};
+
+static const u32 *omap3_dispc_supported_color_modes[] = {
+	/* OMAP_DSS_GFX */
+	COLOR_ARRAY(
+	DRM_FORMAT_RGBX4444, DRM_FORMAT_ARGB4444,
+	DRM_FORMAT_RGB565, DRM_FORMAT_XRGB8888,
+	DRM_FORMAT_RGB888, DRM_FORMAT_ARGB8888,
+	DRM_FORMAT_RGBA8888, DRM_FORMAT_RGBX8888),
+
+	/* OMAP_DSS_VIDEO1 */
+	COLOR_ARRAY(
+	DRM_FORMAT_XRGB8888, DRM_FORMAT_RGB888,
+	DRM_FORMAT_RGBX4444, DRM_FORMAT_RGB565,
+	DRM_FORMAT_YUYV, DRM_FORMAT_UYVY),
+
+	/* OMAP_DSS_VIDEO2 */
+	COLOR_ARRAY(
+	DRM_FORMAT_RGBX4444, DRM_FORMAT_ARGB4444,
+	DRM_FORMAT_RGB565, DRM_FORMAT_XRGB8888,
+	DRM_FORMAT_RGB888, DRM_FORMAT_YUYV,
+	DRM_FORMAT_UYVY, DRM_FORMAT_ARGB8888,
+	DRM_FORMAT_RGBA8888, DRM_FORMAT_RGBX8888),
+};
+
+static const u32 *omap4_dispc_supported_color_modes[] = {
+	/* OMAP_DSS_GFX */
+	COLOR_ARRAY(
+	DRM_FORMAT_RGBX4444, DRM_FORMAT_ARGB4444,
+	DRM_FORMAT_RGB565, DRM_FORMAT_XRGB8888,
+	DRM_FORMAT_RGB888, DRM_FORMAT_ARGB8888,
+	DRM_FORMAT_RGBA8888, DRM_FORMAT_RGBX8888,
+	DRM_FORMAT_ARGB1555, DRM_FORMAT_XRGB4444,
+	DRM_FORMAT_RGBA4444, DRM_FORMAT_XRGB1555),
+
+	/* OMAP_DSS_VIDEO1 */
+	COLOR_ARRAY(
+	DRM_FORMAT_RGB565, DRM_FORMAT_RGBX4444,
+	DRM_FORMAT_YUYV, DRM_FORMAT_ARGB1555,
+	DRM_FORMAT_RGBA8888, DRM_FORMAT_NV12,
+	DRM_FORMAT_RGBA4444, DRM_FORMAT_XRGB8888,
+	DRM_FORMAT_RGB888, DRM_FORMAT_UYVY,
+	DRM_FORMAT_ARGB4444, DRM_FORMAT_XRGB1555,
+	DRM_FORMAT_ARGB8888, DRM_FORMAT_XRGB4444,
+	DRM_FORMAT_RGBX8888),
+
+       /* OMAP_DSS_VIDEO2 */
+	COLOR_ARRAY(
+	DRM_FORMAT_RGB565, DRM_FORMAT_RGBX4444,
+	DRM_FORMAT_YUYV, DRM_FORMAT_ARGB1555,
+	DRM_FORMAT_RGBA8888, DRM_FORMAT_NV12,
+	DRM_FORMAT_RGBA4444, DRM_FORMAT_XRGB8888,
+	DRM_FORMAT_RGB888, DRM_FORMAT_UYVY,
+	DRM_FORMAT_ARGB4444, DRM_FORMAT_XRGB1555,
+	DRM_FORMAT_ARGB8888, DRM_FORMAT_XRGB4444,
+	DRM_FORMAT_RGBX8888),
+
+	/* OMAP_DSS_VIDEO3 */
+	COLOR_ARRAY(
+	DRM_FORMAT_RGB565, DRM_FORMAT_RGBX4444,
+	DRM_FORMAT_YUYV, DRM_FORMAT_ARGB1555,
+	DRM_FORMAT_RGBA8888, DRM_FORMAT_NV12,
+	DRM_FORMAT_RGBA4444, DRM_FORMAT_XRGB8888,
+	DRM_FORMAT_RGB888, DRM_FORMAT_UYVY,
+	DRM_FORMAT_ARGB4444, DRM_FORMAT_XRGB1555,
+	DRM_FORMAT_ARGB8888, DRM_FORMAT_XRGB4444,
+	DRM_FORMAT_RGBX8888),
+
+	/* OMAP_DSS_WB */
+	COLOR_ARRAY(
+	DRM_FORMAT_RGB565, DRM_FORMAT_RGBX4444,
+	DRM_FORMAT_YUYV, DRM_FORMAT_ARGB1555,
+	DRM_FORMAT_RGBA8888, DRM_FORMAT_NV12,
+	DRM_FORMAT_RGBA4444, DRM_FORMAT_XRGB8888,
+	DRM_FORMAT_RGB888, DRM_FORMAT_UYVY,
+	DRM_FORMAT_ARGB4444, DRM_FORMAT_XRGB1555,
+	DRM_FORMAT_ARGB8888, DRM_FORMAT_XRGB4444,
+	DRM_FORMAT_RGBX8888),
+};
+
 static const struct dispc_features omap24xx_dispc_feats = {
 	.sw_start		=	5,
 	.fp_start		=	15,
@@ -3701,9 +4046,26 @@ static const struct dispc_features omap24xx_dispc_feats = {
 	.mgr_width_max		=	2048,
 	.mgr_height_max		=	2048,
 	.max_lcd_pclk		=	66500000,
+	.max_downscale		=	2,
+	/*
+	 * Assume the line width buffer to be 768 pixels as OMAP2 DISPC scaler
+	 * cannot scale an image width larger than 768.
+	 */
+	.max_line_width		=	768,
+	.min_pcd		=	2,
 	.calc_scaling		=	dispc_ovl_calc_scaling_24xx,
 	.calc_core_clk		=	calc_core_clk_24xx,
 	.num_fifos		=	3,
+	.features		=	omap2_dispc_features_list,
+	.num_features		=	ARRAY_SIZE(omap2_dispc_features_list),
+	.reg_fields		=	omap2_dispc_reg_fields,
+	.num_reg_fields		=	ARRAY_SIZE(omap2_dispc_reg_fields),
+	.overlay_caps		=	omap2_dispc_overlay_caps,
+	.supported_color_modes	=	omap2_dispc_supported_color_modes,
+	.num_mgrs		=	2,
+	.num_ovls		=	3,
+	.buffer_size_unit	=	1,
+	.burst_size_unit	=	8,
 	.no_framedone_tv	=	true,
 	.set_max_preload	=	false,
 	.last_pixel_inc_missing	=	true,
@@ -3722,9 +4084,22 @@ static const struct dispc_features omap34xx_rev1_0_dispc_feats = {
 	.mgr_height_max		=	2048,
 	.max_lcd_pclk		=	173000000,
 	.max_tv_pclk		=	59000000,
+	.max_downscale		=	4,
+	.max_line_width		=	1024,
+	.min_pcd		=	1,
 	.calc_scaling		=	dispc_ovl_calc_scaling_34xx,
 	.calc_core_clk		=	calc_core_clk_34xx,
 	.num_fifos		=	3,
+	.features		=	omap3_dispc_features_list,
+	.num_features		=	ARRAY_SIZE(omap3_dispc_features_list),
+	.reg_fields		=	omap3_dispc_reg_fields,
+	.num_reg_fields		=	ARRAY_SIZE(omap3_dispc_reg_fields),
+	.overlay_caps		=	omap3430_dispc_overlay_caps,
+	.supported_color_modes	=	omap3_dispc_supported_color_modes,
+	.num_mgrs		=	2,
+	.num_ovls		=	3,
+	.buffer_size_unit	=	1,
+	.burst_size_unit	=	8,
 	.no_framedone_tv	=	true,
 	.set_max_preload	=	false,
 	.last_pixel_inc_missing	=	true,
@@ -3743,9 +4118,90 @@ static const struct dispc_features omap34xx_rev3_0_dispc_feats = {
 	.mgr_height_max		=	2048,
 	.max_lcd_pclk		=	173000000,
 	.max_tv_pclk		=	59000000,
+	.max_downscale		=	4,
+	.max_line_width		=	1024,
+	.min_pcd		=	1,
 	.calc_scaling		=	dispc_ovl_calc_scaling_34xx,
 	.calc_core_clk		=	calc_core_clk_34xx,
 	.num_fifos		=	3,
+	.features		=	omap3_dispc_features_list,
+	.num_features		=	ARRAY_SIZE(omap3_dispc_features_list),
+	.reg_fields		=	omap3_dispc_reg_fields,
+	.num_reg_fields		=	ARRAY_SIZE(omap3_dispc_reg_fields),
+	.overlay_caps		=	omap3430_dispc_overlay_caps,
+	.supported_color_modes	=	omap3_dispc_supported_color_modes,
+	.num_mgrs		=	2,
+	.num_ovls		=	3,
+	.buffer_size_unit	=	1,
+	.burst_size_unit	=	8,
+	.no_framedone_tv	=	true,
+	.set_max_preload	=	false,
+	.last_pixel_inc_missing	=	true,
+};
+
+static const struct dispc_features omap36xx_dispc_feats = {
+	.sw_start		=	7,
+	.fp_start		=	19,
+	.bp_start		=	31,
+	.sw_max			=	256,
+	.vp_max			=	4095,
+	.hp_max			=	4096,
+	.mgr_width_start	=	10,
+	.mgr_height_start	=	26,
+	.mgr_width_max		=	2048,
+	.mgr_height_max		=	2048,
+	.max_lcd_pclk		=	173000000,
+	.max_tv_pclk		=	59000000,
+	.max_downscale		=	4,
+	.max_line_width		=	1024,
+	.min_pcd		=	1,
+	.calc_scaling		=	dispc_ovl_calc_scaling_34xx,
+	.calc_core_clk		=	calc_core_clk_34xx,
+	.num_fifos		=	3,
+	.features		=	omap3_dispc_features_list,
+	.num_features		=	ARRAY_SIZE(omap3_dispc_features_list),
+	.reg_fields		=	omap3_dispc_reg_fields,
+	.num_reg_fields		=	ARRAY_SIZE(omap3_dispc_reg_fields),
+	.overlay_caps		=	omap3630_dispc_overlay_caps,
+	.supported_color_modes	=	omap3_dispc_supported_color_modes,
+	.num_mgrs		=	2,
+	.num_ovls		=	3,
+	.buffer_size_unit	=	1,
+	.burst_size_unit	=	8,
+	.no_framedone_tv	=	true,
+	.set_max_preload	=	false,
+	.last_pixel_inc_missing	=	true,
+};
+
+static const struct dispc_features am43xx_dispc_feats = {
+	.sw_start		=	7,
+	.fp_start		=	19,
+	.bp_start		=	31,
+	.sw_max			=	256,
+	.vp_max			=	4095,
+	.hp_max			=	4096,
+	.mgr_width_start	=	10,
+	.mgr_height_start	=	26,
+	.mgr_width_max		=	2048,
+	.mgr_height_max		=	2048,
+	.max_lcd_pclk		=	173000000,
+	.max_tv_pclk		=	59000000,
+	.max_downscale		=	4,
+	.max_line_width		=	1024,
+	.min_pcd		=	1,
+	.calc_scaling		=	dispc_ovl_calc_scaling_34xx,
+	.calc_core_clk		=	calc_core_clk_34xx,
+	.num_fifos		=	3,
+	.features		=	am43xx_dispc_features_list,
+	.num_features		=	ARRAY_SIZE(am43xx_dispc_features_list),
+	.reg_fields		=	omap3_dispc_reg_fields,
+	.num_reg_fields		=	ARRAY_SIZE(omap3_dispc_reg_fields),
+	.overlay_caps		=	omap3430_dispc_overlay_caps,
+	.supported_color_modes	=	omap3_dispc_supported_color_modes,
+	.num_mgrs		=	1,
+	.num_ovls		=	3,
+	.buffer_size_unit	=	1,
+	.burst_size_unit	=	8,
 	.no_framedone_tv	=	true,
 	.set_max_preload	=	false,
 	.last_pixel_inc_missing	=	true,
@@ -3764,9 +4220,22 @@ static const struct dispc_features omap44xx_dispc_feats = {
 	.mgr_height_max		=	2048,
 	.max_lcd_pclk		=	170000000,
 	.max_tv_pclk		=	185625000,
+	.max_downscale		=	4,
+	.max_line_width		=	2048,
+	.min_pcd		=	1,
 	.calc_scaling		=	dispc_ovl_calc_scaling_44xx,
 	.calc_core_clk		=	calc_core_clk_44xx,
 	.num_fifos		=	5,
+	.features		=	omap4_dispc_features_list,
+	.num_features		=	ARRAY_SIZE(omap4_dispc_features_list),
+	.reg_fields		=	omap4_dispc_reg_fields,
+	.num_reg_fields		=	ARRAY_SIZE(omap4_dispc_reg_fields),
+	.overlay_caps		=	omap4_dispc_overlay_caps,
+	.supported_color_modes	=	omap4_dispc_supported_color_modes,
+	.num_mgrs		=	3,
+	.num_ovls		=	4,
+	.buffer_size_unit	=	16,
+	.burst_size_unit	=	16,
 	.gfx_fifo_workaround	=	true,
 	.set_max_preload	=	true,
 	.supports_sync_align	=	true,
@@ -3790,9 +4259,22 @@ static const struct dispc_features omap54xx_dispc_feats = {
 	.mgr_height_max		=	4096,
 	.max_lcd_pclk		=	170000000,
 	.max_tv_pclk		=	186000000,
+	.max_downscale		=	4,
+	.max_line_width		=	2048,
+	.min_pcd		=	1,
 	.calc_scaling		=	dispc_ovl_calc_scaling_44xx,
 	.calc_core_clk		=	calc_core_clk_44xx,
 	.num_fifos		=	5,
+	.features		=	omap5_dispc_features_list,
+	.num_features		=	ARRAY_SIZE(omap5_dispc_features_list),
+	.reg_fields		=	omap4_dispc_reg_fields,
+	.num_reg_fields		=	ARRAY_SIZE(omap4_dispc_reg_fields),
+	.overlay_caps		=	omap4_dispc_overlay_caps,
+	.supported_color_modes	=	omap4_dispc_supported_color_modes,
+	.num_mgrs		=	4,
+	.num_ovls		=	4,
+	.buffer_size_unit	=	16,
+	.burst_size_unit	=	16,
 	.gfx_fifo_workaround	=	true,
 	.mstandby_workaround	=	true,
 	.set_max_preload	=	true,
@@ -3804,54 +4286,6 @@ static const struct dispc_features omap54xx_dispc_feats = {
 	.has_gamma_i734_bug	=	true,
 };
 
-static int dispc_init_features(struct platform_device *pdev)
-{
-	const struct dispc_features *src;
-	struct dispc_features *dst;
-
-	dst = devm_kzalloc(&pdev->dev, sizeof(*dst), GFP_KERNEL);
-	if (!dst) {
-		dev_err(&pdev->dev, "Failed to allocate DISPC Features\n");
-		return -ENOMEM;
-	}
-
-	switch (omapdss_get_version()) {
-	case OMAPDSS_VER_OMAP24xx:
-		src = &omap24xx_dispc_feats;
-		break;
-
-	case OMAPDSS_VER_OMAP34xx_ES1:
-		src = &omap34xx_rev1_0_dispc_feats;
-		break;
-
-	case OMAPDSS_VER_OMAP34xx_ES3:
-	case OMAPDSS_VER_OMAP3630:
-	case OMAPDSS_VER_AM35xx:
-	case OMAPDSS_VER_AM43xx:
-		src = &omap34xx_rev3_0_dispc_feats;
-		break;
-
-	case OMAPDSS_VER_OMAP4430_ES1:
-	case OMAPDSS_VER_OMAP4430_ES2:
-	case OMAPDSS_VER_OMAP4:
-		src = &omap44xx_dispc_feats;
-		break;
-
-	case OMAPDSS_VER_OMAP5:
-	case OMAPDSS_VER_DRA7xx:
-		src = &omap54xx_dispc_feats;
-		break;
-
-	default:
-		return -ENODEV;
-	}
-
-	memcpy(dst, src, sizeof(*dst));
-	dispc.feat = dst;
-
-	return 0;
-}
-
 static irqreturn_t dispc_irq_handler(int irq, void *arg)
 {
 	if (!dispc.is_enabled)
@@ -4083,9 +4517,28 @@ static const struct dispc_ops dispc_ops = {
 };
 
 /* DISPC HW IP initialisation */
+static const struct of_device_id dispc_of_match[] = {
+	{ .compatible = "ti,omap2-dispc", .data = &omap24xx_dispc_feats },
+	{ .compatible = "ti,omap3-dispc", .data = &omap36xx_dispc_feats },
+	{ .compatible = "ti,omap4-dispc", .data = &omap44xx_dispc_feats },
+	{ .compatible = "ti,omap5-dispc", .data = &omap54xx_dispc_feats },
+	{ .compatible = "ti,dra7-dispc",  .data = &omap54xx_dispc_feats },
+	{},
+};
+
+static const struct soc_device_attribute dispc_soc_devices[] = {
+	{ .machine = "OMAP3[45]*",
+	  .revision = "ES[12].?",	.data = &omap34xx_rev1_0_dispc_feats },
+	{ .machine = "OMAP3[45]*",	.data = &omap34xx_rev3_0_dispc_feats },
+	{ .machine = "AM35*",		.data = &omap34xx_rev3_0_dispc_feats },
+	{ .machine = "AM43*",		.data = &am43xx_dispc_feats },
+	{ /* sentinel */ }
+};
+
 static int dispc_bind(struct device *dev, struct device *master, void *data)
 {
 	struct platform_device *pdev = to_platform_device(dev);
+	const struct soc_device_attribute *soc;
 	u32 rev;
 	int r = 0;
 	struct resource *dispc_mem;
@@ -4095,9 +4548,15 @@ static int dispc_bind(struct device *dev, struct device *master, void *data)
 
 	spin_lock_init(&dispc.control_lock);
 
-	r = dispc_init_features(dispc.pdev);
-	if (r)
-		return r;
+	/*
+	 * The OMAP3-based models can't be told apart using the compatible
+	 * string, use SoC device matching.
+	 */
+	soc = soc_device_match(dispc_soc_devices);
+	if (soc)
+		dispc.feat = soc->data;
+	else
+		dispc.feat = of_match_device(dispc_of_match, &pdev->dev)->data;
 
 	r = dispc_errata_i734_wa_init();
 	if (r)
@@ -4226,15 +4685,6 @@ static const struct dev_pm_ops dispc_pm_ops = {
 	.runtime_resume = dispc_runtime_resume,
 };
 
-static const struct of_device_id dispc_of_match[] = {
-	{ .compatible = "ti,omap2-dispc", },
-	{ .compatible = "ti,omap3-dispc", },
-	{ .compatible = "ti,omap4-dispc", },
-	{ .compatible = "ti,omap5-dispc", },
-	{ .compatible = "ti,dra7-dispc", },
-	{},
-};
-
 static struct platform_driver omap_dispchw_driver = {
 	.probe		= dispc_probe,
 	.remove         = dispc_remove,
diff --git a/drivers/gpu/drm/omapdrm/dss/dpi.c b/drivers/gpu/drm/omapdrm/dss/dpi.c
index 86dbb65..daf286f 100644
--- a/drivers/gpu/drm/omapdrm/dss/dpi.c
+++ b/drivers/gpu/drm/omapdrm/dss/dpi.c
@@ -32,13 +32,14 @@
 #include <linux/string.h>
 #include <linux/of.h>
 #include <linux/clk.h>
+#include <linux/sys_soc.h>
 
 #include "omapdss.h"
 #include "dss.h"
-#include "dss_features.h"
 
 struct dpi_data {
 	struct platform_device *pdev;
+	enum dss_model dss_model;
 
 	struct regulator *vdds_dsi_reg;
 	enum dss_clk_source clk_src;
@@ -99,25 +100,21 @@ static enum dss_clk_source dpi_get_clk_src_dra7xx(enum omap_channel channel)
 	return DSS_CLK_SRC_FCK;
 }
 
-static enum dss_clk_source dpi_get_clk_src(enum omap_channel channel)
+static enum dss_clk_source dpi_get_clk_src(struct dpi_data *dpi)
 {
+	enum omap_channel channel = dpi->output.dispc_channel;
+
 	/*
 	 * XXX we can't currently use DSI PLL for DPI with OMAP3, as the DSI PLL
 	 * would also be used for DISPC fclk. Meaning, when the DPI output is
 	 * disabled, DISPC clock will be disabled, and TV out will stop.
 	 */
-	switch (omapdss_get_version()) {
-	case OMAPDSS_VER_OMAP24xx:
-	case OMAPDSS_VER_OMAP34xx_ES1:
-	case OMAPDSS_VER_OMAP34xx_ES3:
-	case OMAPDSS_VER_OMAP3630:
-	case OMAPDSS_VER_AM35xx:
-	case OMAPDSS_VER_AM43xx:
+	switch (dpi->dss_model) {
+	case DSS_MODEL_OMAP2:
+	case DSS_MODEL_OMAP3:
 		return DSS_CLK_SRC_FCK;
 
-	case OMAPDSS_VER_OMAP4430_ES1:
-	case OMAPDSS_VER_OMAP4430_ES2:
-	case OMAPDSS_VER_OMAP4:
+	case DSS_MODEL_OMAP4:
 		switch (channel) {
 		case OMAP_DSS_CHANNEL_LCD:
 			return DSS_CLK_SRC_PLL1_1;
@@ -127,7 +124,7 @@ static enum dss_clk_source dpi_get_clk_src(enum omap_channel channel)
 			return DSS_CLK_SRC_FCK;
 		}
 
-	case OMAPDSS_VER_OMAP5:
+	case DSS_MODEL_OMAP5:
 		switch (channel) {
 		case OMAP_DSS_CHANNEL_LCD:
 			return DSS_CLK_SRC_PLL1_1;
@@ -138,7 +135,7 @@ static enum dss_clk_source dpi_get_clk_src(enum omap_channel channel)
 			return DSS_CLK_SRC_FCK;
 		}
 
-	case OMAPDSS_VER_DRA7xx:
+	case DSS_MODEL_DRA7:
 		return dpi_get_clk_src_dra7xx(channel);
 
 	default:
@@ -213,7 +210,7 @@ static bool dpi_calc_pll_cb(int n, int m, unsigned long fint,
 	ctx->pll_cinfo.clkdco = clkdco;
 
 	return dss_pll_hsdiv_calc_a(ctx->pll, clkdco,
-		ctx->pck_min, dss_feat_get_param_max(FEAT_PARAM_DSS_FCK),
+		ctx->pck_min, dss_get_max_fck_rate(),
 		dpi_calc_hsdiv_cb, ctx);
 }
 
@@ -403,19 +400,13 @@ static int dpi_display_enable(struct omap_dss_device *dssdev)
 
 	mutex_lock(&dpi->lock);
 
-	if (dss_has_feature(FEAT_DPI_USES_VDDS_DSI) && !dpi->vdds_dsi_reg) {
-		DSSERR("no VDSS_DSI regulator\n");
-		r = -ENODEV;
-		goto err_no_reg;
-	}
-
 	if (!out->dispc_channel_connected) {
 		DSSERR("failed to enable display: no output/manager\n");
 		r = -ENODEV;
 		goto err_no_out_mgr;
 	}
 
-	if (dss_has_feature(FEAT_DPI_USES_VDDS_DSI)) {
+	if (dpi->vdds_dsi_reg) {
 		r = regulator_enable(dpi->vdds_dsi_reg);
 		if (r)
 			goto err_reg_enable;
@@ -459,11 +450,10 @@ static int dpi_display_enable(struct omap_dss_device *dssdev)
 err_src_sel:
 	dispc_runtime_put();
 err_get_dispc:
-	if (dss_has_feature(FEAT_DPI_USES_VDDS_DSI))
+	if (dpi->vdds_dsi_reg)
 		regulator_disable(dpi->vdds_dsi_reg);
 err_reg_enable:
 err_no_out_mgr:
-err_no_reg:
 	mutex_unlock(&dpi->lock);
 	return r;
 }
@@ -484,7 +474,7 @@ static void dpi_display_disable(struct omap_dss_device *dssdev)
 
 	dispc_runtime_put();
 
-	if (dss_has_feature(FEAT_DPI_USES_VDDS_DSI))
+	if (dpi->vdds_dsi_reg)
 		regulator_disable(dpi->vdds_dsi_reg);
 
 	mutex_unlock(&dpi->lock);
@@ -575,11 +565,21 @@ static int dpi_verify_pll(struct dss_pll *pll)
 	return 0;
 }
 
+static const struct soc_device_attribute dpi_soc_devices[] = {
+	{ .family = "OMAP3[456]*" },
+	{ .family = "[AD]M37*" },
+	{ /* sentinel */ }
+};
+
 static int dpi_init_regulator(struct dpi_data *dpi)
 {
 	struct regulator *vdds_dsi;
 
-	if (!dss_has_feature(FEAT_DPI_USES_VDDS_DSI))
+	/*
+	 * The DPI uses the DSI VDDS on OMAP34xx, OMAP35xx, OMAP36xx, AM37xx and
+	 * DM37xx only.
+	 */
+	if (!soc_device_match(dpi_soc_devices))
 		return 0;
 
 	if (dpi->vdds_dsi_reg)
@@ -604,7 +604,7 @@ static void dpi_init_pll(struct dpi_data *dpi)
 	if (dpi->pll)
 		return;
 
-	dpi->clk_src = dpi_get_clk_src(dpi->output.dispc_channel);
+	dpi->clk_src = dpi_get_clk_src(dpi);
 
 	pll = dss_pll_find_by_src(dpi->clk_src);
 	if (!pll)
@@ -624,18 +624,14 @@ static void dpi_init_pll(struct dpi_data *dpi)
  * the channel in some more dynamic manner, or get the channel as a user
  * parameter.
  */
-static enum omap_channel dpi_get_channel(int port_num)
+static enum omap_channel dpi_get_channel(struct dpi_data *dpi, int port_num)
 {
-	switch (omapdss_get_version()) {
-	case OMAPDSS_VER_OMAP24xx:
-	case OMAPDSS_VER_OMAP34xx_ES1:
-	case OMAPDSS_VER_OMAP34xx_ES3:
-	case OMAPDSS_VER_OMAP3630:
-	case OMAPDSS_VER_AM35xx:
-	case OMAPDSS_VER_AM43xx:
+	switch (dpi->dss_model) {
+	case DSS_MODEL_OMAP2:
+	case DSS_MODEL_OMAP3:
 		return OMAP_DSS_CHANNEL_LCD;
 
-	case OMAPDSS_VER_DRA7xx:
+	case DSS_MODEL_DRA7:
 		switch (port_num) {
 		case 2:
 			return OMAP_DSS_CHANNEL_LCD3;
@@ -646,12 +642,10 @@ static enum omap_channel dpi_get_channel(int port_num)
 			return OMAP_DSS_CHANNEL_LCD;
 		}
 
-	case OMAPDSS_VER_OMAP4430_ES1:
-	case OMAPDSS_VER_OMAP4430_ES2:
-	case OMAPDSS_VER_OMAP4:
+	case DSS_MODEL_OMAP4:
 		return OMAP_DSS_CHANNEL_LCD2;
 
-	case OMAPDSS_VER_OMAP5:
+	case DSS_MODEL_OMAP5:
 		return OMAP_DSS_CHANNEL_LCD3;
 
 	default:
@@ -716,10 +710,8 @@ static const struct omapdss_dpi_ops dpi_ops = {
 	.get_timings = dpi_get_timings,
 };
 
-static void dpi_init_output_port(struct platform_device *pdev,
-	struct device_node *port)
+static void dpi_init_output_port(struct dpi_data *dpi, struct device_node *port)
 {
-	struct dpi_data *dpi = port->data;
 	struct omap_dss_device *out = &dpi->output;
 	int r;
 	u32 port_num;
@@ -741,10 +733,10 @@ static void dpi_init_output_port(struct platform_device *pdev,
 		break;
 	}
 
-	out->dev = &pdev->dev;
+	out->dev = &dpi->pdev->dev;
 	out->id = OMAP_DSS_OUTPUT_DPI;
 	out->output_type = OMAP_DISPLAY_TYPE_DPI;
-	out->dispc_channel = dpi_get_channel(port_num);
+	out->dispc_channel = dpi_get_channel(dpi, port_num);
 	out->port_num = port_num;
 	out->ops.dpi = &dpi_ops;
 	out->owner = THIS_MODULE;
@@ -760,7 +752,8 @@ static void dpi_uninit_output_port(struct device_node *port)
 	omapdss_unregister_output(out);
 }
 
-int dpi_init_port(struct platform_device *pdev, struct device_node *port)
+int dpi_init_port(struct platform_device *pdev, struct device_node *port,
+		  enum dss_model dss_model)
 {
 	struct dpi_data *dpi;
 	struct device_node *ep;
@@ -786,11 +779,12 @@ int dpi_init_port(struct platform_device *pdev, struct device_node *port)
 	of_node_put(ep);
 
 	dpi->pdev = pdev;
+	dpi->dss_model = dss_model;
 	port->data = dpi;
 
 	mutex_init(&dpi->lock);
 
-	dpi_init_output_port(pdev, port);
+	dpi_init_output_port(dpi, port);
 
 	dpi->port_initialized = true;
 
diff --git a/drivers/gpu/drm/omapdrm/dss/dsi.c b/drivers/gpu/drm/omapdrm/dss/dsi.c
index 835f490..b56a057 100644
--- a/drivers/gpu/drm/omapdrm/dss/dsi.c
+++ b/drivers/gpu/drm/omapdrm/dss/dsi.c
@@ -20,6 +20,8 @@
 #define DSS_SUBSYS_NAME "DSI"
 
 #include <linux/kernel.h>
+#include <linux/mfd/syscon.h>
+#include <linux/regmap.h>
 #include <linux/io.h>
 #include <linux/clk.h>
 #include <linux/device.h>
@@ -42,12 +44,12 @@
 #include <linux/of_graph.h>
 #include <linux/of_platform.h>
 #include <linux/component.h>
+#include <linux/sys_soc.h>
 
 #include <video/mipi_display.h>
 
 #include "omapdss.h"
 #include "dss.h"
-#include "dss_features.h"
 
 #define DSI_CATCH_MISSING_TE
 
@@ -228,6 +230,12 @@ static int dsi_vc_send_null(struct omap_dss_device *dssdev, int channel);
 #define DSI_MAX_NR_ISRS                2
 #define DSI_MAX_NR_LANES	5
 
+enum dsi_model {
+	DSI_MODEL_OMAP3,
+	DSI_MODEL_OMAP4,
+	DSI_MODEL_OMAP5,
+};
+
 enum dsi_lane_function {
 	DSI_LANE_UNUSED	= 0,
 	DSI_LANE_CLK,
@@ -299,12 +307,36 @@ struct dsi_lp_clock_info {
 	u16 lp_clk_div;
 };
 
+struct dsi_module_id_data {
+	u32 address;
+	int id;
+};
+
+enum dsi_quirks {
+	DSI_QUIRK_PLL_PWR_BUG = (1 << 0),	/* DSI-PLL power command 0x3 is not working */
+	DSI_QUIRK_DCS_CMD_CONFIG_VC = (1 << 1),
+	DSI_QUIRK_VC_OCP_WIDTH = (1 << 2),
+	DSI_QUIRK_REVERSE_TXCLKESC = (1 << 3),
+	DSI_QUIRK_GNQ = (1 << 4),
+	DSI_QUIRK_PHY_DCC = (1 << 5),
+};
+
+struct dsi_of_data {
+	enum dsi_model model;
+	const struct dss_pll_hw *pll_hw;
+	const struct dsi_module_id_data *modules;
+	unsigned int max_fck_freq;
+	unsigned int max_pll_lpdiv;
+	enum dsi_quirks quirks;
+};
+
 struct dsi_data {
 	struct platform_device *pdev;
 	void __iomem *proto_base;
 	void __iomem *phy_base;
 	void __iomem *pll_base;
 
+	const struct dsi_of_data *data;
 	int module_id;
 
 	int irq;
@@ -312,6 +344,7 @@ struct dsi_data {
 	bool is_enabled;
 
 	struct clk *dss_clk;
+	struct regmap *syscon;
 
 	struct dispc_clock_info user_dispc_cinfo;
 	struct dss_pll_clock_info user_dsi_cinfo;
@@ -397,13 +430,6 @@ struct dsi_packet_sent_handler_data {
 	struct completion *completion;
 };
 
-struct dsi_module_id_data {
-	u32 address;
-	int id;
-};
-
-static const struct of_device_id dsi_of_match[];
-
 #ifdef DSI_PERF_MEASURE
 static bool dsi_perf;
 module_param(dsi_perf, bool, 0644);
@@ -1186,6 +1212,7 @@ static int dsi_regulator_init(struct platform_device *dsidev)
 
 static void _dsi_print_reset_status(struct platform_device *dsidev)
 {
+	struct dsi_data *dsi = dsi_get_dsidrv_data(dsidev);
 	u32 l;
 	int b0, b1, b2;
 
@@ -1194,7 +1221,7 @@ static void _dsi_print_reset_status(struct platform_device *dsidev)
 	 * I/O. */
 	l = dsi_read_reg(dsidev, DSI_DSIPHY_CFG5);
 
-	if (dss_has_feature(FEAT_DSI_REVERSE_TXCLKESC)) {
+	if (dsi->data->quirks & DSI_QUIRK_REVERSE_TXCLKESC) {
 		b0 = 28;
 		b1 = 27;
 		b2 = 26;
@@ -1297,7 +1324,7 @@ static int dsi_set_lp_clk_divisor(struct platform_device *dsidev)
 	unsigned long dsi_fclk;
 	unsigned lp_clk_div;
 	unsigned long lp_clk;
-	unsigned lpdiv_max = dss_feat_get_param_max(FEAT_PARAM_DSIPLL_LPDIV);
+	unsigned lpdiv_max = dsi->data->max_pll_lpdiv;
 
 
 	lp_clk_div = dsi->user_lp_cinfo.lp_clk_div;
@@ -1349,11 +1376,12 @@ enum dsi_pll_power_state {
 static int dsi_pll_power(struct platform_device *dsidev,
 		enum dsi_pll_power_state state)
 {
+	struct dsi_data *dsi = dsi_get_dsidrv_data(dsidev);
 	int t = 0;
 
 	/* DSI-PLL power command 0x3 is not working */
-	if (dss_has_feature(FEAT_DSI_PLL_PWR_BUG) &&
-			state == DSI_PLL_POWER_ON_DIV)
+	if ((dsi->data->quirks & DSI_QUIRK_PLL_PWR_BUG) &&
+	    state == DSI_PLL_POWER_ON_DIV)
 		state = DSI_PLL_POWER_ON_ALL;
 
 	/* PLL_PWR_CMD */
@@ -1373,11 +1401,12 @@ static int dsi_pll_power(struct platform_device *dsidev,
 }
 
 
-static void dsi_pll_calc_dsi_fck(struct dss_pll_clock_info *cinfo)
+static void dsi_pll_calc_dsi_fck(struct dsi_data *dsi,
+				 struct dss_pll_clock_info *cinfo)
 {
 	unsigned long max_dsi_fck;
 
-	max_dsi_fck = dss_feat_get_param_max(FEAT_PARAM_DSI_FCK);
+	max_dsi_fck = dsi->data->max_fck_freq;
 
 	cinfo->mX[HSDIV_DSI] = DIV_ROUND_UP(cinfo->clkdco, max_dsi_fck);
 	cinfo->clkout[HSDIV_DSI] = cinfo->clkdco / cinfo->mX[HSDIV_DSI];
@@ -1773,13 +1802,14 @@ static int dsi_cio_power(struct platform_device *dsidev,
 
 static unsigned dsi_get_line_buf_size(struct platform_device *dsidev)
 {
+	struct dsi_data *dsi = dsi_get_dsidrv_data(dsidev);
 	int val;
 
 	/* line buffer on OMAP3 is 1024 x 24bits */
 	/* XXX: for some reason using full buffer size causes
 	 * considerable TX slowdown with update sizes that fill the
 	 * whole buffer */
-	if (!dss_has_feature(FEAT_DSI_GNQ))
+	if (!(dsi->data->quirks & DSI_QUIRK_GNQ))
 		return 1023 * 3;
 
 	val = REG_GET(dsidev, DSI_GNQ, 14, 12); /* VP1_LINE_BUFFER_SIZE */
@@ -1872,6 +1902,7 @@ static inline unsigned ddr2ns(struct platform_device *dsidev, unsigned ddr)
 
 static void dsi_cio_timings(struct platform_device *dsidev)
 {
+	struct dsi_data *dsi = dsi_get_dsidrv_data(dsidev);
 	u32 r;
 	u32 ths_prepare, ths_prepare_ths_zero, ths_trail, ths_exit;
 	u32 tlpx_half, tclk_trail, tclk_zero;
@@ -1934,7 +1965,7 @@ static void dsi_cio_timings(struct platform_device *dsidev)
 	r = FLD_MOD(r, tclk_trail, 15, 8);
 	r = FLD_MOD(r, tclk_zero, 7, 0);
 
-	if (dss_has_feature(FEAT_DSI_PHY_DCC)) {
+	if (dsi->data->quirks & DSI_QUIRK_PHY_DCC) {
 		r = FLD_MOD(r, 0, 21, 21);	/* DCCEN = disable */
 		r = FLD_MOD(r, 1, 22, 22);	/* CLKINP_DIVBY2EN = enable */
 		r = FLD_MOD(r, 1, 23, 23);	/* CLKINP_SEL = enable */
@@ -2006,7 +2037,7 @@ static int dsi_cio_wait_tx_clk_esc_reset(struct platform_device *dsidev)
 	static const u8 offsets_new[] = { 24, 25, 26, 27, 28 };
 	const u8 *offsets;
 
-	if (dss_has_feature(FEAT_DSI_REVERSE_TXCLKESC))
+	if (dsi->data->quirks & DSI_QUIRK_REVERSE_TXCLKESC)
 		offsets = offsets_old;
 	else
 		offsets = offsets_new;
@@ -2060,6 +2091,83 @@ static unsigned dsi_get_lane_mask(struct platform_device *dsidev)
 	return mask;
 }
 
+/* OMAP4 CONTROL_DSIPHY */
+#define OMAP4_DSIPHY_SYSCON_OFFSET			0x78
+
+#define OMAP4_DSI2_LANEENABLE_SHIFT			29
+#define OMAP4_DSI2_LANEENABLE_MASK			(0x7 << 29)
+#define OMAP4_DSI1_LANEENABLE_SHIFT			24
+#define OMAP4_DSI1_LANEENABLE_MASK			(0x1f << 24)
+#define OMAP4_DSI1_PIPD_SHIFT				19
+#define OMAP4_DSI1_PIPD_MASK				(0x1f << 19)
+#define OMAP4_DSI2_PIPD_SHIFT				14
+#define OMAP4_DSI2_PIPD_MASK				(0x1f << 14)
+
+static int dsi_omap4_mux_pads(struct dsi_data *dsi, unsigned int lanes)
+{
+	u32 enable_mask, enable_shift;
+	u32 pipd_mask, pipd_shift;
+
+	if (dsi->module_id == 0) {
+		enable_mask = OMAP4_DSI1_LANEENABLE_MASK;
+		enable_shift = OMAP4_DSI1_LANEENABLE_SHIFT;
+		pipd_mask = OMAP4_DSI1_PIPD_MASK;
+		pipd_shift = OMAP4_DSI1_PIPD_SHIFT;
+	} else if (dsi->module_id == 1) {
+		enable_mask = OMAP4_DSI2_LANEENABLE_MASK;
+		enable_shift = OMAP4_DSI2_LANEENABLE_SHIFT;
+		pipd_mask = OMAP4_DSI2_PIPD_MASK;
+		pipd_shift = OMAP4_DSI2_PIPD_SHIFT;
+	} else {
+		return -ENODEV;
+	}
+
+	return regmap_update_bits(dsi->syscon, OMAP4_DSIPHY_SYSCON_OFFSET,
+		enable_mask | pipd_mask,
+		(lanes << enable_shift) | (lanes << pipd_shift));
+}
+
+/* OMAP5 CONTROL_DSIPHY */
+
+#define OMAP5_DSIPHY_SYSCON_OFFSET	0x74
+
+#define OMAP5_DSI1_LANEENABLE_SHIFT	24
+#define OMAP5_DSI2_LANEENABLE_SHIFT	19
+#define OMAP5_DSI_LANEENABLE_MASK	0x1f
+
+static int dsi_omap5_mux_pads(struct dsi_data *dsi, unsigned int lanes)
+{
+	u32 enable_shift;
+
+	if (dsi->module_id == 0)
+		enable_shift = OMAP5_DSI1_LANEENABLE_SHIFT;
+	else if (dsi->module_id == 1)
+		enable_shift = OMAP5_DSI2_LANEENABLE_SHIFT;
+	else
+		return -ENODEV;
+
+	return regmap_update_bits(dsi->syscon, OMAP5_DSIPHY_SYSCON_OFFSET,
+		OMAP5_DSI_LANEENABLE_MASK << enable_shift,
+		lanes << enable_shift);
+}
+
+static int dsi_enable_pads(struct dsi_data *dsi, unsigned int lane_mask)
+{
+	if (dsi->data->model == DSI_MODEL_OMAP4)
+		return dsi_omap4_mux_pads(dsi, lane_mask);
+	if (dsi->data->model == DSI_MODEL_OMAP5)
+		return dsi_omap5_mux_pads(dsi, lane_mask);
+	return 0;
+}
+
+static void dsi_disable_pads(struct dsi_data *dsi)
+{
+	if (dsi->data->model == DSI_MODEL_OMAP4)
+		dsi_omap4_mux_pads(dsi, 0);
+	else if (dsi->data->model == DSI_MODEL_OMAP5)
+		dsi_omap5_mux_pads(dsi, 0);
+}
+
 static int dsi_cio_init(struct platform_device *dsidev)
 {
 	struct dsi_data *dsi = dsi_get_dsidrv_data(dsidev);
@@ -2068,7 +2176,7 @@ static int dsi_cio_init(struct platform_device *dsidev)
 
 	DSSDBG("DSI CIO init starts");
 
-	r = dss_dsi_enable_pads(dsi->module_id, dsi_get_lane_mask(dsidev));
+	r = dsi_enable_pads(dsi, dsi_get_lane_mask(dsidev));
 	if (r)
 		return r;
 
@@ -2178,7 +2286,7 @@ static int dsi_cio_init(struct platform_device *dsidev)
 		dsi_cio_disable_lane_override(dsidev);
 err_scp_clk_dom:
 	dsi_disable_scp_clk(dsidev);
-	dss_dsi_disable_pads(dsi->module_id, dsi_get_lane_mask(dsidev));
+	dsi_disable_pads(dsi);
 	return r;
 }
 
@@ -2191,7 +2299,7 @@ static void dsi_cio_uninit(struct platform_device *dsidev)
 
 	dsi_cio_power(dsidev, DSI_COMPLEXIO_POWER_OFF);
 	dsi_disable_scp_clk(dsidev);
-	dss_dsi_disable_pads(dsi->module_id, dsi_get_lane_mask(dsidev));
+	dsi_disable_pads(dsi);
 }
 
 static void dsi_config_tx_fifo(struct platform_device *dsidev,
@@ -2439,7 +2547,7 @@ static void dsi_vc_initial_config(struct platform_device *dsidev, int channel)
 	r = FLD_MOD(r, 1, 7, 7); /* CS_TX_EN */
 	r = FLD_MOD(r, 1, 8, 8); /* ECC_TX_EN */
 	r = FLD_MOD(r, 0, 9, 9); /* MODE_SPEED, high speed on/off */
-	if (dss_has_feature(FEAT_DSI_VC_OCP_WIDTH))
+	if (dsi->data->quirks & DSI_QUIRK_VC_OCP_WIDTH)
 		r = FLD_MOD(r, 3, 11, 10);	/* OCP_WIDTH = 32 bit */
 
 	r = FLD_MOD(r, 4, 29, 27); /* DMA_RX_REQ_NB = no dma */
@@ -2474,7 +2582,7 @@ static int dsi_vc_config_source(struct platform_device *dsidev, int channel,
 	REG_FLD_MOD(dsidev, DSI_VC_CTRL(channel), source, 1, 1);
 
 	/* DCS_CMD_ENABLE */
-	if (dss_has_feature(FEAT_DSI_DCS_CMD_CONFIG_VC)) {
+	if (dsi->data->quirks & DSI_QUIRK_DCS_CMD_CONFIG_VC) {
 		bool enable = source == DSI_VC_SOURCE_VP;
 		REG_FLD_MOD(dsidev, DSI_VC_CTRL(channel), enable, 30, 30);
 	}
@@ -3607,7 +3715,7 @@ static int dsi_proto_config(struct platform_device *dsidev)
 	r = FLD_MOD(r, 0, 8, 8);	/* VP_CLK_POL */
 	r = FLD_MOD(r, 1, 14, 14);	/* TRIGGER_RESET_MODE */
 	r = FLD_MOD(r, 1, 19, 19);	/* EOT_ENABLE */
-	if (!dss_has_feature(FEAT_DSI_DCS_CMD_CONFIG_VC)) {
+	if (!(dsi->data->quirks & DSI_QUIRK_DCS_CMD_CONFIG_VC)) {
 		r = FLD_MOD(r, 1, 24, 24);	/* DCS_CMD_ENABLE */
 		/* DCS_CMD_CODE, 1=start, 0=continue */
 		r = FLD_MOD(r, 0, 25, 25);
@@ -4450,6 +4558,7 @@ static bool dsi_cm_calc_pll_cb(int n, int m, unsigned long fint,
 		unsigned long clkdco, void *data)
 {
 	struct dsi_clk_calc_ctx *ctx = data;
+	struct dsi_data *dsi = dsi_get_dsidrv_data(ctx->dsidev);
 
 	ctx->dsi_cinfo.n = n;
 	ctx->dsi_cinfo.m = m;
@@ -4457,7 +4566,7 @@ static bool dsi_cm_calc_pll_cb(int n, int m, unsigned long fint,
 	ctx->dsi_cinfo.clkdco = clkdco;
 
 	return dss_pll_hsdiv_calc_a(ctx->pll, clkdco, ctx->req_pck_min,
-			dss_feat_get_param_max(FEAT_PARAM_DSS_FCK),
+			dsi->data->max_fck_freq,
 			dsi_cm_calc_hsdiv_cb, ctx);
 }
 
@@ -4749,6 +4858,7 @@ static bool dsi_vm_calc_pll_cb(int n, int m, unsigned long fint,
 		unsigned long clkdco, void *data)
 {
 	struct dsi_clk_calc_ctx *ctx = data;
+	struct dsi_data *dsi = dsi_get_dsidrv_data(ctx->dsidev);
 
 	ctx->dsi_cinfo.n = n;
 	ctx->dsi_cinfo.m = m;
@@ -4756,7 +4866,7 @@ static bool dsi_vm_calc_pll_cb(int n, int m, unsigned long fint,
 	ctx->dsi_cinfo.clkdco = clkdco;
 
 	return dss_pll_hsdiv_calc_a(ctx->pll, clkdco, ctx->req_pck_min,
-			dss_feat_get_param_max(FEAT_PARAM_DSS_FCK),
+			dsi->data->max_fck_freq,
 			dsi_vm_calc_hsdiv_cb, ctx);
 }
 
@@ -4827,7 +4937,7 @@ static int dsi_set_config(struct omap_dss_device *dssdev,
 		goto err;
 	}
 
-	dsi_pll_calc_dsi_fck(&ctx.dsi_cinfo);
+	dsi_pll_calc_dsi_fck(dsi, &ctx.dsi_cinfo);
 
 	r = dsi_lp_clock_calc(ctx.dsi_cinfo.clkout[HSDIV_DSI],
 		config->lp_clk_min, config->lp_clk_max, &dsi->user_lp_cinfo);
@@ -4857,24 +4967,14 @@ static int dsi_set_config(struct omap_dss_device *dssdev,
  * the channel in some more dynamic manner, or get the channel as a user
  * parameter.
  */
-static enum omap_channel dsi_get_channel(int module_id)
+static enum omap_channel dsi_get_channel(struct dsi_data *dsi)
 {
-	switch (omapdss_get_version()) {
-	case OMAPDSS_VER_OMAP24xx:
-	case OMAPDSS_VER_AM43xx:
-		DSSWARN("DSI not supported\n");
+	switch (dsi->data->model) {
+	case DSI_MODEL_OMAP3:
 		return OMAP_DSS_CHANNEL_LCD;
 
-	case OMAPDSS_VER_OMAP34xx_ES1:
-	case OMAPDSS_VER_OMAP34xx_ES3:
-	case OMAPDSS_VER_OMAP3630:
-	case OMAPDSS_VER_AM35xx:
-		return OMAP_DSS_CHANNEL_LCD;
-
-	case OMAPDSS_VER_OMAP4430_ES1:
-	case OMAPDSS_VER_OMAP4430_ES2:
-	case OMAPDSS_VER_OMAP4:
-		switch (module_id) {
+	case DSI_MODEL_OMAP4:
+		switch (dsi->module_id) {
 		case 0:
 			return OMAP_DSS_CHANNEL_LCD;
 		case 1:
@@ -4884,8 +4984,8 @@ static enum omap_channel dsi_get_channel(int module_id)
 			return OMAP_DSS_CHANNEL_LCD;
 		}
 
-	case OMAPDSS_VER_OMAP5:
-		switch (module_id) {
+	case DSI_MODEL_OMAP5:
+		switch (dsi->module_id) {
 		case 0:
 			return OMAP_DSS_CHANNEL_LCD;
 		case 1:
@@ -5065,7 +5165,7 @@ static void dsi_init_output(struct platform_device *dsidev)
 
 	out->output_type = OMAP_DISPLAY_TYPE_DSI;
 	out->name = dsi->module_id == 0 ? "dsi.0" : "dsi.1";
-	out->dispc_channel = dsi_get_channel(dsi->module_id);
+	out->dispc_channel = dsi_get_channel(dsi);
 	out->ops.dsi = &dsi_ops;
 	out->owner = THIS_MODULE;
 
@@ -5240,29 +5340,7 @@ static int dsi_init_pll_data(struct platform_device *dsidev)
 	pll->id = dsi->module_id == 0 ? DSS_PLL_DSI1 : DSS_PLL_DSI2;
 	pll->clkin = clk;
 	pll->base = dsi->pll_base;
-
-	switch (omapdss_get_version()) {
-	case OMAPDSS_VER_OMAP34xx_ES1:
-	case OMAPDSS_VER_OMAP34xx_ES3:
-	case OMAPDSS_VER_OMAP3630:
-	case OMAPDSS_VER_AM35xx:
-		pll->hw = &dss_omap3_dsi_pll_hw;
-		break;
-
-	case OMAPDSS_VER_OMAP4430_ES1:
-	case OMAPDSS_VER_OMAP4430_ES2:
-	case OMAPDSS_VER_OMAP4:
-		pll->hw = &dss_omap4_dsi_pll_hw;
-		break;
-
-	case OMAPDSS_VER_OMAP5:
-		pll->hw = &dss_omap5_dsi_pll_hw;
-		break;
-
-	default:
-		return -ENODEV;
-	}
-
+	pll->hw = dsi->data->pll_hw;
 	pll->ops = &dsi_pll_ops;
 
 	r = dss_pll_register(pll);
@@ -5273,9 +5351,74 @@ static int dsi_init_pll_data(struct platform_device *dsidev)
 }
 
 /* DSI1 HW IP initialisation */
+static const struct dsi_of_data dsi_of_data_omap34xx = {
+	.model = DSI_MODEL_OMAP3,
+	.pll_hw = &dss_omap3_dsi_pll_hw,
+	.modules = (const struct dsi_module_id_data[]) {
+		{ .address = 0x4804fc00, .id = 0, },
+		{ },
+	},
+	.max_fck_freq = 173000000,
+	.max_pll_lpdiv = (1 << 13) - 1,
+	.quirks = DSI_QUIRK_REVERSE_TXCLKESC,
+};
+
+static const struct dsi_of_data dsi_of_data_omap36xx = {
+	.model = DSI_MODEL_OMAP3,
+	.pll_hw = &dss_omap3_dsi_pll_hw,
+	.modules = (const struct dsi_module_id_data[]) {
+		{ .address = 0x4804fc00, .id = 0, },
+		{ },
+	},
+	.max_fck_freq = 173000000,
+	.max_pll_lpdiv = (1 << 13) - 1,
+	.quirks = DSI_QUIRK_PLL_PWR_BUG,
+};
+
+static const struct dsi_of_data dsi_of_data_omap4 = {
+	.model = DSI_MODEL_OMAP4,
+	.pll_hw = &dss_omap4_dsi_pll_hw,
+	.modules = (const struct dsi_module_id_data[]) {
+		{ .address = 0x58004000, .id = 0, },
+		{ .address = 0x58005000, .id = 1, },
+		{ },
+	},
+	.max_fck_freq = 170000000,
+	.max_pll_lpdiv = (1 << 13) - 1,
+	.quirks = DSI_QUIRK_DCS_CMD_CONFIG_VC | DSI_QUIRK_VC_OCP_WIDTH
+		| DSI_QUIRK_GNQ,
+};
+
+static const struct dsi_of_data dsi_of_data_omap5 = {
+	.model = DSI_MODEL_OMAP5,
+	.pll_hw = &dss_omap5_dsi_pll_hw,
+	.modules = (const struct dsi_module_id_data[]) {
+		{ .address = 0x58004000, .id = 0, },
+		{ .address = 0x58009000, .id = 1, },
+		{ },
+	},
+	.max_fck_freq = 209250000,
+	.max_pll_lpdiv = (1 << 13) - 1,
+	.quirks = DSI_QUIRK_DCS_CMD_CONFIG_VC | DSI_QUIRK_VC_OCP_WIDTH
+		| DSI_QUIRK_GNQ | DSI_QUIRK_PHY_DCC,
+};
+
+static const struct of_device_id dsi_of_match[] = {
+	{ .compatible = "ti,omap3-dsi", .data = &dsi_of_data_omap36xx, },
+	{ .compatible = "ti,omap4-dsi", .data = &dsi_of_data_omap4, },
+	{ .compatible = "ti,omap5-dsi", .data = &dsi_of_data_omap5, },
+	{},
+};
+
+static const struct soc_device_attribute dsi_soc_devices[] = {
+	{ .machine = "OMAP3[45]*",	.data = &dsi_of_data_omap34xx },
+	{ .machine = "AM35*",		.data = &dsi_of_data_omap34xx },
+	{ /* sentinel */ }
+};
 static int dsi_bind(struct device *dev, struct device *master, void *data)
 {
 	struct platform_device *dsidev = to_platform_device(dev);
+	const struct soc_device_attribute *soc;
 	const struct dsi_module_id_data *d;
 	u32 rev;
 	int r, i;
@@ -5339,7 +5482,13 @@ static int dsi_bind(struct device *dev, struct device *master, void *data)
 		return r;
 	}
 
-	d = of_match_node(dsi_of_match, dsidev->dev.of_node)->data;
+	soc = soc_device_match(dsi_soc_devices);
+	if (soc)
+		dsi->data = soc->data;
+	else
+		dsi->data = of_match_node(dsi_of_match, dev->of_node)->data;
+
+	d = dsi->data->modules;
 	while (d->address != 0 && d->address != dsi_mem->start)
 		d++;
 
@@ -5350,6 +5499,24 @@ static int dsi_bind(struct device *dev, struct device *master, void *data)
 
 	dsi->module_id = d->id;
 
+	if (dsi->data->model == DSI_MODEL_OMAP4 ||
+	    dsi->data->model == DSI_MODEL_OMAP5) {
+		struct device_node *np;
+
+		/*
+		 * The OMAP4/5 display DT bindings don't reference the padconf
+		 * syscon. Our only option to retrieve it is to find it by name.
+		 */
+		np = of_find_node_by_name(NULL,
+			dsi->data->model == DSI_MODEL_OMAP4 ?
+			"omap4_padconf_global" : "omap5_padconf_global");
+		if (!np)
+			return -ENODEV;
+
+		dsi->syscon = syscon_node_to_regmap(np);
+		of_node_put(np);
+	}
+
 	/* DSI VCs initialization */
 	for (i = 0; i < ARRAY_SIZE(dsi->vc); i++) {
 		dsi->vc[i].source = DSI_VC_SOURCE_L4;
@@ -5375,7 +5542,7 @@ static int dsi_bind(struct device *dev, struct device *master, void *data)
 
 	/* DSI on OMAP3 doesn't have register DSI_GNQ, set number
 	 * of data to 3 by default */
-	if (dss_has_feature(FEAT_DSI_GNQ))
+	if (dsi->data->quirks & DSI_QUIRK_GNQ)
 		/* NB_DATA_LANES */
 		dsi->num_lanes_supported = 1 + REG_GET(dsidev, DSI_GNQ, 11, 9);
 	else
@@ -5495,30 +5662,6 @@ static const struct dev_pm_ops dsi_pm_ops = {
 	.runtime_resume = dsi_runtime_resume,
 };
 
-static const struct dsi_module_id_data dsi_of_data_omap3[] = {
-	{ .address = 0x4804fc00, .id = 0, },
-	{ },
-};
-
-static const struct dsi_module_id_data dsi_of_data_omap4[] = {
-	{ .address = 0x58004000, .id = 0, },
-	{ .address = 0x58005000, .id = 1, },
-	{ },
-};
-
-static const struct dsi_module_id_data dsi_of_data_omap5[] = {
-	{ .address = 0x58004000, .id = 0, },
-	{ .address = 0x58009000, .id = 1, },
-	{ },
-};
-
-static const struct of_device_id dsi_of_match[] = {
-	{ .compatible = "ti,omap3-dsi", .data = dsi_of_data_omap3, },
-	{ .compatible = "ti,omap4-dsi", .data = dsi_of_data_omap4, },
-	{ .compatible = "ti,omap5-dsi", .data = dsi_of_data_omap5, },
-	{},
-};
-
 static struct platform_driver omap_dsihw_driver = {
 	.probe		= dsi_probe,
 	.remove		= dsi_remove,
diff --git a/drivers/gpu/drm/omapdrm/dss/dss.c b/drivers/gpu/drm/omapdrm/dss/dss.c
index 99e22ca..d1755f1 100644
--- a/drivers/gpu/drm/omapdrm/dss/dss.c
+++ b/drivers/gpu/drm/omapdrm/dss/dss.c
@@ -22,6 +22,7 @@
 
 #define DSS_SUBSYS_NAME "DSS"
 
+#include <linux/debugfs.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <linux/io.h>
@@ -38,14 +39,15 @@
 #include <linux/mfd/syscon.h>
 #include <linux/regmap.h>
 #include <linux/of.h>
+#include <linux/of_device.h>
 #include <linux/of_graph.h>
 #include <linux/regulator/consumer.h>
 #include <linux/suspend.h>
 #include <linux/component.h>
+#include <linux/sys_soc.h>
 
 #include "omapdss.h"
 #include "dss.h"
-#include "dss_features.h"
 
 #define DSS_SZ_REGS			SZ_512
 
@@ -69,15 +71,24 @@ struct dss_reg {
 #define REG_FLD_MOD(idx, val, start, end) \
 	dss_write_reg(idx, FLD_MOD(dss_read_reg(idx), val, start, end))
 
+struct dss_ops {
+	int (*dpi_select_source)(int port, enum omap_channel channel);
+	int (*select_lcd_source)(enum omap_channel channel,
+		enum dss_clk_source clk_src);
+};
+
 struct dss_features {
+	enum dss_model model;
 	u8 fck_div_max;
+	unsigned int fck_freq_max;
 	u8 dss_fck_multiplier;
 	const char *parent_clk_name;
 	const enum omap_display_type *ports;
 	int num_ports;
-	int (*dpi_select_source)(int port, enum omap_channel channel);
-	int (*select_lcd_source)(enum omap_channel channel,
-		enum dss_clk_source clk_src);
+	const enum omap_dss_output_id *outputs;
+	const struct dss_ops *ops;
+	struct dss_reg_field dispc_clk_switch;
+	bool has_lcd_clk_src;
 };
 
 static struct {
@@ -139,8 +150,7 @@ static void dss_save_context(void)
 
 	SR(CONTROL);
 
-	if (dss_feat_get_supported_displays(OMAP_DSS_CHANNEL_LCD) &
-			OMAP_DISPLAY_TYPE_SDI) {
+	if (dss.feat->outputs[OMAP_DSS_CHANNEL_LCD] & OMAP_DSS_OUTPUT_SDI) {
 		SR(SDI_CONTROL);
 		SR(PLL_CONTROL);
 	}
@@ -159,8 +169,7 @@ static void dss_restore_context(void)
 
 	RR(CONTROL);
 
-	if (dss_feat_get_supported_displays(OMAP_DSS_CHANNEL_LCD) &
-			OMAP_DISPLAY_TYPE_SDI) {
+	if (dss.feat->outputs[OMAP_DSS_CHANNEL_LCD] & OMAP_DSS_OUTPUT_SDI) {
 		RR(SDI_CONTROL);
 		RR(PLL_CONTROL);
 	}
@@ -390,8 +399,7 @@ static void dss_dump_regs(struct seq_file *s)
 	DUMPREG(DSS_SYSSTATUS);
 	DUMPREG(DSS_CONTROL);
 
-	if (dss_feat_get_supported_displays(OMAP_DSS_CHANNEL_LCD) &
-			OMAP_DISPLAY_TYPE_SDI) {
+	if (dss.feat->outputs[OMAP_DSS_CHANNEL_LCD] & OMAP_DSS_OUTPUT_SDI) {
 		DUMPREG(DSS_SDI_CONTROL);
 		DUMPREG(DSS_PLL_CONTROL);
 		DUMPREG(DSS_SDI_STATUS);
@@ -419,14 +427,12 @@ static int dss_get_channel_index(enum omap_channel channel)
 static void dss_select_dispc_clk_source(enum dss_clk_source clk_src)
 {
 	int b;
-	u8 start, end;
 
 	/*
 	 * We always use PRCM clock as the DISPC func clock, except on DSS3,
 	 * where we don't have separate DISPC and LCD clock sources.
 	 */
-	if (WARN_ON(dss_has_feature(FEAT_LCD_CLK_SRC) &&
-		clk_src != DSS_CLK_SRC_FCK))
+	if (WARN_ON(dss.feat->has_lcd_clk_src && clk_src != DSS_CLK_SRC_FCK))
 		return;
 
 	switch (clk_src) {
@@ -444,9 +450,9 @@ static void dss_select_dispc_clk_source(enum dss_clk_source clk_src)
 		return;
 	}
 
-	dss_feat_get_reg_field(FEAT_REG_DISPC_CLK_SWITCH, &start, &end);
-
-	REG_FLD_MOD(DSS_CONTROL, b, start, end);	/* DISPC_CLK_SWITCH */
+	REG_FLD_MOD(DSS_CONTROL, b,			/* DISPC_CLK_SWITCH */
+		    dss.feat->dispc_clk_switch.start,
+		    dss.feat->dispc_clk_switch.end);
 
 	dss.dispc_clk_source = clk_src;
 }
@@ -570,13 +576,13 @@ void dss_select_lcd_clk_source(enum omap_channel channel,
 	int idx = dss_get_channel_index(channel);
 	int r;
 
-	if (!dss_has_feature(FEAT_LCD_CLK_SRC)) {
+	if (!dss.feat->has_lcd_clk_src) {
 		dss_select_dispc_clk_source(clk_src);
 		dss.lcd_clk_source[idx] = clk_src;
 		return;
 	}
 
-	r = dss.feat->select_lcd_source(channel, clk_src);
+	r = dss.feat->ops->select_lcd_source(channel, clk_src);
 	if (r)
 		return;
 
@@ -595,7 +601,7 @@ enum dss_clk_source dss_get_dsi_clk_source(int dsi_module)
 
 enum dss_clk_source dss_get_lcd_clk_source(enum omap_channel channel)
 {
-	if (dss_has_feature(FEAT_LCD_CLK_SRC)) {
+	if (dss.feat->has_lcd_clk_src) {
 		int idx = dss_get_channel_index(channel);
 		return dss.lcd_clk_source[idx];
 	} else {
@@ -615,7 +621,7 @@ bool dss_div_calc(unsigned long pck, unsigned long fck_min,
 	unsigned long prate;
 	unsigned m;
 
-	fck_hw_max = dss_feat_get_param_max(FEAT_PARAM_DSS_FCK);
+	fck_hw_max = dss.feat->fck_freq_max;
 
 	if (dss.parent_clk == NULL) {
 		unsigned pckd;
@@ -673,6 +679,16 @@ unsigned long dss_get_dispc_clk_rate(void)
 	return dss.dss_clk_rate;
 }
 
+unsigned long dss_get_max_fck_rate(void)
+{
+	return dss.feat->fck_freq_max;
+}
+
+enum omap_dss_output_id dss_get_supported_outputs(enum omap_channel channel)
+{
+	return dss.feat->outputs[channel];
+}
+
 static int dss_setup_default_clock(void)
 {
 	unsigned long max_dss_fck, prate;
@@ -680,7 +696,7 @@ static int dss_setup_default_clock(void)
 	unsigned fck_div;
 	int r;
 
-	max_dss_fck = dss_feat_get_param_max(FEAT_PARAM_DSS_FCK);
+	max_dss_fck = dss.feat->fck_freq_max;
 
 	if (dss.parent_clk == NULL) {
 		fck = clk_round_rate(dss.dss_clk, max_dss_fck);
@@ -721,27 +737,29 @@ void dss_set_dac_pwrdn_bgz(bool enable)
 
 void dss_select_hdmi_venc_clk_source(enum dss_hdmi_venc_clk_source_select src)
 {
-	enum omap_display_type dp;
-	dp = dss_feat_get_supported_displays(OMAP_DSS_CHANNEL_DIGIT);
+	enum omap_dss_output_id outputs;
+
+	outputs = dss.feat->outputs[OMAP_DSS_CHANNEL_DIGIT];
 
 	/* Complain about invalid selections */
-	WARN_ON((src == DSS_VENC_TV_CLK) && !(dp & OMAP_DISPLAY_TYPE_VENC));
-	WARN_ON((src == DSS_HDMI_M_PCLK) && !(dp & OMAP_DISPLAY_TYPE_HDMI));
+	WARN_ON((src == DSS_VENC_TV_CLK) && !(outputs & OMAP_DSS_OUTPUT_VENC));
+	WARN_ON((src == DSS_HDMI_M_PCLK) && !(outputs & OMAP_DSS_OUTPUT_HDMI));
 
 	/* Select only if we have options */
-	if ((dp & OMAP_DISPLAY_TYPE_VENC) && (dp & OMAP_DISPLAY_TYPE_HDMI))
+	if ((outputs & OMAP_DSS_OUTPUT_VENC) &&
+	    (outputs & OMAP_DSS_OUTPUT_HDMI))
 		REG_FLD_MOD(DSS_CONTROL, src, 15, 15);	/* VENC_HDMI_SWITCH */
 }
 
 enum dss_hdmi_venc_clk_source_select dss_get_hdmi_venc_clk_source(void)
 {
-	enum omap_display_type displays;
+	enum omap_dss_output_id outputs;
 
-	displays = dss_feat_get_supported_displays(OMAP_DSS_CHANNEL_DIGIT);
-	if ((displays & OMAP_DISPLAY_TYPE_HDMI) == 0)
+	outputs = dss.feat->outputs[OMAP_DSS_CHANNEL_DIGIT];
+	if ((outputs & OMAP_DSS_OUTPUT_HDMI) == 0)
 		return DSS_VENC_TV_CLK;
 
-	if ((displays & OMAP_DISPLAY_TYPE_VENC) == 0)
+	if ((outputs & OMAP_DSS_OUTPUT_VENC) == 0)
 		return DSS_HDMI_M_PCLK;
 
 	return REG_GET(DSS_CONTROL, 15, 15);
@@ -823,7 +841,7 @@ static int dss_dpi_select_source_dra7xx(int port, enum omap_channel channel)
 
 int dss_dpi_select_source(int port, enum omap_channel channel)
 {
-	return dss.feat->dpi_select_source(port, channel);
+	return dss.feat->ops->dpi_select_source(port, channel);
 }
 
 static int dss_get_clocks(void)
@@ -882,7 +900,7 @@ void dss_runtime_put(void)
 
 /* DEBUGFS */
 #if defined(CONFIG_OMAP2_DSS_DEBUGFS)
-void dss_debug_dump_clocks(struct seq_file *s)
+static void dss_debug_dump_clocks(struct seq_file *s)
 {
 	dss_dump_clocks(s);
 	dispc_dump_clocks(s);
@@ -890,8 +908,88 @@ void dss_debug_dump_clocks(struct seq_file *s)
 	dsi_dump_clocks(s);
 #endif
 }
-#endif
 
+static int dss_debug_show(struct seq_file *s, void *unused)
+{
+	void (*func)(struct seq_file *) = s->private;
+
+	func(s);
+	return 0;
+}
+
+static int dss_debug_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, dss_debug_show, inode->i_private);
+}
+
+static const struct file_operations dss_debug_fops = {
+	.open           = dss_debug_open,
+	.read           = seq_read,
+	.llseek         = seq_lseek,
+	.release        = single_release,
+};
+
+static struct dentry *dss_debugfs_dir;
+
+static int dss_initialize_debugfs(void)
+{
+	dss_debugfs_dir = debugfs_create_dir("omapdss", NULL);
+	if (IS_ERR(dss_debugfs_dir)) {
+		int err = PTR_ERR(dss_debugfs_dir);
+
+		dss_debugfs_dir = NULL;
+		return err;
+	}
+
+	debugfs_create_file("clk", S_IRUGO, dss_debugfs_dir,
+			&dss_debug_dump_clocks, &dss_debug_fops);
+
+	return 0;
+}
+
+static void dss_uninitialize_debugfs(void)
+{
+	if (dss_debugfs_dir)
+		debugfs_remove_recursive(dss_debugfs_dir);
+}
+
+int dss_debugfs_create_file(const char *name, void (*write)(struct seq_file *))
+{
+	struct dentry *d;
+
+	d = debugfs_create_file(name, S_IRUGO, dss_debugfs_dir,
+			write, &dss_debug_fops);
+
+	return PTR_ERR_OR_ZERO(d);
+}
+#else /* CONFIG_OMAP2_DSS_DEBUGFS */
+static inline int dss_initialize_debugfs(void)
+{
+	return 0;
+}
+static inline void dss_uninitialize_debugfs(void)
+{
+}
+#endif /* CONFIG_OMAP2_DSS_DEBUGFS */
+
+static const struct dss_ops dss_ops_omap2_omap3 = {
+	.dpi_select_source = &dss_dpi_select_source_omap2_omap3,
+};
+
+static const struct dss_ops dss_ops_omap4 = {
+	.dpi_select_source = &dss_dpi_select_source_omap4,
+	.select_lcd_source = &dss_lcd_clk_mux_omap4,
+};
+
+static const struct dss_ops dss_ops_omap5 = {
+	.dpi_select_source = &dss_dpi_select_source_omap5,
+	.select_lcd_source = &dss_lcd_clk_mux_omap5,
+};
+
+static const struct dss_ops dss_ops_dra7 = {
+	.dpi_select_source = &dss_dpi_select_source_dra7xx,
+	.select_lcd_source = &dss_lcd_clk_mux_dra7,
+};
 
 static const enum omap_display_type omap2plus_ports[] = {
 	OMAP_DISPLAY_TYPE_DPI,
@@ -908,130 +1006,168 @@ static const enum omap_display_type dra7xx_ports[] = {
 	OMAP_DISPLAY_TYPE_DPI,
 };
 
+static const enum omap_dss_output_id omap2_dss_supported_outputs[] = {
+	/* OMAP_DSS_CHANNEL_LCD */
+	OMAP_DSS_OUTPUT_DPI | OMAP_DSS_OUTPUT_DBI,
+
+	/* OMAP_DSS_CHANNEL_DIGIT */
+	OMAP_DSS_OUTPUT_VENC,
+};
+
+static const enum omap_dss_output_id omap3430_dss_supported_outputs[] = {
+	/* OMAP_DSS_CHANNEL_LCD */
+	OMAP_DSS_OUTPUT_DPI | OMAP_DSS_OUTPUT_DBI |
+	OMAP_DSS_OUTPUT_SDI | OMAP_DSS_OUTPUT_DSI1,
+
+	/* OMAP_DSS_CHANNEL_DIGIT */
+	OMAP_DSS_OUTPUT_VENC,
+};
+
+static const enum omap_dss_output_id omap3630_dss_supported_outputs[] = {
+	/* OMAP_DSS_CHANNEL_LCD */
+	OMAP_DSS_OUTPUT_DPI | OMAP_DSS_OUTPUT_DBI |
+	OMAP_DSS_OUTPUT_DSI1,
+
+	/* OMAP_DSS_CHANNEL_DIGIT */
+	OMAP_DSS_OUTPUT_VENC,
+};
+
+static const enum omap_dss_output_id am43xx_dss_supported_outputs[] = {
+	/* OMAP_DSS_CHANNEL_LCD */
+	OMAP_DSS_OUTPUT_DPI | OMAP_DSS_OUTPUT_DBI,
+};
+
+static const enum omap_dss_output_id omap4_dss_supported_outputs[] = {
+	/* OMAP_DSS_CHANNEL_LCD */
+	OMAP_DSS_OUTPUT_DBI | OMAP_DSS_OUTPUT_DSI1,
+
+	/* OMAP_DSS_CHANNEL_DIGIT */
+	OMAP_DSS_OUTPUT_VENC | OMAP_DSS_OUTPUT_HDMI,
+
+	/* OMAP_DSS_CHANNEL_LCD2 */
+	OMAP_DSS_OUTPUT_DPI | OMAP_DSS_OUTPUT_DBI |
+	OMAP_DSS_OUTPUT_DSI2,
+};
+
+static const enum omap_dss_output_id omap5_dss_supported_outputs[] = {
+	/* OMAP_DSS_CHANNEL_LCD */
+	OMAP_DSS_OUTPUT_DPI | OMAP_DSS_OUTPUT_DBI |
+	OMAP_DSS_OUTPUT_DSI1 | OMAP_DSS_OUTPUT_DSI2,
+
+	/* OMAP_DSS_CHANNEL_DIGIT */
+	OMAP_DSS_OUTPUT_HDMI,
+
+	/* OMAP_DSS_CHANNEL_LCD2 */
+	OMAP_DSS_OUTPUT_DPI | OMAP_DSS_OUTPUT_DBI |
+	OMAP_DSS_OUTPUT_DSI1,
+
+	/* OMAP_DSS_CHANNEL_LCD3 */
+	OMAP_DSS_OUTPUT_DPI | OMAP_DSS_OUTPUT_DBI |
+	OMAP_DSS_OUTPUT_DSI2,
+};
+
 static const struct dss_features omap24xx_dss_feats = {
+	.model			=	DSS_MODEL_OMAP2,
 	/*
 	 * fck div max is really 16, but the divider range has gaps. The range
 	 * from 1 to 6 has no gaps, so let's use that as a max.
 	 */
 	.fck_div_max		=	6,
+	.fck_freq_max		=	133000000,
 	.dss_fck_multiplier	=	2,
 	.parent_clk_name	=	"core_ck",
-	.dpi_select_source	=	&dss_dpi_select_source_omap2_omap3,
 	.ports			=	omap2plus_ports,
 	.num_ports		=	ARRAY_SIZE(omap2plus_ports),
+	.outputs		=	omap2_dss_supported_outputs,
+	.ops			=	&dss_ops_omap2_omap3,
+	.dispc_clk_switch	=	{ 0, 0 },
+	.has_lcd_clk_src	=	false,
 };
 
 static const struct dss_features omap34xx_dss_feats = {
+	.model			=	DSS_MODEL_OMAP3,
 	.fck_div_max		=	16,
+	.fck_freq_max		=	173000000,
 	.dss_fck_multiplier	=	2,
 	.parent_clk_name	=	"dpll4_ck",
-	.dpi_select_source	=	&dss_dpi_select_source_omap2_omap3,
 	.ports			=	omap34xx_ports,
+	.outputs		=	omap3430_dss_supported_outputs,
 	.num_ports		=	ARRAY_SIZE(omap34xx_ports),
+	.ops			=	&dss_ops_omap2_omap3,
+	.dispc_clk_switch	=	{ 0, 0 },
+	.has_lcd_clk_src	=	false,
 };
 
 static const struct dss_features omap3630_dss_feats = {
+	.model			=	DSS_MODEL_OMAP3,
 	.fck_div_max		=	32,
+	.fck_freq_max		=	173000000,
 	.dss_fck_multiplier	=	1,
 	.parent_clk_name	=	"dpll4_ck",
-	.dpi_select_source	=	&dss_dpi_select_source_omap2_omap3,
 	.ports			=	omap2plus_ports,
 	.num_ports		=	ARRAY_SIZE(omap2plus_ports),
+	.outputs		=	omap3630_dss_supported_outputs,
+	.ops			=	&dss_ops_omap2_omap3,
+	.dispc_clk_switch	=	{ 0, 0 },
+	.has_lcd_clk_src	=	false,
 };
 
 static const struct dss_features omap44xx_dss_feats = {
+	.model			=	DSS_MODEL_OMAP4,
 	.fck_div_max		=	32,
+	.fck_freq_max		=	186000000,
 	.dss_fck_multiplier	=	1,
 	.parent_clk_name	=	"dpll_per_x2_ck",
-	.dpi_select_source	=	&dss_dpi_select_source_omap4,
 	.ports			=	omap2plus_ports,
 	.num_ports		=	ARRAY_SIZE(omap2plus_ports),
-	.select_lcd_source	=	&dss_lcd_clk_mux_omap4,
+	.outputs		=	omap4_dss_supported_outputs,
+	.ops			=	&dss_ops_omap4,
+	.dispc_clk_switch	=	{ 9, 8 },
+	.has_lcd_clk_src	=	true,
 };
 
 static const struct dss_features omap54xx_dss_feats = {
+	.model			=	DSS_MODEL_OMAP5,
 	.fck_div_max		=	64,
+	.fck_freq_max		=	209250000,
 	.dss_fck_multiplier	=	1,
 	.parent_clk_name	=	"dpll_per_x2_ck",
-	.dpi_select_source	=	&dss_dpi_select_source_omap5,
 	.ports			=	omap2plus_ports,
 	.num_ports		=	ARRAY_SIZE(omap2plus_ports),
-	.select_lcd_source	=	&dss_lcd_clk_mux_omap5,
+	.outputs		=	omap5_dss_supported_outputs,
+	.ops			=	&dss_ops_omap5,
+	.dispc_clk_switch	=	{ 9, 7 },
+	.has_lcd_clk_src	=	true,
 };
 
 static const struct dss_features am43xx_dss_feats = {
+	.model			=	DSS_MODEL_OMAP3,
 	.fck_div_max		=	0,
+	.fck_freq_max		=	200000000,
 	.dss_fck_multiplier	=	0,
 	.parent_clk_name	=	NULL,
-	.dpi_select_source	=	&dss_dpi_select_source_omap2_omap3,
 	.ports			=	omap2plus_ports,
 	.num_ports		=	ARRAY_SIZE(omap2plus_ports),
+	.outputs		=	am43xx_dss_supported_outputs,
+	.ops			=	&dss_ops_omap2_omap3,
+	.dispc_clk_switch	=	{ 0, 0 },
+	.has_lcd_clk_src	=	true,
 };
 
 static const struct dss_features dra7xx_dss_feats = {
+	.model			=	DSS_MODEL_DRA7,
 	.fck_div_max		=	64,
+	.fck_freq_max		=	209250000,
 	.dss_fck_multiplier	=	1,
 	.parent_clk_name	=	"dpll_per_x2_ck",
-	.dpi_select_source	=	&dss_dpi_select_source_dra7xx,
 	.ports			=	dra7xx_ports,
 	.num_ports		=	ARRAY_SIZE(dra7xx_ports),
-	.select_lcd_source	=	&dss_lcd_clk_mux_dra7,
+	.outputs		=	omap5_dss_supported_outputs,
+	.ops			=	&dss_ops_dra7,
+	.dispc_clk_switch	=	{ 9, 7 },
+	.has_lcd_clk_src	=	true,
 };
 
-static int dss_init_features(struct platform_device *pdev)
-{
-	const struct dss_features *src;
-	struct dss_features *dst;
-
-	dst = devm_kzalloc(&pdev->dev, sizeof(*dst), GFP_KERNEL);
-	if (!dst) {
-		dev_err(&pdev->dev, "Failed to allocate local DSS Features\n");
-		return -ENOMEM;
-	}
-
-	switch (omapdss_get_version()) {
-	case OMAPDSS_VER_OMAP24xx:
-		src = &omap24xx_dss_feats;
-		break;
-
-	case OMAPDSS_VER_OMAP34xx_ES1:
-	case OMAPDSS_VER_OMAP34xx_ES3:
-	case OMAPDSS_VER_AM35xx:
-		src = &omap34xx_dss_feats;
-		break;
-
-	case OMAPDSS_VER_OMAP3630:
-		src = &omap3630_dss_feats;
-		break;
-
-	case OMAPDSS_VER_OMAP4430_ES1:
-	case OMAPDSS_VER_OMAP4430_ES2:
-	case OMAPDSS_VER_OMAP4:
-		src = &omap44xx_dss_feats;
-		break;
-
-	case OMAPDSS_VER_OMAP5:
-		src = &omap54xx_dss_feats;
-		break;
-
-	case OMAPDSS_VER_AM43xx:
-		src = &am43xx_dss_feats;
-		break;
-
-	case OMAPDSS_VER_DRA7xx:
-		src = &dra7xx_dss_feats;
-		break;
-
-	default:
-		return -ENODEV;
-	}
-
-	memcpy(dst, src, sizeof(*dst));
-	dss.feat = dst;
-
-	return 0;
-}
-
 static int dss_init_ports(struct platform_device *pdev)
 {
 	struct device_node *parent = pdev->dev.of_node;
@@ -1045,7 +1181,7 @@ static int dss_init_ports(struct platform_device *pdev)
 
 		switch (dss.feat->ports[i]) {
 		case OMAP_DISPLAY_TYPE_DPI:
-			dpi_init_port(pdev, port);
+			dpi_init_port(pdev, port, dss.feat->model);
 			break;
 		case OMAP_DISPLAY_TYPE_SDI:
 			sdi_init_port(pdev, port);
@@ -1144,6 +1280,23 @@ static int dss_video_pll_probe(struct platform_device *pdev)
 }
 
 /* DSS HW IP initialisation */
+static const struct of_device_id dss_of_match[] = {
+	{ .compatible = "ti,omap2-dss", .data = &omap24xx_dss_feats },
+	{ .compatible = "ti,omap3-dss", .data = &omap3630_dss_feats },
+	{ .compatible = "ti,omap4-dss", .data = &omap44xx_dss_feats },
+	{ .compatible = "ti,omap5-dss", .data = &omap54xx_dss_feats },
+	{ .compatible = "ti,dra7-dss",  .data = &dra7xx_dss_feats },
+	{},
+};
+MODULE_DEVICE_TABLE(of, dss_of_match);
+
+static const struct soc_device_attribute dss_soc_devices[] = {
+	{ .machine = "OMAP3430/3530", .data = &omap34xx_dss_feats },
+	{ .machine = "AM35??",        .data = &omap34xx_dss_feats },
+	{ .family  = "AM43xx",        .data = &am43xx_dss_feats },
+	{ /* sentinel */ }
+};
+
 static int dss_bind(struct device *dev)
 {
 	struct platform_device *pdev = to_platform_device(dev);
@@ -1151,12 +1304,6 @@ static int dss_bind(struct device *dev)
 	u32 rev;
 	int r;
 
-	dss.pdev = pdev;
-
-	r = dss_init_features(dss.pdev);
-	if (r)
-		return r;
-
 	dss_mem = platform_get_resource(dss.pdev, IORESOURCE_MEM, 0);
 	dss.base = devm_ioremap_resource(&pdev->dev, dss_mem);
 	if (IS_ERR(dss.base))
@@ -1288,15 +1435,34 @@ static int dss_add_child_component(struct device *dev, void *data)
 
 static int dss_probe(struct platform_device *pdev)
 {
+	const struct soc_device_attribute *soc;
 	struct component_match *match = NULL;
 	int r;
 
+	dss.pdev = pdev;
+
+	/*
+	 * The various OMAP3-based SoCs can't be told apart using the compatible
+	 * string, use SoC device matching.
+	 */
+	soc = soc_device_match(dss_soc_devices);
+	if (soc)
+		dss.feat = soc->data;
+	else
+		dss.feat = of_match_device(dss_of_match, &pdev->dev)->data;
+
+	r = dss_initialize_debugfs();
+	if (r)
+		return r;
+
 	/* add all the child devices as components */
 	device_for_each_child(&pdev->dev, &match, dss_add_child_component);
 
 	r = component_master_add_with_match(&pdev->dev, &dss_component_ops, match);
-	if (r)
+	if (r) {
+		dss_uninitialize_debugfs();
 		return r;
+	}
 
 	return 0;
 }
@@ -1304,9 +1470,27 @@ static int dss_probe(struct platform_device *pdev)
 static int dss_remove(struct platform_device *pdev)
 {
 	component_master_del(&pdev->dev, &dss_component_ops);
+
+	dss_uninitialize_debugfs();
+
 	return 0;
 }
 
+static void dss_shutdown(struct platform_device *pdev)
+{
+	struct omap_dss_device *dssdev = NULL;
+
+	DSSDBG("shutdown\n");
+
+	for_each_dss_dev(dssdev) {
+		if (!dssdev->driver)
+			continue;
+
+		if (dssdev->state == OMAP_DSS_DISPLAY_ACTIVE)
+			dssdev->driver->disable(dssdev);
+	}
+}
+
 static int dss_runtime_suspend(struct device *dev)
 {
 	dss_save_context();
@@ -1343,20 +1527,10 @@ static const struct dev_pm_ops dss_pm_ops = {
 	.runtime_resume = dss_runtime_resume,
 };
 
-static const struct of_device_id dss_of_match[] = {
-	{ .compatible = "ti,omap2-dss", },
-	{ .compatible = "ti,omap3-dss", },
-	{ .compatible = "ti,omap4-dss", },
-	{ .compatible = "ti,omap5-dss", },
-	{ .compatible = "ti,dra7-dss", },
-	{},
-};
-
-MODULE_DEVICE_TABLE(of, dss_of_match);
-
 static struct platform_driver omap_dsshw_driver = {
 	.probe		= dss_probe,
 	.remove		= dss_remove,
+	.shutdown	= dss_shutdown,
 	.driver         = {
 		.name   = "omapdss_dss",
 		.pm	= &dss_pm_ops,
diff --git a/drivers/gpu/drm/omapdrm/dss/dss.h b/drivers/gpu/drm/omapdrm/dss/dss.h
index 8dbf35f..ed46557 100644
--- a/drivers/gpu/drm/omapdrm/dss/dss.h
+++ b/drivers/gpu/drm/omapdrm/dss/dss.h
@@ -27,6 +27,9 @@
 
 #include "omapdss.h"
 
+#define MAX_DSS_LCD_MANAGERS	3
+#define MAX_NUM_DSI		2
+
 #ifdef pr_fmt
 #undef pr_fmt
 #endif
@@ -72,6 +75,14 @@
 #define FLD_MOD(orig, val, start, end) \
 	(((orig) & ~FLD_MASK(start, end)) | FLD_VAL(val, start, end))
 
+enum dss_model {
+	DSS_MODEL_OMAP2,
+	DSS_MODEL_OMAP3,
+	DSS_MODEL_OMAP4,
+	DSS_MODEL_OMAP5,
+	DSS_MODEL_DRA7,
+};
+
 enum dss_io_pad_mode {
 	DSS_IO_PAD_MODE_RESET,
 	DSS_IO_PAD_MODE_RFBI,
@@ -174,6 +185,9 @@ struct dss_pll_hw {
 	bool has_freqsel;
 	bool has_selfreqdco;
 	bool has_refsel;
+
+	/* DRA7 errata i886: use high N & M to avoid jitter */
+	bool errata_i886;
 };
 
 struct dss_pll {
@@ -192,6 +206,11 @@ struct dss_pll {
 	struct dss_pll_clock_info cinfo;
 };
 
+/* Defines a generic omap register field */
+struct dss_reg_field {
+	u8 start, end;
+};
+
 struct dispc_clock_info {
 	/* rates that we get with dividers below */
 	unsigned long lck;
@@ -219,10 +238,11 @@ struct seq_file;
 struct platform_device;
 
 /* core */
-int dss_dsi_enable_pads(int dsi_id, unsigned lane_mask);
-void dss_dsi_disable_pads(int dsi_id, unsigned lane_mask);
-int dss_set_min_bus_tput(struct device *dev, unsigned long tput);
-int dss_debugfs_create_file(const char *name, void (*write)(struct seq_file *));
+static inline int dss_set_min_bus_tput(struct device *dev, unsigned long tput)
+{
+	/* To be implemented when the OMAP platform will provide this feature */
+	return 0;
+}
 
 static inline bool dss_mgr_is_lcd(enum omap_channel id)
 {
@@ -234,6 +254,16 @@ static inline bool dss_mgr_is_lcd(enum omap_channel id)
 }
 
 /* DSS */
+#if defined(CONFIG_OMAP2_DSS_DEBUGFS)
+int dss_debugfs_create_file(const char *name, void (*write)(struct seq_file *));
+#else
+static inline int dss_debugfs_create_file(const char *name,
+					  void (*write)(struct seq_file *))
+{
+	return 0;
+}
+#endif /* CONFIG_OMAP2_DSS_DEBUGFS */
+
 int dss_init_platform_driver(void) __init;
 void dss_uninit_platform_driver(void);
 
@@ -241,6 +271,8 @@ int dss_runtime_get(void);
 void dss_runtime_put(void);
 
 unsigned long dss_get_dispc_clk_rate(void);
+unsigned long dss_get_max_fck_rate(void);
+enum omap_dss_output_id dss_get_supported_outputs(enum omap_channel channel);
 int dss_dpi_select_source(int port, enum omap_channel channel);
 void dss_select_hdmi_venc_clk_source(enum dss_hdmi_venc_clk_source_select);
 enum dss_hdmi_venc_clk_source_select dss_get_hdmi_venc_clk_source(void);
@@ -252,10 +284,6 @@ struct dss_pll *dss_video_pll_init(struct platform_device *pdev, int id,
 	struct regulator *regulator);
 void dss_video_pll_uninit(struct dss_pll *pll);
 
-#if defined(CONFIG_OMAP2_DSS_DEBUGFS)
-void dss_debug_dump_clocks(struct seq_file *s);
-#endif
-
 void dss_ctrl_pll_enable(enum dss_pll_id pll_id, bool enable);
 
 void dss_sdi_init(int datapairs);
@@ -312,11 +340,12 @@ void dsi_irq_handler(void);
 
 /* DPI */
 #ifdef CONFIG_OMAP2_DSS_DPI
-int dpi_init_port(struct platform_device *pdev, struct device_node *port);
+int dpi_init_port(struct platform_device *pdev, struct device_node *port,
+		  enum dss_model dss_model);
 void dpi_uninit_port(struct device_node *port);
 #else
 static inline int dpi_init_port(struct platform_device *pdev,
-		struct device_node *port)
+		struct device_node *port, enum dss_model dss_model)
 {
 	return 0;
 }
diff --git a/drivers/gpu/drm/omapdrm/dss/dss_features.c b/drivers/gpu/drm/omapdrm/dss/dss_features.c
deleted file mode 100644
index 0e59971..0000000
--- a/drivers/gpu/drm/omapdrm/dss/dss_features.c
+++ /dev/null
@@ -1,905 +0,0 @@
-/*
- * linux/drivers/video/omap2/dss/dss_features.c
- *
- * Copyright (C) 2010 Texas Instruments
- * Author: Archit Taneja <archit@ti.com>
- *
- * This program is free software; you can redistribute it and/or modify it
- * under the terms of the GNU General Public License version 2 as published by
- * the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but WITHOUT
- * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
- * more details.
- *
- * You should have received a copy of the GNU General Public License along with
- * this program.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/types.h>
-#include <linux/err.h>
-#include <linux/slab.h>
-#include <drm/drm_fourcc.h>
-
-#include "omapdss.h"
-#include "dss.h"
-#include "dss_features.h"
-
-/* Defines a generic omap register field */
-struct dss_reg_field {
-	u8 start, end;
-};
-
-struct dss_param_range {
-	int min, max;
-};
-
-struct omap_dss_features {
-	const struct dss_reg_field *reg_fields;
-	const int num_reg_fields;
-
-	const enum dss_feat_id *features;
-	const int num_features;
-
-	const int num_mgrs;
-	const int num_ovls;
-	const enum omap_display_type *supported_displays;
-	const enum omap_dss_output_id *supported_outputs;
-	const u32 **supported_color_modes;
-	const enum omap_overlay_caps *overlay_caps;
-	const struct dss_param_range *dss_params;
-
-	const u32 buffer_size_unit;
-	const u32 burst_size_unit;
-};
-
-/* This struct is assigned to one of the below during initialization */
-static const struct omap_dss_features *omap_current_dss_features;
-
-static const struct dss_reg_field omap2_dss_reg_fields[] = {
-	[FEAT_REG_FIRHINC]			= { 11, 0 },
-	[FEAT_REG_FIRVINC]			= { 27, 16 },
-	[FEAT_REG_FIFOLOWTHRESHOLD]		= { 8, 0 },
-	[FEAT_REG_FIFOHIGHTHRESHOLD]		= { 24, 16 },
-	[FEAT_REG_FIFOSIZE]			= { 8, 0 },
-	[FEAT_REG_HORIZONTALACCU]		= { 9, 0 },
-	[FEAT_REG_VERTICALACCU]			= { 25, 16 },
-	[FEAT_REG_DISPC_CLK_SWITCH]		= { 0, 0 },
-};
-
-static const struct dss_reg_field omap3_dss_reg_fields[] = {
-	[FEAT_REG_FIRHINC]			= { 12, 0 },
-	[FEAT_REG_FIRVINC]			= { 28, 16 },
-	[FEAT_REG_FIFOLOWTHRESHOLD]		= { 11, 0 },
-	[FEAT_REG_FIFOHIGHTHRESHOLD]		= { 27, 16 },
-	[FEAT_REG_FIFOSIZE]			= { 10, 0 },
-	[FEAT_REG_HORIZONTALACCU]		= { 9, 0 },
-	[FEAT_REG_VERTICALACCU]			= { 25, 16 },
-	[FEAT_REG_DISPC_CLK_SWITCH]		= { 0, 0 },
-};
-
-static const struct dss_reg_field am43xx_dss_reg_fields[] = {
-	[FEAT_REG_FIRHINC]			= { 12, 0 },
-	[FEAT_REG_FIRVINC]			= { 28, 16 },
-	[FEAT_REG_FIFOLOWTHRESHOLD]	= { 11, 0 },
-	[FEAT_REG_FIFOHIGHTHRESHOLD]		= { 27, 16 },
-	[FEAT_REG_FIFOSIZE]		= { 10, 0 },
-	[FEAT_REG_HORIZONTALACCU]		= { 9, 0 },
-	[FEAT_REG_VERTICALACCU]			= { 25, 16 },
-	[FEAT_REG_DISPC_CLK_SWITCH]		= { 0, 0 },
-};
-
-static const struct dss_reg_field omap4_dss_reg_fields[] = {
-	[FEAT_REG_FIRHINC]			= { 12, 0 },
-	[FEAT_REG_FIRVINC]			= { 28, 16 },
-	[FEAT_REG_FIFOLOWTHRESHOLD]		= { 15, 0 },
-	[FEAT_REG_FIFOHIGHTHRESHOLD]		= { 31, 16 },
-	[FEAT_REG_FIFOSIZE]			= { 15, 0 },
-	[FEAT_REG_HORIZONTALACCU]		= { 10, 0 },
-	[FEAT_REG_VERTICALACCU]			= { 26, 16 },
-	[FEAT_REG_DISPC_CLK_SWITCH]		= { 9, 8 },
-};
-
-static const struct dss_reg_field omap5_dss_reg_fields[] = {
-	[FEAT_REG_FIRHINC]			= { 12, 0 },
-	[FEAT_REG_FIRVINC]			= { 28, 16 },
-	[FEAT_REG_FIFOLOWTHRESHOLD]		= { 15, 0 },
-	[FEAT_REG_FIFOHIGHTHRESHOLD]		= { 31, 16 },
-	[FEAT_REG_FIFOSIZE]			= { 15, 0 },
-	[FEAT_REG_HORIZONTALACCU]		= { 10, 0 },
-	[FEAT_REG_VERTICALACCU]			= { 26, 16 },
-	[FEAT_REG_DISPC_CLK_SWITCH]		= { 9, 7 },
-};
-
-static const enum omap_display_type omap2_dss_supported_displays[] = {
-	/* OMAP_DSS_CHANNEL_LCD */
-	OMAP_DISPLAY_TYPE_DPI | OMAP_DISPLAY_TYPE_DBI,
-
-	/* OMAP_DSS_CHANNEL_DIGIT */
-	OMAP_DISPLAY_TYPE_VENC,
-};
-
-static const enum omap_display_type omap3430_dss_supported_displays[] = {
-	/* OMAP_DSS_CHANNEL_LCD */
-	OMAP_DISPLAY_TYPE_DPI | OMAP_DISPLAY_TYPE_DBI |
-	OMAP_DISPLAY_TYPE_SDI | OMAP_DISPLAY_TYPE_DSI,
-
-	/* OMAP_DSS_CHANNEL_DIGIT */
-	OMAP_DISPLAY_TYPE_VENC,
-};
-
-static const enum omap_display_type omap3630_dss_supported_displays[] = {
-	/* OMAP_DSS_CHANNEL_LCD */
-	OMAP_DISPLAY_TYPE_DPI | OMAP_DISPLAY_TYPE_DBI |
-	OMAP_DISPLAY_TYPE_DSI,
-
-	/* OMAP_DSS_CHANNEL_DIGIT */
-	OMAP_DISPLAY_TYPE_VENC,
-};
-
-static const enum omap_display_type am43xx_dss_supported_displays[] = {
-	/* OMAP_DSS_CHANNEL_LCD */
-	OMAP_DISPLAY_TYPE_DPI | OMAP_DISPLAY_TYPE_DBI,
-};
-
-static const enum omap_display_type omap4_dss_supported_displays[] = {
-	/* OMAP_DSS_CHANNEL_LCD */
-	OMAP_DISPLAY_TYPE_DBI | OMAP_DISPLAY_TYPE_DSI,
-
-	/* OMAP_DSS_CHANNEL_DIGIT */
-	OMAP_DISPLAY_TYPE_VENC | OMAP_DISPLAY_TYPE_HDMI,
-
-	/* OMAP_DSS_CHANNEL_LCD2 */
-	OMAP_DISPLAY_TYPE_DPI | OMAP_DISPLAY_TYPE_DBI |
-	OMAP_DISPLAY_TYPE_DSI,
-};
-
-static const enum omap_display_type omap5_dss_supported_displays[] = {
-	/* OMAP_DSS_CHANNEL_LCD */
-	OMAP_DISPLAY_TYPE_DPI | OMAP_DISPLAY_TYPE_DBI |
-	OMAP_DISPLAY_TYPE_DSI,
-
-	/* OMAP_DSS_CHANNEL_DIGIT */
-	OMAP_DISPLAY_TYPE_HDMI | OMAP_DISPLAY_TYPE_DPI,
-
-	/* OMAP_DSS_CHANNEL_LCD2 */
-	OMAP_DISPLAY_TYPE_DPI | OMAP_DISPLAY_TYPE_DBI |
-	OMAP_DISPLAY_TYPE_DSI,
-};
-
-static const enum omap_dss_output_id omap2_dss_supported_outputs[] = {
-	/* OMAP_DSS_CHANNEL_LCD */
-	OMAP_DSS_OUTPUT_DPI | OMAP_DSS_OUTPUT_DBI,
-
-	/* OMAP_DSS_CHANNEL_DIGIT */
-	OMAP_DSS_OUTPUT_VENC,
-};
-
-static const enum omap_dss_output_id omap3430_dss_supported_outputs[] = {
-	/* OMAP_DSS_CHANNEL_LCD */
-	OMAP_DSS_OUTPUT_DPI | OMAP_DSS_OUTPUT_DBI |
-	OMAP_DSS_OUTPUT_SDI | OMAP_DSS_OUTPUT_DSI1,
-
-	/* OMAP_DSS_CHANNEL_DIGIT */
-	OMAP_DSS_OUTPUT_VENC,
-};
-
-static const enum omap_dss_output_id omap3630_dss_supported_outputs[] = {
-	/* OMAP_DSS_CHANNEL_LCD */
-	OMAP_DSS_OUTPUT_DPI | OMAP_DSS_OUTPUT_DBI |
-	OMAP_DSS_OUTPUT_DSI1,
-
-	/* OMAP_DSS_CHANNEL_DIGIT */
-	OMAP_DSS_OUTPUT_VENC,
-};
-
-static const enum omap_dss_output_id am43xx_dss_supported_outputs[] = {
-	/* OMAP_DSS_CHANNEL_LCD */
-	OMAP_DSS_OUTPUT_DPI | OMAP_DSS_OUTPUT_DBI,
-};
-
-static const enum omap_dss_output_id omap4_dss_supported_outputs[] = {
-	/* OMAP_DSS_CHANNEL_LCD */
-	OMAP_DSS_OUTPUT_DBI | OMAP_DSS_OUTPUT_DSI1,
-
-	/* OMAP_DSS_CHANNEL_DIGIT */
-	OMAP_DSS_OUTPUT_VENC | OMAP_DSS_OUTPUT_HDMI,
-
-	/* OMAP_DSS_CHANNEL_LCD2 */
-	OMAP_DSS_OUTPUT_DPI | OMAP_DSS_OUTPUT_DBI |
-	OMAP_DSS_OUTPUT_DSI2,
-};
-
-static const enum omap_dss_output_id omap5_dss_supported_outputs[] = {
-	/* OMAP_DSS_CHANNEL_LCD */
-	OMAP_DSS_OUTPUT_DPI | OMAP_DSS_OUTPUT_DBI |
-	OMAP_DSS_OUTPUT_DSI1 | OMAP_DSS_OUTPUT_DSI2,
-
-	/* OMAP_DSS_CHANNEL_DIGIT */
-	OMAP_DSS_OUTPUT_HDMI,
-
-	/* OMAP_DSS_CHANNEL_LCD2 */
-	OMAP_DSS_OUTPUT_DPI | OMAP_DSS_OUTPUT_DBI |
-	OMAP_DSS_OUTPUT_DSI1,
-
-	/* OMAP_DSS_CHANNEL_LCD3 */
-	OMAP_DSS_OUTPUT_DPI | OMAP_DSS_OUTPUT_DBI |
-	OMAP_DSS_OUTPUT_DSI2,
-};
-
-#define COLOR_ARRAY(arr...) (const u32[]) { arr, 0 }
-
-static const u32 *omap2_dss_supported_color_modes[] = {
-
-	/* OMAP_DSS_GFX */
-	COLOR_ARRAY(
-	DRM_FORMAT_RGBX4444, DRM_FORMAT_RGB565,
-	DRM_FORMAT_XRGB8888, DRM_FORMAT_RGB888),
-
-	/* OMAP_DSS_VIDEO1 */
-	COLOR_ARRAY(
-	DRM_FORMAT_RGB565, DRM_FORMAT_XRGB8888,
-	DRM_FORMAT_RGB888, DRM_FORMAT_YUYV,
-	DRM_FORMAT_UYVY),
-
-	/* OMAP_DSS_VIDEO2 */
-	COLOR_ARRAY(
-	DRM_FORMAT_RGB565, DRM_FORMAT_XRGB8888,
-	DRM_FORMAT_RGB888, DRM_FORMAT_YUYV,
-	DRM_FORMAT_UYVY),
-};
-
-static const u32 *omap3_dss_supported_color_modes[] = {
-	/* OMAP_DSS_GFX */
-	COLOR_ARRAY(
-	DRM_FORMAT_RGBX4444, DRM_FORMAT_ARGB4444,
-	DRM_FORMAT_RGB565, DRM_FORMAT_XRGB8888,
-	DRM_FORMAT_RGB888, DRM_FORMAT_ARGB8888,
-	DRM_FORMAT_RGBA8888, DRM_FORMAT_RGBX8888),
-
-	/* OMAP_DSS_VIDEO1 */
-	COLOR_ARRAY(
-	DRM_FORMAT_XRGB8888, DRM_FORMAT_RGB888,
-	DRM_FORMAT_RGBX4444, DRM_FORMAT_RGB565,
-	DRM_FORMAT_YUYV, DRM_FORMAT_UYVY),
-
-	/* OMAP_DSS_VIDEO2 */
-	COLOR_ARRAY(
-	DRM_FORMAT_RGBX4444, DRM_FORMAT_ARGB4444,
-	DRM_FORMAT_RGB565, DRM_FORMAT_XRGB8888,
-	DRM_FORMAT_RGB888, DRM_FORMAT_YUYV,
-	DRM_FORMAT_UYVY, DRM_FORMAT_ARGB8888,
-	DRM_FORMAT_RGBA8888, DRM_FORMAT_RGBX8888),
-};
-
-static const u32 *omap4_dss_supported_color_modes[] = {
-	/* OMAP_DSS_GFX */
-	COLOR_ARRAY(
-	DRM_FORMAT_RGBX4444, DRM_FORMAT_ARGB4444,
-	DRM_FORMAT_RGB565, DRM_FORMAT_XRGB8888,
-	DRM_FORMAT_RGB888, DRM_FORMAT_ARGB8888,
-	DRM_FORMAT_RGBA8888, DRM_FORMAT_RGBX8888,
-	DRM_FORMAT_ARGB1555, DRM_FORMAT_XRGB4444,
-	DRM_FORMAT_RGBA4444, DRM_FORMAT_XRGB1555),
-
-	/* OMAP_DSS_VIDEO1 */
-	COLOR_ARRAY(
-	DRM_FORMAT_RGB565, DRM_FORMAT_RGBX4444,
-	DRM_FORMAT_YUYV, DRM_FORMAT_ARGB1555,
-	DRM_FORMAT_RGBA8888, DRM_FORMAT_NV12,
-	DRM_FORMAT_RGBA4444, DRM_FORMAT_XRGB8888,
-	DRM_FORMAT_RGB888, DRM_FORMAT_UYVY,
-	DRM_FORMAT_ARGB4444, DRM_FORMAT_XRGB1555,
-	DRM_FORMAT_ARGB8888, DRM_FORMAT_XRGB4444,
-	DRM_FORMAT_RGBX8888),
-
-       /* OMAP_DSS_VIDEO2 */
-	COLOR_ARRAY(
-	DRM_FORMAT_RGB565, DRM_FORMAT_RGBX4444,
-	DRM_FORMAT_YUYV, DRM_FORMAT_ARGB1555,
-	DRM_FORMAT_RGBA8888, DRM_FORMAT_NV12,
-	DRM_FORMAT_RGBA4444, DRM_FORMAT_XRGB8888,
-	DRM_FORMAT_RGB888, DRM_FORMAT_UYVY,
-	DRM_FORMAT_ARGB4444, DRM_FORMAT_XRGB1555,
-	DRM_FORMAT_ARGB8888, DRM_FORMAT_XRGB4444,
-	DRM_FORMAT_RGBX8888),
-
-	/* OMAP_DSS_VIDEO3 */
-	COLOR_ARRAY(
-	DRM_FORMAT_RGB565, DRM_FORMAT_RGBX4444,
-	DRM_FORMAT_YUYV, DRM_FORMAT_ARGB1555,
-	DRM_FORMAT_RGBA8888, DRM_FORMAT_NV12,
-	DRM_FORMAT_RGBA4444, DRM_FORMAT_XRGB8888,
-	DRM_FORMAT_RGB888, DRM_FORMAT_UYVY,
-	DRM_FORMAT_ARGB4444, DRM_FORMAT_XRGB1555,
-	DRM_FORMAT_ARGB8888, DRM_FORMAT_XRGB4444,
-	DRM_FORMAT_RGBX8888),
-
-	/* OMAP_DSS_WB */
-	COLOR_ARRAY(
-	DRM_FORMAT_RGB565, DRM_FORMAT_RGBX4444,
-	DRM_FORMAT_YUYV, DRM_FORMAT_ARGB1555,
-	DRM_FORMAT_RGBA8888, DRM_FORMAT_NV12,
-	DRM_FORMAT_RGBA4444, DRM_FORMAT_XRGB8888,
-	DRM_FORMAT_RGB888, DRM_FORMAT_UYVY,
-	DRM_FORMAT_ARGB4444, DRM_FORMAT_XRGB1555,
-	DRM_FORMAT_ARGB8888, DRM_FORMAT_XRGB4444,
-	DRM_FORMAT_RGBX8888),
-};
-
-static const enum omap_overlay_caps omap2_dss_overlay_caps[] = {
-	/* OMAP_DSS_GFX */
-	OMAP_DSS_OVL_CAP_POS | OMAP_DSS_OVL_CAP_REPLICATION,
-
-	/* OMAP_DSS_VIDEO1 */
-	OMAP_DSS_OVL_CAP_SCALE | OMAP_DSS_OVL_CAP_POS |
-		OMAP_DSS_OVL_CAP_REPLICATION,
-
-	/* OMAP_DSS_VIDEO2 */
-	OMAP_DSS_OVL_CAP_SCALE | OMAP_DSS_OVL_CAP_POS |
-		OMAP_DSS_OVL_CAP_REPLICATION,
-};
-
-static const enum omap_overlay_caps omap3430_dss_overlay_caps[] = {
-	/* OMAP_DSS_GFX */
-	OMAP_DSS_OVL_CAP_GLOBAL_ALPHA | OMAP_DSS_OVL_CAP_POS |
-		OMAP_DSS_OVL_CAP_REPLICATION,
-
-	/* OMAP_DSS_VIDEO1 */
-	OMAP_DSS_OVL_CAP_SCALE | OMAP_DSS_OVL_CAP_POS |
-		OMAP_DSS_OVL_CAP_REPLICATION,
-
-	/* OMAP_DSS_VIDEO2 */
-	OMAP_DSS_OVL_CAP_SCALE | OMAP_DSS_OVL_CAP_GLOBAL_ALPHA |
-		OMAP_DSS_OVL_CAP_POS | OMAP_DSS_OVL_CAP_REPLICATION,
-};
-
-static const enum omap_overlay_caps omap3630_dss_overlay_caps[] = {
-	/* OMAP_DSS_GFX */
-	OMAP_DSS_OVL_CAP_GLOBAL_ALPHA | OMAP_DSS_OVL_CAP_PRE_MULT_ALPHA |
-		OMAP_DSS_OVL_CAP_POS | OMAP_DSS_OVL_CAP_REPLICATION,
-
-	/* OMAP_DSS_VIDEO1 */
-	OMAP_DSS_OVL_CAP_SCALE | OMAP_DSS_OVL_CAP_POS |
-		OMAP_DSS_OVL_CAP_REPLICATION,
-
-	/* OMAP_DSS_VIDEO2 */
-	OMAP_DSS_OVL_CAP_SCALE | OMAP_DSS_OVL_CAP_GLOBAL_ALPHA |
-		OMAP_DSS_OVL_CAP_PRE_MULT_ALPHA | OMAP_DSS_OVL_CAP_POS |
-		OMAP_DSS_OVL_CAP_REPLICATION,
-};
-
-static const enum omap_overlay_caps omap4_dss_overlay_caps[] = {
-	/* OMAP_DSS_GFX */
-	OMAP_DSS_OVL_CAP_GLOBAL_ALPHA | OMAP_DSS_OVL_CAP_PRE_MULT_ALPHA |
-		OMAP_DSS_OVL_CAP_ZORDER | OMAP_DSS_OVL_CAP_POS |
-		OMAP_DSS_OVL_CAP_REPLICATION,
-
-	/* OMAP_DSS_VIDEO1 */
-	OMAP_DSS_OVL_CAP_SCALE | OMAP_DSS_OVL_CAP_GLOBAL_ALPHA |
-		OMAP_DSS_OVL_CAP_PRE_MULT_ALPHA | OMAP_DSS_OVL_CAP_ZORDER |
-		OMAP_DSS_OVL_CAP_POS | OMAP_DSS_OVL_CAP_REPLICATION,
-
-	/* OMAP_DSS_VIDEO2 */
-	OMAP_DSS_OVL_CAP_SCALE | OMAP_DSS_OVL_CAP_GLOBAL_ALPHA |
-		OMAP_DSS_OVL_CAP_PRE_MULT_ALPHA | OMAP_DSS_OVL_CAP_ZORDER |
-		OMAP_DSS_OVL_CAP_POS | OMAP_DSS_OVL_CAP_REPLICATION,
-
-	/* OMAP_DSS_VIDEO3 */
-	OMAP_DSS_OVL_CAP_SCALE | OMAP_DSS_OVL_CAP_GLOBAL_ALPHA |
-		OMAP_DSS_OVL_CAP_PRE_MULT_ALPHA | OMAP_DSS_OVL_CAP_ZORDER |
-		OMAP_DSS_OVL_CAP_POS | OMAP_DSS_OVL_CAP_REPLICATION,
-};
-
-static const struct dss_param_range omap2_dss_param_range[] = {
-	[FEAT_PARAM_DSS_FCK]			= { 0, 133000000 },
-	[FEAT_PARAM_DSS_PCD]			= { 2, 255 },
-	[FEAT_PARAM_DOWNSCALE]			= { 1, 2 },
-	/*
-	 * Assuming the line width buffer to be 768 pixels as OMAP2 DISPC
-	 * scaler cannot scale a image with width more than 768.
-	 */
-	[FEAT_PARAM_LINEWIDTH]			= { 1, 768 },
-};
-
-static const struct dss_param_range omap3_dss_param_range[] = {
-	[FEAT_PARAM_DSS_FCK]			= { 0, 173000000 },
-	[FEAT_PARAM_DSS_PCD]			= { 1, 255 },
-	[FEAT_PARAM_DSIPLL_LPDIV]		= { 1, (1 << 13) - 1},
-	[FEAT_PARAM_DSI_FCK]			= { 0, 173000000 },
-	[FEAT_PARAM_DOWNSCALE]			= { 1, 4 },
-	[FEAT_PARAM_LINEWIDTH]			= { 1, 1024 },
-};
-
-static const struct dss_param_range am43xx_dss_param_range[] = {
-	[FEAT_PARAM_DSS_FCK]			= { 0, 200000000 },
-	[FEAT_PARAM_DSS_PCD]			= { 1, 255 },
-	[FEAT_PARAM_DOWNSCALE]			= { 1, 4 },
-	[FEAT_PARAM_LINEWIDTH]			= { 1, 1024 },
-};
-
-static const struct dss_param_range omap4_dss_param_range[] = {
-	[FEAT_PARAM_DSS_FCK]			= { 0, 186000000 },
-	[FEAT_PARAM_DSS_PCD]			= { 1, 255 },
-	[FEAT_PARAM_DSIPLL_LPDIV]		= { 0, (1 << 13) - 1 },
-	[FEAT_PARAM_DSI_FCK]			= { 0, 170000000 },
-	[FEAT_PARAM_DOWNSCALE]			= { 1, 4 },
-	[FEAT_PARAM_LINEWIDTH]			= { 1, 2048 },
-};
-
-static const struct dss_param_range omap5_dss_param_range[] = {
-	[FEAT_PARAM_DSS_FCK]			= { 0, 209250000 },
-	[FEAT_PARAM_DSS_PCD]			= { 1, 255 },
-	[FEAT_PARAM_DSIPLL_LPDIV]		= { 0, (1 << 13) - 1 },
-	[FEAT_PARAM_DSI_FCK]			= { 0, 209250000 },
-	[FEAT_PARAM_DOWNSCALE]			= { 1, 4 },
-	[FEAT_PARAM_LINEWIDTH]			= { 1, 2048 },
-};
-
-static const enum dss_feat_id omap2_dss_feat_list[] = {
-	FEAT_LCDENABLEPOL,
-	FEAT_LCDENABLESIGNAL,
-	FEAT_PCKFREEENABLE,
-	FEAT_FUNCGATED,
-	FEAT_ROWREPEATENABLE,
-	FEAT_RESIZECONF,
-};
-
-static const enum dss_feat_id omap3430_dss_feat_list[] = {
-	FEAT_LCDENABLEPOL,
-	FEAT_LCDENABLESIGNAL,
-	FEAT_PCKFREEENABLE,
-	FEAT_FUNCGATED,
-	FEAT_LINEBUFFERSPLIT,
-	FEAT_ROWREPEATENABLE,
-	FEAT_RESIZECONF,
-	FEAT_DSI_REVERSE_TXCLKESC,
-	FEAT_VENC_REQUIRES_TV_DAC_CLK,
-	FEAT_CPR,
-	FEAT_PRELOAD,
-	FEAT_FIR_COEF_V,
-	FEAT_ALPHA_FIXED_ZORDER,
-	FEAT_FIFO_MERGE,
-	FEAT_OMAP3_DSI_FIFO_BUG,
-	FEAT_DPI_USES_VDDS_DSI,
-};
-
-static const enum dss_feat_id am35xx_dss_feat_list[] = {
-	FEAT_LCDENABLEPOL,
-	FEAT_LCDENABLESIGNAL,
-	FEAT_PCKFREEENABLE,
-	FEAT_FUNCGATED,
-	FEAT_LINEBUFFERSPLIT,
-	FEAT_ROWREPEATENABLE,
-	FEAT_RESIZECONF,
-	FEAT_DSI_REVERSE_TXCLKESC,
-	FEAT_VENC_REQUIRES_TV_DAC_CLK,
-	FEAT_CPR,
-	FEAT_PRELOAD,
-	FEAT_FIR_COEF_V,
-	FEAT_ALPHA_FIXED_ZORDER,
-	FEAT_FIFO_MERGE,
-	FEAT_OMAP3_DSI_FIFO_BUG,
-};
-
-static const enum dss_feat_id am43xx_dss_feat_list[] = {
-	FEAT_LCDENABLEPOL,
-	FEAT_LCDENABLESIGNAL,
-	FEAT_PCKFREEENABLE,
-	FEAT_FUNCGATED,
-	FEAT_LINEBUFFERSPLIT,
-	FEAT_ROWREPEATENABLE,
-	FEAT_RESIZECONF,
-	FEAT_CPR,
-	FEAT_PRELOAD,
-	FEAT_FIR_COEF_V,
-	FEAT_ALPHA_FIXED_ZORDER,
-	FEAT_FIFO_MERGE,
-};
-
-static const enum dss_feat_id omap3630_dss_feat_list[] = {
-	FEAT_LCDENABLEPOL,
-	FEAT_LCDENABLESIGNAL,
-	FEAT_PCKFREEENABLE,
-	FEAT_FUNCGATED,
-	FEAT_LINEBUFFERSPLIT,
-	FEAT_ROWREPEATENABLE,
-	FEAT_RESIZECONF,
-	FEAT_DSI_PLL_PWR_BUG,
-	FEAT_CPR,
-	FEAT_PRELOAD,
-	FEAT_FIR_COEF_V,
-	FEAT_ALPHA_FIXED_ZORDER,
-	FEAT_FIFO_MERGE,
-	FEAT_OMAP3_DSI_FIFO_BUG,
-	FEAT_DPI_USES_VDDS_DSI,
-};
-
-static const enum dss_feat_id omap4430_es1_0_dss_feat_list[] = {
-	FEAT_MGR_LCD2,
-	FEAT_CORE_CLK_DIV,
-	FEAT_LCD_CLK_SRC,
-	FEAT_DSI_DCS_CMD_CONFIG_VC,
-	FEAT_DSI_VC_OCP_WIDTH,
-	FEAT_DSI_GNQ,
-	FEAT_HANDLE_UV_SEPARATE,
-	FEAT_ATTR2,
-	FEAT_CPR,
-	FEAT_PRELOAD,
-	FEAT_FIR_COEF_V,
-	FEAT_ALPHA_FREE_ZORDER,
-	FEAT_FIFO_MERGE,
-	FEAT_BURST_2D,
-};
-
-static const enum dss_feat_id omap4430_es2_0_1_2_dss_feat_list[] = {
-	FEAT_MGR_LCD2,
-	FEAT_CORE_CLK_DIV,
-	FEAT_LCD_CLK_SRC,
-	FEAT_DSI_DCS_CMD_CONFIG_VC,
-	FEAT_DSI_VC_OCP_WIDTH,
-	FEAT_DSI_GNQ,
-	FEAT_HDMI_CTS_SWMODE,
-	FEAT_HANDLE_UV_SEPARATE,
-	FEAT_ATTR2,
-	FEAT_CPR,
-	FEAT_PRELOAD,
-	FEAT_FIR_COEF_V,
-	FEAT_ALPHA_FREE_ZORDER,
-	FEAT_FIFO_MERGE,
-	FEAT_BURST_2D,
-};
-
-static const enum dss_feat_id omap4_dss_feat_list[] = {
-	FEAT_MGR_LCD2,
-	FEAT_CORE_CLK_DIV,
-	FEAT_LCD_CLK_SRC,
-	FEAT_DSI_DCS_CMD_CONFIG_VC,
-	FEAT_DSI_VC_OCP_WIDTH,
-	FEAT_DSI_GNQ,
-	FEAT_HDMI_CTS_SWMODE,
-	FEAT_HDMI_AUDIO_USE_MCLK,
-	FEAT_HANDLE_UV_SEPARATE,
-	FEAT_ATTR2,
-	FEAT_CPR,
-	FEAT_PRELOAD,
-	FEAT_FIR_COEF_V,
-	FEAT_ALPHA_FREE_ZORDER,
-	FEAT_FIFO_MERGE,
-	FEAT_BURST_2D,
-};
-
-static const enum dss_feat_id omap5_dss_feat_list[] = {
-	FEAT_MGR_LCD2,
-	FEAT_MGR_LCD3,
-	FEAT_CORE_CLK_DIV,
-	FEAT_LCD_CLK_SRC,
-	FEAT_DSI_DCS_CMD_CONFIG_VC,
-	FEAT_DSI_VC_OCP_WIDTH,
-	FEAT_DSI_GNQ,
-	FEAT_HDMI_CTS_SWMODE,
-	FEAT_HDMI_AUDIO_USE_MCLK,
-	FEAT_HANDLE_UV_SEPARATE,
-	FEAT_ATTR2,
-	FEAT_CPR,
-	FEAT_PRELOAD,
-	FEAT_FIR_COEF_V,
-	FEAT_ALPHA_FREE_ZORDER,
-	FEAT_FIFO_MERGE,
-	FEAT_BURST_2D,
-	FEAT_DSI_PHY_DCC,
-	FEAT_MFLAG,
-};
-
-/* OMAP2 DSS Features */
-static const struct omap_dss_features omap2_dss_features = {
-	.reg_fields = omap2_dss_reg_fields,
-	.num_reg_fields = ARRAY_SIZE(omap2_dss_reg_fields),
-
-	.features = omap2_dss_feat_list,
-	.num_features = ARRAY_SIZE(omap2_dss_feat_list),
-
-	.num_mgrs = 2,
-	.num_ovls = 3,
-	.supported_displays = omap2_dss_supported_displays,
-	.supported_outputs = omap2_dss_supported_outputs,
-	.supported_color_modes = omap2_dss_supported_color_modes,
-	.overlay_caps = omap2_dss_overlay_caps,
-	.dss_params = omap2_dss_param_range,
-	.buffer_size_unit = 1,
-	.burst_size_unit = 8,
-};
-
-/* OMAP3 DSS Features */
-static const struct omap_dss_features omap3430_dss_features = {
-	.reg_fields = omap3_dss_reg_fields,
-	.num_reg_fields = ARRAY_SIZE(omap3_dss_reg_fields),
-
-	.features = omap3430_dss_feat_list,
-	.num_features = ARRAY_SIZE(omap3430_dss_feat_list),
-
-	.num_mgrs = 2,
-	.num_ovls = 3,
-	.supported_displays = omap3430_dss_supported_displays,
-	.supported_outputs = omap3430_dss_supported_outputs,
-	.supported_color_modes = omap3_dss_supported_color_modes,
-	.overlay_caps = omap3430_dss_overlay_caps,
-	.dss_params = omap3_dss_param_range,
-	.buffer_size_unit = 1,
-	.burst_size_unit = 8,
-};
-
-/*
- * AM35xx DSS Features. This is basically OMAP3 DSS Features without the
- * vdds_dsi regulator.
- */
-static const struct omap_dss_features am35xx_dss_features = {
-	.reg_fields = omap3_dss_reg_fields,
-	.num_reg_fields = ARRAY_SIZE(omap3_dss_reg_fields),
-
-	.features = am35xx_dss_feat_list,
-	.num_features = ARRAY_SIZE(am35xx_dss_feat_list),
-
-	.num_mgrs = 2,
-	.num_ovls = 3,
-	.supported_displays = omap3430_dss_supported_displays,
-	.supported_outputs = omap3430_dss_supported_outputs,
-	.supported_color_modes = omap3_dss_supported_color_modes,
-	.overlay_caps = omap3430_dss_overlay_caps,
-	.dss_params = omap3_dss_param_range,
-	.buffer_size_unit = 1,
-	.burst_size_unit = 8,
-};
-
-static const struct omap_dss_features am43xx_dss_features = {
-	.reg_fields = am43xx_dss_reg_fields,
-	.num_reg_fields = ARRAY_SIZE(am43xx_dss_reg_fields),
-
-	.features = am43xx_dss_feat_list,
-	.num_features = ARRAY_SIZE(am43xx_dss_feat_list),
-
-	.num_mgrs = 1,
-	.num_ovls = 3,
-	.supported_displays = am43xx_dss_supported_displays,
-	.supported_outputs = am43xx_dss_supported_outputs,
-	.supported_color_modes = omap3_dss_supported_color_modes,
-	.overlay_caps = omap3430_dss_overlay_caps,
-	.dss_params = am43xx_dss_param_range,
-	.buffer_size_unit = 1,
-	.burst_size_unit = 8,
-};
-
-static const struct omap_dss_features omap3630_dss_features = {
-	.reg_fields = omap3_dss_reg_fields,
-	.num_reg_fields = ARRAY_SIZE(omap3_dss_reg_fields),
-
-	.features = omap3630_dss_feat_list,
-	.num_features = ARRAY_SIZE(omap3630_dss_feat_list),
-
-	.num_mgrs = 2,
-	.num_ovls = 3,
-	.supported_displays = omap3630_dss_supported_displays,
-	.supported_outputs = omap3630_dss_supported_outputs,
-	.supported_color_modes = omap3_dss_supported_color_modes,
-	.overlay_caps = omap3630_dss_overlay_caps,
-	.dss_params = omap3_dss_param_range,
-	.buffer_size_unit = 1,
-	.burst_size_unit = 8,
-};
-
-/* OMAP4 DSS Features */
-/* For OMAP4430 ES 1.0 revision */
-static const struct omap_dss_features omap4430_es1_0_dss_features  = {
-	.reg_fields = omap4_dss_reg_fields,
-	.num_reg_fields = ARRAY_SIZE(omap4_dss_reg_fields),
-
-	.features = omap4430_es1_0_dss_feat_list,
-	.num_features = ARRAY_SIZE(omap4430_es1_0_dss_feat_list),
-
-	.num_mgrs = 3,
-	.num_ovls = 4,
-	.supported_displays = omap4_dss_supported_displays,
-	.supported_outputs = omap4_dss_supported_outputs,
-	.supported_color_modes = omap4_dss_supported_color_modes,
-	.overlay_caps = omap4_dss_overlay_caps,
-	.dss_params = omap4_dss_param_range,
-	.buffer_size_unit = 16,
-	.burst_size_unit = 16,
-};
-
-/* For OMAP4430 ES 2.0, 2.1 and 2.2 revisions */
-static const struct omap_dss_features omap4430_es2_0_1_2_dss_features = {
-	.reg_fields = omap4_dss_reg_fields,
-	.num_reg_fields = ARRAY_SIZE(omap4_dss_reg_fields),
-
-	.features = omap4430_es2_0_1_2_dss_feat_list,
-	.num_features = ARRAY_SIZE(omap4430_es2_0_1_2_dss_feat_list),
-
-	.num_mgrs = 3,
-	.num_ovls = 4,
-	.supported_displays = omap4_dss_supported_displays,
-	.supported_outputs = omap4_dss_supported_outputs,
-	.supported_color_modes = omap4_dss_supported_color_modes,
-	.overlay_caps = omap4_dss_overlay_caps,
-	.dss_params = omap4_dss_param_range,
-	.buffer_size_unit = 16,
-	.burst_size_unit = 16,
-};
-
-/* For all the other OMAP4 versions */
-static const struct omap_dss_features omap4_dss_features = {
-	.reg_fields = omap4_dss_reg_fields,
-	.num_reg_fields = ARRAY_SIZE(omap4_dss_reg_fields),
-
-	.features = omap4_dss_feat_list,
-	.num_features = ARRAY_SIZE(omap4_dss_feat_list),
-
-	.num_mgrs = 3,
-	.num_ovls = 4,
-	.supported_displays = omap4_dss_supported_displays,
-	.supported_outputs = omap4_dss_supported_outputs,
-	.supported_color_modes = omap4_dss_supported_color_modes,
-	.overlay_caps = omap4_dss_overlay_caps,
-	.dss_params = omap4_dss_param_range,
-	.buffer_size_unit = 16,
-	.burst_size_unit = 16,
-};
-
-/* OMAP5 DSS Features */
-static const struct omap_dss_features omap5_dss_features = {
-	.reg_fields = omap5_dss_reg_fields,
-	.num_reg_fields = ARRAY_SIZE(omap5_dss_reg_fields),
-
-	.features = omap5_dss_feat_list,
-	.num_features = ARRAY_SIZE(omap5_dss_feat_list),
-
-	.num_mgrs = 4,
-	.num_ovls = 4,
-	.supported_displays = omap5_dss_supported_displays,
-	.supported_outputs = omap5_dss_supported_outputs,
-	.supported_color_modes = omap4_dss_supported_color_modes,
-	.overlay_caps = omap4_dss_overlay_caps,
-	.dss_params = omap5_dss_param_range,
-	.buffer_size_unit = 16,
-	.burst_size_unit = 16,
-};
-
-/* Functions returning values related to a DSS feature */
-int dss_feat_get_num_mgrs(void)
-{
-	return omap_current_dss_features->num_mgrs;
-}
-
-int dss_feat_get_num_ovls(void)
-{
-	return omap_current_dss_features->num_ovls;
-}
-
-unsigned long dss_feat_get_param_min(enum dss_range_param param)
-{
-	return omap_current_dss_features->dss_params[param].min;
-}
-
-unsigned long dss_feat_get_param_max(enum dss_range_param param)
-{
-	return omap_current_dss_features->dss_params[param].max;
-}
-
-enum omap_display_type dss_feat_get_supported_displays(enum omap_channel channel)
-{
-	return omap_current_dss_features->supported_displays[channel];
-}
-
-enum omap_dss_output_id dss_feat_get_supported_outputs(enum omap_channel channel)
-{
-	return omap_current_dss_features->supported_outputs[channel];
-}
-
-const u32 *dss_feat_get_supported_color_modes(enum omap_plane_id plane)
-{
-	return omap_current_dss_features->supported_color_modes[plane];
-}
-
-enum omap_overlay_caps dss_feat_get_overlay_caps(enum omap_plane_id plane)
-{
-	return omap_current_dss_features->overlay_caps[plane];
-}
-
-bool dss_feat_color_mode_supported(enum omap_plane_id plane, u32 fourcc)
-{
-	const u32 *modes;
-	unsigned int i;
-
-	modes = omap_current_dss_features->supported_color_modes[plane];
-
-	for (i = 0; modes[i]; ++i) {
-		if (modes[i] == fourcc)
-			return true;
-	}
-
-	return false;
-}
-
-u32 dss_feat_get_buffer_size_unit(void)
-{
-	return omap_current_dss_features->buffer_size_unit;
-}
-
-u32 dss_feat_get_burst_size_unit(void)
-{
-	return omap_current_dss_features->burst_size_unit;
-}
-
-/* DSS has_feature check */
-bool dss_has_feature(enum dss_feat_id id)
-{
-	int i;
-	const enum dss_feat_id *features = omap_current_dss_features->features;
-	const int num_features = omap_current_dss_features->num_features;
-
-	for (i = 0; i < num_features; i++) {
-		if (features[i] == id)
-			return true;
-	}
-
-	return false;
-}
-
-void dss_feat_get_reg_field(enum dss_feat_reg_field id, u8 *start, u8 *end)
-{
-	if (id >= omap_current_dss_features->num_reg_fields)
-		BUG();
-
-	*start = omap_current_dss_features->reg_fields[id].start;
-	*end = omap_current_dss_features->reg_fields[id].end;
-}
-
-void dss_features_init(enum omapdss_version version)
-{
-	switch (version) {
-	case OMAPDSS_VER_OMAP24xx:
-		omap_current_dss_features = &omap2_dss_features;
-		break;
-
-	case OMAPDSS_VER_OMAP34xx_ES1:
-	case OMAPDSS_VER_OMAP34xx_ES3:
-		omap_current_dss_features = &omap3430_dss_features;
-		break;
-
-	case OMAPDSS_VER_OMAP3630:
-		omap_current_dss_features = &omap3630_dss_features;
-		break;
-
-	case OMAPDSS_VER_OMAP4430_ES1:
-		omap_current_dss_features = &omap4430_es1_0_dss_features;
-		break;
-
-	case OMAPDSS_VER_OMAP4430_ES2:
-		omap_current_dss_features = &omap4430_es2_0_1_2_dss_features;
-		break;
-
-	case OMAPDSS_VER_OMAP4:
-		omap_current_dss_features = &omap4_dss_features;
-		break;
-
-	case OMAPDSS_VER_OMAP5:
-	case OMAPDSS_VER_DRA7xx:
-		omap_current_dss_features = &omap5_dss_features;
-		break;
-
-	case OMAPDSS_VER_AM35xx:
-		omap_current_dss_features = &am35xx_dss_features;
-		break;
-
-	case OMAPDSS_VER_AM43xx:
-		omap_current_dss_features = &am43xx_dss_features;
-		break;
-
-	default:
-		DSSWARN("Unsupported OMAP version");
-		break;
-	}
-}
diff --git a/drivers/gpu/drm/omapdrm/dss/dss_features.h b/drivers/gpu/drm/omapdrm/dss/dss_features.h
deleted file mode 100644
index c36436d..0000000
--- a/drivers/gpu/drm/omapdrm/dss/dss_features.h
+++ /dev/null
@@ -1,109 +0,0 @@
-/*
- * linux/drivers/video/omap2/dss/dss_features.h
- *
- * Copyright (C) 2010 Texas Instruments
- * Author: Archit Taneja <archit@ti.com>
- *
- * This program is free software; you can redistribute it and/or modify it
- * under the terms of the GNU General Public License version 2 as published by
- * the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but WITHOUT
- * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
- * more details.
- *
- * You should have received a copy of the GNU General Public License along with
- * this program.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#ifndef __OMAP2_DSS_FEATURES_H
-#define __OMAP2_DSS_FEATURES_H
-
-#define MAX_DSS_MANAGERS	4
-#define MAX_DSS_OVERLAYS	4
-#define MAX_DSS_LCD_MANAGERS	3
-#define MAX_NUM_DSI		2
-
-/* DSS has feature id */
-enum dss_feat_id {
-	FEAT_LCDENABLEPOL,
-	FEAT_LCDENABLESIGNAL,
-	FEAT_PCKFREEENABLE,
-	FEAT_FUNCGATED,
-	FEAT_MGR_LCD2,
-	FEAT_MGR_LCD3,
-	FEAT_LINEBUFFERSPLIT,
-	FEAT_ROWREPEATENABLE,
-	FEAT_RESIZECONF,
-	/* Independent core clk divider */
-	FEAT_CORE_CLK_DIV,
-	FEAT_LCD_CLK_SRC,
-	/* DSI-PLL power command 0x3 is not working */
-	FEAT_DSI_PLL_PWR_BUG,
-	FEAT_DSI_DCS_CMD_CONFIG_VC,
-	FEAT_DSI_VC_OCP_WIDTH,
-	FEAT_DSI_REVERSE_TXCLKESC,
-	FEAT_DSI_GNQ,
-	FEAT_DPI_USES_VDDS_DSI,
-	FEAT_HDMI_CTS_SWMODE,
-	FEAT_HDMI_AUDIO_USE_MCLK,
-	FEAT_HANDLE_UV_SEPARATE,
-	FEAT_ATTR2,
-	FEAT_VENC_REQUIRES_TV_DAC_CLK,
-	FEAT_CPR,
-	FEAT_PRELOAD,
-	FEAT_FIR_COEF_V,
-	FEAT_ALPHA_FIXED_ZORDER,
-	FEAT_ALPHA_FREE_ZORDER,
-	FEAT_FIFO_MERGE,
-	/* An unknown HW bug causing the normal FIFO thresholds not to work */
-	FEAT_OMAP3_DSI_FIFO_BUG,
-	FEAT_BURST_2D,
-	FEAT_DSI_PHY_DCC,
-	FEAT_MFLAG,
-};
-
-/* DSS register field id */
-enum dss_feat_reg_field {
-	FEAT_REG_FIRHINC,
-	FEAT_REG_FIRVINC,
-	FEAT_REG_FIFOHIGHTHRESHOLD,
-	FEAT_REG_FIFOLOWTHRESHOLD,
-	FEAT_REG_FIFOSIZE,
-	FEAT_REG_HORIZONTALACCU,
-	FEAT_REG_VERTICALACCU,
-	FEAT_REG_DISPC_CLK_SWITCH,
-};
-
-enum dss_range_param {
-	FEAT_PARAM_DSS_FCK,
-	FEAT_PARAM_DSS_PCD,
-	FEAT_PARAM_DSIPLL_LPDIV,
-	FEAT_PARAM_DSI_FCK,
-	FEAT_PARAM_DOWNSCALE,
-	FEAT_PARAM_LINEWIDTH,
-};
-
-/* DSS Feature Functions */
-unsigned long dss_feat_get_param_min(enum dss_range_param param);
-unsigned long dss_feat_get_param_max(enum dss_range_param param);
-enum omap_overlay_caps dss_feat_get_overlay_caps(enum omap_plane_id plane);
-bool dss_feat_color_mode_supported(enum omap_plane_id plane,
-		u32 fourcc);
-
-u32 dss_feat_get_buffer_size_unit(void);	/* in bytes */
-u32 dss_feat_get_burst_size_unit(void);		/* in bytes */
-
-bool dss_has_feature(enum dss_feat_id id);
-void dss_feat_get_reg_field(enum dss_feat_reg_field id, u8 *start, u8 *end);
-void dss_features_init(enum omapdss_version version);
-
-enum omap_display_type dss_feat_get_supported_displays(enum omap_channel channel);
-enum omap_dss_output_id dss_feat_get_supported_outputs(enum omap_channel channel);
-
-int dss_feat_get_num_mgrs(void);
-int dss_feat_get_num_ovls(void);
-const u32 *dss_feat_get_supported_color_modes(enum omap_plane_id plane);
-
-#endif
diff --git a/drivers/gpu/drm/omapdrm/dss/hdmi.h b/drivers/gpu/drm/omapdrm/dss/hdmi.h
index fb6cccd..a820b39 100644
--- a/drivers/gpu/drm/omapdrm/dss/hdmi.h
+++ b/drivers/gpu/drm/omapdrm/dss/hdmi.h
@@ -234,6 +234,7 @@ struct hdmi_core_audio_config {
 struct hdmi_wp_data {
 	void __iomem *base;
 	phys_addr_t phys_base;
+	unsigned int version;
 };
 
 struct hdmi_pll_data {
@@ -245,15 +246,24 @@ struct hdmi_pll_data {
 	struct hdmi_wp_data *wp;
 };
 
+struct hdmi_phy_features {
+	bool bist_ctrl;
+	bool ldo_voltage;
+	unsigned long max_phy;
+};
+
 struct hdmi_phy_data {
 	void __iomem *base;
 
+	const struct hdmi_phy_features *features;
 	u8 lane_function[4];
 	u8 lane_polarity[4];
 };
 
 struct hdmi_core_data {
 	void __iomem *base;
+	bool cts_swmode;
+	bool audio_use_mclk;
 };
 
 static inline void hdmi_write_reg(void __iomem *base_addr, const u32 idx,
@@ -303,7 +313,8 @@ void hdmi_wp_video_config_timing(struct hdmi_wp_data *wp,
 		struct videomode *vm);
 void hdmi_wp_init_vid_fmt_timings(struct hdmi_video_format *video_fmt,
 		struct videomode *vm, struct hdmi_config *param);
-int hdmi_wp_init(struct platform_device *pdev, struct hdmi_wp_data *wp);
+int hdmi_wp_init(struct platform_device *pdev, struct hdmi_wp_data *wp,
+		 unsigned int version);
 phys_addr_t hdmi_wp_get_audio_dma_addr(struct hdmi_wp_data *wp);
 
 /* HDMI PLL funcs */
@@ -316,7 +327,8 @@ void hdmi_pll_uninit(struct hdmi_pll_data *hpll);
 int hdmi_phy_configure(struct hdmi_phy_data *phy, unsigned long hfbitclk,
 	unsigned long lfbitclk);
 void hdmi_phy_dump(struct hdmi_phy_data *phy, struct seq_file *s);
-int hdmi_phy_init(struct platform_device *pdev, struct hdmi_phy_data *phy);
+int hdmi_phy_init(struct platform_device *pdev, struct hdmi_phy_data *phy,
+		  unsigned int version);
 int hdmi_phy_parse_lanes(struct hdmi_phy_data *phy, const u32 *lanes);
 
 /* HDMI common funcs */
diff --git a/drivers/gpu/drm/omapdrm/dss/hdmi4.c b/drivers/gpu/drm/omapdrm/dss/hdmi4.c
index 284b494..f169348 100644
--- a/drivers/gpu/drm/omapdrm/dss/hdmi4.c
+++ b/drivers/gpu/drm/omapdrm/dss/hdmi4.c
@@ -40,7 +40,6 @@
 #include "omapdss.h"
 #include "hdmi4_core.h"
 #include "dss.h"
-#include "dss_features.h"
 #include "hdmi.h"
 
 static struct omap_hdmi hdmi;
@@ -668,7 +667,7 @@ static int hdmi_audio_register(struct device *dev)
 {
 	struct omap_hdmi_audio_pdata pdata = {
 		.dev = dev,
-		.dss_version = omapdss_get_version(),
+		.version = 4,
 		.audio_dma_addr = hdmi_wp_get_audio_dma_addr(&hdmi.wp),
 		.ops = &hdmi_audio_ops,
 	};
@@ -700,7 +699,7 @@ static int hdmi4_bind(struct device *dev, struct device *master, void *data)
 	if (r)
 		return r;
 
-	r = hdmi_wp_init(pdev, &hdmi.wp);
+	r = hdmi_wp_init(pdev, &hdmi.wp, 4);
 	if (r)
 		return r;
 
@@ -708,7 +707,7 @@ static int hdmi4_bind(struct device *dev, struct device *master, void *data)
 	if (r)
 		return r;
 
-	r = hdmi_phy_init(pdev, &hdmi.phy);
+	r = hdmi_phy_init(pdev, &hdmi.phy, 4);
 	if (r)
 		goto err;
 
diff --git a/drivers/gpu/drm/omapdrm/dss/hdmi4_core.c b/drivers/gpu/drm/omapdrm/dss/hdmi4_core.c
index ed60016..365cf07 100644
--- a/drivers/gpu/drm/omapdrm/dss/hdmi4_core.c
+++ b/drivers/gpu/drm/omapdrm/dss/hdmi4_core.c
@@ -31,11 +31,11 @@
 #include <linux/platform_device.h>
 #include <linux/string.h>
 #include <linux/seq_file.h>
+#include <linux/sys_soc.h>
 #include <sound/asound.h>
 #include <sound/asoundef.h>
 
 #include "hdmi4_core.h"
-#include "dss_features.h"
 
 #define HDMI_CORE_AV		0x500
 
@@ -757,10 +757,10 @@ int hdmi4_audio_config(struct hdmi_core_data *core, struct hdmi_wp_data *wp,
 	/* Audio clock regeneration settings */
 	acore.n = n;
 	acore.cts = cts;
-	if (dss_has_feature(FEAT_HDMI_CTS_SWMODE)) {
+	if (core->cts_swmode) {
 		acore.aud_par_busclk = 0;
 		acore.cts_mode = HDMI_AUDIO_CTS_MODE_SW;
-		acore.use_mclk = dss_has_feature(FEAT_HDMI_AUDIO_USE_MCLK);
+		acore.use_mclk = core->audio_use_mclk;
 	} else {
 		acore.aud_par_busclk = (((128 * 31) - 1) << 8);
 		acore.cts_mode = HDMI_AUDIO_CTS_MODE_HW;
@@ -884,10 +884,42 @@ void hdmi4_audio_stop(struct hdmi_core_data *core, struct hdmi_wp_data *wp)
 	hdmi_wp_audio_core_req_enable(wp, false);
 }
 
+struct hdmi4_features {
+	bool cts_swmode;
+	bool audio_use_mclk;
+};
+
+static const struct hdmi4_features hdmi4_es1_features = {
+	.cts_swmode = false,
+	.audio_use_mclk = false,
+};
+
+static const struct hdmi4_features hdmi4_es2_features = {
+	.cts_swmode = true,
+	.audio_use_mclk = false,
+};
+
+static const struct hdmi4_features hdmi4_es3_features = {
+	.cts_swmode = true,
+	.audio_use_mclk = true,
+};
+
+static const struct soc_device_attribute hdmi4_soc_devices[] = {
+	{ .family = "OMAP4", .revision = "ES1.?", .data = &hdmi4_es1_features },
+	{ .family = "OMAP4", .revision = "ES2.?", .data = &hdmi4_es2_features },
+	{ .family = "OMAP4",			  .data = &hdmi4_es3_features },
+	{ /* sentinel */ }
+};
+
 int hdmi4_core_init(struct platform_device *pdev, struct hdmi_core_data *core)
 {
+	const struct hdmi4_features *features;
 	struct resource *res;
 
+	features = soc_device_match(hdmi4_soc_devices)->data;
+	core->cts_swmode = features->cts_swmode;
+	core->audio_use_mclk = features->audio_use_mclk;
+
 	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "core");
 	core->base = devm_ioremap_resource(&pdev->dev, res);
 	if (IS_ERR(core->base))
diff --git a/drivers/gpu/drm/omapdrm/dss/hdmi5.c b/drivers/gpu/drm/omapdrm/dss/hdmi5.c
index 441e199..b3221ca 100644
--- a/drivers/gpu/drm/omapdrm/dss/hdmi5.c
+++ b/drivers/gpu/drm/omapdrm/dss/hdmi5.c
@@ -45,7 +45,6 @@
 #include "omapdss.h"
 #include "hdmi5_core.h"
 #include "dss.h"
-#include "dss_features.h"
 
 static struct omap_hdmi hdmi;
 
@@ -695,7 +694,7 @@ static int hdmi_audio_register(struct device *dev)
 {
 	struct omap_hdmi_audio_pdata pdata = {
 		.dev = dev,
-		.dss_version = omapdss_get_version(),
+		.version = 5,
 		.audio_dma_addr = hdmi_wp_get_audio_dma_addr(&hdmi.wp),
 		.ops = &hdmi_audio_ops,
 	};
@@ -732,7 +731,7 @@ static int hdmi5_bind(struct device *dev, struct device *master, void *data)
 	if (r)
 		return r;
 
-	r = hdmi_wp_init(pdev, &hdmi.wp);
+	r = hdmi_wp_init(pdev, &hdmi.wp, 5);
 	if (r)
 		return r;
 
@@ -740,7 +739,7 @@ static int hdmi5_bind(struct device *dev, struct device *master, void *data)
 	if (r)
 		return r;
 
-	r = hdmi_phy_init(pdev, &hdmi.phy);
+	r = hdmi_phy_init(pdev, &hdmi.phy, 5);
 	if (r)
 		goto err;
 
diff --git a/drivers/gpu/drm/omapdrm/dss/hdmi_phy.c b/drivers/gpu/drm/omapdrm/dss/hdmi_phy.c
index fb5e4c7..a156292 100644
--- a/drivers/gpu/drm/omapdrm/dss/hdmi_phy.c
+++ b/drivers/gpu/drm/omapdrm/dss/hdmi_phy.c
@@ -19,14 +19,6 @@
 #include "dss.h"
 #include "hdmi.h"
 
-struct hdmi_phy_features {
-	bool bist_ctrl;
-	bool ldo_voltage;
-	unsigned long max_phy;
-};
-
-static const struct hdmi_phy_features *phy_feat;
-
 void hdmi_phy_dump(struct hdmi_phy_data *phy, struct seq_file *s)
 {
 #define DUMPPHY(r) seq_printf(s, "%-35s %08x\n", #r,\
@@ -36,7 +28,7 @@ void hdmi_phy_dump(struct hdmi_phy_data *phy, struct seq_file *s)
 	DUMPPHY(HDMI_TXPHY_DIGITAL_CTRL);
 	DUMPPHY(HDMI_TXPHY_POWER_CTRL);
 	DUMPPHY(HDMI_TXPHY_PAD_CFG_CTRL);
-	if (phy_feat->bist_ctrl)
+	if (phy->features->bist_ctrl)
 		DUMPPHY(HDMI_TXPHY_BIST_CONTROL);
 }
 
@@ -146,7 +138,7 @@ int hdmi_phy_configure(struct hdmi_phy_data *phy, unsigned long hfbitclk,
 	 * In OMAP5+, the HFBITCLK must be divided by 2 before issuing the
 	 * HDMI_PHYPWRCMD_LDOON command.
 	*/
-	if (phy_feat->bist_ctrl)
+	if (phy->features->bist_ctrl)
 		REG_FLD_MOD(phy->base, HDMI_TXPHY_BIST_CONTROL, 1, 11, 11);
 
 	/*
@@ -155,7 +147,7 @@ int hdmi_phy_configure(struct hdmi_phy_data *phy, unsigned long hfbitclk,
 	 */
 	if (hfbitclk != lfbitclk)
 		freqout = 0;
-	else if (hfbitclk / 10 < phy_feat->max_phy)
+	else if (hfbitclk / 10 < phy->features->max_phy)
 		freqout = 1;
 	else
 		freqout = 2;
@@ -170,7 +162,7 @@ int hdmi_phy_configure(struct hdmi_phy_data *phy, unsigned long hfbitclk,
 	hdmi_write_reg(phy->base, HDMI_TXPHY_DIGITAL_CTRL, 0xF0000000);
 
 	/* Setup max LDO voltage */
-	if (phy_feat->ldo_voltage)
+	if (phy->features->ldo_voltage)
 		REG_FLD_MOD(phy->base, HDMI_TXPHY_POWER_CTRL, 0xB, 3, 0);
 
 	hdmi_phy_configure_lanes(phy);
@@ -190,47 +182,15 @@ static const struct hdmi_phy_features omap54xx_phy_feats = {
 	.max_phy	=	186000000,
 };
 
-static int hdmi_phy_init_features(struct platform_device *pdev)
+int hdmi_phy_init(struct platform_device *pdev, struct hdmi_phy_data *phy,
+		  unsigned int version)
 {
-	struct hdmi_phy_features *dst;
-	const struct hdmi_phy_features *src;
-
-	dst = devm_kzalloc(&pdev->dev, sizeof(*dst), GFP_KERNEL);
-	if (!dst) {
-		dev_err(&pdev->dev, "Failed to allocate HDMI PHY Features\n");
-		return -ENOMEM;
-	}
-
-	switch (omapdss_get_version()) {
-	case OMAPDSS_VER_OMAP4430_ES1:
-	case OMAPDSS_VER_OMAP4430_ES2:
-	case OMAPDSS_VER_OMAP4:
-		src = &omap44xx_phy_feats;
-		break;
-
-	case OMAPDSS_VER_OMAP5:
-	case OMAPDSS_VER_DRA7xx:
-		src = &omap54xx_phy_feats;
-		break;
-
-	default:
-		return -ENODEV;
-	}
-
-	memcpy(dst, src, sizeof(*dst));
-	phy_feat = dst;
-
-	return 0;
-}
-
-int hdmi_phy_init(struct platform_device *pdev, struct hdmi_phy_data *phy)
-{
-	int r;
 	struct resource *res;
 
-	r = hdmi_phy_init_features(pdev);
-	if (r)
-		return r;
+	if (version == 4)
+		phy->features = &omap44xx_phy_feats;
+	else
+		phy->features = &omap54xx_phy_feats;
 
 	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "phy");
 	phy->base = devm_ioremap_resource(&pdev->dev, res);
diff --git a/drivers/gpu/drm/omapdrm/dss/hdmi_pll.c b/drivers/gpu/drm/omapdrm/dss/hdmi_pll.c
index 4623935..55bee81 100644
--- a/drivers/gpu/drm/omapdrm/dss/hdmi_pll.c
+++ b/drivers/gpu/drm/omapdrm/dss/hdmi_pll.c
@@ -71,7 +71,7 @@ static void hdmi_pll_disable(struct dss_pll *dsspll)
 	WARN_ON(r < 0 && r != -ENOSYS);
 }
 
-static const struct dss_pll_ops dsi_pll_ops = {
+static const struct dss_pll_ops hdmi_pll_ops = {
 	.enable = hdmi_pll_enable,
 	.disable = hdmi_pll_disable,
 	.set_config = dss_pll_write_config_type_b,
@@ -128,7 +128,8 @@ static const struct dss_pll_hw dss_omap5_hdmi_pll_hw = {
 	.has_refsel = true,
 };
 
-static int dsi_init_pll_data(struct platform_device *pdev, struct hdmi_pll_data *hpll)
+static int hdmi_init_pll_data(struct platform_device *pdev,
+			      struct hdmi_pll_data *hpll)
 {
 	struct dss_pll *pll = &hpll->pll;
 	struct clk *clk;
@@ -145,23 +146,12 @@ static int dsi_init_pll_data(struct platform_device *pdev, struct hdmi_pll_data
 	pll->base = hpll->base;
 	pll->clkin = clk;
 
-	switch (omapdss_get_version()) {
-	case OMAPDSS_VER_OMAP4430_ES1:
-	case OMAPDSS_VER_OMAP4430_ES2:
-	case OMAPDSS_VER_OMAP4:
+	if (hpll->wp->version == 4)
 		pll->hw = &dss_omap4_hdmi_pll_hw;
-		break;
-
-	case OMAPDSS_VER_OMAP5:
-	case OMAPDSS_VER_DRA7xx:
+	else
 		pll->hw = &dss_omap5_hdmi_pll_hw;
-		break;
 
-	default:
-		return -ENODEV;
-	}
-
-	pll->ops = &dsi_pll_ops;
+	pll->ops = &hdmi_pll_ops;
 
 	r = dss_pll_register(pll);
 	if (r)
@@ -184,7 +174,7 @@ int hdmi_pll_init(struct platform_device *pdev, struct hdmi_pll_data *pll,
 	if (IS_ERR(pll->base))
 		return PTR_ERR(pll->base);
 
-	r = dsi_init_pll_data(pdev, pll);
+	r = hdmi_init_pll_data(pdev, pll);
 	if (r) {
 		DSSERR("failed to init HDMI PLL\n");
 		return r;
diff --git a/drivers/gpu/drm/omapdrm/dss/hdmi_wp.c b/drivers/gpu/drm/omapdrm/dss/hdmi_wp.c
index ab129df..88034fb 100644
--- a/drivers/gpu/drm/omapdrm/dss/hdmi_wp.c
+++ b/drivers/gpu/drm/omapdrm/dss/hdmi_wp.c
@@ -178,9 +178,7 @@ void hdmi_wp_video_config_timing(struct hdmi_wp_data *wp,
 	 * However, we don't support OMAP5 ES1 at all, so we can just check for
 	 * OMAP4 here.
 	 */
-	if (omapdss_get_version() == OMAPDSS_VER_OMAP4430_ES1 ||
-	    omapdss_get_version() == OMAPDSS_VER_OMAP4430_ES2 ||
-	    omapdss_get_version() == OMAPDSS_VER_OMAP4)
+	if (wp->version == 4)
 		hsync_len_offset = 0;
 
 	timing_h |= FLD_VAL(vm->hback_porch, 31, 20);
@@ -235,9 +233,7 @@ void hdmi_wp_audio_config_format(struct hdmi_wp_data *wp,
 	DSSDBG("Enter hdmi_wp_audio_config_format\n");
 
 	r = hdmi_read_reg(wp->base, HDMI_WP_AUDIO_CFG);
-	if (omapdss_get_version() == OMAPDSS_VER_OMAP4430_ES1 ||
-	    omapdss_get_version() == OMAPDSS_VER_OMAP4430_ES2 ||
-	    omapdss_get_version() == OMAPDSS_VER_OMAP4) {
+	if (wp->version == 4) {
 		r = FLD_MOD(r, aud_fmt->stereo_channels, 26, 24);
 		r = FLD_MOD(r, aud_fmt->active_chnnls_msk, 23, 16);
 	}
@@ -282,7 +278,8 @@ int hdmi_wp_audio_core_req_enable(struct hdmi_wp_data *wp, bool enable)
 	return 0;
 }
 
-int hdmi_wp_init(struct platform_device *pdev, struct hdmi_wp_data *wp)
+int hdmi_wp_init(struct platform_device *pdev, struct hdmi_wp_data *wp,
+		 unsigned int version)
 {
 	struct resource *res;
 
@@ -292,6 +289,7 @@ int hdmi_wp_init(struct platform_device *pdev, struct hdmi_wp_data *wp)
 		return PTR_ERR(wp->base);
 
 	wp->phys_base = res->start;
+	wp->version = version;
 
 	return 0;
 }
diff --git a/drivers/gpu/drm/omapdrm/dss/omapdss.h b/drivers/gpu/drm/omapdrm/dss/omapdss.h
index 85953a0b..47a3316 100644
--- a/drivers/gpu/drm/omapdrm/dss/omapdss.h
+++ b/drivers/gpu/drm/omapdrm/dss/omapdss.h
@@ -25,6 +25,7 @@
 #include <video/videomode.h>
 #include <linux/platform_data/omapdss.h>
 #include <uapi/drm/drm_mode.h>
+#include <drm/drm_crtc.h>
 
 #define DISPC_IRQ_FRAMEDONE		(1 << 0)
 #define DISPC_IRQ_VSYNC			(1 << 1)
@@ -241,13 +242,6 @@ struct omap_dss_dsi_config {
 	enum omap_dss_dsi_trans_mode trans_mode;
 };
 
-/* Hardcoded videomodes for tv. Venc only uses these to
- * identify the mode, and does not actually use the configs
- * itself. However, the configs should be something that
- * a normal monitor can also show */
-extern const struct videomode omap_dss_pal_vm;
-extern const struct videomode omap_dss_ntsc_vm;
-
 struct omap_dss_cpr_coefs {
 	s16 rr, rg, rb;
 	s16 gr, gg, gb;
@@ -403,6 +397,14 @@ struct omapdss_hdmi_ops {
 	int (*read_edid)(struct omap_dss_device *dssdev, u8 *buf, int len);
 	bool (*detect)(struct omap_dss_device *dssdev);
 
+	int (*register_hpd_cb)(struct omap_dss_device *dssdev,
+			       void (*cb)(void *cb_data,
+					  enum drm_connector_status status),
+			       void *cb_data);
+	void (*unregister_hpd_cb)(struct omap_dss_device *dssdev);
+	void (*enable_hpd)(struct omap_dss_device *dssdev);
+	void (*disable_hpd)(struct omap_dss_device *dssdev);
+
 	int (*set_hdmi_mode)(struct omap_dss_device *dssdev, bool hdmi_mode);
 	int (*set_infoframe)(struct omap_dss_device *dssdev,
 		const struct hdmi_avi_infoframe *avi);
@@ -567,12 +569,19 @@ struct omap_dss_driver {
 	int (*read_edid)(struct omap_dss_device *dssdev, u8 *buf, int len);
 	bool (*detect)(struct omap_dss_device *dssdev);
 
+	int (*register_hpd_cb)(struct omap_dss_device *dssdev,
+			       void (*cb)(void *cb_data,
+					  enum drm_connector_status status),
+			       void *cb_data);
+	void (*unregister_hpd_cb)(struct omap_dss_device *dssdev);
+	void (*enable_hpd)(struct omap_dss_device *dssdev);
+	void (*disable_hpd)(struct omap_dss_device *dssdev);
+
 	int (*set_hdmi_mode)(struct omap_dss_device *dssdev, bool hdmi_mode);
 	int (*set_hdmi_infoframe)(struct omap_dss_device *dssdev,
 		const struct hdmi_avi_infoframe *avi);
 };
 
-enum omapdss_version omapdss_get_version(void);
 bool omapdss_is_initialized(void);
 
 int omap_dss_register_driver(struct omap_dss_driver *);
diff --git a/drivers/gpu/drm/omapdrm/dss/pll.c b/drivers/gpu/drm/omapdrm/dss/pll.c
index 5e22130..9d9d9d4 100644
--- a/drivers/gpu/drm/omapdrm/dss/pll.c
+++ b/drivers/gpu/drm/omapdrm/dss/pll.c
@@ -215,8 +215,8 @@ bool dss_pll_calc_a(const struct dss_pll *pll, unsigned long clkin,
 		dss_pll_calc_func func, void *data)
 {
 	const struct dss_pll_hw *hw = pll->hw;
-	int n, n_min, n_max;
-	int m, m_min, m_max;
+	int n, n_start, n_stop, n_inc;
+	int m, m_start, m_stop, m_inc;
 	unsigned long fint, clkdco;
 	unsigned long pll_hw_max;
 	unsigned long fint_hw_min, fint_hw_max;
@@ -226,22 +226,33 @@ bool dss_pll_calc_a(const struct dss_pll *pll, unsigned long clkin,
 	fint_hw_min = hw->fint_min;
 	fint_hw_max = hw->fint_max;
 
-	n_min = max(DIV_ROUND_UP(clkin, fint_hw_max), 1ul);
-	n_max = min((unsigned)(clkin / fint_hw_min), hw->n_max);
+	n_start = max(DIV_ROUND_UP(clkin, fint_hw_max), 1ul);
+	n_stop = min((unsigned)(clkin / fint_hw_min), hw->n_max);
+	n_inc = 1;
+
+	if (hw->errata_i886) {
+		swap(n_start, n_stop);
+		n_inc = -1;
+	}
 
 	pll_max = pll_max ? pll_max : ULONG_MAX;
 
-	/* Try to find high N & M to avoid jitter (DRA7 errata i886) */
-	for (n = n_max; n >= n_min; --n) {
+	for (n = n_start; n != n_stop; n += n_inc) {
 		fint = clkin / n;
 
-		m_min = max(DIV_ROUND_UP(DIV_ROUND_UP(pll_min, fint), 2),
+		m_start = max(DIV_ROUND_UP(DIV_ROUND_UP(pll_min, fint), 2),
 				1ul);
-		m_max = min3((unsigned)(pll_max / fint / 2),
+		m_stop = min3((unsigned)(pll_max / fint / 2),
 				(unsigned)(pll_hw_max / fint / 2),
 				hw->m_max);
+		m_inc = 1;
 
-		for (m = m_max; m >= m_min; --m) {
+		if (hw->errata_i886) {
+			swap(m_start, m_stop);
+			m_inc = -1;
+		}
+
+		for (m = m_start; m != m_stop; m += m_inc) {
 			clkdco = 2 * m * fint;
 
 			if (func(n, m, fint, clkdco, data))
diff --git a/drivers/gpu/drm/omapdrm/dss/venc.c b/drivers/gpu/drm/omapdrm/dss/venc.c
index a6bfb39..d58da6f 100644
--- a/drivers/gpu/drm/omapdrm/dss/venc.c
+++ b/drivers/gpu/drm/omapdrm/dss/venc.c
@@ -37,10 +37,10 @@
 #include <linux/of.h>
 #include <linux/of_graph.h>
 #include <linux/component.h>
+#include <linux/sys_soc.h>
 
 #include "omapdss.h"
 #include "dss.h"
-#include "dss_features.h"
 
 /* Venc registers */
 #define VENC_REV_ID				0x00
@@ -263,7 +263,13 @@ static const struct venc_config venc_config_pal_bdghi = {
 	.fid_ext_start_y__fid_ext_offset_y	= 0x01380005,
 };
 
-const struct videomode omap_dss_pal_vm = {
+enum venc_videomode {
+	VENC_MODE_UNKNOWN,
+	VENC_MODE_PAL,
+	VENC_MODE_NTSC,
+};
+
+static const struct videomode omap_dss_pal_vm = {
 	.hactive	= 720,
 	.vactive	= 574,
 	.pixelclock	= 13500000,
@@ -279,9 +285,8 @@ const struct videomode omap_dss_pal_vm = {
 			  DISPLAY_FLAGS_PIXDATA_POSEDGE |
 			  DISPLAY_FLAGS_SYNC_NEGEDGE,
 };
-EXPORT_SYMBOL(omap_dss_pal_vm);
 
-const struct videomode omap_dss_ntsc_vm = {
+static const struct videomode omap_dss_ntsc_vm = {
 	.hactive	= 720,
 	.vactive	= 482,
 	.pixelclock	= 13500000,
@@ -297,7 +302,24 @@ const struct videomode omap_dss_ntsc_vm = {
 			  DISPLAY_FLAGS_PIXDATA_POSEDGE |
 			  DISPLAY_FLAGS_SYNC_NEGEDGE,
 };
-EXPORT_SYMBOL(omap_dss_ntsc_vm);
+
+static enum venc_videomode venc_get_videomode(const struct videomode *vm)
+{
+	if (!(vm->flags & DISPLAY_FLAGS_INTERLACED))
+		return VENC_MODE_UNKNOWN;
+
+	if (vm->pixelclock == omap_dss_pal_vm.pixelclock &&
+	    vm->hactive == omap_dss_pal_vm.hactive &&
+	    vm->vactive == omap_dss_pal_vm.vactive)
+		return VENC_MODE_PAL;
+
+	if (vm->pixelclock == omap_dss_ntsc_vm.pixelclock &&
+	    vm->hactive == omap_dss_ntsc_vm.hactive &&
+	    vm->vactive == omap_dss_ntsc_vm.vactive)
+		return VENC_MODE_NTSC;
+
+	return VENC_MODE_UNKNOWN;
+}
 
 static struct {
 	struct platform_device *pdev;
@@ -311,6 +333,7 @@ static struct {
 	struct videomode vm;
 	enum omap_dss_venc_type type;
 	bool invert_polarity;
+	bool requires_tv_dac_clk;
 
 	struct omap_dss_device output;
 } venc;
@@ -424,14 +447,14 @@ static void venc_runtime_put(void)
 
 static const struct venc_config *venc_timings_to_config(struct videomode *vm)
 {
-	if (memcmp(&omap_dss_pal_vm, vm, sizeof(*vm)) == 0)
+	switch (venc_get_videomode(vm)) {
+	default:
+		WARN_ON_ONCE(1);
+	case VENC_MODE_PAL:
 		return &venc_config_pal_trm;
-
-	if (memcmp(&omap_dss_ntsc_vm, vm, sizeof(*vm)) == 0)
+	case VENC_MODE_NTSC:
 		return &venc_config_ntsc_trm;
-
-	BUG();
-	return NULL;
+	}
 }
 
 static int venc_power_on(struct omap_dss_device *dssdev)
@@ -542,15 +565,28 @@ static void venc_display_disable(struct omap_dss_device *dssdev)
 static void venc_set_timings(struct omap_dss_device *dssdev,
 			     struct videomode *vm)
 {
+	struct videomode actual_vm;
+
 	DSSDBG("venc_set_timings\n");
 
 	mutex_lock(&venc.venc_lock);
 
+	switch (venc_get_videomode(vm)) {
+	default:
+		WARN_ON_ONCE(1);
+	case VENC_MODE_PAL:
+		actual_vm = omap_dss_pal_vm;
+		break;
+	case VENC_MODE_NTSC:
+		actual_vm = omap_dss_ntsc_vm;
+		break;
+	}
+
 	/* Reset WSS data when the TV standard changes. */
-	if (memcmp(&venc.vm, vm, sizeof(*vm)))
+	if (memcmp(&venc.vm, &actual_vm, sizeof(actual_vm)))
 		venc.wss_data = 0;
 
-	venc.vm = *vm;
+	venc.vm = actual_vm;
 
 	dispc_set_tv_pclk(13500000);
 
@@ -562,13 +598,13 @@ static int venc_check_timings(struct omap_dss_device *dssdev,
 {
 	DSSDBG("venc_check_timings\n");
 
-	if (memcmp(&omap_dss_pal_vm, vm, sizeof(*vm)) == 0)
+	switch (venc_get_videomode(vm)) {
+	case VENC_MODE_PAL:
+	case VENC_MODE_NTSC:
 		return 0;
-
-	if (memcmp(&omap_dss_ntsc_vm, vm, sizeof(*vm)) == 0)
-		return 0;
-
-	return -EINVAL;
+	default:
+		return -EINVAL;
+	}
 }
 
 static void venc_get_timings(struct omap_dss_device *dssdev,
@@ -693,7 +729,7 @@ static int venc_get_clocks(struct platform_device *pdev)
 {
 	struct clk *clk;
 
-	if (dss_has_feature(FEAT_VENC_REQUIRES_TV_DAC_CLK)) {
+	if (venc.requires_tv_dac_clk) {
 		clk = devm_clk_get(&pdev->dev, "tv_dac_clk");
 		if (IS_ERR(clk)) {
 			DSSERR("can't get tv_dac_clk\n");
@@ -828,6 +864,12 @@ static int venc_probe_of(struct platform_device *pdev)
 }
 
 /* VENC HW IP initialisation */
+static const struct soc_device_attribute venc_soc_devices[] = {
+	{ .machine = "OMAP3[45]*" },
+	{ .machine = "AM35*" },
+	{ /* sentinel */ }
+};
+
 static int venc_bind(struct device *dev, struct device *master, void *data)
 {
 	struct platform_device *pdev = to_platform_device(dev);
@@ -837,6 +879,10 @@ static int venc_bind(struct device *dev, struct device *master, void *data)
 
 	venc.pdev = pdev;
 
+	/* The OMAP34xx, OMAP35xx and AM35xx VENC require the TV DAC clock. */
+	if (soc_device_match(venc_soc_devices))
+		venc.requires_tv_dac_clk = true;
+
 	mutex_init(&venc.venc_lock);
 
 	venc.wss_data = 0;
diff --git a/drivers/gpu/drm/omapdrm/dss/video-pll.c b/drivers/gpu/drm/omapdrm/dss/video-pll.c
index fbd1263..38a239c 100644
--- a/drivers/gpu/drm/omapdrm/dss/video-pll.c
+++ b/drivers/gpu/drm/omapdrm/dss/video-pll.c
@@ -19,7 +19,6 @@
 
 #include "omapdss.h"
 #include "dss.h"
-#include "dss_features.h"
 
 struct dss_video_pll {
 	struct dss_pll pll;
@@ -131,6 +130,8 @@ static const struct dss_pll_hw dss_dra7_video_pll_hw = {
 	.mX_lsb[3] = 5,
 
 	.has_refsel = true,
+
+	.errata_i886 = true,
 };
 
 struct dss_pll *dss_video_pll_init(struct platform_device *pdev, int id,
diff --git a/drivers/gpu/drm/omapdrm/omap_connector.c b/drivers/gpu/drm/omapdrm/omap_connector.c
index c24b6b7..aa5ba9a 100644
--- a/drivers/gpu/drm/omapdrm/omap_connector.c
+++ b/drivers/gpu/drm/omapdrm/omap_connector.c
@@ -35,6 +35,23 @@ struct omap_connector {
 	bool hdmi_mode;
 };
 
+static void omap_connector_hpd_cb(void *cb_data,
+				  enum drm_connector_status status)
+{
+	struct omap_connector *omap_connector = cb_data;
+	struct drm_connector *connector = &omap_connector->base;
+	struct drm_device *dev = connector->dev;
+	enum drm_connector_status old_status;
+
+	mutex_lock(&dev->mode_config.mutex);
+	old_status = connector->status;
+	connector->status = status;
+	mutex_unlock(&dev->mode_config.mutex);
+
+	if (old_status != status)
+		drm_kms_helper_hotplug_event(dev);
+}
+
 bool omap_connector_get_hdmi_mode(struct drm_connector *connector)
 {
 	struct omap_connector *omap_connector = to_omap_connector(connector);
@@ -75,6 +92,10 @@ static void omap_connector_destroy(struct drm_connector *connector)
 	struct omap_dss_device *dssdev = omap_connector->dssdev;
 
 	DBG("%s", omap_connector->dssdev->name);
+	if (connector->polled == DRM_CONNECTOR_POLL_HPD &&
+	    dssdev->driver->unregister_hpd_cb) {
+		dssdev->driver->unregister_hpd_cb(dssdev);
+	}
 	drm_connector_unregister(connector);
 	drm_connector_cleanup(connector);
 	kfree(omap_connector);
@@ -195,7 +216,6 @@ static int omap_connector_mode_valid(struct drm_connector *connector,
 }
 
 static const struct drm_connector_funcs omap_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.reset = drm_atomic_helper_connector_reset,
 	.detect = omap_connector_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
@@ -216,6 +236,7 @@ struct drm_connector *omap_connector_init(struct drm_device *dev,
 {
 	struct drm_connector *connector = NULL;
 	struct omap_connector *omap_connector;
+	bool hpd_supported = false;
 
 	DBG("%s", dssdev->name);
 
@@ -233,7 +254,20 @@ struct drm_connector *omap_connector_init(struct drm_device *dev,
 				connector_type);
 	drm_connector_helper_add(connector, &omap_connector_helper_funcs);
 
-	if (dssdev->driver->detect)
+	if (dssdev->driver->register_hpd_cb) {
+		int ret = dssdev->driver->register_hpd_cb(dssdev,
+							  omap_connector_hpd_cb,
+							  omap_connector);
+		if (!ret)
+			hpd_supported = true;
+		else if (ret != -ENOTSUPP)
+			DBG("%s: Failed to register HPD callback (%d).",
+			    dssdev->name, ret);
+	}
+
+	if (hpd_supported)
+		connector->polled = DRM_CONNECTOR_POLL_HPD;
+	else if (dssdev->driver->detect)
 		connector->polled = DRM_CONNECTOR_POLL_CONNECT |
 				    DRM_CONNECTOR_POLL_DISCONNECT;
 	else
diff --git a/drivers/gpu/drm/omapdrm/omap_crtc.c b/drivers/gpu/drm/omapdrm/omap_crtc.c
index dd0ef40..cc85c16 100644
--- a/drivers/gpu/drm/omapdrm/omap_crtc.c
+++ b/drivers/gpu/drm/omapdrm/omap_crtc.c
@@ -26,6 +26,16 @@
 
 #include "omap_drv.h"
 
+#define to_omap_crtc_state(x) container_of(x, struct omap_crtc_state, base)
+
+struct omap_crtc_state {
+	/* Must be first. */
+	struct drm_crtc_state base;
+	/* Shadow values for legacy userspace support. */
+	unsigned int rotation;
+	unsigned int zpos;
+};
+
 #define to_omap_crtc(x) container_of(x, struct omap_crtc, base)
 
 struct omap_crtc {
@@ -356,7 +366,8 @@ static void omap_crtc_arm_event(struct drm_crtc *crtc)
 	}
 }
 
-static void omap_crtc_enable(struct drm_crtc *crtc)
+static void omap_crtc_atomic_enable(struct drm_crtc *crtc,
+				    struct drm_crtc_state *old_state)
 {
 	struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
 	int ret;
@@ -372,7 +383,8 @@ static void omap_crtc_enable(struct drm_crtc *crtc)
 	spin_unlock_irq(&crtc->dev->event_lock);
 }
 
-static void omap_crtc_disable(struct drm_crtc *crtc)
+static void omap_crtc_atomic_disable(struct drm_crtc *crtc,
+				     struct drm_crtc_state *old_state)
 {
 	struct omap_crtc *omap_crtc = to_omap_crtc(crtc);
 
@@ -443,6 +455,8 @@ static void omap_crtc_mode_set_nofb(struct drm_crtc *crtc)
 static int omap_crtc_atomic_check(struct drm_crtc *crtc,
 				struct drm_crtc_state *state)
 {
+	struct drm_plane_state *pri_state;
+
 	if (state->color_mgmt_changed && state->gamma_lut) {
 		uint length = state->gamma_lut->length /
 			sizeof(struct drm_color_lut);
@@ -451,6 +465,16 @@ static int omap_crtc_atomic_check(struct drm_crtc *crtc,
 			return -EINVAL;
 	}
 
+	pri_state = drm_atomic_get_new_plane_state(state->state, crtc->primary);
+	if (pri_state) {
+		struct omap_crtc_state *omap_crtc_state =
+			to_omap_crtc_state(state);
+
+		/* Mirror new values for zpos and rotation in omap_crtc_state */
+		omap_crtc_state->zpos = pri_state->zpos;
+		omap_crtc_state->rotation = pri_state->rotation;
+	}
+
 	return 0;
 }
 
@@ -496,39 +520,32 @@ static void omap_crtc_atomic_flush(struct drm_crtc *crtc,
 	spin_unlock_irq(&crtc->dev->event_lock);
 }
 
-static bool omap_crtc_is_plane_prop(struct drm_crtc *crtc,
-	struct drm_property *property)
-{
-	struct drm_device *dev = crtc->dev;
-	struct omap_drm_private *priv = dev->dev_private;
-
-	return property == priv->zorder_prop ||
-		property == crtc->primary->rotation_property;
-}
-
 static int omap_crtc_atomic_set_property(struct drm_crtc *crtc,
 					 struct drm_crtc_state *state,
 					 struct drm_property *property,
 					 uint64_t val)
 {
-	if (omap_crtc_is_plane_prop(crtc, property)) {
-		struct drm_plane_state *plane_state;
-		struct drm_plane *plane = crtc->primary;
+	struct omap_drm_private *priv = crtc->dev->dev_private;
+	struct drm_plane_state *plane_state;
 
-		/*
-		 * Delegate property set to the primary plane. Get the plane
-		 * state and set the property directly.
-		 */
+	/*
+	 * Delegate property set to the primary plane. Get the plane state and
+	 * set the property directly, the shadow copy will be assigned in the
+	 * omap_crtc_atomic_check callback. This way updates to plane state will
+	 * always be mirrored in the crtc state correctly.
+	 */
+	plane_state = drm_atomic_get_plane_state(state->state, crtc->primary);
+	if (IS_ERR(plane_state))
+		return PTR_ERR(plane_state);
 
-		plane_state = drm_atomic_get_plane_state(state->state, plane);
-		if (IS_ERR(plane_state))
-			return PTR_ERR(plane_state);
+	if (property == crtc->primary->rotation_property)
+		plane_state->rotation = val;
+	else if (property == priv->zorder_prop)
+		plane_state->zpos = val;
+	else
+		return -EINVAL;
 
-		return drm_atomic_plane_set_property(plane, plane_state,
-				property, val);
-	}
-
-	return -EINVAL;
+	return 0;
 }
 
 static int omap_crtc_atomic_get_property(struct drm_crtc *crtc,
@@ -536,28 +553,60 @@ static int omap_crtc_atomic_get_property(struct drm_crtc *crtc,
 					 struct drm_property *property,
 					 uint64_t *val)
 {
-	if (omap_crtc_is_plane_prop(crtc, property)) {
-		/*
-		 * Delegate property get to the primary plane. The
-		 * drm_atomic_plane_get_property() function isn't exported, but
-		 * can be called through drm_object_property_get_value() as that
-		 * will call drm_atomic_get_property() for atomic drivers.
-		 */
-		return drm_object_property_get_value(&crtc->primary->base,
-				property, val);
-	}
+	struct omap_drm_private *priv = crtc->dev->dev_private;
+	struct omap_crtc_state *omap_state = to_omap_crtc_state(state);
 
-	return -EINVAL;
+	if (property == crtc->primary->rotation_property)
+		*val = omap_state->rotation;
+	else if (property == priv->zorder_prop)
+		*val = omap_state->zpos;
+	else
+		return -EINVAL;
+
+	return 0;
+}
+
+static void omap_crtc_reset(struct drm_crtc *crtc)
+{
+	if (crtc->state)
+		__drm_atomic_helper_crtc_destroy_state(crtc->state);
+
+	kfree(crtc->state);
+	crtc->state = kzalloc(sizeof(struct omap_crtc_state), GFP_KERNEL);
+
+	if (crtc->state)
+		crtc->state->crtc = crtc;
+}
+
+static struct drm_crtc_state *
+omap_crtc_duplicate_state(struct drm_crtc *crtc)
+{
+	struct omap_crtc_state *state, *current_state;
+
+	if (WARN_ON(!crtc->state))
+		return NULL;
+
+	current_state = to_omap_crtc_state(crtc->state);
+
+	state = kmalloc(sizeof(*state), GFP_KERNEL);
+	if (!state)
+		return NULL;
+
+	__drm_atomic_helper_crtc_duplicate_state(crtc, &state->base);
+
+	state->zpos = current_state->zpos;
+	state->rotation = current_state->rotation;
+
+	return &state->base;
 }
 
 static const struct drm_crtc_funcs omap_crtc_funcs = {
-	.reset = drm_atomic_helper_crtc_reset,
+	.reset = omap_crtc_reset,
 	.set_config = drm_atomic_helper_set_config,
 	.destroy = omap_crtc_destroy,
 	.page_flip = drm_atomic_helper_page_flip,
 	.gamma_set = drm_atomic_helper_legacy_gamma_set,
-	.set_property = drm_atomic_helper_crtc_set_property,
-	.atomic_duplicate_state = drm_atomic_helper_crtc_duplicate_state,
+	.atomic_duplicate_state = omap_crtc_duplicate_state,
 	.atomic_destroy_state = drm_atomic_helper_crtc_destroy_state,
 	.atomic_set_property = omap_crtc_atomic_set_property,
 	.atomic_get_property = omap_crtc_atomic_get_property,
@@ -567,11 +616,11 @@ static const struct drm_crtc_funcs omap_crtc_funcs = {
 
 static const struct drm_crtc_helper_funcs omap_crtc_helper_funcs = {
 	.mode_set_nofb = omap_crtc_mode_set_nofb,
-	.disable = omap_crtc_disable,
-	.enable = omap_crtc_enable,
 	.atomic_check = omap_crtc_atomic_check,
 	.atomic_begin = omap_crtc_atomic_begin,
 	.atomic_flush = omap_crtc_atomic_flush,
+	.atomic_enable = omap_crtc_atomic_enable,
+	.atomic_disable = omap_crtc_atomic_disable,
 };
 
 /* -----------------------------------------------------------------------------
diff --git a/drivers/gpu/drm/omapdrm/omap_drv.c b/drivers/gpu/drm/omapdrm/omap_drv.c
index 022029e..cdf5b06 100644
--- a/drivers/gpu/drm/omapdrm/omap_drv.c
+++ b/drivers/gpu/drm/omapdrm/omap_drv.c
@@ -57,13 +57,13 @@ static void omap_fb_output_poll_changed(struct drm_device *dev)
 static void omap_atomic_wait_for_completion(struct drm_device *dev,
 					    struct drm_atomic_state *old_state)
 {
-	struct drm_crtc_state *old_crtc_state;
+	struct drm_crtc_state *new_crtc_state;
 	struct drm_crtc *crtc;
 	unsigned int i;
 	int ret;
 
-	for_each_crtc_in_state(old_state, crtc, old_crtc_state, i) {
-		if (!crtc->state->enable)
+	for_each_new_crtc_in_state(old_state, crtc, new_crtc_state, i) {
+		if (!new_crtc_state->active)
 			continue;
 
 		ret = omap_crtc_wait_pending(crtc);
@@ -84,23 +84,36 @@ static void omap_atomic_commit_tail(struct drm_atomic_state *old_state)
 	/* Apply the atomic update. */
 	drm_atomic_helper_commit_modeset_disables(dev, old_state);
 
-	/* With the current dss dispc implementation we have to enable
-	 * the new modeset before we can commit planes. The dispc ovl
-	 * configuration relies on the video mode configuration been
-	 * written into the HW when the ovl configuration is
-	 * calculated.
-	 *
-	 * This approach is not ideal because after a mode change the
-	 * plane update is executed only after the first vblank
-	 * interrupt. The dispc implementation should be fixed so that
-	 * it is able use uncommitted drm state information.
-	 */
-	drm_atomic_helper_commit_modeset_enables(dev, old_state);
-	omap_atomic_wait_for_completion(dev, old_state);
+	if (priv->omaprev != 0x3430) {
+		/* With the current dss dispc implementation we have to enable
+		 * the new modeset before we can commit planes. The dispc ovl
+		 * configuration relies on the video mode configuration been
+		 * written into the HW when the ovl configuration is
+		 * calculated.
+		 *
+		 * This approach is not ideal because after a mode change the
+		 * plane update is executed only after the first vblank
+		 * interrupt. The dispc implementation should be fixed so that
+		 * it is able use uncommitted drm state information.
+		 */
+		drm_atomic_helper_commit_modeset_enables(dev, old_state);
+		omap_atomic_wait_for_completion(dev, old_state);
 
-	drm_atomic_helper_commit_planes(dev, old_state, 0);
+		drm_atomic_helper_commit_planes(dev, old_state, 0);
 
-	drm_atomic_helper_commit_hw_done(old_state);
+		drm_atomic_helper_commit_hw_done(old_state);
+	} else {
+		/*
+		 * OMAP3 DSS seems to have issues with the work-around above,
+		 * resulting in endless sync losts if a crtc is enabled without
+		 * a plane. For now, skip the WA for OMAP3.
+		 */
+		drm_atomic_helper_commit_planes(dev, old_state, 0);
+
+		drm_atomic_helper_commit_modeset_enables(dev, old_state);
+
+		drm_atomic_helper_commit_hw_done(old_state);
+	}
 
 	/*
 	 * Wait for completion of the page flips to ensure that old buffers
@@ -324,6 +337,32 @@ static int omap_modeset_init(struct drm_device *dev)
 }
 
 /*
+ * Enable the HPD in external components if supported
+ */
+static void omap_modeset_enable_external_hpd(void)
+{
+	struct omap_dss_device *dssdev = NULL;
+
+	for_each_dss_dev(dssdev) {
+		if (dssdev->driver->enable_hpd)
+			dssdev->driver->enable_hpd(dssdev);
+	}
+}
+
+/*
+ * Disable the HPD in external components if supported
+ */
+static void omap_modeset_disable_external_hpd(void)
+{
+	struct omap_dss_device *dssdev = NULL;
+
+	for_each_dss_dev(dssdev) {
+		if (dssdev->driver->disable_hpd)
+			dssdev->driver->disable_hpd(dssdev);
+	}
+}
+
+/*
  * drm ioctl funcs
  */
 
@@ -438,44 +477,11 @@ static int dev_open(struct drm_device *dev, struct drm_file *file)
  */
 static void dev_lastclose(struct drm_device *dev)
 {
-	int i;
-
-	/* we don't support vga_switcheroo.. so just make sure the fbdev
-	 * mode is active
-	 */
 	struct omap_drm_private *priv = dev->dev_private;
 	int ret;
 
 	DBG("lastclose: dev=%p", dev);
 
-	/* need to restore default rotation state.. not sure
-	 * if there is a cleaner way to restore properties to
-	 * default state?  Maybe a flag that properties should
-	 * automatically be restored to default state on
-	 * lastclose?
-	 */
-	for (i = 0; i < priv->num_crtcs; i++) {
-		struct drm_crtc *crtc = priv->crtcs[i];
-
-		if (!crtc->primary->rotation_property)
-			continue;
-
-		drm_object_property_set_value(&crtc->base,
-					      crtc->primary->rotation_property,
-					      DRM_MODE_ROTATE_0);
-	}
-
-	for (i = 0; i < priv->num_planes; i++) {
-		struct drm_plane *plane = priv->planes[i];
-
-		if (!plane->rotation_property)
-			continue;
-
-		drm_object_property_set_value(&plane->base,
-					      plane->rotation_property,
-					      DRM_MODE_ROTATE_0);
-	}
-
 	if (priv->fbdev) {
 		ret = drm_fb_helper_restore_fbdev_mode_unlocked(priv->fbdev);
 		if (ret)
@@ -517,7 +523,6 @@ static struct drm_driver omap_drm_driver = {
 	.gem_vm_ops = &omap_gem_vm_ops,
 	.dumb_create = omap_gem_dumb_create,
 	.dumb_map_offset = omap_gem_dumb_map_offset,
-	.dumb_destroy = drm_gem_dumb_destroy,
 	.ioctls = ioctls,
 	.num_ioctls = DRM_OMAP_NUM_IOCTLS,
 	.fops = &omapdriver_fops,
@@ -550,6 +555,12 @@ static int pdev_probe(struct platform_device *pdev)
 	if (omapdss_is_initialized() == false)
 		return -EPROBE_DEFER;
 
+	ret = dma_set_coherent_mask(&pdev->dev, DMA_BIT_MASK(32));
+	if (ret) {
+		dev_err(&pdev->dev, "Failed to set the DMA mask\n");
+		return ret;
+	}
+
 	omap_crtc_pre_init();
 
 	ret = omap_connect_dssdevs();
@@ -603,6 +614,7 @@ static int pdev_probe(struct platform_device *pdev)
 	priv->fbdev = omap_fbdev_init(ddev);
 
 	drm_kms_helper_poll_init(ddev);
+	omap_modeset_enable_external_hpd();
 
 	/*
 	 * Register the DRM device with the core and the connectors with
@@ -615,6 +627,7 @@ static int pdev_probe(struct platform_device *pdev)
 	return 0;
 
 err_cleanup_helpers:
+	omap_modeset_disable_external_hpd();
 	drm_kms_helper_poll_fini(ddev);
 	if (priv->fbdev)
 		omap_fbdev_free(ddev);
@@ -643,6 +656,7 @@ static int pdev_remove(struct platform_device *pdev)
 
 	drm_dev_unregister(ddev);
 
+	omap_modeset_disable_external_hpd();
 	drm_kms_helper_poll_fini(ddev);
 
 	if (priv->fbdev)
@@ -734,7 +748,7 @@ static SIMPLE_DEV_PM_OPS(omapdrm_pm_ops, omap_drm_suspend, omap_drm_resume);
 
 static struct platform_driver pdev = {
 	.driver = {
-		.name = DRIVER_NAME,
+		.name = "omapdrm",
 		.pm = &omapdrm_pm_ops,
 	},
 	.probe = pdev_probe,
diff --git a/drivers/gpu/drm/omapdrm/omap_encoder.c b/drivers/gpu/drm/omapdrm/omap_encoder.c
index 86c977b..624f5b50 100644
--- a/drivers/gpu/drm/omapdrm/omap_encoder.c
+++ b/drivers/gpu/drm/omapdrm/omap_encoder.c
@@ -85,7 +85,8 @@ static void omap_encoder_mode_set(struct drm_encoder *encoder,
 	if (hdmi_mode && dssdev->driver->set_hdmi_infoframe) {
 		struct hdmi_avi_infoframe avi;
 
-		r = drm_hdmi_avi_infoframe_from_display_mode(&avi, adjusted_mode);
+		r = drm_hdmi_avi_infoframe_from_display_mode(&avi, adjusted_mode,
+							     false);
 		if (r == 0)
 			dssdev->driver->set_hdmi_infoframe(dssdev, &avi);
 	}
diff --git a/drivers/gpu/drm/omapdrm/omap_fb.c b/drivers/gpu/drm/omapdrm/omap_fb.c
index ddf7a45..b1a762b 100644
--- a/drivers/gpu/drm/omapdrm/omap_fb.c
+++ b/drivers/gpu/drm/omapdrm/omap_fb.c
@@ -379,7 +379,7 @@ struct drm_framebuffer *omap_framebuffer_create(struct drm_device *dev,
 	return fb;
 
 error:
-	while (--i > 0)
+	while (--i >= 0)
 		drm_gem_object_unreference_unlocked(bos[i]);
 
 	return fb;
diff --git a/drivers/gpu/drm/omapdrm/omap_fbdev.c b/drivers/gpu/drm/omapdrm/omap_fbdev.c
index daf81a0..9273118 100644
--- a/drivers/gpu/drm/omapdrm/omap_fbdev.c
+++ b/drivers/gpu/drm/omapdrm/omap_fbdev.c
@@ -184,7 +184,6 @@ static int omap_fbdev_create(struct drm_fb_helper *helper,
 	helper->fb = fb;
 
 	fbi->par = helper;
-	fbi->flags = FBINFO_DEFAULT;
 	fbi->fbops = &omap_fb_ops;
 
 	strcpy(fbi->fix.id, MODULE_NAME);
diff --git a/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c b/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c
index 863a881..afdbad5 100644
--- a/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c
+++ b/drivers/gpu/drm/omapdrm/omap_gem_dmabuf.c
@@ -144,7 +144,7 @@ static int omap_gem_dmabuf_mmap(struct dma_buf *buffer,
 	return omap_gem_mmap_obj(obj, vma);
 }
 
-static struct dma_buf_ops omap_dmabuf_ops = {
+static const struct dma_buf_ops omap_dmabuf_ops = {
 	.map_dma_buf = omap_gem_map_dma_buf,
 	.unmap_dma_buf = omap_gem_unmap_dma_buf,
 	.release = drm_gem_dmabuf_release,
diff --git a/drivers/gpu/drm/omapdrm/omap_plane.c b/drivers/gpu/drm/omapdrm/omap_plane.c
index 2160f64..15e5d5d 100644
--- a/drivers/gpu/drm/omapdrm/omap_plane.c
+++ b/drivers/gpu/drm/omapdrm/omap_plane.c
@@ -235,7 +235,6 @@ static const struct drm_plane_funcs omap_plane_funcs = {
 	.disable_plane = drm_atomic_helper_disable_plane,
 	.reset = omap_plane_reset,
 	.destroy = omap_plane_destroy,
-	.set_property = drm_atomic_helper_plane_set_property,
 	.atomic_duplicate_state = drm_atomic_helper_plane_duplicate_state,
 	.atomic_destroy_state = drm_atomic_helper_plane_destroy_state,
 	.atomic_set_property = omap_plane_atomic_set_property,
@@ -291,7 +290,7 @@ struct drm_plane *omap_plane_init(struct drm_device *dev,
 
 	ret = drm_universal_plane_init(dev, plane, possible_crtcs,
 				       &omap_plane_funcs, formats,
-				       nformats, type, NULL);
+				       nformats, NULL, type, NULL);
 	if (ret < 0)
 		goto error;
 
diff --git a/drivers/gpu/drm/panel/panel-lvds.c b/drivers/gpu/drm/panel/panel-lvds.c
index 3216aa9..e2d57c0 100644
--- a/drivers/gpu/drm/panel/panel-lvds.c
+++ b/drivers/gpu/drm/panel/panel-lvds.c
@@ -143,14 +143,14 @@ static int panel_lvds_parse_dt(struct panel_lvds *lvds)
 
 	ret = of_property_read_u32(np, "width-mm", &lvds->width);
 	if (ret < 0) {
-		dev_err(lvds->dev, "%s: invalid or missing %s DT property\n",
-			of_node_full_name(np), "width-mm");
+		dev_err(lvds->dev, "%pOF: invalid or missing %s DT property\n",
+			np, "width-mm");
 		return -ENODEV;
 	}
 	ret = of_property_read_u32(np, "height-mm", &lvds->height);
 	if (ret < 0) {
-		dev_err(lvds->dev, "%s: invalid or missing %s DT property\n",
-			of_node_full_name(np), "height-mm");
+		dev_err(lvds->dev, "%pOF: invalid or missing %s DT property\n",
+			np, "height-mm");
 		return -ENODEV;
 	}
 
@@ -158,8 +158,8 @@ static int panel_lvds_parse_dt(struct panel_lvds *lvds)
 
 	ret = of_property_read_string(np, "data-mapping", &mapping);
 	if (ret < 0) {
-		dev_err(lvds->dev, "%s: invalid or missing %s DT property\n",
-			of_node_full_name(np), "data-mapping");
+		dev_err(lvds->dev, "%pOF: invalid or missing %s DT property\n",
+			np, "data-mapping");
 		return -ENODEV;
 	}
 
@@ -170,8 +170,8 @@ static int panel_lvds_parse_dt(struct panel_lvds *lvds)
 	} else if (!strcmp(mapping, "vesa-24")) {
 		lvds->bus_format = MEDIA_BUS_FMT_RGB888_1X7X4_SPWG;
 	} else {
-		dev_err(lvds->dev, "%s: invalid or missing %s DT property\n",
-			of_node_full_name(np), "data-mapping");
+		dev_err(lvds->dev, "%pOF: invalid or missing %s DT property\n",
+			np, "data-mapping");
 		return -EINVAL;
 	}
 
diff --git a/drivers/gpu/drm/pl111/pl111_connector.c b/drivers/gpu/drm/pl111/pl111_connector.c
index 3f213d7..d335f9a 100644
--- a/drivers/gpu/drm/pl111/pl111_connector.c
+++ b/drivers/gpu/drm/pl111/pl111_connector.c
@@ -69,7 +69,6 @@ const struct drm_connector_funcs connector_funcs = {
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = pl111_connector_destroy,
 	.detect = pl111_connector_detect,
-	.dpms = drm_atomic_helper_connector_dpms,
 	.reset = drm_atomic_helper_connector_reset,
 	.atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
 	.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
diff --git a/drivers/gpu/drm/pl111/pl111_display.c b/drivers/gpu/drm/pl111/pl111_display.c
index c6ca4f1b..b58c988 100644
--- a/drivers/gpu/drm/pl111/pl111_display.c
+++ b/drivers/gpu/drm/pl111/pl111_display.c
@@ -23,6 +23,7 @@
 #include <drm/drmP.h>
 #include <drm/drm_panel.h>
 #include <drm/drm_gem_cma_helper.h>
+#include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 
 #include "pl111_drm.h"
@@ -274,7 +275,7 @@ void pl111_disable_vblank(struct drm_device *drm, unsigned int crtc)
 static int pl111_display_prepare_fb(struct drm_simple_display_pipe *pipe,
 				    struct drm_plane_state *plane_state)
 {
-	return drm_fb_cma_prepare_fb(&pipe->plane, plane_state);
+	return drm_gem_fb_prepare_fb(&pipe->plane, plane_state);
 }
 
 static const struct drm_simple_display_pipe_funcs pl111_display_funcs = {
@@ -457,7 +458,7 @@ int pl111_display_init(struct drm_device *drm)
 	ret = drm_simple_display_pipe_init(drm, &priv->pipe,
 					   &pl111_display_funcs,
 					   formats, ARRAY_SIZE(formats),
-					   &priv->connector.connector);
+					   NULL, &priv->connector.connector);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/pl111/pl111_drv.c b/drivers/gpu/drm/pl111/pl111_drv.c
index ac8771b..581c452 100644
--- a/drivers/gpu/drm/pl111/pl111_drv.c
+++ b/drivers/gpu/drm/pl111/pl111_drv.c
@@ -66,14 +66,15 @@
 #include <drm/drm_atomic_helper.h>
 #include <drm/drm_crtc_helper.h>
 #include <drm/drm_gem_cma_helper.h>
+#include <drm/drm_gem_framebuffer_helper.h>
 #include <drm/drm_fb_cma_helper.h>
 
 #include "pl111_drm.h"
 
 #define DRIVER_DESC      "DRM module for PL111"
 
-static struct drm_mode_config_funcs mode_config_funcs = {
-	.fb_create = drm_fb_cma_create,
+static const struct drm_mode_config_funcs mode_config_funcs = {
+	.fb_create = drm_gem_fb_create,
 	.atomic_check = drm_atomic_helper_check,
 	.atomic_commit = drm_atomic_helper_commit,
 };
@@ -159,9 +160,7 @@ static struct drm_driver pl111_drm_driver = {
 	.minor = 0,
 	.patchlevel = 0,
 	.dumb_create = drm_gem_cma_dumb_create,
-	.dumb_destroy = drm_gem_dumb_destroy,
-	.dumb_map_offset = drm_gem_cma_dumb_map_offset,
-	.gem_free_object = drm_gem_cma_free_object,
+	.gem_free_object_unlocked = drm_gem_cma_free_object,
 	.gem_vm_ops = &drm_gem_cma_vm_ops,
 
 	.enable_vblank = pl111_enable_vblank,
diff --git a/drivers/gpu/drm/qxl/qxl_display.c b/drivers/gpu/drm/qxl/qxl_display.c
index 03fe182..14c5613 100644
--- a/drivers/gpu/drm/qxl/qxl_display.c
+++ b/drivers/gpu/drm/qxl/qxl_display.c
@@ -378,10 +378,6 @@ qxl_framebuffer_init(struct drm_device *dev,
 	return 0;
 }
 
-static void qxl_crtc_dpms(struct drm_crtc *crtc, int mode)
-{
-}
-
 static bool qxl_crtc_mode_fixup(struct drm_crtc *crtc,
 				  const struct drm_display_mode *mode,
 				  struct drm_display_mode *adjusted_mode)
@@ -437,7 +433,7 @@ static void qxl_monitors_config_set(struct qxl_device *qdev,
 
 }
 
-void qxl_mode_set_nofb(struct drm_crtc *crtc)
+static void qxl_mode_set_nofb(struct drm_crtc *crtc)
 {
 	struct qxl_device *qdev = crtc->dev->dev_private;
 	struct qxl_crtc *qcrtc = to_qxl_crtc(crtc);
@@ -451,12 +447,14 @@ void qxl_mode_set_nofb(struct drm_crtc *crtc)
 
 }
 
-static void qxl_crtc_commit(struct drm_crtc *crtc)
+static void qxl_crtc_atomic_enable(struct drm_crtc *crtc,
+				   struct drm_crtc_state *old_state)
 {
 	DRM_DEBUG("\n");
 }
 
-static void qxl_crtc_disable(struct drm_crtc *crtc)
+static void qxl_crtc_atomic_disable(struct drm_crtc *crtc,
+				    struct drm_crtc_state *old_state)
 {
 	struct qxl_crtc *qcrtc = to_qxl_crtc(crtc);
 	struct qxl_device *qdev = crtc->dev->dev_private;
@@ -467,16 +465,15 @@ static void qxl_crtc_disable(struct drm_crtc *crtc)
 }
 
 static const struct drm_crtc_helper_funcs qxl_crtc_helper_funcs = {
-	.dpms = qxl_crtc_dpms,
-	.disable = qxl_crtc_disable,
 	.mode_fixup = qxl_crtc_mode_fixup,
 	.mode_set_nofb = qxl_mode_set_nofb,
-	.commit = qxl_crtc_commit,
 	.atomic_flush = qxl_crtc_atomic_flush,
+	.atomic_enable = qxl_crtc_atomic_enable,
+	.atomic_disable = qxl_crtc_atomic_disable,
 };
 
-int qxl_primary_atomic_check(struct drm_plane *plane,
-			     struct drm_plane_state *state)
+static int qxl_primary_atomic_check(struct drm_plane *plane,
+				    struct drm_plane_state *state)
 {
 	struct qxl_device *qdev = plane->dev->dev_private;
 	struct qxl_framebuffer *qfb;
@@ -547,8 +544,8 @@ static void qxl_primary_atomic_disable(struct drm_plane *plane,
 	}
 }
 
-int qxl_plane_atomic_check(struct drm_plane *plane,
-			   struct drm_plane_state *state)
+static int qxl_plane_atomic_check(struct drm_plane *plane,
+				  struct drm_plane_state *state)
 {
 	return 0;
 }
@@ -647,8 +644,8 @@ static void qxl_cursor_atomic_update(struct drm_plane *plane,
 
 }
 
-void qxl_cursor_atomic_disable(struct drm_plane *plane,
-			       struct drm_plane_state *old_state)
+static void qxl_cursor_atomic_disable(struct drm_plane *plane,
+				      struct drm_plane_state *old_state)
 {
 	struct qxl_device *qdev = plane->dev->dev_private;
 	struct qxl_release *release;
@@ -675,8 +672,8 @@ void qxl_cursor_atomic_disable(struct drm_plane *plane,
 	qxl_release_fence_buffer_objects(release);
 }
 
-int qxl_plane_prepare_fb(struct drm_plane *plane,
-			 struct drm_plane_state *new_state)
+static int qxl_plane_prepare_fb(struct drm_plane *plane,
+				struct drm_plane_state *new_state)
 {
 	struct drm_gem_object *obj;
 	struct qxl_bo *user_bo;
@@ -787,7 +784,7 @@ static struct drm_plane *qxl_create_plane(struct qxl_device *qdev,
 
 	err = drm_universal_plane_init(&qdev->ddev, plane, possible_crtcs,
 				       funcs, formats, num_formats,
-				       type, NULL);
+				       NULL, type, NULL);
 	if (err)
 		goto free_plane;
 
diff --git a/drivers/gpu/drm/qxl/qxl_drv.c b/drivers/gpu/drm/qxl/qxl_drv.c
index c2fc201..2445e75 100644
--- a/drivers/gpu/drm/qxl/qxl_drv.c
+++ b/drivers/gpu/drm/qxl/qxl_drv.c
@@ -37,7 +37,6 @@
 #include "qxl_drv.h"
 #include "qxl_object.h"
 
-extern int qxl_max_ioctls;
 static const struct pci_device_id pciidlist[] = {
 	{ 0x1b36, 0x100, PCI_ANY_ID, PCI_ANY_ID, PCI_CLASS_DISPLAY_VGA << 8,
 	  0xffff00, 0 },
@@ -262,11 +261,8 @@ static struct drm_driver qxl_driver = {
 			   DRIVER_HAVE_IRQ | DRIVER_IRQ_SHARED |
 			   DRIVER_ATOMIC,
 
-	.set_busid = drm_pci_set_busid,
-
 	.dumb_create = qxl_mode_dumb_create,
 	.dumb_map_offset = qxl_mode_dumb_mmap,
-	.dumb_destroy = drm_gem_dumb_destroy,
 #if defined(CONFIG_DEBUG_FS)
 	.debugfs_init = qxl_debugfs_init,
 #endif
@@ -303,12 +299,12 @@ static int __init qxl_init(void)
 	if (qxl_modeset == 0)
 		return -EINVAL;
 	qxl_driver.num_ioctls = qxl_max_ioctls;
-	return drm_pci_init(&qxl_driver, &qxl_pci_driver);
+	return pci_register_driver(&qxl_pci_driver);
 }
 
 static void __exit qxl_exit(void)
 {
-	drm_pci_exit(&qxl_driver, &qxl_pci_driver);
+	pci_unregister_driver(&qxl_pci_driver);
 }
 
 module_init(qxl_init);
diff --git a/drivers/gpu/drm/qxl/qxl_drv.h b/drivers/gpu/drm/qxl/qxl_drv.h
index 3591d23..3397a19 100644
--- a/drivers/gpu/drm/qxl/qxl_drv.h
+++ b/drivers/gpu/drm/qxl/qxl_drv.h
@@ -64,6 +64,7 @@
 
 extern int qxl_log_level;
 extern int qxl_num_crtc;
+extern int qxl_max_ioctls;
 
 enum {
 	QXL_INFO_LEVEL = 1,
diff --git a/drivers/gpu/drm/qxl/qxl_fb.c b/drivers/gpu/drm/qxl/qxl_fb.c
index 573e7e9..844c4a3 100644
--- a/drivers/gpu/drm/qxl/qxl_fb.c
+++ b/drivers/gpu/drm/qxl/qxl_fb.c
@@ -275,7 +275,6 @@ static int qxlfb_create(struct qxl_fbdev *qfbdev,
 
 	drm_fb_helper_fill_fix(info, fb->pitches[0], fb->format->depth);
 
-	info->flags = FBINFO_DEFAULT | FBINFO_HWACCEL_COPYAREA | FBINFO_HWACCEL_FILLRECT;
 	info->fbops = &qxlfb_ops;
 
 	/*
diff --git a/drivers/gpu/drm/qxl/qxl_ioctl.c b/drivers/gpu/drm/qxl/qxl_ioctl.c
index 0b82a87..31effed 100644
--- a/drivers/gpu/drm/qxl/qxl_ioctl.c
+++ b/drivers/gpu/drm/qxl/qxl_ioctl.c
@@ -163,7 +163,7 @@ static int qxl_process_single_command(struct qxl_device *qdev,
 		return -EINVAL;
 
 	if (!access_ok(VERIFY_READ,
-		       (void *)(unsigned long)cmd->command,
+		       u64_to_user_ptr(cmd->command),
 		       cmd->command_size))
 		return -EFAULT;
 
@@ -183,7 +183,9 @@ static int qxl_process_single_command(struct qxl_device *qdev,
 
 	/* TODO copy slow path code from i915 */
 	fb_cmd = qxl_bo_kmap_atomic_page(qdev, cmd_bo, (release->release_offset & PAGE_SIZE));
-	unwritten = __copy_from_user_inatomic_nocache(fb_cmd + sizeof(union qxl_release_info) + (release->release_offset & ~PAGE_SIZE), (void *)(unsigned long)cmd->command, cmd->command_size);
+	unwritten = __copy_from_user_inatomic_nocache
+		(fb_cmd + sizeof(union qxl_release_info) + (release->release_offset & ~PAGE_SIZE),
+		 u64_to_user_ptr(cmd->command), cmd->command_size);
 
 	{
 		struct qxl_drawable *draw = fb_cmd;
@@ -201,10 +203,9 @@ static int qxl_process_single_command(struct qxl_device *qdev,
 	num_relocs = 0;
 	for (i = 0; i < cmd->relocs_num; ++i) {
 		struct drm_qxl_reloc reloc;
+		struct drm_qxl_reloc __user *u = u64_to_user_ptr(cmd->relocs);
 
-		if (copy_from_user(&reloc,
-				       &((struct drm_qxl_reloc *)(uintptr_t)cmd->relocs)[i],
-				       sizeof(reloc))) {
+		if (copy_from_user(&reloc, u + i, sizeof(reloc))) {
 			ret = -EFAULT;
 			goto out_free_bos;
 		}
@@ -282,10 +283,10 @@ static int qxl_execbuffer_ioctl(struct drm_device *dev, void *data,
 
 	for (cmd_num = 0; cmd_num < execbuffer->commands_num; ++cmd_num) {
 
-		struct drm_qxl_command *commands =
-			(struct drm_qxl_command *)(uintptr_t)execbuffer->commands;
+		struct drm_qxl_command __user *commands =
+			u64_to_user_ptr(execbuffer->commands);
 
-		if (copy_from_user(&user_cmd, &commands[cmd_num],
+		if (copy_from_user(&user_cmd, commands + cmd_num,
 				       sizeof(user_cmd)))
 			return -EFAULT;
 
diff --git a/drivers/gpu/drm/qxl/qxl_object.c b/drivers/gpu/drm/qxl/qxl_object.c
index 9a7eef7..0a67ddf 100644
--- a/drivers/gpu/drm/qxl/qxl_object.c
+++ b/drivers/gpu/drm/qxl/qxl_object.c
@@ -221,7 +221,7 @@ struct qxl_bo *qxl_bo_ref(struct qxl_bo *bo)
 	return bo;
 }
 
-int __qxl_bo_pin(struct qxl_bo *bo, u32 domain, u64 *gpu_addr)
+static int __qxl_bo_pin(struct qxl_bo *bo, u32 domain, u64 *gpu_addr)
 {
 	struct drm_device *ddev = bo->gem_base.dev;
 	int r;
@@ -244,7 +244,7 @@ int __qxl_bo_pin(struct qxl_bo *bo, u32 domain, u64 *gpu_addr)
 	return r;
 }
 
-int __qxl_bo_unpin(struct qxl_bo *bo)
+static int __qxl_bo_unpin(struct qxl_bo *bo)
 {
 	struct drm_device *ddev = bo->gem_base.dev;
 	int r, i;
diff --git a/drivers/gpu/drm/qxl/qxl_ttm.c b/drivers/gpu/drm/qxl/qxl_ttm.c
index 87fc1db..7ecf8a4 100644
--- a/drivers/gpu/drm/qxl/qxl_ttm.c
+++ b/drivers/gpu/drm/qxl/qxl_ttm.c
@@ -187,7 +187,7 @@ static void qxl_evict_flags(struct ttm_buffer_object *bo,
 				struct ttm_placement *placement)
 {
 	struct qxl_bo *qbo;
-	static struct ttm_place placements = {
+	static const struct ttm_place placements = {
 		.fpfn = 0,
 		.lpfn = 0,
 		.flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_SYSTEM
diff --git a/drivers/gpu/drm/r128/r128_drv.c b/drivers/gpu/drm/r128/r128_drv.c
index a982be5..0d2b7e4 100644
--- a/drivers/gpu/drm/r128/r128_drv.c
+++ b/drivers/gpu/drm/r128/r128_drv.c
@@ -62,7 +62,6 @@ static struct drm_driver driver = {
 	.load = r128_driver_load,
 	.preclose = r128_driver_preclose,
 	.lastclose = r128_driver_lastclose,
-	.set_busid = drm_pci_set_busid,
 	.get_vblank_counter = r128_get_vblank_counter,
 	.enable_vblank = r128_enable_vblank,
 	.disable_vblank = r128_disable_vblank,
@@ -96,12 +95,12 @@ static int __init r128_init(void)
 {
 	driver.num_ioctls = r128_max_ioctl;
 
-	return drm_pci_init(&driver, &r128_pci_driver);
+	return drm_legacy_pci_init(&driver, &r128_pci_driver);
 }
 
 static void __exit r128_exit(void)
 {
-	drm_pci_exit(&driver, &r128_pci_driver);
+	drm_legacy_pci_exit(&driver, &r128_pci_driver);
 }
 
 module_init(r128_init);
diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c b/drivers/gpu/drm/radeon/atombios_crtc.c
index 3c492a0..02baaaf 100644
--- a/drivers/gpu/drm/radeon/atombios_crtc.c
+++ b/drivers/gpu/drm/radeon/atombios_crtc.c
@@ -2217,7 +2217,6 @@ static const struct drm_crtc_helper_funcs atombios_helper_funcs = {
 	.mode_set_base_atomic = atombios_crtc_set_base_atomic,
 	.prepare = atombios_crtc_prepare,
 	.commit = atombios_crtc_commit,
-	.load_lut = radeon_crtc_load_lut,
 	.disable = atombios_crtc_disable,
 };
 
diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 5008f3d..ec63bc5 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -464,7 +464,7 @@ struct radeon_bo_list {
 	struct radeon_bo		*robj;
 	struct ttm_validate_buffer	tv;
 	uint64_t			gpu_offset;
-	unsigned			prefered_domains;
+	unsigned			preferred_domains;
 	unsigned			allowed_domains;
 	uint32_t			tiling_flags;
 };
@@ -2327,7 +2327,7 @@ struct radeon_device {
 	uint8_t				*bios;
 	bool				is_atom_bios;
 	uint16_t			bios_header_start;
-	struct radeon_bo		*stollen_vga_memory;
+	struct radeon_bo		*stolen_vga_memory;
 	/* Register mmio */
 	resource_size_t			rmmio_base;
 	resource_size_t			rmmio_size;
diff --git a/drivers/gpu/drm/radeon/radeon_acpi.c b/drivers/gpu/drm/radeon/radeon_acpi.c
index 6efbd65..8d3251a 100644
--- a/drivers/gpu/drm/radeon/radeon_acpi.c
+++ b/drivers/gpu/drm/radeon/radeon_acpi.c
@@ -351,7 +351,7 @@ static int radeon_atif_get_sbios_requests(acpi_handle handle,
  * handles it.
  * Returns NOTIFY code
  */
-int radeon_atif_handler(struct radeon_device *rdev,
+static int radeon_atif_handler(struct radeon_device *rdev,
 		struct acpi_bus_event *event)
 {
 	struct radeon_atif *atif = &rdev->atif;
diff --git a/drivers/gpu/drm/radeon/radeon_acpi.h b/drivers/gpu/drm/radeon/radeon_acpi.h
index 7af1977..35202a4 100644
--- a/drivers/gpu/drm/radeon/radeon_acpi.h
+++ b/drivers/gpu/drm/radeon/radeon_acpi.h
@@ -27,9 +27,6 @@
 struct radeon_device;
 struct acpi_bus_event;
 
-int radeon_atif_handler(struct radeon_device *rdev,
-		struct acpi_bus_event *event);
-
 /* AMD hw uses four ACPI control methods:
  * 1. ATIF
  * ARG0: (ACPI_INTEGER) function code
diff --git a/drivers/gpu/drm/radeon/radeon_audio.c b/drivers/gpu/drm/radeon/radeon_audio.c
index aaacac1..770e31f 100644
--- a/drivers/gpu/drm/radeon/radeon_audio.c
+++ b/drivers/gpu/drm/radeon/radeon_audio.c
@@ -516,7 +516,7 @@ static int radeon_audio_set_avi_packet(struct drm_encoder *encoder,
 	if (!connector)
 		return -EINVAL;
 
-	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode);
+	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
 	if (err < 0) {
 		DRM_ERROR("failed to setup AVI infoframe: %d\n", err);
 		return err;
diff --git a/drivers/gpu/drm/radeon/radeon_connectors.c b/drivers/gpu/drm/radeon/radeon_connectors.c
index 27affbd..2f642cb 100644
--- a/drivers/gpu/drm/radeon/radeon_connectors.c
+++ b/drivers/gpu/drm/radeon/radeon_connectors.c
@@ -773,12 +773,15 @@ static int radeon_connector_set_property(struct drm_connector *connector, struct
 
 		if (connector->encoder->crtc) {
 			struct drm_crtc *crtc  = connector->encoder->crtc;
-			const struct drm_crtc_helper_funcs *crtc_funcs = crtc->helper_private;
 			struct radeon_crtc *radeon_crtc = to_radeon_crtc(crtc);
 
 			radeon_crtc->output_csc = radeon_encoder->output_csc;
 
-			(*crtc_funcs->load_lut)(crtc);
+			/*
+			 * Our .gamma_set assumes the .gamma_store has been
+			 * prefilled and don't care about its arguments.
+			 */
+			crtc->funcs->gamma_set(crtc, NULL, NULL, NULL, 0, NULL);
 		}
 	}
 
diff --git a/drivers/gpu/drm/radeon/radeon_cs.c b/drivers/gpu/drm/radeon/radeon_cs.c
index 00b22af..1ae31dbc 100644
--- a/drivers/gpu/drm/radeon/radeon_cs.c
+++ b/drivers/gpu/drm/radeon/radeon_cs.c
@@ -130,7 +130,7 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser *p)
 		     p->rdev->family == CHIP_RS880)) {
 
 			/* TODO: is this still needed for NI+ ? */
-			p->relocs[i].prefered_domains =
+			p->relocs[i].preferred_domains =
 				RADEON_GEM_DOMAIN_VRAM;
 
 			p->relocs[i].allowed_domains =
@@ -148,14 +148,14 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser *p)
 				return -EINVAL;
 			}
 
-			p->relocs[i].prefered_domains = domain;
+			p->relocs[i].preferred_domains = domain;
 			if (domain == RADEON_GEM_DOMAIN_VRAM)
 				domain |= RADEON_GEM_DOMAIN_GTT;
 			p->relocs[i].allowed_domains = domain;
 		}
 
 		if (radeon_ttm_tt_has_userptr(p->relocs[i].robj->tbo.ttm)) {
-			uint32_t domain = p->relocs[i].prefered_domains;
+			uint32_t domain = p->relocs[i].preferred_domains;
 			if (!(domain & RADEON_GEM_DOMAIN_GTT)) {
 				DRM_ERROR("Only RADEON_GEM_DOMAIN_GTT is "
 					  "allowed for userptr BOs\n");
@@ -163,7 +163,7 @@ static int radeon_cs_parser_relocs(struct radeon_cs_parser *p)
 			}
 			need_mmap_lock = true;
 			domain = RADEON_GEM_DOMAIN_GTT;
-			p->relocs[i].prefered_domains = domain;
+			p->relocs[i].preferred_domains = domain;
 			p->relocs[i].allowed_domains = domain;
 		}
 
@@ -437,7 +437,7 @@ static void radeon_cs_parser_fini(struct radeon_cs_parser *parser, int error, bo
 			if (bo == NULL)
 				continue;
 
-			drm_gem_object_unreference_unlocked(&bo->gem_base);
+			drm_gem_object_put_unlocked(&bo->gem_base);
 		}
 	}
 	kfree(parser->track);
diff --git a/drivers/gpu/drm/radeon/radeon_cursor.c b/drivers/gpu/drm/radeon/radeon_cursor.c
index 4a4f953..9195227 100644
--- a/drivers/gpu/drm/radeon/radeon_cursor.c
+++ b/drivers/gpu/drm/radeon/radeon_cursor.c
@@ -307,7 +307,7 @@ int radeon_crtc_cursor_set2(struct drm_crtc *crtc,
 	robj = gem_to_radeon_bo(obj);
 	ret = radeon_bo_reserve(robj, false);
 	if (ret != 0) {
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ret;
 	}
 	/* Only 27 bit offset for legacy cursor */
@@ -317,7 +317,7 @@ int radeon_crtc_cursor_set2(struct drm_crtc *crtc,
 	radeon_bo_unreserve(robj);
 	if (ret) {
 		DRM_ERROR("Failed to pin new cursor BO (%d)\n", ret);
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ret;
 	}
 
@@ -352,7 +352,7 @@ int radeon_crtc_cursor_set2(struct drm_crtc *crtc,
 			radeon_bo_unpin(robj);
 			radeon_bo_unreserve(robj);
 		}
-		drm_gem_object_unreference_unlocked(radeon_crtc->cursor_bo);
+		drm_gem_object_put_unlocked(radeon_crtc->cursor_bo);
 	}
 
 	radeon_crtc->cursor_bo = obj;
diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c
index 17d3daf..ddfe91ef 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -42,6 +42,7 @@ static void avivo_crtc_load_lut(struct drm_crtc *crtc)
 	struct radeon_crtc *radeon_crtc = to_radeon_crtc(crtc);
 	struct drm_device *dev = crtc->dev;
 	struct radeon_device *rdev = dev->dev_private;
+	u16 *r, *g, *b;
 	int i;
 
 	DRM_DEBUG_KMS("%d\n", radeon_crtc->crtc_id);
@@ -60,11 +61,14 @@ static void avivo_crtc_load_lut(struct drm_crtc *crtc)
 	WREG32(AVIVO_DC_LUT_WRITE_EN_MASK, 0x0000003f);
 
 	WREG8(AVIVO_DC_LUT_RW_INDEX, 0);
+	r = crtc->gamma_store;
+	g = r + crtc->gamma_size;
+	b = g + crtc->gamma_size;
 	for (i = 0; i < 256; i++) {
 		WREG32(AVIVO_DC_LUT_30_COLOR,
-			     (radeon_crtc->lut_r[i] << 20) |
-			     (radeon_crtc->lut_g[i] << 10) |
-			     (radeon_crtc->lut_b[i] << 0));
+		       ((*r++ & 0xffc0) << 14) |
+		       ((*g++ & 0xffc0) << 4) |
+		       (*b++ >> 6));
 	}
 
 	/* Only change bit 0 of LUT_SEL, other bits are set elsewhere */
@@ -76,6 +80,7 @@ static void dce4_crtc_load_lut(struct drm_crtc *crtc)
 	struct radeon_crtc *radeon_crtc = to_radeon_crtc(crtc);
 	struct drm_device *dev = crtc->dev;
 	struct radeon_device *rdev = dev->dev_private;
+	u16 *r, *g, *b;
 	int i;
 
 	DRM_DEBUG_KMS("%d\n", radeon_crtc->crtc_id);
@@ -93,11 +98,14 @@ static void dce4_crtc_load_lut(struct drm_crtc *crtc)
 	WREG32(EVERGREEN_DC_LUT_WRITE_EN_MASK + radeon_crtc->crtc_offset, 0x00000007);
 
 	WREG32(EVERGREEN_DC_LUT_RW_INDEX + radeon_crtc->crtc_offset, 0);
+	r = crtc->gamma_store;
+	g = r + crtc->gamma_size;
+	b = g + crtc->gamma_size;
 	for (i = 0; i < 256; i++) {
 		WREG32(EVERGREEN_DC_LUT_30_COLOR + radeon_crtc->crtc_offset,
-		       (radeon_crtc->lut_r[i] << 20) |
-		       (radeon_crtc->lut_g[i] << 10) |
-		       (radeon_crtc->lut_b[i] << 0));
+		       ((*r++ & 0xffc0) << 14) |
+		       ((*g++ & 0xffc0) << 4) |
+		       (*b++ >> 6));
 	}
 }
 
@@ -106,6 +114,7 @@ static void dce5_crtc_load_lut(struct drm_crtc *crtc)
 	struct radeon_crtc *radeon_crtc = to_radeon_crtc(crtc);
 	struct drm_device *dev = crtc->dev;
 	struct radeon_device *rdev = dev->dev_private;
+	u16 *r, *g, *b;
 	int i;
 
 	DRM_DEBUG_KMS("%d\n", radeon_crtc->crtc_id);
@@ -135,11 +144,14 @@ static void dce5_crtc_load_lut(struct drm_crtc *crtc)
 	WREG32(EVERGREEN_DC_LUT_WRITE_EN_MASK + radeon_crtc->crtc_offset, 0x00000007);
 
 	WREG32(EVERGREEN_DC_LUT_RW_INDEX + radeon_crtc->crtc_offset, 0);
+	r = crtc->gamma_store;
+	g = r + crtc->gamma_size;
+	b = g + crtc->gamma_size;
 	for (i = 0; i < 256; i++) {
 		WREG32(EVERGREEN_DC_LUT_30_COLOR + radeon_crtc->crtc_offset,
-		       (radeon_crtc->lut_r[i] << 20) |
-		       (radeon_crtc->lut_g[i] << 10) |
-		       (radeon_crtc->lut_b[i] << 0));
+		       ((*r++ & 0xffc0) << 14) |
+		       ((*g++ & 0xffc0) << 4) |
+		       (*b++ >> 6));
 	}
 
 	WREG32(NI_DEGAMMA_CONTROL + radeon_crtc->crtc_offset,
@@ -172,6 +184,7 @@ static void legacy_crtc_load_lut(struct drm_crtc *crtc)
 	struct radeon_crtc *radeon_crtc = to_radeon_crtc(crtc);
 	struct drm_device *dev = crtc->dev;
 	struct radeon_device *rdev = dev->dev_private;
+	u16 *r, *g, *b;
 	int i;
 	uint32_t dac2_cntl;
 
@@ -183,11 +196,14 @@ static void legacy_crtc_load_lut(struct drm_crtc *crtc)
 	WREG32(RADEON_DAC_CNTL2, dac2_cntl);
 
 	WREG8(RADEON_PALETTE_INDEX, 0);
+	r = crtc->gamma_store;
+	g = r + crtc->gamma_size;
+	b = g + crtc->gamma_size;
 	for (i = 0; i < 256; i++) {
 		WREG32(RADEON_PALETTE_30_DATA,
-			     (radeon_crtc->lut_r[i] << 20) |
-			     (radeon_crtc->lut_g[i] << 10) |
-			     (radeon_crtc->lut_b[i] << 0));
+		       ((*r++ & 0xffc0) << 14) |
+		       ((*g++ & 0xffc0) << 4) |
+		       (*b++ >> 6));
 	}
 }
 
@@ -209,41 +225,10 @@ void radeon_crtc_load_lut(struct drm_crtc *crtc)
 		legacy_crtc_load_lut(crtc);
 }
 
-/** Sets the color ramps on behalf of fbcon */
-void radeon_crtc_fb_gamma_set(struct drm_crtc *crtc, u16 red, u16 green,
-			      u16 blue, int regno)
-{
-	struct radeon_crtc *radeon_crtc = to_radeon_crtc(crtc);
-
-	radeon_crtc->lut_r[regno] = red >> 6;
-	radeon_crtc->lut_g[regno] = green >> 6;
-	radeon_crtc->lut_b[regno] = blue >> 6;
-}
-
-/** Gets the color ramps on behalf of fbcon */
-void radeon_crtc_fb_gamma_get(struct drm_crtc *crtc, u16 *red, u16 *green,
-			      u16 *blue, int regno)
-{
-	struct radeon_crtc *radeon_crtc = to_radeon_crtc(crtc);
-
-	*red = radeon_crtc->lut_r[regno] << 6;
-	*green = radeon_crtc->lut_g[regno] << 6;
-	*blue = radeon_crtc->lut_b[regno] << 6;
-}
-
 static int radeon_crtc_gamma_set(struct drm_crtc *crtc, u16 *red, u16 *green,
 				 u16 *blue, uint32_t size,
 				 struct drm_modeset_acquire_ctx *ctx)
 {
-	struct radeon_crtc *radeon_crtc = to_radeon_crtc(crtc);
-	int i;
-
-	/* userspace palettes are always correct as is */
-	for (i = 0; i < size; i++) {
-		radeon_crtc->lut_r[i] = red[i] >> 6;
-		radeon_crtc->lut_g[i] = green[i] >> 6;
-		radeon_crtc->lut_b[i] = blue[i] >> 6;
-	}
 	radeon_crtc_load_lut(crtc);
 
 	return 0;
@@ -282,7 +267,7 @@ static void radeon_unpin_work_func(struct work_struct *__work)
 	} else
 		DRM_ERROR("failed to reserve buffer after flip\n");
 
-	drm_gem_object_unreference_unlocked(&work->old_rbo->gem_base);
+	drm_gem_object_put_unlocked(&work->old_rbo->gem_base);
 	kfree(work);
 }
 
@@ -519,7 +504,7 @@ static int radeon_crtc_page_flip_target(struct drm_crtc *crtc,
 	obj = old_radeon_fb->obj;
 
 	/* take a reference to the old object */
-	drm_gem_object_reference(obj);
+	drm_gem_object_get(obj);
 	work->old_rbo = gem_to_radeon_bo(obj);
 
 	new_radeon_fb = to_radeon_framebuffer(fb);
@@ -618,7 +603,7 @@ static int radeon_crtc_page_flip_target(struct drm_crtc *crtc,
 	radeon_bo_unreserve(new_rbo);
 
 cleanup:
-	drm_gem_object_unreference_unlocked(&work->old_rbo->gem_base);
+	drm_gem_object_put_unlocked(&work->old_rbo->gem_base);
 	dma_fence_put(work->fence);
 	kfree(work);
 	return r;
@@ -1303,7 +1288,7 @@ static void radeon_user_framebuffer_destroy(struct drm_framebuffer *fb)
 {
 	struct radeon_framebuffer *radeon_fb = to_radeon_framebuffer(fb);
 
-	drm_gem_object_unreference_unlocked(radeon_fb->obj);
+	drm_gem_object_put_unlocked(radeon_fb->obj);
 	drm_framebuffer_cleanup(fb);
 	kfree(radeon_fb);
 }
@@ -1363,14 +1348,14 @@ radeon_user_framebuffer_create(struct drm_device *dev,
 
 	radeon_fb = kzalloc(sizeof(*radeon_fb), GFP_KERNEL);
 	if (radeon_fb == NULL) {
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ERR_PTR(-ENOMEM);
 	}
 
 	ret = radeon_framebuffer_init(dev, radeon_fb, mode_cmd, obj);
 	if (ret) {
 		kfree(radeon_fb);
-		drm_gem_object_unreference_unlocked(obj);
+		drm_gem_object_put_unlocked(obj);
 		return ERR_PTR(ret);
 	}
 
@@ -1388,12 +1373,12 @@ static const struct drm_mode_config_funcs radeon_mode_funcs = {
 	.output_poll_changed = radeon_output_poll_changed
 };
 
-static struct drm_prop_enum_list radeon_tmds_pll_enum_list[] =
+static const struct drm_prop_enum_list radeon_tmds_pll_enum_list[] =
 {	{ 0, "driver" },
 	{ 1, "bios" },
 };
 
-static struct drm_prop_enum_list radeon_tv_std_enum_list[] =
+static const struct drm_prop_enum_list radeon_tv_std_enum_list[] =
 {	{ TV_STD_NTSC, "ntsc" },
 	{ TV_STD_PAL, "pal" },
 	{ TV_STD_PAL_M, "pal-m" },
@@ -1404,25 +1389,25 @@ static struct drm_prop_enum_list radeon_tv_std_enum_list[] =
 	{ TV_STD_SECAM, "secam" },
 };
 
-static struct drm_prop_enum_list radeon_underscan_enum_list[] =
+static const struct drm_prop_enum_list radeon_underscan_enum_list[] =
 {	{ UNDERSCAN_OFF, "off" },
 	{ UNDERSCAN_ON, "on" },
 	{ UNDERSCAN_AUTO, "auto" },
 };
 
-static struct drm_prop_enum_list radeon_audio_enum_list[] =
+static const struct drm_prop_enum_list radeon_audio_enum_list[] =
 {	{ RADEON_AUDIO_DISABLE, "off" },
 	{ RADEON_AUDIO_ENABLE, "on" },
 	{ RADEON_AUDIO_AUTO, "auto" },
 };
 
 /* XXX support different dither options? spatial, temporal, both, etc. */
-static struct drm_prop_enum_list radeon_dither_enum_list[] =
+static const struct drm_prop_enum_list radeon_dither_enum_list[] =
 {	{ RADEON_FMT_DITHER_DISABLE, "off" },
 	{ RADEON_FMT_DITHER_ENABLE, "on" },
 };
 
-static struct drm_prop_enum_list radeon_output_csc_enum_list[] =
+static const struct drm_prop_enum_list radeon_output_csc_enum_list[] =
 {	{ RADEON_OUTPUT_CSC_BYPASS, "bypass" },
 	{ RADEON_OUTPUT_CSC_TVRGB, "tvrgb" },
 	{ RADEON_OUTPUT_CSC_YCBCR601, "ycbcr601" },
diff --git a/drivers/gpu/drm/radeon/radeon_dp_mst.c b/drivers/gpu/drm/radeon/radeon_dp_mst.c
index 6598306..ebdf1b8 100644
--- a/drivers/gpu/drm/radeon/radeon_dp_mst.c
+++ b/drivers/gpu/drm/radeon/radeon_dp_mst.c
@@ -300,9 +300,7 @@ static void radeon_dp_register_mst_connector(struct drm_connector *connector)
 	struct drm_device *dev = connector->dev;
 	struct radeon_device *rdev = dev->dev_private;
 
-	drm_modeset_lock_all(dev);
 	radeon_fb_add_connector(rdev, connector);
-	drm_modeset_unlock_all(dev);
 
 	drm_connector_register(connector);
 }
@@ -315,13 +313,8 @@ static void radeon_dp_destroy_mst_connector(struct drm_dp_mst_topology_mgr *mgr,
 	struct radeon_device *rdev = dev->dev_private;
 
 	drm_connector_unregister(connector);
-	/* need to nuke the connector */
-	drm_modeset_lock_all(dev);
-	/* dpms off */
 	radeon_fb_remove_connector(rdev, connector);
-
 	drm_connector_cleanup(connector);
-	drm_modeset_unlock_all(dev);
 
 	kfree(connector);
 	DRM_DEBUG_KMS("\n");
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
index 74abd16..f4becad 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -567,7 +567,6 @@ static struct drm_driver kms_driver = {
 	.open = radeon_driver_open_kms,
 	.postclose = radeon_driver_postclose_kms,
 	.lastclose = radeon_driver_lastclose_kms,
-	.set_busid = drm_pci_set_busid,
 	.unload = radeon_driver_unload_kms,
 	.get_vblank_counter = radeon_get_vblank_counter_kms,
 	.enable_vblank = radeon_enable_vblank_kms,
@@ -584,7 +583,6 @@ static struct drm_driver kms_driver = {
 	.gem_close_object = radeon_gem_object_close,
 	.dumb_create = radeon_mode_dumb_create,
 	.dumb_map_offset = radeon_mode_dumb_mmap,
-	.dumb_destroy = drm_gem_dumb_destroy,
 	.fops = &radeon_driver_kms_fops,
 
 	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
@@ -642,14 +640,13 @@ static int __init radeon_init(void)
 		return -EINVAL;
 	}
 
-	/* let modprobe override vga console setting */
-	return drm_pci_init(driver, pdriver);
+	return pci_register_driver(pdriver);
 }
 
 static void __exit radeon_exit(void)
 {
 	radeon_kfd_fini();
-	drm_pci_exit(driver, pdriver);
+	pci_unregister_driver(pdriver);
 	radeon_unregister_atpx_handler();
 }
 
diff --git a/drivers/gpu/drm/radeon/radeon_fb.c b/drivers/gpu/drm/radeon/radeon_fb.c
index 356ad90..fd25361 100644
--- a/drivers/gpu/drm/radeon/radeon_fb.c
+++ b/drivers/gpu/drm/radeon/radeon_fb.c
@@ -118,7 +118,7 @@ static void radeonfb_destroy_pinned_object(struct drm_gem_object *gobj)
 		radeon_bo_unpin(rbo);
 		radeon_bo_unreserve(rbo);
 	}
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 }
 
 static int radeonfb_create_pinned_object(struct radeon_fbdev *rfbdev,
@@ -264,7 +264,6 @@ static int radeonfb_create(struct drm_fb_helper *helper,
 
 	drm_fb_helper_fill_fix(info, fb->pitches[0], fb->format->depth);
 
-	info->flags = FBINFO_DEFAULT | FBINFO_CAN_FORCE_OUTPUT;
 	info->fbops = &radeonfb_ops;
 
 	tmp = radeon_bo_gpu_offset(rbo) - rdev->mc.vram_start;
@@ -300,7 +299,7 @@ static int radeonfb_create(struct drm_fb_helper *helper,
 
 	}
 	if (fb && ret) {
-		drm_gem_object_unreference_unlocked(gobj);
+		drm_gem_object_put_unlocked(gobj);
 		drm_framebuffer_unregister_private(fb);
 		drm_framebuffer_cleanup(fb);
 		kfree(fb);
@@ -332,8 +331,6 @@ static int radeon_fbdev_destroy(struct drm_device *dev, struct radeon_fbdev *rfb
 }
 
 static const struct drm_fb_helper_funcs radeon_fb_helper_funcs = {
-	.gamma_set = radeon_crtc_fb_gamma_set,
-	.gamma_get = radeon_crtc_fb_gamma_get,
 	.fb_probe = radeonfb_create,
 };
 
@@ -347,9 +344,12 @@ int radeon_fbdev_init(struct radeon_device *rdev)
 	if (list_empty(&rdev->ddev->mode_config.connector_list))
 		return 0;
 
-	/* select 8 bpp console on RN50 or 16MB cards */
-	if (ASIC_IS_RN50(rdev) || rdev->mc.real_vram_size <= (32*1024*1024))
+	/* select 8 bpp console on 8MB cards, or 16 bpp on RN50 or 32MB */
+	if (rdev->mc.real_vram_size <= (8*1024*1024))
 		bpp_sel = 8;
+	else if (ASIC_IS_RN50(rdev) ||
+		 rdev->mc.real_vram_size <= (32*1024*1024))
+		bpp_sel = 16;
 
 	rfbdev = kzalloc(sizeof(struct radeon_fbdev), GFP_KERNEL);
 	if (!rfbdev)
diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c
index 574bf7e..3386452 100644
--- a/drivers/gpu/drm/radeon/radeon_gem.c
+++ b/drivers/gpu/drm/radeon/radeon_gem.c
@@ -271,7 +271,7 @@ int radeon_gem_create_ioctl(struct drm_device *dev, void *data,
 	}
 	r = drm_gem_handle_create(filp, gobj, &handle);
 	/* drop reference from allocate - handle holds it now */
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	if (r) {
 		up_read(&rdev->exclusive_lock);
 		r = radeon_gem_handle_lockup(rdev, r);
@@ -352,7 +352,7 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data,
 
 	r = drm_gem_handle_create(filp, gobj, &handle);
 	/* drop reference from allocate - handle holds it now */
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	if (r)
 		goto handle_lockup;
 
@@ -361,7 +361,7 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data,
 	return 0;
 
 release_object:
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 
 handle_lockup:
 	up_read(&rdev->exclusive_lock);
@@ -395,7 +395,7 @@ int radeon_gem_set_domain_ioctl(struct drm_device *dev, void *data,
 
 	r = radeon_gem_set_domain(gobj, args->read_domains, args->write_domain);
 
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	up_read(&rdev->exclusive_lock);
 	r = radeon_gem_handle_lockup(robj->rdev, r);
 	return r;
@@ -414,11 +414,11 @@ int radeon_mode_dumb_mmap(struct drm_file *filp,
 	}
 	robj = gem_to_radeon_bo(gobj);
 	if (radeon_ttm_tt_has_userptr(robj->tbo.ttm)) {
-		drm_gem_object_unreference_unlocked(gobj);
+		drm_gem_object_put_unlocked(gobj);
 		return -EPERM;
 	}
 	*offset_p = radeon_bo_mmap_offset(robj);
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	return 0;
 }
 
@@ -453,7 +453,7 @@ int radeon_gem_busy_ioctl(struct drm_device *dev, void *data,
 
 	cur_placement = ACCESS_ONCE(robj->tbo.mem.mem_type);
 	args->domain = radeon_mem_type_to_domain(cur_placement);
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	return r;
 }
 
@@ -485,7 +485,7 @@ int radeon_gem_wait_idle_ioctl(struct drm_device *dev, void *data,
 	if (rdev->asic->mmio_hdp_flush &&
 	    radeon_mem_type_to_domain(cur_placement) == RADEON_GEM_DOMAIN_VRAM)
 		robj->rdev->asic->mmio_hdp_flush(rdev);
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	r = radeon_gem_handle_lockup(rdev, r);
 	return r;
 }
@@ -504,7 +504,7 @@ int radeon_gem_set_tiling_ioctl(struct drm_device *dev, void *data,
 		return -ENOENT;
 	robj = gem_to_radeon_bo(gobj);
 	r = radeon_bo_set_tiling_flags(robj, args->tiling_flags, args->pitch);
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	return r;
 }
 
@@ -527,7 +527,7 @@ int radeon_gem_get_tiling_ioctl(struct drm_device *dev, void *data,
 	radeon_bo_get_tiling_flags(rbo, &args->tiling_flags, &args->pitch);
 	radeon_bo_unreserve(rbo);
 out:
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	return r;
 }
 
@@ -661,14 +661,14 @@ int radeon_gem_va_ioctl(struct drm_device *dev, void *data,
 	r = radeon_bo_reserve(rbo, false);
 	if (r) {
 		args->operation = RADEON_VA_RESULT_ERROR;
-		drm_gem_object_unreference_unlocked(gobj);
+		drm_gem_object_put_unlocked(gobj);
 		return r;
 	}
 	bo_va = radeon_vm_bo_find(&fpriv->vm, rbo);
 	if (!bo_va) {
 		args->operation = RADEON_VA_RESULT_ERROR;
 		radeon_bo_unreserve(rbo);
-		drm_gem_object_unreference_unlocked(gobj);
+		drm_gem_object_put_unlocked(gobj);
 		return -ENOENT;
 	}
 
@@ -695,7 +695,7 @@ int radeon_gem_va_ioctl(struct drm_device *dev, void *data,
 		args->operation = RADEON_VA_RESULT_ERROR;
 	}
 out:
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	return r;
 }
 
@@ -736,7 +736,7 @@ int radeon_gem_op_ioctl(struct drm_device *dev, void *data,
 
 	radeon_bo_unreserve(robj);
 out:
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	return r;
 }
 
@@ -762,7 +762,7 @@ int radeon_mode_dumb_create(struct drm_file *file_priv,
 
 	r = drm_gem_handle_create(file_priv, gobj, &handle);
 	/* drop reference from allocate - handle holds it now */
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	if (r) {
 		return r;
 	}
diff --git a/drivers/gpu/drm/radeon/radeon_irq_kms.c b/drivers/gpu/drm/radeon/radeon_irq_kms.c
index 7aacb44..afaf10d 100644
--- a/drivers/gpu/drm/radeon/radeon_irq_kms.c
+++ b/drivers/gpu/drm/radeon/radeon_irq_kms.c
@@ -283,6 +283,10 @@ int radeon_irq_kms_init(struct radeon_device *rdev)
 	int r = 0;
 
 	spin_lock_init(&rdev->irq.lock);
+
+	/* Disable vblank irqs aggressively for power-saving */
+	rdev->ddev->vblank_disable_immediate = true;
+
 	r = drm_vblank_init(rdev->ddev, rdev->num_crtc);
 	if (r) {
 		return r;
@@ -324,7 +328,6 @@ int radeon_irq_kms_init(struct radeon_device *rdev)
  */
 void radeon_irq_kms_fini(struct radeon_device *rdev)
 {
-	drm_vblank_cleanup(rdev->ddev);
 	if (rdev->irq.installed) {
 		drm_irq_uninstall(rdev->ddev);
 		rdev->irq.installed = false;
diff --git a/drivers/gpu/drm/radeon/radeon_kfd.c b/drivers/gpu/drm/radeon/radeon_kfd.c
index a2ab6dc..f6578c9 100644
--- a/drivers/gpu/drm/radeon/radeon_kfd.c
+++ b/drivers/gpu/drm/radeon/radeon_kfd.c
@@ -75,12 +75,14 @@ static int kgd_init_pipeline(struct kgd_dev *kgd, uint32_t pipe_id,
 				uint32_t hpd_size, uint64_t hpd_gpu_addr);
 static int kgd_init_interrupts(struct kgd_dev *kgd, uint32_t pipe_id);
 static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
-			uint32_t queue_id, uint32_t __user *wptr);
+			uint32_t queue_id, uint32_t __user *wptr,
+			uint32_t wptr_shift, uint32_t wptr_mask,
+			struct mm_struct *mm);
 static int kgd_hqd_sdma_load(struct kgd_dev *kgd, void *mqd);
 static bool kgd_hqd_is_occupied(struct kgd_dev *kgd, uint64_t queue_address,
 				uint32_t pipe_id, uint32_t queue_id);
 
-static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
+static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd, uint32_t reset_type,
 				unsigned int timeout, uint32_t pipe_id,
 				uint32_t queue_id);
 static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd);
@@ -482,7 +484,9 @@ static inline struct cik_sdma_rlc_registers *get_sdma_mqd(void *mqd)
 }
 
 static int kgd_hqd_load(struct kgd_dev *kgd, void *mqd, uint32_t pipe_id,
-			uint32_t queue_id, uint32_t __user *wptr)
+			uint32_t queue_id, uint32_t __user *wptr,
+			uint32_t wptr_shift, uint32_t wptr_mask,
+			struct mm_struct *mm)
 {
 	uint32_t wptr_shadow, is_wptr_shadow_valid;
 	struct cik_mqd *m;
@@ -636,7 +640,7 @@ static bool kgd_hqd_sdma_is_occupied(struct kgd_dev *kgd, void *mqd)
 	return false;
 }
 
-static int kgd_hqd_destroy(struct kgd_dev *kgd, uint32_t reset_type,
+static int kgd_hqd_destroy(struct kgd_dev *kgd, void *mqd, uint32_t reset_type,
 				unsigned int timeout, uint32_t pipe_id,
 				uint32_t queue_id)
 {
@@ -785,7 +789,8 @@ static uint32_t kgd_address_watch_get_offset(struct kgd_dev *kgd,
 					unsigned int watch_point_id,
 					unsigned int reg_offset)
 {
-	return watchRegs[watch_point_id * ADDRESS_WATCH_REG_MAX + reg_offset];
+	return watchRegs[watch_point_id * ADDRESS_WATCH_REG_MAX + reg_offset]
+		/ 4;
 }
 
 static bool get_atc_vmid_pasid_mapping_valid(struct kgd_dev *kgd, uint8_t vmid)
diff --git a/drivers/gpu/drm/radeon/radeon_legacy_crtc.c b/drivers/gpu/drm/radeon/radeon_legacy_crtc.c
index ce6cb66..1f1856e 100644
--- a/drivers/gpu/drm/radeon/radeon_legacy_crtc.c
+++ b/drivers/gpu/drm/radeon/radeon_legacy_crtc.c
@@ -1116,7 +1116,6 @@ static const struct drm_crtc_helper_funcs legacy_helper_funcs = {
 	.mode_set_base_atomic = radeon_crtc_set_base_atomic,
 	.prepare = radeon_crtc_prepare,
 	.commit = radeon_crtc_commit,
-	.load_lut = radeon_crtc_load_lut,
 	.disable = radeon_crtc_disable
 };
 
diff --git a/drivers/gpu/drm/radeon/radeon_mode.h b/drivers/gpu/drm/radeon/radeon_mode.h
index 00f5ec5..da44ac2 100644
--- a/drivers/gpu/drm/radeon/radeon_mode.h
+++ b/drivers/gpu/drm/radeon/radeon_mode.h
@@ -935,10 +935,6 @@ extern void
 radeon_combios_encoder_crtc_scratch_regs(struct drm_encoder *encoder, int crtc);
 extern void
 radeon_combios_encoder_dpms_scratch_regs(struct drm_encoder *encoder, bool on);
-extern void radeon_crtc_fb_gamma_set(struct drm_crtc *crtc, u16 red, u16 green,
-				     u16 blue, int regno);
-extern void radeon_crtc_fb_gamma_get(struct drm_crtc *crtc, u16 *red, u16 *green,
-				     u16 *blue, int regno);
 int radeon_framebuffer_init(struct drm_device *dev,
 			     struct radeon_framebuffer *rfb,
 			     const struct drm_mode_fb_cmd2 *mode_cmd,
diff --git a/drivers/gpu/drm/radeon/radeon_object.c b/drivers/gpu/drm/radeon/radeon_object.c
index 8b72229..0935949 100644
--- a/drivers/gpu/drm/radeon/radeon_object.c
+++ b/drivers/gpu/drm/radeon/radeon_object.c
@@ -445,7 +445,7 @@ void radeon_bo_force_delete(struct radeon_device *rdev)
 		list_del_init(&bo->list);
 		mutex_unlock(&bo->rdev->gem.mutex);
 		/* this should unref the ttm bo */
-		drm_gem_object_unreference_unlocked(&bo->gem_base);
+		drm_gem_object_put_unlocked(&bo->gem_base);
 	}
 }
 
@@ -546,7 +546,7 @@ int radeon_bo_list_validate(struct radeon_device *rdev,
 	list_for_each_entry(lobj, head, tv.head) {
 		struct radeon_bo *bo = lobj->robj;
 		if (!bo->pin_count) {
-			u32 domain = lobj->prefered_domains;
+			u32 domain = lobj->preferred_domains;
 			u32 allowed = lobj->allowed_domains;
 			u32 current_domain =
 				radeon_mem_type_to_domain(bo->tbo.mem.mem_type);
diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c b/drivers/gpu/drm/radeon/radeon_ttm.c
index faa0213..bf69bf9 100644
--- a/drivers/gpu/drm/radeon/radeon_ttm.c
+++ b/drivers/gpu/drm/radeon/radeon_ttm.c
@@ -178,7 +178,7 @@ static int radeon_init_mem_type(struct ttm_bo_device *bdev, uint32_t type,
 static void radeon_evict_flags(struct ttm_buffer_object *bo,
 				struct ttm_placement *placement)
 {
-	static struct ttm_place placements = {
+	static const struct ttm_place placements = {
 		.fpfn = 0,
 		.lpfn = 0,
 		.flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_SYSTEM
@@ -907,17 +907,17 @@ int radeon_ttm_init(struct radeon_device *rdev)
 
 	r = radeon_bo_create(rdev, 256 * 1024, PAGE_SIZE, true,
 			     RADEON_GEM_DOMAIN_VRAM, 0, NULL,
-			     NULL, &rdev->stollen_vga_memory);
+			     NULL, &rdev->stolen_vga_memory);
 	if (r) {
 		return r;
 	}
-	r = radeon_bo_reserve(rdev->stollen_vga_memory, false);
+	r = radeon_bo_reserve(rdev->stolen_vga_memory, false);
 	if (r)
 		return r;
-	r = radeon_bo_pin(rdev->stollen_vga_memory, RADEON_GEM_DOMAIN_VRAM, NULL);
-	radeon_bo_unreserve(rdev->stollen_vga_memory);
+	r = radeon_bo_pin(rdev->stolen_vga_memory, RADEON_GEM_DOMAIN_VRAM, NULL);
+	radeon_bo_unreserve(rdev->stolen_vga_memory);
 	if (r) {
-		radeon_bo_unref(&rdev->stollen_vga_memory);
+		radeon_bo_unref(&rdev->stolen_vga_memory);
 		return r;
 	}
 	DRM_INFO("radeon: %uM of VRAM memory ready\n",
@@ -946,13 +946,13 @@ void radeon_ttm_fini(struct radeon_device *rdev)
 	if (!rdev->mman.initialized)
 		return;
 	radeon_ttm_debugfs_fini(rdev);
-	if (rdev->stollen_vga_memory) {
-		r = radeon_bo_reserve(rdev->stollen_vga_memory, false);
+	if (rdev->stolen_vga_memory) {
+		r = radeon_bo_reserve(rdev->stolen_vga_memory, false);
 		if (r == 0) {
-			radeon_bo_unpin(rdev->stollen_vga_memory);
-			radeon_bo_unreserve(rdev->stollen_vga_memory);
+			radeon_bo_unpin(rdev->stolen_vga_memory);
+			radeon_bo_unreserve(rdev->stolen_vga_memory);
 		}
-		radeon_bo_unref(&rdev->stollen_vga_memory);
+		radeon_bo_unref(&rdev->stolen_vga_memory);
 	}
 	ttm_bo_clean_mm(&rdev->mman.bdev, TTM_PL_VRAM);
 	ttm_bo_clean_mm(&rdev->mman.bdev, TTM_PL_TT);
@@ -1030,19 +1030,17 @@ int radeon_mmap(struct file *filp, struct vm_area_struct *vma)
 static int radeon_mm_dump_table(struct seq_file *m, void *data)
 {
 	struct drm_info_node *node = (struct drm_info_node *)m->private;
-	unsigned ttm_pl = *(int *)node->info_ent->data;
+	unsigned ttm_pl = *(int*)node->info_ent->data;
 	struct drm_device *dev = node->minor->dev;
 	struct radeon_device *rdev = dev->dev_private;
-	struct drm_mm *mm = (struct drm_mm *)rdev->mman.bdev.man[ttm_pl].priv;
-	struct ttm_bo_global *glob = rdev->mman.bdev.glob;
+	struct ttm_mem_type_manager *man = &rdev->mman.bdev.man[ttm_pl];
 	struct drm_printer p = drm_seq_file_printer(m);
 
-	spin_lock(&glob->lru_lock);
-	drm_mm_print(mm, &p);
-	spin_unlock(&glob->lru_lock);
+	man->func->debug(man, &p);
 	return 0;
 }
 
+
 static int ttm_pl_vram = TTM_PL_VRAM;
 static int ttm_pl_tt = TTM_PL_TT;
 
diff --git a/drivers/gpu/drm/radeon/radeon_vm.c b/drivers/gpu/drm/radeon/radeon_vm.c
index 5f68245..5e82b40 100644
--- a/drivers/gpu/drm/radeon/radeon_vm.c
+++ b/drivers/gpu/drm/radeon/radeon_vm.c
@@ -139,7 +139,7 @@ struct radeon_bo_list *radeon_vm_get_bos(struct radeon_device *rdev,
 
 	/* add the vm page table to the list */
 	list[0].robj = vm->page_directory;
-	list[0].prefered_domains = RADEON_GEM_DOMAIN_VRAM;
+	list[0].preferred_domains = RADEON_GEM_DOMAIN_VRAM;
 	list[0].allowed_domains = RADEON_GEM_DOMAIN_VRAM;
 	list[0].tv.bo = &vm->page_directory->tbo;
 	list[0].tv.shared = true;
@@ -151,7 +151,7 @@ struct radeon_bo_list *radeon_vm_get_bos(struct radeon_device *rdev,
 			continue;
 
 		list[idx].robj = vm->page_tables[i].bo;
-		list[idx].prefered_domains = RADEON_GEM_DOMAIN_VRAM;
+		list[idx].preferred_domains = RADEON_GEM_DOMAIN_VRAM;
 		list[idx].allowed_domains = RADEON_GEM_DOMAIN_VRAM;
 		list[idx].tv.bo = &list[idx].robj->tbo;
 		list[idx].tv.shared = true;
diff --git a/drivers/gpu/drm/radeon/vce_v2_0.c b/drivers/gpu/drm/radeon/vce_v2_0.c
index fce2144..b0a43b6 100644
--- a/drivers/gpu/drm/radeon/vce_v2_0.c
+++ b/drivers/gpu/drm/radeon/vce_v2_0.c
@@ -104,6 +104,10 @@ static void vce_v2_0_disable_cg(struct radeon_device *rdev)
 	WREG32(VCE_CGTT_CLK_OVERRIDE, 7);
 }
 
+/*
+ * Local variable sw_cg is used for debugging purposes, in case we
+ * ran into problems with dynamic clock gating. Don't remove it.
+ */
 void vce_v2_0_enable_mgcg(struct radeon_device *rdev, bool enable)
 {
 	bool sw_cg = false;
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
index 345eff7..301ea1a 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.c
@@ -13,6 +13,7 @@
 
 #include <linux/clk.h>
 #include <linux/mutex.h>
+#include <linux/sys_soc.h>
 
 #include <drm/drmP.h>
 #include <drm/drm_atomic.h>
@@ -129,10 +130,8 @@ static void rcar_du_dpll_divider(struct rcar_du_crtc *rcrtc,
 			for (fdpll = 1; fdpll < 32; fdpll++) {
 				unsigned long output;
 
-				/* 1/2 (FRQSEL=1) for duty rate 50% */
 				output = input * (n + 1) / (m + 1)
-				       / (fdpll + 1) / 2;
-
+				       / (fdpll + 1);
 				if (output >= 400000000)
 					continue;
 
@@ -158,6 +157,11 @@ static void rcar_du_dpll_divider(struct rcar_du_crtc *rcrtc,
 		 best_diff);
 }
 
+static const struct soc_device_attribute rcar_du_r8a7795_es1[] = {
+	{ .soc_id = "r8a7795", .revision = "ES1.*" },
+	{ /* sentinel */ }
+};
+
 static void rcar_du_crtc_set_display_timing(struct rcar_du_crtc *rcrtc)
 {
 	const struct drm_display_mode *mode = &rcrtc->crtc.state->adjusted_mode;
@@ -168,7 +172,8 @@ static void rcar_du_crtc_set_display_timing(struct rcar_du_crtc *rcrtc)
 	u32 escr;
 	u32 div;
 
-	/* Compute the clock divisor and select the internal or external dot
+	/*
+	 * Compute the clock divisor and select the internal or external dot
 	 * clock based on the requested frequency.
 	 */
 	clk = clk_get_rate(rcrtc->clock);
@@ -185,7 +190,20 @@ static void rcar_du_crtc_set_display_timing(struct rcar_du_crtc *rcrtc)
 
 		extclk = clk_get_rate(rcrtc->extclock);
 		if (rcdu->info->dpll_ch & (1 << rcrtc->index)) {
-			rcar_du_dpll_divider(rcrtc, &dpll, extclk, mode_clock);
+			unsigned long target = mode_clock;
+
+			/*
+			 * The H3 ES1.x exhibits dot clock duty cycle stability
+			 * issues. We can work around them by configuring the
+			 * DPLL to twice the desired frequency, coupled with a
+			 * /2 post-divider. This isn't needed on other SoCs and
+			 * breaks HDMI output on M3-W for a currently unknown
+			 * reason, so restrict the workaround to H3 ES1.x.
+			 */
+			if (soc_device_match(rcar_du_r8a7795_es1))
+				target *= 2;
+
+			rcar_du_dpll_divider(rcrtc, &dpll, extclk, target);
 			extclk = dpll.output;
 		}
 
@@ -197,8 +215,6 @@ static void rcar_du_crtc_set_display_timing(struct rcar_du_crtc *rcrtc)
 
 		if (abs((long)extrate - (long)mode_clock) <
 		    abs((long)rate - (long)mode_clock)) {
-			dev_dbg(rcrtc->group->dev->dev,
-				"crtc%u: using external clock\n", rcrtc->index);
 
 			if (rcdu->info->dpll_ch & (1 << rcrtc->index)) {
 				u32 dpllcr = DPLLCR_CODE | DPLLCR_CLKE
@@ -215,12 +231,14 @@ static void rcar_du_crtc_set_display_timing(struct rcar_du_crtc *rcrtc)
 
 				rcar_du_group_write(rcrtc->group, DPLLCR,
 						    dpllcr);
-
-				escr = ESCR_DCLKSEL_DCLKIN | 1;
-			} else {
-				escr = ESCR_DCLKSEL_DCLKIN | extdiv;
 			}
+
+			escr = ESCR_DCLKSEL_DCLKIN | extdiv;
 		}
+
+		dev_dbg(rcrtc->group->dev->dev,
+			"mode clock %lu extrate %lu rate %lu ESCR 0x%08x\n",
+			mode_clock, extrate, rate, escr);
 	}
 
 	rcar_du_group_write(rcrtc->group, rcrtc->index % 2 ? ESCR2 : ESCR,
@@ -261,12 +279,14 @@ void rcar_du_crtc_route_output(struct drm_crtc *crtc,
 	struct rcar_du_crtc *rcrtc = to_rcar_crtc(crtc);
 	struct rcar_du_device *rcdu = rcrtc->group->dev;
 
-	/* Store the route from the CRTC output to the DU output. The DU will be
+	/*
+	 * Store the route from the CRTC output to the DU output. The DU will be
 	 * configured when starting the CRTC.
 	 */
 	rcrtc->outputs |= BIT(output);
 
-	/* Store RGB routing to DPAD0, the hardware will be configured when
+	/*
+	 * Store RGB routing to DPAD0, the hardware will be configured when
 	 * starting the CRTC.
 	 */
 	if (output == RCAR_DU_OUTPUT_DPAD0)
@@ -342,7 +362,8 @@ static void rcar_du_crtc_update_planes(struct rcar_du_crtc *rcrtc)
 		}
 	}
 
-	/* Update the planes to display timing and dot clock generator
+	/*
+	 * Update the planes to display timing and dot clock generator
 	 * associations.
 	 *
 	 * Updating the DPTSR register requires restarting the CRTC group,
@@ -431,14 +452,8 @@ static void rcar_du_crtc_wait_page_flip(struct rcar_du_crtc *rcrtc)
  * Start/Stop and Suspend/Resume
  */
 
-static void rcar_du_crtc_start(struct rcar_du_crtc *rcrtc)
+static void rcar_du_crtc_setup(struct rcar_du_crtc *rcrtc)
 {
-	struct drm_crtc *crtc = &rcrtc->crtc;
-	bool interlaced;
-
-	if (rcrtc->started)
-		return;
-
 	/* Set display off and background to black */
 	rcar_du_crtc_write(rcrtc, DOOR, DOOR_RGB(0, 0, 0));
 	rcar_du_crtc_write(rcrtc, BPOR, BPOR_RGB(0, 0, 0));
@@ -450,7 +465,20 @@ static void rcar_du_crtc_start(struct rcar_du_crtc *rcrtc)
 	/* Start with all planes disabled. */
 	rcar_du_group_write(rcrtc->group, rcrtc->index % 2 ? DS2PR : DS1PR, 0);
 
-	/* Select master sync mode. This enables display operation in master
+	/* Enable the VSP compositor. */
+	if (rcar_du_has(rcrtc->group->dev, RCAR_DU_FEATURE_VSP1_SOURCE))
+		rcar_du_vsp_enable(rcrtc);
+
+	/* Turn vertical blanking interrupt reporting on. */
+	drm_crtc_vblank_on(&rcrtc->crtc);
+}
+
+static void rcar_du_crtc_start(struct rcar_du_crtc *rcrtc)
+{
+	bool interlaced;
+
+	/*
+	 * Select master sync mode. This enables display operation in master
 	 * sync mode (with the HSYNC and VSYNC signals configured as outputs and
 	 * actively driven).
 	 */
@@ -460,38 +488,56 @@ static void rcar_du_crtc_start(struct rcar_du_crtc *rcrtc)
 			     DSYSR_TVM_MASTER);
 
 	rcar_du_group_start_stop(rcrtc->group, true);
+}
 
-	/* Enable the VSP compositor. */
-	if (rcar_du_has(rcrtc->group->dev, RCAR_DU_FEATURE_VSP1_SOURCE))
-		rcar_du_vsp_enable(rcrtc);
+static void rcar_du_crtc_disable_planes(struct rcar_du_crtc *rcrtc)
+{
+	struct rcar_du_device *rcdu = rcrtc->group->dev;
+	struct drm_crtc *crtc = &rcrtc->crtc;
+	u32 status;
 
-	/* Turn vertical blanking interrupt reporting back on. */
-	drm_crtc_vblank_on(crtc);
+	/* Make sure vblank interrupts are enabled. */
+	drm_crtc_vblank_get(crtc);
 
-	rcrtc->started = true;
+	/*
+	 * Disable planes and calculate how many vertical blanking interrupts we
+	 * have to wait for. If a vertical blanking interrupt has been triggered
+	 * but not processed yet, we don't know whether it occurred before or
+	 * after the planes got disabled. We thus have to wait for two vblank
+	 * interrupts in that case.
+	 */
+	spin_lock_irq(&rcrtc->vblank_lock);
+	rcar_du_group_write(rcrtc->group, rcrtc->index % 2 ? DS2PR : DS1PR, 0);
+	status = rcar_du_crtc_read(rcrtc, DSSR);
+	rcrtc->vblank_count = status & DSSR_VBK ? 2 : 1;
+	spin_unlock_irq(&rcrtc->vblank_lock);
+
+	if (!wait_event_timeout(rcrtc->vblank_wait, rcrtc->vblank_count == 0,
+				msecs_to_jiffies(100)))
+		dev_warn(rcdu->dev, "vertical blanking timeout\n");
+
+	drm_crtc_vblank_put(crtc);
 }
 
 static void rcar_du_crtc_stop(struct rcar_du_crtc *rcrtc)
 {
 	struct drm_crtc *crtc = &rcrtc->crtc;
 
-	if (!rcrtc->started)
-		return;
-
-	/* Disable all planes and wait for the change to take effect. This is
-	 * required as the DSnPR registers are updated on vblank, and no vblank
-	 * will occur once the CRTC is stopped. Disabling planes when starting
-	 * the CRTC thus wouldn't be enough as it would start scanning out
-	 * immediately from old frame buffers until the next vblank.
+	/*
+	 * Disable all planes and wait for the change to take effect. This is
+	 * required as the plane enable registers are updated on vblank, and no
+	 * vblank will occur once the CRTC is stopped. Disabling planes when
+	 * starting the CRTC thus wouldn't be enough as it would start scanning
+	 * out immediately from old frame buffers until the next vblank.
 	 *
 	 * This increases the CRTC stop delay, especially when multiple CRTCs
 	 * are stopped in one operation as we now wait for one vblank per CRTC.
 	 * Whether this can be improved needs to be researched.
 	 */
-	rcar_du_group_write(rcrtc->group, rcrtc->index % 2 ? DS2PR : DS1PR, 0);
-	drm_crtc_wait_one_vblank(crtc);
+	rcar_du_crtc_disable_planes(rcrtc);
 
-	/* Disable vertical blanking interrupt reporting. We first need to wait
+	/*
+	 * Disable vertical blanking interrupt reporting. We first need to wait
 	 * for page flip completion before stopping the CRTC as userspace
 	 * expects page flips to eventually complete.
 	 */
@@ -502,14 +548,13 @@ static void rcar_du_crtc_stop(struct rcar_du_crtc *rcrtc)
 	if (rcar_du_has(rcrtc->group->dev, RCAR_DU_FEATURE_VSP1_SOURCE))
 		rcar_du_vsp_disable(rcrtc);
 
-	/* Select switch sync mode. This stops display operation and configures
+	/*
+	 * Select switch sync mode. This stops display operation and configures
 	 * the HSYNC and VSYNC signals as inputs.
 	 */
 	rcar_du_crtc_clr_set(rcrtc, DSYSR, DSYSR_TVM_MASK, DSYSR_TVM_SWITCH);
 
 	rcar_du_group_start_stop(rcrtc->group, false);
-
-	rcrtc->started = false;
 }
 
 void rcar_du_crtc_suspend(struct rcar_du_crtc *rcrtc)
@@ -529,12 +574,10 @@ void rcar_du_crtc_resume(struct rcar_du_crtc *rcrtc)
 		return;
 
 	rcar_du_crtc_get(rcrtc);
-	rcar_du_crtc_start(rcrtc);
+	rcar_du_crtc_setup(rcrtc);
 
 	/* Commit the planes state. */
-	if (rcar_du_has(rcrtc->group->dev, RCAR_DU_FEATURE_VSP1_SOURCE)) {
-		rcar_du_vsp_enable(rcrtc);
-	} else {
+	if (!rcar_du_has(rcrtc->group->dev, RCAR_DU_FEATURE_VSP1_SOURCE)) {
 		for (i = 0; i < rcrtc->group->num_planes; ++i) {
 			struct rcar_du_plane *plane = &rcrtc->group->planes[i];
 
@@ -546,21 +589,33 @@ void rcar_du_crtc_resume(struct rcar_du_crtc *rcrtc)
 	}
 
 	rcar_du_crtc_update_planes(rcrtc);
+	rcar_du_crtc_start(rcrtc);
 }
 
 /* -----------------------------------------------------------------------------
  * CRTC Functions
  */
 
-static void rcar_du_crtc_enable(struct drm_crtc *crtc)
+static void rcar_du_crtc_atomic_enable(struct drm_crtc *crtc,
+				       struct drm_crtc_state *old_state)
 {
 	struct rcar_du_crtc *rcrtc = to_rcar_crtc(crtc);
 
-	rcar_du_crtc_get(rcrtc);
+	/*
+	 * If the CRTC has already been setup by the .atomic_begin() handler we
+	 * can skip the setup stage.
+	 */
+	if (!rcrtc->initialized) {
+		rcar_du_crtc_get(rcrtc);
+		rcar_du_crtc_setup(rcrtc);
+		rcrtc->initialized = true;
+	}
+
 	rcar_du_crtc_start(rcrtc);
 }
 
-static void rcar_du_crtc_disable(struct drm_crtc *crtc)
+static void rcar_du_crtc_atomic_disable(struct drm_crtc *crtc,
+					struct drm_crtc_state *old_state)
 {
 	struct rcar_du_crtc *rcrtc = to_rcar_crtc(crtc);
 
@@ -574,6 +629,7 @@ static void rcar_du_crtc_disable(struct drm_crtc *crtc)
 	}
 	spin_unlock_irq(&crtc->dev->event_lock);
 
+	rcrtc->initialized = false;
 	rcrtc->outputs = 0;
 }
 
@@ -582,6 +638,19 @@ static void rcar_du_crtc_atomic_begin(struct drm_crtc *crtc,
 {
 	struct rcar_du_crtc *rcrtc = to_rcar_crtc(crtc);
 
+	WARN_ON(!crtc->state->enable);
+
+	/*
+	 * If a mode set is in progress we can be called with the CRTC disabled.
+	 * We then need to first setup the CRTC in order to configure planes.
+	 * The .atomic_enable() handler will notice and skip the CRTC setup.
+	 */
+	if (!rcrtc->initialized) {
+		rcar_du_crtc_get(rcrtc);
+		rcar_du_crtc_setup(rcrtc);
+		rcrtc->initialized = true;
+	}
+
 	if (rcar_du_has(rcrtc->group->dev, RCAR_DU_FEATURE_VSP1_SOURCE))
 		rcar_du_vsp_atomic_begin(rcrtc);
 }
@@ -609,10 +678,10 @@ static void rcar_du_crtc_atomic_flush(struct drm_crtc *crtc,
 }
 
 static const struct drm_crtc_helper_funcs crtc_helper_funcs = {
-	.disable = rcar_du_crtc_disable,
-	.enable = rcar_du_crtc_enable,
 	.atomic_begin = rcar_du_crtc_atomic_begin,
 	.atomic_flush = rcar_du_crtc_atomic_flush,
+	.atomic_enable = rcar_du_crtc_atomic_enable,
+	.atomic_disable = rcar_du_crtc_atomic_disable,
 };
 
 static int rcar_du_crtc_enable_vblank(struct drm_crtc *crtc)
@@ -621,6 +690,7 @@ static int rcar_du_crtc_enable_vblank(struct drm_crtc *crtc)
 
 	rcar_du_crtc_write(rcrtc, DSRCR, DSRCR_VBCL);
 	rcar_du_crtc_set(rcrtc, DIER, DIER_VBE);
+	rcrtc->vblank_enable = true;
 
 	return 0;
 }
@@ -630,6 +700,7 @@ static void rcar_du_crtc_disable_vblank(struct drm_crtc *crtc)
 	struct rcar_du_crtc *rcrtc = to_rcar_crtc(crtc);
 
 	rcar_du_crtc_clr(rcrtc, DIER, DIER_VBE);
+	rcrtc->vblank_enable = false;
 }
 
 static const struct drm_crtc_funcs crtc_funcs = {
@@ -654,14 +725,30 @@ static irqreturn_t rcar_du_crtc_irq(int irq, void *arg)
 	irqreturn_t ret = IRQ_NONE;
 	u32 status;
 
+	spin_lock(&rcrtc->vblank_lock);
+
 	status = rcar_du_crtc_read(rcrtc, DSSR);
 	rcar_du_crtc_write(rcrtc, DSRCR, status & DSRCR_MASK);
 
-	if (status & DSSR_FRM) {
-		drm_crtc_handle_vblank(&rcrtc->crtc);
+	if (status & DSSR_VBK) {
+		/*
+		 * Wake up the vblank wait if the counter reaches 0. This must
+		 * be protected by the vblank_lock to avoid races in
+		 * rcar_du_crtc_disable_planes().
+		 */
+		if (rcrtc->vblank_count) {
+			if (--rcrtc->vblank_count == 0)
+				wake_up(&rcrtc->vblank_wait);
+		}
+	}
 
-		if (rcdu->info->gen < 3)
+	spin_unlock(&rcrtc->vblank_lock);
+
+	if (status & DSSR_VBK) {
+		if (rcdu->info->gen < 3) {
+			drm_crtc_handle_vblank(&rcrtc->crtc);
 			rcar_du_crtc_finish_page_flip(rcrtc);
+		}
 
 		ret = IRQ_HANDLED;
 	}
@@ -715,13 +802,15 @@ int rcar_du_crtc_create(struct rcar_du_group *rgrp, unsigned int index)
 	}
 
 	init_waitqueue_head(&rcrtc->flip_wait);
+	init_waitqueue_head(&rcrtc->vblank_wait);
+	spin_lock_init(&rcrtc->vblank_lock);
 
 	rcrtc->group = rgrp;
 	rcrtc->mmio_offset = mmio_offsets[index];
 	rcrtc->index = index;
 
 	if (rcar_du_has(rcdu, RCAR_DU_FEATURE_VSP1_SOURCE))
-		primary = &rcrtc->vsp->planes[0].plane;
+		primary = &rcrtc->vsp->planes[rcrtc->vsp_pipe].plane;
 	else
 		primary = &rgrp->planes[index % 2].plane;
 
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
index b199ed5..fdc2bf9 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
+++ b/drivers/gpu/drm/rcar-du/rcar_du_crtc.h
@@ -15,6 +15,7 @@
 #define __RCAR_DU_CRTC_H__
 
 #include <linux/mutex.h>
+#include <linux/spinlock.h>
 #include <linux/wait.h>
 
 #include <drm/drmP.h>
@@ -30,11 +31,17 @@ struct rcar_du_vsp;
  * @extclock: external pixel dot clock (optional)
  * @mmio_offset: offset of the CRTC registers in the DU MMIO block
  * @index: CRTC software and hardware index
- * @started: whether the CRTC has been started and is running
+ * @initialized: whether the CRTC has been initialized and clocks enabled
+ * @vblank_enable: whether vblank events are enabled on this CRTC
  * @event: event to post when the pending page flip completes
  * @flip_wait: wait queue used to signal page flip completion
+ * @vblank_lock: protects vblank_wait and vblank_count
+ * @vblank_wait: wait queue used to signal vertical blanking
+ * @vblank_count: number of vertical blanking interrupts to wait for
  * @outputs: bitmask of the outputs (enum rcar_du_output) driven by this CRTC
  * @group: CRTC group this CRTC belongs to
+ * @vsp: VSP feeding video to this CRTC
+ * @vsp_pipe: index of the VSP pipeline feeding video to this CRTC
  */
 struct rcar_du_crtc {
 	struct drm_crtc crtc;
@@ -43,15 +50,21 @@ struct rcar_du_crtc {
 	struct clk *extclock;
 	unsigned int mmio_offset;
 	unsigned int index;
-	bool started;
+	bool initialized;
 
+	bool vblank_enable;
 	struct drm_pending_vblank_event *event;
 	wait_queue_head_t flip_wait;
 
+	spinlock_t vblank_lock;
+	wait_queue_head_t vblank_wait;
+	unsigned int vblank_count;
+
 	unsigned int outputs;
 
 	struct rcar_du_group *group;
 	struct rcar_du_vsp *vsp;
+	unsigned int vsp_pipe;
 };
 
 #define to_rcar_crtc(c)	container_of(c, struct rcar_du_crtc, crtc)
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_drv.c b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
index d6a0255..d2f29e6 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_drv.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_drv.c
@@ -39,7 +39,8 @@ static const struct rcar_du_device_info rcar_du_r8a7779_info = {
 	.features = 0,
 	.num_crtcs = 2,
 	.routes = {
-		/* R8A7779 has two RGB outputs and one (currently unsupported)
+		/*
+		 * R8A7779 has two RGB outputs and one (currently unsupported)
 		 * TCON output.
 		 */
 		[RCAR_DU_OUTPUT_DPAD0] = {
@@ -61,7 +62,8 @@ static const struct rcar_du_device_info rcar_du_r8a7790_info = {
 	.quirks = RCAR_DU_QUIRK_ALIGN_128B | RCAR_DU_QUIRK_LVDS_LANES,
 	.num_crtcs = 3,
 	.routes = {
-		/* R8A7790 has one RGB output, two LVDS outputs and one
+		/*
+		 * R8A7790 has one RGB output, two LVDS outputs and one
 		 * (currently unsupported) TCON output.
 		 */
 		[RCAR_DU_OUTPUT_DPAD0] = {
@@ -87,7 +89,8 @@ static const struct rcar_du_device_info rcar_du_r8a7791_info = {
 		  | RCAR_DU_FEATURE_EXT_CTRL_REGS,
 	.num_crtcs = 2,
 	.routes = {
-		/* R8A779[13] has one RGB output, one LVDS output and one
+		/*
+		 * R8A779[13] has one RGB output, one LVDS output and one
 		 * (currently unsupported) TCON output.
 		 */
 		[RCAR_DU_OUTPUT_DPAD0] = {
@@ -127,7 +130,8 @@ static const struct rcar_du_device_info rcar_du_r8a7794_info = {
 		  | RCAR_DU_FEATURE_EXT_CTRL_REGS,
 	.num_crtcs = 2,
 	.routes = {
-		/* R8A7794 has two RGB outputs and one (currently unsupported)
+		/*
+		 * R8A7794 has two RGB outputs and one (currently unsupported)
 		 * TCON output.
 		 */
 		[RCAR_DU_OUTPUT_DPAD0] = {
@@ -149,7 +153,8 @@ static const struct rcar_du_device_info rcar_du_r8a7795_info = {
 		  | RCAR_DU_FEATURE_VSP1_SOURCE,
 	.num_crtcs = 4,
 	.routes = {
-		/* R8A7795 has one RGB output, two HDMI outputs and one
+		/*
+		 * R8A7795 has one RGB output, two HDMI outputs and one
 		 * LVDS output.
 		 */
 		[RCAR_DU_OUTPUT_DPAD0] = {
@@ -180,19 +185,25 @@ static const struct rcar_du_device_info rcar_du_r8a7796_info = {
 		  | RCAR_DU_FEATURE_VSP1_SOURCE,
 	.num_crtcs = 3,
 	.routes = {
-		/* R8A7796 has one RGB output, one LVDS output and one
-		 * (currently unsupported) HDMI output.
+		/*
+		 * R8A7796 has one RGB output, one LVDS output and one HDMI
+		 * output.
 		 */
 		[RCAR_DU_OUTPUT_DPAD0] = {
 			.possible_crtcs = BIT(2),
 			.port = 0,
 		},
+		[RCAR_DU_OUTPUT_HDMI0] = {
+			.possible_crtcs = BIT(1),
+			.port = 1,
+		},
 		[RCAR_DU_OUTPUT_LVDS0] = {
 			.possible_crtcs = BIT(0),
 			.port = 2,
 		},
 	},
 	.num_lvds = 1,
+	.dpll_ch =  BIT(1),
 };
 
 static const struct of_device_id rcar_du_of_table[] = {
@@ -238,8 +249,6 @@ static struct drm_driver rcar_du_driver = {
 	.gem_prime_vunmap	= drm_gem_cma_prime_vunmap,
 	.gem_prime_mmap		= drm_gem_cma_prime_mmap,
 	.dumb_create		= rcar_du_dumb_create,
-	.dumb_map_offset	= drm_gem_cma_dumb_map_offset,
-	.dumb_destroy		= drm_gem_dumb_destroy,
 	.fops			= &rcar_du_fops,
 	.name			= "rcar-du",
 	.desc			= "Renesas R-Car Display Unit",
@@ -341,7 +350,8 @@ static int rcar_du_probe(struct platform_device *pdev)
 
 	ddev->irq_enabled = 1;
 
-	/* Register the DRM device with the core and the connectors with
+	/*
+	 * Register the DRM device with the core and the connectors with
 	 * sysfs.
 	 */
 	ret = drm_dev_register(ddev, 0);
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_encoder.c b/drivers/gpu/drm/rcar-du/rcar_du_encoder.c
index 3e048dd..ba8d280 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_encoder.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_encoder.c
@@ -186,8 +186,8 @@ int rcar_du_encoder_init(struct rcar_du_device *rcdu,
 	}
 
 	if (enc_node) {
-		dev_dbg(rcdu->dev, "initializing encoder %s for output %u\n",
-			of_node_full_name(enc_node), output);
+		dev_dbg(rcdu->dev, "initializing encoder %pOF for output %u\n",
+			enc_node, output);
 
 		/* Locate the DRM bridge from the encoder DT node. */
 		bridge = of_drm_find_bridge(enc_node);
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_group.c b/drivers/gpu/drm/rcar-du/rcar_du_group.c
index 64738fc..2f37ea9 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_group.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_group.c
@@ -64,7 +64,8 @@ static void rcar_du_group_setup_defr8(struct rcar_du_group *rgrp)
 	if (rcdu->info->gen < 3) {
 		defr8 |= DEFR8_DEFE8;
 
-		/* On Gen2 the DEFR8 register for the first group also controls
+		/*
+		 * On Gen2 the DEFR8 register for the first group also controls
 		 * RGB output routing to DPAD0 and VSPD1 routing to DU0/1/2 for
 		 * DU instances that support it.
 		 */
@@ -75,7 +76,8 @@ static void rcar_du_group_setup_defr8(struct rcar_du_group *rgrp)
 				defr8 |= DEFR8_VSCS;
 		}
 	} else {
-		/* On Gen3 VSPD routing can't be configured, but DPAD routing
+		/*
+		 * On Gen3 VSPD routing can't be configured, but DPAD routing
 		 * needs to be set despite having a single option available.
 		 */
 		u32 crtc = ffs(possible_crtcs) - 1;
@@ -124,7 +126,8 @@ static void rcar_du_group_setup(struct rcar_du_group *rgrp)
 	if (rcdu->info->gen >= 3)
 		rcar_du_group_write(rgrp, DEFR10, DEFR10_CODE | DEFR10_DEFE10);
 
-	/* Use DS1PR and DS2PR to configure planes priorities and connects the
+	/*
+	 * Use DS1PR and DS2PR to configure planes priorities and connects the
 	 * superposition 0 to DU0 pins. DU1 pins will be configured dynamically.
 	 */
 	rcar_du_group_write(rgrp, DORCR, DORCR_PG1D_DS1 | DORCR_DPRS);
@@ -177,7 +180,8 @@ static void __rcar_du_group_start_stop(struct rcar_du_group *rgrp, bool start)
 
 void rcar_du_group_start_stop(struct rcar_du_group *rgrp, bool start)
 {
-	/* Many of the configuration bits are only updated when the display
+	/*
+	 * Many of the configuration bits are only updated when the display
 	 * reset (DRES) bit in DSYSR is set to 1, disabling *both* CRTCs. Some
 	 * of those bits could be pre-configured, but others (especially the
 	 * bits related to plane assignment to display timing controllers) need
@@ -208,23 +212,32 @@ void rcar_du_group_restart(struct rcar_du_group *rgrp)
 
 int rcar_du_set_dpad0_vsp1_routing(struct rcar_du_device *rcdu)
 {
+	struct rcar_du_group *rgrp;
+	struct rcar_du_crtc *crtc;
+	unsigned int index;
 	int ret;
 
 	if (!rcar_du_has(rcdu, RCAR_DU_FEATURE_EXT_CTRL_REGS))
 		return 0;
 
-	/* RGB output routing to DPAD0 and VSP1D routing to DU0/1/2 are
-	 * configured in the DEFR8 register of the first group. As this function
-	 * can be called with the DU0 and DU1 CRTCs disabled, we need to enable
-	 * the first group clock before accessing the register.
+	/*
+	 * RGB output routing to DPAD0 and VSP1D routing to DU0/1/2 are
+	 * configured in the DEFR8 register of the first group on Gen2 and the
+	 * last group on Gen3. As this function can be called with the DU
+	 * channels of the corresponding CRTCs disabled, we need to enable the
+	 * group clock before accessing the register.
 	 */
-	ret = clk_prepare_enable(rcdu->crtcs[0].clock);
+	index = rcdu->info->gen < 3 ? 0 : DIV_ROUND_UP(rcdu->num_crtcs, 2) - 1;
+	rgrp = &rcdu->groups[index];
+	crtc = &rcdu->crtcs[index * 2];
+
+	ret = clk_prepare_enable(crtc->clock);
 	if (ret < 0)
 		return ret;
 
-	rcar_du_group_setup_defr8(&rcdu->groups[0]);
+	rcar_du_group_setup_defr8(rgrp);
 
-	clk_disable_unprepare(rcdu->crtcs[0].clock);
+	clk_disable_unprepare(crtc->clock);
 
 	return 0;
 }
@@ -236,7 +249,8 @@ int rcar_du_group_set_routing(struct rcar_du_group *rgrp)
 
 	dorcr &= ~(DORCR_PG2T | DORCR_DK2S | DORCR_PG2D_MASK);
 
-	/* Set the DPAD1 pins sources. Select CRTC 0 if explicitly requested and
+	/*
+	 * Set the DPAD1 pins sources. Select CRTC 0 if explicitly requested and
 	 * CRTC 1 in all other cases to avoid cloning CRTC 0 to DPAD0 and DPAD1
 	 * by default.
 	 */
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_kms.c b/drivers/gpu/drm/rcar-du/rcar_du_kms.c
index f4125c8..7278b97 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_kms.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_kms.c
@@ -96,7 +96,8 @@ static const struct rcar_du_format_info rcar_du_format_infos[] = {
 		.pnmr = PnMR_SPIM_TP_OFF | PnMR_DDDF_YC,
 		.edf = PnDDCR4_EDF_NONE,
 	},
-	/* The following formats are not supported on Gen2 and thus have no
+	/*
+	 * The following formats are not supported on Gen2 and thus have no
 	 * associated .pnmr or .edf settings.
 	 */
 	{
@@ -153,7 +154,8 @@ int rcar_du_dumb_create(struct drm_file *file, struct drm_device *dev,
 	unsigned int min_pitch = DIV_ROUND_UP(args->width * args->bpp, 8);
 	unsigned int align;
 
-	/* The R8A7779 DU requires a 16 pixels pitch alignment as documented,
+	/*
+	 * The R8A7779 DU requires a 16 pixels pitch alignment as documented,
 	 * but the R8A7790 DU seems to require a 128 bytes pitch alignment.
 	 */
 	if (rcar_du_needs(rcdu, RCAR_DU_QUIRK_ALIGN_128B))
@@ -255,12 +257,12 @@ static void rcar_du_atomic_commit_tail(struct drm_atomic_state *old_state)
 
 	/* Apply the atomic update. */
 	drm_atomic_helper_commit_modeset_disables(dev, old_state);
-	drm_atomic_helper_commit_modeset_enables(dev, old_state);
 	drm_atomic_helper_commit_planes(dev, old_state,
 					DRM_PLANE_COMMIT_ACTIVE_ONLY);
+	drm_atomic_helper_commit_modeset_enables(dev, old_state);
 
 	drm_atomic_helper_commit_hw_done(old_state);
-	drm_atomic_helper_wait_for_vblanks(dev, old_state);
+	drm_atomic_helper_wait_for_flip_done(dev, old_state);
 
 	drm_atomic_helper_cleanup_planes(dev, old_state);
 }
@@ -297,19 +299,19 @@ static int rcar_du_encoders_init_one(struct rcar_du_device *rcdu,
 	 */
 	entity = of_graph_get_remote_port_parent(ep->local_node);
 	if (!entity) {
-		dev_dbg(rcdu->dev, "unconnected endpoint %s, skipping\n",
-			ep->local_node->full_name);
+		dev_dbg(rcdu->dev, "unconnected endpoint %pOF, skipping\n",
+			ep->local_node);
 		return -ENODEV;
 	}
 
 	if (!of_device_is_available(entity)) {
 		dev_dbg(rcdu->dev,
-			"connected entity %s is disabled, skipping\n",
-			entity->full_name);
+			"connected entity %pOF is disabled, skipping\n",
+			entity);
 		return -ENODEV;
 	}
 
-	entity_ep_node = of_parse_phandle(ep->local_node, "remote-endpoint", 0);
+	entity_ep_node = of_graph_get_remote_endpoint(ep->local_node);
 
 	for_each_endpoint_of_node(entity, ep_node) {
 		if (ep_node == entity_ep_node)
@@ -325,8 +327,8 @@ static int rcar_du_encoders_init_one(struct rcar_du_device *rcdu,
 
 		if (!connector) {
 			dev_warn(rcdu->dev,
-				 "no connector for encoder %s, skipping\n",
-				 encoder->full_name);
+				 "no connector for encoder %pOF, skipping\n",
+				 encoder);
 			of_node_put(entity_ep_node);
 			of_node_put(encoder);
 			return -ENODEV;
@@ -348,8 +350,8 @@ static int rcar_du_encoders_init_one(struct rcar_du_device *rcdu,
 	ret = rcar_du_encoder_init(rcdu, output, encoder, connector);
 	if (ret && ret != -EPROBE_DEFER)
 		dev_warn(rcdu->dev,
-			 "failed to initialize encoder %s on output %u (%d), skipping\n",
-			 of_node_full_name(encoder), output, ret);
+			 "failed to initialize encoder %pOF on output %u (%d), skipping\n",
+			 encoder, output, ret);
 
 	of_node_put(encoder);
 	of_node_put(connector);
@@ -419,7 +421,8 @@ static int rcar_du_properties_init(struct rcar_du_device *rcdu)
 	if (rcdu->props.alpha == NULL)
 		return -ENOMEM;
 
-	/* The color key is expressed as an RGB888 triplet stored in a 32-bit
+	/*
+	 * The color key is expressed as an RGB888 triplet stored in a 32-bit
 	 * integer in XRGB8888 format. Bit 24 is used as a flag to disable (0)
 	 * or enable source color keying (1).
 	 */
@@ -432,6 +435,81 @@ static int rcar_du_properties_init(struct rcar_du_device *rcdu)
 	return 0;
 }
 
+static int rcar_du_vsps_init(struct rcar_du_device *rcdu)
+{
+	const struct device_node *np = rcdu->dev->of_node;
+	struct of_phandle_args args;
+	struct {
+		struct device_node *np;
+		unsigned int crtcs_mask;
+	} vsps[RCAR_DU_MAX_VSPS] = { { 0, }, };
+	unsigned int vsps_count = 0;
+	unsigned int cells;
+	unsigned int i;
+	int ret;
+
+	/*
+	 * First parse the DT vsps property to populate the list of VSPs. Each
+	 * entry contains a pointer to the VSP DT node and a bitmask of the
+	 * connected DU CRTCs.
+	 */
+	cells = of_property_count_u32_elems(np, "vsps") / rcdu->num_crtcs - 1;
+	if (cells > 1)
+		return -EINVAL;
+
+	for (i = 0; i < rcdu->num_crtcs; ++i) {
+		unsigned int j;
+
+		ret = of_parse_phandle_with_fixed_args(np, "vsps", cells, i,
+						       &args);
+		if (ret < 0)
+			goto error;
+
+		/*
+		 * Add the VSP to the list or update the corresponding existing
+		 * entry if the VSP has already been added.
+		 */
+		for (j = 0; j < vsps_count; ++j) {
+			if (vsps[j].np == args.np)
+				break;
+		}
+
+		if (j < vsps_count)
+			of_node_put(args.np);
+		else
+			vsps[vsps_count++].np = args.np;
+
+		vsps[j].crtcs_mask |= BIT(i);
+
+		/* Store the VSP pointer and pipe index in the CRTC. */
+		rcdu->crtcs[i].vsp = &rcdu->vsps[j];
+		rcdu->crtcs[i].vsp_pipe = cells >= 1 ? args.args[0] : 0;
+	}
+
+	/*
+	 * Then initialize all the VSPs from the node pointers and CRTCs bitmask
+	 * computed previously.
+	 */
+	for (i = 0; i < vsps_count; ++i) {
+		struct rcar_du_vsp *vsp = &rcdu->vsps[i];
+
+		vsp->index = i;
+		vsp->dev = rcdu;
+
+		ret = rcar_du_vsp_init(vsp, vsps[i].np, vsps[i].crtcs_mask);
+		if (ret < 0)
+			goto error;
+	}
+
+	return 0;
+
+error:
+	for (i = 0; i < ARRAY_SIZE(vsps); ++i)
+		of_node_put(vsps[i].np);
+
+	return ret;
+}
+
 int rcar_du_modeset_init(struct rcar_du_device *rcdu)
 {
 	static const unsigned int mmio_offsets[] = {
@@ -461,7 +539,8 @@ int rcar_du_modeset_init(struct rcar_du_device *rcdu)
 	if (ret < 0)
 		return ret;
 
-	/* Initialize vertical blanking interrupts handling. Start with vblank
+	/*
+	 * Initialize vertical blanking interrupts handling. Start with vblank
 	 * disabled for all CRTCs.
 	 */
 	ret = drm_vblank_init(dev, (1 << rcdu->info->num_crtcs) - 1);
@@ -481,7 +560,8 @@ int rcar_du_modeset_init(struct rcar_du_device *rcdu)
 		rgrp->index = i;
 		rgrp->num_crtcs = min(rcdu->num_crtcs - 2 * i, 2U);
 
-		/* If we have more than one CRTCs in this group pre-associate
+		/*
+		 * If we have more than one CRTCs in this group pre-associate
 		 * the low-order planes with CRTC 0 and the high-order planes
 		 * with CRTC 1 to minimize flicker occurring when the
 		 * association is changed.
@@ -499,17 +579,9 @@ int rcar_du_modeset_init(struct rcar_du_device *rcdu)
 
 	/* Initialize the compositors. */
 	if (rcar_du_has(rcdu, RCAR_DU_FEATURE_VSP1_SOURCE)) {
-		for (i = 0; i < rcdu->num_crtcs; ++i) {
-			struct rcar_du_vsp *vsp = &rcdu->vsps[i];
-
-			vsp->index = i;
-			vsp->dev = rcdu;
-			rcdu->crtcs[i].vsp = vsp;
-
-			ret = rcar_du_vsp_init(vsp);
-			if (ret < 0)
-				return ret;
-		}
+		ret = rcar_du_vsps_init(rcdu);
+		if (ret < 0)
+			return ret;
 	}
 
 	/* Create the CRTCs. */
@@ -537,7 +609,8 @@ int rcar_du_modeset_init(struct rcar_du_device *rcdu)
 
 	num_encoders = ret;
 
-	/* Set the possible CRTCs and possible clones. There's always at least
+	/*
+	 * Set the possible CRTCs and possible clones. There's always at least
 	 * one way for all encoders to clone each other, set all bits in the
 	 * possible clones field.
 	 */
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_lvdscon.c b/drivers/gpu/drm/rcar-du/rcar_du_lvdscon.c
index ee914811..b373ad4 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_lvdscon.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_lvdscon.c
@@ -46,7 +46,6 @@ static void rcar_du_lvds_connector_destroy(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.reset = drm_atomic_helper_connector_reset,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = rcar_du_lvds_connector_destroy,
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_lvdsenc.c b/drivers/gpu/drm/rcar-du/rcar_du_lvdsenc.c
index 1661f62..12d22f3 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_lvdsenc.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_lvdsenc.c
@@ -59,7 +59,8 @@ static void rcar_du_lvdsenc_start_gen2(struct rcar_du_lvdsenc *lvds,
 
 	rcar_lvds_write(lvds, LVDPLLCR, pllcr);
 
-	/* Select the input, hardcode mode 0, enable LVDS operation and turn
+	/*
+	 * Select the input, hardcode mode 0, enable LVDS operation and turn
 	 * bias circuitry on.
 	 */
 	lvdcr0 = (lvds->mode << LVDCR0_LVMD_SHIFT) | LVDCR0_BEN | LVDCR0_LVEN;
@@ -73,7 +74,8 @@ static void rcar_du_lvdsenc_start_gen2(struct rcar_du_lvdsenc *lvds,
 			LVDCR1_CHSTBY_GEN2(1) | LVDCR1_CHSTBY_GEN2(0) |
 			LVDCR1_CLKSTBY_GEN2);
 
-	/* Turn the PLL on, wait for the startup delay, and turn the output
+	/*
+	 * Turn the PLL on, wait for the startup delay, and turn the output
 	 * on.
 	 */
 	lvdcr0 |= LVDCR0_PLLON;
@@ -140,7 +142,8 @@ static int rcar_du_lvdsenc_start(struct rcar_du_lvdsenc *lvds,
 	if (ret < 0)
 		return ret;
 
-	/* Hardcode the channels and control signals routing for now.
+	/*
+	 * Hardcode the channels and control signals routing for now.
 	 *
 	 * HSYNC -> CTRL0
 	 * VSYNC -> CTRL1
@@ -202,7 +205,8 @@ void rcar_du_lvdsenc_atomic_check(struct rcar_du_lvdsenc *lvds,
 {
 	struct rcar_du_device *rcdu = lvds->dev;
 
-	/* The internal LVDS encoder has a restricted clock frequency operating
+	/*
+	 * The internal LVDS encoder has a restricted clock frequency operating
 	 * range (30MHz to 150MHz on Gen2, 25.175MHz to 148.5MHz on Gen3). Clamp
 	 * the clock accordingly.
 	 */
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_plane.c b/drivers/gpu/drm/rcar-du/rcar_du_plane.c
index dcde628..61833cc 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_plane.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_plane.c
@@ -50,23 +50,21 @@
  * automatically when the core swaps the old and new states.
  */
 
-static bool rcar_du_plane_needs_realloc(struct rcar_du_plane *plane,
-					struct rcar_du_plane_state *new_state)
+static bool rcar_du_plane_needs_realloc(
+				const struct rcar_du_plane_state *old_state,
+				const struct rcar_du_plane_state *new_state)
 {
-	struct rcar_du_plane_state *cur_state;
-
-	cur_state = to_rcar_plane_state(plane->plane.state);
-
-	/* Lowering the number of planes doesn't strictly require reallocation
+	/*
+	 * Lowering the number of planes doesn't strictly require reallocation
 	 * as the extra hardware plane will be freed when committing, but doing
 	 * so could lead to more fragmentation.
 	 */
-	if (!cur_state->format ||
-	    cur_state->format->planes != new_state->format->planes)
+	if (!old_state->format ||
+	    old_state->format->planes != new_state->format->planes)
 		return true;
 
 	/* Reallocate hardware planes if the source has changed. */
-	if (cur_state->source != new_state->source)
+	if (old_state->source != new_state->source)
 		return true;
 
 	return false;
@@ -141,37 +139,43 @@ int rcar_du_atomic_check_planes(struct drm_device *dev,
 	unsigned int groups = 0;
 	unsigned int i;
 	struct drm_plane *drm_plane;
-	struct drm_plane_state *drm_plane_state;
+	struct drm_plane_state *old_drm_plane_state;
+	struct drm_plane_state *new_drm_plane_state;
 
 	/* Check if hardware planes need to be reallocated. */
-	for_each_plane_in_state(state, drm_plane, drm_plane_state, i) {
-		struct rcar_du_plane_state *plane_state;
+	for_each_oldnew_plane_in_state(state, drm_plane, old_drm_plane_state,
+				       new_drm_plane_state, i) {
+		struct rcar_du_plane_state *old_plane_state;
+		struct rcar_du_plane_state *new_plane_state;
 		struct rcar_du_plane *plane;
 		unsigned int index;
 
 		plane = to_rcar_plane(drm_plane);
-		plane_state = to_rcar_plane_state(drm_plane_state);
+		old_plane_state = to_rcar_plane_state(old_drm_plane_state);
+		new_plane_state = to_rcar_plane_state(new_drm_plane_state);
 
 		dev_dbg(rcdu->dev, "%s: checking plane (%u,%tu)\n", __func__,
 			plane->group->index, plane - plane->group->planes);
 
-		/* If the plane is being disabled we don't need to go through
+		/*
+		 * If the plane is being disabled we don't need to go through
 		 * the full reallocation procedure. Just mark the hardware
 		 * plane(s) as freed.
 		 */
-		if (!plane_state->format) {
+		if (!new_plane_state->format) {
 			dev_dbg(rcdu->dev, "%s: plane is being disabled\n",
 				__func__);
 			index = plane - plane->group->planes;
 			group_freed_planes[plane->group->index] |= 1 << index;
-			plane_state->hwindex = -1;
+			new_plane_state->hwindex = -1;
 			continue;
 		}
 
-		/* If the plane needs to be reallocated mark it as such, and
+		/*
+		 * If the plane needs to be reallocated mark it as such, and
 		 * mark the hardware plane(s) as free.
 		 */
-		if (rcar_du_plane_needs_realloc(plane, plane_state)) {
+		if (rcar_du_plane_needs_realloc(old_plane_state, new_plane_state)) {
 			dev_dbg(rcdu->dev, "%s: plane needs reallocation\n",
 				__func__);
 			groups |= 1 << plane->group->index;
@@ -179,14 +183,15 @@ int rcar_du_atomic_check_planes(struct drm_device *dev,
 
 			index = plane - plane->group->planes;
 			group_freed_planes[plane->group->index] |= 1 << index;
-			plane_state->hwindex = -1;
+			new_plane_state->hwindex = -1;
 		}
 	}
 
 	if (!needs_realloc)
 		return 0;
 
-	/* Grab all plane states for the groups that need reallocation to ensure
+	/*
+	 * Grab all plane states for the groups that need reallocation to ensure
 	 * locking and avoid racy updates. This serializes the update operation,
 	 * but there's not much we can do about it as that's the hardware
 	 * design.
@@ -204,14 +209,15 @@ int rcar_du_atomic_check_planes(struct drm_device *dev,
 
 		for (i = 0; i < group->num_planes; ++i) {
 			struct rcar_du_plane *plane = &group->planes[i];
-			struct rcar_du_plane_state *plane_state;
+			struct rcar_du_plane_state *new_plane_state;
 			struct drm_plane_state *s;
 
 			s = drm_atomic_get_plane_state(state, &plane->plane);
 			if (IS_ERR(s))
 				return PTR_ERR(s);
 
-			/* If the plane has been freed in the above loop its
+			/*
+			 * If the plane has been freed in the above loop its
 			 * hardware planes must not be added to the used planes
 			 * bitmask. However, the current state doesn't reflect
 			 * the free state yet, as we've modified the new state
@@ -226,16 +232,16 @@ int rcar_du_atomic_check_planes(struct drm_device *dev,
 				continue;
 			}
 
-			plane_state = to_rcar_plane_state(plane->plane.state);
-			used_planes |= rcar_du_plane_hwmask(plane_state);
+			new_plane_state = to_rcar_plane_state(s);
+			used_planes |= rcar_du_plane_hwmask(new_plane_state);
 
 			dev_dbg(rcdu->dev,
 				"%s: plane (%u,%tu) uses %u hwplanes (index %d)\n",
 				__func__, plane->group->index,
 				plane - plane->group->planes,
-				plane_state->format ?
-				plane_state->format->planes : 0,
-				plane_state->hwindex);
+				new_plane_state->format ?
+				new_plane_state->format->planes : 0,
+				new_plane_state->hwindex);
 		}
 
 		group_free_planes[index] = 0xff & ~used_planes;
@@ -246,40 +252,45 @@ int rcar_du_atomic_check_planes(struct drm_device *dev,
 	}
 
 	/* Reallocate hardware planes for each plane that needs it. */
-	for_each_plane_in_state(state, drm_plane, drm_plane_state, i) {
-		struct rcar_du_plane_state *plane_state;
+	for_each_oldnew_plane_in_state(state, drm_plane, old_drm_plane_state,
+				       new_drm_plane_state, i) {
+		struct rcar_du_plane_state *old_plane_state;
+		struct rcar_du_plane_state *new_plane_state;
 		struct rcar_du_plane *plane;
 		unsigned int crtc_planes;
 		unsigned int free;
 		int idx;
 
 		plane = to_rcar_plane(drm_plane);
-		plane_state = to_rcar_plane_state(drm_plane_state);
+		old_plane_state = to_rcar_plane_state(old_drm_plane_state);
+		new_plane_state = to_rcar_plane_state(new_drm_plane_state);
 
 		dev_dbg(rcdu->dev, "%s: allocating plane (%u,%tu)\n", __func__,
 			plane->group->index, plane - plane->group->planes);
 
-		/* Skip planes that are being disabled or don't need to be
+		/*
+		 * Skip planes that are being disabled or don't need to be
 		 * reallocated.
 		 */
-		if (!plane_state->format ||
-		    !rcar_du_plane_needs_realloc(plane, plane_state))
+		if (!new_plane_state->format ||
+		    !rcar_du_plane_needs_realloc(old_plane_state, new_plane_state))
 			continue;
 
-		/* Try to allocate the plane from the free planes currently
+		/*
+		 * Try to allocate the plane from the free planes currently
 		 * associated with the target CRTC to avoid restarting the CRTC
 		 * group and thus minimize flicker. If it fails fall back to
 		 * allocating from all free planes.
 		 */
-		crtc_planes = to_rcar_crtc(plane_state->state.crtc)->index % 2
+		crtc_planes = to_rcar_crtc(new_plane_state->state.crtc)->index % 2
 			    ? plane->group->dptsr_planes
 			    : ~plane->group->dptsr_planes;
 		free = group_free_planes[plane->group->index];
 
-		idx = rcar_du_plane_hwalloc(plane, plane_state,
+		idx = rcar_du_plane_hwalloc(plane, new_plane_state,
 					    free & crtc_planes);
 		if (idx < 0)
-			idx = rcar_du_plane_hwalloc(plane, plane_state,
+			idx = rcar_du_plane_hwalloc(plane, new_plane_state,
 						    free);
 		if (idx < 0) {
 			dev_dbg(rcdu->dev, "%s: no available hardware plane\n",
@@ -288,12 +299,12 @@ int rcar_du_atomic_check_planes(struct drm_device *dev,
 		}
 
 		dev_dbg(rcdu->dev, "%s: allocated %u hwplanes (index %u)\n",
-			__func__, plane_state->format->planes, idx);
+			__func__, new_plane_state->format->planes, idx);
 
-		plane_state->hwindex = idx;
+		new_plane_state->hwindex = idx;
 
 		group_free_planes[plane->group->index] &=
-			~rcar_du_plane_hwmask(plane_state);
+			~rcar_du_plane_hwmask(new_plane_state);
 
 		dev_dbg(rcdu->dev, "%s: group %u free planes mask 0x%02x\n",
 			__func__, plane->group->index,
@@ -351,14 +362,16 @@ static void rcar_du_plane_setup_scanout(struct rcar_du_group *rgrp,
 		dma[1] = 0;
 	}
 
-	/* Memory pitch (expressed in pixels). Must be doubled for interlaced
+	/*
+	 * Memory pitch (expressed in pixels). Must be doubled for interlaced
 	 * operation with 32bpp formats.
 	 */
 	rcar_du_plane_write(rgrp, index, PnMWR,
 			    (interlaced && state->format->bpp == 32) ?
 			    pitch * 2 : pitch);
 
-	/* The Y position is expressed in raster line units and must be doubled
+	/*
+	 * The Y position is expressed in raster line units and must be doubled
 	 * for 32bpp formats, according to the R8A7790 datasheet. No mention of
 	 * doubling the Y position is found in the R8A7779 datasheet, but the
 	 * rule seems to apply there as well.
@@ -396,7 +409,8 @@ static void rcar_du_plane_setup_mode(struct rcar_du_group *rgrp,
 	u32 colorkey;
 	u32 pnmr;
 
-	/* The PnALPHAR register controls alpha-blending in 16bpp formats
+	/*
+	 * The PnALPHAR register controls alpha-blending in 16bpp formats
 	 * (ARGB1555 and XRGB1555).
 	 *
 	 * For ARGB, set the alpha value to 0, and enable alpha-blending when
@@ -413,7 +427,8 @@ static void rcar_du_plane_setup_mode(struct rcar_du_group *rgrp,
 
 	pnmr = PnMR_BM_MD | state->format->pnmr;
 
-	/* Disable color keying when requested. YUV formats have the
+	/*
+	 * Disable color keying when requested. YUV formats have the
 	 * PnMR_SPIM_TP_OFF bit set in their pnmr field, disabling color keying
 	 * automatically.
 	 */
@@ -457,7 +472,8 @@ static void rcar_du_plane_setup_format_gen2(struct rcar_du_group *rgrp,
 	u32 ddcr2 = PnDDCR2_CODE;
 	u32 ddcr4;
 
-	/* Data format
+	/*
+	 * Data format
 	 *
 	 * The data format is selected by the DDDF field in PnMR and the EDF
 	 * field in DDCR4.
@@ -589,7 +605,8 @@ static void rcar_du_plane_atomic_update(struct drm_plane *plane,
 
 	rcar_du_plane_setup(rplane);
 
-	/* Check whether the source has changed from memory to live source or
+	/*
+	 * Check whether the source has changed from memory to live source or
 	 * from live source to memory. The source has been configured by the
 	 * VSPS bit in the PnDDCR4 register. Although the datasheet states that
 	 * the bit is updated during vertical blanking, it seems that updates
@@ -698,7 +715,6 @@ static const struct drm_plane_funcs rcar_du_plane_funcs = {
 	.update_plane = drm_atomic_helper_update_plane,
 	.disable_plane = drm_atomic_helper_disable_plane,
 	.reset = rcar_du_plane_reset,
-	.set_property = drm_atomic_helper_plane_set_property,
 	.destroy = drm_plane_cleanup,
 	.atomic_duplicate_state = rcar_du_plane_atomic_duplicate_state,
 	.atomic_destroy_state = rcar_du_plane_atomic_destroy_state,
@@ -726,7 +742,8 @@ int rcar_du_planes_init(struct rcar_du_group *rgrp)
 	unsigned int i;
 	int ret;
 
-	 /* Create one primary plane per CRTC in this group and seven overlay
+	 /*
+	  * Create one primary plane per CRTC in this group and seven overlay
 	  * planes.
 	  */
 	rgrp->num_planes = rgrp->num_crtcs + 7;
@@ -743,8 +760,8 @@ int rcar_du_planes_init(struct rcar_du_group *rgrp)
 
 		ret = drm_universal_plane_init(rcdu->ddev, &plane->plane, crtcs,
 					       &rcar_du_plane_funcs, formats,
-					       ARRAY_SIZE(formats), type,
-					       NULL);
+					       ARRAY_SIZE(formats),
+					       NULL, type, NULL);
 		if (ret < 0)
 			return ret;
 
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_plane.h b/drivers/gpu/drm/rcar-du/rcar_du_plane.h
index 8b91dd3..f62e09f 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_plane.h
+++ b/drivers/gpu/drm/rcar-du/rcar_du_plane.h
@@ -20,7 +20,8 @@
 struct rcar_du_format_info;
 struct rcar_du_group;
 
-/* The RCAR DU has 8 hardware planes, shared between primary and overlay planes.
+/*
+ * The RCAR DU has 8 hardware planes, shared between primary and overlay planes.
  * As using overlay planes requires at least one of the CRTCs being enabled, no
  * more than 7 overlay planes can be available. We thus create 1 primary plane
  * per CRTC and 7 overlay planes, for a total of up to 9 KMS planes.
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_vsp.c b/drivers/gpu/drm/rcar-du/rcar_du_vsp.c
index f870445..2c96147 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_vsp.c
+++ b/drivers/gpu/drm/rcar-du/rcar_du_vsp.c
@@ -19,6 +19,7 @@
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_plane_helper.h>
 
+#include <linux/bitops.h>
 #include <linux/dma-mapping.h>
 #include <linux/of_platform.h>
 #include <linux/scatterlist.h>
@@ -30,11 +31,15 @@
 #include "rcar_du_kms.h"
 #include "rcar_du_vsp.h"
 
-static void rcar_du_vsp_complete(void *private)
+static void rcar_du_vsp_complete(void *private, bool completed)
 {
 	struct rcar_du_crtc *crtc = private;
 
-	rcar_du_crtc_finish_page_flip(crtc);
+	if (crtc->vblank_enable)
+		drm_crtc_handle_vblank(&crtc->crtc);
+
+	if (completed)
+		rcar_du_crtc_finish_page_flip(crtc);
 }
 
 void rcar_du_vsp_enable(struct rcar_du_crtc *crtc)
@@ -73,7 +78,8 @@ void rcar_du_vsp_enable(struct rcar_du_crtc *crtc)
 
 	__rcar_du_plane_setup(crtc->group, &state);
 
-	/* Ensure that the plane source configuration takes effect by requesting
+	/*
+	 * Ensure that the plane source configuration takes effect by requesting
 	 * a restart of the group. See rcar_du_plane_atomic_update() for a more
 	 * detailed explanation.
 	 *
@@ -81,22 +87,22 @@ void rcar_du_vsp_enable(struct rcar_du_crtc *crtc)
 	 */
 	crtc->group->need_restart = true;
 
-	vsp1_du_setup_lif(crtc->vsp->vsp, &cfg);
+	vsp1_du_setup_lif(crtc->vsp->vsp, crtc->vsp_pipe, &cfg);
 }
 
 void rcar_du_vsp_disable(struct rcar_du_crtc *crtc)
 {
-	vsp1_du_setup_lif(crtc->vsp->vsp, NULL);
+	vsp1_du_setup_lif(crtc->vsp->vsp, crtc->vsp_pipe, NULL);
 }
 
 void rcar_du_vsp_atomic_begin(struct rcar_du_crtc *crtc)
 {
-	vsp1_du_atomic_begin(crtc->vsp->vsp);
+	vsp1_du_atomic_begin(crtc->vsp->vsp, crtc->vsp_pipe);
 }
 
 void rcar_du_vsp_atomic_flush(struct rcar_du_crtc *crtc)
 {
-	vsp1_du_atomic_flush(crtc->vsp->vsp);
+	vsp1_du_atomic_flush(crtc->vsp->vsp, crtc->vsp_pipe);
 }
 
 /* Keep the two tables in sync. */
@@ -162,6 +168,7 @@ static void rcar_du_vsp_plane_setup(struct rcar_du_vsp_plane *plane)
 {
 	struct rcar_du_vsp_plane_state *state =
 		to_rcar_vsp_plane_state(plane->plane.state);
+	struct rcar_du_crtc *crtc = to_rcar_crtc(state->state.crtc);
 	struct drm_framebuffer *fb = plane->plane.state->fb;
 	struct vsp1_du_atomic_config cfg = {
 		.pixelformat = 0,
@@ -192,7 +199,8 @@ static void rcar_du_vsp_plane_setup(struct rcar_du_vsp_plane *plane)
 		}
 	}
 
-	vsp1_du_atomic_update(plane->vsp->vsp, plane->index, &cfg);
+	vsp1_du_atomic_update(plane->vsp->vsp, crtc->vsp_pipe,
+			      plane->index, &cfg);
 }
 
 static int rcar_du_vsp_plane_prepare_fb(struct drm_plane *plane,
@@ -288,11 +296,13 @@ static void rcar_du_vsp_plane_atomic_update(struct drm_plane *plane,
 					struct drm_plane_state *old_state)
 {
 	struct rcar_du_vsp_plane *rplane = to_rcar_vsp_plane(plane);
+	struct rcar_du_crtc *crtc = to_rcar_crtc(old_state->crtc);
 
 	if (plane->state->crtc)
 		rcar_du_vsp_plane_setup(rplane);
 	else
-		vsp1_du_atomic_update(rplane->vsp->vsp, rplane->index, NULL);
+		vsp1_du_atomic_update(rplane->vsp->vsp, crtc->vsp_pipe,
+				      rplane->index, NULL);
 }
 
 static const struct drm_plane_helper_funcs rcar_du_vsp_plane_helper_funcs = {
@@ -383,7 +393,6 @@ static const struct drm_plane_funcs rcar_du_vsp_plane_funcs = {
 	.update_plane = drm_atomic_helper_update_plane,
 	.disable_plane = drm_atomic_helper_disable_plane,
 	.reset = rcar_du_vsp_plane_reset,
-	.set_property = drm_atomic_helper_plane_set_property,
 	.destroy = drm_plane_cleanup,
 	.atomic_duplicate_state = rcar_du_vsp_plane_atomic_duplicate_state,
 	.atomic_destroy_state = rcar_du_vsp_plane_atomic_destroy_state,
@@ -391,23 +400,17 @@ static const struct drm_plane_funcs rcar_du_vsp_plane_funcs = {
 	.atomic_get_property = rcar_du_vsp_plane_atomic_get_property,
 };
 
-int rcar_du_vsp_init(struct rcar_du_vsp *vsp)
+int rcar_du_vsp_init(struct rcar_du_vsp *vsp, struct device_node *np,
+		     unsigned int crtcs)
 {
 	struct rcar_du_device *rcdu = vsp->dev;
 	struct platform_device *pdev;
-	struct device_node *np;
+	unsigned int num_crtcs = hweight32(crtcs);
 	unsigned int i;
 	int ret;
 
 	/* Find the VSP device and initialize it. */
-	np = of_parse_phandle(rcdu->dev->of_node, "vsps", vsp->index);
-	if (!np) {
-		dev_err(rcdu->dev, "vsps node not found\n");
-		return -ENXIO;
-	}
-
 	pdev = of_find_device_by_node(np);
-	of_node_put(np);
 	if (!pdev)
 		return -ENXIO;
 
@@ -417,7 +420,8 @@ int rcar_du_vsp_init(struct rcar_du_vsp *vsp)
 	if (ret < 0)
 		return ret;
 
-	 /* The VSP2D (Gen3) has 5 RPFs, but the VSP1D (Gen2) is limited to
+	 /*
+	  * The VSP2D (Gen3) has 5 RPFs, but the VSP1D (Gen2) is limited to
 	  * 4 RPFs.
 	  */
 	vsp->num_planes = rcdu->info->gen >= 3 ? 5 : 4;
@@ -428,19 +432,19 @@ int rcar_du_vsp_init(struct rcar_du_vsp *vsp)
 		return -ENOMEM;
 
 	for (i = 0; i < vsp->num_planes; ++i) {
-		enum drm_plane_type type = i ? DRM_PLANE_TYPE_OVERLAY
-					 : DRM_PLANE_TYPE_PRIMARY;
+		enum drm_plane_type type = i < num_crtcs
+					 ? DRM_PLANE_TYPE_PRIMARY
+					 : DRM_PLANE_TYPE_OVERLAY;
 		struct rcar_du_vsp_plane *plane = &vsp->planes[i];
 
 		plane->vsp = vsp;
 		plane->index = i;
 
-		ret = drm_universal_plane_init(rcdu->ddev, &plane->plane,
-					       1 << vsp->index,
+		ret = drm_universal_plane_init(rcdu->ddev, &plane->plane, crtcs,
 					       &rcar_du_vsp_plane_funcs,
 					       formats_kms,
-					       ARRAY_SIZE(formats_kms), type,
-					       NULL);
+					       ARRAY_SIZE(formats_kms),
+					       NULL, type, NULL);
 		if (ret < 0)
 			return ret;
 
diff --git a/drivers/gpu/drm/rcar-du/rcar_du_vsp.h b/drivers/gpu/drm/rcar-du/rcar_du_vsp.h
index 8861661..f876c51 100644
--- a/drivers/gpu/drm/rcar-du/rcar_du_vsp.h
+++ b/drivers/gpu/drm/rcar-du/rcar_du_vsp.h
@@ -64,13 +64,19 @@ to_rcar_vsp_plane_state(struct drm_plane_state *state)
 }
 
 #ifdef CONFIG_DRM_RCAR_VSP
-int rcar_du_vsp_init(struct rcar_du_vsp *vsp);
+int rcar_du_vsp_init(struct rcar_du_vsp *vsp, struct device_node *np,
+		     unsigned int crtcs);
 void rcar_du_vsp_enable(struct rcar_du_crtc *crtc);
 void rcar_du_vsp_disable(struct rcar_du_crtc *crtc);
 void rcar_du_vsp_atomic_begin(struct rcar_du_crtc *crtc);
 void rcar_du_vsp_atomic_flush(struct rcar_du_crtc *crtc);
 #else
-static inline int rcar_du_vsp_init(struct rcar_du_vsp *vsp) { return -ENXIO; };
+static inline int rcar_du_vsp_init(struct rcar_du_vsp *vsp,
+				   struct device_node *np,
+				   unsigned int crtcs)
+{
+	return -ENXIO;
+}
 static inline void rcar_du_vsp_enable(struct rcar_du_crtc *crtc) { };
 static inline void rcar_du_vsp_disable(struct rcar_du_crtc *crtc) { };
 static inline void rcar_du_vsp_atomic_begin(struct rcar_du_crtc *crtc) { };
diff --git a/drivers/gpu/drm/rcar-du/rcar_dw_hdmi.c b/drivers/gpu/drm/rcar-du/rcar_dw_hdmi.c
index 7539626..dc85b53 100644
--- a/drivers/gpu/drm/rcar-du/rcar_dw_hdmi.c
+++ b/drivers/gpu/drm/rcar-du/rcar_dw_hdmi.c
@@ -45,7 +45,7 @@ static int rcar_hdmi_phy_configure(struct dw_hdmi *hdmi,
 {
 	const struct rcar_hdmi_phy_params *params = rcar_hdmi_phy_params;
 
-	for (; params && params->mpixelclock != ~0UL; ++params) {
+	for (; params->mpixelclock != ~0UL; ++params) {
 		if (mpixelclock <= params->mpixelclock)
 			break;
 	}
diff --git a/drivers/gpu/drm/rockchip/cdn-dp-core.c b/drivers/gpu/drm/rockchip/cdn-dp-core.c
index 9b0b058..a57da05 100644
--- a/drivers/gpu/drm/rockchip/cdn-dp-core.c
+++ b/drivers/gpu/drm/rockchip/cdn-dp-core.c
@@ -254,7 +254,6 @@ static void cdn_dp_connector_destroy(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs cdn_dp_atomic_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.detect = cdn_dp_connector_detect,
 	.destroy = cdn_dp_connector_destroy,
 	.fill_modes = drm_helper_probe_single_connector_modes,
diff --git a/drivers/gpu/drm/rockchip/dw-mipi-dsi.c b/drivers/gpu/drm/rockchip/dw-mipi-dsi.c
index 21b9737..9a20b9d 100644
--- a/drivers/gpu/drm/rockchip/dw-mipi-dsi.c
+++ b/drivers/gpu/drm/rockchip/dw-mipi-dsi.c
@@ -1080,7 +1080,6 @@ static void dw_mipi_dsi_drm_connector_destroy(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs dw_mipi_dsi_atomic_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = dw_mipi_dsi_drm_connector_destroy,
 	.reset = drm_atomic_helper_connector_reset,
diff --git a/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c b/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c
index f820848..ccd5d59 100644
--- a/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c
+++ b/drivers/gpu/drm/rockchip/dw_hdmi-rockchip.c
@@ -7,10 +7,12 @@
  * (at your option) any later version.
  */
 
+#include <linux/clk.h>
+#include <linux/mfd/syscon.h>
 #include <linux/module.h>
 #include <linux/platform_device.h>
-#include <linux/mfd/syscon.h>
 #include <linux/regmap.h>
+
 #include <drm/drm_of.h>
 #include <drm/drmP.h>
 #include <drm/drm_crtc_helper.h>
@@ -20,13 +22,32 @@
 #include "rockchip_drm_drv.h"
 #include "rockchip_drm_vop.h"
 
-#define GRF_SOC_CON6                    0x025c
-#define HDMI_SEL_VOP_LIT                (1 << 4)
+#define RK3288_GRF_SOC_CON6		0x025C
+#define RK3288_HDMI_LCDC_SEL		BIT(4)
+#define RK3399_GRF_SOC_CON20		0x6250
+#define RK3399_HDMI_LCDC_SEL		BIT(6)
+
+#define HIWORD_UPDATE(val, mask)	(val | (mask) << 16)
+
+/**
+ * struct rockchip_hdmi_chip_data - splite the grf setting of kind of chips
+ * @lcdsel_grf_reg: grf register offset of lcdc select
+ * @lcdsel_big: reg value of selecting vop big for HDMI
+ * @lcdsel_lit: reg value of selecting vop little for HDMI
+ */
+struct rockchip_hdmi_chip_data {
+	u32	lcdsel_grf_reg;
+	u32	lcdsel_big;
+	u32	lcdsel_lit;
+};
 
 struct rockchip_hdmi {
 	struct device *dev;
 	struct regmap *regmap;
 	struct drm_encoder encoder;
+	const struct rockchip_hdmi_chip_data *chip_data;
+	struct clk *vpll_clk;
+	struct clk *grf_clk;
 };
 
 #define to_rockchip_hdmi(x)	container_of(x, struct rockchip_hdmi, x)
@@ -143,6 +164,7 @@ static const struct dw_hdmi_phy_config rockchip_phy_config[] = {
 static int rockchip_hdmi_parse_dt(struct rockchip_hdmi *hdmi)
 {
 	struct device_node *np = hdmi->dev->of_node;
+	int ret;
 
 	hdmi->regmap = syscon_regmap_lookup_by_phandle(np, "rockchip,grf");
 	if (IS_ERR(hdmi->regmap)) {
@@ -150,6 +172,32 @@ static int rockchip_hdmi_parse_dt(struct rockchip_hdmi *hdmi)
 		return PTR_ERR(hdmi->regmap);
 	}
 
+	hdmi->vpll_clk = devm_clk_get(hdmi->dev, "vpll");
+	if (PTR_ERR(hdmi->vpll_clk) == -ENOENT) {
+		hdmi->vpll_clk = NULL;
+	} else if (PTR_ERR(hdmi->vpll_clk) == -EPROBE_DEFER) {
+		return -EPROBE_DEFER;
+	} else if (IS_ERR(hdmi->vpll_clk)) {
+		dev_err(hdmi->dev, "failed to get grf clock\n");
+		return PTR_ERR(hdmi->vpll_clk);
+	}
+
+	hdmi->grf_clk = devm_clk_get(hdmi->dev, "grf");
+	if (PTR_ERR(hdmi->grf_clk) == -ENOENT) {
+		hdmi->grf_clk = NULL;
+	} else if (PTR_ERR(hdmi->grf_clk) == -EPROBE_DEFER) {
+		return -EPROBE_DEFER;
+	} else if (IS_ERR(hdmi->grf_clk)) {
+		dev_err(hdmi->dev, "failed to get grf clock\n");
+		return PTR_ERR(hdmi->grf_clk);
+	}
+
+	ret = clk_prepare_enable(hdmi->vpll_clk);
+	if (ret) {
+		dev_err(hdmi->dev, "Failed to enable HDMI vpll: %d\n", ret);
+		return ret;
+	}
+
 	return 0;
 }
 
@@ -192,23 +240,36 @@ static void dw_hdmi_rockchip_encoder_mode_set(struct drm_encoder *encoder,
 					      struct drm_display_mode *mode,
 					      struct drm_display_mode *adj_mode)
 {
+	struct rockchip_hdmi *hdmi = to_rockchip_hdmi(encoder);
+
+	clk_set_rate(hdmi->vpll_clk, adj_mode->clock * 1000);
 }
 
 static void dw_hdmi_rockchip_encoder_enable(struct drm_encoder *encoder)
 {
 	struct rockchip_hdmi *hdmi = to_rockchip_hdmi(encoder);
 	u32 val;
-	int mux;
+	int ret;
 
-	mux = drm_of_encoder_active_endpoint_id(hdmi->dev->of_node, encoder);
-	if (mux)
-		val = HDMI_SEL_VOP_LIT | (HDMI_SEL_VOP_LIT << 16);
+	ret = drm_of_encoder_active_endpoint_id(hdmi->dev->of_node, encoder);
+	if (ret)
+		val = hdmi->chip_data->lcdsel_lit;
 	else
-		val = HDMI_SEL_VOP_LIT << 16;
+		val = hdmi->chip_data->lcdsel_big;
 
-	regmap_write(hdmi->regmap, GRF_SOC_CON6, val);
+	ret = clk_prepare_enable(hdmi->grf_clk);
+	if (ret < 0) {
+		dev_err(hdmi->dev, "failed to enable grfclk %d\n", ret);
+		return;
+	}
+
+	ret = regmap_write(hdmi->regmap, hdmi->chip_data->lcdsel_grf_reg, val);
+	if (ret != 0)
+		dev_err(hdmi->dev, "Could not write to GRF: %d\n", ret);
+
+	clk_disable_unprepare(hdmi->grf_clk);
 	dev_dbg(hdmi->dev, "vop %s output to hdmi\n",
-		(mux) ? "LIT" : "BIG");
+		ret ? "LIT" : "BIG");
 }
 
 static int
@@ -232,16 +293,40 @@ static const struct drm_encoder_helper_funcs dw_hdmi_rockchip_encoder_helper_fun
 	.atomic_check = dw_hdmi_rockchip_encoder_atomic_check,
 };
 
-static const struct dw_hdmi_plat_data rockchip_hdmi_drv_data = {
+static struct rockchip_hdmi_chip_data rk3288_chip_data = {
+	.lcdsel_grf_reg = RK3288_GRF_SOC_CON6,
+	.lcdsel_big = HIWORD_UPDATE(0, RK3288_HDMI_LCDC_SEL),
+	.lcdsel_lit = HIWORD_UPDATE(RK3288_HDMI_LCDC_SEL, RK3288_HDMI_LCDC_SEL),
+};
+
+static const struct dw_hdmi_plat_data rk3288_hdmi_drv_data = {
 	.mode_valid = dw_hdmi_rockchip_mode_valid,
 	.mpll_cfg   = rockchip_mpll_cfg,
 	.cur_ctr    = rockchip_cur_ctr,
 	.phy_config = rockchip_phy_config,
+	.phy_data = &rk3288_chip_data,
+};
+
+static struct rockchip_hdmi_chip_data rk3399_chip_data = {
+	.lcdsel_grf_reg = RK3399_GRF_SOC_CON20,
+	.lcdsel_big = HIWORD_UPDATE(0, RK3399_HDMI_LCDC_SEL),
+	.lcdsel_lit = HIWORD_UPDATE(RK3399_HDMI_LCDC_SEL, RK3399_HDMI_LCDC_SEL),
+};
+
+static const struct dw_hdmi_plat_data rk3399_hdmi_drv_data = {
+	.mode_valid = dw_hdmi_rockchip_mode_valid,
+	.mpll_cfg   = rockchip_mpll_cfg,
+	.cur_ctr    = rockchip_cur_ctr,
+	.phy_config = rockchip_phy_config,
+	.phy_data = &rk3399_chip_data,
 };
 
 static const struct of_device_id dw_hdmi_rockchip_dt_ids[] = {
 	{ .compatible = "rockchip,rk3288-dw-hdmi",
-	  .data = &rockchip_hdmi_drv_data
+	  .data = &rk3288_hdmi_drv_data
+	},
+	{ .compatible = "rockchip,rk3399-dw-hdmi",
+	  .data = &rk3399_hdmi_drv_data
 	},
 	{},
 };
@@ -268,6 +353,7 @@ static int dw_hdmi_rockchip_bind(struct device *dev, struct device *master,
 	match = of_match_node(dw_hdmi_rockchip_dt_ids, pdev->dev.of_node);
 	plat_data = match->data;
 	hdmi->dev = &pdev->dev;
+	hdmi->chip_data = plat_data->phy_data;
 	encoder = &hdmi->encoder;
 
 	encoder->possible_crtcs = drm_of_find_possible_crtcs(drm, dev->of_node);
diff --git a/drivers/gpu/drm/rockchip/inno_hdmi.c b/drivers/gpu/drm/rockchip/inno_hdmi.c
index 7d9b75e..7a251a5 100644
--- a/drivers/gpu/drm/rockchip/inno_hdmi.c
+++ b/drivers/gpu/drm/rockchip/inno_hdmi.c
@@ -294,7 +294,7 @@ static int inno_hdmi_config_video_avi(struct inno_hdmi *hdmi,
 	union hdmi_infoframe frame;
 	int rc;
 
-	rc = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi, mode);
+	rc = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi, mode, false);
 
 	if (hdmi->hdmi_data.enc_out_format == HDMI_COLORSPACE_YUV444)
 		frame.avi.colorspace = HDMI_COLORSPACE_YUV444;
@@ -592,8 +592,7 @@ static void inno_hdmi_connector_destroy(struct drm_connector *connector)
 	drm_connector_cleanup(connector);
 }
 
-static struct drm_connector_funcs inno_hdmi_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
+static const struct drm_connector_funcs inno_hdmi_connector_funcs = {
 	.fill_modes = inno_hdmi_probe_single_connector_modes,
 	.detect = inno_hdmi_connector_detect,
 	.destroy = inno_hdmi_connector_destroy,
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
index c16bc0a..ff3d0f5 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
@@ -161,23 +161,21 @@ static int rockchip_drm_bind(struct device *dev)
 	 */
 	drm_dev->irq_enabled = true;
 
+	ret = rockchip_drm_fbdev_init(drm_dev);
+	if (ret)
+		goto err_unbind_all;
+
 	/* init kms poll for handling hpd */
 	drm_kms_helper_poll_init(drm_dev);
 
-	ret = rockchip_drm_fbdev_init(drm_dev);
+	ret = drm_dev_register(drm_dev, 0);
 	if (ret)
 		goto err_kms_helper_poll_fini;
 
-	ret = drm_dev_register(drm_dev, 0);
-	if (ret)
-		goto err_fbdev_fini;
-
 	return 0;
-err_fbdev_fini:
-	rockchip_drm_fbdev_fini(drm_dev);
 err_kms_helper_poll_fini:
 	drm_kms_helper_poll_fini(drm_dev);
-	drm_vblank_cleanup(drm_dev);
+	rockchip_drm_fbdev_fini(drm_dev);
 err_unbind_all:
 	component_unbind_all(dev, drm_dev);
 err_mode_config_cleanup:
@@ -200,7 +198,6 @@ static void rockchip_drm_unbind(struct device *dev)
 	drm_kms_helper_poll_fini(drm_dev);
 
 	drm_atomic_helper_shutdown(drm_dev);
-	drm_vblank_cleanup(drm_dev);
 	component_unbind_all(dev, drm_dev);
 	drm_mode_config_cleanup(drm_dev);
 	rockchip_iommu_cleanup(drm_dev);
@@ -235,8 +232,6 @@ static struct drm_driver rockchip_drm_driver = {
 	.gem_vm_ops		= &drm_gem_cma_vm_ops,
 	.gem_free_object_unlocked = rockchip_gem_free_object,
 	.dumb_create		= rockchip_gem_dumb_create,
-	.dumb_map_offset	= rockchip_gem_dumb_map_offset,
-	.dumb_destroy		= drm_gem_dumb_destroy,
 	.prime_handle_to_fd	= drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle	= drm_gem_prime_fd_to_handle,
 	.gem_prime_import	= drm_gem_prime_import,
@@ -378,8 +373,8 @@ static int rockchip_drm_platform_of_probe(struct device *dev)
 
 		iommu = of_parse_phandle(port->parent, "iommus", 0);
 		if (!iommu || !of_device_is_available(iommu->parent)) {
-			dev_dbg(dev, "no iommu attached for %s, using non-iommu buffers\n",
-				port->parent->full_name);
+			dev_dbg(dev, "no iommu attached for %pOF, using non-iommu buffers\n",
+				port->parent);
 			/*
 			 * if there is a crtc not support iommu, force set all
 			 * crtc use non-iommu buffer.
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c
index 81f9548..7077304 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_fb.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_fb.c
@@ -48,7 +48,7 @@ static void rockchip_drm_fb_destroy(struct drm_framebuffer *fb)
 	int i;
 
 	for (i = 0; i < ROCKCHIP_MAX_FB_BUFFER; i++)
-		drm_gem_object_unreference_unlocked(rockchip_fb->obj[i]);
+		drm_gem_object_put_unlocked(rockchip_fb->obj[i]);
 
 	drm_framebuffer_cleanup(fb);
 	kfree(rockchip_fb);
@@ -144,7 +144,7 @@ rockchip_user_fb_create(struct drm_device *dev, struct drm_file *file_priv,
 			width * drm_format_plane_cpp(mode_cmd->pixel_format, i);
 
 		if (obj->size < min_size) {
-			drm_gem_object_unreference_unlocked(obj);
+			drm_gem_object_put_unlocked(obj);
 			ret = -EINVAL;
 			goto err_gem_object_unreference;
 		}
@@ -161,40 +161,19 @@ rockchip_user_fb_create(struct drm_device *dev, struct drm_file *file_priv,
 
 err_gem_object_unreference:
 	for (i--; i >= 0; i--)
-		drm_gem_object_unreference_unlocked(objs[i]);
+		drm_gem_object_put_unlocked(objs[i]);
 	return ERR_PTR(ret);
 }
 
 static void rockchip_drm_output_poll_changed(struct drm_device *dev)
 {
 	struct rockchip_drm_private *private = dev->dev_private;
-	struct drm_fb_helper *fb_helper = &private->fbdev_helper;
 
-	if (fb_helper)
-		drm_fb_helper_hotplug_event(fb_helper);
-}
-
-static void
-rockchip_atomic_commit_tail(struct drm_atomic_state *state)
-{
-	struct drm_device *dev = state->dev;
-
-	drm_atomic_helper_commit_modeset_disables(dev, state);
-
-	drm_atomic_helper_commit_modeset_enables(dev, state);
-
-	drm_atomic_helper_commit_planes(dev, state,
-					DRM_PLANE_COMMIT_ACTIVE_ONLY);
-
-	drm_atomic_helper_commit_hw_done(state);
-
-	drm_atomic_helper_wait_for_vblanks(dev, state);
-
-	drm_atomic_helper_cleanup_planes(dev, state);
+	drm_fb_helper_hotplug_event(&private->fbdev_helper);
 }
 
 static const struct drm_mode_config_helper_funcs rockchip_mode_config_helpers = {
-	.atomic_commit_tail = rockchip_atomic_commit_tail,
+	.atomic_commit_tail = drm_atomic_helper_commit_tail_rpm,
 };
 
 static const struct drm_mode_config_funcs rockchip_drm_mode_config_funcs = {
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_fbdev.c b/drivers/gpu/drm/rockchip/rockchip_drm_fbdev.c
index ce946b9..724579e 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_fbdev.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_fbdev.c
@@ -173,7 +173,7 @@ void rockchip_drm_fbdev_fini(struct drm_device *dev)
 	drm_fb_helper_unregister_fbi(helper);
 
 	if (helper->fb)
-		drm_framebuffer_unreference(helper->fb);
+		drm_framebuffer_put(helper->fb);
 
 	drm_fb_helper_fini(helper);
 }
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
index b74ac71..1869c8b 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_gem.c
@@ -383,7 +383,7 @@ rockchip_gem_create_with_handle(struct drm_file *file_priv,
 		goto err_handle_create;
 
 	/* drop reference from allocate - handle holds it now. */
-	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
 
 	return rk_obj;
 
@@ -393,32 +393,6 @@ rockchip_gem_create_with_handle(struct drm_file *file_priv,
 	return ERR_PTR(ret);
 }
 
-int rockchip_gem_dumb_map_offset(struct drm_file *file_priv,
-				 struct drm_device *dev, uint32_t handle,
-				 uint64_t *offset)
-{
-	struct drm_gem_object *obj;
-	int ret;
-
-	obj = drm_gem_object_lookup(file_priv, handle);
-	if (!obj) {
-		DRM_ERROR("failed to lookup gem object.\n");
-		return -EINVAL;
-	}
-
-	ret = drm_gem_create_mmap_offset(obj);
-	if (ret)
-		goto out;
-
-	*offset = drm_vma_node_offset_addr(&obj->vma_node);
-	DRM_DEBUG_KMS("offset = 0x%llx\n", *offset);
-
-out:
-	drm_gem_object_unreference_unlocked(obj);
-
-	return 0;
-}
-
 /*
  * rockchip_gem_dumb_create - (struct drm_driver)->dumb_create callback
  * function
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_gem.h b/drivers/gpu/drm/rockchip/rockchip_drm_gem.h
index 3f6ea4d..f237375 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_gem.h
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_gem.h
@@ -57,7 +57,4 @@ void rockchip_gem_free_object(struct drm_gem_object *obj);
 int rockchip_gem_dumb_create(struct drm_file *file_priv,
 			     struct drm_device *dev,
 			     struct drm_mode_create_dumb *args);
-int rockchip_gem_dumb_map_offset(struct drm_file *file_priv,
-				 struct drm_device *dev, uint32_t handle,
-				 uint64_t *offset);
 #endif /* _ROCKCHIP_DRM_GEM_H */
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
index 2900f14..bf9ed0e 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.c
@@ -42,33 +42,20 @@
 #include "rockchip_drm_psr.h"
 #include "rockchip_drm_vop.h"
 
-#define __REG_SET_RELAXED(x, off, mask, shift, v, write_mask) \
-		vop_mask_write(x, off, mask, shift, v, write_mask, true)
-
-#define __REG_SET_NORMAL(x, off, mask, shift, v, write_mask) \
-		vop_mask_write(x, off, mask, shift, v, write_mask, false)
-
-#define REG_SET(x, base, reg, v, mode) \
-		__REG_SET_##mode(x, base + reg.offset, \
-				 reg.mask, reg.shift, v, reg.write_mask)
-#define REG_SET_MASK(x, base, reg, mask, v, mode) \
-		__REG_SET_##mode(x, base + reg.offset, \
-				 mask, reg.shift, v, reg.write_mask)
-
 #define VOP_WIN_SET(x, win, name, v) \
-		REG_SET(x, win->base, win->phy->name, v, RELAXED)
+		vop_reg_set(vop, &win->phy->name, win->base, ~0, v, #name)
 #define VOP_SCL_SET(x, win, name, v) \
-		REG_SET(x, win->base, win->phy->scl->name, v, RELAXED)
+		vop_reg_set(vop, &win->phy->scl->name, win->base, ~0, v, #name)
 #define VOP_SCL_SET_EXT(x, win, name, v) \
-		REG_SET(x, win->base, win->phy->scl->ext->name, v, RELAXED)
-#define VOP_CTRL_SET(x, name, v) \
-		REG_SET(x, 0, (x)->data->ctrl->name, v, NORMAL)
+		vop_reg_set(vop, &win->phy->scl->ext->name, \
+			    win->base, ~0, v, #name)
 
-#define VOP_INTR_GET(vop, name) \
-		vop_read_reg(vop, 0, &vop->data->ctrl->name)
+#define VOP_INTR_SET_MASK(vop, name, mask, v) \
+		vop_reg_set(vop, &vop->data->intr->name, 0, mask, v, #name)
 
-#define VOP_INTR_SET(vop, name, mask, v) \
-		REG_SET_MASK(vop, 0, vop->data->intr->name, mask, v, NORMAL)
+#define VOP_REG_SET(vop, group, name, v) \
+		    vop_reg_set(vop, &vop->data->group->name, 0, ~0, v, #name)
+
 #define VOP_INTR_SET_TYPE(vop, name, type, v) \
 	do { \
 		int i, reg = 0, mask = 0; \
@@ -78,13 +65,13 @@
 				mask |= 1 << i; \
 			} \
 		} \
-		VOP_INTR_SET(vop, name, mask, reg); \
+		VOP_INTR_SET_MASK(vop, name, mask, reg); \
 	} while (0)
 #define VOP_INTR_GET_TYPE(vop, name, type) \
 		vop_get_intr_type(vop, &vop->data->intr->name, type)
 
 #define VOP_WIN_GET(x, win, name) \
-		vop_read_reg(x, win->base, &win->phy->name)
+		vop_read_reg(x, win->offset, win->phy->name)
 
 #define VOP_WIN_GET_YRGBADDR(vop, win) \
 		vop_readl(vop, win->base + win->phy->yrgb_mst.offset)
@@ -166,14 +153,22 @@ static inline uint32_t vop_read_reg(struct vop *vop, uint32_t base,
 	return (vop_readl(vop, base + reg->offset) >> reg->shift) & reg->mask;
 }
 
-static inline void vop_mask_write(struct vop *vop, uint32_t offset,
-				  uint32_t mask, uint32_t shift, uint32_t v,
-				  bool write_mask, bool relaxed)
+static void vop_reg_set(struct vop *vop, const struct vop_reg *reg,
+			uint32_t _offset, uint32_t _mask, uint32_t v,
+			const char *reg_name)
 {
-	if (!mask)
-		return;
+	int offset, mask, shift;
 
-	if (write_mask) {
+	if (!reg || !reg->mask) {
+		dev_dbg(vop->dev, "Warning: not support %s\n", reg_name);
+		return;
+	}
+
+	offset = reg->offset + _offset;
+	mask = reg->mask & _mask;
+	shift = reg->shift;
+
+	if (reg->write_mask) {
 		v = ((v << shift) & 0xffff) | (mask << (shift + 16));
 	} else {
 		uint32_t cached_val = vop->regsbak[offset >> 2];
@@ -182,7 +177,7 @@ static inline void vop_mask_write(struct vop *vop, uint32_t offset,
 		vop->regsbak[offset >> 2] = v;
 	}
 
-	if (relaxed)
+	if (reg->relaxed)
 		writel_relaxed(v, vop->regs + offset);
 	else
 		writel(v, vop->regs + offset);
@@ -204,7 +199,7 @@ static inline uint32_t vop_get_intr_type(struct vop *vop,
 
 static inline void vop_cfg_done(struct vop *vop)
 {
-	VOP_CTRL_SET(vop, cfg_done, 1);
+	VOP_REG_SET(vop, common, cfg_done, 1);
 }
 
 static bool has_rb_swapped(uint32_t format)
@@ -556,7 +551,7 @@ static int vop_enable(struct drm_crtc *crtc)
 
 	spin_lock(&vop->reg_lock);
 
-	VOP_CTRL_SET(vop, standby, 0);
+	VOP_REG_SET(vop, common, standby, 1);
 
 	spin_unlock(&vop->reg_lock);
 
@@ -577,7 +572,8 @@ static int vop_enable(struct drm_crtc *crtc)
 	return ret;
 }
 
-static void vop_crtc_disable(struct drm_crtc *crtc)
+static void vop_crtc_atomic_disable(struct drm_crtc *crtc,
+				    struct drm_crtc_state *old_state)
 {
 	struct vop *vop = to_vop(crtc);
 
@@ -599,7 +595,7 @@ static void vop_crtc_disable(struct drm_crtc *crtc)
 
 	spin_lock(&vop->reg_lock);
 
-	VOP_CTRL_SET(vop, standby, 1);
+	VOP_REG_SET(vop, common, standby, 1);
 
 	spin_unlock(&vop->reg_lock);
 
@@ -870,7 +866,8 @@ static bool vop_crtc_mode_fixup(struct drm_crtc *crtc,
 	return true;
 }
 
-static void vop_crtc_enable(struct drm_crtc *crtc)
+static void vop_crtc_atomic_enable(struct drm_crtc *crtc,
+				   struct drm_crtc_state *old_state)
 {
 	struct vop *vop = to_vop(crtc);
 	const struct vop_data *vop_data = vop->data;
@@ -897,70 +894,34 @@ static void vop_crtc_enable(struct drm_crtc *crtc)
 		return;
 	}
 
-	/*
-	 * If dclk rate is zero, mean that scanout is stop,
-	 * we don't need wait any more.
-	 */
-	if (clk_get_rate(vop->dclk)) {
-		/*
-		 * Rk3288 vop timing register is immediately, when configure
-		 * display timing on display time, may cause tearing.
-		 *
-		 * Vop standby will take effect at end of current frame,
-		 * if dsp hold valid irq happen, it means standby complete.
-		 *
-		 * mode set:
-		 *    standby and wait complete --> |----
-		 *                                  | display time
-		 *                                  |----
-		 *                                  |---> dsp hold irq
-		 *     configure display timing --> |
-		 *         standby exit             |
-		 *                                  | new frame start.
-		 */
-
-		reinit_completion(&vop->dsp_hold_completion);
-		vop_dsp_hold_valid_irq_enable(vop);
-
-		spin_lock(&vop->reg_lock);
-
-		VOP_CTRL_SET(vop, standby, 1);
-
-		spin_unlock(&vop->reg_lock);
-
-		wait_for_completion(&vop->dsp_hold_completion);
-
-		vop_dsp_hold_valid_irq_disable(vop);
-	}
-
 	pin_pol = BIT(DCLK_INVERT);
 	pin_pol |= (adjusted_mode->flags & DRM_MODE_FLAG_PHSYNC) ?
 		   BIT(HSYNC_POSITIVE) : 0;
 	pin_pol |= (adjusted_mode->flags & DRM_MODE_FLAG_PVSYNC) ?
 		   BIT(VSYNC_POSITIVE) : 0;
-	VOP_CTRL_SET(vop, pin_pol, pin_pol);
+	VOP_REG_SET(vop, output, pin_pol, pin_pol);
 
 	switch (s->output_type) {
 	case DRM_MODE_CONNECTOR_LVDS:
-		VOP_CTRL_SET(vop, rgb_en, 1);
-		VOP_CTRL_SET(vop, rgb_pin_pol, pin_pol);
+		VOP_REG_SET(vop, output, rgb_en, 1);
+		VOP_REG_SET(vop, output, rgb_pin_pol, pin_pol);
 		break;
 	case DRM_MODE_CONNECTOR_eDP:
-		VOP_CTRL_SET(vop, edp_pin_pol, pin_pol);
-		VOP_CTRL_SET(vop, edp_en, 1);
+		VOP_REG_SET(vop, output, edp_pin_pol, pin_pol);
+		VOP_REG_SET(vop, output, edp_en, 1);
 		break;
 	case DRM_MODE_CONNECTOR_HDMIA:
-		VOP_CTRL_SET(vop, hdmi_pin_pol, pin_pol);
-		VOP_CTRL_SET(vop, hdmi_en, 1);
+		VOP_REG_SET(vop, output, hdmi_pin_pol, pin_pol);
+		VOP_REG_SET(vop, output, hdmi_en, 1);
 		break;
 	case DRM_MODE_CONNECTOR_DSI:
-		VOP_CTRL_SET(vop, mipi_pin_pol, pin_pol);
-		VOP_CTRL_SET(vop, mipi_en, 1);
+		VOP_REG_SET(vop, output, mipi_pin_pol, pin_pol);
+		VOP_REG_SET(vop, output, mipi_en, 1);
 		break;
 	case DRM_MODE_CONNECTOR_DisplayPort:
 		pin_pol &= ~BIT(DCLK_INVERT);
-		VOP_CTRL_SET(vop, dp_pin_pol, pin_pol);
-		VOP_CTRL_SET(vop, dp_en, 1);
+		VOP_REG_SET(vop, output, dp_pin_pol, pin_pol);
+		VOP_REG_SET(vop, output, dp_en, 1);
 		break;
 	default:
 		DRM_DEV_ERROR(vop->dev, "unsupported connector_type [%d]\n",
@@ -973,25 +934,25 @@ static void vop_crtc_enable(struct drm_crtc *crtc)
 	if (s->output_mode == ROCKCHIP_OUT_MODE_AAAA &&
 	    !(vop_data->feature & VOP_FEATURE_OUTPUT_RGB10))
 		s->output_mode = ROCKCHIP_OUT_MODE_P888;
-	VOP_CTRL_SET(vop, out_mode, s->output_mode);
+	VOP_REG_SET(vop, common, out_mode, s->output_mode);
 
-	VOP_CTRL_SET(vop, htotal_pw, (htotal << 16) | hsync_len);
+	VOP_REG_SET(vop, modeset, htotal_pw, (htotal << 16) | hsync_len);
 	val = hact_st << 16;
 	val |= hact_end;
-	VOP_CTRL_SET(vop, hact_st_end, val);
-	VOP_CTRL_SET(vop, hpost_st_end, val);
+	VOP_REG_SET(vop, modeset, hact_st_end, val);
+	VOP_REG_SET(vop, modeset, hpost_st_end, val);
 
-	VOP_CTRL_SET(vop, vtotal_pw, (vtotal << 16) | vsync_len);
+	VOP_REG_SET(vop, modeset, vtotal_pw, (vtotal << 16) | vsync_len);
 	val = vact_st << 16;
 	val |= vact_end;
-	VOP_CTRL_SET(vop, vact_st_end, val);
-	VOP_CTRL_SET(vop, vpost_st_end, val);
+	VOP_REG_SET(vop, modeset, vact_st_end, val);
+	VOP_REG_SET(vop, modeset, vpost_st_end, val);
 
-	VOP_CTRL_SET(vop, line_flag_num[0], vact_end);
+	VOP_REG_SET(vop, intr, line_flag_num[0], vact_end);
 
 	clk_set_rate(vop->dclk, adjusted_mode->clock * 1000);
 
-	VOP_CTRL_SET(vop, standby, 0);
+	VOP_REG_SET(vop, common, standby, 0);
 
 	rockchip_drm_psr_activate(&vop->crtc);
 }
@@ -1026,7 +987,7 @@ static void vop_crtc_atomic_flush(struct drm_crtc *crtc,
 				  struct drm_crtc_state *old_crtc_state)
 {
 	struct drm_atomic_state *old_state = old_crtc_state->state;
-	struct drm_plane_state *old_plane_state;
+	struct drm_plane_state *old_plane_state, *new_plane_state;
 	struct vop *vop = to_vop(crtc);
 	struct drm_plane *plane;
 	int i;
@@ -1057,14 +1018,15 @@ static void vop_crtc_atomic_flush(struct drm_crtc *crtc,
 	}
 	spin_unlock_irq(&crtc->dev->event_lock);
 
-	for_each_plane_in_state(old_state, plane, old_plane_state, i) {
+	for_each_oldnew_plane_in_state(old_state, plane, old_plane_state,
+				       new_plane_state, i) {
 		if (!old_plane_state->fb)
 			continue;
 
-		if (old_plane_state->fb == plane->state->fb)
+		if (old_plane_state->fb == new_plane_state->fb)
 			continue;
 
-		drm_framebuffer_reference(old_plane_state->fb);
+		drm_framebuffer_get(old_plane_state->fb);
 		drm_flip_work_queue(&vop->fb_unref_work, old_plane_state->fb);
 		set_bit(VOP_PENDING_FB_UNREF, &vop->pending);
 		WARN_ON(drm_crtc_vblank_get(crtc) != 0);
@@ -1078,11 +1040,11 @@ static void vop_crtc_atomic_begin(struct drm_crtc *crtc,
 }
 
 static const struct drm_crtc_helper_funcs vop_crtc_helper_funcs = {
-	.enable = vop_crtc_enable,
-	.disable = vop_crtc_disable,
 	.mode_fixup = vop_crtc_mode_fixup,
 	.atomic_flush = vop_crtc_atomic_flush,
 	.atomic_begin = vop_crtc_atomic_begin,
+	.atomic_enable = vop_crtc_atomic_enable,
+	.atomic_disable = vop_crtc_atomic_disable,
 };
 
 static void vop_crtc_destroy(struct drm_crtc *crtc)
@@ -1188,7 +1150,7 @@ static void vop_fb_unref_worker(struct drm_flip_work *work, void *val)
 	struct drm_framebuffer *fb = val;
 
 	drm_crtc_vblank_put(&vop->crtc);
-	drm_framebuffer_unreference(fb);
+	drm_framebuffer_put(fb);
 }
 
 static void vop_handle_vblank(struct vop *vop)
@@ -1289,7 +1251,7 @@ static int vop_create_crtc(struct vop *vop)
 					       0, &vop_plane_funcs,
 					       win_data->phy->data_formats,
 					       win_data->phy->nformats,
-					       win_data->type, NULL);
+					       NULL, win_data->type, NULL);
 		if (ret) {
 			DRM_DEV_ERROR(vop->dev, "failed to init plane %d\n",
 				      ret);
@@ -1328,7 +1290,7 @@ static int vop_create_crtc(struct vop *vop)
 					       &vop_plane_funcs,
 					       win_data->phy->data_formats,
 					       win_data->phy->nformats,
-					       win_data->type, NULL);
+					       NULL, win_data->type, NULL);
 		if (ret) {
 			DRM_DEV_ERROR(vop->dev, "failed to init overlay %d\n",
 				      ret);
@@ -1339,8 +1301,8 @@ static int vop_create_crtc(struct vop *vop)
 
 	port = of_get_child_by_name(dev->of_node, "port");
 	if (!port) {
-		DRM_DEV_ERROR(vop->dev, "no port node found in %s\n",
-			      dev->of_node->full_name);
+		DRM_DEV_ERROR(vop->dev, "no port node found in %pOF\n",
+			      dev->of_node);
 		ret = -ENOENT;
 		goto err_cleanup_crtc;
 	}
@@ -1394,7 +1356,6 @@ static void vop_destroy_crtc(struct vop *vop)
 static int vop_initial(struct vop *vop)
 {
 	const struct vop_data *vop_data = vop->data;
-	const struct vop_reg_data *init_table = vop_data->init_table;
 	struct reset_control *ahb_rst;
 	int i, ret;
 
@@ -1454,13 +1415,16 @@ static int vop_initial(struct vop *vop)
 
 	memcpy(vop->regsbak, vop->regs, vop->len);
 
-	for (i = 0; i < vop_data->table_size; i++)
-		vop_writel(vop, init_table[i].offset, init_table[i].value);
+	VOP_REG_SET(vop, misc, global_regdone_en, 1);
+	VOP_REG_SET(vop, common, dsp_blank, 0);
 
 	for (i = 0; i < vop_data->win_size; i++) {
 		const struct vop_win_data *win = &vop_data->win[i];
+		int channel = i * 2 + 1;
 
+		VOP_WIN_SET(vop, win, channel, (channel + 1) << 4 | channel);
 		VOP_WIN_SET(vop, win, enable, 0);
+		VOP_WIN_SET(vop, win, gate, 1);
 	}
 
 	vop_cfg_done(vop);
diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_vop.h b/drivers/gpu/drm/rockchip/rockchip_drm_vop.h
index 27eefbf..56bbd2e 100644
--- a/drivers/gpu/drm/rockchip/rockchip_drm_vop.h
+++ b/drivers/gpu/drm/rockchip/rockchip_drm_vop.h
@@ -15,6 +15,14 @@
 #ifndef _ROCKCHIP_DRM_VOP_H
 #define _ROCKCHIP_DRM_VOP_H
 
+/*
+ * major: IP major version, used for IP structure
+ * minor: big feature change under same structure
+ */
+#define VOP_VERSION(major, minor)	((major) << 8 | (minor))
+#define VOP_MAJOR(version)		((version) >> 8)
+#define VOP_MINOR(version)		((version) & 0xff)
+
 enum vop_data_format {
 	VOP_FMT_ARGB8888 = 0,
 	VOP_FMT_RGB888,
@@ -24,53 +32,58 @@ enum vop_data_format {
 	VOP_FMT_YUV444SP,
 };
 
-struct vop_reg_data {
-	uint32_t offset;
-	uint32_t value;
-};
-
 struct vop_reg {
-	uint32_t offset;
-	uint32_t shift;
 	uint32_t mask;
+	uint16_t offset;
+	uint8_t shift;
 	bool write_mask;
+	bool relaxed;
 };
 
-struct vop_ctrl {
-	struct vop_reg standby;
-	struct vop_reg data_blank;
-	struct vop_reg gate_en;
-	struct vop_reg mmu_en;
-	struct vop_reg rgb_en;
+struct vop_modeset {
+	struct vop_reg htotal_pw;
+	struct vop_reg hact_st_end;
+	struct vop_reg hpost_st_end;
+	struct vop_reg vtotal_pw;
+	struct vop_reg vact_st_end;
+	struct vop_reg vpost_st_end;
+};
+
+struct vop_output {
+	struct vop_reg pin_pol;
+	struct vop_reg dp_pin_pol;
+	struct vop_reg edp_pin_pol;
+	struct vop_reg hdmi_pin_pol;
+	struct vop_reg mipi_pin_pol;
+	struct vop_reg rgb_pin_pol;
+	struct vop_reg dp_en;
 	struct vop_reg edp_en;
 	struct vop_reg hdmi_en;
 	struct vop_reg mipi_en;
-	struct vop_reg dp_en;
-	struct vop_reg out_mode;
+	struct vop_reg rgb_en;
+};
+
+struct vop_common {
+	struct vop_reg cfg_done;
+	struct vop_reg dsp_blank;
+	struct vop_reg data_blank;
 	struct vop_reg dither_down;
 	struct vop_reg dither_up;
-	struct vop_reg pin_pol;
-	struct vop_reg rgb_pin_pol;
-	struct vop_reg hdmi_pin_pol;
-	struct vop_reg edp_pin_pol;
-	struct vop_reg mipi_pin_pol;
-	struct vop_reg dp_pin_pol;
+	struct vop_reg gate_en;
+	struct vop_reg mmu_en;
+	struct vop_reg out_mode;
+	struct vop_reg standby;
+};
 
-	struct vop_reg htotal_pw;
-	struct vop_reg hact_st_end;
-	struct vop_reg vtotal_pw;
-	struct vop_reg vact_st_end;
-	struct vop_reg hpost_st_end;
-	struct vop_reg vpost_st_end;
-
-	struct vop_reg line_flag_num[2];
-
-	struct vop_reg cfg_done;
+struct vop_misc {
+	struct vop_reg global_regdone_en;
 };
 
 struct vop_intr {
 	const int *intrs;
 	uint32_t nintrs;
+
+	struct vop_reg line_flag_num[2];
 	struct vop_reg enable;
 	struct vop_reg clear;
 	struct vop_reg status;
@@ -115,6 +128,7 @@ struct vop_win_phy {
 	uint32_t nformats;
 
 	struct vop_reg enable;
+	struct vop_reg gate;
 	struct vop_reg format;
 	struct vop_reg rb_swap;
 	struct vop_reg act_info;
@@ -127,6 +141,7 @@ struct vop_win_phy {
 
 	struct vop_reg dst_alpha_ctl;
 	struct vop_reg src_alpha_ctl;
+	struct vop_reg channel;
 };
 
 struct vop_win_data {
@@ -136,10 +151,12 @@ struct vop_win_data {
 };
 
 struct vop_data {
-	const struct vop_reg_data *init_table;
-	unsigned int table_size;
-	const struct vop_ctrl *ctrl;
+	uint32_t version;
 	const struct vop_intr *intr;
+	const struct vop_common *common;
+	const struct vop_misc *misc;
+	const struct vop_modeset *modeset;
+	const struct vop_output *output;
 	const struct vop_win_data *win;
 	unsigned int win_size;
 
diff --git a/drivers/gpu/drm/rockchip/rockchip_vop_reg.c b/drivers/gpu/drm/rockchip/rockchip_vop_reg.c
index bafd698..94de7b9 100644
--- a/drivers/gpu/drm/rockchip/rockchip_vop_reg.c
+++ b/drivers/gpu/drm/rockchip/rockchip_vop_reg.c
@@ -20,17 +20,23 @@
 #include "rockchip_drm_vop.h"
 #include "rockchip_vop_reg.h"
 
-#define VOP_REG(off, _mask, s) \
-		{.offset = off, \
+#define _VOP_REG(off, _mask, _shift, _write_mask, _relaxed) \
+		{ \
+		 .offset = off, \
 		 .mask = _mask, \
-		 .shift = s, \
-		 .write_mask = false,}
+		 .shift = _shift, \
+		 .write_mask = _write_mask, \
+		 .relaxed = _relaxed, \
+		}
 
-#define VOP_REG_MASK(off, _mask, s) \
-		{.offset = off, \
-		 .mask = _mask, \
-		 .shift = s, \
-		 .write_mask = true,}
+#define VOP_REG(off, _mask, _shift) \
+		_VOP_REG(off, _mask, _shift, false, true)
+
+#define VOP_REG_SYNC(off, _mask, _shift) \
+		_VOP_REG(off, _mask, _shift, false, false)
+
+#define VOP_REG_MASK_SYNC(off, _mask, _shift) \
+		_VOP_REG(off, _mask, _shift, true, false)
 
 static const uint32_t formats_win_full[] = {
 	DRM_FORMAT_XRGB8888,
@@ -110,32 +116,35 @@ static const int rk3036_vop_intrs[] = {
 static const struct vop_intr rk3036_intr = {
 	.intrs = rk3036_vop_intrs,
 	.nintrs = ARRAY_SIZE(rk3036_vop_intrs),
-	.status = VOP_REG(RK3036_INT_STATUS, 0xf, 0),
-	.enable = VOP_REG(RK3036_INT_STATUS, 0xf, 4),
-	.clear = VOP_REG(RK3036_INT_STATUS, 0xf, 8),
+	.line_flag_num[0] = VOP_REG(RK3036_INT_STATUS, 0xfff, 12),
+	.status = VOP_REG_SYNC(RK3036_INT_STATUS, 0xf, 0),
+	.enable = VOP_REG_SYNC(RK3036_INT_STATUS, 0xf, 4),
+	.clear = VOP_REG_SYNC(RK3036_INT_STATUS, 0xf, 8),
 };
 
-static const struct vop_ctrl rk3036_ctrl_data = {
-	.standby = VOP_REG(RK3036_SYS_CTRL, 0x1, 30),
-	.out_mode = VOP_REG(RK3036_DSP_CTRL0, 0xf, 0),
-	.pin_pol = VOP_REG(RK3036_DSP_CTRL0, 0xf, 4),
+static const struct vop_modeset rk3036_modeset = {
 	.htotal_pw = VOP_REG(RK3036_DSP_HTOTAL_HS_END, 0x1fff1fff, 0),
 	.hact_st_end = VOP_REG(RK3036_DSP_HACT_ST_END, 0x1fff1fff, 0),
 	.vtotal_pw = VOP_REG(RK3036_DSP_VTOTAL_VS_END, 0x1fff1fff, 0),
 	.vact_st_end = VOP_REG(RK3036_DSP_VACT_ST_END, 0x1fff1fff, 0),
-	.line_flag_num[0] = VOP_REG(RK3036_INT_STATUS, 0xfff, 12),
-	.cfg_done = VOP_REG(RK3036_REG_CFG_DONE, 0x1, 0),
 };
 
-static const struct vop_reg_data rk3036_vop_init_reg_table[] = {
-	{RK3036_DSP_CTRL1, 0x00000000},
+static const struct vop_output rk3036_output = {
+	.pin_pol = VOP_REG(RK3036_DSP_CTRL0, 0xf, 4),
+};
+
+static const struct vop_common rk3036_common = {
+	.standby = VOP_REG_SYNC(RK3036_SYS_CTRL, 0x1, 30),
+	.out_mode = VOP_REG(RK3036_DSP_CTRL0, 0xf, 0),
+	.dsp_blank = VOP_REG(RK3036_DSP_CTRL1, 0x1, 24),
+	.cfg_done = VOP_REG_SYNC(RK3036_REG_CFG_DONE, 0x1, 0),
 };
 
 static const struct vop_data rk3036_vop = {
-	.init_table = rk3036_vop_init_reg_table,
-	.table_size = ARRAY_SIZE(rk3036_vop_init_reg_table),
-	.ctrl = &rk3036_ctrl_data,
 	.intr = &rk3036_intr,
+	.common = &rk3036_common,
+	.modeset = &rk3036_modeset,
+	.output = &rk3036_output,
 	.win = rk3036_vop_win_data,
 	.win_size = ARRAY_SIZE(rk3036_vop_win_data),
 };
@@ -188,12 +197,14 @@ static const struct vop_win_phy rk3288_win01_data = {
 	.uv_vir = VOP_REG(RK3288_WIN0_VIR, 0x3fff, 16),
 	.src_alpha_ctl = VOP_REG(RK3288_WIN0_SRC_ALPHA_CTRL, 0xff, 0),
 	.dst_alpha_ctl = VOP_REG(RK3288_WIN0_DST_ALPHA_CTRL, 0xff, 0),
+	.channel = VOP_REG(RK3288_WIN0_CTRL2, 0xff, 0),
 };
 
 static const struct vop_win_phy rk3288_win23_data = {
 	.data_formats = formats_win_lite,
 	.nformats = ARRAY_SIZE(formats_win_lite),
-	.enable = VOP_REG(RK3288_WIN2_CTRL0, 0x1, 0),
+	.enable = VOP_REG(RK3288_WIN2_CTRL0, 0x1, 4),
+	.gate = VOP_REG(RK3288_WIN2_CTRL0, 0x1, 0),
 	.format = VOP_REG(RK3288_WIN2_CTRL0, 0x7, 1),
 	.rb_swap = VOP_REG(RK3288_WIN2_CTRL0, 0x1, 12),
 	.dsp_info = VOP_REG(RK3288_WIN2_DSP_INFO0, 0x0fff0fff, 0),
@@ -204,40 +215,33 @@ static const struct vop_win_phy rk3288_win23_data = {
 	.dst_alpha_ctl = VOP_REG(RK3288_WIN2_DST_ALPHA_CTRL, 0xff, 0),
 };
 
-static const struct vop_ctrl rk3288_ctrl_data = {
-	.standby = VOP_REG(RK3288_SYS_CTRL, 0x1, 22),
-	.gate_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 23),
-	.mmu_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 20),
-	.rgb_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 12),
-	.hdmi_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 13),
-	.edp_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 14),
-	.mipi_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 15),
-	.dither_down = VOP_REG(RK3288_DSP_CTRL1, 0xf, 1),
-	.dither_up = VOP_REG(RK3288_DSP_CTRL1, 0x1, 6),
-	.data_blank = VOP_REG(RK3288_DSP_CTRL0, 0x1, 19),
-	.out_mode = VOP_REG(RK3288_DSP_CTRL0, 0xf, 0),
-	.pin_pol = VOP_REG(RK3288_DSP_CTRL0, 0xf, 4),
+static const struct vop_modeset rk3288_modeset = {
 	.htotal_pw = VOP_REG(RK3288_DSP_HTOTAL_HS_END, 0x1fff1fff, 0),
 	.hact_st_end = VOP_REG(RK3288_DSP_HACT_ST_END, 0x1fff1fff, 0),
 	.vtotal_pw = VOP_REG(RK3288_DSP_VTOTAL_VS_END, 0x1fff1fff, 0),
 	.vact_st_end = VOP_REG(RK3288_DSP_VACT_ST_END, 0x1fff1fff, 0),
 	.hpost_st_end = VOP_REG(RK3288_POST_DSP_HACT_INFO, 0x1fff1fff, 0),
 	.vpost_st_end = VOP_REG(RK3288_POST_DSP_VACT_INFO, 0x1fff1fff, 0),
-	.line_flag_num[0] = VOP_REG(RK3288_INTR_CTRL0, 0x1fff, 12),
-	.cfg_done = VOP_REG(RK3288_REG_CFG_DONE, 0x1, 0),
 };
 
-static const struct vop_reg_data rk3288_init_reg_table[] = {
-	{RK3288_SYS_CTRL, 0x00c00000},
-	{RK3288_DSP_CTRL0, 0x00000000},
-	{RK3288_WIN0_CTRL0, 0x00000080},
-	{RK3288_WIN1_CTRL0, 0x00000080},
-	/* TODO: Win2/3 support multiple area function, but we haven't found
-	 * a suitable way to use it yet, so let's just use them as other windows
-	 * with only area 0 enabled.
-	 */
-	{RK3288_WIN2_CTRL0, 0x00000010},
-	{RK3288_WIN3_CTRL0, 0x00000010},
+static const struct vop_output rk3288_output = {
+	.pin_pol = VOP_REG(RK3288_DSP_CTRL0, 0xf, 4),
+	.rgb_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 12),
+	.hdmi_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 13),
+	.edp_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 14),
+	.mipi_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 15),
+};
+
+static const struct vop_common rk3288_common = {
+	.standby = VOP_REG_SYNC(RK3288_SYS_CTRL, 0x1, 22),
+	.gate_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 23),
+	.mmu_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 20),
+	.dither_down = VOP_REG(RK3288_DSP_CTRL1, 0xf, 1),
+	.dither_up = VOP_REG(RK3288_DSP_CTRL1, 0x1, 6),
+	.data_blank = VOP_REG(RK3288_DSP_CTRL0, 0x1, 19),
+	.dsp_blank = VOP_REG(RK3288_DSP_CTRL0, 0x3, 18),
+	.out_mode = VOP_REG(RK3288_DSP_CTRL0, 0xf, 0),
+	.cfg_done = VOP_REG_SYNC(RK3288_REG_CFG_DONE, 0x1, 0),
 };
 
 /*
@@ -267,50 +271,24 @@ static const int rk3288_vop_intrs[] = {
 static const struct vop_intr rk3288_vop_intr = {
 	.intrs = rk3288_vop_intrs,
 	.nintrs = ARRAY_SIZE(rk3288_vop_intrs),
+	.line_flag_num[0] = VOP_REG(RK3288_INTR_CTRL0, 0x1fff, 12),
 	.status = VOP_REG(RK3288_INTR_CTRL0, 0xf, 0),
 	.enable = VOP_REG(RK3288_INTR_CTRL0, 0xf, 4),
 	.clear = VOP_REG(RK3288_INTR_CTRL0, 0xf, 8),
 };
 
 static const struct vop_data rk3288_vop = {
-	.init_table = rk3288_init_reg_table,
-	.table_size = ARRAY_SIZE(rk3288_init_reg_table),
+	.version = VOP_VERSION(3, 1),
 	.feature = VOP_FEATURE_OUTPUT_RGB10,
 	.intr = &rk3288_vop_intr,
-	.ctrl = &rk3288_ctrl_data,
+	.common = &rk3288_common,
+	.modeset = &rk3288_modeset,
+	.output = &rk3288_output,
 	.win = rk3288_vop_win_data,
 	.win_size = ARRAY_SIZE(rk3288_vop_win_data),
 };
 
-static const struct vop_ctrl rk3399_ctrl_data = {
-	.standby = VOP_REG(RK3399_SYS_CTRL, 0x1, 22),
-	.gate_en = VOP_REG(RK3399_SYS_CTRL, 0x1, 23),
-	.dp_en = VOP_REG(RK3399_SYS_CTRL, 0x1, 11),
-	.rgb_en = VOP_REG(RK3399_SYS_CTRL, 0x1, 12),
-	.hdmi_en = VOP_REG(RK3399_SYS_CTRL, 0x1, 13),
-	.edp_en = VOP_REG(RK3399_SYS_CTRL, 0x1, 14),
-	.mipi_en = VOP_REG(RK3399_SYS_CTRL, 0x1, 15),
-	.dither_down = VOP_REG(RK3399_DSP_CTRL1, 0xf, 1),
-	.dither_up = VOP_REG(RK3399_DSP_CTRL1, 0x1, 6),
-	.data_blank = VOP_REG(RK3399_DSP_CTRL0, 0x1, 19),
-	.out_mode = VOP_REG(RK3399_DSP_CTRL0, 0xf, 0),
-	.rgb_pin_pol = VOP_REG(RK3399_DSP_CTRL1, 0xf, 16),
-	.dp_pin_pol = VOP_REG(RK3399_DSP_CTRL1, 0xf, 16),
-	.hdmi_pin_pol = VOP_REG(RK3399_DSP_CTRL1, 0xf, 20),
-	.edp_pin_pol = VOP_REG(RK3399_DSP_CTRL1, 0xf, 24),
-	.mipi_pin_pol = VOP_REG(RK3399_DSP_CTRL1, 0xf, 28),
-	.htotal_pw = VOP_REG(RK3399_DSP_HTOTAL_HS_END, 0x1fff1fff, 0),
-	.hact_st_end = VOP_REG(RK3399_DSP_HACT_ST_END, 0x1fff1fff, 0),
-	.vtotal_pw = VOP_REG(RK3399_DSP_VTOTAL_VS_END, 0x1fff1fff, 0),
-	.vact_st_end = VOP_REG(RK3399_DSP_VACT_ST_END, 0x1fff1fff, 0),
-	.hpost_st_end = VOP_REG(RK3399_POST_DSP_HACT_INFO, 0x1fff1fff, 0),
-	.vpost_st_end = VOP_REG(RK3399_POST_DSP_VACT_INFO, 0x1fff1fff, 0),
-	.line_flag_num[0] = VOP_REG(RK3399_LINE_FLAG, 0xffff, 0),
-	.line_flag_num[1] = VOP_REG(RK3399_LINE_FLAG, 0xffff, 16),
-	.cfg_done = VOP_REG_MASK(RK3399_REG_CFG_DONE, 0x1, 0),
-};
-
-static const int rk3399_vop_intrs[] = {
+static const int rk3368_vop_intrs[] = {
 	FS_INTR,
 	0, 0,
 	LINE_FLAG_INTR,
@@ -320,69 +298,232 @@ static const int rk3399_vop_intrs[] = {
 	DSP_HOLD_VALID_INTR,
 };
 
-static const struct vop_intr rk3399_vop_intr = {
-	.intrs = rk3399_vop_intrs,
-	.nintrs = ARRAY_SIZE(rk3399_vop_intrs),
-	.status = VOP_REG_MASK(RK3399_INTR_STATUS0, 0xffff, 0),
-	.enable = VOP_REG_MASK(RK3399_INTR_EN0, 0xffff, 0),
-	.clear = VOP_REG_MASK(RK3399_INTR_CLEAR0, 0xffff, 0),
+static const struct vop_intr rk3368_vop_intr = {
+	.intrs = rk3368_vop_intrs,
+	.nintrs = ARRAY_SIZE(rk3368_vop_intrs),
+	.line_flag_num[0] = VOP_REG(RK3368_LINE_FLAG, 0xffff, 0),
+	.line_flag_num[1] = VOP_REG(RK3368_LINE_FLAG, 0xffff, 16),
+	.status = VOP_REG_MASK_SYNC(RK3368_INTR_STATUS, 0x3fff, 0),
+	.enable = VOP_REG_MASK_SYNC(RK3368_INTR_EN, 0x3fff, 0),
+	.clear = VOP_REG_MASK_SYNC(RK3368_INTR_CLEAR, 0x3fff, 0),
 };
 
-static const struct vop_reg_data rk3399_init_reg_table[] = {
-	{RK3399_SYS_CTRL, 0x2000f800},
-	{RK3399_DSP_CTRL0, 0x00000000},
-	{RK3399_WIN0_CTRL0, 0x00000080},
-	{RK3399_WIN1_CTRL0, 0x00000080},
-	/* TODO: Win2/3 support multiple area function, but we haven't found
-	 * a suitable way to use it yet, so let's just use them as other windows
-	 * with only area 0 enabled.
-	 */
-	{RK3399_WIN2_CTRL0, 0x00000010},
-	{RK3399_WIN3_CTRL0, 0x00000010},
+static const struct vop_win_phy rk3368_win23_data = {
+	.data_formats = formats_win_lite,
+	.nformats = ARRAY_SIZE(formats_win_lite),
+	.gate = VOP_REG(RK3368_WIN2_CTRL0, 0x1, 0),
+	.enable = VOP_REG(RK3368_WIN2_CTRL0, 0x1, 4),
+	.format = VOP_REG(RK3368_WIN2_CTRL0, 0x3, 5),
+	.rb_swap = VOP_REG(RK3368_WIN2_CTRL0, 0x1, 20),
+	.dsp_info = VOP_REG(RK3368_WIN2_DSP_INFO0, 0x0fff0fff, 0),
+	.dsp_st = VOP_REG(RK3368_WIN2_DSP_ST0, 0x1fff1fff, 0),
+	.yrgb_mst = VOP_REG(RK3368_WIN2_MST0, 0xffffffff, 0),
+	.yrgb_vir = VOP_REG(RK3368_WIN2_VIR0_1, 0x1fff, 0),
+	.src_alpha_ctl = VOP_REG(RK3368_WIN2_SRC_ALPHA_CTRL, 0xff, 0),
+	.dst_alpha_ctl = VOP_REG(RK3368_WIN2_DST_ALPHA_CTRL, 0xff, 0),
+};
+
+static const struct vop_win_data rk3368_vop_win_data[] = {
+	{ .base = 0x00, .phy = &rk3288_win01_data,
+	  .type = DRM_PLANE_TYPE_PRIMARY },
+	{ .base = 0x40, .phy = &rk3288_win01_data,
+	  .type = DRM_PLANE_TYPE_OVERLAY },
+	{ .base = 0x00, .phy = &rk3368_win23_data,
+	  .type = DRM_PLANE_TYPE_OVERLAY },
+	{ .base = 0x50, .phy = &rk3368_win23_data,
+	  .type = DRM_PLANE_TYPE_CURSOR },
+};
+
+static const struct vop_output rk3368_output = {
+	.rgb_pin_pol = VOP_REG(RK3368_DSP_CTRL1, 0xf, 16),
+	.hdmi_pin_pol = VOP_REG(RK3368_DSP_CTRL1, 0xf, 20),
+	.edp_pin_pol = VOP_REG(RK3368_DSP_CTRL1, 0xf, 24),
+	.mipi_pin_pol = VOP_REG(RK3368_DSP_CTRL1, 0xf, 28),
+	.rgb_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 12),
+	.hdmi_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 13),
+	.edp_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 14),
+	.mipi_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 15),
+};
+
+static const struct vop_misc rk3368_misc = {
+	.global_regdone_en = VOP_REG(RK3368_SYS_CTRL, 0x1, 11),
+};
+
+static const struct vop_data rk3368_vop = {
+	.version = VOP_VERSION(3, 2),
+	.intr = &rk3368_vop_intr,
+	.common = &rk3288_common,
+	.modeset = &rk3288_modeset,
+	.output = &rk3368_output,
+	.misc = &rk3368_misc,
+	.win = rk3368_vop_win_data,
+	.win_size = ARRAY_SIZE(rk3368_vop_win_data),
+};
+
+static const struct vop_intr rk3366_vop_intr = {
+	.intrs = rk3368_vop_intrs,
+	.nintrs = ARRAY_SIZE(rk3368_vop_intrs),
+	.line_flag_num[0] = VOP_REG(RK3366_LINE_FLAG, 0xffff, 0),
+	.line_flag_num[1] = VOP_REG(RK3366_LINE_FLAG, 0xffff, 16),
+	.status = VOP_REG_MASK_SYNC(RK3366_INTR_STATUS0, 0xffff, 0),
+	.enable = VOP_REG_MASK_SYNC(RK3366_INTR_EN0, 0xffff, 0),
+	.clear = VOP_REG_MASK_SYNC(RK3366_INTR_CLEAR0, 0xffff, 0),
+};
+
+static const struct vop_data rk3366_vop = {
+	.version = VOP_VERSION(3, 4),
+	.intr = &rk3366_vop_intr,
+	.common = &rk3288_common,
+	.modeset = &rk3288_modeset,
+	.output = &rk3368_output,
+	.misc = &rk3368_misc,
+	.win = rk3368_vop_win_data,
+	.win_size = ARRAY_SIZE(rk3368_vop_win_data),
+};
+
+static const struct vop_output rk3399_output = {
+	.dp_pin_pol = VOP_REG(RK3399_DSP_CTRL1, 0xf, 16),
+	.rgb_pin_pol = VOP_REG(RK3368_DSP_CTRL1, 0xf, 16),
+	.hdmi_pin_pol = VOP_REG(RK3368_DSP_CTRL1, 0xf, 20),
+	.edp_pin_pol = VOP_REG(RK3368_DSP_CTRL1, 0xf, 24),
+	.mipi_pin_pol = VOP_REG(RK3368_DSP_CTRL1, 0xf, 28),
+	.dp_en = VOP_REG(RK3399_SYS_CTRL, 0x1, 11),
+	.rgb_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 12),
+	.hdmi_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 13),
+	.edp_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 14),
+	.mipi_en = VOP_REG(RK3288_SYS_CTRL, 0x1, 15),
 };
 
 static const struct vop_data rk3399_vop_big = {
-	.init_table = rk3399_init_reg_table,
-	.table_size = ARRAY_SIZE(rk3399_init_reg_table),
+	.version = VOP_VERSION(3, 5),
 	.feature = VOP_FEATURE_OUTPUT_RGB10,
-	.intr = &rk3399_vop_intr,
-	.ctrl = &rk3399_ctrl_data,
-	/*
-	 * rk3399 vop big windows register layout is same as rk3288.
-	 */
-	.win = rk3288_vop_win_data,
-	.win_size = ARRAY_SIZE(rk3288_vop_win_data),
+	.intr = &rk3366_vop_intr,
+	.common = &rk3288_common,
+	.modeset = &rk3288_modeset,
+	.output = &rk3399_output,
+	.misc = &rk3368_misc,
+	.win = rk3368_vop_win_data,
+	.win_size = ARRAY_SIZE(rk3368_vop_win_data),
 };
 
 static const struct vop_win_data rk3399_vop_lit_win_data[] = {
 	{ .base = 0x00, .phy = &rk3288_win01_data,
 	  .type = DRM_PLANE_TYPE_PRIMARY },
-	{ .base = 0x00, .phy = &rk3288_win23_data,
+	{ .base = 0x00, .phy = &rk3368_win23_data,
 	  .type = DRM_PLANE_TYPE_CURSOR},
 };
 
 static const struct vop_data rk3399_vop_lit = {
-	.init_table = rk3399_init_reg_table,
-	.table_size = ARRAY_SIZE(rk3399_init_reg_table),
-	.intr = &rk3399_vop_intr,
-	.ctrl = &rk3399_ctrl_data,
-	/*
-	 * rk3399 vop lit windows register layout is same as rk3288,
-	 * but cut off the win1 and win3 windows.
-	 */
+	.version = VOP_VERSION(3, 6),
+	.intr = &rk3366_vop_intr,
+	.common = &rk3288_common,
+	.modeset = &rk3288_modeset,
+	.output = &rk3399_output,
+	.misc = &rk3368_misc,
 	.win = rk3399_vop_lit_win_data,
 	.win_size = ARRAY_SIZE(rk3399_vop_lit_win_data),
 };
 
+static const struct vop_win_data rk3228_vop_win_data[] = {
+	{ .base = 0x00, .phy = &rk3288_win01_data,
+	  .type = DRM_PLANE_TYPE_PRIMARY },
+	{ .base = 0x40, .phy = &rk3288_win01_data,
+	  .type = DRM_PLANE_TYPE_CURSOR },
+};
+
+static const struct vop_data rk3228_vop = {
+	.version = VOP_VERSION(3, 7),
+	.feature = VOP_FEATURE_OUTPUT_RGB10,
+	.intr = &rk3366_vop_intr,
+	.common = &rk3288_common,
+	.modeset = &rk3288_modeset,
+	.output = &rk3399_output,
+	.misc = &rk3368_misc,
+	.win = rk3228_vop_win_data,
+	.win_size = ARRAY_SIZE(rk3228_vop_win_data),
+};
+
+static const struct vop_modeset rk3328_modeset = {
+	.htotal_pw = VOP_REG(RK3328_DSP_HTOTAL_HS_END, 0x1fff1fff, 0),
+	.hact_st_end = VOP_REG(RK3328_DSP_HACT_ST_END, 0x1fff1fff, 0),
+	.vtotal_pw = VOP_REG(RK3328_DSP_VTOTAL_VS_END, 0x1fff1fff, 0),
+	.vact_st_end = VOP_REG(RK3328_DSP_VACT_ST_END, 0x1fff1fff, 0),
+	.hpost_st_end = VOP_REG(RK3328_POST_DSP_HACT_INFO, 0x1fff1fff, 0),
+	.vpost_st_end = VOP_REG(RK3328_POST_DSP_VACT_INFO, 0x1fff1fff, 0),
+};
+
+static const struct vop_output rk3328_output = {
+	.rgb_en = VOP_REG(RK3328_SYS_CTRL, 0x1, 12),
+	.hdmi_en = VOP_REG(RK3328_SYS_CTRL, 0x1, 13),
+	.edp_en = VOP_REG(RK3328_SYS_CTRL, 0x1, 14),
+	.mipi_en = VOP_REG(RK3328_SYS_CTRL, 0x1, 15),
+	.rgb_pin_pol = VOP_REG(RK3328_DSP_CTRL1, 0xf, 16),
+	.hdmi_pin_pol = VOP_REG(RK3328_DSP_CTRL1, 0xf, 20),
+	.edp_pin_pol = VOP_REG(RK3328_DSP_CTRL1, 0xf, 24),
+	.mipi_pin_pol = VOP_REG(RK3328_DSP_CTRL1, 0xf, 28),
+};
+
+static const struct vop_misc rk3328_misc = {
+	.global_regdone_en = VOP_REG(RK3328_SYS_CTRL, 0x1, 11),
+};
+
+static const struct vop_common rk3328_common = {
+	.standby = VOP_REG_SYNC(RK3328_SYS_CTRL, 0x1, 22),
+	.dither_down = VOP_REG(RK3328_DSP_CTRL1, 0xf, 1),
+	.dither_up = VOP_REG(RK3328_DSP_CTRL1, 0x1, 6),
+	.dsp_blank = VOP_REG(RK3328_DSP_CTRL0, 0x3, 18),
+	.out_mode = VOP_REG(RK3328_DSP_CTRL0, 0xf, 0),
+	.cfg_done = VOP_REG_SYNC(RK3328_REG_CFG_DONE, 0x1, 0),
+};
+
+static const struct vop_intr rk3328_vop_intr = {
+	.intrs = rk3368_vop_intrs,
+	.nintrs = ARRAY_SIZE(rk3368_vop_intrs),
+	.line_flag_num[0] = VOP_REG(RK3328_LINE_FLAG, 0xffff, 0),
+	.line_flag_num[1] = VOP_REG(RK3328_LINE_FLAG, 0xffff, 16),
+	.status = VOP_REG_MASK_SYNC(RK3328_INTR_STATUS0, 0xffff, 0),
+	.enable = VOP_REG_MASK_SYNC(RK3328_INTR_EN0, 0xffff, 0),
+	.clear = VOP_REG_MASK_SYNC(RK3328_INTR_CLEAR0, 0xffff, 0),
+};
+
+static const struct vop_win_data rk3328_vop_win_data[] = {
+	{ .base = 0xd0, .phy = &rk3288_win01_data,
+	  .type = DRM_PLANE_TYPE_PRIMARY },
+	{ .base = 0x1d0, .phy = &rk3288_win01_data,
+	  .type = DRM_PLANE_TYPE_OVERLAY },
+	{ .base = 0x2d0, .phy = &rk3288_win01_data,
+	  .type = DRM_PLANE_TYPE_CURSOR },
+};
+
+static const struct vop_data rk3328_vop = {
+	.version = VOP_VERSION(3, 8),
+	.feature = VOP_FEATURE_OUTPUT_RGB10,
+	.intr = &rk3328_vop_intr,
+	.common = &rk3328_common,
+	.modeset = &rk3328_modeset,
+	.output = &rk3328_output,
+	.misc = &rk3328_misc,
+	.win = rk3328_vop_win_data,
+	.win_size = ARRAY_SIZE(rk3328_vop_win_data),
+};
+
 static const struct of_device_id vop_driver_dt_match[] = {
 	{ .compatible = "rockchip,rk3036-vop",
 	  .data = &rk3036_vop },
 	{ .compatible = "rockchip,rk3288-vop",
 	  .data = &rk3288_vop },
+	{ .compatible = "rockchip,rk3368-vop",
+	  .data = &rk3368_vop },
+	{ .compatible = "rockchip,rk3366-vop",
+	  .data = &rk3366_vop },
 	{ .compatible = "rockchip,rk3399-vop-big",
 	  .data = &rk3399_vop_big },
 	{ .compatible = "rockchip,rk3399-vop-lit",
 	  .data = &rk3399_vop_lit },
+	{ .compatible = "rockchip,rk3228-vop",
+	  .data = &rk3228_vop },
+	{ .compatible = "rockchip,rk3328-vop",
+	  .data = &rk3328_vop },
 	{},
 };
 MODULE_DEVICE_TABLE(of, vop_driver_dt_match);
diff --git a/drivers/gpu/drm/rockchip/rockchip_vop_reg.h b/drivers/gpu/drm/rockchip/rockchip_vop_reg.h
index cd19726..4a4799f 100644
--- a/drivers/gpu/drm/rockchip/rockchip_vop_reg.h
+++ b/drivers/gpu/drm/rockchip/rockchip_vop_reg.h
@@ -41,6 +41,7 @@
 #define RK3288_WIN0_SRC_ALPHA_CTRL		0x0060
 #define RK3288_WIN0_DST_ALPHA_CTRL		0x0064
 #define RK3288_WIN0_FADING_CTRL			0x0068
+#define RK3288_WIN0_CTRL2			0x006c
 
 /* win1 register */
 #define RK3288_WIN1_CTRL0			0x0070
@@ -122,6 +123,717 @@
 #define RK3288_DSP_VACT_ST_END_F1		0x019c
 /* register definition end */
 
+/* rk3368 register definition */
+#define RK3368_REG_CFG_DONE			0x0000
+#define RK3368_VERSION_INFO			0x0004
+#define RK3368_SYS_CTRL				0x0008
+#define RK3368_SYS_CTRL1			0x000c
+#define RK3368_DSP_CTRL0			0x0010
+#define RK3368_DSP_CTRL1			0x0014
+#define RK3368_DSP_BG				0x0018
+#define RK3368_MCU_CTRL				0x001c
+#define RK3368_LINE_FLAG			0x0020
+#define RK3368_INTR_EN				0x0024
+#define RK3368_INTR_CLEAR			0x0028
+#define RK3368_INTR_STATUS			0x002c
+#define RK3368_WIN0_CTRL0			0x0030
+#define RK3368_WIN0_CTRL1			0x0034
+#define RK3368_WIN0_COLOR_KEY			0x0038
+#define RK3368_WIN0_VIR				0x003c
+#define RK3368_WIN0_YRGB_MST			0x0040
+#define RK3368_WIN0_CBR_MST			0x0044
+#define RK3368_WIN0_ACT_INFO			0x0048
+#define RK3368_WIN0_DSP_INFO			0x004c
+#define RK3368_WIN0_DSP_ST			0x0050
+#define RK3368_WIN0_SCL_FACTOR_YRGB		0x0054
+#define RK3368_WIN0_SCL_FACTOR_CBR		0x0058
+#define RK3368_WIN0_SCL_OFFSET			0x005c
+#define RK3368_WIN0_SRC_ALPHA_CTRL		0x0060
+#define RK3368_WIN0_DST_ALPHA_CTRL		0x0064
+#define RK3368_WIN0_FADING_CTRL			0x0068
+#define RK3368_WIN0_CTRL2			0x006c
+#define RK3368_WIN1_CTRL0			0x0070
+#define RK3368_WIN1_CTRL1			0x0074
+#define RK3368_WIN1_COLOR_KEY			0x0078
+#define RK3368_WIN1_VIR				0x007c
+#define RK3368_WIN1_YRGB_MST			0x0080
+#define RK3368_WIN1_CBR_MST			0x0084
+#define RK3368_WIN1_ACT_INFO			0x0088
+#define RK3368_WIN1_DSP_INFO			0x008c
+#define RK3368_WIN1_DSP_ST			0x0090
+#define RK3368_WIN1_SCL_FACTOR_YRGB		0x0094
+#define RK3368_WIN1_SCL_FACTOR_CBR		0x0098
+#define RK3368_WIN1_SCL_OFFSET			0x009c
+#define RK3368_WIN1_SRC_ALPHA_CTRL		0x00a0
+#define RK3368_WIN1_DST_ALPHA_CTRL		0x00a4
+#define RK3368_WIN1_FADING_CTRL			0x00a8
+#define RK3368_WIN1_CTRL2			0x00ac
+#define RK3368_WIN2_CTRL0			0x00b0
+#define RK3368_WIN2_CTRL1			0x00b4
+#define RK3368_WIN2_VIR0_1			0x00b8
+#define RK3368_WIN2_VIR2_3			0x00bc
+#define RK3368_WIN2_MST0			0x00c0
+#define RK3368_WIN2_DSP_INFO0			0x00c4
+#define RK3368_WIN2_DSP_ST0			0x00c8
+#define RK3368_WIN2_COLOR_KEY			0x00cc
+#define RK3368_WIN2_MST1			0x00d0
+#define RK3368_WIN2_DSP_INFO1			0x00d4
+#define RK3368_WIN2_DSP_ST1			0x00d8
+#define RK3368_WIN2_SRC_ALPHA_CTRL		0x00dc
+#define RK3368_WIN2_MST2			0x00e0
+#define RK3368_WIN2_DSP_INFO2			0x00e4
+#define RK3368_WIN2_DSP_ST2			0x00e8
+#define RK3368_WIN2_DST_ALPHA_CTRL		0x00ec
+#define RK3368_WIN2_MST3			0x00f0
+#define RK3368_WIN2_DSP_INFO3			0x00f4
+#define RK3368_WIN2_DSP_ST3			0x00f8
+#define RK3368_WIN2_FADING_CTRL			0x00fc
+#define RK3368_WIN3_CTRL0			0x0100
+#define RK3368_WIN3_CTRL1			0x0104
+#define RK3368_WIN3_VIR0_1			0x0108
+#define RK3368_WIN3_VIR2_3			0x010c
+#define RK3368_WIN3_MST0			0x0110
+#define RK3368_WIN3_DSP_INFO0			0x0114
+#define RK3368_WIN3_DSP_ST0			0x0118
+#define RK3368_WIN3_COLOR_KEY			0x011c
+#define RK3368_WIN3_MST1			0x0120
+#define RK3368_WIN3_DSP_INFO1			0x0124
+#define RK3368_WIN3_DSP_ST1			0x0128
+#define RK3368_WIN3_SRC_ALPHA_CTRL		0x012c
+#define RK3368_WIN3_MST2			0x0130
+#define RK3368_WIN3_DSP_INFO2			0x0134
+#define RK3368_WIN3_DSP_ST2			0x0138
+#define RK3368_WIN3_DST_ALPHA_CTRL		0x013c
+#define RK3368_WIN3_MST3			0x0140
+#define RK3368_WIN3_DSP_INFO3			0x0144
+#define RK3368_WIN3_DSP_ST3			0x0148
+#define RK3368_WIN3_FADING_CTRL			0x014c
+#define RK3368_HWC_CTRL0			0x0150
+#define RK3368_HWC_CTRL1			0x0154
+#define RK3368_HWC_MST				0x0158
+#define RK3368_HWC_DSP_ST			0x015c
+#define RK3368_HWC_SRC_ALPHA_CTRL		0x0160
+#define RK3368_HWC_DST_ALPHA_CTRL		0x0164
+#define RK3368_HWC_FADING_CTRL			0x0168
+#define RK3368_HWC_RESERVED1			0x016c
+#define RK3368_POST_DSP_HACT_INFO		0x0170
+#define RK3368_POST_DSP_VACT_INFO		0x0174
+#define RK3368_POST_SCL_FACTOR_YRGB		0x0178
+#define RK3368_POST_RESERVED			0x017c
+#define RK3368_POST_SCL_CTRL			0x0180
+#define RK3368_POST_DSP_VACT_INFO_F1		0x0184
+#define RK3368_DSP_HTOTAL_HS_END		0x0188
+#define RK3368_DSP_HACT_ST_END			0x018c
+#define RK3368_DSP_VTOTAL_VS_END		0x0190
+#define RK3368_DSP_VACT_ST_END			0x0194
+#define RK3368_DSP_VS_ST_END_F1			0x0198
+#define RK3368_DSP_VACT_ST_END_F1		0x019c
+#define RK3368_PWM_CTRL				0x01a0
+#define RK3368_PWM_PERIOD_HPR			0x01a4
+#define RK3368_PWM_DUTY_LPR			0x01a8
+#define RK3368_PWM_CNT				0x01ac
+#define RK3368_BCSH_COLOR_BAR			0x01b0
+#define RK3368_BCSH_BCS				0x01b4
+#define RK3368_BCSH_H				0x01b8
+#define RK3368_BCSH_CTRL			0x01bc
+#define RK3368_CABC_CTRL0			0x01c0
+#define RK3368_CABC_CTRL1			0x01c4
+#define RK3368_CABC_CTRL2			0x01c8
+#define RK3368_CABC_CTRL3			0x01cc
+#define RK3368_CABC_GAUSS_LINE0_0		0x01d0
+#define RK3368_CABC_GAUSS_LINE0_1		0x01d4
+#define RK3368_CABC_GAUSS_LINE1_0		0x01d8
+#define RK3368_CABC_GAUSS_LINE1_1		0x01dc
+#define RK3368_CABC_GAUSS_LINE2_0		0x01e0
+#define RK3368_CABC_GAUSS_LINE2_1		0x01e4
+#define RK3368_FRC_LOWER01_0			0x01e8
+#define RK3368_FRC_LOWER01_1			0x01ec
+#define RK3368_FRC_LOWER10_0			0x01f0
+#define RK3368_FRC_LOWER10_1			0x01f4
+#define RK3368_FRC_LOWER11_0			0x01f8
+#define RK3368_FRC_LOWER11_1			0x01fc
+#define RK3368_IFBDC_CTRL			0x0200
+#define RK3368_IFBDC_TILES_NUM			0x0204
+#define RK3368_IFBDC_FRAME_RST_CYCLE		0x0208
+#define RK3368_IFBDC_BASE_ADDR			0x020c
+#define RK3368_IFBDC_MB_SIZE			0x0210
+#define RK3368_IFBDC_CMP_INDEX_INIT		0x0214
+#define RK3368_IFBDC_VIR			0x0220
+#define RK3368_IFBDC_DEBUG0			0x0230
+#define RK3368_IFBDC_DEBUG1			0x0234
+#define RK3368_LATENCY_CTRL0			0x0250
+#define RK3368_RD_MAX_LATENCY_NUM0		0x0254
+#define RK3368_RD_LATENCY_THR_NUM0		0x0258
+#define RK3368_RD_LATENCY_SAMP_NUM0		0x025c
+#define RK3368_WIN0_DSP_BG			0x0260
+#define RK3368_WIN1_DSP_BG			0x0264
+#define RK3368_WIN2_DSP_BG			0x0268
+#define RK3368_WIN3_DSP_BG			0x026c
+#define RK3368_SCAN_LINE_NUM			0x0270
+#define RK3368_CABC_DEBUG0			0x0274
+#define RK3368_CABC_DEBUG1			0x0278
+#define RK3368_CABC_DEBUG2			0x027c
+#define RK3368_DBG_REG_000			0x0280
+#define RK3368_DBG_REG_001			0x0284
+#define RK3368_DBG_REG_002			0x0288
+#define RK3368_DBG_REG_003			0x028c
+#define RK3368_DBG_REG_004			0x0290
+#define RK3368_DBG_REG_005			0x0294
+#define RK3368_DBG_REG_006			0x0298
+#define RK3368_DBG_REG_007			0x029c
+#define RK3368_DBG_REG_008			0x02a0
+#define RK3368_DBG_REG_016			0x02c0
+#define RK3368_DBG_REG_017			0x02c4
+#define RK3368_DBG_REG_018			0x02c8
+#define RK3368_DBG_REG_019			0x02cc
+#define RK3368_DBG_REG_020			0x02d0
+#define RK3368_DBG_REG_021			0x02d4
+#define RK3368_DBG_REG_022			0x02d8
+#define RK3368_DBG_REG_023			0x02dc
+#define RK3368_DBG_REG_028			0x02f0
+#define RK3368_MMU_DTE_ADDR			0x0300
+#define RK3368_MMU_STATUS			0x0304
+#define RK3368_MMU_COMMAND			0x0308
+#define RK3368_MMU_PAGE_FAULT_ADDR		0x030c
+#define RK3368_MMU_ZAP_ONE_LINE			0x0310
+#define RK3368_MMU_INT_RAWSTAT			0x0314
+#define RK3368_MMU_INT_CLEAR			0x0318
+#define RK3368_MMU_INT_MASK			0x031c
+#define RK3368_MMU_INT_STATUS			0x0320
+#define RK3368_MMU_AUTO_GATING			0x0324
+#define RK3368_WIN2_LUT_ADDR			0x0400
+#define RK3368_WIN3_LUT_ADDR			0x0800
+#define RK3368_HWC_LUT_ADDR			0x0c00
+#define RK3368_GAMMA_LUT_ADDR			0x1000
+#define RK3368_CABC_GAMMA_LUT_ADDR		0x1800
+#define RK3368_MCU_BYPASS_WPORT			0x2200
+#define RK3368_MCU_BYPASS_RPORT			0x2300
+/* rk3368 register definition end */
+
+#define RK3366_REG_CFG_DONE			0x0000
+#define RK3366_VERSION_INFO			0x0004
+#define RK3366_SYS_CTRL				0x0008
+#define RK3366_SYS_CTRL1			0x000c
+#define RK3366_DSP_CTRL0			0x0010
+#define RK3366_DSP_CTRL1			0x0014
+#define RK3366_DSP_BG				0x0018
+#define RK3366_MCU_CTRL				0x001c
+#define RK3366_WB_CTRL0				0x0020
+#define RK3366_WB_CTRL1				0x0024
+#define RK3366_WB_YRGB_MST			0x0028
+#define RK3366_WB_CBR_MST			0x002c
+#define RK3366_WIN0_CTRL0			0x0030
+#define RK3366_WIN0_CTRL1			0x0034
+#define RK3366_WIN0_COLOR_KEY			0x0038
+#define RK3366_WIN0_VIR				0x003c
+#define RK3366_WIN0_YRGB_MST			0x0040
+#define RK3366_WIN0_CBR_MST			0x0044
+#define RK3366_WIN0_ACT_INFO			0x0048
+#define RK3366_WIN0_DSP_INFO			0x004c
+#define RK3366_WIN0_DSP_ST			0x0050
+#define RK3366_WIN0_SCL_FACTOR_YRGB		0x0054
+#define RK3366_WIN0_SCL_FACTOR_CBR		0x0058
+#define RK3366_WIN0_SCL_OFFSET			0x005c
+#define RK3366_WIN0_SRC_ALPHA_CTRL		0x0060
+#define RK3366_WIN0_DST_ALPHA_CTRL		0x0064
+#define RK3366_WIN0_FADING_CTRL			0x0068
+#define RK3366_WIN0_CTRL2			0x006c
+#define RK3366_WIN1_CTRL0			0x0070
+#define RK3366_WIN1_CTRL1			0x0074
+#define RK3366_WIN1_COLOR_KEY			0x0078
+#define RK3366_WIN1_VIR				0x007c
+#define RK3366_WIN1_YRGB_MST			0x0080
+#define RK3366_WIN1_CBR_MST			0x0084
+#define RK3366_WIN1_ACT_INFO			0x0088
+#define RK3366_WIN1_DSP_INFO			0x008c
+#define RK3366_WIN1_DSP_ST			0x0090
+#define RK3366_WIN1_SCL_FACTOR_YRGB		0x0094
+#define RK3366_WIN1_SCL_FACTOR_CBR		0x0098
+#define RK3366_WIN1_SCL_OFFSET			0x009c
+#define RK3366_WIN1_SRC_ALPHA_CTRL		0x00a0
+#define RK3366_WIN1_DST_ALPHA_CTRL		0x00a4
+#define RK3366_WIN1_FADING_CTRL			0x00a8
+#define RK3366_WIN1_CTRL2			0x00ac
+#define RK3366_WIN2_CTRL0			0x00b0
+#define RK3366_WIN2_CTRL1			0x00b4
+#define RK3366_WIN2_VIR0_1			0x00b8
+#define RK3366_WIN2_VIR2_3			0x00bc
+#define RK3366_WIN2_MST0			0x00c0
+#define RK3366_WIN2_DSP_INFO0			0x00c4
+#define RK3366_WIN2_DSP_ST0			0x00c8
+#define RK3366_WIN2_COLOR_KEY			0x00cc
+#define RK3366_WIN2_MST1			0x00d0
+#define RK3366_WIN2_DSP_INFO1			0x00d4
+#define RK3366_WIN2_DSP_ST1			0x00d8
+#define RK3366_WIN2_SRC_ALPHA_CTRL		0x00dc
+#define RK3366_WIN2_MST2			0x00e0
+#define RK3366_WIN2_DSP_INFO2			0x00e4
+#define RK3366_WIN2_DSP_ST2			0x00e8
+#define RK3366_WIN2_DST_ALPHA_CTRL		0x00ec
+#define RK3366_WIN2_MST3			0x00f0
+#define RK3366_WIN2_DSP_INFO3			0x00f4
+#define RK3366_WIN2_DSP_ST3			0x00f8
+#define RK3366_WIN2_FADING_CTRL			0x00fc
+#define RK3366_WIN3_CTRL0			0x0100
+#define RK3366_WIN3_CTRL1			0x0104
+#define RK3366_WIN3_VIR0_1			0x0108
+#define RK3366_WIN3_VIR2_3			0x010c
+#define RK3366_WIN3_MST0			0x0110
+#define RK3366_WIN3_DSP_INFO0			0x0114
+#define RK3366_WIN3_DSP_ST0			0x0118
+#define RK3366_WIN3_COLOR_KEY			0x011c
+#define RK3366_WIN3_MST1			0x0120
+#define RK3366_WIN3_DSP_INFO1			0x0124
+#define RK3366_WIN3_DSP_ST1			0x0128
+#define RK3366_WIN3_SRC_ALPHA_CTRL		0x012c
+#define RK3366_WIN3_MST2			0x0130
+#define RK3366_WIN3_DSP_INFO2			0x0134
+#define RK3366_WIN3_DSP_ST2			0x0138
+#define RK3366_WIN3_DST_ALPHA_CTRL		0x013c
+#define RK3366_WIN3_MST3			0x0140
+#define RK3366_WIN3_DSP_INFO3			0x0144
+#define RK3366_WIN3_DSP_ST3			0x0148
+#define RK3366_WIN3_FADING_CTRL			0x014c
+#define RK3366_HWC_CTRL0			0x0150
+#define RK3366_HWC_CTRL1			0x0154
+#define RK3366_HWC_MST				0x0158
+#define RK3366_HWC_DSP_ST			0x015c
+#define RK3366_HWC_SRC_ALPHA_CTRL		0x0160
+#define RK3366_HWC_DST_ALPHA_CTRL		0x0164
+#define RK3366_HWC_FADING_CTRL			0x0168
+#define RK3366_HWC_RESERVED1			0x016c
+#define RK3366_POST_DSP_HACT_INFO		0x0170
+#define RK3366_POST_DSP_VACT_INFO		0x0174
+#define RK3366_POST_SCL_FACTOR_YRGB		0x0178
+#define RK3366_POST_RESERVED			0x017c
+#define RK3366_POST_SCL_CTRL			0x0180
+#define RK3366_POST_DSP_VACT_INFO_F1		0x0184
+#define RK3366_DSP_HTOTAL_HS_END		0x0188
+#define RK3366_DSP_HACT_ST_END			0x018c
+#define RK3366_DSP_VTOTAL_VS_END		0x0190
+#define RK3366_DSP_VACT_ST_END			0x0194
+#define RK3366_DSP_VS_ST_END_F1			0x0198
+#define RK3366_DSP_VACT_ST_END_F1		0x019c
+#define RK3366_PWM_CTRL				0x01a0
+#define RK3366_PWM_PERIOD_HPR			0x01a4
+#define RK3366_PWM_DUTY_LPR			0x01a8
+#define RK3366_PWM_CNT				0x01ac
+#define RK3366_BCSH_COLOR_BAR			0x01b0
+#define RK3366_BCSH_BCS				0x01b4
+#define RK3366_BCSH_H				0x01b8
+#define RK3366_BCSH_CTRL			0x01bc
+#define RK3366_CABC_CTRL0			0x01c0
+#define RK3366_CABC_CTRL1			0x01c4
+#define RK3366_CABC_CTRL2			0x01c8
+#define RK3366_CABC_CTRL3			0x01cc
+#define RK3366_CABC_GAUSS_LINE0_0		0x01d0
+#define RK3366_CABC_GAUSS_LINE0_1		0x01d4
+#define RK3366_CABC_GAUSS_LINE1_0		0x01d8
+#define RK3366_CABC_GAUSS_LINE1_1		0x01dc
+#define RK3366_CABC_GAUSS_LINE2_0		0x01e0
+#define RK3366_CABC_GAUSS_LINE2_1		0x01e4
+#define RK3366_FRC_LOWER01_0			0x01e8
+#define RK3366_FRC_LOWER01_1			0x01ec
+#define RK3366_FRC_LOWER10_0			0x01f0
+#define RK3366_FRC_LOWER10_1			0x01f4
+#define RK3366_FRC_LOWER11_0			0x01f8
+#define RK3366_FRC_LOWER11_1			0x01fc
+#define RK3366_INTR_EN0				0x0280
+#define RK3366_INTR_CLEAR0			0x0284
+#define RK3366_INTR_STATUS0			0x0288
+#define RK3366_INTR_RAW_STATUS0			0x028c
+#define RK3366_INTR_EN1				0x0290
+#define RK3366_INTR_CLEAR1			0x0294
+#define RK3366_INTR_STATUS1			0x0298
+#define RK3366_INTR_RAW_STATUS1			0x029c
+#define RK3366_LINE_FLAG			0x02a0
+#define RK3366_VOP_STATUS			0x02a4
+#define RK3366_BLANKING_VALUE			0x02a8
+#define RK3366_WIN0_DSP_BG			0x02b0
+#define RK3366_WIN1_DSP_BG			0x02b4
+#define RK3366_WIN2_DSP_BG			0x02b8
+#define RK3366_WIN3_DSP_BG			0x02bc
+#define RK3366_WIN2_LUT_ADDR			0x0400
+#define RK3366_WIN3_LUT_ADDR			0x0800
+#define RK3366_HWC_LUT_ADDR			0x0c00
+#define RK3366_GAMMA0_LUT_ADDR			0x1000
+#define RK3366_GAMMA1_LUT_ADDR			0x1400
+#define RK3366_CABC_GAMMA_LUT_ADDR		0x1800
+#define RK3366_MCU_BYPASS_WPORT			0x2200
+#define RK3366_MCU_BYPASS_RPORT			0x2300
+#define RK3366_MMU_DTE_ADDR			0x2400
+#define RK3366_MMU_STATUS			0x2404
+#define RK3366_MMU_COMMAND			0x2408
+#define RK3366_MMU_PAGE_FAULT_ADDR		0x240c
+#define RK3366_MMU_ZAP_ONE_LINE			0x2410
+#define RK3366_MMU_INT_RAWSTAT			0x2414
+#define RK3366_MMU_INT_CLEAR			0x2418
+#define RK3366_MMU_INT_MASK			0x241c
+#define RK3366_MMU_INT_STATUS			0x2420
+#define RK3366_MMU_AUTO_GATING			0x2424
+
+/* rk3399 register definition */
+#define RK3399_REG_CFG_DONE			0x0000
+#define RK3399_VERSION_INFO			0x0004
+#define RK3399_SYS_CTRL				0x0008
+#define RK3399_SYS_CTRL1			0x000c
+#define RK3399_DSP_CTRL0			0x0010
+#define RK3399_DSP_CTRL1			0x0014
+#define RK3399_DSP_BG				0x0018
+#define RK3399_MCU_CTRL				0x001c
+#define RK3399_WB_CTRL0				0x0020
+#define RK3399_WB_CTRL1				0x0024
+#define RK3399_WB_YRGB_MST			0x0028
+#define RK3399_WB_CBR_MST			0x002c
+#define RK3399_WIN0_CTRL0			0x0030
+#define RK3399_WIN0_CTRL1			0x0034
+#define RK3399_WIN0_COLOR_KEY			0x0038
+#define RK3399_WIN0_VIR				0x003c
+#define RK3399_WIN0_YRGB_MST			0x0040
+#define RK3399_WIN0_CBR_MST			0x0044
+#define RK3399_WIN0_ACT_INFO			0x0048
+#define RK3399_WIN0_DSP_INFO			0x004c
+#define RK3399_WIN0_DSP_ST			0x0050
+#define RK3399_WIN0_SCL_FACTOR_YRGB		0x0054
+#define RK3399_WIN0_SCL_FACTOR_CBR		0x0058
+#define RK3399_WIN0_SCL_OFFSET			0x005c
+#define RK3399_WIN0_SRC_ALPHA_CTRL		0x0060
+#define RK3399_WIN0_DST_ALPHA_CTRL		0x0064
+#define RK3399_WIN0_FADING_CTRL			0x0068
+#define RK3399_WIN0_CTRL2			0x006c
+#define RK3399_WIN1_CTRL0			0x0070
+#define RK3399_WIN1_CTRL1			0x0074
+#define RK3399_WIN1_COLOR_KEY			0x0078
+#define RK3399_WIN1_VIR				0x007c
+#define RK3399_WIN1_YRGB_MST			0x0080
+#define RK3399_WIN1_CBR_MST			0x0084
+#define RK3399_WIN1_ACT_INFO			0x0088
+#define RK3399_WIN1_DSP_INFO			0x008c
+#define RK3399_WIN1_DSP_ST			0x0090
+#define RK3399_WIN1_SCL_FACTOR_YRGB		0x0094
+#define RK3399_WIN1_SCL_FACTOR_CBR		0x0098
+#define RK3399_WIN1_SCL_OFFSET			0x009c
+#define RK3399_WIN1_SRC_ALPHA_CTRL		0x00a0
+#define RK3399_WIN1_DST_ALPHA_CTRL		0x00a4
+#define RK3399_WIN1_FADING_CTRL			0x00a8
+#define RK3399_WIN1_CTRL2			0x00ac
+#define RK3399_WIN2_CTRL0			0x00b0
+#define RK3399_WIN2_CTRL1			0x00b4
+#define RK3399_WIN2_VIR0_1			0x00b8
+#define RK3399_WIN2_VIR2_3			0x00bc
+#define RK3399_WIN2_MST0			0x00c0
+#define RK3399_WIN2_DSP_INFO0			0x00c4
+#define RK3399_WIN2_DSP_ST0			0x00c8
+#define RK3399_WIN2_COLOR_KEY			0x00cc
+#define RK3399_WIN2_MST1			0x00d0
+#define RK3399_WIN2_DSP_INFO1			0x00d4
+#define RK3399_WIN2_DSP_ST1			0x00d8
+#define RK3399_WIN2_SRC_ALPHA_CTRL		0x00dc
+#define RK3399_WIN2_MST2			0x00e0
+#define RK3399_WIN2_DSP_INFO2			0x00e4
+#define RK3399_WIN2_DSP_ST2			0x00e8
+#define RK3399_WIN2_DST_ALPHA_CTRL		0x00ec
+#define RK3399_WIN2_MST3			0x00f0
+#define RK3399_WIN2_DSP_INFO3			0x00f4
+#define RK3399_WIN2_DSP_ST3			0x00f8
+#define RK3399_WIN2_FADING_CTRL			0x00fc
+#define RK3399_WIN3_CTRL0			0x0100
+#define RK3399_WIN3_CTRL1			0x0104
+#define RK3399_WIN3_VIR0_1			0x0108
+#define RK3399_WIN3_VIR2_3			0x010c
+#define RK3399_WIN3_MST0			0x0110
+#define RK3399_WIN3_DSP_INFO0			0x0114
+#define RK3399_WIN3_DSP_ST0			0x0118
+#define RK3399_WIN3_COLOR_KEY			0x011c
+#define RK3399_WIN3_MST1			0x0120
+#define RK3399_WIN3_DSP_INFO1			0x0124
+#define RK3399_WIN3_DSP_ST1			0x0128
+#define RK3399_WIN3_SRC_ALPHA_CTRL		0x012c
+#define RK3399_WIN3_MST2			0x0130
+#define RK3399_WIN3_DSP_INFO2			0x0134
+#define RK3399_WIN3_DSP_ST2			0x0138
+#define RK3399_WIN3_DST_ALPHA_CTRL		0x013c
+#define RK3399_WIN3_MST3			0x0140
+#define RK3399_WIN3_DSP_INFO3			0x0144
+#define RK3399_WIN3_DSP_ST3			0x0148
+#define RK3399_WIN3_FADING_CTRL			0x014c
+#define RK3399_HWC_CTRL0			0x0150
+#define RK3399_HWC_CTRL1			0x0154
+#define RK3399_HWC_MST				0x0158
+#define RK3399_HWC_DSP_ST			0x015c
+#define RK3399_HWC_SRC_ALPHA_CTRL		0x0160
+#define RK3399_HWC_DST_ALPHA_CTRL		0x0164
+#define RK3399_HWC_FADING_CTRL			0x0168
+#define RK3399_HWC_RESERVED1			0x016c
+#define RK3399_POST_DSP_HACT_INFO		0x0170
+#define RK3399_POST_DSP_VACT_INFO		0x0174
+#define RK3399_POST_SCL_FACTOR_YRGB		0x0178
+#define RK3399_POST_RESERVED			0x017c
+#define RK3399_POST_SCL_CTRL			0x0180
+#define RK3399_POST_DSP_VACT_INFO_F1		0x0184
+#define RK3399_DSP_HTOTAL_HS_END		0x0188
+#define RK3399_DSP_HACT_ST_END			0x018c
+#define RK3399_DSP_VTOTAL_VS_END		0x0190
+#define RK3399_DSP_VACT_ST_END			0x0194
+#define RK3399_DSP_VS_ST_END_F1			0x0198
+#define RK3399_DSP_VACT_ST_END_F1		0x019c
+#define RK3399_PWM_CTRL				0x01a0
+#define RK3399_PWM_PERIOD_HPR			0x01a4
+#define RK3399_PWM_DUTY_LPR			0x01a8
+#define RK3399_PWM_CNT				0x01ac
+#define RK3399_BCSH_COLOR_BAR			0x01b0
+#define RK3399_BCSH_BCS				0x01b4
+#define RK3399_BCSH_H				0x01b8
+#define RK3399_BCSH_CTRL			0x01bc
+#define RK3399_CABC_CTRL0			0x01c0
+#define RK3399_CABC_CTRL1			0x01c4
+#define RK3399_CABC_CTRL2			0x01c8
+#define RK3399_CABC_CTRL3			0x01cc
+#define RK3399_CABC_GAUSS_LINE0_0		0x01d0
+#define RK3399_CABC_GAUSS_LINE0_1		0x01d4
+#define RK3399_CABC_GAUSS_LINE1_0		0x01d8
+#define RK3399_CABC_GAUSS_LINE1_1		0x01dc
+#define RK3399_CABC_GAUSS_LINE2_0		0x01e0
+#define RK3399_CABC_GAUSS_LINE2_1		0x01e4
+#define RK3399_FRC_LOWER01_0			0x01e8
+#define RK3399_FRC_LOWER01_1			0x01ec
+#define RK3399_FRC_LOWER10_0			0x01f0
+#define RK3399_FRC_LOWER10_1			0x01f4
+#define RK3399_FRC_LOWER11_0			0x01f8
+#define RK3399_FRC_LOWER11_1			0x01fc
+#define RK3399_AFBCD0_CTRL			0x0200
+#define RK3399_AFBCD0_HDR_PTR			0x0204
+#define RK3399_AFBCD0_PIC_SIZE			0x0208
+#define RK3399_AFBCD0_STATUS			0x020c
+#define RK3399_AFBCD1_CTRL			0x0220
+#define RK3399_AFBCD1_HDR_PTR			0x0224
+#define RK3399_AFBCD1_PIC_SIZE			0x0228
+#define RK3399_AFBCD1_STATUS			0x022c
+#define RK3399_AFBCD2_CTRL			0x0240
+#define RK3399_AFBCD2_HDR_PTR			0x0244
+#define RK3399_AFBCD2_PIC_SIZE			0x0248
+#define RK3399_AFBCD2_STATUS			0x024c
+#define RK3399_AFBCD3_CTRL			0x0260
+#define RK3399_AFBCD3_HDR_PTR			0x0264
+#define RK3399_AFBCD3_PIC_SIZE			0x0268
+#define RK3399_AFBCD3_STATUS			0x026c
+#define RK3399_INTR_EN0				0x0280
+#define RK3399_INTR_CLEAR0			0x0284
+#define RK3399_INTR_STATUS0			0x0288
+#define RK3399_INTR_RAW_STATUS0			0x028c
+#define RK3399_INTR_EN1				0x0290
+#define RK3399_INTR_CLEAR1			0x0294
+#define RK3399_INTR_STATUS1			0x0298
+#define RK3399_INTR_RAW_STATUS1			0x029c
+#define RK3399_LINE_FLAG			0x02a0
+#define RK3399_VOP_STATUS			0x02a4
+#define RK3399_BLANKING_VALUE			0x02a8
+#define RK3399_MCU_BYPASS_PORT			0x02ac
+#define RK3399_WIN0_DSP_BG			0x02b0
+#define RK3399_WIN1_DSP_BG			0x02b4
+#define RK3399_WIN2_DSP_BG			0x02b8
+#define RK3399_WIN3_DSP_BG			0x02bc
+#define RK3399_YUV2YUV_WIN			0x02c0
+#define RK3399_YUV2YUV_POST			0x02c4
+#define RK3399_AUTO_GATING_EN			0x02cc
+#define RK3399_WIN0_CSC_COE			0x03a0
+#define RK3399_WIN1_CSC_COE			0x03c0
+#define RK3399_WIN2_CSC_COE			0x03e0
+#define RK3399_WIN3_CSC_COE			0x0400
+#define RK3399_HWC_CSC_COE			0x0420
+#define RK3399_BCSH_R2Y_CSC_COE			0x0440
+#define RK3399_BCSH_Y2R_CSC_COE			0x0460
+#define RK3399_POST_YUV2YUV_Y2R_COE		0x0480
+#define RK3399_POST_YUV2YUV_3X3_COE		0x04a0
+#define RK3399_POST_YUV2YUV_R2Y_COE		0x04c0
+#define RK3399_WIN0_YUV2YUV_Y2R			0x04e0
+#define RK3399_WIN0_YUV2YUV_3X3			0x0500
+#define RK3399_WIN0_YUV2YUV_R2Y			0x0520
+#define RK3399_WIN1_YUV2YUV_Y2R			0x0540
+#define RK3399_WIN1_YUV2YUV_3X3			0x0560
+#define RK3399_WIN1_YUV2YUV_R2Y			0x0580
+#define RK3399_WIN2_YUV2YUV_Y2R			0x05a0
+#define RK3399_WIN2_YUV2YUV_3X3			0x05c0
+#define RK3399_WIN2_YUV2YUV_R2Y			0x05e0
+#define RK3399_WIN3_YUV2YUV_Y2R			0x0600
+#define RK3399_WIN3_YUV2YUV_3X3			0x0620
+#define RK3399_WIN3_YUV2YUV_R2Y			0x0640
+#define RK3399_WIN2_LUT_ADDR			0x1000
+#define RK3399_WIN3_LUT_ADDR			0x1400
+#define RK3399_HWC_LUT_ADDR			0x1800
+#define RK3399_CABC_GAMMA_LUT_ADDR		0x1c00
+#define RK3399_GAMMA_LUT_ADDR			0x2000
+/* rk3399 register definition end */
+
+/* rk3328 register definition end */
+#define RK3328_REG_CFG_DONE			0x00000000
+#define RK3328_VERSION_INFO			0x00000004
+#define RK3328_SYS_CTRL				0x00000008
+#define RK3328_SYS_CTRL1			0x0000000c
+#define RK3328_DSP_CTRL0			0x00000010
+#define RK3328_DSP_CTRL1			0x00000014
+#define RK3328_DSP_BG				0x00000018
+#define RK3328_AUTO_GATING_EN			0x0000003c
+#define RK3328_LINE_FLAG			0x00000040
+#define RK3328_VOP_STATUS			0x00000044
+#define RK3328_BLANKING_VALUE			0x00000048
+#define RK3328_WIN0_DSP_BG			0x00000050
+#define RK3328_WIN1_DSP_BG			0x00000054
+#define RK3328_DBG_PERF_LATENCY_CTRL0		0x000000c0
+#define RK3328_DBG_PERF_RD_MAX_LATENCY_NUM0	0x000000c4
+#define RK3328_DBG_PERF_RD_LATENCY_THR_NUM0	0x000000c8
+#define RK3328_DBG_PERF_RD_LATENCY_SAMP_NUM0	0x000000cc
+#define RK3328_INTR_EN0				0x000000e0
+#define RK3328_INTR_CLEAR0			0x000000e4
+#define RK3328_INTR_STATUS0			0x000000e8
+#define RK3328_INTR_RAW_STATUS0			0x000000ec
+#define RK3328_INTR_EN1				0x000000f0
+#define RK3328_INTR_CLEAR1			0x000000f4
+#define RK3328_INTR_STATUS1			0x000000f8
+#define RK3328_INTR_RAW_STATUS1			0x000000fc
+#define RK3328_WIN0_CTRL0			0x00000100
+#define RK3328_WIN0_CTRL1			0x00000104
+#define RK3328_WIN0_COLOR_KEY			0x00000108
+#define RK3328_WIN0_VIR				0x0000010c
+#define RK3328_WIN0_YRGB_MST			0x00000110
+#define RK3328_WIN0_CBR_MST			0x00000114
+#define RK3328_WIN0_ACT_INFO			0x00000118
+#define RK3328_WIN0_DSP_INFO			0x0000011c
+#define RK3328_WIN0_DSP_ST			0x00000120
+#define RK3328_WIN0_SCL_FACTOR_YRGB		0x00000124
+#define RK3328_WIN0_SCL_FACTOR_CBR		0x00000128
+#define RK3328_WIN0_SCL_OFFSET			0x0000012c
+#define RK3328_WIN0_SRC_ALPHA_CTRL		0x00000130
+#define RK3328_WIN0_DST_ALPHA_CTRL		0x00000134
+#define RK3328_WIN0_FADING_CTRL			0x00000138
+#define RK3328_WIN0_CTRL2			0x0000013c
+#define RK3328_DBG_WIN0_REG0			0x000001f0
+#define RK3328_DBG_WIN0_REG1			0x000001f4
+#define RK3328_DBG_WIN0_REG2			0x000001f8
+#define RK3328_DBG_WIN0_RESERVED		0x000001fc
+#define RK3328_WIN1_CTRL0			0x00000200
+#define RK3328_WIN1_CTRL1			0x00000204
+#define RK3328_WIN1_COLOR_KEY			0x00000208
+#define RK3328_WIN1_VIR				0x0000020c
+#define RK3328_WIN1_YRGB_MST			0x00000210
+#define RK3328_WIN1_CBR_MST			0x00000214
+#define RK3328_WIN1_ACT_INFO			0x00000218
+#define RK3328_WIN1_DSP_INFO			0x0000021c
+#define RK3328_WIN1_DSP_ST			0x00000220
+#define RK3328_WIN1_SCL_FACTOR_YRGB		0x00000224
+#define RK3328_WIN1_SCL_FACTOR_CBR		0x00000228
+#define RK3328_WIN1_SCL_OFFSET			0x0000022c
+#define RK3328_WIN1_SRC_ALPHA_CTRL		0x00000230
+#define RK3328_WIN1_DST_ALPHA_CTRL		0x00000234
+#define RK3328_WIN1_FADING_CTRL			0x00000238
+#define RK3328_WIN1_CTRL2			0x0000023c
+#define RK3328_DBG_WIN1_REG0			0x000002f0
+#define RK3328_DBG_WIN1_REG1			0x000002f4
+#define RK3328_DBG_WIN1_REG2			0x000002f8
+#define RK3328_DBG_WIN1_RESERVED		0x000002fc
+#define RK3328_WIN2_CTRL0			0x00000300
+#define RK3328_WIN2_CTRL1			0x00000304
+#define RK3328_WIN2_COLOR_KEY			0x00000308
+#define RK3328_WIN2_VIR				0x0000030c
+#define RK3328_WIN2_YRGB_MST			0x00000310
+#define RK3328_WIN2_CBR_MST			0x00000314
+#define RK3328_WIN2_ACT_INFO			0x00000318
+#define RK3328_WIN2_DSP_INFO			0x0000031c
+#define RK3328_WIN2_DSP_ST			0x00000320
+#define RK3328_WIN2_SCL_FACTOR_YRGB		0x00000324
+#define RK3328_WIN2_SCL_FACTOR_CBR		0x00000328
+#define RK3328_WIN2_SCL_OFFSET			0x0000032c
+#define RK3328_WIN2_SRC_ALPHA_CTRL		0x00000330
+#define RK3328_WIN2_DST_ALPHA_CTRL		0x00000334
+#define RK3328_WIN2_FADING_CTRL			0x00000338
+#define RK3328_WIN2_CTRL2			0x0000033c
+#define RK3328_DBG_WIN2_REG0			0x000003f0
+#define RK3328_DBG_WIN2_REG1			0x000003f4
+#define RK3328_DBG_WIN2_REG2			0x000003f8
+#define RK3328_DBG_WIN2_RESERVED		0x000003fc
+#define RK3328_WIN3_CTRL0			0x00000400
+#define RK3328_WIN3_CTRL1			0x00000404
+#define RK3328_WIN3_COLOR_KEY			0x00000408
+#define RK3328_WIN3_VIR				0x0000040c
+#define RK3328_WIN3_YRGB_MST			0x00000410
+#define RK3328_WIN3_CBR_MST			0x00000414
+#define RK3328_WIN3_ACT_INFO			0x00000418
+#define RK3328_WIN3_DSP_INFO			0x0000041c
+#define RK3328_WIN3_DSP_ST			0x00000420
+#define RK3328_WIN3_SCL_FACTOR_YRGB		0x00000424
+#define RK3328_WIN3_SCL_FACTOR_CBR		0x00000428
+#define RK3328_WIN3_SCL_OFFSET			0x0000042c
+#define RK3328_WIN3_SRC_ALPHA_CTRL		0x00000430
+#define RK3328_WIN3_DST_ALPHA_CTRL		0x00000434
+#define RK3328_WIN3_FADING_CTRL			0x00000438
+#define RK3328_WIN3_CTRL2			0x0000043c
+#define RK3328_DBG_WIN3_REG0			0x000004f0
+#define RK3328_DBG_WIN3_REG1			0x000004f4
+#define RK3328_DBG_WIN3_REG2			0x000004f8
+#define RK3328_DBG_WIN3_RESERVED		0x000004fc
+
+#define RK3328_HWC_CTRL0			0x00000500
+#define RK3328_HWC_CTRL1			0x00000504
+#define RK3328_HWC_MST				0x00000508
+#define RK3328_HWC_DSP_ST			0x0000050c
+#define RK3328_HWC_SRC_ALPHA_CTRL		0x00000510
+#define RK3328_HWC_DST_ALPHA_CTRL		0x00000514
+#define RK3328_HWC_FADING_CTRL			0x00000518
+#define RK3328_HWC_RESERVED1			0x0000051c
+#define RK3328_POST_DSP_HACT_INFO		0x00000600
+#define RK3328_POST_DSP_VACT_INFO		0x00000604
+#define RK3328_POST_SCL_FACTOR_YRGB		0x00000608
+#define RK3328_POST_RESERVED			0x0000060c
+#define RK3328_POST_SCL_CTRL			0x00000610
+#define RK3328_POST_DSP_VACT_INFO_F1		0x00000614
+#define RK3328_DSP_HTOTAL_HS_END		0x00000618
+#define RK3328_DSP_HACT_ST_END			0x0000061c
+#define RK3328_DSP_VTOTAL_VS_END		0x00000620
+#define RK3328_DSP_VACT_ST_END			0x00000624
+#define RK3328_DSP_VS_ST_END_F1			0x00000628
+#define RK3328_DSP_VACT_ST_END_F1		0x0000062c
+#define RK3328_BCSH_COLOR_BAR			0x00000640
+#define RK3328_BCSH_BCS				0x00000644
+#define RK3328_BCSH_H				0x00000648
+#define RK3328_BCSH_CTRL			0x0000064c
+#define RK3328_FRC_LOWER01_0			0x00000678
+#define RK3328_FRC_LOWER01_1			0x0000067c
+#define RK3328_FRC_LOWER10_0			0x00000680
+#define RK3328_FRC_LOWER10_1			0x00000684
+#define RK3328_FRC_LOWER11_0			0x00000688
+#define RK3328_FRC_LOWER11_1			0x0000068c
+#define RK3328_DBG_POST_REG0			0x000006e8
+#define RK3328_DBG_POST_RESERVED		0x000006ec
+#define RK3328_DBG_DATAO			0x000006f0
+#define RK3328_DBG_DATAO_2			0x000006f4
+
+/* sdr to hdr */
+#define RK3328_SDR2HDR_CTRL			0x00000700
+#define RK3328_EOTF_OETF_Y0			0x00000704
+#define RK3328_RESERVED0001			0x00000708
+#define RK3328_RESERVED0002			0x0000070c
+#define RK3328_EOTF_OETF_Y1			0x00000710
+#define RK3328_EOTF_OETF_Y64			0x0000080c
+#define RK3328_OETF_DX_DXPOW1			0x00000810
+#define RK3328_OETF_DX_DXPOW64			0x0000090c
+#define RK3328_OETF_XN1				0x00000910
+#define RK3328_OETF_XN63			0x00000a08
+
+/* hdr to sdr */
+#define RK3328_HDR2SDR_CTRL			0x00000a10
+#define RK3328_HDR2SDR_SRC_RANGE		0x00000a14
+#define RK3328_HDR2SDR_NORMFACEETF		0x00000a18
+#define RK3328_RESERVED0003			0x00000a1c
+#define RK3328_HDR2SDR_DST_RANGE		0x00000a20
+#define RK3328_HDR2SDR_NORMFACCGAMMA		0x00000a24
+#define RK3328_EETF_OETF_Y0			0x00000a28
+#define RK3328_SAT_Y0				0x00000a2c
+#define RK3328_EETF_OETF_Y1			0x00000a30
+#define RK3328_SAT_Y1				0x00000ab0
+#define RK3328_SAT_Y8				0x00000acc
+
+#define RK3328_HWC_LUT_ADDR			0x00000c00
+
 /* rk3036 register definition */
 #define RK3036_SYS_CTRL			0x00
 #define RK3036_DSP_CTRL0		0x04
@@ -166,197 +878,4 @@
 #define RK3036_HWC_LUT_ADDR		0x800
 /* rk3036 register definition end */
 
-/* rk3399 register definition */
-#define RK3399_REG_CFG_DONE		0x00000
-#define RK3399_VERSION_INFO		0x00004
-#define RK3399_SYS_CTRL			0x00008
-#define RK3399_SYS_CTRL1		0x0000c
-#define RK3399_DSP_CTRL0		0x00010
-#define RK3399_DSP_CTRL1		0x00014
-#define RK3399_DSP_BG			0x00018
-#define RK3399_MCU_CTRL			0x0001c
-#define RK3399_WB_CTRL0			0x00020
-#define RK3399_WB_CTRL1			0x00024
-#define RK3399_WB_YRGB_MST		0x00028
-#define RK3399_WB_CBR_MST		0x0002c
-#define RK3399_WIN0_CTRL0		0x00030
-#define RK3399_WIN0_CTRL1		0x00034
-#define RK3399_WIN0_COLOR_KEY		0x00038
-#define RK3399_WIN0_VIR			0x0003c
-#define RK3399_WIN0_YRGB_MST		0x00040
-#define RK3399_WIN0_CBR_MST		0x00044
-#define RK3399_WIN0_ACT_INFO		0x00048
-#define RK3399_WIN0_DSP_INFO		0x0004c
-#define RK3399_WIN0_DSP_ST		0x00050
-#define RK3399_WIN0_SCL_FACTOR_YRGB	0x00054
-#define RK3399_WIN0_SCL_FACTOR_CBR	0x00058
-#define RK3399_WIN0_SCL_OFFSET		0x0005c
-#define RK3399_WIN0_SRC_ALPHA_CTRL	0x00060
-#define RK3399_WIN0_DST_ALPHA_CTRL	0x00064
-#define RK3399_WIN0_FADING_CTRL		0x00068
-#define RK3399_WIN0_CTRL2		0x0006c
-#define RK3399_WIN1_CTRL0		0x00070
-#define RK3399_WIN1_CTRL1		0x00074
-#define RK3399_WIN1_COLOR_KEY		0x00078
-#define RK3399_WIN1_VIR			0x0007c
-#define RK3399_WIN1_YRGB_MST		0x00080
-#define RK3399_WIN1_CBR_MST		0x00084
-#define RK3399_WIN1_ACT_INFO		0x00088
-#define RK3399_WIN1_DSP_INFO		0x0008c
-#define RK3399_WIN1_DSP_ST		0x00090
-#define RK3399_WIN1_SCL_FACTOR_YRGB	0x00094
-#define RK3399_WIN1_SCL_FACTOR_CBR	0x00098
-#define RK3399_WIN1_SCL_OFFSET		0x0009c
-#define RK3399_WIN1_SRC_ALPHA_CTRL	0x000a0
-#define RK3399_WIN1_DST_ALPHA_CTRL	0x000a4
-#define RK3399_WIN1_FADING_CTRL		0x000a8
-#define RK3399_WIN1_CTRL2		0x000ac
-#define RK3399_WIN2_CTRL0		0x000b0
-#define RK3399_WIN2_CTRL1		0x000b4
-#define RK3399_WIN2_VIR0_1		0x000b8
-#define RK3399_WIN2_VIR2_3		0x000bc
-#define RK3399_WIN2_MST0		0x000c0
-#define RK3399_WIN2_DSP_INFO0		0x000c4
-#define RK3399_WIN2_DSP_ST0		0x000c8
-#define RK3399_WIN2_COLOR_KEY		0x000cc
-#define RK3399_WIN2_MST1		0x000d0
-#define RK3399_WIN2_DSP_INFO1		0x000d4
-#define RK3399_WIN2_DSP_ST1		0x000d8
-#define RK3399_WIN2_SRC_ALPHA_CTRL	0x000dc
-#define RK3399_WIN2_MST2		0x000e0
-#define RK3399_WIN2_DSP_INFO2		0x000e4
-#define RK3399_WIN2_DSP_ST2		0x000e8
-#define RK3399_WIN2_DST_ALPHA_CTRL	0x000ec
-#define RK3399_WIN2_MST3		0x000f0
-#define RK3399_WIN2_DSP_INFO3		0x000f4
-#define RK3399_WIN2_DSP_ST3		0x000f8
-#define RK3399_WIN2_FADING_CTRL		0x000fc
-#define RK3399_WIN3_CTRL0		0x00100
-#define RK3399_WIN3_CTRL1		0x00104
-#define RK3399_WIN3_VIR0_1		0x00108
-#define RK3399_WIN3_VIR2_3		0x0010c
-#define RK3399_WIN3_MST0		0x00110
-#define RK3399_WIN3_DSP_INFO0		0x00114
-#define RK3399_WIN3_DSP_ST0		0x00118
-#define RK3399_WIN3_COLOR_KEY		0x0011c
-#define RK3399_WIN3_MST1		0x00120
-#define RK3399_WIN3_DSP_INFO1		0x00124
-#define RK3399_WIN3_DSP_ST1		0x00128
-#define RK3399_WIN3_SRC_ALPHA_CTRL	0x0012c
-#define RK3399_WIN3_MST2		0x00130
-#define RK3399_WIN3_DSP_INFO2		0x00134
-#define RK3399_WIN3_DSP_ST2		0x00138
-#define RK3399_WIN3_DST_ALPHA_CTRL	0x0013c
-#define RK3399_WIN3_MST3		0x00140
-#define RK3399_WIN3_DSP_INFO3		0x00144
-#define RK3399_WIN3_DSP_ST3		0x00148
-#define RK3399_WIN3_FADING_CTRL		0x0014c
-#define RK3399_HWC_CTRL0		0x00150
-#define RK3399_HWC_CTRL1		0x00154
-#define RK3399_HWC_MST			0x00158
-#define RK3399_HWC_DSP_ST		0x0015c
-#define RK3399_HWC_SRC_ALPHA_CTRL	0x00160
-#define RK3399_HWC_DST_ALPHA_CTRL	0x00164
-#define RK3399_HWC_FADING_CTRL		0x00168
-#define RK3399_HWC_RESERVED1		0x0016c
-#define RK3399_POST_DSP_HACT_INFO	0x00170
-#define RK3399_POST_DSP_VACT_INFO	0x00174
-#define RK3399_POST_SCL_FACTOR_YRGB	0x00178
-#define RK3399_POST_RESERVED		0x0017c
-#define RK3399_POST_SCL_CTRL		0x00180
-#define RK3399_POST_DSP_VACT_INFO_F1	0x00184
-#define RK3399_DSP_HTOTAL_HS_END	0x00188
-#define RK3399_DSP_HACT_ST_END		0x0018c
-#define RK3399_DSP_VTOTAL_VS_END	0x00190
-#define RK3399_DSP_VACT_ST_END		0x00194
-#define RK3399_DSP_VS_ST_END_F1		0x00198
-#define RK3399_DSP_VACT_ST_END_F1	0x0019c
-#define RK3399_PWM_CTRL			0x001a0
-#define RK3399_PWM_PERIOD_HPR		0x001a4
-#define RK3399_PWM_DUTY_LPR		0x001a8
-#define RK3399_PWM_CNT			0x001ac
-#define RK3399_BCSH_COLOR_BAR		0x001b0
-#define RK3399_BCSH_BCS			0x001b4
-#define RK3399_BCSH_H			0x001b8
-#define RK3399_BCSH_CTRL		0x001bc
-#define RK3399_CABC_CTRL0		0x001c0
-#define RK3399_CABC_CTRL1		0x001c4
-#define RK3399_CABC_CTRL2		0x001c8
-#define RK3399_CABC_CTRL3		0x001cc
-#define RK3399_CABC_GAUSS_LINE0_0	0x001d0
-#define RK3399_CABC_GAUSS_LINE0_1	0x001d4
-#define RK3399_CABC_GAUSS_LINE1_0	0x001d8
-#define RK3399_CABC_GAUSS_LINE1_1	0x001dc
-#define RK3399_CABC_GAUSS_LINE2_0	0x001e0
-#define RK3399_CABC_GAUSS_LINE2_1	0x001e4
-#define RK3399_FRC_LOWER01_0		0x001e8
-#define RK3399_FRC_LOWER01_1		0x001ec
-#define RK3399_FRC_LOWER10_0		0x001f0
-#define RK3399_FRC_LOWER10_1		0x001f4
-#define RK3399_FRC_LOWER11_0		0x001f8
-#define RK3399_FRC_LOWER11_1		0x001fc
-#define RK3399_AFBCD0_CTRL		0x00200
-#define RK3399_AFBCD0_HDR_PTR		0x00204
-#define RK3399_AFBCD0_PIC_SIZE		0x00208
-#define RK3399_AFBCD0_STATUS		0x0020c
-#define RK3399_AFBCD1_CTRL		0x00220
-#define RK3399_AFBCD1_HDR_PTR		0x00224
-#define RK3399_AFBCD1_PIC_SIZE		0x00228
-#define RK3399_AFBCD1_STATUS		0x0022c
-#define RK3399_AFBCD2_CTRL		0x00240
-#define RK3399_AFBCD2_HDR_PTR		0x00244
-#define RK3399_AFBCD2_PIC_SIZE		0x00248
-#define RK3399_AFBCD2_STATUS		0x0024c
-#define RK3399_AFBCD3_CTRL		0x00260
-#define RK3399_AFBCD3_HDR_PTR		0x00264
-#define RK3399_AFBCD3_PIC_SIZE		0x00268
-#define RK3399_AFBCD3_STATUS		0x0026c
-#define RK3399_INTR_EN0			0x00280
-#define RK3399_INTR_CLEAR0		0x00284
-#define RK3399_INTR_STATUS0		0x00288
-#define RK3399_INTR_RAW_STATUS0		0x0028c
-#define RK3399_INTR_EN1			0x00290
-#define RK3399_INTR_CLEAR1		0x00294
-#define RK3399_INTR_STATUS1		0x00298
-#define RK3399_INTR_RAW_STATUS1		0x0029c
-#define RK3399_LINE_FLAG		0x002a0
-#define RK3399_VOP_STATUS		0x002a4
-#define RK3399_BLANKING_VALUE		0x002a8
-#define RK3399_MCU_BYPASS_PORT		0x002ac
-#define RK3399_WIN0_DSP_BG		0x002b0
-#define RK3399_WIN1_DSP_BG		0x002b4
-#define RK3399_WIN2_DSP_BG		0x002b8
-#define RK3399_WIN3_DSP_BG		0x002bc
-#define RK3399_YUV2YUV_WIN		0x002c0
-#define RK3399_YUV2YUV_POST		0x002c4
-#define RK3399_AUTO_GATING_EN		0x002cc
-#define RK3399_WIN0_CSC_COE		0x003a0
-#define RK3399_WIN1_CSC_COE		0x003c0
-#define RK3399_WIN2_CSC_COE		0x003e0
-#define RK3399_WIN3_CSC_COE		0x00400
-#define RK3399_HWC_CSC_COE		0x00420
-#define RK3399_BCSH_R2Y_CSC_COE		0x00440
-#define RK3399_BCSH_Y2R_CSC_COE		0x00460
-#define RK3399_POST_YUV2YUV_Y2R_COE	0x00480
-#define RK3399_POST_YUV2YUV_3X3_COE	0x004a0
-#define RK3399_POST_YUV2YUV_R2Y_COE	0x004c0
-#define RK3399_WIN0_YUV2YUV_Y2R		0x004e0
-#define RK3399_WIN0_YUV2YUV_3X3		0x00500
-#define RK3399_WIN0_YUV2YUV_R2Y		0x00520
-#define RK3399_WIN1_YUV2YUV_Y2R		0x00540
-#define RK3399_WIN1_YUV2YUV_3X3		0x00560
-#define RK3399_WIN1_YUV2YUV_R2Y		0x00580
-#define RK3399_WIN2_YUV2YUV_Y2R		0x005a0
-#define RK3399_WIN2_YUV2YUV_3X3		0x005c0
-#define RK3399_WIN2_YUV2YUV_R2Y		0x005e0
-#define RK3399_WIN3_YUV2YUV_Y2R		0x00600
-#define RK3399_WIN3_YUV2YUV_3X3		0x00620
-#define RK3399_WIN3_YUV2YUV_R2Y		0x00640
-#define RK3399_WIN2_LUT_ADDR		0x01000
-#define RK3399_WIN3_LUT_ADDR		0x01400
-#define RK3399_HWC_LUT_ADDR		0x01800
-#define RK3399_CABC_GAMMA_LUT_ADDR	0x01c00
-#define RK3399_GAMMA_LUT_ADDR		0x02000
-/* rk3399 register definition end */
-
 #endif /* _ROCKCHIP_VOP_REG_H */
diff --git a/drivers/gpu/drm/savage/savage_drv.c b/drivers/gpu/drm/savage/savage_drv.c
index 78c6d8e..2bddeb8 100644
--- a/drivers/gpu/drm/savage/savage_drv.c
+++ b/drivers/gpu/drm/savage/savage_drv.c
@@ -55,7 +55,6 @@ static struct drm_driver driver = {
 	.preclose = savage_reclaim_buffers,
 	.lastclose = savage_driver_lastclose,
 	.unload = savage_driver_unload,
-	.set_busid = drm_pci_set_busid,
 	.ioctls = savage_ioctls,
 	.dma_ioctl = savage_bci_buffers,
 	.fops = &savage_driver_fops,
@@ -75,12 +74,12 @@ static struct pci_driver savage_pci_driver = {
 static int __init savage_init(void)
 {
 	driver.num_ioctls = savage_max_ioctl;
-	return drm_pci_init(&driver, &savage_pci_driver);
+	return drm_legacy_pci_init(&driver, &savage_pci_driver);
 }
 
 static void __exit savage_exit(void)
 {
-	drm_pci_exit(&driver, &savage_pci_driver);
+	drm_legacy_pci_exit(&driver, &savage_pci_driver);
 }
 
 module_init(savage_init);
diff --git a/drivers/gpu/drm/shmobile/shmob_drm_drv.c b/drivers/gpu/drm/shmobile/shmob_drm_drv.c
index 800d1d2..5925725 100644
--- a/drivers/gpu/drm/shmobile/shmob_drm_drv.c
+++ b/drivers/gpu/drm/shmobile/shmob_drm_drv.c
@@ -145,8 +145,6 @@ static struct drm_driver shmob_drm_driver = {
 	.gem_prime_vunmap	= drm_gem_cma_prime_vunmap,
 	.gem_prime_mmap		= drm_gem_cma_prime_mmap,
 	.dumb_create		= drm_gem_cma_dumb_create,
-	.dumb_map_offset	= drm_gem_cma_dumb_map_offset,
-	.dumb_destroy		= drm_gem_dumb_destroy,
 	.fops			= &shmob_drm_fops,
 	.name			= "shmob-drm",
 	.desc			= "Renesas SH Mobile DRM",
@@ -277,7 +275,7 @@ static int shmob_drm_probe(struct platform_device *pdev)
 	ret = drm_irq_install(ddev, platform_get_irq(pdev, 0));
 	if (ret < 0) {
 		dev_err(&pdev->dev, "failed to install IRQ handler\n");
-		goto err_vblank_cleanup;
+		goto err_modeset_cleanup;
 	}
 
 	/*
@@ -292,8 +290,6 @@ static int shmob_drm_probe(struct platform_device *pdev)
 
 err_irq_uninstall:
 	drm_irq_uninstall(ddev);
-err_vblank_cleanup:
-	drm_vblank_cleanup(ddev);
 err_modeset_cleanup:
 	drm_kms_helper_poll_fini(ddev);
 	drm_mode_config_cleanup(ddev);
diff --git a/drivers/gpu/drm/sis/sis_drv.c b/drivers/gpu/drm/sis/sis_drv.c
index 7f05da1..e04a926 100644
--- a/drivers/gpu/drm/sis/sis_drv.c
+++ b/drivers/gpu/drm/sis/sis_drv.c
@@ -104,7 +104,6 @@ static struct drm_driver driver = {
 	.open = sis_driver_open,
 	.preclose = sis_reclaim_buffers_locked,
 	.postclose = sis_driver_postclose,
-	.set_busid = drm_pci_set_busid,
 	.dma_quiescent = sis_idle,
 	.lastclose = sis_lastclose,
 	.ioctls = sis_ioctls,
@@ -125,12 +124,12 @@ static struct pci_driver sis_pci_driver = {
 static int __init sis_init(void)
 {
 	driver.num_ioctls = sis_max_ioctl;
-	return drm_pci_init(&driver, &sis_pci_driver);
+	return drm_legacy_pci_init(&driver, &sis_pci_driver);
 }
 
 static void __exit sis_exit(void)
 {
-	drm_pci_exit(&driver, &sis_pci_driver);
+	drm_legacy_pci_exit(&driver, &sis_pci_driver);
 }
 
 module_init(sis_init);
diff --git a/drivers/gpu/drm/sti/sti_crtc.c b/drivers/gpu/drm/sti/sti_crtc.c
index d45a433..e8a4d48 100644
--- a/drivers/gpu/drm/sti/sti_crtc.c
+++ b/drivers/gpu/drm/sti/sti_crtc.c
@@ -20,7 +20,8 @@
 #include "sti_vid.h"
 #include "sti_vtg.h"
 
-static void sti_crtc_enable(struct drm_crtc *crtc)
+static void sti_crtc_atomic_enable(struct drm_crtc *crtc,
+				   struct drm_crtc_state *old_state)
 {
 	struct sti_mixer *mixer = to_sti_mixer(crtc);
 
@@ -31,7 +32,8 @@ static void sti_crtc_enable(struct drm_crtc *crtc)
 	drm_crtc_vblank_on(crtc);
 }
 
-static void sti_crtc_disabling(struct drm_crtc *crtc)
+static void sti_crtc_atomic_disable(struct drm_crtc *crtc,
+				    struct drm_crtc_state *old_state)
 {
 	struct sti_mixer *mixer = to_sti_mixer(crtc);
 
@@ -222,10 +224,10 @@ static void sti_crtc_atomic_flush(struct drm_crtc *crtc,
 }
 
 static const struct drm_crtc_helper_funcs sti_crtc_helper_funcs = {
-	.enable = sti_crtc_enable,
-	.disable = sti_crtc_disabling,
 	.mode_set_nofb = sti_crtc_mode_set_nofb,
 	.atomic_flush = sti_crtc_atomic_flush,
+	.atomic_enable = sti_crtc_atomic_enable,
+	.atomic_disable = sti_crtc_atomic_disable,
 };
 
 static void sti_crtc_destroy(struct drm_crtc *crtc)
diff --git a/drivers/gpu/drm/sti/sti_cursor.c b/drivers/gpu/drm/sti/sti_cursor.c
index 5b3a41f..b709ebb 100644
--- a/drivers/gpu/drm/sti/sti_cursor.c
+++ b/drivers/gpu/drm/sti/sti_cursor.c
@@ -348,7 +348,6 @@ static const struct drm_plane_funcs sti_cursor_plane_helpers_funcs = {
 	.update_plane = drm_atomic_helper_update_plane,
 	.disable_plane = drm_atomic_helper_disable_plane,
 	.destroy = sti_cursor_destroy,
-	.set_property = drm_atomic_helper_plane_set_property,
 	.reset = sti_plane_reset,
 	.atomic_duplicate_state = drm_atomic_helper_plane_duplicate_state,
 	.atomic_destroy_state = drm_atomic_helper_plane_destroy_state,
@@ -392,7 +391,7 @@ struct drm_plane *sti_cursor_create(struct drm_device *drm_dev,
 				       &sti_cursor_plane_helpers_funcs,
 				       cursor_supported_formats,
 				       ARRAY_SIZE(cursor_supported_formats),
-				       DRM_PLANE_TYPE_CURSOR, NULL);
+				       NULL, DRM_PLANE_TYPE_CURSOR, NULL);
 	if (res) {
 		DRM_ERROR("Failed to initialize universal plane\n");
 		goto err_plane;
diff --git a/drivers/gpu/drm/sti/sti_drv.c b/drivers/gpu/drm/sti/sti_drv.c
index a4b57428..1700c54 100644
--- a/drivers/gpu/drm/sti/sti_drv.c
+++ b/drivers/gpu/drm/sti/sti_drv.c
@@ -175,8 +175,6 @@ static struct drm_driver sti_driver = {
 	.gem_free_object_unlocked = drm_gem_cma_free_object,
 	.gem_vm_ops = &drm_gem_cma_vm_ops,
 	.dumb_create = drm_gem_cma_dumb_create,
-	.dumb_map_offset = drm_gem_cma_dumb_map_offset,
-	.dumb_destroy = drm_gem_dumb_destroy,
 	.fops = &sti_driver_fops,
 
 	.enable_vblank = sti_crtc_enable_vblank,
@@ -237,7 +235,6 @@ static void sti_cleanup(struct drm_device *ddev)
 	}
 
 	drm_kms_helper_poll_fini(ddev);
-	drm_vblank_cleanup(ddev);
 	component_unbind_all(ddev->dev, ddev);
 	kfree(private);
 	ddev->dev_private = NULL;
diff --git a/drivers/gpu/drm/sti/sti_dvo.c b/drivers/gpu/drm/sti/sti_dvo.c
index 24ebc6b..852bf22 100644
--- a/drivers/gpu/drm/sti/sti_dvo.c
+++ b/drivers/gpu/drm/sti/sti_dvo.c
@@ -412,7 +412,6 @@ static int sti_dvo_late_register(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs sti_dvo_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.detect = sti_dvo_connector_detect,
 	.destroy = drm_connector_cleanup,
@@ -582,7 +581,7 @@ static int sti_dvo_remove(struct platform_device *pdev)
 	return 0;
 }
 
-static struct of_device_id dvo_of_match[] = {
+static const struct of_device_id dvo_of_match[] = {
 	{ .compatible = "st,stih407-dvo", },
 	{ /* end node */ }
 };
diff --git a/drivers/gpu/drm/sti/sti_gdp.c b/drivers/gpu/drm/sti/sti_gdp.c
index 5ee0503..b65eea4 100644
--- a/drivers/gpu/drm/sti/sti_gdp.c
+++ b/drivers/gpu/drm/sti/sti_gdp.c
@@ -895,7 +895,6 @@ static const struct drm_plane_funcs sti_gdp_plane_helpers_funcs = {
 	.update_plane = drm_atomic_helper_update_plane,
 	.disable_plane = drm_atomic_helper_disable_plane,
 	.destroy = sti_gdp_destroy,
-	.set_property = drm_atomic_helper_plane_set_property,
 	.reset = sti_plane_reset,
 	.atomic_duplicate_state = drm_atomic_helper_plane_duplicate_state,
 	.atomic_destroy_state = drm_atomic_helper_plane_destroy_state,
@@ -931,7 +930,7 @@ struct drm_plane *sti_gdp_create(struct drm_device *drm_dev,
 				       &sti_gdp_plane_helpers_funcs,
 				       gdp_supported_formats,
 				       ARRAY_SIZE(gdp_supported_formats),
-				       type, NULL);
+				       NULL, type, NULL);
 	if (res) {
 		DRM_ERROR("Failed to initialize universal plane\n");
 		goto err;
diff --git a/drivers/gpu/drm/sti/sti_hda.c b/drivers/gpu/drm/sti/sti_hda.c
index d6ed909..cf65e32 100644
--- a/drivers/gpu/drm/sti/sti_hda.c
+++ b/drivers/gpu/drm/sti/sti_hda.c
@@ -647,7 +647,6 @@ static int sti_hda_late_register(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs sti_hda_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = drm_connector_cleanup,
 	.reset = drm_atomic_helper_connector_reset,
diff --git a/drivers/gpu/drm/sti/sti_hdmi.c b/drivers/gpu/drm/sti/sti_hdmi.c
index a59c95a..30f02d2 100644
--- a/drivers/gpu/drm/sti/sti_hdmi.c
+++ b/drivers/gpu/drm/sti/sti_hdmi.c
@@ -434,7 +434,7 @@ static int hdmi_avi_infoframe_config(struct sti_hdmi *hdmi)
 
 	DRM_DEBUG_DRIVER("\n");
 
-	ret = drm_hdmi_avi_infoframe_from_display_mode(&infoframe, mode);
+	ret = drm_hdmi_avi_infoframe_from_display_mode(&infoframe, mode, false);
 	if (ret < 0) {
 		DRM_ERROR("failed to setup AVI infoframe: %d\n", ret);
 		return ret;
@@ -1113,12 +1113,10 @@ static int sti_hdmi_late_register(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs sti_hdmi_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.detect = sti_hdmi_connector_detect,
 	.destroy = drm_connector_cleanup,
 	.reset = drm_atomic_helper_connector_reset,
-	.set_property = drm_atomic_helper_connector_set_property,
 	.atomic_set_property = sti_hdmi_connector_set_property,
 	.atomic_get_property = sti_hdmi_connector_get_property,
 	.atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
diff --git a/drivers/gpu/drm/sti/sti_hqvdp.c b/drivers/gpu/drm/sti/sti_hqvdp.c
index a1c161f..b19b343 100644
--- a/drivers/gpu/drm/sti/sti_hqvdp.c
+++ b/drivers/gpu/drm/sti/sti_hqvdp.c
@@ -958,6 +958,7 @@ static void sti_hqvdp_start_xp70(struct sti_hqvdp *hqvdp)
 	}
 	if (i == POLL_MAX_ATTEMPT) {
 		DRM_ERROR("Could not reset\n");
+		clk_disable_unprepare(hqvdp->clk);
 		goto out;
 	}
 
@@ -994,6 +995,7 @@ static void sti_hqvdp_start_xp70(struct sti_hqvdp *hqvdp)
 	}
 	if (i == POLL_MAX_ATTEMPT) {
 		DRM_ERROR("Could not boot\n");
+		clk_disable_unprepare(hqvdp->clk);
 		goto out;
 	}
 
@@ -1081,6 +1083,7 @@ static int sti_hqvdp_atomic_check(struct drm_plane *drm_plane,
 					    &hqvdp->vtg_nb,
 					    crtc)) {
 			DRM_ERROR("Cannot register VTG notifier\n");
+			clk_disable_unprepare(hqvdp->clk_pix_main);
 			return -EINVAL;
 		}
 		hqvdp->vtg_registered = true;
@@ -1273,7 +1276,6 @@ static const struct drm_plane_funcs sti_hqvdp_plane_helpers_funcs = {
 	.update_plane = drm_atomic_helper_update_plane,
 	.disable_plane = drm_atomic_helper_disable_plane,
 	.destroy = sti_hqvdp_destroy,
-	.set_property = drm_atomic_helper_plane_set_property,
 	.reset = sti_plane_reset,
 	.atomic_duplicate_state = drm_atomic_helper_plane_duplicate_state,
 	.atomic_destroy_state = drm_atomic_helper_plane_destroy_state,
@@ -1295,7 +1297,7 @@ static struct drm_plane *sti_hqvdp_create(struct drm_device *drm_dev,
 				       &sti_hqvdp_plane_helpers_funcs,
 				       hqvdp_supported_formats,
 				       ARRAY_SIZE(hqvdp_supported_formats),
-				       DRM_PLANE_TYPE_OVERLAY, NULL);
+				       NULL, DRM_PLANE_TYPE_OVERLAY, NULL);
 	if (res) {
 		DRM_ERROR("Failed to initialize universal plane\n");
 		return NULL;
@@ -1395,7 +1397,7 @@ static int sti_hqvdp_remove(struct platform_device *pdev)
 	return 0;
 }
 
-static struct of_device_id hqvdp_of_match[] = {
+static const struct of_device_id hqvdp_of_match[] = {
 	{ .compatible = "st,stih407-hqvdp", },
 	{ /* end node */ }
 };
diff --git a/drivers/gpu/drm/stm/Kconfig b/drivers/gpu/drm/stm/Kconfig
index 8fe5b18..35367ad 100644
--- a/drivers/gpu/drm/stm/Kconfig
+++ b/drivers/gpu/drm/stm/Kconfig
@@ -4,7 +4,7 @@
 	select DRM_KMS_HELPER
 	select DRM_GEM_CMA_HELPER
 	select DRM_KMS_CMA_HELPER
-	select DRM_PANEL
+	select DRM_PANEL_BRIDGE
 	select VIDEOMODE_HELPERS
 	select FB_PROVIDE_GET_FB_UNMAPPED_AREA
 
@@ -13,3 +13,10 @@
 	  STMicroelectronics STM32 MCUs.
 	  To compile this driver as a module, choose M here: the module
 	  will be called stm-drm.
+
+config DRM_STM_DSI
+	tristate "STMicroelectronics specific extensions for Synopsys MIPI DSI"
+	depends on DRM_STM
+	select DRM_DW_MIPI_DSI
+	help
+	  Choose this option for MIPI DSI support on STMicroelectronics SoC.
diff --git a/drivers/gpu/drm/stm/Makefile b/drivers/gpu/drm/stm/Makefile
index a09ecf4..d883adc 100644
--- a/drivers/gpu/drm/stm/Makefile
+++ b/drivers/gpu/drm/stm/Makefile
@@ -2,4 +2,6 @@
 	drv.o \
 	ltdc.o
 
+obj-$(CONFIG_DRM_STM_DSI) += dw_mipi_dsi-stm.o
+
 obj-$(CONFIG_DRM_STM) += stm-drm.o
diff --git a/drivers/gpu/drm/stm/drv.c b/drivers/gpu/drm/stm/drv.c
index 83ab48f..b333b37 100644
--- a/drivers/gpu/drm/stm/drv.c
+++ b/drivers/gpu/drm/stm/drv.c
@@ -20,13 +20,6 @@
 
 #include "ltdc.h"
 
-#define DRIVER_NAME		"stm"
-#define DRIVER_DESC		"STMicroelectronics SoC DRM"
-#define DRIVER_DATE		"20170330"
-#define DRIVER_MAJOR		1
-#define DRIVER_MINOR		0
-#define DRIVER_PATCH_LEVEL	0
-
 #define STM_MAX_FB_WIDTH	2048
 #define STM_MAX_FB_HEIGHT	2048 /* same as width to handle orientation */
 
@@ -59,16 +52,14 @@ static struct drm_driver drv_driver = {
 	.driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_PRIME |
 			   DRIVER_ATOMIC,
 	.lastclose = drv_lastclose,
-	.name = DRIVER_NAME,
-	.desc = DRIVER_DESC,
-	.date = DRIVER_DATE,
-	.major = DRIVER_MAJOR,
-	.minor = DRIVER_MINOR,
-	.patchlevel = DRIVER_PATCH_LEVEL,
+	.name = "stm",
+	.desc = "STMicroelectronics SoC DRM",
+	.date = "20170330",
+	.major = 1,
+	.minor = 0,
+	.patchlevel = 0,
 	.fops = &drv_driver_fops,
 	.dumb_create = drm_gem_cma_dumb_create,
-	.dumb_map_offset = drm_gem_cma_dumb_map_offset,
-	.dumb_destroy = drm_gem_dumb_destroy,
 	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
 	.gem_free_object_unlocked = drm_gem_cma_free_object,
@@ -206,7 +197,7 @@ static struct platform_driver stm_drm_platform_driver = {
 	.probe = stm_drm_platform_probe,
 	.remove = stm_drm_platform_remove,
 	.driver = {
-		.name = DRIVER_NAME,
+		.name = "stm32-display",
 		.of_match_table = drv_dt_ids,
 	},
 };
diff --git a/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c
new file mode 100644
index 0000000..568c5d0
--- /dev/null
+++ b/drivers/gpu/drm/stm/dw_mipi_dsi-stm.c
@@ -0,0 +1,352 @@
+/*
+ * Copyright (C) STMicroelectronics SA 2017
+ *
+ * Authors: Philippe Cornu <philippe.cornu@st.com>
+ *          Yannick Fertre <yannick.fertre@st.com>
+ *
+ * License terms:  GNU General Public License (GPL), version 2
+ */
+
+#include <linux/clk.h>
+#include <linux/iopoll.h>
+#include <linux/module.h>
+#include <drm/drmP.h>
+#include <drm/drm_mipi_dsi.h>
+#include <drm/bridge/dw_mipi_dsi.h>
+#include <video/mipi_display.h>
+
+/* DSI wrapper register & bit definitions */
+/* Note: registers are named as in the Reference Manual */
+#define DSI_WCFGR	0x0400		/* Wrapper ConFiGuration Reg */
+#define WCFGR_DSIM	BIT(0)		/* DSI Mode */
+#define WCFGR_COLMUX	GENMASK(3, 1)	/* COLor MUltipleXing */
+
+#define DSI_WCR		0x0404		/* Wrapper Control Reg */
+#define WCR_DSIEN	BIT(3)		/* DSI ENable */
+
+#define DSI_WISR	0x040C		/* Wrapper Interrupt and Status Reg */
+#define WISR_PLLLS	BIT(8)		/* PLL Lock Status */
+#define WISR_RRS	BIT(12)		/* Regulator Ready Status */
+
+#define DSI_WPCR0	0x0418		/* Wrapper Phy Conf Reg 0 */
+#define WPCR0_UIX4	GENMASK(5, 0)	/* Unit Interval X 4 */
+#define WPCR0_TDDL	BIT(16)		/* Turn Disable Data Lanes */
+
+#define DSI_WRPCR	0x0430		/* Wrapper Regulator & Pll Ctrl Reg */
+#define WRPCR_PLLEN	BIT(0)		/* PLL ENable */
+#define WRPCR_NDIV	GENMASK(8, 2)	/* pll loop DIVision Factor */
+#define WRPCR_IDF	GENMASK(14, 11)	/* pll Input Division Factor */
+#define WRPCR_ODF	GENMASK(17, 16)	/* pll Output Division Factor */
+#define WRPCR_REGEN	BIT(24)		/* REGulator ENable */
+#define WRPCR_BGREN	BIT(28)		/* BandGap Reference ENable */
+#define IDF_MIN		1
+#define IDF_MAX		7
+#define NDIV_MIN	10
+#define NDIV_MAX	125
+#define ODF_MIN		1
+#define ODF_MAX		8
+
+/* dsi color format coding according to the datasheet */
+enum dsi_color {
+	DSI_RGB565_CONF1,
+	DSI_RGB565_CONF2,
+	DSI_RGB565_CONF3,
+	DSI_RGB666_CONF1,
+	DSI_RGB666_CONF2,
+	DSI_RGB888,
+};
+
+#define LANE_MIN_KBPS	31250
+#define LANE_MAX_KBPS	500000
+
+/* Sleep & timeout for regulator on/off, pll lock/unlock & fifo empty */
+#define SLEEP_US	1000
+#define TIMEOUT_US	200000
+
+struct dw_mipi_dsi_stm {
+	void __iomem *base;
+	struct clk *pllref_clk;
+};
+
+static inline void dsi_write(struct dw_mipi_dsi_stm *dsi, u32 reg, u32 val)
+{
+	writel(val, dsi->base + reg);
+}
+
+static inline u32 dsi_read(struct dw_mipi_dsi_stm *dsi, u32 reg)
+{
+	return readl(dsi->base + reg);
+}
+
+static inline void dsi_set(struct dw_mipi_dsi_stm *dsi, u32 reg, u32 mask)
+{
+	dsi_write(dsi, reg, dsi_read(dsi, reg) | mask);
+}
+
+static inline void dsi_clear(struct dw_mipi_dsi_stm *dsi, u32 reg, u32 mask)
+{
+	dsi_write(dsi, reg, dsi_read(dsi, reg) & ~mask);
+}
+
+static inline void dsi_update_bits(struct dw_mipi_dsi_stm *dsi, u32 reg,
+				   u32 mask, u32 val)
+{
+	dsi_write(dsi, reg, (dsi_read(dsi, reg) & ~mask) | val);
+}
+
+static enum dsi_color dsi_color_from_mipi(enum mipi_dsi_pixel_format fmt)
+{
+	switch (fmt) {
+	case MIPI_DSI_FMT_RGB888:
+		return DSI_RGB888;
+	case MIPI_DSI_FMT_RGB666:
+		return DSI_RGB666_CONF2;
+	case MIPI_DSI_FMT_RGB666_PACKED:
+		return DSI_RGB666_CONF1;
+	case MIPI_DSI_FMT_RGB565:
+		return DSI_RGB565_CONF1;
+	default:
+		DRM_DEBUG_DRIVER("MIPI color invalid, so we use rgb888\n");
+	}
+	return DSI_RGB888;
+}
+
+static int dsi_pll_get_clkout_khz(int clkin_khz, int idf, int ndiv, int odf)
+{
+	/* prevent from division by 0 */
+	if (idf * odf)
+		return DIV_ROUND_CLOSEST(clkin_khz * ndiv, idf * odf);
+
+	return 0;
+}
+
+static int dsi_pll_get_params(int clkin_khz, int clkout_khz,
+			      int *idf, int *ndiv, int *odf)
+{
+	int i, o, n, n_min, n_max;
+	int fvco_min, fvco_max, delta, best_delta; /* all in khz */
+
+	/* Early checks preventing division by 0 & odd results */
+	if ((clkin_khz <= 0) || (clkout_khz <= 0))
+		return -EINVAL;
+
+	fvco_min = LANE_MIN_KBPS * 2 * ODF_MAX;
+	fvco_max = LANE_MAX_KBPS * 2 * ODF_MIN;
+
+	best_delta = 1000000; /* big started value (1000000khz) */
+
+	for (i = IDF_MIN; i <= IDF_MAX; i++) {
+		/* Compute ndiv range according to Fvco */
+		n_min = ((fvco_min * i) / (2 * clkin_khz)) + 1;
+		n_max = (fvco_max * i) / (2 * clkin_khz);
+
+		/* No need to continue idf loop if we reach ndiv max */
+		if (n_min >= NDIV_MAX)
+			break;
+
+		/* Clamp ndiv to valid values */
+		if (n_min < NDIV_MIN)
+			n_min = NDIV_MIN;
+		if (n_max > NDIV_MAX)
+			n_max = NDIV_MAX;
+
+		for (o = ODF_MIN; o <= ODF_MAX; o *= 2) {
+			n = DIV_ROUND_CLOSEST(i * o * clkout_khz, clkin_khz);
+			/* Check ndiv according to vco range */
+			if ((n < n_min) || (n > n_max))
+				continue;
+			/* Check if new delta is better & saves parameters */
+			delta = dsi_pll_get_clkout_khz(clkin_khz, i, n, o) -
+				clkout_khz;
+			if (delta < 0)
+				delta = -delta;
+			if (delta < best_delta) {
+				*idf = i;
+				*ndiv = n;
+				*odf = o;
+				best_delta = delta;
+			}
+			/* fast return in case of "perfect result" */
+			if (!delta)
+				return 0;
+		}
+	}
+
+	return 0;
+}
+
+static int dw_mipi_dsi_phy_init(void *priv_data)
+{
+	struct dw_mipi_dsi_stm *dsi = priv_data;
+	u32 val;
+	int ret;
+
+	/* Enable the regulator */
+	dsi_set(dsi, DSI_WRPCR, WRPCR_REGEN | WRPCR_BGREN);
+	ret = readl_poll_timeout(dsi->base + DSI_WISR, val, val & WISR_RRS,
+				 SLEEP_US, TIMEOUT_US);
+	if (ret)
+		DRM_DEBUG_DRIVER("!TIMEOUT! waiting REGU, let's continue\n");
+
+	/* Enable the DSI PLL & wait for its lock */
+	dsi_set(dsi, DSI_WRPCR, WRPCR_PLLEN);
+	ret = readl_poll_timeout(dsi->base + DSI_WISR, val, val & WISR_PLLLS,
+				 SLEEP_US, TIMEOUT_US);
+	if (ret)
+		DRM_DEBUG_DRIVER("!TIMEOUT! waiting PLL, let's continue\n");
+
+	/* Enable the DSI wrapper */
+	dsi_set(dsi, DSI_WCR, WCR_DSIEN);
+
+	return 0;
+}
+
+static int
+dw_mipi_dsi_get_lane_mbps(void *priv_data, struct drm_display_mode *mode,
+			  unsigned long mode_flags, u32 lanes, u32 format,
+			  unsigned int *lane_mbps)
+{
+	struct dw_mipi_dsi_stm *dsi = priv_data;
+	unsigned int idf, ndiv, odf, pll_in_khz, pll_out_khz;
+	int ret, bpp;
+	u32 val;
+
+	pll_in_khz = (unsigned int)(clk_get_rate(dsi->pllref_clk) / 1000);
+
+	/* Compute requested pll out */
+	bpp = mipi_dsi_pixel_format_to_bpp(format);
+	pll_out_khz = mode->clock * bpp / lanes;
+	/* Add 20% to pll out to be higher than pixel bw (burst mode only) */
+	pll_out_khz = (pll_out_khz * 12) / 10;
+	if (pll_out_khz > LANE_MAX_KBPS) {
+		pll_out_khz = LANE_MAX_KBPS;
+		DRM_WARN("Warning max phy mbps is used\n");
+	}
+	if (pll_out_khz < LANE_MIN_KBPS) {
+		pll_out_khz = LANE_MIN_KBPS;
+		DRM_WARN("Warning min phy mbps is used\n");
+	}
+
+	/* Compute best pll parameters */
+	idf = 0;
+	ndiv = 0;
+	odf = 0;
+	ret = dsi_pll_get_params(pll_in_khz, pll_out_khz, &idf, &ndiv, &odf);
+	if (ret)
+		DRM_WARN("Warning dsi_pll_get_params(): bad params\n");
+
+	/* Get the adjusted pll out value */
+	pll_out_khz = dsi_pll_get_clkout_khz(pll_in_khz, idf, ndiv, odf);
+
+	/* Set the PLL division factors */
+	dsi_update_bits(dsi, DSI_WRPCR,	WRPCR_NDIV | WRPCR_IDF | WRPCR_ODF,
+			(ndiv << 2) | (idf << 11) | ((ffs(odf) - 1) << 16));
+
+	/* Compute uix4 & set the bit period in high-speed mode */
+	val = 4000000 / pll_out_khz;
+	dsi_update_bits(dsi, DSI_WPCR0, WPCR0_UIX4, val);
+
+	/* Select video mode by resetting DSIM bit */
+	dsi_clear(dsi, DSI_WCFGR, WCFGR_DSIM);
+
+	/* Select the color coding */
+	dsi_update_bits(dsi, DSI_WCFGR, WCFGR_COLMUX,
+			dsi_color_from_mipi(format) << 1);
+
+	*lane_mbps = pll_out_khz / 1000;
+
+	DRM_DEBUG_DRIVER("pll_in %ukHz pll_out %ukHz lane_mbps %uMHz\n",
+			 pll_in_khz, pll_out_khz, *lane_mbps);
+
+	return 0;
+}
+
+static const struct dw_mipi_dsi_phy_ops dw_mipi_dsi_stm_phy_ops = {
+	.init = dw_mipi_dsi_phy_init,
+	.get_lane_mbps = dw_mipi_dsi_get_lane_mbps,
+};
+
+static struct dw_mipi_dsi_plat_data dw_mipi_dsi_stm_plat_data = {
+	.max_data_lanes = 2,
+	.phy_ops = &dw_mipi_dsi_stm_phy_ops,
+};
+
+static const struct of_device_id dw_mipi_dsi_stm_dt_ids[] = {
+	{ .compatible = "st,stm32-dsi", .data = &dw_mipi_dsi_stm_plat_data, },
+	{ },
+};
+MODULE_DEVICE_TABLE(of, dw_mipi_dsi_stm_dt_ids);
+
+static int dw_mipi_dsi_stm_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct dw_mipi_dsi_stm *dsi;
+	struct resource *res;
+	int ret;
+
+	dsi = devm_kzalloc(dev, sizeof(*dsi), GFP_KERNEL);
+	if (!dsi)
+		return -ENOMEM;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!res) {
+		DRM_ERROR("Unable to get resource\n");
+		return -ENODEV;
+	}
+
+	dsi->base = devm_ioremap_resource(dev, res);
+	if (IS_ERR(dsi->base)) {
+		DRM_ERROR("Unable to get dsi registers\n");
+		return PTR_ERR(dsi->base);
+	}
+
+	dsi->pllref_clk = devm_clk_get(dev, "ref");
+	if (IS_ERR(dsi->pllref_clk)) {
+		ret = PTR_ERR(dsi->pllref_clk);
+		dev_err(dev, "Unable to get pll reference clock: %d\n", ret);
+		return ret;
+	}
+
+	ret = clk_prepare_enable(dsi->pllref_clk);
+	if (ret) {
+		dev_err(dev, "%s: Failed to enable pllref_clk\n", __func__);
+		return ret;
+	}
+
+	dw_mipi_dsi_stm_plat_data.base = dsi->base;
+	dw_mipi_dsi_stm_plat_data.priv_data = dsi;
+
+	ret = dw_mipi_dsi_probe(pdev, &dw_mipi_dsi_stm_plat_data);
+	if (ret) {
+		DRM_ERROR("Failed to initialize mipi dsi host\n");
+		clk_disable_unprepare(dsi->pllref_clk);
+	}
+
+	return ret;
+}
+
+static int dw_mipi_dsi_stm_remove(struct platform_device *pdev)
+{
+	struct dw_mipi_dsi_stm *dsi = dw_mipi_dsi_stm_plat_data.priv_data;
+
+	clk_disable_unprepare(dsi->pllref_clk);
+	dw_mipi_dsi_remove(pdev);
+
+	return 0;
+}
+
+static struct platform_driver dw_mipi_dsi_stm_driver = {
+	.probe		= dw_mipi_dsi_stm_probe,
+	.remove		= dw_mipi_dsi_stm_remove,
+	.driver		= {
+		.of_match_table = dw_mipi_dsi_stm_dt_ids,
+		.name	= "dw_mipi_dsi-stm",
+	},
+};
+
+module_platform_driver(dw_mipi_dsi_stm_driver);
+
+MODULE_AUTHOR("Philippe Cornu <philippe.cornu@st.com>");
+MODULE_AUTHOR("Yannick Fertre <yannick.fertre@st.com>");
+MODULE_DESCRIPTION("STMicroelectronics DW MIPI DSI host controller driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/gpu/drm/stm/ltdc.c b/drivers/gpu/drm/stm/ltdc.c
index 1b9483d..d394a0363 100644
--- a/drivers/gpu/drm/stm/ltdc.c
+++ b/drivers/gpu/drm/stm/ltdc.c
@@ -21,7 +21,7 @@
 #include <drm/drm_fb_cma_helper.h>
 #include <drm/drm_gem_cma_helper.h>
 #include <drm/drm_of.h>
-#include <drm/drm_panel.h>
+#include <drm/drm_bridge.h>
 #include <drm/drm_plane_helper.h>
 
 #include <video/videomode.h>
@@ -42,52 +42,52 @@
  * an extra offset specified with reg_ofs.
  */
 #define REG_OFS_NONE	0
-#define REG_OFS_4	4 /* Insertion of "Layer Configuration 2" reg */
+#define REG_OFS_4	4		/* Insertion of "Layer Conf. 2" reg */
 #define REG_OFS		(ldev->caps.reg_ofs)
-#define LAY_OFS		0x80	/* Register Offset between 2 layers */
+#define LAY_OFS		0x80		/* Register Offset between 2 layers */
 
 /* Global register offsets */
-#define LTDC_IDR	0x0000 /* IDentification */
-#define LTDC_LCR	0x0004 /* Layer Count */
-#define LTDC_SSCR	0x0008 /* Synchronization Size Configuration */
-#define LTDC_BPCR	0x000C /* Back Porch Configuration */
-#define LTDC_AWCR	0x0010 /* Active Width Configuration */
-#define LTDC_TWCR	0x0014 /* Total Width Configuration */
-#define LTDC_GCR	0x0018 /* Global Control */
-#define LTDC_GC1R	0x001C /* Global Configuration 1 */
-#define LTDC_GC2R	0x0020 /* Global Configuration 2 */
-#define LTDC_SRCR	0x0024 /* Shadow Reload Configuration */
-#define LTDC_GACR	0x0028 /* GAmma Correction */
-#define LTDC_BCCR	0x002C /* Background Color Configuration */
-#define LTDC_IER	0x0034 /* Interrupt Enable */
-#define LTDC_ISR	0x0038 /* Interrupt Status */
-#define LTDC_ICR	0x003C /* Interrupt Clear */
-#define LTDC_LIPCR	0x0040 /* Line Interrupt Position Configuration */
-#define LTDC_CPSR	0x0044 /* Current Position Status */
-#define LTDC_CDSR	0x0048 /* Current Display Status */
+#define LTDC_IDR	0x0000		/* IDentification */
+#define LTDC_LCR	0x0004		/* Layer Count */
+#define LTDC_SSCR	0x0008		/* Synchronization Size Configuration */
+#define LTDC_BPCR	0x000C		/* Back Porch Configuration */
+#define LTDC_AWCR	0x0010		/* Active Width Configuration */
+#define LTDC_TWCR	0x0014		/* Total Width Configuration */
+#define LTDC_GCR	0x0018		/* Global Control */
+#define LTDC_GC1R	0x001C		/* Global Configuration 1 */
+#define LTDC_GC2R	0x0020		/* Global Configuration 2 */
+#define LTDC_SRCR	0x0024		/* Shadow Reload Configuration */
+#define LTDC_GACR	0x0028		/* GAmma Correction */
+#define LTDC_BCCR	0x002C		/* Background Color Configuration */
+#define LTDC_IER	0x0034		/* Interrupt Enable */
+#define LTDC_ISR	0x0038		/* Interrupt Status */
+#define LTDC_ICR	0x003C		/* Interrupt Clear */
+#define LTDC_LIPCR	0x0040		/* Line Interrupt Position Conf. */
+#define LTDC_CPSR	0x0044		/* Current Position Status */
+#define LTDC_CDSR	0x0048		/* Current Display Status */
 
 /* Layer register offsets */
-#define LTDC_L1LC1R	(0x0080)	   /* L1 Layer Configuration 1 */
-#define LTDC_L1LC2R	(0x0084)	   /* L1 Layer Configuration 2 */
-#define LTDC_L1CR	(0x0084 + REG_OFS) /* L1 Control */
-#define LTDC_L1WHPCR	(0x0088 + REG_OFS) /* L1 Window Hor Position Config */
-#define LTDC_L1WVPCR	(0x008C + REG_OFS) /* L1 Window Vert Position Config */
-#define LTDC_L1CKCR	(0x0090 + REG_OFS) /* L1 Color Keying Configuration */
-#define LTDC_L1PFCR	(0x0094 + REG_OFS) /* L1 Pixel Format Configuration */
-#define LTDC_L1CACR	(0x0098 + REG_OFS) /* L1 Constant Alpha Config */
-#define LTDC_L1DCCR	(0x009C + REG_OFS) /* L1 Default Color Configuration */
-#define LTDC_L1BFCR	(0x00A0 + REG_OFS) /* L1 Blend Factors Configuration */
-#define LTDC_L1FBBCR	(0x00A4 + REG_OFS) /* L1 FrameBuffer Bus Control */
-#define LTDC_L1AFBCR	(0x00A8 + REG_OFS) /* L1 AuxFB Control */
-#define LTDC_L1CFBAR	(0x00AC + REG_OFS) /* L1 Color FrameBuffer Address */
-#define LTDC_L1CFBLR	(0x00B0 + REG_OFS) /* L1 Color FrameBuffer Length */
-#define LTDC_L1CFBLNR	(0x00B4 + REG_OFS) /* L1 Color FrameBuffer Line Nb */
-#define LTDC_L1AFBAR	(0x00B8 + REG_OFS) /* L1 AuxFB Address */
-#define LTDC_L1AFBLR	(0x00BC + REG_OFS) /* L1 AuxFB Length */
-#define LTDC_L1AFBLNR	(0x00C0 + REG_OFS) /* L1 AuxFB Line Number */
-#define LTDC_L1CLUTWR	(0x00C4 + REG_OFS) /* L1 CLUT Write */
-#define LTDC_L1YS1R	(0x00E0 + REG_OFS) /* L1 YCbCr Scale 1 */
-#define LTDC_L1YS2R	(0x00E4 + REG_OFS) /* L1 YCbCr Scale 2 */
+#define LTDC_L1LC1R	(0x80)		/* L1 Layer Configuration 1 */
+#define LTDC_L1LC2R	(0x84)		/* L1 Layer Configuration 2 */
+#define LTDC_L1CR	(0x84 + REG_OFS)/* L1 Control */
+#define LTDC_L1WHPCR	(0x88 + REG_OFS)/* L1 Window Hor Position Config */
+#define LTDC_L1WVPCR	(0x8C + REG_OFS)/* L1 Window Vert Position Config */
+#define LTDC_L1CKCR	(0x90 + REG_OFS)/* L1 Color Keying Configuration */
+#define LTDC_L1PFCR	(0x94 + REG_OFS)/* L1 Pixel Format Configuration */
+#define LTDC_L1CACR	(0x98 + REG_OFS)/* L1 Constant Alpha Config */
+#define LTDC_L1DCCR	(0x9C + REG_OFS)/* L1 Default Color Configuration */
+#define LTDC_L1BFCR	(0xA0 + REG_OFS)/* L1 Blend Factors Configuration */
+#define LTDC_L1FBBCR	(0xA4 + REG_OFS)/* L1 FrameBuffer Bus Control */
+#define LTDC_L1AFBCR	(0xA8 + REG_OFS)/* L1 AuxFB Control */
+#define LTDC_L1CFBAR	(0xAC + REG_OFS)/* L1 Color FrameBuffer Address */
+#define LTDC_L1CFBLR	(0xB0 + REG_OFS)/* L1 Color FrameBuffer Length */
+#define LTDC_L1CFBLNR	(0xB4 + REG_OFS)/* L1 Color FrameBuffer Line Nb */
+#define LTDC_L1AFBAR	(0xB8 + REG_OFS)/* L1 AuxFB Address */
+#define LTDC_L1AFBLR	(0xBC + REG_OFS)/* L1 AuxFB Length */
+#define LTDC_L1AFBLNR	(0xC0 + REG_OFS)/* L1 AuxFB Line Number */
+#define LTDC_L1CLUTWR	(0xC4 + REG_OFS)/* L1 CLUT Write */
+#define LTDC_L1YS1R	(0xE0 + REG_OFS)/* L1 YCbCr Scale 1 */
+#define LTDC_L1YS2R	(0xE4 + REG_OFS)/* L1 YCbCr Scale 2 */
 
 /* Bit definitions */
 #define SSCR_VSH	GENMASK(10, 0)	/* Vertical Synchronization Height */
@@ -104,10 +104,10 @@
 
 #define GCR_LTDCEN	BIT(0)		/* LTDC ENable */
 #define GCR_DEN		BIT(16)		/* Dither ENable */
-#define GCR_PCPOL	BIT(28)		/* Pixel Clock POLarity */
-#define GCR_DEPOL	BIT(29)		/* Data Enable POLarity */
-#define GCR_VSPOL	BIT(30)		/* Vertical Synchro POLarity */
-#define GCR_HSPOL	BIT(31)		/* Horizontal Synchro POLarity */
+#define GCR_PCPOL	BIT(28)		/* Pixel Clock POLarity-Inverted */
+#define GCR_DEPOL	BIT(29)		/* Data Enable POLarity-High */
+#define GCR_VSPOL	BIT(30)		/* Vertical Synchro POLarity-High */
+#define GCR_HSPOL	BIT(31)		/* Horizontal Synchro POLarity-High */
 
 #define GC1R_WBCH	GENMASK(3, 0)	/* Width of Blue CHannel output */
 #define GC1R_WGCH	GENMASK(7, 4)	/* Width of Green Channel output */
@@ -172,60 +172,52 @@
 #define LXCFBLR_CFBLL	GENMASK(12, 0)	/* Color Frame Buffer Line Length */
 #define LXCFBLR_CFBP	GENMASK(28, 16)	/* Color Frame Buffer Pitch in bytes */
 
-#define LXCFBLNR_CFBLN	GENMASK(10, 0)	 /* Color Frame Buffer Line Number */
+#define LXCFBLNR_CFBLN	GENMASK(10, 0)	/* Color Frame Buffer Line Number */
 
-#define HSPOL_AL   0		/* Horizontal Sync POLarity Active Low */
-#define VSPOL_AL   0		/* Vertical Sync POLarity Active Low */
-#define DEPOL_AL   0		/* Data Enable POLarity Active Low */
-#define PCPOL_IPC  0		/* Input Pixel Clock */
-#define HSPOL_AH   GCR_HSPOL	/* Horizontal Sync POLarity Active High */
-#define VSPOL_AH   GCR_VSPOL	/* Vertical Sync POLarity Active High */
-#define DEPOL_AH   GCR_DEPOL	/* Data Enable POLarity Active High */
-#define PCPOL_IIPC GCR_PCPOL	/* Inverted Input Pixel Clock */
-#define CONSTA_MAX 0xFF		/* CONSTant Alpha MAX= 1.0 */
-#define BF1_PAXCA  0x600	/* Pixel Alpha x Constant Alpha */
-#define BF1_CA     0x400	/* Constant Alpha */
-#define BF2_1PAXCA 0x007	/* 1 - (Pixel Alpha x Constant Alpha) */
-#define BF2_1CA	   0x005	/* 1 - Constant Alpha */
+#define CONSTA_MAX	0xFF		/* CONSTant Alpha MAX= 1.0 */
+#define BF1_PAXCA	0x600		/* Pixel Alpha x Constant Alpha */
+#define BF1_CA		0x400		/* Constant Alpha */
+#define BF2_1PAXCA	0x007		/* 1 - (Pixel Alpha x Constant Alpha) */
+#define BF2_1CA		0x005		/* 1 - Constant Alpha */
 
-#define NB_PF           8       /* Max nb of HW pixel format */
+#define NB_PF		8		/* Max nb of HW pixel format */
 
 enum ltdc_pix_fmt {
 	PF_NONE,
 	/* RGB formats */
-	PF_ARGB8888,    /* ARGB [32 bits] */
-	PF_RGBA8888,    /* RGBA [32 bits] */
-	PF_RGB888,      /* RGB [24 bits] */
-	PF_RGB565,      /* RGB [16 bits] */
-	PF_ARGB1555,    /* ARGB A:1 bit RGB:15 bits [16 bits] */
-	PF_ARGB4444,    /* ARGB A:4 bits R/G/B: 4 bits each [16 bits] */
+	PF_ARGB8888,		/* ARGB [32 bits] */
+	PF_RGBA8888,		/* RGBA [32 bits] */
+	PF_RGB888,		/* RGB [24 bits] */
+	PF_RGB565,		/* RGB [16 bits] */
+	PF_ARGB1555,		/* ARGB A:1 bit RGB:15 bits [16 bits] */
+	PF_ARGB4444,		/* ARGB A:4 bits R/G/B: 4 bits each [16 bits] */
 	/* Indexed formats */
-	PF_L8,          /* Indexed 8 bits [8 bits] */
-	PF_AL44,        /* Alpha:4 bits + indexed 4 bits [8 bits] */
-	PF_AL88         /* Alpha:8 bits + indexed 8 bits [16 bits] */
+	PF_L8,			/* Indexed 8 bits [8 bits] */
+	PF_AL44,		/* Alpha:4 bits + indexed 4 bits [8 bits] */
+	PF_AL88			/* Alpha:8 bits + indexed 8 bits [16 bits] */
 };
 
 /* The index gives the encoding of the pixel format for an HW version */
 static const enum ltdc_pix_fmt ltdc_pix_fmt_a0[NB_PF] = {
-	PF_ARGB8888,	/* 0x00 */
-	PF_RGB888,	/* 0x01 */
-	PF_RGB565,	/* 0x02 */
-	PF_ARGB1555,	/* 0x03 */
-	PF_ARGB4444,	/* 0x04 */
-	PF_L8,		/* 0x05 */
-	PF_AL44,	/* 0x06 */
-	PF_AL88		/* 0x07 */
+	PF_ARGB8888,		/* 0x00 */
+	PF_RGB888,		/* 0x01 */
+	PF_RGB565,		/* 0x02 */
+	PF_ARGB1555,		/* 0x03 */
+	PF_ARGB4444,		/* 0x04 */
+	PF_L8,			/* 0x05 */
+	PF_AL44,		/* 0x06 */
+	PF_AL88			/* 0x07 */
 };
 
 static const enum ltdc_pix_fmt ltdc_pix_fmt_a1[NB_PF] = {
-	PF_ARGB8888,	/* 0x00 */
-	PF_RGB888,	/* 0x01 */
-	PF_RGB565,	/* 0x02 */
-	PF_RGBA8888,	/* 0x03 */
-	PF_AL44,	/* 0x04 */
-	PF_L8,		/* 0x05 */
-	PF_ARGB1555,	/* 0x06 */
-	PF_ARGB4444	/* 0x07 */
+	PF_ARGB8888,		/* 0x00 */
+	PF_RGB888,		/* 0x01 */
+	PF_RGB565,		/* 0x02 */
+	PF_RGBA8888,		/* 0x03 */
+	PF_AL44,		/* 0x04 */
+	PF_L8,			/* 0x05 */
+	PF_ARGB1555,		/* 0x06 */
+	PF_ARGB4444		/* 0x07 */
 };
 
 static inline u32 reg_read(void __iomem *base, u32 reg)
@@ -269,11 +261,6 @@ static inline struct ltdc_device *encoder_to_ltdc(struct drm_encoder *enc)
 	return (struct ltdc_device *)enc->dev->dev_private;
 }
 
-static inline struct ltdc_device *connector_to_ltdc(struct drm_connector *con)
-{
-	return (struct ltdc_device *)con->dev->dev_private;
-}
-
 static inline enum ltdc_pix_fmt to_ltdc_pixelformat(u32 drm_fmt)
 {
 	enum ltdc_pix_fmt pf;
@@ -307,7 +294,7 @@ static inline enum ltdc_pix_fmt to_ltdc_pixelformat(u32 drm_fmt)
 	default:
 		pf = PF_NONE;
 		break;
-	/* Note: There are no DRM_FORMAT for AL44 and AL88 */
+		/* Note: There are no DRM_FORMAT for AL44 and AL88 */
 	}
 
 	return pf;
@@ -330,8 +317,8 @@ static inline u32 to_drm_pixelformat(enum ltdc_pix_fmt pf)
 		return DRM_FORMAT_ARGB4444;
 	case PF_L8:
 		return DRM_FORMAT_C8;
-	case PF_AL44: /* No DRM support */
-	case PF_AL88: /* No DRM support */
+	case PF_AL44:		/* No DRM support */
+	case PF_AL88:		/* No DRM support */
 	case PF_NONE:
 	default:
 		return 0;
@@ -375,18 +362,8 @@ static irqreturn_t ltdc_irq(int irq, void *arg)
  * DRM_CRTC
  */
 
-static void ltdc_crtc_load_lut(struct drm_crtc *crtc)
-{
-	struct ltdc_device *ldev = crtc_to_ltdc(crtc);
-	unsigned int i, lay;
-
-	for (lay = 0; lay < ldev->caps.nb_layers; lay++)
-		for (i = 0; i < 256; i++)
-			reg_write(ldev->regs, LTDC_L1CLUTWR + lay * LAY_OFS,
-				  ldev->clut[i]);
-}
-
-static void ltdc_crtc_enable(struct drm_crtc *crtc)
+static void ltdc_crtc_atomic_enable(struct drm_crtc *crtc,
+				    struct drm_crtc_state *old_state)
 {
 	struct ltdc_device *ldev = crtc_to_ltdc(crtc);
 
@@ -407,7 +384,8 @@ static void ltdc_crtc_enable(struct drm_crtc *crtc)
 	drm_crtc_vblank_on(crtc);
 }
 
-static void ltdc_crtc_disable(struct drm_crtc *crtc)
+static void ltdc_crtc_atomic_disable(struct drm_crtc *crtc,
+				     struct drm_crtc_state *old_state)
 {
 	struct ltdc_device *ldev = crtc_to_ltdc(crtc);
 
@@ -462,20 +440,20 @@ static void ltdc_crtc_mode_set_nofb(struct drm_crtc *crtc)
 
 	clk_enable(ldev->pixel_clk);
 
-	/* Configures the HS, VS, DE and PC polarities. */
-	val = HSPOL_AL | VSPOL_AL | DEPOL_AL | PCPOL_IPC;
+	/* Configures the HS, VS, DE and PC polarities. Default Active Low */
+	val = 0;
 
 	if (vm.flags & DISPLAY_FLAGS_HSYNC_HIGH)
-		val |= HSPOL_AH;
+		val |= GCR_HSPOL;
 
 	if (vm.flags & DISPLAY_FLAGS_VSYNC_HIGH)
-		val |= VSPOL_AH;
+		val |= GCR_VSPOL;
 
 	if (vm.flags & DISPLAY_FLAGS_DE_HIGH)
-		val |= DEPOL_AH;
+		val |= GCR_DEPOL;
 
 	if (vm.flags & DISPLAY_FLAGS_PIXDATA_NEGEDGE)
-		val |= PCPOL_IIPC;
+		val |= GCR_PCPOL;
 
 	reg_update_bits(ldev->regs, LTDC_GCR,
 			GCR_HSPOL | GCR_VSPOL | GCR_DEPOL | GCR_PCPOL, val);
@@ -522,12 +500,11 @@ static void ltdc_crtc_atomic_flush(struct drm_crtc *crtc,
 	}
 }
 
-static struct drm_crtc_helper_funcs ltdc_crtc_helper_funcs = {
-	.load_lut = ltdc_crtc_load_lut,
-	.enable = ltdc_crtc_enable,
-	.disable = ltdc_crtc_disable,
+static const struct drm_crtc_helper_funcs ltdc_crtc_helper_funcs = {
 	.mode_set_nofb = ltdc_crtc_mode_set_nofb,
 	.atomic_flush = ltdc_crtc_atomic_flush,
+	.atomic_enable = ltdc_crtc_atomic_enable,
+	.atomic_disable = ltdc_crtc_atomic_disable,
 };
 
 int ltdc_crtc_enable_vblank(struct drm_device *ddev, unsigned int pipe)
@@ -548,7 +525,7 @@ void ltdc_crtc_disable_vblank(struct drm_device *ddev, unsigned int pipe)
 	reg_clear(ldev->regs, LTDC_IER, IER_LIE);
 }
 
-static struct drm_crtc_funcs ltdc_crtc_funcs = {
+static const struct drm_crtc_funcs ltdc_crtc_funcs = {
 	.destroy = drm_crtc_cleanup,
 	.set_config = drm_atomic_helper_set_config,
 	.page_flip = drm_atomic_helper_page_flip,
@@ -613,11 +590,11 @@ static void ltdc_plane_atomic_update(struct drm_plane *plane,
 	src_w = state->src_w >> 16;
 	src_h = state->src_h >> 16;
 
-	DRM_DEBUG_DRIVER(
-		"plane:%d fb:%d (%dx%d)@(%d,%d) -> (%dx%d)@(%d,%d)\n",
-		plane->base.id, fb->base.id,
-		src_w, src_h, src_x, src_y,
-		state->crtc_w, state->crtc_h, state->crtc_x, state->crtc_y);
+	DRM_DEBUG_DRIVER("plane:%d fb:%d (%dx%d)@(%d,%d) -> (%dx%d)@(%d,%d)\n",
+			 plane->base.id, fb->base.id,
+			 src_w, src_h, src_x, src_y,
+			 state->crtc_w, state->crtc_h,
+			 state->crtc_x, state->crtc_y);
 
 	bpcr = reg_read(ldev->regs, LTDC_BPCR);
 	ahbp = (bpcr & BPCR_AHBP) >> 16;
@@ -642,7 +619,7 @@ static void ltdc_plane_atomic_update(struct drm_plane *plane,
 	if (val == NB_PF) {
 		DRM_ERROR("Pixel format %.4s not supported\n",
 			  (char *)&fb->format->format);
-		val = 0; /* set by default ARGB 32 bits */
+		val = 0;	/* set by default ARGB 32 bits */
 	}
 	reg_update_bits(ldev->regs, LTDC_L1PFCR + lofs, LXPFCR_PF, val);
 
@@ -656,8 +633,7 @@ static void ltdc_plane_atomic_update(struct drm_plane *plane,
 
 	/* Specifies the constant alpha value */
 	val = CONSTA_MAX;
-	reg_update_bits(ldev->regs, LTDC_L1CACR + lofs,
-			LXCACR_CONSTA, val);
+	reg_update_bits(ldev->regs, LTDC_L1CACR + lofs, LXCACR_CONSTA, val);
 
 	/* Specifies the blending factors */
 	val = BF1_PAXCA | BF2_1PAXCA;
@@ -666,8 +642,7 @@ static void ltdc_plane_atomic_update(struct drm_plane *plane,
 
 	/* Configures the frame buffer line number */
 	val = y1 - y0 + 1;
-	reg_update_bits(ldev->regs, LTDC_L1CFBLNR + lofs,
-			LXCFBLNR_CFBLN, val);
+	reg_update_bits(ldev->regs, LTDC_L1CFBLNR + lofs, LXCFBLNR_CFBLN, val);
 
 	/* Sets the FB address */
 	paddr = (u32)drm_fb_cma_get_gem_addr(fb, state, 0);
@@ -706,11 +681,10 @@ static void ltdc_plane_atomic_disable(struct drm_plane *plane,
 			 oldstate->crtc->base.id, plane->base.id);
 }
 
-static struct drm_plane_funcs ltdc_plane_funcs = {
+static const struct drm_plane_funcs ltdc_plane_funcs = {
 	.update_plane = drm_atomic_helper_update_plane,
 	.disable_plane = drm_atomic_helper_disable_plane,
 	.destroy = drm_plane_cleanup,
-	.set_property = drm_atomic_helper_plane_set_property,
 	.reset = drm_atomic_helper_plane_reset,
 	.atomic_duplicate_state = drm_atomic_helper_plane_duplicate_state,
 	.atomic_destroy_state = drm_atomic_helper_plane_destroy_state,
@@ -748,7 +722,7 @@ static struct drm_plane *ltdc_plane_create(struct drm_device *ddev,
 
 	ret = drm_universal_plane_init(ddev, plane, possible_crtcs,
 				       &ltdc_plane_funcs, formats, nb_fmt,
-				       type, NULL);
+				       NULL, type, NULL);
 	if (ret < 0)
 		return 0;
 
@@ -773,7 +747,7 @@ static int ltdc_crtc_init(struct drm_device *ddev, struct drm_crtc *crtc)
 	struct ltdc_device *ldev = ddev->dev_private;
 	struct drm_plane *primary, *overlay;
 	unsigned int i;
-	int res;
+	int ret;
 
 	primary = ltdc_plane_create(ddev, DRM_PLANE_TYPE_PRIMARY);
 	if (!primary) {
@@ -781,9 +755,9 @@ static int ltdc_crtc_init(struct drm_device *ddev, struct drm_crtc *crtc)
 		return -EINVAL;
 	}
 
-	res = drm_crtc_init_with_planes(ddev, crtc, primary, NULL,
+	ret = drm_crtc_init_with_planes(ddev, crtc, primary, NULL,
 					&ltdc_crtc_funcs, NULL);
-	if (res) {
+	if (ret) {
 		DRM_ERROR("Can not initialize CRTC\n");
 		goto cleanup;
 	}
@@ -796,7 +770,7 @@ static int ltdc_crtc_init(struct drm_device *ddev, struct drm_crtc *crtc)
 	for (i = 1; i < ldev->caps.nb_layers; i++) {
 		overlay = ltdc_plane_create(ddev, DRM_PLANE_TYPE_OVERLAY);
 		if (!overlay) {
-			res = -ENOMEM;
+			ret = -ENOMEM;
 			DRM_ERROR("Can not create overlay plane %d\n", i);
 			goto cleanup;
 		}
@@ -806,137 +780,42 @@ static int ltdc_crtc_init(struct drm_device *ddev, struct drm_crtc *crtc)
 
 cleanup:
 	ltdc_plane_destroy_all(ddev);
-	return res;
+	return ret;
 }
 
 /*
  * DRM_ENCODER
  */
 
-static void ltdc_rgb_encoder_enable(struct drm_encoder *encoder)
-{
-	struct ltdc_device *ldev = encoder_to_ltdc(encoder);
-
-	DRM_DEBUG_DRIVER("\n");
-
-	drm_panel_prepare(ldev->panel);
-	drm_panel_enable(ldev->panel);
-}
-
-static void ltdc_rgb_encoder_disable(struct drm_encoder *encoder)
-{
-	struct ltdc_device *ldev = encoder_to_ltdc(encoder);
-
-	DRM_DEBUG_DRIVER("\n");
-
-	drm_panel_disable(ldev->panel);
-	drm_panel_unprepare(ldev->panel);
-}
-
-static const struct drm_encoder_helper_funcs ltdc_rgb_encoder_helper_funcs = {
-	.enable = ltdc_rgb_encoder_enable,
-	.disable = ltdc_rgb_encoder_disable,
-};
-
-static const struct drm_encoder_funcs ltdc_rgb_encoder_funcs = {
+static const struct drm_encoder_funcs ltdc_encoder_funcs = {
 	.destroy = drm_encoder_cleanup,
 };
 
-static struct drm_encoder *ltdc_rgb_encoder_create(struct drm_device *ddev)
+static int ltdc_encoder_init(struct drm_device *ddev)
 {
+	struct ltdc_device *ldev = ddev->dev_private;
 	struct drm_encoder *encoder;
+	int ret;
 
 	encoder = devm_kzalloc(ddev->dev, sizeof(*encoder), GFP_KERNEL);
 	if (!encoder)
-		return NULL;
+		return -ENOMEM;
 
 	encoder->possible_crtcs = CRTC_MASK;
-	encoder->possible_clones = 0; /* No cloning support */
+	encoder->possible_clones = 0;	/* No cloning support */
 
-	drm_encoder_init(ddev, encoder, &ltdc_rgb_encoder_funcs,
+	drm_encoder_init(ddev, encoder, &ltdc_encoder_funcs,
 			 DRM_MODE_ENCODER_DPI, NULL);
 
-	drm_encoder_helper_add(encoder, &ltdc_rgb_encoder_helper_funcs);
-
-	DRM_DEBUG_DRIVER("RGB encoder:%d created\n", encoder->base.id);
-
-	return encoder;
-}
-
-/*
- * DRM_CONNECTOR
- */
-
-static int ltdc_rgb_connector_get_modes(struct drm_connector *connector)
-{
-	struct drm_device *ddev = connector->dev;
-	struct ltdc_device *ldev = ddev->dev_private;
-	int ret = 0;
-
-	DRM_DEBUG_DRIVER("\n");
-
-	if (ldev->panel)
-		ret = drm_panel_get_modes(ldev->panel);
-
-	return ret < 0 ? 0 : ret;
-}
-
-static struct drm_connector_helper_funcs ltdc_rgb_connector_helper_funcs = {
-	.get_modes = ltdc_rgb_connector_get_modes,
-};
-
-static enum drm_connector_status
-ltdc_rgb_connector_detect(struct drm_connector *connector, bool force)
-{
-	struct ltdc_device *ldev = connector_to_ltdc(connector);
-
-	return ldev->panel ? connector_status_connected :
-	       connector_status_disconnected;
-}
-
-static void ltdc_rgb_connector_destroy(struct drm_connector *connector)
-{
-	DRM_DEBUG_DRIVER("\n");
-
-	drm_connector_unregister(connector);
-	drm_connector_cleanup(connector);
-}
-
-static const struct drm_connector_funcs ltdc_rgb_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
-	.fill_modes = drm_helper_probe_single_connector_modes,
-	.detect = ltdc_rgb_connector_detect,
-	.destroy = ltdc_rgb_connector_destroy,
-	.reset = drm_atomic_helper_connector_reset,
-	.atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
-	.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
-};
-
-struct drm_connector *ltdc_rgb_connector_create(struct drm_device *ddev)
-{
-	struct drm_connector *connector;
-	int err;
-
-	connector = devm_kzalloc(ddev->dev, sizeof(*connector), GFP_KERNEL);
-	if (!connector) {
-		DRM_ERROR("Failed to allocate connector\n");
-		return NULL;
+	ret = drm_bridge_attach(encoder, ldev->bridge, NULL);
+	if (ret) {
+		drm_encoder_cleanup(encoder);
+		return -EINVAL;
 	}
 
-	connector->polled = DRM_CONNECTOR_POLL_HPD;
+	DRM_DEBUG_DRIVER("Bridge encoder:%d created\n", encoder->base.id);
 
-	err = drm_connector_init(ddev, connector, &ltdc_rgb_connector_funcs,
-				 DRM_MODE_CONNECTOR_DPI);
-	if (err) {
-		DRM_ERROR("Failed to initialize connector\n");
-		return NULL;
-	}
-
-	drm_connector_helper_add(connector, &ltdc_rgb_connector_helper_funcs);
-
-	DRM_DEBUG_DRIVER("RGB connector:%d created\n", connector->base.id);
-
-	return connector;
+	return 0;
 }
 
 static int ltdc_get_caps(struct drm_device *ddev)
@@ -972,61 +851,26 @@ static int ltdc_get_caps(struct drm_device *ddev)
 	return 0;
 }
 
-static struct drm_panel *ltdc_get_panel(struct drm_device *ddev)
-{
-	struct device *dev = ddev->dev;
-	struct device_node *np = dev->of_node;
-	struct device_node *entity, *port = NULL;
-	struct drm_panel *panel = NULL;
-
-	DRM_DEBUG_DRIVER("\n");
-
-	/*
-	 * Parse ltdc node to get remote port and find RGB panel / HDMI slave
-	 * If a dsi or a bridge (hdmi, lvds...) is connected to ltdc,
-	 * a remote port & RGB panel will not be found.
-	 */
-	for_each_endpoint_of_node(np, entity) {
-		if (!of_device_is_available(entity))
-			continue;
-
-		port = of_graph_get_remote_port_parent(entity);
-		if (port) {
-			panel = of_drm_find_panel(port);
-			of_node_put(port);
-			if (panel) {
-				DRM_DEBUG_DRIVER("remote panel %s\n",
-						 port->full_name);
-			} else {
-				DRM_DEBUG_DRIVER("panel missing\n");
-				of_node_put(entity);
-			}
-		}
-	}
-
-	return panel;
-}
-
 int ltdc_load(struct drm_device *ddev)
 {
 	struct platform_device *pdev = to_platform_device(ddev->dev);
 	struct ltdc_device *ldev = ddev->dev_private;
 	struct device *dev = ddev->dev;
 	struct device_node *np = dev->of_node;
-	struct drm_encoder *encoder;
-	struct drm_connector *connector = NULL;
+	struct drm_bridge *bridge;
+	struct drm_panel *panel;
 	struct drm_crtc *crtc;
 	struct reset_control *rstc;
-	struct resource res;
+	struct resource *res;
 	int irq, ret, i;
 
 	DRM_DEBUG_DRIVER("\n");
 
-	ldev->panel = ltdc_get_panel(ddev);
-	if (!ldev->panel)
-		return -EPROBE_DEFER;
+	ret = drm_of_find_panel_or_bridge(np, 0, 0, &panel, &bridge);
+	if (ret)
+		return ret;
 
-	rstc = of_reset_control_get(np, NULL);
+	rstc = devm_reset_control_get_exclusive(dev, NULL);
 
 	mutex_init(&ldev->err_lock);
 
@@ -1041,15 +885,18 @@ int ltdc_load(struct drm_device *ddev)
 		return -ENODEV;
 	}
 
-	if (of_address_to_resource(np, 0, &res)) {
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!res) {
 		DRM_ERROR("Unable to get resource\n");
-		return -ENODEV;
+		ret = -ENODEV;
+		goto err;
 	}
 
-	ldev->regs = devm_ioremap_resource(dev, &res);
+	ldev->regs = devm_ioremap_resource(dev, res);
 	if (IS_ERR(ldev->regs)) {
 		DRM_ERROR("Unable to get ltdc registers\n");
-		return PTR_ERR(ldev->regs);
+		ret = PTR_ERR(ldev->regs);
+		goto err;
 	}
 
 	for (i = 0; i < MAX_IRQ; i++) {
@@ -1062,7 +909,7 @@ int ltdc_load(struct drm_device *ddev)
 						dev_name(dev), ddev);
 		if (ret) {
 			DRM_ERROR("Failed to register LTDC interrupt\n");
-			return ret;
+			goto err;
 		}
 	}
 
@@ -1077,33 +924,27 @@ int ltdc_load(struct drm_device *ddev)
 	if (ret) {
 		DRM_ERROR("hardware identifier (0x%08x) not supported!\n",
 			  ldev->caps.hw_version);
-		return ret;
+		goto err;
 	}
 
 	DRM_INFO("ltdc hw version 0x%08x - ready\n", ldev->caps.hw_version);
 
-	if (ldev->panel) {
-		encoder = ltdc_rgb_encoder_create(ddev);
-		if (!encoder) {
-			DRM_ERROR("Failed to create RGB encoder\n");
-			ret = -EINVAL;
+	if (panel) {
+		bridge = drm_panel_bridge_add(panel, DRM_MODE_CONNECTOR_DPI);
+		if (IS_ERR(bridge)) {
+			DRM_ERROR("Failed to create panel-bridge\n");
+			ret = PTR_ERR(bridge);
 			goto err;
 		}
+		ldev->is_panel_bridge = true;
+	}
 
-		connector = ltdc_rgb_connector_create(ddev);
-		if (!connector) {
-			DRM_ERROR("Failed to create RGB connector\n");
-			ret = -EINVAL;
-			goto err;
-		}
+	ldev->bridge = bridge;
 
-		ret = drm_mode_connector_attach_encoder(connector, encoder);
-		if (ret) {
-			DRM_ERROR("Failed to attach connector to encoder\n");
-			goto err;
-		}
-
-		drm_panel_attach(ldev->panel, connector);
+	ret = ltdc_encoder_init(ddev);
+	if (ret) {
+		DRM_ERROR("Failed to init encoder\n");
+		goto err;
 	}
 
 	crtc = devm_kzalloc(dev, sizeof(*crtc), GFP_KERNEL);
@@ -1129,9 +970,10 @@ int ltdc_load(struct drm_device *ddev)
 	ddev->irq_enabled = 1;
 
 	return 0;
+
 err:
-	if (ldev->panel)
-		drm_panel_detach(ldev->panel);
+	if (ldev->is_panel_bridge)
+		drm_panel_bridge_remove(bridge);
 
 	clk_disable_unprepare(ldev->pixel_clk);
 
@@ -1144,8 +986,8 @@ void ltdc_unload(struct drm_device *ddev)
 
 	DRM_DEBUG_DRIVER("\n");
 
-	if (ldev->panel)
-		drm_panel_detach(ldev->panel);
+	if (ldev->is_panel_bridge)
+		drm_panel_bridge_remove(ldev->bridge);
 
 	clk_disable_unprepare(ldev->pixel_clk);
 }
diff --git a/drivers/gpu/drm/stm/ltdc.h b/drivers/gpu/drm/stm/ltdc.h
index d7a9c73..bc6d6f6 100644
--- a/drivers/gpu/drm/stm/ltdc.h
+++ b/drivers/gpu/drm/stm/ltdc.h
@@ -24,10 +24,10 @@ struct ltdc_device {
 	struct drm_fbdev_cma *fbdev;
 	void __iomem *regs;
 	struct clk *pixel_clk;	/* lcd pixel clock */
-	struct drm_panel *panel;
+	struct drm_bridge *bridge;
+	bool is_panel_bridge;
 	struct mutex err_lock;	/* protecting error_status */
 	struct ltdc_caps caps;
-	u32 clut[256];		/* color look up table */
 	u32 error_status;
 	u32 irq_status;
 };
diff --git a/drivers/gpu/drm/sun4i/Kconfig b/drivers/gpu/drm/sun4i/Kconfig
index 5bcad8f..06f0530 100644
--- a/drivers/gpu/drm/sun4i/Kconfig
+++ b/drivers/gpu/drm/sun4i/Kconfig
@@ -13,17 +13,26 @@
 	  Display Engine. If M is selected the module will be called
 	  sun4i-drm.
 
+if DRM_SUN4I
+
 config DRM_SUN4I_HDMI
        tristate "Allwinner A10 HDMI Controller Support"
-       depends on DRM_SUN4I
        default DRM_SUN4I
        help
 	  Choose this option if you have an Allwinner SoC with an HDMI
 	  controller.
 
+config DRM_SUN4I_HDMI_CEC
+       bool "Allwinner A10 HDMI CEC Support"
+       depends on DRM_SUN4I_HDMI
+       select CEC_CORE
+       depends on CEC_PIN
+       help
+	  Choose this option if you have an Allwinner SoC with an HDMI
+	  controller and want to use CEC.
+
 config DRM_SUN4I_BACKEND
 	tristate "Support for Allwinner A10 Display Engine Backend"
-	depends on DRM_SUN4I
 	default DRM_SUN4I
 	help
 	  Choose this option if you have an Allwinner SoC with the
@@ -33,10 +42,11 @@
 
 config DRM_SUN8I_MIXER
 	tristate "Support for Allwinner Display Engine 2.0 Mixer"
-	depends on DRM_SUN4I
 	default MACH_SUN8I
 	help
 	  Choose this option if you have an Allwinner SoC with the
 	  Allwinner Display Engine 2.0, which has a mixer to do some
 	  graphics mixture and feed graphics to TCON, If M is
 	  selected the module will be called sun8i-mixer.
+
+endif
diff --git a/drivers/gpu/drm/sun4i/Makefile b/drivers/gpu/drm/sun4i/Makefile
index e29fd3a..43c753c 100644
--- a/drivers/gpu/drm/sun4i/Makefile
+++ b/drivers/gpu/drm/sun4i/Makefile
@@ -2,6 +2,7 @@
 sun4i-drm-y += sun4i_framebuffer.o
 
 sun4i-drm-hdmi-y += sun4i_hdmi_enc.o
+sun4i-drm-hdmi-y += sun4i_hdmi_i2c.o
 sun4i-drm-hdmi-y += sun4i_hdmi_ddc_clk.o
 sun4i-drm-hdmi-y += sun4i_hdmi_tmds_clk.o
 
diff --git a/drivers/gpu/drm/sun4i/sun4i_backend.c b/drivers/gpu/drm/sun4i/sun4i_backend.c
index cf48021..ec59436 100644
--- a/drivers/gpu/drm/sun4i/sun4i_backend.c
+++ b/drivers/gpu/drm/sun4i/sun4i_backend.c
@@ -312,7 +312,7 @@ static int sun4i_backend_of_get_id(struct device_node *node)
 		struct device_node *remote;
 		u32 reg;
 
-		remote = of_parse_phandle(ep, "remote-endpoint", 0);
+		remote = of_graph_get_remote_endpoint(ep);
 		if (!remote)
 			continue;
 
diff --git a/drivers/gpu/drm/sun4i/sun4i_crtc.c b/drivers/gpu/drm/sun4i/sun4i_crtc.c
index f8c7043..d097c6f 100644
--- a/drivers/gpu/drm/sun4i/sun4i_crtc.c
+++ b/drivers/gpu/drm/sun4i/sun4i_crtc.c
@@ -69,7 +69,8 @@ static void sun4i_crtc_atomic_flush(struct drm_crtc *crtc,
 	}
 }
 
-static void sun4i_crtc_disable(struct drm_crtc *crtc)
+static void sun4i_crtc_atomic_disable(struct drm_crtc *crtc,
+				      struct drm_crtc_state *old_state)
 {
 	struct sun4i_crtc *scrtc = drm_crtc_to_sun4i_crtc(crtc);
 
@@ -86,7 +87,8 @@ static void sun4i_crtc_disable(struct drm_crtc *crtc)
 	}
 }
 
-static void sun4i_crtc_enable(struct drm_crtc *crtc)
+static void sun4i_crtc_atomic_enable(struct drm_crtc *crtc,
+				     struct drm_crtc_state *old_state)
 {
 	struct sun4i_crtc *scrtc = drm_crtc_to_sun4i_crtc(crtc);
 
@@ -98,8 +100,8 @@ static void sun4i_crtc_enable(struct drm_crtc *crtc)
 static const struct drm_crtc_helper_funcs sun4i_crtc_helper_funcs = {
 	.atomic_begin	= sun4i_crtc_atomic_begin,
 	.atomic_flush	= sun4i_crtc_atomic_flush,
-	.disable	= sun4i_crtc_disable,
-	.enable		= sun4i_crtc_enable,
+	.atomic_enable	= sun4i_crtc_atomic_enable,
+	.atomic_disable	= sun4i_crtc_atomic_disable,
 };
 
 static int sun4i_crtc_enable_vblank(struct drm_crtc *crtc)
diff --git a/drivers/gpu/drm/sun4i/sun4i_drv.c b/drivers/gpu/drm/sun4i/sun4i_drv.c
index a45a627..ace5965 100644
--- a/drivers/gpu/drm/sun4i/sun4i_drv.c
+++ b/drivers/gpu/drm/sun4i/sun4i_drv.c
@@ -48,8 +48,6 @@ static struct drm_driver sun4i_drv_driver = {
 
 	/* GEM Operations */
 	.dumb_create		= drm_gem_cma_dumb_create,
-	.dumb_destroy		= drm_gem_dumb_destroy,
-	.dumb_map_offset	= drm_gem_cma_dumb_map_offset,
 	.gem_free_object_unlocked = drm_gem_cma_free_object,
 	.gem_vm_ops		= &drm_gem_cma_vm_ops,
 
@@ -195,9 +193,9 @@ static bool sun4i_drv_node_is_tcon(struct device_node *node)
 
 static int compare_of(struct device *dev, void *data)
 {
-	DRM_DEBUG_DRIVER("Comparing of node %s with %s\n",
-			 of_node_full_name(dev->of_node),
-			 of_node_full_name(data));
+	DRM_DEBUG_DRIVER("Comparing of node %pOF with %pOF\n",
+			 dev->of_node,
+			 data);
 
 	return dev->of_node == data;
 }
@@ -227,8 +225,7 @@ static int sun4i_drv_add_endpoints(struct device *dev,
 
 	if (!sun4i_drv_node_is_frontend(node)) {
 		/* Add current component */
-		DRM_DEBUG_DRIVER("Adding component %s\n",
-				 of_node_full_name(node));
+		DRM_DEBUG_DRIVER("Adding component %pOF\n", node);
 		drm_of_component_match_add(dev, match, compare_of, node);
 		count++;
 	}
diff --git a/drivers/gpu/drm/sun4i/sun4i_hdmi.h b/drivers/gpu/drm/sun4i/sun4i_hdmi.h
index 2f2f2ff..1457750 100644
--- a/drivers/gpu/drm/sun4i/sun4i_hdmi.h
+++ b/drivers/gpu/drm/sun4i/sun4i_hdmi.h
@@ -15,6 +15,8 @@
 #include <drm/drm_connector.h>
 #include <drm/drm_encoder.h>
 
+#include <media/cec.h>
+
 #define SUN4I_HDMI_CTRL_REG		0x004
 #define SUN4I_HDMI_CTRL_ENABLE			BIT(31)
 
@@ -86,6 +88,11 @@
 #define SUN4I_HDMI_PLL_DBG0_TMDS_PARENT_MASK	BIT(21)
 #define SUN4I_HDMI_PLL_DBG0_TMDS_PARENT_SHIFT	21
 
+#define SUN4I_HDMI_CEC			0x214
+#define SUN4I_HDMI_CEC_ENABLE			BIT(11)
+#define SUN4I_HDMI_CEC_TX			BIT(9)
+#define SUN4I_HDMI_CEC_RX			BIT(8)
+
 #define SUN4I_HDMI_PKT_CTRL_REG(n)	(0x2f0 + (4 * (n)))
 #define SUN4I_HDMI_PKT_CTRL_TYPE(n, t)		((t) << (((n) % 4) * 4))
 
@@ -96,6 +103,7 @@
 #define SUN4I_HDMI_DDC_CTRL_ENABLE		BIT(31)
 #define SUN4I_HDMI_DDC_CTRL_START_CMD		BIT(30)
 #define SUN4I_HDMI_DDC_CTRL_FIFO_DIR_MASK	BIT(8)
+#define SUN4I_HDMI_DDC_CTRL_FIFO_DIR_WRITE	(1 << 8)
 #define SUN4I_HDMI_DDC_CTRL_FIFO_DIR_READ	(0 << 8)
 #define SUN4I_HDMI_DDC_CTRL_RESET		BIT(0)
 
@@ -105,14 +113,34 @@
 #define SUN4I_HDMI_DDC_ADDR_OFFSET(off)		(((off) & 0xff) << 8)
 #define SUN4I_HDMI_DDC_ADDR_SLAVE(addr)		((addr) & 0xff)
 
+#define SUN4I_HDMI_DDC_INT_STATUS_REG		0x50c
+#define SUN4I_HDMI_DDC_INT_STATUS_ILLEGAL_FIFO_OPERATION	BIT(7)
+#define SUN4I_HDMI_DDC_INT_STATUS_DDC_RX_FIFO_UNDERFLOW		BIT(6)
+#define SUN4I_HDMI_DDC_INT_STATUS_DDC_TX_FIFO_OVERFLOW		BIT(5)
+#define SUN4I_HDMI_DDC_INT_STATUS_FIFO_REQUEST			BIT(4)
+#define SUN4I_HDMI_DDC_INT_STATUS_ARBITRATION_ERROR		BIT(3)
+#define SUN4I_HDMI_DDC_INT_STATUS_ACK_ERROR			BIT(2)
+#define SUN4I_HDMI_DDC_INT_STATUS_BUS_ERROR			BIT(1)
+#define SUN4I_HDMI_DDC_INT_STATUS_TRANSFER_COMPLETE		BIT(0)
+
 #define SUN4I_HDMI_DDC_FIFO_CTRL_REG	0x510
 #define SUN4I_HDMI_DDC_FIFO_CTRL_CLEAR		BIT(31)
+#define SUN4I_HDMI_DDC_FIFO_CTRL_RX_THRES(n)	(((n) & 0xf) << 4)
+#define SUN4I_HDMI_DDC_FIFO_CTRL_RX_THRES_MASK	GENMASK(7, 4)
+#define SUN4I_HDMI_DDC_FIFO_CTRL_RX_THRES_MAX	(BIT(4) - 1)
+#define SUN4I_HDMI_DDC_FIFO_CTRL_TX_THRES(n)	((n) & 0xf)
+#define SUN4I_HDMI_DDC_FIFO_CTRL_TX_THRES_MASK	GENMASK(3, 0)
+#define SUN4I_HDMI_DDC_FIFO_CTRL_TX_THRES_MAX	(BIT(4) - 1)
 
 #define SUN4I_HDMI_DDC_FIFO_DATA_REG	0x518
+
 #define SUN4I_HDMI_DDC_BYTE_COUNT_REG	0x51c
+#define SUN4I_HDMI_DDC_BYTE_COUNT_MAX		(BIT(10) - 1)
 
 #define SUN4I_HDMI_DDC_CMD_REG		0x520
 #define SUN4I_HDMI_DDC_CMD_EXPLICIT_EDDC_READ	6
+#define SUN4I_HDMI_DDC_CMD_IMPLICIT_READ	5
+#define SUN4I_HDMI_DDC_CMD_IMPLICIT_WRITE	3
 
 #define SUN4I_HDMI_DDC_CLK_REG		0x528
 #define SUN4I_HDMI_DDC_CLK_M(m)			(((m) & 0x7) << 3)
@@ -146,12 +174,16 @@ struct sun4i_hdmi {
 	struct clk		*ddc_clk;
 	struct clk		*tmds_clk;
 
+	struct i2c_adapter	*i2c;
+
 	struct sun4i_drv	*drv;
 
 	bool			hdmi_monitor;
+	struct cec_adapter	*cec_adap;
 };
 
 int sun4i_ddc_create(struct sun4i_hdmi *hdmi, struct clk *clk);
 int sun4i_tmds_create(struct sun4i_hdmi *hdmi);
+int sun4i_hdmi_i2c_create(struct device *dev, struct sun4i_hdmi *hdmi);
 
 #endif /* _SUN4I_HDMI_H_ */
diff --git a/drivers/gpu/drm/sun4i/sun4i_hdmi_enc.c b/drivers/gpu/drm/sun4i/sun4i_hdmi_enc.c
index d3398f62..9ea6cd5 100644
--- a/drivers/gpu/drm/sun4i/sun4i_hdmi_enc.c
+++ b/drivers/gpu/drm/sun4i/sun4i_hdmi_enc.c
@@ -29,8 +29,6 @@
 #include "sun4i_hdmi.h"
 #include "sun4i_tcon.h"
 
-#define DDC_SEGMENT_ADDR	0x30
-
 static inline struct sun4i_hdmi *
 drm_encoder_to_sun4i_hdmi(struct drm_encoder *encoder)
 {
@@ -52,7 +50,7 @@ static int sun4i_hdmi_setup_avi_infoframes(struct sun4i_hdmi *hdmi,
 	u8 buffer[17];
 	int i, ret;
 
-	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode);
+	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
 	if (ret < 0) {
 		DRM_ERROR("Failed to get infoframes from mode\n");
 		return ret;
@@ -184,93 +182,13 @@ static const struct drm_encoder_funcs sun4i_hdmi_funcs = {
 	.destroy	= drm_encoder_cleanup,
 };
 
-static int sun4i_hdmi_read_sub_block(struct sun4i_hdmi *hdmi,
-				     unsigned int blk, unsigned int offset,
-				     u8 *buf, unsigned int count)
-{
-	unsigned long reg;
-	int i;
-
-	reg = readl(hdmi->base + SUN4I_HDMI_DDC_CTRL_REG);
-	reg &= ~SUN4I_HDMI_DDC_CTRL_FIFO_DIR_MASK;
-	writel(reg | SUN4I_HDMI_DDC_CTRL_FIFO_DIR_READ,
-	       hdmi->base + SUN4I_HDMI_DDC_CTRL_REG);
-
-	writel(SUN4I_HDMI_DDC_ADDR_SEGMENT(offset >> 8) |
-	       SUN4I_HDMI_DDC_ADDR_EDDC(DDC_SEGMENT_ADDR << 1) |
-	       SUN4I_HDMI_DDC_ADDR_OFFSET(offset) |
-	       SUN4I_HDMI_DDC_ADDR_SLAVE(DDC_ADDR),
-	       hdmi->base + SUN4I_HDMI_DDC_ADDR_REG);
-
-	reg = readl(hdmi->base + SUN4I_HDMI_DDC_FIFO_CTRL_REG);
-	writel(reg | SUN4I_HDMI_DDC_FIFO_CTRL_CLEAR,
-	       hdmi->base + SUN4I_HDMI_DDC_FIFO_CTRL_REG);
-
-	writel(count, hdmi->base + SUN4I_HDMI_DDC_BYTE_COUNT_REG);
-	writel(SUN4I_HDMI_DDC_CMD_EXPLICIT_EDDC_READ,
-	       hdmi->base + SUN4I_HDMI_DDC_CMD_REG);
-
-	reg = readl(hdmi->base + SUN4I_HDMI_DDC_CTRL_REG);
-	writel(reg | SUN4I_HDMI_DDC_CTRL_START_CMD,
-	       hdmi->base + SUN4I_HDMI_DDC_CTRL_REG);
-
-	if (readl_poll_timeout(hdmi->base + SUN4I_HDMI_DDC_CTRL_REG, reg,
-			       !(reg & SUN4I_HDMI_DDC_CTRL_START_CMD),
-			       100, 100000))
-		return -EIO;
-
-	for (i = 0; i < count; i++)
-		buf[i] = readb(hdmi->base + SUN4I_HDMI_DDC_FIFO_DATA_REG);
-
-	return 0;
-}
-
-static int sun4i_hdmi_read_edid_block(void *data, u8 *buf, unsigned int blk,
-				      size_t length)
-{
-	struct sun4i_hdmi *hdmi = data;
-	int retry = 2, i;
-
-	do {
-		for (i = 0; i < length; i += SUN4I_HDMI_DDC_FIFO_SIZE) {
-			unsigned char offset = blk * EDID_LENGTH + i;
-			unsigned int count = min((unsigned int)SUN4I_HDMI_DDC_FIFO_SIZE,
-						 length - i);
-			int ret;
-
-			ret = sun4i_hdmi_read_sub_block(hdmi, blk, offset,
-							buf + i, count);
-			if (ret)
-				return ret;
-		}
-	} while (!drm_edid_block_valid(buf, blk, true, NULL) && (retry--));
-
-	return 0;
-}
-
 static int sun4i_hdmi_get_modes(struct drm_connector *connector)
 {
 	struct sun4i_hdmi *hdmi = drm_connector_to_sun4i_hdmi(connector);
-	unsigned long reg;
 	struct edid *edid;
 	int ret;
 
-	/* Reset i2c controller */
-	writel(SUN4I_HDMI_DDC_CTRL_ENABLE | SUN4I_HDMI_DDC_CTRL_RESET,
-	       hdmi->base + SUN4I_HDMI_DDC_CTRL_REG);
-	if (readl_poll_timeout(hdmi->base + SUN4I_HDMI_DDC_CTRL_REG, reg,
-			       !(reg & SUN4I_HDMI_DDC_CTRL_RESET),
-			       100, 2000))
-		return -EIO;
-
-	writel(SUN4I_HDMI_DDC_LINE_CTRL_SDA_ENABLE |
-	       SUN4I_HDMI_DDC_LINE_CTRL_SCL_ENABLE,
-	       hdmi->base + SUN4I_HDMI_DDC_LINE_CTRL_REG);
-
-	clk_prepare_enable(hdmi->ddc_clk);
-	clk_set_rate(hdmi->ddc_clk, 100000);
-
-	edid = drm_do_get_edid(connector, sun4i_hdmi_read_edid_block, hdmi);
+	edid = drm_get_edid(connector, hdmi->i2c);
 	if (!edid)
 		return 0;
 
@@ -279,11 +197,10 @@ static int sun4i_hdmi_get_modes(struct drm_connector *connector)
 			 hdmi->hdmi_monitor ? "an HDMI" : "a DVI");
 
 	drm_mode_connector_update_edid_property(connector, edid);
+	cec_s_phys_addr_from_edid(hdmi->cec_adap, edid);
 	ret = drm_add_edid_modes(connector, edid);
 	kfree(edid);
 
-	clk_disable_unprepare(hdmi->ddc_clk);
-
 	return ret;
 }
 
@@ -299,14 +216,15 @@ sun4i_hdmi_connector_detect(struct drm_connector *connector, bool force)
 
 	if (readl_poll_timeout(hdmi->base + SUN4I_HDMI_HPD_REG, reg,
 			       reg & SUN4I_HDMI_HPD_HIGH,
-			       0, 500000))
+			       0, 500000)) {
+		cec_phys_addr_invalidate(hdmi->cec_adap);
 		return connector_status_disconnected;
+	}
 
 	return connector_status_connected;
 }
 
 static const struct drm_connector_funcs sun4i_hdmi_connector_funcs = {
-	.dpms			= drm_atomic_helper_connector_dpms,
 	.detect			= sun4i_hdmi_connector_detect,
 	.fill_modes		= drm_helper_probe_single_connector_modes,
 	.destroy		= drm_connector_cleanup,
@@ -315,6 +233,40 @@ static const struct drm_connector_funcs sun4i_hdmi_connector_funcs = {
 	.atomic_destroy_state	= drm_atomic_helper_connector_destroy_state,
 };
 
+#ifdef CONFIG_DRM_SUN4I_HDMI_CEC
+static bool sun4i_hdmi_cec_pin_read(struct cec_adapter *adap)
+{
+	struct sun4i_hdmi *hdmi = cec_get_drvdata(adap);
+
+	return readl(hdmi->base + SUN4I_HDMI_CEC) & SUN4I_HDMI_CEC_RX;
+}
+
+static void sun4i_hdmi_cec_pin_low(struct cec_adapter *adap)
+{
+	struct sun4i_hdmi *hdmi = cec_get_drvdata(adap);
+
+	/* Start driving the CEC pin low */
+	writel(SUN4I_HDMI_CEC_ENABLE, hdmi->base + SUN4I_HDMI_CEC);
+}
+
+static void sun4i_hdmi_cec_pin_high(struct cec_adapter *adap)
+{
+	struct sun4i_hdmi *hdmi = cec_get_drvdata(adap);
+
+	/*
+	 * Stop driving the CEC pin, the pull up will take over
+	 * unless another CEC device is driving the pin low.
+	 */
+	writel(0, hdmi->base + SUN4I_HDMI_CEC);
+}
+
+static const struct cec_pin_ops sun4i_hdmi_cec_pin_ops = {
+	.read = sun4i_hdmi_cec_pin_read,
+	.low = sun4i_hdmi_cec_pin_low,
+	.high = sun4i_hdmi_cec_pin_high,
+};
+#endif
+
 static int sun4i_hdmi_bind(struct device *dev, struct device *master,
 			   void *data)
 {
@@ -407,9 +359,9 @@ static int sun4i_hdmi_bind(struct device *dev, struct device *master,
 		SUN4I_HDMI_PLL_CTRL_PLL_EN;
 	writel(reg, hdmi->base + SUN4I_HDMI_PLL_CTRL_REG);
 
-	ret = sun4i_ddc_create(hdmi, hdmi->tmds_clk);
+	ret = sun4i_hdmi_i2c_create(dev, hdmi);
 	if (ret) {
-		dev_err(dev, "Couldn't create the DDC clock\n");
+		dev_err(dev, "Couldn't create the HDMI I2C adapter\n");
 		return ret;
 	}
 
@@ -422,13 +374,26 @@ static int sun4i_hdmi_bind(struct device *dev, struct device *master,
 			       NULL);
 	if (ret) {
 		dev_err(dev, "Couldn't initialise the HDMI encoder\n");
-		return ret;
+		goto err_del_i2c_adapter;
 	}
 
 	hdmi->encoder.possible_crtcs = drm_of_find_possible_crtcs(drm,
 								  dev->of_node);
-	if (!hdmi->encoder.possible_crtcs)
-		return -EPROBE_DEFER;
+	if (!hdmi->encoder.possible_crtcs) {
+		ret = -EPROBE_DEFER;
+		goto err_del_i2c_adapter;
+	}
+
+#ifdef CONFIG_DRM_SUN4I_HDMI_CEC
+	hdmi->cec_adap = cec_pin_allocate_adapter(&sun4i_hdmi_cec_pin_ops,
+		hdmi, "sun4i", CEC_CAP_TRANSMIT | CEC_CAP_LOG_ADDRS |
+		CEC_CAP_PASSTHROUGH | CEC_CAP_RC);
+	ret = PTR_ERR_OR_ZERO(hdmi->cec_adap);
+	if (ret < 0)
+		goto err_cleanup_connector;
+	writel(readl(hdmi->base + SUN4I_HDMI_CEC) & ~SUN4I_HDMI_CEC_TX,
+	       hdmi->base + SUN4I_HDMI_CEC);
+#endif
 
 	drm_connector_helper_add(&hdmi->connector,
 				 &sun4i_hdmi_connector_helper_funcs);
@@ -445,12 +410,18 @@ static int sun4i_hdmi_bind(struct device *dev, struct device *master,
 	hdmi->connector.polled = DRM_CONNECTOR_POLL_CONNECT |
 		DRM_CONNECTOR_POLL_DISCONNECT;
 
+	ret = cec_register_adapter(hdmi->cec_adap, dev);
+	if (ret < 0)
+		goto err_cleanup_connector;
 	drm_mode_connector_attach_encoder(&hdmi->connector, &hdmi->encoder);
 
 	return 0;
 
 err_cleanup_connector:
+	cec_delete_adapter(hdmi->cec_adap);
 	drm_encoder_cleanup(&hdmi->encoder);
+err_del_i2c_adapter:
+	i2c_del_adapter(hdmi->i2c);
 	return ret;
 }
 
@@ -459,8 +430,10 @@ static void sun4i_hdmi_unbind(struct device *dev, struct device *master,
 {
 	struct sun4i_hdmi *hdmi = dev_get_drvdata(dev);
 
+	cec_unregister_adapter(hdmi->cec_adap);
 	drm_connector_cleanup(&hdmi->connector);
 	drm_encoder_cleanup(&hdmi->encoder);
+	i2c_del_adapter(hdmi->i2c);
 }
 
 static const struct component_ops sun4i_hdmi_ops = {
diff --git a/drivers/gpu/drm/sun4i/sun4i_hdmi_i2c.c b/drivers/gpu/drm/sun4i/sun4i_hdmi_i2c.c
new file mode 100644
index 0000000..2e42d09
--- /dev/null
+++ b/drivers/gpu/drm/sun4i/sun4i_hdmi_i2c.c
@@ -0,0 +1,220 @@
+/*
+ * Copyright (C) 2016 Maxime Ripard <maxime.ripard@free-electrons.com>
+ * Copyright (C) 2017 Jonathan Liu <net147@gmail.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of
+ * the License, or (at your option) any later version.
+ */
+
+#include <linux/clk.h>
+#include <linux/i2c.h>
+#include <linux/iopoll.h>
+
+#include "sun4i_hdmi.h"
+
+#define SUN4I_HDMI_DDC_INT_STATUS_ERROR_MASK ( \
+	SUN4I_HDMI_DDC_INT_STATUS_ILLEGAL_FIFO_OPERATION | \
+	SUN4I_HDMI_DDC_INT_STATUS_DDC_RX_FIFO_UNDERFLOW | \
+	SUN4I_HDMI_DDC_INT_STATUS_DDC_TX_FIFO_OVERFLOW | \
+	SUN4I_HDMI_DDC_INT_STATUS_ARBITRATION_ERROR | \
+	SUN4I_HDMI_DDC_INT_STATUS_ACK_ERROR | \
+	SUN4I_HDMI_DDC_INT_STATUS_BUS_ERROR \
+)
+
+/* FIFO request bit is set when FIFO level is above RX_THRESHOLD during read */
+#define RX_THRESHOLD SUN4I_HDMI_DDC_FIFO_CTRL_RX_THRES_MAX
+/* FIFO request bit is set when FIFO level is below TX_THRESHOLD during write */
+#define TX_THRESHOLD 1
+
+static int fifo_transfer(struct sun4i_hdmi *hdmi, u8 *buf, int len, bool read)
+{
+	/*
+	 * 1 byte takes 9 clock cycles (8 bits + 1 ACK) = 90 us for 100 kHz
+	 * clock. As clock rate is fixed, just round it up to 100 us.
+	 */
+	const unsigned long byte_time_ns = 100;
+	const u32 mask = SUN4I_HDMI_DDC_INT_STATUS_ERROR_MASK |
+			 SUN4I_HDMI_DDC_INT_STATUS_FIFO_REQUEST |
+			 SUN4I_HDMI_DDC_INT_STATUS_TRANSFER_COMPLETE;
+	u32 reg;
+
+	/* Limit transfer length by FIFO threshold */
+	len = min_t(int, len, read ? (RX_THRESHOLD + 1) :
+			      (SUN4I_HDMI_DDC_FIFO_SIZE - TX_THRESHOLD + 1));
+
+	/* Wait until error, FIFO request bit set or transfer complete */
+	if (readl_poll_timeout(hdmi->base + SUN4I_HDMI_DDC_INT_STATUS_REG, reg,
+			       reg & mask, len * byte_time_ns, 100000))
+		return -ETIMEDOUT;
+
+	if (reg & SUN4I_HDMI_DDC_INT_STATUS_ERROR_MASK)
+		return -EIO;
+
+	if (read)
+		readsb(hdmi->base + SUN4I_HDMI_DDC_FIFO_DATA_REG, buf, len);
+	else
+		writesb(hdmi->base + SUN4I_HDMI_DDC_FIFO_DATA_REG, buf, len);
+
+	/* Clear FIFO request bit */
+	writel(SUN4I_HDMI_DDC_INT_STATUS_FIFO_REQUEST,
+	       hdmi->base + SUN4I_HDMI_DDC_INT_STATUS_REG);
+
+	return len;
+}
+
+static int xfer_msg(struct sun4i_hdmi *hdmi, struct i2c_msg *msg)
+{
+	int i, len;
+	u32 reg;
+
+	/* Set FIFO direction */
+	reg = readl(hdmi->base + SUN4I_HDMI_DDC_CTRL_REG);
+	reg &= ~SUN4I_HDMI_DDC_CTRL_FIFO_DIR_MASK;
+	reg |= (msg->flags & I2C_M_RD) ?
+	       SUN4I_HDMI_DDC_CTRL_FIFO_DIR_READ :
+	       SUN4I_HDMI_DDC_CTRL_FIFO_DIR_WRITE;
+	writel(reg, hdmi->base + SUN4I_HDMI_DDC_CTRL_REG);
+
+	/* Set I2C address */
+	writel(SUN4I_HDMI_DDC_ADDR_SLAVE(msg->addr),
+	       hdmi->base + SUN4I_HDMI_DDC_ADDR_REG);
+
+	/* Set FIFO RX/TX thresholds and clear FIFO */
+	reg = readl(hdmi->base + SUN4I_HDMI_DDC_FIFO_CTRL_REG);
+	reg |= SUN4I_HDMI_DDC_FIFO_CTRL_CLEAR;
+	reg &= ~SUN4I_HDMI_DDC_FIFO_CTRL_RX_THRES_MASK;
+	reg |= SUN4I_HDMI_DDC_FIFO_CTRL_RX_THRES(RX_THRESHOLD);
+	reg &= ~SUN4I_HDMI_DDC_FIFO_CTRL_TX_THRES_MASK;
+	reg |= SUN4I_HDMI_DDC_FIFO_CTRL_TX_THRES(TX_THRESHOLD);
+	writel(reg, hdmi->base + SUN4I_HDMI_DDC_FIFO_CTRL_REG);
+	if (readl_poll_timeout(hdmi->base + SUN4I_HDMI_DDC_FIFO_CTRL_REG,
+			       reg,
+			       !(reg & SUN4I_HDMI_DDC_FIFO_CTRL_CLEAR),
+			       100, 2000))
+		return -EIO;
+
+	/* Set transfer length */
+	writel(msg->len, hdmi->base + SUN4I_HDMI_DDC_BYTE_COUNT_REG);
+
+	/* Set command */
+	writel(msg->flags & I2C_M_RD ?
+	       SUN4I_HDMI_DDC_CMD_IMPLICIT_READ :
+	       SUN4I_HDMI_DDC_CMD_IMPLICIT_WRITE,
+	       hdmi->base + SUN4I_HDMI_DDC_CMD_REG);
+
+	/* Clear interrupt status bits */
+	writel(SUN4I_HDMI_DDC_INT_STATUS_ERROR_MASK |
+	       SUN4I_HDMI_DDC_INT_STATUS_FIFO_REQUEST |
+	       SUN4I_HDMI_DDC_INT_STATUS_TRANSFER_COMPLETE,
+	       hdmi->base + SUN4I_HDMI_DDC_INT_STATUS_REG);
+
+	/* Start command */
+	reg = readl(hdmi->base + SUN4I_HDMI_DDC_CTRL_REG);
+	writel(reg | SUN4I_HDMI_DDC_CTRL_START_CMD,
+	       hdmi->base + SUN4I_HDMI_DDC_CTRL_REG);
+
+	/* Transfer bytes */
+	for (i = 0; i < msg->len; i += len) {
+		len = fifo_transfer(hdmi, msg->buf + i, msg->len - i,
+				    msg->flags & I2C_M_RD);
+		if (len <= 0)
+			return len;
+	}
+
+	/* Wait for command to finish */
+	if (readl_poll_timeout(hdmi->base + SUN4I_HDMI_DDC_CTRL_REG,
+			       reg,
+			       !(reg & SUN4I_HDMI_DDC_CTRL_START_CMD),
+			       100, 100000))
+		return -EIO;
+
+	/* Check for errors */
+	reg = readl(hdmi->base + SUN4I_HDMI_DDC_INT_STATUS_REG);
+	if ((reg & SUN4I_HDMI_DDC_INT_STATUS_ERROR_MASK) ||
+	    !(reg & SUN4I_HDMI_DDC_INT_STATUS_TRANSFER_COMPLETE)) {
+		return -EIO;
+	}
+
+	return 0;
+}
+
+static int sun4i_hdmi_i2c_xfer(struct i2c_adapter *adap,
+			       struct i2c_msg *msgs, int num)
+{
+	struct sun4i_hdmi *hdmi = i2c_get_adapdata(adap);
+	u32 reg;
+	int err, i, ret = num;
+
+	for (i = 0; i < num; i++) {
+		if (!msgs[i].len)
+			return -EINVAL;
+		if (msgs[i].len > SUN4I_HDMI_DDC_BYTE_COUNT_MAX)
+			return -EINVAL;
+	}
+
+	/* Reset I2C controller */
+	writel(SUN4I_HDMI_DDC_CTRL_ENABLE | SUN4I_HDMI_DDC_CTRL_RESET,
+	       hdmi->base + SUN4I_HDMI_DDC_CTRL_REG);
+	if (readl_poll_timeout(hdmi->base + SUN4I_HDMI_DDC_CTRL_REG, reg,
+			       !(reg & SUN4I_HDMI_DDC_CTRL_RESET),
+			       100, 2000))
+		return -EIO;
+
+	writel(SUN4I_HDMI_DDC_LINE_CTRL_SDA_ENABLE |
+	       SUN4I_HDMI_DDC_LINE_CTRL_SCL_ENABLE,
+	       hdmi->base + SUN4I_HDMI_DDC_LINE_CTRL_REG);
+
+	clk_prepare_enable(hdmi->ddc_clk);
+	clk_set_rate(hdmi->ddc_clk, 100000);
+
+	for (i = 0; i < num; i++) {
+		err = xfer_msg(hdmi, &msgs[i]);
+		if (err) {
+			ret = err;
+			break;
+		}
+	}
+
+	clk_disable_unprepare(hdmi->ddc_clk);
+	return ret;
+}
+
+static u32 sun4i_hdmi_i2c_func(struct i2c_adapter *adap)
+{
+	return I2C_FUNC_I2C | I2C_FUNC_SMBUS_EMUL;
+}
+
+static const struct i2c_algorithm sun4i_hdmi_i2c_algorithm = {
+	.master_xfer	= sun4i_hdmi_i2c_xfer,
+	.functionality	= sun4i_hdmi_i2c_func,
+};
+
+int sun4i_hdmi_i2c_create(struct device *dev, struct sun4i_hdmi *hdmi)
+{
+	struct i2c_adapter *adap;
+	int ret = 0;
+
+	ret = sun4i_ddc_create(hdmi, hdmi->tmds_clk);
+	if (ret)
+		return ret;
+
+	adap = devm_kzalloc(dev, sizeof(*adap), GFP_KERNEL);
+	if (!adap)
+		return -ENOMEM;
+
+	adap->owner = THIS_MODULE;
+	adap->class = I2C_CLASS_DDC;
+	adap->algo = &sun4i_hdmi_i2c_algorithm;
+	strlcpy(adap->name, "sun4i_hdmi_i2c adapter", sizeof(adap->name));
+	i2c_set_adapdata(adap, hdmi);
+
+	ret = i2c_add_adapter(adap);
+	if (ret)
+		return ret;
+
+	hdmi->i2c = adap;
+
+	return ret;
+}
diff --git a/drivers/gpu/drm/sun4i/sun4i_layer.c b/drivers/gpu/drm/sun4i/sun4i_layer.c
index ead4f9d..7bddf12 100644
--- a/drivers/gpu/drm/sun4i/sun4i_layer.c
+++ b/drivers/gpu/drm/sun4i/sun4i_layer.c
@@ -25,12 +25,6 @@ struct sun4i_plane_desc {
 	       uint32_t                nformats;
 };
 
-static int sun4i_backend_layer_atomic_check(struct drm_plane *plane,
-					    struct drm_plane_state *state)
-{
-	return 0;
-}
-
 static void sun4i_backend_layer_atomic_disable(struct drm_plane *plane,
 					       struct drm_plane_state *old_state)
 {
@@ -52,8 +46,7 @@ static void sun4i_backend_layer_atomic_update(struct drm_plane *plane,
 	sun4i_backend_layer_enable(backend, layer->id, true);
 }
 
-static struct drm_plane_helper_funcs sun4i_backend_layer_helper_funcs = {
-	.atomic_check	= sun4i_backend_layer_atomic_check,
+static const struct drm_plane_helper_funcs sun4i_backend_layer_helper_funcs = {
 	.atomic_disable	= sun4i_backend_layer_atomic_disable,
 	.atomic_update	= sun4i_backend_layer_atomic_update,
 };
@@ -115,7 +108,7 @@ static struct sun4i_layer *sun4i_layer_init_one(struct drm_device *drm,
 	ret = drm_universal_plane_init(drm, &layer->plane, 0,
 				       &sun4i_backend_layer_funcs,
 				       plane->formats, plane->nformats,
-				       plane->type, NULL);
+				       NULL, plane->type, NULL);
 	if (ret) {
 		dev_err(drm->dev, "Couldn't initialize layer\n");
 		return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/sun4i/sun4i_rgb.c b/drivers/gpu/drm/sun4i/sun4i_rgb.c
index 422b191..7cd7090 100644
--- a/drivers/gpu/drm/sun4i/sun4i_rgb.c
+++ b/drivers/gpu/drm/sun4i/sun4i_rgb.c
@@ -119,8 +119,7 @@ sun4i_rgb_connector_destroy(struct drm_connector *connector)
 	drm_connector_cleanup(connector);
 }
 
-static struct drm_connector_funcs sun4i_rgb_con_funcs = {
-	.dpms			= drm_atomic_helper_connector_dpms,
+static const struct drm_connector_funcs sun4i_rgb_con_funcs = {
 	.fill_modes		= drm_helper_probe_single_connector_modes,
 	.destroy		= sun4i_rgb_connector_destroy,
 	.reset			= drm_atomic_helper_connector_reset,
@@ -128,13 +127,6 @@ static struct drm_connector_funcs sun4i_rgb_con_funcs = {
 	.atomic_destroy_state	= drm_atomic_helper_connector_destroy_state,
 };
 
-static int sun4i_rgb_atomic_check(struct drm_encoder *encoder,
-				  struct drm_crtc_state *crtc_state,
-				  struct drm_connector_state *conn_state)
-{
-	return 0;
-}
-
 static void sun4i_rgb_encoder_enable(struct drm_encoder *encoder)
 {
 	struct sun4i_rgb *rgb = drm_encoder_to_sun4i_rgb(encoder);
@@ -182,7 +174,6 @@ static void sun4i_rgb_encoder_mode_set(struct drm_encoder *encoder,
 }
 
 static struct drm_encoder_helper_funcs sun4i_rgb_enc_helper_funcs = {
-	.atomic_check	= sun4i_rgb_atomic_check,
 	.mode_set	= sun4i_rgb_encoder_mode_set,
 	.disable	= sun4i_rgb_encoder_disable,
 	.enable		= sun4i_rgb_encoder_enable,
diff --git a/drivers/gpu/drm/sun4i/sun4i_tcon.h b/drivers/gpu/drm/sun4i/sun4i_tcon.h
index e3c50ec..552c88e 100644
--- a/drivers/gpu/drm/sun4i/sun4i_tcon.h
+++ b/drivers/gpu/drm/sun4i/sun4i_tcon.h
@@ -194,8 +194,6 @@ void sun4i_tcon_channel_enable(struct sun4i_tcon *tcon, int channel);
 void sun4i_tcon_enable_vblank(struct sun4i_tcon *tcon, bool enable);
 
 /* Mode Related Controls */
-void sun4i_tcon_switch_interlace(struct sun4i_tcon *tcon,
-				 bool enable);
 void sun4i_tcon_set_mux(struct sun4i_tcon *tcon, int channel,
 			struct drm_encoder *encoder);
 void sun4i_tcon0_mode_set(struct sun4i_tcon *tcon,
diff --git a/drivers/gpu/drm/sun4i/sun4i_tv.c b/drivers/gpu/drm/sun4i/sun4i_tv.c
index 338b9e5..050cfd4 100644
--- a/drivers/gpu/drm/sun4i/sun4i_tv.c
+++ b/drivers/gpu/drm/sun4i/sun4i_tv.c
@@ -341,13 +341,6 @@ static void sun4i_tv_mode_to_drm_mode(const struct tv_mode *tv_mode,
 	mode->vtotal = mode->vsync_end  + tv_mode->vback_porch;
 }
 
-static int sun4i_tv_atomic_check(struct drm_encoder *encoder,
-				 struct drm_crtc_state *crtc_state,
-				 struct drm_connector_state *conn_state)
-{
-	return 0;
-}
-
 static void sun4i_tv_disable(struct drm_encoder *encoder)
 {
 	struct sun4i_tv *tv = drm_encoder_to_sun4i_tv(encoder);
@@ -489,7 +482,6 @@ static void sun4i_tv_mode_set(struct drm_encoder *encoder,
 }
 
 static struct drm_encoder_helper_funcs sun4i_tv_helper_funcs = {
-	.atomic_check	= sun4i_tv_atomic_check,
 	.disable	= sun4i_tv_disable,
 	.enable		= sun4i_tv_enable,
 	.mode_set	= sun4i_tv_mode_set,
@@ -545,8 +537,7 @@ sun4i_tv_comp_connector_destroy(struct drm_connector *connector)
 	drm_connector_cleanup(connector);
 }
 
-static struct drm_connector_funcs sun4i_tv_comp_connector_funcs = {
-	.dpms			= drm_atomic_helper_connector_dpms,
+static const struct drm_connector_funcs sun4i_tv_comp_connector_funcs = {
 	.fill_modes		= drm_helper_probe_single_connector_modes,
 	.destroy		= sun4i_tv_comp_connector_destroy,
 	.reset			= drm_atomic_helper_connector_reset,
diff --git a/drivers/gpu/drm/sun4i/sun8i_layer.c b/drivers/gpu/drm/sun4i/sun8i_layer.c
index e627eee..23810ff 100644
--- a/drivers/gpu/drm/sun4i/sun8i_layer.c
+++ b/drivers/gpu/drm/sun4i/sun8i_layer.c
@@ -90,7 +90,7 @@ static struct sun8i_layer *sun8i_layer_init_one(struct drm_device *drm,
 	ret = drm_universal_plane_init(drm, &layer->plane, 0,
 				       &sun8i_mixer_layer_funcs,
 				       plane->formats, plane->nformats,
-				       plane->type, NULL);
+				       NULL, plane->type, NULL);
 	if (ret) {
 		dev_err(drm->dev, "Couldn't initialize layer\n");
 		return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/tdfx/tdfx_drv.c b/drivers/gpu/drm/tdfx/tdfx_drv.c
index c54138c..3a14768 100644
--- a/drivers/gpu/drm/tdfx/tdfx_drv.c
+++ b/drivers/gpu/drm/tdfx/tdfx_drv.c
@@ -55,7 +55,6 @@ static const struct file_operations tdfx_driver_fops = {
 
 static struct drm_driver driver = {
 	.driver_features = DRIVER_LEGACY,
-	.set_busid = drm_pci_set_busid,
 	.fops = &tdfx_driver_fops,
 	.name = DRIVER_NAME,
 	.desc = DRIVER_DESC,
@@ -72,12 +71,12 @@ static struct pci_driver tdfx_pci_driver = {
 
 static int __init tdfx_init(void)
 {
-	return drm_pci_init(&driver, &tdfx_pci_driver);
+	return drm_legacy_pci_init(&driver, &tdfx_pci_driver);
 }
 
 static void __exit tdfx_exit(void)
 {
-	drm_pci_exit(&driver, &tdfx_pci_driver);
+	drm_legacy_pci_exit(&driver, &tdfx_pci_driver);
 }
 
 module_init(tdfx_init);
diff --git a/drivers/gpu/drm/tegra/Kconfig b/drivers/gpu/drm/tegra/Kconfig
index 2db29d6..dc58ab1 100644
--- a/drivers/gpu/drm/tegra/Kconfig
+++ b/drivers/gpu/drm/tegra/Kconfig
@@ -3,6 +3,7 @@
 	depends on ARCH_TEGRA || (ARM && COMPILE_TEST)
 	depends on COMMON_CLK
 	depends on DRM
+	depends on OF
 	select DRM_KMS_HELPER
 	select DRM_MIPI_DSI
 	select DRM_PANEL
diff --git a/drivers/gpu/drm/tegra/Makefile b/drivers/gpu/drm/tegra/Makefile
index 6af3a9a..8927784 100644
--- a/drivers/gpu/drm/tegra/Makefile
+++ b/drivers/gpu/drm/tegra/Makefile
@@ -17,4 +17,6 @@
 	falcon.o \
 	vic.o
 
+tegra-drm-y += trace.o
+
 obj-$(CONFIG_DRM_TEGRA) += tegra-drm.o
diff --git a/drivers/gpu/drm/tegra/dc.c b/drivers/gpu/drm/tegra/dc.c
index c875f11..4df3911 100644
--- a/drivers/gpu/drm/tegra/dc.c
+++ b/drivers/gpu/drm/tegra/dc.c
@@ -678,8 +678,8 @@ static struct drm_plane *tegra_dc_primary_plane_create(struct drm_device *drm,
 
 	err = drm_universal_plane_init(drm, &plane->base, possible_crtcs,
 				       &tegra_primary_plane_funcs, formats,
-				       num_formats, DRM_PLANE_TYPE_PRIMARY,
-				       NULL);
+				       num_formats, NULL,
+				       DRM_PLANE_TYPE_PRIMARY, NULL);
 	if (err < 0) {
 		kfree(plane);
 		return ERR_PTR(err);
@@ -844,8 +844,8 @@ static struct drm_plane *tegra_dc_cursor_plane_create(struct drm_device *drm,
 
 	err = drm_universal_plane_init(drm, &plane->base, 1 << dc->pipe,
 				       &tegra_cursor_plane_funcs, formats,
-				       num_formats, DRM_PLANE_TYPE_CURSOR,
-				       NULL);
+				       num_formats, NULL,
+				       DRM_PLANE_TYPE_CURSOR, NULL);
 	if (err < 0) {
 		kfree(plane);
 		return ERR_PTR(err);
@@ -906,8 +906,8 @@ static struct drm_plane *tegra_dc_overlay_plane_create(struct drm_device *drm,
 
 	err = drm_universal_plane_init(drm, &plane->base, 1 << dc->pipe,
 				       &tegra_overlay_plane_funcs, formats,
-				       num_formats, DRM_PLANE_TYPE_OVERLAY,
-				       NULL);
+				       num_formats, NULL,
+				       DRM_PLANE_TYPE_OVERLAY, NULL);
 	if (err < 0) {
 		kfree(plane);
 		return ERR_PTR(err);
@@ -1199,7 +1199,8 @@ static int tegra_dc_wait_idle(struct tegra_dc *dc, unsigned long timeout)
 	return -ETIMEDOUT;
 }
 
-static void tegra_crtc_disable(struct drm_crtc *crtc)
+static void tegra_crtc_atomic_disable(struct drm_crtc *crtc,
+				      struct drm_crtc_state *old_state)
 {
 	struct tegra_dc *dc = to_tegra_dc(crtc);
 	u32 value;
@@ -1243,7 +1244,8 @@ static void tegra_crtc_disable(struct drm_crtc *crtc)
 	pm_runtime_put_sync(dc->dev);
 }
 
-static void tegra_crtc_enable(struct drm_crtc *crtc)
+static void tegra_crtc_atomic_enable(struct drm_crtc *crtc,
+				     struct drm_crtc_state *old_state)
 {
 	struct drm_display_mode *mode = &crtc->state->adjusted_mode;
 	struct tegra_dc_state *state = to_dc_state(crtc->state);
@@ -1351,11 +1353,11 @@ static void tegra_crtc_atomic_flush(struct drm_crtc *crtc,
 }
 
 static const struct drm_crtc_helper_funcs tegra_crtc_helper_funcs = {
-	.disable = tegra_crtc_disable,
-	.enable = tegra_crtc_enable,
 	.atomic_check = tegra_crtc_atomic_check,
 	.atomic_begin = tegra_crtc_atomic_begin,
 	.atomic_flush = tegra_crtc_atomic_flush,
+	.atomic_enable = tegra_crtc_atomic_enable,
+	.atomic_disable = tegra_crtc_atomic_disable,
 };
 
 static irqreturn_t tegra_dc_irq(int irq, void *data)
diff --git a/drivers/gpu/drm/tegra/dpaux.c b/drivers/gpu/drm/tegra/dpaux.c
index 2fde44c3..e4da041 100644
--- a/drivers/gpu/drm/tegra/dpaux.c
+++ b/drivers/gpu/drm/tegra/dpaux.c
@@ -25,6 +25,7 @@
 
 #include "dpaux.h"
 #include "drm.h"
+#include "trace.h"
 
 static DEFINE_MUTEX(dpaux_lock);
 static LIST_HEAD(dpaux_list);
@@ -65,14 +66,19 @@ static inline struct tegra_dpaux *work_to_dpaux(struct work_struct *work)
 }
 
 static inline u32 tegra_dpaux_readl(struct tegra_dpaux *dpaux,
-				    unsigned long offset)
+				    unsigned int offset)
 {
-	return readl(dpaux->regs + (offset << 2));
+	u32 value = readl(dpaux->regs + (offset << 2));
+
+	trace_dpaux_readl(dpaux->dev, offset, value);
+
+	return value;
 }
 
 static inline void tegra_dpaux_writel(struct tegra_dpaux *dpaux,
-				      u32 value, unsigned long offset)
+				      u32 value, unsigned int offset)
 {
+	trace_dpaux_writel(dpaux->dev, offset, value);
 	writel(value, dpaux->regs + (offset << 2));
 }
 
diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c
index 518f4b6..597d563 100644
--- a/drivers/gpu/drm/tegra/drm.c
+++ b/drivers/gpu/drm/tegra/drm.c
@@ -100,7 +100,12 @@ static int tegra_atomic_commit(struct drm_device *drm,
 	 * the software side now.
 	 */
 
-	drm_atomic_helper_swap_state(state, true);
+	err = drm_atomic_helper_swap_state(state, true);
+	if (err) {
+		mutex_unlock(&tegra->commit.lock);
+		drm_atomic_helper_cleanup_planes(drm, state);
+		return err;
+	}
 
 	drm_atomic_state_get(state);
 	if (nonblock)
@@ -214,12 +219,10 @@ static int tegra_drm_load(struct drm_device *drm, unsigned long flags)
 
 	err = tegra_drm_fb_init(drm);
 	if (err < 0)
-		goto vblank;
+		goto device;
 
 	return 0;
 
-vblank:
-	drm_vblank_cleanup(drm);
 device:
 	host1x_device_exit(device);
 fbdev:
@@ -248,7 +251,6 @@ static void tegra_drm_unload(struct drm_device *drm)
 	drm_kms_helper_poll_fini(drm);
 	tegra_drm_fb_exit(drm);
 	drm_mode_config_cleanup(drm);
-	drm_vblank_cleanup(drm);
 
 	err = host1x_device_exit(device);
 	if (err < 0)
@@ -304,8 +306,6 @@ host1x_bo_lookup(struct drm_file *file, u32 handle)
 	if (!gem)
 		return NULL;
 
-	drm_gem_object_unreference_unlocked(gem);
-
 	bo = to_tegra_bo(gem);
 	return &bo->base;
 }
@@ -394,8 +394,10 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 		(void __user *)(uintptr_t)args->waitchks;
 	struct drm_tegra_syncpt syncpt;
 	struct host1x *host1x = dev_get_drvdata(drm->dev->parent);
+	struct drm_gem_object **refs;
 	struct host1x_syncpt *sp;
 	struct host1x_job *job;
+	unsigned int num_refs;
 	int err;
 
 	/* We don't yet support other than one syncpt_incr struct per submit */
@@ -417,6 +419,21 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 	job->class = context->client->base.class;
 	job->serialize = true;
 
+	/*
+	 * Track referenced BOs so that they can be unreferenced after the
+	 * submission is complete.
+	 */
+	num_refs = num_cmdbufs + num_relocs * 2 + num_waitchks;
+
+	refs = kmalloc_array(num_refs, sizeof(*refs), GFP_KERNEL);
+	if (!refs) {
+		err = -ENOMEM;
+		goto put;
+	}
+
+	/* reuse as an iterator later */
+	num_refs = 0;
+
 	while (num_cmdbufs) {
 		struct drm_tegra_cmdbuf cmdbuf;
 		struct host1x_bo *bo;
@@ -445,6 +462,7 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 
 		offset = (u64)cmdbuf.offset + (u64)cmdbuf.words * sizeof(u32);
 		obj = host1x_to_tegra_bo(bo);
+		refs[num_refs++] = &obj->gem;
 
 		/*
 		 * Gather buffer base address must be 4-bytes aligned,
@@ -474,6 +492,7 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 
 		reloc = &job->relocarray[num_relocs];
 		obj = host1x_to_tegra_bo(reloc->cmdbuf.bo);
+		refs[num_refs++] = &obj->gem;
 
 		/*
 		 * The unaligned cmdbuf offset will cause an unaligned write
@@ -487,6 +506,7 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 		}
 
 		obj = host1x_to_tegra_bo(reloc->target.bo);
+		refs[num_refs++] = &obj->gem;
 
 		if (reloc->target.offset >= obj->gem.size) {
 			err = -EINVAL;
@@ -506,6 +526,7 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 			goto fail;
 
 		obj = host1x_to_tegra_bo(wait->bo);
+		refs[num_refs++] = &obj->gem;
 
 		/*
 		 * The unaligned offset will cause an unaligned write during
@@ -545,17 +566,20 @@ int tegra_drm_submit(struct tegra_drm_context *context,
 		goto fail;
 
 	err = host1x_job_submit(job);
-	if (err)
-		goto fail_submit;
+	if (err) {
+		host1x_job_unpin(job);
+		goto fail;
+	}
 
 	args->fence = job->syncpt_end;
 
-	host1x_job_put(job);
-	return 0;
-
-fail_submit:
-	host1x_job_unpin(job);
 fail:
+	while (num_refs--)
+		drm_gem_object_put_unlocked(refs[num_refs]);
+
+	kfree(refs);
+
+put:
 	host1x_job_put(job);
 	return err;
 }
@@ -591,7 +615,7 @@ static int tegra_gem_mmap(struct drm_device *drm, void *data,
 
 	args->offset = drm_vma_node_offset_addr(&bo->gem.vma_node);
 
-	drm_gem_object_unreference_unlocked(gem);
+	drm_gem_object_put_unlocked(gem);
 
 	return 0;
 }
@@ -858,7 +882,7 @@ static int tegra_gem_set_tiling(struct drm_device *drm, void *data,
 	bo->tiling.mode = mode;
 	bo->tiling.value = value;
 
-	drm_gem_object_unreference_unlocked(gem);
+	drm_gem_object_put_unlocked(gem);
 
 	return 0;
 }
@@ -898,7 +922,7 @@ static int tegra_gem_get_tiling(struct drm_device *drm, void *data,
 		break;
 	}
 
-	drm_gem_object_unreference_unlocked(gem);
+	drm_gem_object_put_unlocked(gem);
 
 	return err;
 }
@@ -923,7 +947,7 @@ static int tegra_gem_set_flags(struct drm_device *drm, void *data,
 	if (args->flags & DRM_TEGRA_GEM_BOTTOM_UP)
 		bo->flags |= TEGRA_BO_BOTTOM_UP;
 
-	drm_gem_object_unreference_unlocked(gem);
+	drm_gem_object_put_unlocked(gem);
 
 	return 0;
 }
@@ -945,7 +969,7 @@ static int tegra_gem_get_flags(struct drm_device *drm, void *data,
 	if (bo->flags & TEGRA_BO_BOTTOM_UP)
 		args->flags |= DRM_TEGRA_GEM_BOTTOM_UP;
 
-	drm_gem_object_unreference_unlocked(gem);
+	drm_gem_object_put_unlocked(gem);
 
 	return 0;
 }
@@ -953,20 +977,34 @@ static int tegra_gem_get_flags(struct drm_device *drm, void *data,
 
 static const struct drm_ioctl_desc tegra_drm_ioctls[] = {
 #ifdef CONFIG_DRM_TEGRA_STAGING
-	DRM_IOCTL_DEF_DRV(TEGRA_GEM_CREATE, tegra_gem_create, 0),
-	DRM_IOCTL_DEF_DRV(TEGRA_GEM_MMAP, tegra_gem_mmap, 0),
-	DRM_IOCTL_DEF_DRV(TEGRA_SYNCPT_READ, tegra_syncpt_read, 0),
-	DRM_IOCTL_DEF_DRV(TEGRA_SYNCPT_INCR, tegra_syncpt_incr, 0),
-	DRM_IOCTL_DEF_DRV(TEGRA_SYNCPT_WAIT, tegra_syncpt_wait, 0),
-	DRM_IOCTL_DEF_DRV(TEGRA_OPEN_CHANNEL, tegra_open_channel, 0),
-	DRM_IOCTL_DEF_DRV(TEGRA_CLOSE_CHANNEL, tegra_close_channel, 0),
-	DRM_IOCTL_DEF_DRV(TEGRA_GET_SYNCPT, tegra_get_syncpt, 0),
-	DRM_IOCTL_DEF_DRV(TEGRA_SUBMIT, tegra_submit, 0),
-	DRM_IOCTL_DEF_DRV(TEGRA_GET_SYNCPT_BASE, tegra_get_syncpt_base, 0),
-	DRM_IOCTL_DEF_DRV(TEGRA_GEM_SET_TILING, tegra_gem_set_tiling, 0),
-	DRM_IOCTL_DEF_DRV(TEGRA_GEM_GET_TILING, tegra_gem_get_tiling, 0),
-	DRM_IOCTL_DEF_DRV(TEGRA_GEM_SET_FLAGS, tegra_gem_set_flags, 0),
-	DRM_IOCTL_DEF_DRV(TEGRA_GEM_GET_FLAGS, tegra_gem_get_flags, 0),
+	DRM_IOCTL_DEF_DRV(TEGRA_GEM_CREATE, tegra_gem_create,
+			  DRM_UNLOCKED | DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(TEGRA_GEM_MMAP, tegra_gem_mmap,
+			  DRM_UNLOCKED | DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(TEGRA_SYNCPT_READ, tegra_syncpt_read,
+			  DRM_UNLOCKED | DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(TEGRA_SYNCPT_INCR, tegra_syncpt_incr,
+			  DRM_UNLOCKED | DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(TEGRA_SYNCPT_WAIT, tegra_syncpt_wait,
+			  DRM_UNLOCKED | DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(TEGRA_OPEN_CHANNEL, tegra_open_channel,
+			  DRM_UNLOCKED | DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(TEGRA_CLOSE_CHANNEL, tegra_close_channel,
+			  DRM_UNLOCKED | DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(TEGRA_GET_SYNCPT, tegra_get_syncpt,
+			  DRM_UNLOCKED | DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(TEGRA_SUBMIT, tegra_submit,
+			  DRM_UNLOCKED | DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(TEGRA_GET_SYNCPT_BASE, tegra_get_syncpt_base,
+			  DRM_UNLOCKED | DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(TEGRA_GEM_SET_TILING, tegra_gem_set_tiling,
+			  DRM_UNLOCKED | DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(TEGRA_GEM_GET_TILING, tegra_gem_get_tiling,
+			  DRM_UNLOCKED | DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(TEGRA_GEM_SET_FLAGS, tegra_gem_set_flags,
+			  DRM_UNLOCKED | DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(TEGRA_GEM_GET_FLAGS, tegra_gem_get_flags,
+			  DRM_UNLOCKED | DRM_RENDER_ALLOW),
 #endif
 };
 
@@ -1033,9 +1071,11 @@ static int tegra_debugfs_iova(struct seq_file *s, void *data)
 	struct tegra_drm *tegra = drm->dev_private;
 	struct drm_printer p = drm_seq_file_printer(s);
 
-	mutex_lock(&tegra->mm_lock);
-	drm_mm_print(&tegra->mm, &p);
-	mutex_unlock(&tegra->mm_lock);
+	if (tegra->domain) {
+		mutex_lock(&tegra->mm_lock);
+		drm_mm_print(&tegra->mm, &p);
+		mutex_unlock(&tegra->mm_lock);
+	}
 
 	return 0;
 }
@@ -1055,7 +1095,7 @@ static int tegra_debugfs_init(struct drm_minor *minor)
 
 static struct drm_driver tegra_drm_driver = {
 	.driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_PRIME |
-			   DRIVER_ATOMIC,
+			   DRIVER_ATOMIC | DRIVER_RENDER,
 	.load = tegra_drm_load,
 	.unload = tegra_drm_unload,
 	.open = tegra_drm_open,
@@ -1075,8 +1115,6 @@ static struct drm_driver tegra_drm_driver = {
 	.gem_prime_import = tegra_gem_prime_import,
 
 	.dumb_create = tegra_bo_dumb_create,
-	.dumb_map_offset = tegra_bo_dumb_map_offset,
-	.dumb_destroy = drm_gem_dumb_destroy,
 
 	.ioctls = tegra_drm_ioctls,
 	.num_ioctls = ARRAY_SIZE(tegra_drm_ioctls),
diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h
index 6d6da01..063f5d3 100644
--- a/drivers/gpu/drm/tegra/drm.h
+++ b/drivers/gpu/drm/tegra/drm.h
@@ -23,6 +23,7 @@
 #include <drm/drm_fixed.h>
 
 #include "gem.h"
+#include "trace.h"
 
 struct reset_control;
 
@@ -172,14 +173,19 @@ static inline struct tegra_dc *to_tegra_dc(struct drm_crtc *crtc)
 }
 
 static inline void tegra_dc_writel(struct tegra_dc *dc, u32 value,
-				   unsigned long offset)
+				   unsigned int offset)
 {
+	trace_dc_writel(dc->dev, offset, value);
 	writel(value, dc->regs + (offset << 2));
 }
 
-static inline u32 tegra_dc_readl(struct tegra_dc *dc, unsigned long offset)
+static inline u32 tegra_dc_readl(struct tegra_dc *dc, unsigned int offset)
 {
-	return readl(dc->regs + (offset << 2));
+	u32 value = readl(dc->regs + (offset << 2));
+
+	trace_dc_readl(dc->dev, offset, value);
+
+	return value;
 }
 
 struct tegra_dc_window {
diff --git a/drivers/gpu/drm/tegra/dsi.c b/drivers/gpu/drm/tegra/dsi.c
index 3dea121..046649e 100644
--- a/drivers/gpu/drm/tegra/dsi.c
+++ b/drivers/gpu/drm/tegra/dsi.c
@@ -28,6 +28,7 @@
 #include "drm.h"
 #include "dsi.h"
 #include "mipi-phy.h"
+#include "trace.h"
 
 struct tegra_dsi_state {
 	struct drm_connector_state base;
@@ -105,15 +106,20 @@ static struct tegra_dsi_state *tegra_dsi_get_state(struct tegra_dsi *dsi)
 	return to_dsi_state(dsi->output.connector.state);
 }
 
-static inline u32 tegra_dsi_readl(struct tegra_dsi *dsi, unsigned long reg)
+static inline u32 tegra_dsi_readl(struct tegra_dsi *dsi, unsigned int offset)
 {
-	return readl(dsi->regs + (reg << 2));
+	u32 value = readl(dsi->regs + (offset << 2));
+
+	trace_dsi_readl(dsi->dev, offset, value);
+
+	return value;
 }
 
 static inline void tegra_dsi_writel(struct tegra_dsi *dsi, u32 value,
-				    unsigned long reg)
+				    unsigned int offset)
 {
-	writel(value, dsi->regs + (reg << 2));
+	trace_dsi_writel(dsi->dev, offset, value);
+	writel(value, dsi->regs + (offset << 2));
 }
 
 static int tegra_dsi_show_regs(struct seq_file *s, void *data)
@@ -815,7 +821,6 @@ tegra_dsi_connector_duplicate_state(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs tegra_dsi_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.reset = tegra_dsi_connector_reset,
 	.detect = tegra_output_connector_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
diff --git a/drivers/gpu/drm/tegra/fb.c b/drivers/gpu/drm/tegra/fb.c
index 25acb73..80540c1 100644
--- a/drivers/gpu/drm/tegra/fb.c
+++ b/drivers/gpu/drm/tegra/fb.c
@@ -88,7 +88,7 @@ static void tegra_fb_destroy(struct drm_framebuffer *framebuffer)
 			if (bo->pages)
 				vunmap(bo->vaddr);
 
-			drm_gem_object_unreference_unlocked(&bo->gem);
+			drm_gem_object_put_unlocked(&bo->gem);
 		}
 	}
 
@@ -195,7 +195,7 @@ struct drm_framebuffer *tegra_fb_create(struct drm_device *drm,
 
 unreference:
 	while (i--)
-		drm_gem_object_unreference_unlocked(&planes[i]->gem);
+		drm_gem_object_put_unlocked(&planes[i]->gem);
 
 	return ERR_PTR(err);
 }
@@ -242,7 +242,7 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
 	info = drm_fb_helper_alloc_fbi(helper);
 	if (IS_ERR(info)) {
 		dev_err(drm->dev, "failed to allocate framebuffer info\n");
-		drm_gem_object_unreference_unlocked(&bo->gem);
+		drm_gem_object_put_unlocked(&bo->gem);
 		return PTR_ERR(info);
 	}
 
@@ -251,7 +251,7 @@ static int tegra_fbdev_probe(struct drm_fb_helper *helper,
 		err = PTR_ERR(fbdev->fb);
 		dev_err(drm->dev, "failed to allocate DRM framebuffer: %d\n",
 			err);
-		drm_gem_object_unreference_unlocked(&bo->gem);
+		drm_gem_object_put_unlocked(&bo->gem);
 		return PTR_ERR(fbdev->fb);
 	}
 
diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c
index 7a39a35..ab1e53d 100644
--- a/drivers/gpu/drm/tegra/gem.c
+++ b/drivers/gpu/drm/tegra/gem.c
@@ -24,7 +24,7 @@ static void tegra_bo_put(struct host1x_bo *bo)
 {
 	struct tegra_bo *obj = host1x_to_tegra_bo(bo);
 
-	drm_gem_object_unreference_unlocked(&obj->gem);
+	drm_gem_object_put_unlocked(&obj->gem);
 }
 
 static dma_addr_t tegra_bo_pin(struct host1x_bo *bo, struct sg_table **sgt)
@@ -95,7 +95,7 @@ static struct host1x_bo *tegra_bo_get(struct host1x_bo *bo)
 {
 	struct tegra_bo *obj = host1x_to_tegra_bo(bo);
 
-	drm_gem_object_reference(&obj->gem);
+	drm_gem_object_get(&obj->gem);
 
 	return bo;
 }
@@ -325,7 +325,7 @@ struct tegra_bo *tegra_bo_create_with_handle(struct drm_file *file,
 		return ERR_PTR(err);
 	}
 
-	drm_gem_object_unreference_unlocked(&bo->gem);
+	drm_gem_object_put_unlocked(&bo->gem);
 
 	return bo;
 }
@@ -423,27 +423,6 @@ int tegra_bo_dumb_create(struct drm_file *file, struct drm_device *drm,
 	return 0;
 }
 
-int tegra_bo_dumb_map_offset(struct drm_file *file, struct drm_device *drm,
-			     u32 handle, u64 *offset)
-{
-	struct drm_gem_object *gem;
-	struct tegra_bo *bo;
-
-	gem = drm_gem_object_lookup(file, handle);
-	if (!gem) {
-		dev_err(drm->dev, "failed to lookup GEM object\n");
-		return -EINVAL;
-	}
-
-	bo = to_tegra_bo(gem);
-
-	*offset = drm_vma_node_offset_addr(&bo->gem.vma_node);
-
-	drm_gem_object_unreference_unlocked(gem);
-
-	return 0;
-}
-
 static int tegra_bo_fault(struct vm_fault *vmf)
 {
 	struct vm_area_struct *vma = vmf->vma;
@@ -481,30 +460,28 @@ const struct vm_operations_struct tegra_bo_vm_ops = {
 	.close = drm_gem_vm_close,
 };
 
-int tegra_drm_mmap(struct file *file, struct vm_area_struct *vma)
+static int tegra_gem_mmap(struct drm_gem_object *gem,
+			  struct vm_area_struct *vma)
 {
-	struct drm_gem_object *gem;
-	struct tegra_bo *bo;
-	int ret;
-
-	ret = drm_gem_mmap(file, vma);
-	if (ret)
-		return ret;
-
-	gem = vma->vm_private_data;
-	bo = to_tegra_bo(gem);
+	struct tegra_bo *bo = to_tegra_bo(gem);
 
 	if (!bo->pages) {
 		unsigned long vm_pgoff = vma->vm_pgoff;
+		int err;
 
+		/*
+		 * Clear the VM_PFNMAP flag that was set by drm_gem_mmap(),
+		 * and set the vm_pgoff (used as a fake buffer offset by DRM)
+		 * to 0 as we want to map the whole buffer.
+		 */
 		vma->vm_flags &= ~VM_PFNMAP;
 		vma->vm_pgoff = 0;
 
-		ret = dma_mmap_wc(gem->dev->dev, vma, bo->vaddr, bo->paddr,
+		err = dma_mmap_wc(gem->dev->dev, vma, bo->vaddr, bo->paddr,
 				  gem->size);
-		if (ret) {
+		if (err < 0) {
 			drm_gem_vm_close(vma);
-			return ret;
+			return err;
 		}
 
 		vma->vm_pgoff = vm_pgoff;
@@ -520,6 +497,20 @@ int tegra_drm_mmap(struct file *file, struct vm_area_struct *vma)
 	return 0;
 }
 
+int tegra_drm_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	struct drm_gem_object *gem;
+	int err;
+
+	err = drm_gem_mmap(file, vma);
+	if (err < 0)
+		return err;
+
+	gem = vma->vm_private_data;
+
+	return tegra_gem_mmap(gem, vma);
+}
+
 static struct sg_table *
 tegra_gem_prime_map_dma_buf(struct dma_buf_attachment *attach,
 			    enum dma_data_direction dir)
@@ -603,7 +594,14 @@ static void tegra_gem_prime_kunmap(struct dma_buf *buf, unsigned long page,
 
 static int tegra_gem_prime_mmap(struct dma_buf *buf, struct vm_area_struct *vma)
 {
-	return -EINVAL;
+	struct drm_gem_object *gem = buf->priv;
+	int err;
+
+	err = drm_gem_mmap_obj(gem, gem->size, vma);
+	if (err < 0)
+		return err;
+
+	return tegra_gem_mmap(gem, vma);
 }
 
 static void *tegra_gem_prime_vmap(struct dma_buf *buf)
@@ -654,7 +652,7 @@ struct drm_gem_object *tegra_gem_prime_import(struct drm_device *drm,
 		struct drm_gem_object *gem = buf->priv;
 
 		if (gem->dev == drm) {
-			drm_gem_object_reference(gem);
+			drm_gem_object_get(gem);
 			return gem;
 		}
 	}
diff --git a/drivers/gpu/drm/tegra/gem.h b/drivers/gpu/drm/tegra/gem.h
index 8b32a6f..8eb9fd2 100644
--- a/drivers/gpu/drm/tegra/gem.h
+++ b/drivers/gpu/drm/tegra/gem.h
@@ -67,8 +67,6 @@ struct tegra_bo *tegra_bo_create_with_handle(struct drm_file *file,
 void tegra_bo_free_object(struct drm_gem_object *gem);
 int tegra_bo_dumb_create(struct drm_file *file, struct drm_device *drm,
 			 struct drm_mode_create_dumb *args);
-int tegra_bo_dumb_map_offset(struct drm_file *file, struct drm_device *drm,
-			     u32 handle, u64 *offset);
 
 int tegra_drm_mmap(struct file *file, struct vm_area_struct *vma);
 
diff --git a/drivers/gpu/drm/tegra/hdmi.c b/drivers/gpu/drm/tegra/hdmi.c
index cda0491e..5b9d83b 100644
--- a/drivers/gpu/drm/tegra/hdmi.c
+++ b/drivers/gpu/drm/tegra/hdmi.c
@@ -24,6 +24,7 @@
 #include "hdmi.h"
 #include "drm.h"
 #include "dc.h"
+#include "trace.h"
 
 #define HDMI_ELD_BUFFER_SIZE 96
 
@@ -100,14 +101,19 @@ enum {
 };
 
 static inline u32 tegra_hdmi_readl(struct tegra_hdmi *hdmi,
-				   unsigned long offset)
+				   unsigned int offset)
 {
-	return readl(hdmi->regs + (offset << 2));
+	u32 value = readl(hdmi->regs + (offset << 2));
+
+	trace_hdmi_readl(hdmi->dev, offset, value);
+
+	return value;
 }
 
 static inline void tegra_hdmi_writel(struct tegra_hdmi *hdmi, u32 value,
-				     unsigned long offset)
+				     unsigned int offset)
 {
+	trace_hdmi_writel(hdmi->dev, offset, value);
 	writel(value, hdmi->regs + (offset << 2));
 }
 
@@ -734,7 +740,7 @@ static void tegra_hdmi_setup_avi_infoframe(struct tegra_hdmi *hdmi,
 	u8 buffer[17];
 	ssize_t err;
 
-	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode);
+	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
 	if (err < 0) {
 		dev_err(hdmi->dev, "failed to setup AVI infoframe: %zd\n", err);
 		return;
@@ -902,7 +908,6 @@ tegra_hdmi_connector_detect(struct drm_connector *connector, bool force)
 }
 
 static const struct drm_connector_funcs tegra_hdmi_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.reset = drm_atomic_helper_connector_reset,
 	.detect = tegra_hdmi_connector_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
diff --git a/drivers/gpu/drm/tegra/rgb.c b/drivers/gpu/drm/tegra/rgb.c
index a131b44..78ec519 100644
--- a/drivers/gpu/drm/tegra/rgb.c
+++ b/drivers/gpu/drm/tegra/rgb.c
@@ -88,7 +88,6 @@ static void tegra_dc_write_regs(struct tegra_dc *dc,
 }
 
 static const struct drm_connector_funcs tegra_rgb_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.reset = drm_atomic_helper_connector_reset,
 	.detect = tegra_output_connector_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
diff --git a/drivers/gpu/drm/tegra/sor.c b/drivers/gpu/drm/tegra/sor.c
index a8f5289..7ab1d1d 100644
--- a/drivers/gpu/drm/tegra/sor.c
+++ b/drivers/gpu/drm/tegra/sor.c
@@ -26,6 +26,7 @@
 #include "dc.h"
 #include "drm.h"
 #include "sor.h"
+#include "trace.h"
 
 #define SOR_REKEY 0x38
 
@@ -232,14 +233,19 @@ static inline struct tegra_sor *to_sor(struct tegra_output *output)
 	return container_of(output, struct tegra_sor, output);
 }
 
-static inline u32 tegra_sor_readl(struct tegra_sor *sor, unsigned long offset)
+static inline u32 tegra_sor_readl(struct tegra_sor *sor, unsigned int offset)
 {
-	return readl(sor->regs + (offset << 2));
+	u32 value = readl(sor->regs + (offset << 2));
+
+	trace_sor_readl(sor->dev, offset, value);
+
+	return value;
 }
 
 static inline void tegra_sor_writel(struct tegra_sor *sor, u32 value,
-				    unsigned long offset)
+				    unsigned int offset)
 {
+	trace_sor_writel(sor->dev, offset, value);
 	writel(value, sor->regs + (offset << 2));
 }
 
@@ -1340,7 +1346,6 @@ tegra_sor_connector_duplicate_state(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs tegra_sor_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.reset = tegra_sor_connector_reset,
 	.detect = tegra_sor_connector_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
@@ -1904,7 +1909,7 @@ tegra_sor_hdmi_setup_avi_infoframe(struct tegra_sor *sor,
 	value &= ~INFOFRAME_CTRL_ENABLE;
 	tegra_sor_writel(sor, value, SOR_HDMI_AVI_INFOFRAME_CTRL);
 
-	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode);
+	err = drm_hdmi_avi_infoframe_from_display_mode(&frame, mode, false);
 	if (err < 0) {
 		dev_err(sor->dev, "failed to setup AVI infoframe: %d\n", err);
 		return err;
diff --git a/drivers/gpu/drm/tegra/trace.c b/drivers/gpu/drm/tegra/trace.c
new file mode 100644
index 0000000..006f65c
--- /dev/null
+++ b/drivers/gpu/drm/tegra/trace.c
@@ -0,0 +1,2 @@
+#define CREATE_TRACE_POINTS
+#include "trace.h"
diff --git a/drivers/gpu/drm/tegra/trace.h b/drivers/gpu/drm/tegra/trace.h
new file mode 100644
index 0000000..e9b7cdad
--- /dev/null
+++ b/drivers/gpu/drm/tegra/trace.h
@@ -0,0 +1,68 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM tegra
+
+#if !defined(DRM_TEGRA_TRACE_H) || defined(TRACE_HEADER_MULTI_READ)
+#define DRM_TEGRA_TRACE_H 1
+
+#include <linux/device.h>
+#include <linux/tracepoint.h>
+
+DECLARE_EVENT_CLASS(register_access,
+	TP_PROTO(struct device *dev, unsigned int offset, u32 value),
+	TP_ARGS(dev, offset, value),
+	TP_STRUCT__entry(
+		__field(struct device *, dev)
+		__field(unsigned int, offset)
+		__field(u32, value)
+	),
+	TP_fast_assign(
+		__entry->dev = dev;
+		__entry->offset = offset;
+		__entry->value = value;
+	),
+	TP_printk("%s %04x %08x", dev_name(__entry->dev), __entry->offset,
+		  __entry->value)
+);
+
+DEFINE_EVENT(register_access, dc_writel,
+	TP_PROTO(struct device *dev, unsigned int offset, u32 value),
+	TP_ARGS(dev, offset, value));
+DEFINE_EVENT(register_access, dc_readl,
+	TP_PROTO(struct device *dev, unsigned int offset, u32 value),
+	TP_ARGS(dev, offset, value));
+
+DEFINE_EVENT(register_access, hdmi_writel,
+	TP_PROTO(struct device *dev, unsigned int offset, u32 value),
+	TP_ARGS(dev, offset, value));
+DEFINE_EVENT(register_access, hdmi_readl,
+	TP_PROTO(struct device *dev, unsigned int offset, u32 value),
+	TP_ARGS(dev, offset, value));
+
+DEFINE_EVENT(register_access, dsi_writel,
+	TP_PROTO(struct device *dev, unsigned int offset, u32 value),
+	TP_ARGS(dev, offset, value));
+DEFINE_EVENT(register_access, dsi_readl,
+	TP_PROTO(struct device *dev, unsigned int offset, u32 value),
+	TP_ARGS(dev, offset, value));
+
+DEFINE_EVENT(register_access, dpaux_writel,
+	TP_PROTO(struct device *dev, unsigned int offset, u32 value),
+	TP_ARGS(dev, offset, value));
+DEFINE_EVENT(register_access, dpaux_readl,
+	TP_PROTO(struct device *dev, unsigned int offset, u32 value),
+	TP_ARGS(dev, offset, value));
+
+DEFINE_EVENT(register_access, sor_writel,
+	TP_PROTO(struct device *dev, unsigned int offset, u32 value),
+	TP_ARGS(dev, offset, value));
+DEFINE_EVENT(register_access, sor_readl,
+	TP_PROTO(struct device *dev, unsigned int offset, u32 value),
+	TP_ARGS(dev, offset, value));
+
+#endif /* DRM_TEGRA_TRACE_H */
+
+/* This part must be outside protection */
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE trace
+#include <trace/define_trace.h>
diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c
index 47cb1aa..2448229 100644
--- a/drivers/gpu/drm/tegra/vic.c
+++ b/drivers/gpu/drm/tegra/vic.c
@@ -258,12 +258,16 @@ static const struct tegra_drm_client_ops vic_ops = {
 	.submit = tegra_drm_submit,
 };
 
+#define NVIDIA_TEGRA_124_VIC_FIRMWARE "nvidia/tegra124/vic03_ucode.bin"
+
 static const struct vic_config vic_t124_config = {
-	.firmware = "nvidia/tegra124/vic03_ucode.bin",
+	.firmware = NVIDIA_TEGRA_124_VIC_FIRMWARE,
 };
 
+#define NVIDIA_TEGRA_210_VIC_FIRMWARE "nvidia/tegra210/vic04_ucode.bin"
+
 static const struct vic_config vic_t210_config = {
-	.firmware = "nvidia/tegra210/vic04_ucode.bin",
+	.firmware = NVIDIA_TEGRA_210_VIC_FIRMWARE,
 };
 
 static const struct of_device_id vic_match[] = {
@@ -394,3 +398,10 @@ struct platform_driver tegra_vic_driver = {
 	.probe = vic_probe,
 	.remove = vic_remove,
 };
+
+#if IS_ENABLED(CONFIG_ARCH_TEGRA_124_SOC)
+MODULE_FIRMWARE(NVIDIA_TEGRA_124_VIC_FIRMWARE);
+#endif
+#if IS_ENABLED(CONFIG_ARCH_TEGRA_210_SOC)
+MODULE_FIRMWARE(NVIDIA_TEGRA_210_VIC_FIRMWARE);
+#endif
diff --git a/drivers/gpu/drm/tilcdc/tilcdc_crtc.c b/drivers/gpu/drm/tilcdc/tilcdc_crtc.c
index d524ed0..406fe45 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_crtc.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_crtc.c
@@ -504,6 +504,12 @@ static void tilcdc_crtc_enable(struct drm_crtc *crtc)
 	mutex_unlock(&tilcdc_crtc->enable_lock);
 }
 
+static void tilcdc_crtc_atomic_enable(struct drm_crtc *crtc,
+				      struct drm_crtc_state *old_state)
+{
+	tilcdc_crtc_enable(crtc);
+}
+
 static void tilcdc_crtc_off(struct drm_crtc *crtc, bool shutdown)
 {
 	struct tilcdc_crtc *tilcdc_crtc = to_tilcdc_crtc(crtc);
@@ -562,6 +568,12 @@ static void tilcdc_crtc_disable(struct drm_crtc *crtc)
 	tilcdc_crtc_off(crtc, false);
 }
 
+static void tilcdc_crtc_atomic_disable(struct drm_crtc *crtc,
+				       struct drm_crtc_state *old_state)
+{
+	tilcdc_crtc_disable(crtc);
+}
+
 void tilcdc_crtc_shutdown(struct drm_crtc *crtc)
 {
 	tilcdc_crtc_off(crtc, true);
@@ -729,9 +741,9 @@ static const struct drm_crtc_funcs tilcdc_crtc_funcs = {
 
 static const struct drm_crtc_helper_funcs tilcdc_crtc_helper_funcs = {
 		.mode_fixup     = tilcdc_crtc_mode_fixup,
-		.enable		= tilcdc_crtc_enable,
-		.disable	= tilcdc_crtc_disable,
 		.atomic_check	= tilcdc_crtc_atomic_check,
+		.atomic_enable	= tilcdc_crtc_atomic_enable,
+		.atomic_disable	= tilcdc_crtc_atomic_disable,
 };
 
 int tilcdc_crtc_max_width(struct drm_crtc *crtc)
@@ -1038,8 +1050,8 @@ int tilcdc_crtc_create(struct drm_device *dev)
 	if (priv->is_componentized) {
 		crtc->port = of_graph_get_port_by_id(dev->dev->of_node, 0);
 		if (!crtc->port) { /* This should never happen */
-			dev_err(dev->dev, "Port node not found in %s\n",
-				dev->dev->of_node->full_name);
+			dev_err(dev->dev, "Port node not found in %pOF\n",
+				dev->dev->of_node);
 			ret = -EINVAL;
 			goto fail;
 		}
diff --git a/drivers/gpu/drm/tilcdc/tilcdc_drv.c b/drivers/gpu/drm/tilcdc/tilcdc_drv.c
index d67e189..b0d70f94 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_drv.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_drv.c
@@ -108,7 +108,11 @@ static int tilcdc_commit(struct drm_device *dev,
 	if (ret)
 		return ret;
 
-	drm_atomic_helper_swap_state(state, true);
+	ret = drm_atomic_helper_swap_state(state, true);
+	if (ret) {
+		drm_atomic_helper_cleanup_planes(dev, state);
+		return ret;
+	}
 
 	/*
 	 * Everything below can be run asynchronously without the need to grab
@@ -538,8 +542,6 @@ static struct drm_driver tilcdc_driver = {
 	.gem_free_object_unlocked = drm_gem_cma_free_object,
 	.gem_vm_ops         = &drm_gem_cma_vm_ops,
 	.dumb_create        = drm_gem_cma_dumb_create,
-	.dumb_map_offset    = drm_gem_cma_dumb_map_offset,
-	.dumb_destroy       = drm_gem_dumb_destroy,
 
 	.prime_handle_to_fd	= drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle	= drm_gem_prime_fd_to_handle,
diff --git a/drivers/gpu/drm/tilcdc/tilcdc_panel.c b/drivers/gpu/drm/tilcdc/tilcdc_panel.c
index 28c3e2f..1813a36 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_panel.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_panel.c
@@ -189,7 +189,6 @@ static struct drm_encoder *panel_connector_best_encoder(
 
 static const struct drm_connector_funcs panel_connector_funcs = {
 	.destroy            = panel_connector_destroy,
-	.dpms               = drm_atomic_helper_connector_dpms,
 	.fill_modes         = drm_helper_probe_single_connector_modes,
 	.reset              = drm_atomic_helper_connector_reset,
 	.atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
diff --git a/drivers/gpu/drm/tilcdc/tilcdc_plane.c b/drivers/gpu/drm/tilcdc/tilcdc_plane.c
index ba0d66c..7667b03 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_plane.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_plane.c
@@ -28,7 +28,6 @@ static struct drm_plane_funcs tilcdc_plane_funcs = {
 	.update_plane	= drm_atomic_helper_update_plane,
 	.disable_plane	= drm_atomic_helper_disable_plane,
 	.destroy	= drm_plane_cleanup,
-	.set_property	= drm_atomic_helper_plane_set_property,
 	.reset		= drm_atomic_helper_plane_reset,
 	.atomic_duplicate_state = drm_atomic_helper_plane_duplicate_state,
 	.atomic_destroy_state = drm_atomic_helper_plane_destroy_state,
diff --git a/drivers/gpu/drm/tilcdc/tilcdc_tfp410.c b/drivers/gpu/drm/tilcdc/tilcdc_tfp410.c
index aabfad8..1e2dfb1 100644
--- a/drivers/gpu/drm/tilcdc/tilcdc_tfp410.c
+++ b/drivers/gpu/drm/tilcdc/tilcdc_tfp410.c
@@ -202,7 +202,6 @@ static struct drm_encoder *tfp410_connector_best_encoder(
 
 static const struct drm_connector_funcs tfp410_connector_funcs = {
 	.destroy            = tfp410_connector_destroy,
-	.dpms               = drm_atomic_helper_connector_dpms,
 	.detect             = tfp410_connector_detect,
 	.fill_modes         = drm_helper_probe_single_connector_modes,
 	.reset              = drm_atomic_helper_connector_reset,
diff --git a/drivers/gpu/drm/tinydrm/Kconfig b/drivers/gpu/drm/tinydrm/Kconfig
index 3504c538..2e790e7 100644
--- a/drivers/gpu/drm/tinydrm/Kconfig
+++ b/drivers/gpu/drm/tinydrm/Kconfig
@@ -19,3 +19,26 @@
 	help
 	  DRM driver for the Multi-Inno MI0283QT display panel
 	  If M is selected the module will be called mi0283qt.
+
+config TINYDRM_REPAPER
+	tristate "DRM support for Pervasive Displays RePaper panels (V231)"
+	depends on DRM_TINYDRM && SPI
+	depends on THERMAL || !THERMAL
+	help
+	  DRM driver for the following Pervasive Displays panels:
+	  1.44" TFT EPD Panel (E1144CS021)
+	  1.90" TFT EPD Panel (E1190CS021)
+	  2.00" TFT EPD Panel (E2200CS021)
+	  2.71" TFT EPD Panel (E2271CS021)
+
+	  If M is selected the module will be called repaper.
+
+config TINYDRM_ST7586
+	tristate "DRM support for Sitronix ST7586 display panels"
+	depends on DRM_TINYDRM && SPI
+	select TINYDRM_MIPI_DBI
+	help
+	  DRM driver for the following Sitronix ST7586 panels:
+	  * LEGO MINDSTORMS EV3
+
+	  If M is selected the module will be called st7586.
diff --git a/drivers/gpu/drm/tinydrm/Makefile b/drivers/gpu/drm/tinydrm/Makefile
index 7a3604c..0c184bd 100644
--- a/drivers/gpu/drm/tinydrm/Makefile
+++ b/drivers/gpu/drm/tinydrm/Makefile
@@ -5,3 +5,5 @@
 
 # Displays
 obj-$(CONFIG_TINYDRM_MI0283QT)		+= mi0283qt.o
+obj-$(CONFIG_TINYDRM_REPAPER)		+= repaper.o
+obj-$(CONFIG_TINYDRM_ST7586)		+= st7586.o
diff --git a/drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c b/drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c
index d4cda33..bd6cce0 100644
--- a/drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c
+++ b/drivers/gpu/drm/tinydrm/core/tinydrm-helpers.c
@@ -7,13 +7,15 @@
  * (at your option) any later version.
  */
 
-#include <drm/tinydrm/tinydrm.h>
-#include <drm/tinydrm/tinydrm-helpers.h>
 #include <linux/backlight.h>
+#include <linux/dma-buf.h>
 #include <linux/pm.h>
 #include <linux/spi/spi.h>
 #include <linux/swab.h>
 
+#include <drm/tinydrm/tinydrm.h>
+#include <drm/tinydrm/tinydrm-helpers.h>
+
 static unsigned int spi_max;
 module_param(spi_max, uint, 0400);
 MODULE_PARM_DESC(spi_max, "Set a lower SPI max transfer size");
@@ -181,6 +183,60 @@ void tinydrm_xrgb8888_to_rgb565(u16 *dst, void *vaddr,
 EXPORT_SYMBOL(tinydrm_xrgb8888_to_rgb565);
 
 /**
+ * tinydrm_xrgb8888_to_gray8 - Convert XRGB8888 to grayscale
+ * @dst: 8-bit grayscale destination buffer
+ * @vaddr: XRGB8888 source buffer
+ * @fb: DRM framebuffer
+ * @clip: Clip rectangle area to copy
+ *
+ * Drm doesn't have native monochrome or grayscale support.
+ * Such drivers can announce the commonly supported XR24 format to userspace
+ * and use this function to convert to the native format.
+ *
+ * Monochrome drivers will use the most significant bit,
+ * where 1 means foreground color and 0 background color.
+ *
+ * ITU BT.601 is used for the RGB -> luma (brightness) conversion.
+ */
+void tinydrm_xrgb8888_to_gray8(u8 *dst, void *vaddr, struct drm_framebuffer *fb,
+			       struct drm_clip_rect *clip)
+{
+	unsigned int len = (clip->x2 - clip->x1) * sizeof(u32);
+	unsigned int x, y;
+	void *buf;
+	u32 *src;
+
+	if (WARN_ON(fb->format->format != DRM_FORMAT_XRGB8888))
+		return;
+	/*
+	 * The cma memory is write-combined so reads are uncached.
+	 * Speed up by fetching one line at a time.
+	 */
+	buf = kmalloc(len, GFP_KERNEL);
+	if (!buf)
+		return;
+
+	for (y = clip->y1; y < clip->y2; y++) {
+		src = vaddr + (y * fb->pitches[0]);
+		src += clip->x1;
+		memcpy(buf, src, len);
+		src = buf;
+		for (x = clip->x1; x < clip->x2; x++) {
+			u8 r = (*src & 0x00ff0000) >> 16;
+			u8 g = (*src & 0x0000ff00) >> 8;
+			u8 b =  *src & 0x000000ff;
+
+			/* ITU BT.601: Y = 0.299 R + 0.587 G + 0.114 B */
+			*dst++ = (3 * r + 6 * g + b) / 10;
+			src++;
+		}
+	}
+
+	kfree(buf);
+}
+EXPORT_SYMBOL(tinydrm_xrgb8888_to_gray8);
+
+/**
  * tinydrm_of_find_backlight - Find backlight device in device-tree
  * @dev: Device
  *
diff --git a/drivers/gpu/drm/tinydrm/core/tinydrm-pipe.c b/drivers/gpu/drm/tinydrm/core/tinydrm-pipe.c
index ec43fb7..177e9d8 100644
--- a/drivers/gpu/drm/tinydrm/core/tinydrm-pipe.c
+++ b/drivers/gpu/drm/tinydrm/core/tinydrm-pipe.c
@@ -56,7 +56,7 @@ static const struct drm_connector_helper_funcs tinydrm_connector_hfuncs = {
 static enum drm_connector_status
 tinydrm_connector_detect(struct drm_connector *connector, bool force)
 {
-	if (drm_device_is_unplugged(connector->dev))
+	if (drm_dev_is_unplugged(connector->dev))
 		return connector_status_disconnected;
 
 	return connector->status;
@@ -71,7 +71,6 @@ static void tinydrm_connector_destroy(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs tinydrm_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.reset = drm_atomic_helper_connector_reset,
 	.detect = tinydrm_connector_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
@@ -225,7 +224,7 @@ tinydrm_display_pipe_init(struct tinydrm_device *tdev,
 		return PTR_ERR(connector);
 
 	ret = drm_simple_display_pipe_init(drm, &tdev->pipe, funcs, formats,
-					   format_count, connector);
+					   format_count, NULL, connector);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/tinydrm/mi0283qt.c b/drivers/gpu/drm/tinydrm/mi0283qt.c
index 482ff1c3..7e5bb7d 100644
--- a/drivers/gpu/drm/tinydrm/mi0283qt.c
+++ b/drivers/gpu/drm/tinydrm/mi0283qt.c
@@ -195,8 +195,12 @@ static int mi0283qt_probe(struct spi_device *spi)
 
 	device_property_read_u32(dev, "rotation", &rotation);
 
-	ret = mipi_dbi_spi_init(spi, mipi, dc, &mi0283qt_pipe_funcs,
-				&mi0283qt_driver, &mi0283qt_mode, rotation);
+	ret = mipi_dbi_spi_init(spi, mipi, dc);
+	if (ret)
+		return ret;
+
+	ret = mipi_dbi_init(&spi->dev, mipi, &mi0283qt_pipe_funcs,
+			    &mi0283qt_driver, &mi0283qt_mode, rotation);
 	if (ret)
 		return ret;
 
diff --git a/drivers/gpu/drm/tinydrm/mipi-dbi.c b/drivers/gpu/drm/tinydrm/mipi-dbi.c
index c83eeb7..2caeabc 100644
--- a/drivers/gpu/drm/tinydrm/mipi-dbi.c
+++ b/drivers/gpu/drm/tinydrm/mipi-dbi.c
@@ -776,15 +776,12 @@ static int mipi_dbi_typec3_command(struct mipi_dbi *mipi, u8 cmd,
 /**
  * mipi_dbi_spi_init - Initialize MIPI DBI SPI interfaced controller
  * @spi: SPI device
- * @dc: D/C gpio (optional)
  * @mipi: &mipi_dbi structure to initialize
- * @pipe_funcs: Display pipe functions
- * @driver: DRM driver
- * @mode: Display mode
- * @rotation: Initial rotation in degrees Counter Clock Wise
+ * @dc: D/C gpio (optional)
  *
  * This function sets &mipi_dbi->command, enables &mipi->read_commands for the
- * usual read commands and initializes @mipi using mipi_dbi_init().
+ * usual read commands. It should be followed by a call to mipi_dbi_init() or
+ * a driver-specific init.
  *
  * If @dc is set, a Type C Option 3 interface is assumed, if not
  * Type C Option 1.
@@ -799,11 +796,7 @@ static int mipi_dbi_typec3_command(struct mipi_dbi *mipi, u8 cmd,
  * Zero on success, negative error code on failure.
  */
 int mipi_dbi_spi_init(struct spi_device *spi, struct mipi_dbi *mipi,
-		      struct gpio_desc *dc,
-		      const struct drm_simple_display_pipe_funcs *pipe_funcs,
-		      struct drm_driver *driver,
-		      const struct drm_display_mode *mode,
-		      unsigned int rotation)
+		      struct gpio_desc *dc)
 {
 	size_t tx_size = tinydrm_spi_max_transfer_size(spi, 0);
 	struct device *dev = &spi->dev;
@@ -849,7 +842,7 @@ int mipi_dbi_spi_init(struct spi_device *spi, struct mipi_dbi *mipi,
 			return -ENOMEM;
 	}
 
-	return mipi_dbi_init(dev, mipi, pipe_funcs, driver, mode, rotation);
+	return 0;
 }
 EXPORT_SYMBOL(mipi_dbi_spi_init);
 
diff --git a/drivers/gpu/drm/tinydrm/repaper.c b/drivers/gpu/drm/tinydrm/repaper.c
new file mode 100644
index 0000000..30dc97b
--- /dev/null
+++ b/drivers/gpu/drm/tinydrm/repaper.c
@@ -0,0 +1,1117 @@
+/*
+ * DRM driver for Pervasive Displays RePaper branded e-ink panels
+ *
+ * Copyright 2013-2017 Pervasive Displays, Inc.
+ * Copyright 2017 Noralf Trønnes
+ *
+ * The driver supports:
+ * Material Film: Aurora Mb (V231)
+ * Driver IC: G2 (eTC)
+ *
+ * The controller code was taken from the userspace driver:
+ * https://github.com/repaper/gratis
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/delay.h>
+#include <linux/dma-buf.h>
+#include <linux/gpio/consumer.h>
+#include <linux/module.h>
+#include <linux/of_device.h>
+#include <linux/sched/clock.h>
+#include <linux/spi/spi.h>
+#include <linux/thermal.h>
+
+#include <drm/tinydrm/tinydrm.h>
+#include <drm/tinydrm/tinydrm-helpers.h>
+
+#define REPAPER_RID_G2_COG_ID	0x12
+
+enum repaper_model {
+	E1144CS021 = 1,
+	E1190CS021,
+	E2200CS021,
+	E2271CS021,
+};
+
+enum repaper_stage {         /* Image pixel -> Display pixel */
+	REPAPER_COMPENSATE,  /* B -> W, W -> B (Current Image) */
+	REPAPER_WHITE,       /* B -> N, W -> W (Current Image) */
+	REPAPER_INVERSE,     /* B -> N, W -> B (New Image) */
+	REPAPER_NORMAL       /* B -> B, W -> W (New Image) */
+};
+
+enum repaper_epd_border_byte {
+	REPAPER_BORDER_BYTE_NONE,
+	REPAPER_BORDER_BYTE_ZERO,
+	REPAPER_BORDER_BYTE_SET,
+};
+
+struct repaper_epd {
+	struct tinydrm_device tinydrm;
+	struct spi_device *spi;
+
+	struct gpio_desc *panel_on;
+	struct gpio_desc *border;
+	struct gpio_desc *discharge;
+	struct gpio_desc *reset;
+	struct gpio_desc *busy;
+
+	struct thermal_zone_device *thermal;
+
+	unsigned int height;
+	unsigned int width;
+	unsigned int bytes_per_scan;
+	const u8 *channel_select;
+	unsigned int stage_time;
+	unsigned int factored_stage_time;
+	bool middle_scan;
+	bool pre_border_byte;
+	enum repaper_epd_border_byte border_byte;
+
+	u8 *line_buffer;
+	void *current_frame;
+
+	bool enabled;
+	bool cleared;
+	bool partial;
+};
+
+static inline struct repaper_epd *
+epd_from_tinydrm(struct tinydrm_device *tdev)
+{
+	return container_of(tdev, struct repaper_epd, tinydrm);
+}
+
+static int repaper_spi_transfer(struct spi_device *spi, u8 header,
+				const void *tx, void *rx, size_t len)
+{
+	void *txbuf = NULL, *rxbuf = NULL;
+	struct spi_transfer tr[2] = {};
+	u8 *headerbuf;
+	int ret;
+
+	headerbuf = kmalloc(1, GFP_KERNEL);
+	if (!headerbuf)
+		return -ENOMEM;
+
+	headerbuf[0] = header;
+	tr[0].tx_buf = headerbuf;
+	tr[0].len = 1;
+
+	/* Stack allocated tx? */
+	if (tx && len <= 32) {
+		txbuf = kmalloc(len, GFP_KERNEL);
+		if (!txbuf) {
+			ret = -ENOMEM;
+			goto out_free;
+		}
+		memcpy(txbuf, tx, len);
+	}
+
+	if (rx) {
+		rxbuf = kmalloc(len, GFP_KERNEL);
+		if (!rxbuf) {
+			ret = -ENOMEM;
+			goto out_free;
+		}
+	}
+
+	tr[1].tx_buf = txbuf ? txbuf : tx;
+	tr[1].rx_buf = rxbuf;
+	tr[1].len = len;
+
+	ndelay(80);
+	ret = spi_sync_transfer(spi, tr, 2);
+	if (rx && !ret)
+		memcpy(rx, rxbuf, len);
+
+out_free:
+	kfree(headerbuf);
+	kfree(txbuf);
+	kfree(rxbuf);
+
+	return ret;
+}
+
+static int repaper_write_buf(struct spi_device *spi, u8 reg,
+			     const u8 *buf, size_t len)
+{
+	int ret;
+
+	ret = repaper_spi_transfer(spi, 0x70, &reg, NULL, 1);
+	if (ret)
+		return ret;
+
+	return repaper_spi_transfer(spi, 0x72, buf, NULL, len);
+}
+
+static int repaper_write_val(struct spi_device *spi, u8 reg, u8 val)
+{
+	return repaper_write_buf(spi, reg, &val, 1);
+}
+
+static int repaper_read_val(struct spi_device *spi, u8 reg)
+{
+	int ret;
+	u8 val;
+
+	ret = repaper_spi_transfer(spi, 0x70, &reg, NULL, 1);
+	if (ret)
+		return ret;
+
+	ret = repaper_spi_transfer(spi, 0x73, NULL, &val, 1);
+
+	return ret ? ret : val;
+}
+
+static int repaper_read_id(struct spi_device *spi)
+{
+	int ret;
+	u8 id;
+
+	ret = repaper_spi_transfer(spi, 0x71, NULL, &id, 1);
+
+	return ret ? ret : id;
+}
+
+static void repaper_spi_mosi_low(struct spi_device *spi)
+{
+	const u8 buf[1] = { 0 };
+
+	spi_write(spi, buf, 1);
+}
+
+/* pixels on display are numbered from 1 so even is actually bits 1,3,5,... */
+static void repaper_even_pixels(struct repaper_epd *epd, u8 **pp,
+				const u8 *data, u8 fixed_value, const u8 *mask,
+				enum repaper_stage stage)
+{
+	unsigned int b;
+
+	for (b = 0; b < (epd->width / 8); b++) {
+		if (data) {
+			u8 pixels = data[b] & 0xaa;
+			u8 pixel_mask = 0xff;
+			u8 p1, p2, p3, p4;
+
+			if (mask) {
+				pixel_mask = (mask[b] ^ pixels) & 0xaa;
+				pixel_mask |= pixel_mask >> 1;
+			}
+
+			switch (stage) {
+			case REPAPER_COMPENSATE: /* B -> W, W -> B (Current) */
+				pixels = 0xaa | ((pixels ^ 0xaa) >> 1);
+				break;
+			case REPAPER_WHITE:      /* B -> N, W -> W (Current) */
+				pixels = 0x55 + ((pixels ^ 0xaa) >> 1);
+				break;
+			case REPAPER_INVERSE:    /* B -> N, W -> B (New) */
+				pixels = 0x55 | (pixels ^ 0xaa);
+				break;
+			case REPAPER_NORMAL:     /* B -> B, W -> W (New) */
+				pixels = 0xaa | (pixels >> 1);
+				break;
+			}
+
+			pixels = (pixels & pixel_mask) | (~pixel_mask & 0x55);
+			p1 = (pixels >> 6) & 0x03;
+			p2 = (pixels >> 4) & 0x03;
+			p3 = (pixels >> 2) & 0x03;
+			p4 = (pixels >> 0) & 0x03;
+			pixels = (p1 << 0) | (p2 << 2) | (p3 << 4) | (p4 << 6);
+			*(*pp)++ = pixels;
+		} else {
+			*(*pp)++ = fixed_value;
+		}
+	}
+}
+
+/* pixels on display are numbered from 1 so odd is actually bits 0,2,4,... */
+static void repaper_odd_pixels(struct repaper_epd *epd, u8 **pp,
+			       const u8 *data, u8 fixed_value, const u8 *mask,
+			       enum repaper_stage stage)
+{
+	unsigned int b;
+
+	for (b = epd->width / 8; b > 0; b--) {
+		if (data) {
+			u8 pixels = data[b - 1] & 0x55;
+			u8 pixel_mask = 0xff;
+
+			if (mask) {
+				pixel_mask = (mask[b - 1] ^ pixels) & 0x55;
+				pixel_mask |= pixel_mask << 1;
+			}
+
+			switch (stage) {
+			case REPAPER_COMPENSATE: /* B -> W, W -> B (Current) */
+				pixels = 0xaa | (pixels ^ 0x55);
+				break;
+			case REPAPER_WHITE:      /* B -> N, W -> W (Current) */
+				pixels = 0x55 + (pixels ^ 0x55);
+				break;
+			case REPAPER_INVERSE:    /* B -> N, W -> B (New) */
+				pixels = 0x55 | ((pixels ^ 0x55) << 1);
+				break;
+			case REPAPER_NORMAL:     /* B -> B, W -> W (New) */
+				pixels = 0xaa | pixels;
+				break;
+			}
+
+			pixels = (pixels & pixel_mask) | (~pixel_mask & 0x55);
+			*(*pp)++ = pixels;
+		} else {
+			*(*pp)++ = fixed_value;
+		}
+	}
+}
+
+/* interleave bits: (byte)76543210 -> (16 bit).7.6.5.4.3.2.1 */
+static inline u16 repaper_interleave_bits(u16 value)
+{
+	value = (value | (value << 4)) & 0x0f0f;
+	value = (value | (value << 2)) & 0x3333;
+	value = (value | (value << 1)) & 0x5555;
+
+	return value;
+}
+
+/* pixels on display are numbered from 1 */
+static void repaper_all_pixels(struct repaper_epd *epd, u8 **pp,
+			       const u8 *data, u8 fixed_value, const u8 *mask,
+			       enum repaper_stage stage)
+{
+	unsigned int b;
+
+	for (b = epd->width / 8; b > 0; b--) {
+		if (data) {
+			u16 pixels = repaper_interleave_bits(data[b - 1]);
+			u16 pixel_mask = 0xffff;
+
+			if (mask) {
+				pixel_mask = repaper_interleave_bits(mask[b - 1]);
+
+				pixel_mask = (pixel_mask ^ pixels) & 0x5555;
+				pixel_mask |= pixel_mask << 1;
+			}
+
+			switch (stage) {
+			case REPAPER_COMPENSATE: /* B -> W, W -> B (Current) */
+				pixels = 0xaaaa | (pixels ^ 0x5555);
+				break;
+			case REPAPER_WHITE:      /* B -> N, W -> W (Current) */
+				pixels = 0x5555 + (pixels ^ 0x5555);
+				break;
+			case REPAPER_INVERSE:    /* B -> N, W -> B (New) */
+				pixels = 0x5555 | ((pixels ^ 0x5555) << 1);
+				break;
+			case REPAPER_NORMAL:     /* B -> B, W -> W (New) */
+				pixels = 0xaaaa | pixels;
+				break;
+			}
+
+			pixels = (pixels & pixel_mask) | (~pixel_mask & 0x5555);
+			*(*pp)++ = pixels >> 8;
+			*(*pp)++ = pixels;
+		} else {
+			*(*pp)++ = fixed_value;
+			*(*pp)++ = fixed_value;
+		}
+	}
+}
+
+/* output one line of scan and data bytes to the display */
+static void repaper_one_line(struct repaper_epd *epd, unsigned int line,
+			     const u8 *data, u8 fixed_value, const u8 *mask,
+			     enum repaper_stage stage)
+{
+	u8 *p = epd->line_buffer;
+	unsigned int b;
+
+	repaper_spi_mosi_low(epd->spi);
+
+	if (epd->pre_border_byte)
+		*p++ = 0x00;
+
+	if (epd->middle_scan) {
+		/* data bytes */
+		repaper_odd_pixels(epd, &p, data, fixed_value, mask, stage);
+
+		/* scan line */
+		for (b = epd->bytes_per_scan; b > 0; b--) {
+			if (line / 4 == b - 1)
+				*p++ = 0x03 << (2 * (line & 0x03));
+			else
+				*p++ = 0x00;
+		}
+
+		/* data bytes */
+		repaper_even_pixels(epd, &p, data, fixed_value, mask, stage);
+	} else {
+		/*
+		 * even scan line, but as lines on display are numbered from 1,
+		 * line: 1,3,5,...
+		 */
+		for (b = 0; b < epd->bytes_per_scan; b++) {
+			if (0 != (line & 0x01) && line / 8 == b)
+				*p++ = 0xc0 >> (line & 0x06);
+			else
+				*p++ = 0x00;
+		}
+
+		/* data bytes */
+		repaper_all_pixels(epd, &p, data, fixed_value, mask, stage);
+
+		/*
+		 * odd scan line, but as lines on display are numbered from 1,
+		 * line: 0,2,4,6,...
+		 */
+		for (b = epd->bytes_per_scan; b > 0; b--) {
+			if (0 == (line & 0x01) && line / 8 == b - 1)
+				*p++ = 0x03 << (line & 0x06);
+			else
+				*p++ = 0x00;
+		}
+	}
+
+	switch (epd->border_byte) {
+	case REPAPER_BORDER_BYTE_NONE:
+		break;
+
+	case REPAPER_BORDER_BYTE_ZERO:
+		*p++ = 0x00;
+		break;
+
+	case REPAPER_BORDER_BYTE_SET:
+		switch (stage) {
+		case REPAPER_COMPENSATE:
+		case REPAPER_WHITE:
+		case REPAPER_INVERSE:
+			*p++ = 0x00;
+			break;
+		case REPAPER_NORMAL:
+			*p++ = 0xaa;
+			break;
+		}
+		break;
+	}
+
+	repaper_write_buf(epd->spi, 0x0a, epd->line_buffer,
+			  p - epd->line_buffer);
+
+	/* Output data to panel */
+	repaper_write_val(epd->spi, 0x02, 0x07);
+
+	repaper_spi_mosi_low(epd->spi);
+}
+
+static void repaper_frame_fixed(struct repaper_epd *epd, u8 fixed_value,
+				enum repaper_stage stage)
+{
+	unsigned int line;
+
+	for (line = 0; line < epd->height; line++)
+		repaper_one_line(epd, line, NULL, fixed_value, NULL, stage);
+}
+
+static void repaper_frame_data(struct repaper_epd *epd, const u8 *image,
+			       const u8 *mask, enum repaper_stage stage)
+{
+	unsigned int line;
+
+	if (!mask) {
+		for (line = 0; line < epd->height; line++) {
+			repaper_one_line(epd, line,
+					 &image[line * (epd->width / 8)],
+					 0, NULL, stage);
+		}
+	} else {
+		for (line = 0; line < epd->height; line++) {
+			size_t n = line * epd->width / 8;
+
+			repaper_one_line(epd, line, &image[n], 0, &mask[n],
+					 stage);
+		}
+	}
+}
+
+static void repaper_frame_fixed_repeat(struct repaper_epd *epd, u8 fixed_value,
+				       enum repaper_stage stage)
+{
+	u64 start = local_clock();
+	u64 end = start + (epd->factored_stage_time * 1000 * 1000);
+
+	do {
+		repaper_frame_fixed(epd, fixed_value, stage);
+	} while (local_clock() < end);
+}
+
+static void repaper_frame_data_repeat(struct repaper_epd *epd, const u8 *image,
+				      const u8 *mask, enum repaper_stage stage)
+{
+	u64 start = local_clock();
+	u64 end = start + (epd->factored_stage_time * 1000 * 1000);
+
+	do {
+		repaper_frame_data(epd, image, mask, stage);
+	} while (local_clock() < end);
+}
+
+static void repaper_get_temperature(struct repaper_epd *epd)
+{
+	int ret, temperature = 0;
+	unsigned int factor10x;
+
+	if (!epd->thermal)
+		return;
+
+	ret = thermal_zone_get_temp(epd->thermal, &temperature);
+	if (ret) {
+		dev_err(&epd->spi->dev, "Failed to get temperature (%d)\n",
+			ret);
+		return;
+	}
+
+	temperature /= 1000;
+
+	if (temperature <= -10)
+		factor10x = 170;
+	else if (temperature <= -5)
+		factor10x = 120;
+	else if (temperature <= 5)
+		factor10x = 80;
+	else if (temperature <= 10)
+		factor10x = 40;
+	else if (temperature <= 15)
+		factor10x = 30;
+	else if (temperature <= 20)
+		factor10x = 20;
+	else if (temperature <= 40)
+		factor10x = 10;
+	else
+		factor10x = 7;
+
+	epd->factored_stage_time = epd->stage_time * factor10x / 10;
+}
+
+static void repaper_gray8_to_mono_reversed(u8 *buf, u32 width, u32 height)
+{
+	u8 *gray8 = buf, *mono = buf;
+	int y, xb, i;
+
+	for (y = 0; y < height; y++)
+		for (xb = 0; xb < width / 8; xb++) {
+			u8 byte = 0x00;
+
+			for (i = 0; i < 8; i++) {
+				int x = xb * 8 + i;
+
+				byte >>= 1;
+				if (gray8[y * width + x] >> 7)
+					byte |= BIT(7);
+			}
+			*mono++ = byte;
+		}
+}
+
+static int repaper_fb_dirty(struct drm_framebuffer *fb,
+			    struct drm_file *file_priv,
+			    unsigned int flags, unsigned int color,
+			    struct drm_clip_rect *clips,
+			    unsigned int num_clips)
+{
+	struct drm_gem_cma_object *cma_obj = drm_fb_cma_get_gem_obj(fb, 0);
+	struct dma_buf_attachment *import_attach = cma_obj->base.import_attach;
+	struct tinydrm_device *tdev = fb->dev->dev_private;
+	struct repaper_epd *epd = epd_from_tinydrm(tdev);
+	struct drm_clip_rect clip;
+	u8 *buf = NULL;
+	int ret = 0;
+
+	/* repaper can't do partial updates */
+	clip.x1 = 0;
+	clip.x2 = fb->width;
+	clip.y1 = 0;
+	clip.y2 = fb->height;
+
+	mutex_lock(&tdev->dirty_lock);
+
+	if (!epd->enabled)
+		goto out_unlock;
+
+	/* fbdev can flush even when we're not interested */
+	if (tdev->pipe.plane.fb != fb)
+		goto out_unlock;
+
+	repaper_get_temperature(epd);
+
+	DRM_DEBUG("Flushing [FB:%d] st=%ums\n", fb->base.id,
+		  epd->factored_stage_time);
+
+	buf = kmalloc(fb->width * fb->height, GFP_KERNEL);
+	if (!buf) {
+		ret = -ENOMEM;
+		goto out_unlock;
+	}
+
+	if (import_attach) {
+		ret = dma_buf_begin_cpu_access(import_attach->dmabuf,
+					       DMA_FROM_DEVICE);
+		if (ret)
+			goto out_unlock;
+	}
+
+	tinydrm_xrgb8888_to_gray8(buf, cma_obj->vaddr, fb, &clip);
+
+	if (import_attach) {
+		ret = dma_buf_end_cpu_access(import_attach->dmabuf,
+					     DMA_FROM_DEVICE);
+		if (ret)
+			goto out_unlock;
+	}
+
+	repaper_gray8_to_mono_reversed(buf, fb->width, fb->height);
+
+	if (epd->partial) {
+		repaper_frame_data_repeat(epd, buf, epd->current_frame,
+					  REPAPER_NORMAL);
+	} else if (epd->cleared) {
+		repaper_frame_data_repeat(epd, epd->current_frame, NULL,
+					  REPAPER_COMPENSATE);
+		repaper_frame_data_repeat(epd, epd->current_frame, NULL,
+					  REPAPER_WHITE);
+		repaper_frame_data_repeat(epd, buf, NULL, REPAPER_INVERSE);
+		repaper_frame_data_repeat(epd, buf, NULL, REPAPER_NORMAL);
+
+		epd->partial = true;
+	} else {
+		/* Clear display (anything -> white) */
+		repaper_frame_fixed_repeat(epd, 0xff, REPAPER_COMPENSATE);
+		repaper_frame_fixed_repeat(epd, 0xff, REPAPER_WHITE);
+		repaper_frame_fixed_repeat(epd, 0xaa, REPAPER_INVERSE);
+		repaper_frame_fixed_repeat(epd, 0xaa, REPAPER_NORMAL);
+
+		/* Assuming a clear (white) screen output an image */
+		repaper_frame_fixed_repeat(epd, 0xaa, REPAPER_COMPENSATE);
+		repaper_frame_fixed_repeat(epd, 0xaa, REPAPER_WHITE);
+		repaper_frame_data_repeat(epd, buf, NULL, REPAPER_INVERSE);
+		repaper_frame_data_repeat(epd, buf, NULL, REPAPER_NORMAL);
+
+		epd->cleared = true;
+		epd->partial = true;
+	}
+
+	memcpy(epd->current_frame, buf, fb->width * fb->height / 8);
+
+	/*
+	 * An extra frame write is needed if pixels are set in the bottom line,
+	 * or else grey lines rises up from the pixels
+	 */
+	if (epd->pre_border_byte) {
+		unsigned int x;
+
+		for (x = 0; x < (fb->width / 8); x++)
+			if (buf[x + (fb->width * (fb->height - 1) / 8)]) {
+				repaper_frame_data_repeat(epd, buf,
+							  epd->current_frame,
+							  REPAPER_NORMAL);
+				break;
+			}
+	}
+
+out_unlock:
+	mutex_unlock(&tdev->dirty_lock);
+
+	if (ret)
+		dev_err(fb->dev->dev, "Failed to update display (%d)\n", ret);
+	kfree(buf);
+
+	return ret;
+}
+
+static const struct drm_framebuffer_funcs repaper_fb_funcs = {
+	.destroy	= drm_fb_cma_destroy,
+	.create_handle	= drm_fb_cma_create_handle,
+	.dirty		= repaper_fb_dirty,
+};
+
+static void power_off(struct repaper_epd *epd)
+{
+	/* Turn off power and all signals */
+	gpiod_set_value_cansleep(epd->reset, 0);
+	gpiod_set_value_cansleep(epd->panel_on, 0);
+	if (epd->border)
+		gpiod_set_value_cansleep(epd->border, 0);
+
+	/* Ensure SPI MOSI and CLOCK are Low before CS Low */
+	repaper_spi_mosi_low(epd->spi);
+
+	/* Discharge pulse */
+	gpiod_set_value_cansleep(epd->discharge, 1);
+	msleep(150);
+	gpiod_set_value_cansleep(epd->discharge, 0);
+}
+
+static void repaper_pipe_enable(struct drm_simple_display_pipe *pipe,
+				struct drm_crtc_state *crtc_state)
+{
+	struct tinydrm_device *tdev = pipe_to_tinydrm(pipe);
+	struct repaper_epd *epd = epd_from_tinydrm(tdev);
+	struct spi_device *spi = epd->spi;
+	struct device *dev = &spi->dev;
+	bool dc_ok = false;
+	int i, ret;
+
+	DRM_DEBUG_DRIVER("\n");
+
+	/* Power up sequence */
+	gpiod_set_value_cansleep(epd->reset, 0);
+	gpiod_set_value_cansleep(epd->panel_on, 0);
+	gpiod_set_value_cansleep(epd->discharge, 0);
+	if (epd->border)
+		gpiod_set_value_cansleep(epd->border, 0);
+	repaper_spi_mosi_low(spi);
+	usleep_range(5000, 10000);
+
+	gpiod_set_value_cansleep(epd->panel_on, 1);
+	/*
+	 * This delay comes from the repaper.org userspace driver, it's not
+	 * mentioned in the datasheet.
+	 */
+	usleep_range(10000, 15000);
+	gpiod_set_value_cansleep(epd->reset, 1);
+	if (epd->border)
+		gpiod_set_value_cansleep(epd->border, 1);
+	usleep_range(5000, 10000);
+	gpiod_set_value_cansleep(epd->reset, 0);
+	usleep_range(5000, 10000);
+	gpiod_set_value_cansleep(epd->reset, 1);
+	usleep_range(5000, 10000);
+
+	/* Wait for COG to become ready */
+	for (i = 100; i > 0; i--) {
+		if (!gpiod_get_value_cansleep(epd->busy))
+			break;
+
+		usleep_range(10, 100);
+	}
+
+	if (!i) {
+		dev_err(dev, "timeout waiting for panel to become ready.\n");
+		power_off(epd);
+		return;
+	}
+
+	repaper_read_id(spi);
+	ret = repaper_read_id(spi);
+	if (ret != REPAPER_RID_G2_COG_ID) {
+		if (ret < 0)
+			dev_err(dev, "failed to read chip (%d)\n", ret);
+		else
+			dev_err(dev, "wrong COG ID 0x%02x\n", ret);
+		power_off(epd);
+		return;
+	}
+
+	/* Disable OE */
+	repaper_write_val(spi, 0x02, 0x40);
+
+	ret = repaper_read_val(spi, 0x0f);
+	if (ret < 0 || !(ret & 0x80)) {
+		if (ret < 0)
+			dev_err(dev, "failed to read chip (%d)\n", ret);
+		else
+			dev_err(dev, "panel is reported broken\n");
+		power_off(epd);
+		return;
+	}
+
+	/* Power saving mode */
+	repaper_write_val(spi, 0x0b, 0x02);
+	/* Channel select */
+	repaper_write_buf(spi, 0x01, epd->channel_select, 8);
+	/* High power mode osc */
+	repaper_write_val(spi, 0x07, 0xd1);
+	/* Power setting */
+	repaper_write_val(spi, 0x08, 0x02);
+	/* Vcom level */
+	repaper_write_val(spi, 0x09, 0xc2);
+	/* Power setting */
+	repaper_write_val(spi, 0x04, 0x03);
+	/* Driver latch on */
+	repaper_write_val(spi, 0x03, 0x01);
+	/* Driver latch off */
+	repaper_write_val(spi, 0x03, 0x00);
+	usleep_range(5000, 10000);
+
+	/* Start chargepump */
+	for (i = 0; i < 4; ++i) {
+		/* Charge pump positive voltage on - VGH/VDL on */
+		repaper_write_val(spi, 0x05, 0x01);
+		msleep(240);
+
+		/* Charge pump negative voltage on - VGL/VDL on */
+		repaper_write_val(spi, 0x05, 0x03);
+		msleep(40);
+
+		/* Charge pump Vcom on - Vcom driver on */
+		repaper_write_val(spi, 0x05, 0x0f);
+		msleep(40);
+
+		/* check DC/DC */
+		ret = repaper_read_val(spi, 0x0f);
+		if (ret < 0) {
+			dev_err(dev, "failed to read chip (%d)\n", ret);
+			power_off(epd);
+			return;
+		}
+
+		if (ret & 0x40) {
+			dc_ok = true;
+			break;
+		}
+	}
+
+	if (!dc_ok) {
+		dev_err(dev, "dc/dc failed\n");
+		power_off(epd);
+		return;
+	}
+
+	/*
+	 * Output enable to disable
+	 * The userspace driver sets this to 0x04, but the datasheet says 0x06
+	 */
+	repaper_write_val(spi, 0x02, 0x04);
+
+	epd->enabled = true;
+	epd->partial = false;
+}
+
+static void repaper_pipe_disable(struct drm_simple_display_pipe *pipe)
+{
+	struct tinydrm_device *tdev = pipe_to_tinydrm(pipe);
+	struct repaper_epd *epd = epd_from_tinydrm(tdev);
+	struct spi_device *spi = epd->spi;
+	unsigned int line;
+
+	DRM_DEBUG_DRIVER("\n");
+
+	mutex_lock(&tdev->dirty_lock);
+	epd->enabled = false;
+	mutex_unlock(&tdev->dirty_lock);
+
+	/* Nothing frame */
+	for (line = 0; line < epd->height; line++)
+		repaper_one_line(epd, 0x7fffu, NULL, 0x00, NULL,
+				 REPAPER_COMPENSATE);
+
+	/* 2.7" */
+	if (epd->border) {
+		/* Dummy line */
+		repaper_one_line(epd, 0x7fffu, NULL, 0x00, NULL,
+				 REPAPER_COMPENSATE);
+		msleep(25);
+		gpiod_set_value_cansleep(epd->border, 0);
+		msleep(200);
+		gpiod_set_value_cansleep(epd->border, 1);
+	} else {
+		/* Border dummy line */
+		repaper_one_line(epd, 0x7fffu, NULL, 0x00, NULL,
+				 REPAPER_NORMAL);
+		msleep(200);
+	}
+
+	/* not described in datasheet */
+	repaper_write_val(spi, 0x0b, 0x00);
+	/* Latch reset turn on */
+	repaper_write_val(spi, 0x03, 0x01);
+	/* Power off charge pump Vcom */
+	repaper_write_val(spi, 0x05, 0x03);
+	/* Power off charge pump neg voltage */
+	repaper_write_val(spi, 0x05, 0x01);
+	msleep(120);
+	/* Discharge internal */
+	repaper_write_val(spi, 0x04, 0x80);
+	/* turn off all charge pumps */
+	repaper_write_val(spi, 0x05, 0x00);
+	/* Turn off osc */
+	repaper_write_val(spi, 0x07, 0x01);
+	msleep(50);
+
+	power_off(epd);
+}
+
+static const struct drm_simple_display_pipe_funcs repaper_pipe_funcs = {
+	.enable = repaper_pipe_enable,
+	.disable = repaper_pipe_disable,
+	.update = tinydrm_display_pipe_update,
+	.prepare_fb = tinydrm_display_pipe_prepare_fb,
+};
+
+static const uint32_t repaper_formats[] = {
+	DRM_FORMAT_XRGB8888,
+};
+
+static const struct drm_display_mode repaper_e1144cs021_mode = {
+	TINYDRM_MODE(128, 96, 29, 22),
+};
+
+static const u8 repaper_e1144cs021_cs[] = { 0x00, 0x00, 0x00, 0x00,
+					    0x00, 0x0f, 0xff, 0x00 };
+
+static const struct drm_display_mode repaper_e1190cs021_mode = {
+	TINYDRM_MODE(144, 128, 36, 32),
+};
+
+static const u8 repaper_e1190cs021_cs[] = { 0x00, 0x00, 0x00, 0x03,
+					    0xfc, 0x00, 0x00, 0xff };
+
+static const struct drm_display_mode repaper_e2200cs021_mode = {
+	TINYDRM_MODE(200, 96, 46, 22),
+};
+
+static const u8 repaper_e2200cs021_cs[] = { 0x00, 0x00, 0x00, 0x00,
+					    0x01, 0xff, 0xe0, 0x00 };
+
+static const struct drm_display_mode repaper_e2271cs021_mode = {
+	TINYDRM_MODE(264, 176, 57, 38),
+};
+
+static const u8 repaper_e2271cs021_cs[] = { 0x00, 0x00, 0x00, 0x7f,
+					    0xff, 0xfe, 0x00, 0x00 };
+
+DEFINE_DRM_GEM_CMA_FOPS(repaper_fops);
+
+static struct drm_driver repaper_driver = {
+	.driver_features	= DRIVER_GEM | DRIVER_MODESET | DRIVER_PRIME |
+				  DRIVER_ATOMIC,
+	.fops			= &repaper_fops,
+	TINYDRM_GEM_DRIVER_OPS,
+	.name			= "repaper",
+	.desc			= "Pervasive Displays RePaper e-ink panels",
+	.date			= "20170405",
+	.major			= 1,
+	.minor			= 0,
+};
+
+static const struct of_device_id repaper_of_match[] = {
+	{ .compatible = "pervasive,e1144cs021", .data = (void *)E1144CS021 },
+	{ .compatible = "pervasive,e1190cs021", .data = (void *)E1190CS021 },
+	{ .compatible = "pervasive,e2200cs021", .data = (void *)E2200CS021 },
+	{ .compatible = "pervasive,e2271cs021", .data = (void *)E2271CS021 },
+	{},
+};
+MODULE_DEVICE_TABLE(of, repaper_of_match);
+
+static const struct spi_device_id repaper_id[] = {
+	{ "e1144cs021", E1144CS021 },
+	{ "e1190cs021", E1190CS021 },
+	{ "e2200cs021", E2200CS021 },
+	{ "e2271cs021", E2271CS021 },
+	{ },
+};
+MODULE_DEVICE_TABLE(spi, repaper_id);
+
+static int repaper_probe(struct spi_device *spi)
+{
+	const struct drm_display_mode *mode;
+	const struct spi_device_id *spi_id;
+	const struct of_device_id *match;
+	struct device *dev = &spi->dev;
+	struct tinydrm_device *tdev;
+	enum repaper_model model;
+	const char *thermal_zone;
+	struct repaper_epd *epd;
+	size_t line_buffer_size;
+	int ret;
+
+	match = of_match_device(repaper_of_match, dev);
+	if (match) {
+		model = (enum repaper_model)match->data;
+	} else {
+		spi_id = spi_get_device_id(spi);
+		model = spi_id->driver_data;
+	}
+
+	/* The SPI device is used to allocate dma memory */
+	if (!dev->coherent_dma_mask) {
+		ret = dma_coerce_mask_and_coherent(dev, DMA_BIT_MASK(32));
+		if (ret) {
+			dev_warn(dev, "Failed to set dma mask %d\n", ret);
+			return ret;
+		}
+	}
+
+	epd = devm_kzalloc(dev, sizeof(*epd), GFP_KERNEL);
+	if (!epd)
+		return -ENOMEM;
+
+	epd->spi = spi;
+
+	epd->panel_on = devm_gpiod_get(dev, "panel-on", GPIOD_OUT_LOW);
+	if (IS_ERR(epd->panel_on)) {
+		ret = PTR_ERR(epd->panel_on);
+		if (ret != -EPROBE_DEFER)
+			dev_err(dev, "Failed to get gpio 'panel-on'\n");
+		return ret;
+	}
+
+	epd->discharge = devm_gpiod_get(dev, "discharge", GPIOD_OUT_LOW);
+	if (IS_ERR(epd->discharge)) {
+		ret = PTR_ERR(epd->discharge);
+		if (ret != -EPROBE_DEFER)
+			dev_err(dev, "Failed to get gpio 'discharge'\n");
+		return ret;
+	}
+
+	epd->reset = devm_gpiod_get(dev, "reset", GPIOD_OUT_LOW);
+	if (IS_ERR(epd->reset)) {
+		ret = PTR_ERR(epd->reset);
+		if (ret != -EPROBE_DEFER)
+			dev_err(dev, "Failed to get gpio 'reset'\n");
+		return ret;
+	}
+
+	epd->busy = devm_gpiod_get(dev, "busy", GPIOD_IN);
+	if (IS_ERR(epd->busy)) {
+		ret = PTR_ERR(epd->busy);
+		if (ret != -EPROBE_DEFER)
+			dev_err(dev, "Failed to get gpio 'busy'\n");
+		return ret;
+	}
+
+	if (!device_property_read_string(dev, "pervasive,thermal-zone",
+					 &thermal_zone)) {
+		epd->thermal = thermal_zone_get_zone_by_name(thermal_zone);
+		if (IS_ERR(epd->thermal)) {
+			dev_err(dev, "Failed to get thermal zone: %s\n",
+				thermal_zone);
+			return PTR_ERR(epd->thermal);
+		}
+	}
+
+	switch (model) {
+	case E1144CS021:
+		mode = &repaper_e1144cs021_mode;
+		epd->channel_select = repaper_e1144cs021_cs;
+		epd->stage_time = 480;
+		epd->bytes_per_scan = 96 / 4;
+		epd->middle_scan = true; /* data-scan-data */
+		epd->pre_border_byte = false;
+		epd->border_byte = REPAPER_BORDER_BYTE_ZERO;
+		break;
+
+	case E1190CS021:
+		mode = &repaper_e1190cs021_mode;
+		epd->channel_select = repaper_e1190cs021_cs;
+		epd->stage_time = 480;
+		epd->bytes_per_scan = 128 / 4 / 2;
+		epd->middle_scan = false; /* scan-data-scan */
+		epd->pre_border_byte = false;
+		epd->border_byte = REPAPER_BORDER_BYTE_SET;
+		break;
+
+	case E2200CS021:
+		mode = &repaper_e2200cs021_mode;
+		epd->channel_select = repaper_e2200cs021_cs;
+		epd->stage_time = 480;
+		epd->bytes_per_scan = 96 / 4;
+		epd->middle_scan = true; /* data-scan-data */
+		epd->pre_border_byte = true;
+		epd->border_byte = REPAPER_BORDER_BYTE_NONE;
+		break;
+
+	case E2271CS021:
+		epd->border = devm_gpiod_get(dev, "border", GPIOD_OUT_LOW);
+		if (IS_ERR(epd->border)) {
+			ret = PTR_ERR(epd->border);
+			if (ret != -EPROBE_DEFER)
+				dev_err(dev, "Failed to get gpio 'border'\n");
+			return ret;
+		}
+
+		mode = &repaper_e2271cs021_mode;
+		epd->channel_select = repaper_e2271cs021_cs;
+		epd->stage_time = 630;
+		epd->bytes_per_scan = 176 / 4;
+		epd->middle_scan = true; /* data-scan-data */
+		epd->pre_border_byte = true;
+		epd->border_byte = REPAPER_BORDER_BYTE_NONE;
+		break;
+
+	default:
+		return -ENODEV;
+	}
+
+	epd->width = mode->hdisplay;
+	epd->height = mode->vdisplay;
+	epd->factored_stage_time = epd->stage_time;
+
+	line_buffer_size = 2 * epd->width / 8 + epd->bytes_per_scan + 2;
+	epd->line_buffer = devm_kzalloc(dev, line_buffer_size, GFP_KERNEL);
+	if (!epd->line_buffer)
+		return -ENOMEM;
+
+	epd->current_frame = devm_kzalloc(dev, epd->width * epd->height / 8,
+					  GFP_KERNEL);
+	if (!epd->current_frame)
+		return -ENOMEM;
+
+	tdev = &epd->tinydrm;
+
+	ret = devm_tinydrm_init(dev, tdev, &repaper_fb_funcs, &repaper_driver);
+	if (ret)
+		return ret;
+
+	ret = tinydrm_display_pipe_init(tdev, &repaper_pipe_funcs,
+					DRM_MODE_CONNECTOR_VIRTUAL,
+					repaper_formats,
+					ARRAY_SIZE(repaper_formats), mode, 0);
+	if (ret)
+		return ret;
+
+	drm_mode_config_reset(tdev->drm);
+
+	ret = devm_tinydrm_register(tdev);
+	if (ret)
+		return ret;
+
+	spi_set_drvdata(spi, tdev);
+
+	DRM_DEBUG_DRIVER("Initialized %s:%s @%uMHz on minor %d\n",
+			 tdev->drm->driver->name, dev_name(dev),
+			 spi->max_speed_hz / 1000000,
+			 tdev->drm->primary->index);
+
+	return 0;
+}
+
+static void repaper_shutdown(struct spi_device *spi)
+{
+	struct tinydrm_device *tdev = spi_get_drvdata(spi);
+
+	tinydrm_shutdown(tdev);
+}
+
+static struct spi_driver repaper_spi_driver = {
+	.driver = {
+		.name = "repaper",
+		.owner = THIS_MODULE,
+		.of_match_table = repaper_of_match,
+	},
+	.id_table = repaper_id,
+	.probe = repaper_probe,
+	.shutdown = repaper_shutdown,
+};
+module_spi_driver(repaper_spi_driver);
+
+MODULE_DESCRIPTION("Pervasive Displays RePaper DRM driver");
+MODULE_AUTHOR("Noralf Trønnes");
+MODULE_LICENSE("GPL");
diff --git a/drivers/gpu/drm/tinydrm/st7586.c b/drivers/gpu/drm/tinydrm/st7586.c
new file mode 100644
index 0000000..b439956
--- /dev/null
+++ b/drivers/gpu/drm/tinydrm/st7586.c
@@ -0,0 +1,428 @@
+/*
+ * DRM driver for Sitronix ST7586 panels
+ *
+ * Copyright 2017 David Lechner <david@lechnology.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/delay.h>
+#include <linux/dma-buf.h>
+#include <linux/gpio/consumer.h>
+#include <linux/module.h>
+#include <linux/property.h>
+#include <linux/spi/spi.h>
+#include <video/mipi_display.h>
+
+#include <drm/tinydrm/mipi-dbi.h>
+#include <drm/tinydrm/tinydrm-helpers.h>
+
+/* controller-specific commands */
+#define ST7586_DISP_MODE_GRAY	0x38
+#define ST7586_DISP_MODE_MONO	0x39
+#define ST7586_ENABLE_DDRAM	0x3a
+#define ST7586_SET_DISP_DUTY	0xb0
+#define ST7586_SET_PART_DISP	0xb4
+#define ST7586_SET_NLINE_INV	0xb5
+#define ST7586_SET_VOP		0xc0
+#define ST7586_SET_BIAS_SYSTEM	0xc3
+#define ST7586_SET_BOOST_LEVEL	0xc4
+#define ST7586_SET_VOP_OFFSET	0xc7
+#define ST7586_ENABLE_ANALOG	0xd0
+#define ST7586_AUTO_READ_CTRL	0xd7
+#define ST7586_OTP_RW_CTRL	0xe0
+#define ST7586_OTP_CTRL_OUT	0xe1
+#define ST7586_OTP_READ		0xe3
+
+#define ST7586_DISP_CTRL_MX	BIT(6)
+#define ST7586_DISP_CTRL_MY	BIT(7)
+
+/*
+ * The ST7586 controller has an unusual pixel format where 2bpp grayscale is
+ * packed 3 pixels per byte with the first two pixels using 3 bits and the 3rd
+ * pixel using only 2 bits.
+ *
+ * |  D7  |  D6  |  D5  ||      |      || 2bpp |
+ * | (D4) | (D3) | (D2) ||  D1  |  D0  || GRAY |
+ * +------+------+------++------+------++------+
+ * |  1   |  1   |  1   ||  1   |  1   || 0  0 | black
+ * |  1   |  0   |  0   ||  1   |  0   || 0  1 | dark gray
+ * |  0   |  1   |  0   ||  0   |  1   || 1  0 | light gray
+ * |  0   |  0   |  0   ||  0   |  0   || 1  1 | white
+ */
+
+static const u8 st7586_lookup[] = { 0x7, 0x4, 0x2, 0x0 };
+
+static void st7586_xrgb8888_to_gray332(u8 *dst, void *vaddr,
+				       struct drm_framebuffer *fb,
+				       struct drm_clip_rect *clip)
+{
+	size_t len = (clip->x2 - clip->x1) * (clip->y2 - clip->y1);
+	unsigned int x, y;
+	u8 *src, *buf, val;
+
+	buf = kmalloc(len, GFP_KERNEL);
+	if (!buf)
+		return;
+
+	tinydrm_xrgb8888_to_gray8(buf, vaddr, fb, clip);
+	src = buf;
+
+	for (y = clip->y1; y < clip->y2; y++) {
+		for (x = clip->x1; x < clip->x2; x += 3) {
+			val = st7586_lookup[*src++ >> 6] << 5;
+			val |= st7586_lookup[*src++ >> 6] << 2;
+			val |= st7586_lookup[*src++ >> 6] >> 1;
+			*dst++ = val;
+		}
+	}
+
+	kfree(buf);
+}
+
+static int st7586_buf_copy(void *dst, struct drm_framebuffer *fb,
+			   struct drm_clip_rect *clip)
+{
+	struct drm_gem_cma_object *cma_obj = drm_fb_cma_get_gem_obj(fb, 0);
+	struct dma_buf_attachment *import_attach = cma_obj->base.import_attach;
+	void *src = cma_obj->vaddr;
+	int ret = 0;
+
+	if (import_attach) {
+		ret = dma_buf_begin_cpu_access(import_attach->dmabuf,
+					       DMA_FROM_DEVICE);
+		if (ret)
+			return ret;
+	}
+
+	st7586_xrgb8888_to_gray332(dst, src, fb, clip);
+
+	if (import_attach)
+		ret = dma_buf_end_cpu_access(import_attach->dmabuf,
+					     DMA_FROM_DEVICE);
+
+	return ret;
+}
+
+static int st7586_fb_dirty(struct drm_framebuffer *fb,
+			   struct drm_file *file_priv, unsigned int flags,
+			   unsigned int color, struct drm_clip_rect *clips,
+			   unsigned int num_clips)
+{
+	struct tinydrm_device *tdev = fb->dev->dev_private;
+	struct mipi_dbi *mipi = mipi_dbi_from_tinydrm(tdev);
+	struct drm_clip_rect clip;
+	int start, end;
+	int ret = 0;
+
+	mutex_lock(&tdev->dirty_lock);
+
+	if (!mipi->enabled)
+		goto out_unlock;
+
+	/* fbdev can flush even when we're not interested */
+	if (tdev->pipe.plane.fb != fb)
+		goto out_unlock;
+
+	tinydrm_merge_clips(&clip, clips, num_clips, flags, fb->width,
+			    fb->height);
+
+	/* 3 pixels per byte, so grow clip to nearest multiple of 3 */
+	clip.x1 = rounddown(clip.x1, 3);
+	clip.x2 = roundup(clip.x2, 3);
+
+	DRM_DEBUG("Flushing [FB:%d] x1=%u, x2=%u, y1=%u, y2=%u\n", fb->base.id,
+		  clip.x1, clip.x2, clip.y1, clip.y2);
+
+	ret = st7586_buf_copy(mipi->tx_buf, fb, &clip);
+	if (ret)
+		goto out_unlock;
+
+	/* Pixels are packed 3 per byte */
+	start = clip.x1 / 3;
+	end = clip.x2 / 3;
+
+	mipi_dbi_command(mipi, MIPI_DCS_SET_COLUMN_ADDRESS,
+			 (start >> 8) & 0xFF, start & 0xFF,
+			 (end >> 8) & 0xFF, (end - 1) & 0xFF);
+	mipi_dbi_command(mipi, MIPI_DCS_SET_PAGE_ADDRESS,
+			 (clip.y1 >> 8) & 0xFF, clip.y1 & 0xFF,
+			 (clip.y2 >> 8) & 0xFF, (clip.y2 - 1) & 0xFF);
+
+	ret = mipi_dbi_command_buf(mipi, MIPI_DCS_WRITE_MEMORY_START,
+				   (u8 *)mipi->tx_buf,
+				   (end - start) * (clip.y2 - clip.y1));
+
+out_unlock:
+	mutex_unlock(&tdev->dirty_lock);
+
+	if (ret)
+		dev_err_once(fb->dev->dev, "Failed to update display %d\n",
+			     ret);
+
+	return ret;
+}
+
+static const struct drm_framebuffer_funcs st7586_fb_funcs = {
+	.destroy	= drm_fb_cma_destroy,
+	.create_handle	= drm_fb_cma_create_handle,
+	.dirty		= st7586_fb_dirty,
+};
+
+static void st7586_pipe_enable(struct drm_simple_display_pipe *pipe,
+			       struct drm_crtc_state *crtc_state)
+{
+	struct tinydrm_device *tdev = pipe_to_tinydrm(pipe);
+	struct mipi_dbi *mipi = mipi_dbi_from_tinydrm(tdev);
+	struct drm_framebuffer *fb = pipe->plane.fb;
+	struct device *dev = tdev->drm->dev;
+	int ret;
+	u8 addr_mode;
+
+	DRM_DEBUG_KMS("\n");
+
+	mipi_dbi_hw_reset(mipi);
+	ret = mipi_dbi_command(mipi, ST7586_AUTO_READ_CTRL, 0x9f);
+	if (ret) {
+		dev_err(dev, "Error sending command %d\n", ret);
+		return;
+	}
+
+	mipi_dbi_command(mipi, ST7586_OTP_RW_CTRL, 0x00);
+
+	msleep(10);
+
+	mipi_dbi_command(mipi, ST7586_OTP_READ);
+
+	msleep(20);
+
+	mipi_dbi_command(mipi, ST7586_OTP_CTRL_OUT);
+	mipi_dbi_command(mipi, MIPI_DCS_EXIT_SLEEP_MODE);
+	mipi_dbi_command(mipi, MIPI_DCS_SET_DISPLAY_OFF);
+
+	msleep(50);
+
+	mipi_dbi_command(mipi, ST7586_SET_VOP_OFFSET, 0x00);
+	mipi_dbi_command(mipi, ST7586_SET_VOP, 0xe3, 0x00);
+	mipi_dbi_command(mipi, ST7586_SET_BIAS_SYSTEM, 0x02);
+	mipi_dbi_command(mipi, ST7586_SET_BOOST_LEVEL, 0x04);
+	mipi_dbi_command(mipi, ST7586_ENABLE_ANALOG, 0x1d);
+	mipi_dbi_command(mipi, ST7586_SET_NLINE_INV, 0x00);
+	mipi_dbi_command(mipi, ST7586_DISP_MODE_GRAY);
+	mipi_dbi_command(mipi, ST7586_ENABLE_DDRAM, 0x02);
+
+	switch (mipi->rotation) {
+	default:
+		addr_mode = 0x00;
+		break;
+	case 90:
+		addr_mode = ST7586_DISP_CTRL_MY;
+		break;
+	case 180:
+		addr_mode = ST7586_DISP_CTRL_MX | ST7586_DISP_CTRL_MY;
+		break;
+	case 270:
+		addr_mode = ST7586_DISP_CTRL_MX;
+		break;
+	}
+	mipi_dbi_command(mipi, MIPI_DCS_SET_ADDRESS_MODE, addr_mode);
+
+	mipi_dbi_command(mipi, ST7586_SET_DISP_DUTY, 0x7f);
+	mipi_dbi_command(mipi, ST7586_SET_PART_DISP, 0xa0);
+	mipi_dbi_command(mipi, MIPI_DCS_SET_PARTIAL_AREA, 0x00, 0x00, 0x00, 0x77);
+	mipi_dbi_command(mipi, MIPI_DCS_EXIT_INVERT_MODE);
+
+	msleep(100);
+
+	mipi_dbi_command(mipi, MIPI_DCS_SET_DISPLAY_ON);
+
+	mipi->enabled = true;
+
+	if (fb)
+		fb->funcs->dirty(fb, NULL, 0, 0, NULL, 0);
+}
+
+static void st7586_pipe_disable(struct drm_simple_display_pipe *pipe)
+{
+	struct tinydrm_device *tdev = pipe_to_tinydrm(pipe);
+	struct mipi_dbi *mipi = mipi_dbi_from_tinydrm(tdev);
+
+	DRM_DEBUG_KMS("\n");
+
+	if (!mipi->enabled)
+		return;
+
+	mipi_dbi_command(mipi, MIPI_DCS_SET_DISPLAY_OFF);
+	mipi->enabled = false;
+}
+
+static const u32 st7586_formats[] = {
+	DRM_FORMAT_XRGB8888,
+};
+
+static int st7586_init(struct device *dev, struct mipi_dbi *mipi,
+		const struct drm_simple_display_pipe_funcs *pipe_funcs,
+		struct drm_driver *driver, const struct drm_display_mode *mode,
+		unsigned int rotation)
+{
+	size_t bufsize = (mode->vdisplay + 2) / 3 * mode->hdisplay;
+	struct tinydrm_device *tdev = &mipi->tinydrm;
+	int ret;
+
+	mutex_init(&mipi->cmdlock);
+
+	mipi->tx_buf = devm_kmalloc(dev, bufsize, GFP_KERNEL);
+	if (!mipi->tx_buf)
+		return -ENOMEM;
+
+	ret = devm_tinydrm_init(dev, tdev, &st7586_fb_funcs, driver);
+	if (ret)
+		return ret;
+
+	ret = tinydrm_display_pipe_init(tdev, pipe_funcs,
+					DRM_MODE_CONNECTOR_VIRTUAL,
+					st7586_formats,
+					ARRAY_SIZE(st7586_formats),
+					mode, rotation);
+	if (ret)
+		return ret;
+
+	tdev->drm->mode_config.preferred_depth = 32;
+	mipi->rotation = rotation;
+
+	drm_mode_config_reset(tdev->drm);
+
+	DRM_DEBUG_KMS("preferred_depth=%u, rotation = %u\n",
+		      tdev->drm->mode_config.preferred_depth, rotation);
+
+	return 0;
+}
+
+static const struct drm_simple_display_pipe_funcs st7586_pipe_funcs = {
+	.enable		= st7586_pipe_enable,
+	.disable	= st7586_pipe_disable,
+	.update		= tinydrm_display_pipe_update,
+	.prepare_fb	= tinydrm_display_pipe_prepare_fb,
+};
+
+static const struct drm_display_mode st7586_mode = {
+	TINYDRM_MODE(178, 128, 37, 27),
+};
+
+DEFINE_DRM_GEM_CMA_FOPS(st7586_fops);
+
+static struct drm_driver st7586_driver = {
+	.driver_features	= DRIVER_GEM | DRIVER_MODESET | DRIVER_PRIME |
+				  DRIVER_ATOMIC,
+	.fops			= &st7586_fops,
+	TINYDRM_GEM_DRIVER_OPS,
+	.lastclose		= tinydrm_lastclose,
+	.debugfs_init		= mipi_dbi_debugfs_init,
+	.name			= "st7586",
+	.desc			= "Sitronix ST7586",
+	.date			= "20170801",
+	.major			= 1,
+	.minor			= 0,
+};
+
+static const struct of_device_id st7586_of_match[] = {
+	{ .compatible = "lego,ev3-lcd" },
+	{},
+};
+MODULE_DEVICE_TABLE(of, st7586_of_match);
+
+static const struct spi_device_id st7586_id[] = {
+	{ "ev3-lcd", 0 },
+	{ },
+};
+MODULE_DEVICE_TABLE(spi, st7586_id);
+
+static int st7586_probe(struct spi_device *spi)
+{
+	struct device *dev = &spi->dev;
+	struct tinydrm_device *tdev;
+	struct mipi_dbi *mipi;
+	struct gpio_desc *a0;
+	u32 rotation = 0;
+	int ret;
+
+	mipi = devm_kzalloc(dev, sizeof(*mipi), GFP_KERNEL);
+	if (!mipi)
+		return -ENOMEM;
+
+	mipi->reset = devm_gpiod_get(dev, "reset", GPIOD_OUT_HIGH);
+	if (IS_ERR(mipi->reset)) {
+		dev_err(dev, "Failed to get gpio 'reset'\n");
+		return PTR_ERR(mipi->reset);
+	}
+
+	a0 = devm_gpiod_get(dev, "a0", GPIOD_OUT_LOW);
+	if (IS_ERR(a0)) {
+		dev_err(dev, "Failed to get gpio 'a0'\n");
+		return PTR_ERR(a0);
+	}
+
+	device_property_read_u32(dev, "rotation", &rotation);
+
+	ret = mipi_dbi_spi_init(spi, mipi, a0);
+	if (ret)
+		return ret;
+
+	/* Cannot read from this controller via SPI */
+	mipi->read_commands = NULL;
+
+	/*
+	 * we are using 8-bit data, so we are not actually swapping anything,
+	 * but setting mipi->swap_bytes makes mipi_dbi_typec3_command() do the
+	 * right thing and not use 16-bit transfers (which results in swapped
+	 * bytes on little-endian systems and causes out of order data to be
+	 * sent to the display).
+	 */
+	mipi->swap_bytes = true;
+
+	ret = st7586_init(&spi->dev, mipi, &st7586_pipe_funcs, &st7586_driver,
+			  &st7586_mode, rotation);
+	if (ret)
+		return ret;
+
+	tdev = &mipi->tinydrm;
+
+	ret = devm_tinydrm_register(tdev);
+	if (ret)
+		return ret;
+
+	spi_set_drvdata(spi, mipi);
+
+	DRM_DEBUG_DRIVER("Initialized %s:%s @%uMHz on minor %d\n",
+			 tdev->drm->driver->name, dev_name(dev),
+			 spi->max_speed_hz / 1000000,
+			 tdev->drm->primary->index);
+
+	return 0;
+}
+
+static void st7586_shutdown(struct spi_device *spi)
+{
+	struct mipi_dbi *mipi = spi_get_drvdata(spi);
+
+	tinydrm_shutdown(&mipi->tinydrm);
+}
+
+static struct spi_driver st7586_spi_driver = {
+	.driver = {
+		.name = "st7586",
+		.owner = THIS_MODULE,
+		.of_match_table = st7586_of_match,
+	},
+	.id_table = st7586_id,
+	.probe = st7586_probe,
+	.shutdown = st7586_shutdown,
+};
+module_spi_driver(st7586_spi_driver);
+
+MODULE_DESCRIPTION("Sitronix ST7586 DRM driver");
+MODULE_AUTHOR("David Lechner <david@lechnology.com>");
+MODULE_LICENSE("GPL");
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c
index 22b5702..cba11f1 100644
--- a/drivers/gpu/drm/ttm/ttm_bo.c
+++ b/drivers/gpu/drm/ttm/ttm_bo.c
@@ -70,6 +70,7 @@ static inline int ttm_mem_type_from_place(const struct ttm_place *place,
 static void ttm_mem_type_debug(struct ttm_bo_device *bdev, int mem_type)
 {
 	struct ttm_mem_type_manager *man = &bdev->man[mem_type];
+	struct drm_printer p = drm_debug_printer(TTM_PFX);
 
 	pr_err("    has_type: %d\n", man->has_type);
 	pr_err("    use_type: %d\n", man->use_type);
@@ -79,7 +80,7 @@ static void ttm_mem_type_debug(struct ttm_bo_device *bdev, int mem_type)
 	pr_err("    available_caching: 0x%08X\n", man->available_caching);
 	pr_err("    default_caching: 0x%08X\n", man->default_caching);
 	if (mem_type != TTM_PL_SYSTEM)
-		(*man->func->debug)(man, TTM_PFX);
+		(*man->func->debug)(man, &p);
 }
 
 static void ttm_bo_mem_space_debug(struct ttm_buffer_object *bo,
@@ -394,14 +395,33 @@ static void ttm_bo_cleanup_memtype_use(struct ttm_buffer_object *bo)
 	ww_mutex_unlock (&bo->resv->lock);
 }
 
+static int ttm_bo_individualize_resv(struct ttm_buffer_object *bo)
+{
+	int r;
+
+	if (bo->resv == &bo->ttm_resv)
+		return 0;
+
+	reservation_object_init(&bo->ttm_resv);
+	BUG_ON(!reservation_object_trylock(&bo->ttm_resv));
+
+	r = reservation_object_copy_fences(&bo->ttm_resv, bo->resv);
+	if (r) {
+		reservation_object_unlock(&bo->ttm_resv);
+		reservation_object_fini(&bo->ttm_resv);
+	}
+
+	return r;
+}
+
 static void ttm_bo_flush_all_fences(struct ttm_buffer_object *bo)
 {
 	struct reservation_object_list *fobj;
 	struct dma_fence *fence;
 	int i;
 
-	fobj = reservation_object_get_list(bo->resv);
-	fence = reservation_object_get_excl(bo->resv);
+	fobj = reservation_object_get_list(&bo->ttm_resv);
+	fence = reservation_object_get_excl(&bo->ttm_resv);
 	if (fence && !fence->ops->signaled)
 		dma_fence_enable_sw_signaling(fence);
 
@@ -430,8 +450,19 @@ static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo)
 			ttm_bo_cleanup_memtype_use(bo);
 
 			return;
-		} else
-			ttm_bo_flush_all_fences(bo);
+		}
+
+		ret = ttm_bo_individualize_resv(bo);
+		if (ret) {
+			/* Last resort, if we fail to allocate memory for the
+			 * fences block for the BO to become idle and free it.
+			 */
+			spin_unlock(&glob->lru_lock);
+			ttm_bo_wait(bo, true, true);
+			ttm_bo_cleanup_memtype_use(bo);
+			return;
+		}
+		ttm_bo_flush_all_fences(bo);
 
 		/*
 		 * Make NO_EVICT bos immediately available to
@@ -443,6 +474,8 @@ static void ttm_bo_cleanup_refs_or_queue(struct ttm_buffer_object *bo)
 			ttm_bo_add_to_lru(bo);
 		}
 
+		if (bo->resv != &bo->ttm_resv)
+			reservation_object_unlock(&bo->ttm_resv);
 		__ttm_bo_unreserve(bo);
 	}
 
@@ -471,17 +504,25 @@ static int ttm_bo_cleanup_refs_and_unlock(struct ttm_buffer_object *bo,
 					  bool no_wait_gpu)
 {
 	struct ttm_bo_global *glob = bo->glob;
+	struct reservation_object *resv;
 	int ret;
 
-	ret = ttm_bo_wait(bo, false, true);
+	if (unlikely(list_empty(&bo->ddestroy)))
+		resv = bo->resv;
+	else
+		resv = &bo->ttm_resv;
+
+	if (reservation_object_test_signaled_rcu(resv, true))
+		ret = 0;
+	else
+		ret = -EBUSY;
 
 	if (ret && !no_wait_gpu) {
 		long lret;
 		ww_mutex_unlock(&bo->resv->lock);
 		spin_unlock(&glob->lru_lock);
 
-		lret = reservation_object_wait_timeout_rcu(bo->resv,
-							   true,
+		lret = reservation_object_wait_timeout_rcu(resv, true,
 							   interruptible,
 							   30 * HZ);
 
@@ -505,13 +546,6 @@ static int ttm_bo_cleanup_refs_and_unlock(struct ttm_buffer_object *bo,
 			spin_unlock(&glob->lru_lock);
 			return 0;
 		}
-
-		/*
-		 * remove sync_obj with ttm_bo_wait, the wait should be
-		 * finished, and no new wait object should have been added.
-		 */
-		ret = ttm_bo_wait(bo, false, true);
-		WARN_ON(ret);
 	}
 
 	if (ret || unlikely(list_empty(&bo->ddestroy))) {
diff --git a/drivers/gpu/drm/ttm/ttm_bo_manager.c b/drivers/gpu/drm/ttm/ttm_bo_manager.c
index 90a6c0b..a7c232d 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_manager.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_manager.c
@@ -136,13 +136,12 @@ static int ttm_bo_man_takedown(struct ttm_mem_type_manager *man)
 }
 
 static void ttm_bo_man_debug(struct ttm_mem_type_manager *man,
-			     const char *prefix)
+			     struct drm_printer *printer)
 {
 	struct ttm_range_manager *rman = (struct ttm_range_manager *) man->priv;
-	struct drm_printer p = drm_debug_printer(prefix);
 
 	spin_lock(&rman->lock);
-	drm_mm_print(&rman->mm, &p);
+	drm_mm_print(&rman->mm, printer);
 	spin_unlock(&rman->lock);
 }
 
diff --git a/drivers/gpu/drm/ttm/ttm_bo_vm.c b/drivers/gpu/drm/ttm/ttm_bo_vm.c
index b442d12..a01e5c9 100644
--- a/drivers/gpu/drm/ttm/ttm_bo_vm.c
+++ b/drivers/gpu/drm/ttm/ttm_bo_vm.c
@@ -294,10 +294,87 @@ static void ttm_bo_vm_close(struct vm_area_struct *vma)
 	vma->vm_private_data = NULL;
 }
 
+static int ttm_bo_vm_access_kmap(struct ttm_buffer_object *bo,
+				 unsigned long offset,
+				 void *buf, int len, int write)
+{
+	unsigned long page = offset >> PAGE_SHIFT;
+	unsigned long bytes_left = len;
+	int ret;
+
+	/* Copy a page at a time, that way no extra virtual address
+	 * mapping is needed
+	 */
+	offset -= page << PAGE_SHIFT;
+	do {
+		unsigned long bytes = min(bytes_left, PAGE_SIZE - offset);
+		struct ttm_bo_kmap_obj map;
+		void *ptr;
+		bool is_iomem;
+
+		ret = ttm_bo_kmap(bo, page, 1, &map);
+		if (ret)
+			return ret;
+
+		ptr = (uint8_t *)ttm_kmap_obj_virtual(&map, &is_iomem) + offset;
+		WARN_ON_ONCE(is_iomem);
+		if (write)
+			memcpy(ptr, buf, bytes);
+		else
+			memcpy(buf, ptr, bytes);
+		ttm_bo_kunmap(&map);
+
+		page++;
+		bytes_left -= bytes;
+		offset = 0;
+	} while (bytes_left);
+
+	return len;
+}
+
+static int ttm_bo_vm_access(struct vm_area_struct *vma, unsigned long addr,
+			    void *buf, int len, int write)
+{
+	unsigned long offset = (addr) - vma->vm_start;
+	struct ttm_buffer_object *bo = vma->vm_private_data;
+	int ret;
+
+	if (len < 1 || (offset + len) >> PAGE_SHIFT > bo->num_pages)
+		return -EIO;
+
+	ret = ttm_bo_reserve(bo, true, false, NULL);
+	if (ret)
+		return ret;
+
+	switch (bo->mem.mem_type) {
+	case TTM_PL_SYSTEM:
+		if (unlikely(bo->ttm->page_flags & TTM_PAGE_FLAG_SWAPPED)) {
+			ret = ttm_tt_swapin(bo->ttm);
+			if (unlikely(ret != 0))
+				return ret;
+		}
+		/* fall through */
+	case TTM_PL_TT:
+		ret = ttm_bo_vm_access_kmap(bo, offset, buf, len, write);
+		break;
+	default:
+		if (bo->bdev->driver->access_memory)
+			ret = bo->bdev->driver->access_memory(
+				bo, offset, buf, len, write);
+		else
+			ret = -EIO;
+	}
+
+	ttm_bo_unreserve(bo);
+
+	return ret;
+}
+
 static const struct vm_operations_struct ttm_bo_vm_ops = {
 	.fault = ttm_bo_vm_fault,
 	.open = ttm_bo_vm_open,
-	.close = ttm_bo_vm_close
+	.close = ttm_bo_vm_close,
+	.access = ttm_bo_vm_access
 };
 
 static struct ttm_buffer_object *ttm_bo_vm_lookup(struct ttm_bo_device *bdev,
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index eeddc1e..8715998 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -615,7 +615,7 @@ static void ttm_page_pool_fill_locked(struct ttm_page_pool *pool,
 		} else {
 			pr_err("Failed to fill pool (%p)\n", pool);
 			/* If we have any pages left put them to the pool. */
-			list_for_each_entry(p, &pool->list, lru) {
+			list_for_each_entry(p, &new_pages, lru) {
 				++cpages;
 			}
 			list_splice(&new_pages, &pool->list);
diff --git a/drivers/gpu/drm/udl/udl_connector.c b/drivers/gpu/drm/udl/udl_connector.c
index d2f57c5..9f9a497 100644
--- a/drivers/gpu/drm/udl/udl_connector.c
+++ b/drivers/gpu/drm/udl/udl_connector.c
@@ -96,7 +96,7 @@ static int udl_mode_valid(struct drm_connector *connector,
 static enum drm_connector_status
 udl_detect(struct drm_connector *connector, bool force)
 {
-	if (drm_device_is_unplugged(connector->dev))
+	if (drm_dev_is_unplugged(connector->dev))
 		return connector_status_disconnected;
 	return connector_status_connected;
 }
diff --git a/drivers/gpu/drm/udl/udl_dmabuf.c b/drivers/gpu/drm/udl/udl_dmabuf.c
index 2e031a89..2867ed1 100644
--- a/drivers/gpu/drm/udl/udl_dmabuf.c
+++ b/drivers/gpu/drm/udl/udl_dmabuf.c
@@ -186,7 +186,7 @@ static int udl_dmabuf_mmap(struct dma_buf *dma_buf,
 	return -EINVAL;
 }
 
-static struct dma_buf_ops udl_dmabuf_ops = {
+static const struct dma_buf_ops udl_dmabuf_ops = {
 	.attach			= udl_attach_dma_buf,
 	.detach			= udl_detach_dma_buf,
 	.map_dma_buf		= udl_map_dma_buf,
diff --git a/drivers/gpu/drm/udl/udl_drv.c b/drivers/gpu/drm/udl/udl_drv.c
index cd8b017..31421b6 100644
--- a/drivers/gpu/drm/udl/udl_drv.c
+++ b/drivers/gpu/drm/udl/udl_drv.c
@@ -11,11 +11,6 @@
 #include <drm/drm_crtc_helper.h>
 #include "udl_drv.h"
 
-static int udl_driver_set_busid(struct drm_device *d, struct drm_master *m)
-{
-	return 0;
-}
-
 static int udl_usb_suspend(struct usb_interface *interface,
 			   pm_message_t message)
 {
@@ -52,7 +47,6 @@ static struct drm_driver driver = {
 	.driver_features = DRIVER_MODESET | DRIVER_GEM | DRIVER_PRIME,
 	.load = udl_driver_load,
 	.unload = udl_driver_unload,
-	.set_busid = udl_driver_set_busid,
 
 	/* gem hooks */
 	.gem_free_object = udl_gem_free_object,
@@ -60,7 +54,6 @@ static struct drm_driver driver = {
 
 	.dumb_create = udl_dumb_create,
 	.dumb_map_offset = udl_gem_mmap,
-	.dumb_destroy = drm_gem_dumb_destroy,
 	.fops = &udl_driver_fops,
 
 	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
@@ -108,7 +101,7 @@ static void udl_usb_disconnect(struct usb_interface *interface)
 	drm_kms_helper_poll_disable(dev);
 	udl_fbdev_unplug(dev);
 	udl_drop_usb(dev);
-	drm_unplug_dev(dev);
+	drm_dev_unplug(dev);
 }
 
 /*
@@ -118,7 +111,7 @@ static void udl_usb_disconnect(struct usb_interface *interface)
  * which is compatible with all known USB 2.0 era graphics chips and firmware,
  * but allows DisplayLink to increment those for any future incompatible chips
  */
-static struct usb_device_id id_table[] = {
+static const struct usb_device_id id_table[] = {
 	{.idVendor = 0x17e9, .bInterfaceClass = 0xff,
 	 .bInterfaceSubClass = 0x00,
 	 .bInterfaceProtocol = 0x00,
diff --git a/drivers/gpu/drm/udl/udl_fb.c b/drivers/gpu/drm/udl/udl_fb.c
index 4a65003..b7ca90d 100644
--- a/drivers/gpu/drm/udl/udl_fb.c
+++ b/drivers/gpu/drm/udl/udl_fb.c
@@ -198,7 +198,7 @@ static int udl_fb_open(struct fb_info *info, int user)
 	struct udl_device *udl = dev->dev_private;
 
 	/* If the USB device is gone, we don't accept new opens */
-	if (drm_device_is_unplugged(udl->ddev))
+	if (drm_dev_is_unplugged(udl->ddev))
 		return -ENODEV;
 
 	ufbdev->fb_count++;
@@ -309,7 +309,7 @@ static void udl_user_framebuffer_destroy(struct drm_framebuffer *fb)
 	struct udl_framebuffer *ufb = to_udl_fb(fb);
 
 	if (ufb->obj)
-		drm_gem_object_unreference_unlocked(&ufb->obj->base);
+		drm_gem_object_put_unlocked(&ufb->obj->base);
 
 	drm_framebuffer_cleanup(fb);
 	kfree(ufb);
@@ -393,7 +393,6 @@ static int udlfb_create(struct drm_fb_helper *helper,
 	info->fix.smem_len = size;
 	info->fix.smem_start = (unsigned long)ufbdev->ufb.obj->vmapping;
 
-	info->flags = FBINFO_DEFAULT | FBINFO_CAN_FORCE_OUTPUT;
 	info->fbops = &udlfb_ops;
 	drm_fb_helper_fill_fix(info, fb->pitches[0], fb->format->depth);
 	drm_fb_helper_fill_var(info, &ufbdev->helper, sizes->fb_width, sizes->fb_height);
@@ -404,7 +403,7 @@ static int udlfb_create(struct drm_fb_helper *helper,
 
 	return ret;
 out_gfree:
-	drm_gem_object_unreference_unlocked(&ufbdev->ufb.obj->base);
+	drm_gem_object_put_unlocked(&ufbdev->ufb.obj->base);
 out:
 	return ret;
 }
@@ -420,7 +419,7 @@ static void udl_fbdev_destroy(struct drm_device *dev,
 	drm_fb_helper_fini(&ufbdev->helper);
 	drm_framebuffer_unregister_private(&ufbdev->ufb.base);
 	drm_framebuffer_cleanup(&ufbdev->ufb.base);
-	drm_gem_object_unreference_unlocked(&ufbdev->ufb.obj->base);
+	drm_gem_object_put_unlocked(&ufbdev->ufb.obj->base);
 }
 
 int udl_fbdev_init(struct drm_device *dev)
diff --git a/drivers/gpu/drm/udl/udl_gem.c b/drivers/gpu/drm/udl/udl_gem.c
index db9cece..dee6bd9 100644
--- a/drivers/gpu/drm/udl/udl_gem.c
+++ b/drivers/gpu/drm/udl/udl_gem.c
@@ -52,7 +52,7 @@ udl_gem_create(struct drm_file *file,
 		return ret;
 	}
 
-	drm_gem_object_unreference_unlocked(&obj->base);
+	drm_gem_object_put_unlocked(&obj->base);
 	*handle_p = handle;
 	return 0;
 }
@@ -234,7 +234,7 @@ int udl_gem_mmap(struct drm_file *file, struct drm_device *dev,
 	*offset = drm_vma_node_offset_addr(&gobj->base.vma_node);
 
 out:
-	drm_gem_object_unreference(&gobj->base);
+	drm_gem_object_put(&gobj->base);
 unlock:
 	mutex_unlock(&dev->struct_mutex);
 	return ret;
diff --git a/drivers/gpu/drm/udl/udl_main.c b/drivers/gpu/drm/udl/udl_main.c
index a9d93b8..0328b2c 100644
--- a/drivers/gpu/drm/udl/udl_main.c
+++ b/drivers/gpu/drm/udl/udl_main.c
@@ -371,8 +371,6 @@ void udl_driver_unload(struct drm_device *dev)
 {
 	struct udl_device *udl = dev->dev_private;
 
-	drm_vblank_cleanup(dev);
-
 	if (udl->urbs.count)
 		udl_free_urb_list(dev);
 
diff --git a/drivers/gpu/drm/vc4/Kconfig b/drivers/gpu/drm/vc4/Kconfig
index 4361bdc..fdae18a 100644
--- a/drivers/gpu/drm/vc4/Kconfig
+++ b/drivers/gpu/drm/vc4/Kconfig
@@ -19,3 +19,11 @@
 	  This driver requires that "avoid_warnings=2" be present in
 	  the config.txt for the firmware, to keep it from smashing
 	  our display setup.
+
+config DRM_VC4_HDMI_CEC
+       bool "Broadcom VC4 HDMI CEC Support"
+       depends on DRM_VC4
+       select CEC_CORE
+       help
+	  Choose this option if you have a Broadcom VC4 GPU
+	  and want to use CEC.
diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c
index 487f964..3afdbf4 100644
--- a/drivers/gpu/drm/vc4/vc4_bo.c
+++ b/drivers/gpu/drm/vc4/vc4_bo.c
@@ -24,21 +24,35 @@
 #include "vc4_drv.h"
 #include "uapi/drm/vc4_drm.h"
 
+static const char * const bo_type_names[] = {
+	"kernel",
+	"V3D",
+	"V3D shader",
+	"dumb",
+	"binner",
+	"RCL",
+	"BCL",
+	"kernel BO cache",
+};
+
+static bool is_user_label(int label)
+{
+	return label >= VC4_BO_TYPE_COUNT;
+}
+
 static void vc4_bo_stats_dump(struct vc4_dev *vc4)
 {
-	DRM_INFO("num bos allocated: %d\n",
-		 vc4->bo_stats.num_allocated);
-	DRM_INFO("size bos allocated: %dkb\n",
-		 vc4->bo_stats.size_allocated / 1024);
-	DRM_INFO("num bos used: %d\n",
-		 vc4->bo_stats.num_allocated - vc4->bo_stats.num_cached);
-	DRM_INFO("size bos used: %dkb\n",
-		 (vc4->bo_stats.size_allocated -
-		  vc4->bo_stats.size_cached) / 1024);
-	DRM_INFO("num bos cached: %d\n",
-		 vc4->bo_stats.num_cached);
-	DRM_INFO("size bos cached: %dkb\n",
-		 vc4->bo_stats.size_cached / 1024);
+	int i;
+
+	for (i = 0; i < vc4->num_labels; i++) {
+		if (!vc4->bo_labels[i].num_allocated)
+			continue;
+
+		DRM_INFO("%30s: %6dkb BOs (%d)\n",
+			 vc4->bo_labels[i].name,
+			 vc4->bo_labels[i].size_allocated / 1024,
+			 vc4->bo_labels[i].num_allocated);
+	}
 }
 
 #ifdef CONFIG_DEBUG_FS
@@ -47,64 +61,133 @@ int vc4_bo_stats_debugfs(struct seq_file *m, void *unused)
 	struct drm_info_node *node = (struct drm_info_node *)m->private;
 	struct drm_device *dev = node->minor->dev;
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
-	struct vc4_bo_stats stats;
+	int i;
 
-	/* Take a snapshot of the current stats with the lock held. */
 	mutex_lock(&vc4->bo_lock);
-	stats = vc4->bo_stats;
-	mutex_unlock(&vc4->bo_lock);
+	for (i = 0; i < vc4->num_labels; i++) {
+		if (!vc4->bo_labels[i].num_allocated)
+			continue;
 
-	seq_printf(m, "num bos allocated: %d\n",
-		   stats.num_allocated);
-	seq_printf(m, "size bos allocated: %dkb\n",
-		   stats.size_allocated / 1024);
-	seq_printf(m, "num bos used: %d\n",
-		   stats.num_allocated - stats.num_cached);
-	seq_printf(m, "size bos used: %dkb\n",
-		   (stats.size_allocated - stats.size_cached) / 1024);
-	seq_printf(m, "num bos cached: %d\n",
-		   stats.num_cached);
-	seq_printf(m, "size bos cached: %dkb\n",
-		   stats.size_cached / 1024);
+		seq_printf(m, "%30s: %6dkb BOs (%d)\n",
+			   vc4->bo_labels[i].name,
+			   vc4->bo_labels[i].size_allocated / 1024,
+			   vc4->bo_labels[i].num_allocated);
+	}
+	mutex_unlock(&vc4->bo_lock);
 
 	return 0;
 }
 #endif
 
+/* Takes ownership of *name and returns the appropriate slot for it in
+ * the bo_labels[] array, extending it as necessary.
+ *
+ * This is inefficient and could use a hash table instead of walking
+ * an array and strcmp()ing.  However, the assumption is that user
+ * labeling will be infrequent (scanout buffers and other long-lived
+ * objects, or debug driver builds), so we can live with it for now.
+ */
+static int vc4_get_user_label(struct vc4_dev *vc4, const char *name)
+{
+	int i;
+	int free_slot = -1;
+
+	for (i = 0; i < vc4->num_labels; i++) {
+		if (!vc4->bo_labels[i].name) {
+			free_slot = i;
+		} else if (strcmp(vc4->bo_labels[i].name, name) == 0) {
+			kfree(name);
+			return i;
+		}
+	}
+
+	if (free_slot != -1) {
+		WARN_ON(vc4->bo_labels[free_slot].num_allocated != 0);
+		vc4->bo_labels[free_slot].name = name;
+		return free_slot;
+	} else {
+		u32 new_label_count = vc4->num_labels + 1;
+		struct vc4_label *new_labels =
+			krealloc(vc4->bo_labels,
+				 new_label_count * sizeof(*new_labels),
+				 GFP_KERNEL);
+
+		if (!new_labels) {
+			kfree(name);
+			return -1;
+		}
+
+		free_slot = vc4->num_labels;
+		vc4->bo_labels = new_labels;
+		vc4->num_labels = new_label_count;
+
+		vc4->bo_labels[free_slot].name = name;
+		vc4->bo_labels[free_slot].num_allocated = 0;
+		vc4->bo_labels[free_slot].size_allocated = 0;
+
+		return free_slot;
+	}
+}
+
+static void vc4_bo_set_label(struct drm_gem_object *gem_obj, int label)
+{
+	struct vc4_bo *bo = to_vc4_bo(gem_obj);
+	struct vc4_dev *vc4 = to_vc4_dev(gem_obj->dev);
+
+	lockdep_assert_held(&vc4->bo_lock);
+
+	if (label != -1) {
+		vc4->bo_labels[label].num_allocated++;
+		vc4->bo_labels[label].size_allocated += gem_obj->size;
+	}
+
+	vc4->bo_labels[bo->label].num_allocated--;
+	vc4->bo_labels[bo->label].size_allocated -= gem_obj->size;
+
+	if (vc4->bo_labels[bo->label].num_allocated == 0 &&
+	    is_user_label(bo->label)) {
+		/* Free user BO label slots on last unreference.
+		 * Slots are just where we track the stats for a given
+		 * name, and once a name is unused we can reuse that
+		 * slot.
+		 */
+		kfree(vc4->bo_labels[bo->label].name);
+		vc4->bo_labels[bo->label].name = NULL;
+	}
+
+	bo->label = label;
+}
+
 static uint32_t bo_page_index(size_t size)
 {
 	return (size / PAGE_SIZE) - 1;
 }
 
-/* Must be called with bo_lock held. */
 static void vc4_bo_destroy(struct vc4_bo *bo)
 {
 	struct drm_gem_object *obj = &bo->base.base;
 	struct vc4_dev *vc4 = to_vc4_dev(obj->dev);
 
+	lockdep_assert_held(&vc4->bo_lock);
+
+	vc4_bo_set_label(obj, -1);
+
 	if (bo->validated_shader) {
 		kfree(bo->validated_shader->texture_samples);
 		kfree(bo->validated_shader);
 		bo->validated_shader = NULL;
 	}
 
-	vc4->bo_stats.num_allocated--;
-	vc4->bo_stats.size_allocated -= obj->size;
-
 	reservation_object_fini(&bo->_resv);
 
 	drm_gem_cma_free_object(obj);
 }
 
-/* Must be called with bo_lock held. */
 static void vc4_bo_remove_from_cache(struct vc4_bo *bo)
 {
-	struct drm_gem_object *obj = &bo->base.base;
-	struct vc4_dev *vc4 = to_vc4_dev(obj->dev);
+	struct vc4_dev *vc4 = to_vc4_dev(bo->base.base.dev);
 
-	vc4->bo_stats.num_cached--;
-	vc4->bo_stats.size_cached -= obj->size;
-
+	lockdep_assert_held(&vc4->bo_lock);
 	list_del(&bo->unref_head);
 	list_del(&bo->size_head);
 }
@@ -165,7 +248,8 @@ static void vc4_bo_cache_purge(struct drm_device *dev)
 }
 
 static struct vc4_bo *vc4_bo_get_from_cache(struct drm_device *dev,
-					    uint32_t size)
+					    uint32_t size,
+					    enum vc4_kernel_bo_type type)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	uint32_t page_index = bo_page_index(size);
@@ -186,6 +270,8 @@ static struct vc4_bo *vc4_bo_get_from_cache(struct drm_device *dev,
 	kref_init(&bo->base.base.refcount);
 
 out:
+	if (bo)
+		vc4_bo_set_label(&bo->base.base, type);
 	mutex_unlock(&vc4->bo_lock);
 	return bo;
 }
@@ -208,8 +294,9 @@ struct drm_gem_object *vc4_create_object(struct drm_device *dev, size_t size)
 		return ERR_PTR(-ENOMEM);
 
 	mutex_lock(&vc4->bo_lock);
-	vc4->bo_stats.num_allocated++;
-	vc4->bo_stats.size_allocated += size;
+	bo->label = VC4_BO_TYPE_KERNEL;
+	vc4->bo_labels[VC4_BO_TYPE_KERNEL].num_allocated++;
+	vc4->bo_labels[VC4_BO_TYPE_KERNEL].size_allocated += size;
 	mutex_unlock(&vc4->bo_lock);
 	bo->resv = &bo->_resv;
 	reservation_object_init(bo->resv);
@@ -218,7 +305,7 @@ struct drm_gem_object *vc4_create_object(struct drm_device *dev, size_t size)
 }
 
 struct vc4_bo *vc4_bo_create(struct drm_device *dev, size_t unaligned_size,
-			     bool allow_unzeroed)
+			     bool allow_unzeroed, enum vc4_kernel_bo_type type)
 {
 	size_t size = roundup(unaligned_size, PAGE_SIZE);
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
@@ -229,7 +316,7 @@ struct vc4_bo *vc4_bo_create(struct drm_device *dev, size_t unaligned_size,
 		return ERR_PTR(-EINVAL);
 
 	/* First, try to get a vc4_bo from the kernel BO cache. */
-	bo = vc4_bo_get_from_cache(dev, size);
+	bo = vc4_bo_get_from_cache(dev, size, type);
 	if (bo) {
 		if (!allow_unzeroed)
 			memset(bo->base.vaddr, 0, bo->base.base.size);
@@ -251,7 +338,13 @@ struct vc4_bo *vc4_bo_create(struct drm_device *dev, size_t unaligned_size,
 			return ERR_PTR(-ENOMEM);
 		}
 	}
-	return to_vc4_bo(&cma_obj->base);
+	bo = to_vc4_bo(&cma_obj->base);
+
+	mutex_lock(&vc4->bo_lock);
+	vc4_bo_set_label(&cma_obj->base, type);
+	mutex_unlock(&vc4->bo_lock);
+
+	return bo;
 }
 
 int vc4_dumb_create(struct drm_file *file_priv,
@@ -268,22 +361,23 @@ int vc4_dumb_create(struct drm_file *file_priv,
 	if (args->size < args->pitch * args->height)
 		args->size = args->pitch * args->height;
 
-	bo = vc4_bo_create(dev, args->size, false);
+	bo = vc4_bo_create(dev, args->size, false, VC4_BO_TYPE_DUMB);
 	if (IS_ERR(bo))
 		return PTR_ERR(bo);
 
 	ret = drm_gem_handle_create(file_priv, &bo->base.base, &args->handle);
-	drm_gem_object_unreference_unlocked(&bo->base.base);
+	drm_gem_object_put_unlocked(&bo->base.base);
 
 	return ret;
 }
 
-/* Must be called with bo_lock held. */
 static void vc4_bo_cache_free_old(struct drm_device *dev)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	unsigned long expire_time = jiffies - msecs_to_jiffies(1000);
 
+	lockdep_assert_held(&vc4->bo_lock);
+
 	while (!list_empty(&vc4->bo_cache.time_list)) {
 		struct vc4_bo *bo = list_last_entry(&vc4->bo_cache.time_list,
 						    struct vc4_bo, unref_head);
@@ -348,8 +442,7 @@ void vc4_free_object(struct drm_gem_object *gem_bo)
 	list_add(&bo->size_head, cache_list);
 	list_add(&bo->unref_head, &vc4->bo_cache.time_list);
 
-	vc4->bo_stats.num_cached++;
-	vc4->bo_stats.size_cached += gem_bo->size;
+	vc4_bo_set_label(&bo->base.base, VC4_BO_TYPE_KERNEL_CACHE);
 
 	vc4_bo_cache_free_old(dev);
 
@@ -389,7 +482,7 @@ vc4_prime_export(struct drm_device *dev, struct drm_gem_object *obj, int flags)
 	struct vc4_bo *bo = to_vc4_bo(obj);
 
 	if (bo->validated_shader) {
-		DRM_ERROR("Attempting to export shader BO\n");
+		DRM_DEBUG("Attempting to export shader BO\n");
 		return ERR_PTR(-EINVAL);
 	}
 
@@ -410,7 +503,7 @@ int vc4_mmap(struct file *filp, struct vm_area_struct *vma)
 	bo = to_vc4_bo(gem_obj);
 
 	if (bo->validated_shader && (vma->vm_flags & VM_WRITE)) {
-		DRM_ERROR("mmaping of shader BOs for writing not allowed.\n");
+		DRM_DEBUG("mmaping of shader BOs for writing not allowed.\n");
 		return -EINVAL;
 	}
 
@@ -435,7 +528,7 @@ int vc4_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma)
 	struct vc4_bo *bo = to_vc4_bo(obj);
 
 	if (bo->validated_shader && (vma->vm_flags & VM_WRITE)) {
-		DRM_ERROR("mmaping of shader BOs for writing not allowed.\n");
+		DRM_DEBUG("mmaping of shader BOs for writing not allowed.\n");
 		return -EINVAL;
 	}
 
@@ -447,7 +540,7 @@ void *vc4_prime_vmap(struct drm_gem_object *obj)
 	struct vc4_bo *bo = to_vc4_bo(obj);
 
 	if (bo->validated_shader) {
-		DRM_ERROR("mmaping of shader BOs not allowed.\n");
+		DRM_DEBUG("mmaping of shader BOs not allowed.\n");
 		return ERR_PTR(-EINVAL);
 	}
 
@@ -483,12 +576,12 @@ int vc4_create_bo_ioctl(struct drm_device *dev, void *data,
 	 * We can't allocate from the BO cache, because the BOs don't
 	 * get zeroed, and that might leak data between users.
 	 */
-	bo = vc4_bo_create(dev, args->size, false);
+	bo = vc4_bo_create(dev, args->size, false, VC4_BO_TYPE_V3D);
 	if (IS_ERR(bo))
 		return PTR_ERR(bo);
 
 	ret = drm_gem_handle_create(file_priv, &bo->base.base, &args->handle);
-	drm_gem_object_unreference_unlocked(&bo->base.base);
+	drm_gem_object_put_unlocked(&bo->base.base);
 
 	return ret;
 }
@@ -501,14 +594,14 @@ int vc4_mmap_bo_ioctl(struct drm_device *dev, void *data,
 
 	gem_obj = drm_gem_object_lookup(file_priv, args->handle);
 	if (!gem_obj) {
-		DRM_ERROR("Failed to look up GEM BO %d\n", args->handle);
+		DRM_DEBUG("Failed to look up GEM BO %d\n", args->handle);
 		return -EINVAL;
 	}
 
 	/* The mmap offset was set up at BO allocation time. */
 	args->offset = drm_vma_node_offset_addr(&gem_obj->vma_node);
 
-	drm_gem_object_unreference_unlocked(gem_obj);
+	drm_gem_object_put_unlocked(gem_obj);
 	return 0;
 }
 
@@ -536,7 +629,7 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void *data,
 		return -EINVAL;
 	}
 
-	bo = vc4_bo_create(dev, args->size, true);
+	bo = vc4_bo_create(dev, args->size, true, VC4_BO_TYPE_V3D_SHADER);
 	if (IS_ERR(bo))
 		return PTR_ERR(bo);
 
@@ -564,7 +657,7 @@ vc4_create_shader_bo_ioctl(struct drm_device *dev, void *data,
 	ret = drm_gem_handle_create(file_priv, &bo->base.base, &args->handle);
 
  fail:
-	drm_gem_object_unreference_unlocked(&bo->base.base);
+	drm_gem_object_put_unlocked(&bo->base.base);
 
 	return ret;
 }
@@ -605,13 +698,13 @@ int vc4_set_tiling_ioctl(struct drm_device *dev, void *data,
 
 	gem_obj = drm_gem_object_lookup(file_priv, args->handle);
 	if (!gem_obj) {
-		DRM_ERROR("Failed to look up GEM BO %d\n", args->handle);
+		DRM_DEBUG("Failed to look up GEM BO %d\n", args->handle);
 		return -ENOENT;
 	}
 	bo = to_vc4_bo(gem_obj);
 	bo->t_format = t_format;
 
-	drm_gem_object_unreference_unlocked(gem_obj);
+	drm_gem_object_put_unlocked(gem_obj);
 
 	return 0;
 }
@@ -636,7 +729,7 @@ int vc4_get_tiling_ioctl(struct drm_device *dev, void *data,
 
 	gem_obj = drm_gem_object_lookup(file_priv, args->handle);
 	if (!gem_obj) {
-		DRM_ERROR("Failed to look up GEM BO %d\n", args->handle);
+		DRM_DEBUG("Failed to look up GEM BO %d\n", args->handle);
 		return -ENOENT;
 	}
 	bo = to_vc4_bo(gem_obj);
@@ -646,14 +739,29 @@ int vc4_get_tiling_ioctl(struct drm_device *dev, void *data,
 	else
 		args->modifier = DRM_FORMAT_MOD_NONE;
 
-	drm_gem_object_unreference_unlocked(gem_obj);
+	drm_gem_object_put_unlocked(gem_obj);
 
 	return 0;
 }
 
-void vc4_bo_cache_init(struct drm_device *dev)
+int vc4_bo_cache_init(struct drm_device *dev)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	int i;
+
+	/* Create the initial set of BO labels that the kernel will
+	 * use.  This lets us avoid a bunch of string reallocation in
+	 * the kernel's draw and BO allocation paths.
+	 */
+	vc4->bo_labels = kcalloc(VC4_BO_TYPE_COUNT, sizeof(*vc4->bo_labels),
+				 GFP_KERNEL);
+	if (!vc4->bo_labels)
+		return -ENOMEM;
+	vc4->num_labels = VC4_BO_TYPE_COUNT;
+
+	BUILD_BUG_ON(ARRAY_SIZE(bo_type_names) != VC4_BO_TYPE_COUNT);
+	for (i = 0; i < VC4_BO_TYPE_COUNT; i++)
+		vc4->bo_labels[i].name = bo_type_names[i];
 
 	mutex_init(&vc4->bo_lock);
 
@@ -663,19 +771,66 @@ void vc4_bo_cache_init(struct drm_device *dev)
 	setup_timer(&vc4->bo_cache.time_timer,
 		    vc4_bo_cache_time_timer,
 		    (unsigned long)dev);
+
+	return 0;
 }
 
 void vc4_bo_cache_destroy(struct drm_device *dev)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	int i;
 
 	del_timer(&vc4->bo_cache.time_timer);
 	cancel_work_sync(&vc4->bo_cache.time_work);
 
 	vc4_bo_cache_purge(dev);
 
-	if (vc4->bo_stats.num_allocated) {
-		DRM_ERROR("Destroying BO cache while BOs still allocated:\n");
-		vc4_bo_stats_dump(vc4);
+	for (i = 0; i < vc4->num_labels; i++) {
+		if (vc4->bo_labels[i].num_allocated) {
+			DRM_ERROR("Destroying BO cache with %d %s "
+				  "BOs still allocated\n",
+				  vc4->bo_labels[i].num_allocated,
+				  vc4->bo_labels[i].name);
+		}
+
+		if (is_user_label(i))
+			kfree(vc4->bo_labels[i].name);
 	}
+	kfree(vc4->bo_labels);
+}
+
+int vc4_label_bo_ioctl(struct drm_device *dev, void *data,
+		       struct drm_file *file_priv)
+{
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
+	struct drm_vc4_label_bo *args = data;
+	char *name;
+	struct drm_gem_object *gem_obj;
+	int ret = 0, label;
+
+	if (!args->len)
+		return -EINVAL;
+
+	name = strndup_user(u64_to_user_ptr(args->name), args->len + 1);
+	if (IS_ERR(name))
+		return PTR_ERR(name);
+
+	gem_obj = drm_gem_object_lookup(file_priv, args->handle);
+	if (!gem_obj) {
+		DRM_ERROR("Failed to look up GEM BO %d\n", args->handle);
+		kfree(name);
+		return -ENOENT;
+	}
+
+	mutex_lock(&vc4->bo_lock);
+	label = vc4_get_user_label(vc4, name);
+	if (label != -1)
+		vc4_bo_set_label(gem_obj, label);
+	else
+		ret = -ENOMEM;
+	mutex_unlock(&vc4->bo_lock);
+
+	drm_gem_object_put_unlocked(gem_obj);
+
+	return ret;
 }
diff --git a/drivers/gpu/drm/vc4/vc4_crtc.c b/drivers/gpu/drm/vc4/vc4_crtc.c
index a12cc7e..ce1e3b9 100644
--- a/drivers/gpu/drm/vc4/vc4_crtc.c
+++ b/drivers/gpu/drm/vc4/vc4_crtc.c
@@ -479,7 +479,8 @@ static void require_hvs_enabled(struct drm_device *dev)
 		     SCALER_DISPCTRL_ENABLE);
 }
 
-static void vc4_crtc_disable(struct drm_crtc *crtc)
+static void vc4_crtc_atomic_disable(struct drm_crtc *crtc,
+				    struct drm_crtc_state *old_state)
 {
 	struct drm_device *dev = crtc->dev;
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
@@ -518,6 +519,19 @@ static void vc4_crtc_disable(struct drm_crtc *crtc)
 	WARN_ON_ONCE((HVS_READ(SCALER_DISPSTATX(chan)) &
 		      (SCALER_DISPSTATX_FULL | SCALER_DISPSTATX_EMPTY)) !=
 		     SCALER_DISPSTATX_EMPTY);
+
+	/*
+	 * Make sure we issue a vblank event after disabling the CRTC if
+	 * someone was waiting it.
+	 */
+	if (crtc->state->event) {
+		unsigned long flags;
+
+		spin_lock_irqsave(&dev->event_lock, flags);
+		drm_crtc_send_vblank_event(crtc, crtc->state->event);
+		crtc->state->event = NULL;
+		spin_unlock_irqrestore(&dev->event_lock, flags);
+	}
 }
 
 static void vc4_crtc_update_dlist(struct drm_crtc *crtc)
@@ -548,7 +562,8 @@ static void vc4_crtc_update_dlist(struct drm_crtc *crtc)
 	}
 }
 
-static void vc4_crtc_enable(struct drm_crtc *crtc)
+static void vc4_crtc_atomic_enable(struct drm_crtc *crtc,
+				   struct drm_crtc_state *old_state)
 {
 	struct drm_device *dev = crtc->dev;
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
@@ -577,18 +592,17 @@ static void vc4_crtc_enable(struct drm_crtc *crtc)
 		   CRTC_READ(PV_V_CONTROL) | PV_VCONTROL_VIDEN);
 }
 
-static bool vc4_crtc_mode_fixup(struct drm_crtc *crtc,
-				const struct drm_display_mode *mode,
-				struct drm_display_mode *adjusted_mode)
+static enum drm_mode_status vc4_crtc_mode_valid(struct drm_crtc *crtc,
+						const struct drm_display_mode *mode)
 {
 	/* Do not allow doublescan modes from user space */
-	if (adjusted_mode->flags & DRM_MODE_FLAG_DBLSCAN) {
+	if (mode->flags & DRM_MODE_FLAG_DBLSCAN) {
 		DRM_DEBUG_KMS("[CRTC:%d] Doublescan mode rejected.\n",
 			      crtc->base.id);
-		return false;
+		return MODE_NO_DBLESCAN;
 	}
 
-	return true;
+	return MODE_OK;
 }
 
 static int vc4_crtc_atomic_check(struct drm_crtc *crtc,
@@ -682,14 +696,6 @@ static void vc4_disable_vblank(struct drm_crtc *crtc)
 	CRTC_WRITE(PV_INTEN, 0);
 }
 
-/* Must be called with the event lock held */
-bool vc4_event_pending(struct drm_crtc *crtc)
-{
-	struct vc4_crtc *vc4_crtc = to_vc4_crtc(crtc);
-
-	return !!vc4_crtc->event;
-}
-
 static void vc4_crtc_handle_page_flip(struct vc4_crtc *vc4_crtc)
 {
 	struct drm_crtc *crtc = &vc4_crtc->base;
@@ -757,7 +763,7 @@ vc4_async_page_flip_complete(struct vc4_seqno_cb *cb)
 	}
 
 	drm_crtc_vblank_put(crtc);
-	drm_framebuffer_unreference(flip_state->fb);
+	drm_framebuffer_put(flip_state->fb);
 	kfree(flip_state);
 
 	up(&vc4->async_modeset);
@@ -786,7 +792,7 @@ static int vc4_async_page_flip(struct drm_crtc *crtc,
 	if (!flip_state)
 		return -ENOMEM;
 
-	drm_framebuffer_reference(fb);
+	drm_framebuffer_get(fb);
 	flip_state->fb = fb;
 	flip_state->crtc = crtc;
 	flip_state->event = event;
@@ -794,7 +800,7 @@ static int vc4_async_page_flip(struct drm_crtc *crtc,
 	/* Make sure all other async modesetes have landed. */
 	ret = down_interruptible(&vc4->async_modeset);
 	if (ret) {
-		drm_framebuffer_unreference(fb);
+		drm_framebuffer_put(fb);
 		kfree(flip_state);
 		return ret;
 	}
@@ -885,11 +891,11 @@ static const struct drm_crtc_funcs vc4_crtc_funcs = {
 
 static const struct drm_crtc_helper_funcs vc4_crtc_helper_funcs = {
 	.mode_set_nofb = vc4_crtc_mode_set_nofb,
-	.disable = vc4_crtc_disable,
-	.enable = vc4_crtc_enable,
-	.mode_fixup = vc4_crtc_mode_fixup,
+	.mode_valid = vc4_crtc_mode_valid,
 	.atomic_check = vc4_crtc_atomic_check,
 	.atomic_flush = vc4_crtc_atomic_flush,
+	.atomic_enable = vc4_crtc_atomic_enable,
+	.atomic_disable = vc4_crtc_atomic_disable,
 };
 
 static const struct vc4_crtc_data pv0_data = {
diff --git a/drivers/gpu/drm/vc4/vc4_dpi.c b/drivers/gpu/drm/vc4/vc4_dpi.c
index 2e0fe46..519cefe 100644
--- a/drivers/gpu/drm/vc4/vc4_dpi.c
+++ b/drivers/gpu/drm/vc4/vc4_dpi.c
@@ -224,20 +224,19 @@ static void vc4_dpi_encoder_enable(struct drm_encoder *encoder)
 		DRM_ERROR("Failed to set clock rate: %d\n", ret);
 }
 
-static bool vc4_dpi_encoder_mode_fixup(struct drm_encoder *encoder,
-				       const struct drm_display_mode *mode,
-				       struct drm_display_mode *adjusted_mode)
+static enum drm_mode_status vc4_dpi_encoder_mode_valid(struct drm_encoder *encoder,
+						       const struct drm_display_mode *mode)
 {
-	if (adjusted_mode->flags & DRM_MODE_FLAG_INTERLACE)
-		return false;
+	if (mode->flags & DRM_MODE_FLAG_INTERLACE)
+		return MODE_NO_INTERLACE;
 
-	return true;
+	return MODE_OK;
 }
 
 static const struct drm_encoder_helper_funcs vc4_dpi_encoder_helper_funcs = {
 	.disable = vc4_dpi_encoder_disable,
 	.enable = vc4_dpi_encoder_enable,
-	.mode_fixup = vc4_dpi_encoder_mode_fixup,
+	.mode_valid = vc4_dpi_encoder_mode_valid,
 };
 
 static const struct of_device_id vc4_dpi_dt_match[] = {
diff --git a/drivers/gpu/drm/vc4/vc4_drv.c b/drivers/gpu/drm/vc4/vc4_drv.c
index c6b487c..1c96edc 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.c
+++ b/drivers/gpu/drm/vc4/vc4_drv.c
@@ -99,6 +99,7 @@ static int vc4_get_param_ioctl(struct drm_device *dev, void *data,
 	case DRM_VC4_PARAM_SUPPORTS_BRANCHES:
 	case DRM_VC4_PARAM_SUPPORTS_ETC1:
 	case DRM_VC4_PARAM_SUPPORTS_THREADED_FS:
+	case DRM_VC4_PARAM_SUPPORTS_FIXED_RCL_ORDER:
 		args->value = true;
 		break;
 	default:
@@ -140,6 +141,7 @@ static const struct drm_ioctl_desc vc4_drm_ioctls[] = {
 	DRM_IOCTL_DEF_DRV(VC4_GET_PARAM, vc4_get_param_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(VC4_SET_TILING, vc4_set_tiling_ioctl, DRM_RENDER_ALLOW),
 	DRM_IOCTL_DEF_DRV(VC4_GET_TILING, vc4_get_tiling_ioctl, DRM_RENDER_ALLOW),
+	DRM_IOCTL_DEF_DRV(VC4_LABEL_BO, vc4_label_bo_ioctl, DRM_RENDER_ALLOW),
 };
 
 static struct drm_driver vc4_drm_driver = {
@@ -178,8 +180,6 @@ static struct drm_driver vc4_drm_driver = {
 	.gem_prime_mmap = vc4_prime_mmap,
 
 	.dumb_create = vc4_dumb_create,
-	.dumb_map_offset = drm_gem_cma_dumb_map_offset,
-	.dumb_destroy = drm_gem_dumb_destroy,
 
 	.ioctls = vc4_drm_ioctls,
 	.num_ioctls = ARRAY_SIZE(vc4_drm_ioctls),
@@ -257,7 +257,9 @@ static int vc4_drm_bind(struct device *dev)
 	vc4->dev = drm;
 	drm->dev_private = vc4;
 
-	vc4_bo_cache_init(drm);
+	ret = vc4_bo_cache_init(drm);
+	if (ret)
+		goto dev_unref;
 
 	drm_mode_config_init(drm);
 
@@ -281,8 +283,9 @@ static int vc4_drm_bind(struct device *dev)
 	component_unbind_all(dev, drm);
 gem_destroy:
 	vc4_gem_destroy(drm);
-	drm_dev_unref(drm);
 	vc4_bo_cache_destroy(drm);
+dev_unref:
+	drm_dev_unref(drm);
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/vc4/vc4_drv.h b/drivers/gpu/drm/vc4/vc4_drv.h
index df22698..87f2d8e 100644
--- a/drivers/gpu/drm/vc4/vc4_drv.h
+++ b/drivers/gpu/drm/vc4/vc4_drv.h
@@ -11,6 +11,24 @@
 #include <drm/drm_encoder.h>
 #include <drm/drm_gem_cma_helper.h>
 
+/* Don't forget to update vc4_bo.c: bo_type_names[] when adding to
+ * this.
+ */
+enum vc4_kernel_bo_type {
+	/* Any kernel allocation (gem_create_object hook) before it
+	 * gets another type set.
+	 */
+	VC4_BO_TYPE_KERNEL,
+	VC4_BO_TYPE_V3D,
+	VC4_BO_TYPE_V3D_SHADER,
+	VC4_BO_TYPE_DUMB,
+	VC4_BO_TYPE_BIN,
+	VC4_BO_TYPE_RCL,
+	VC4_BO_TYPE_BCL,
+	VC4_BO_TYPE_KERNEL_CACHE,
+	VC4_BO_TYPE_COUNT
+};
+
 struct vc4_dev {
 	struct drm_device *dev;
 
@@ -46,14 +64,14 @@ struct vc4_dev {
 		struct timer_list time_timer;
 	} bo_cache;
 
-	struct vc4_bo_stats {
+	u32 num_labels;
+	struct vc4_label {
+		const char *name;
 		u32 num_allocated;
 		u32 size_allocated;
-		u32 num_cached;
-		u32 size_cached;
-	} bo_stats;
+	} *bo_labels;
 
-	/* Protects bo_cache and the BO stats. */
+	/* Protects bo_cache and bo_labels. */
 	struct mutex bo_lock;
 
 	uint64_t dma_fence_context;
@@ -169,6 +187,11 @@ struct vc4_bo {
 	/* normally (resv == &_resv) except for imported bo's */
 	struct reservation_object *resv;
 	struct reservation_object _resv;
+
+	/* One of enum vc4_kernel_bo_type, or VC4_BO_TYPE_COUNT + i
+	 * for user-allocated labels.
+	 */
+	int label;
 };
 
 static inline struct vc4_bo *
@@ -460,7 +483,7 @@ struct vc4_validated_shader_info {
 struct drm_gem_object *vc4_create_object(struct drm_device *dev, size_t size);
 void vc4_free_object(struct drm_gem_object *gem_obj);
 struct vc4_bo *vc4_bo_create(struct drm_device *dev, size_t size,
-			     bool from_cache);
+			     bool from_cache, enum vc4_kernel_bo_type type);
 int vc4_dumb_create(struct drm_file *file_priv,
 		    struct drm_device *dev,
 		    struct drm_mode_create_dumb *args);
@@ -478,6 +501,8 @@ int vc4_get_tiling_ioctl(struct drm_device *dev, void *data,
 			 struct drm_file *file_priv);
 int vc4_get_hang_state_ioctl(struct drm_device *dev, void *data,
 			     struct drm_file *file_priv);
+int vc4_label_bo_ioctl(struct drm_device *dev, void *data,
+		       struct drm_file *file_priv);
 int vc4_mmap(struct file *filp, struct vm_area_struct *vma);
 struct reservation_object *vc4_prime_res_obj(struct drm_gem_object *obj);
 int vc4_prime_mmap(struct drm_gem_object *obj, struct vm_area_struct *vma);
@@ -485,13 +510,12 @@ struct drm_gem_object *vc4_prime_import_sg_table(struct drm_device *dev,
 						 struct dma_buf_attachment *attach,
 						 struct sg_table *sgt);
 void *vc4_prime_vmap(struct drm_gem_object *obj);
-void vc4_bo_cache_init(struct drm_device *dev);
+int vc4_bo_cache_init(struct drm_device *dev);
 void vc4_bo_cache_destroy(struct drm_device *dev);
 int vc4_bo_stats_debugfs(struct seq_file *m, void *arg);
 
 /* vc4_crtc.c */
 extern struct platform_driver vc4_crtc_driver;
-bool vc4_event_pending(struct drm_crtc *crtc);
 int vc4_crtc_debugfs_regs(struct seq_file *m, void *arg);
 bool vc4_crtc_get_scanoutpos(struct drm_device *dev, unsigned int crtc_id,
 			     bool in_vblank_irq, int *vpos, int *hpos,
diff --git a/drivers/gpu/drm/vc4/vc4_dsi.c b/drivers/gpu/drm/vc4/vc4_dsi.c
index 5e8b81e..d1e0dc9 100644
--- a/drivers/gpu/drm/vc4/vc4_dsi.c
+++ b/drivers/gpu/drm/vc4/vc4_dsi.c
@@ -736,18 +736,18 @@ static void vc4_dsi_latch_ulps(struct vc4_dsi *dsi, bool latch)
 /* Enters or exits Ultra Low Power State. */
 static void vc4_dsi_ulps(struct vc4_dsi *dsi, bool ulps)
 {
-	bool continuous = dsi->mode_flags & MIPI_DSI_CLOCK_NON_CONTINUOUS;
-	u32 phyc_ulps = ((continuous ? DSI_PORT_BIT(PHYC_CLANE_ULPS) : 0) |
+	bool non_continuous = dsi->mode_flags & MIPI_DSI_CLOCK_NON_CONTINUOUS;
+	u32 phyc_ulps = ((non_continuous ? DSI_PORT_BIT(PHYC_CLANE_ULPS) : 0) |
 			 DSI_PHYC_DLANE0_ULPS |
 			 (dsi->lanes > 1 ? DSI_PHYC_DLANE1_ULPS : 0) |
 			 (dsi->lanes > 2 ? DSI_PHYC_DLANE2_ULPS : 0) |
 			 (dsi->lanes > 3 ? DSI_PHYC_DLANE3_ULPS : 0));
-	u32 stat_ulps = ((continuous ? DSI1_STAT_PHY_CLOCK_ULPS : 0) |
+	u32 stat_ulps = ((non_continuous ? DSI1_STAT_PHY_CLOCK_ULPS : 0) |
 			 DSI1_STAT_PHY_D0_ULPS |
 			 (dsi->lanes > 1 ? DSI1_STAT_PHY_D1_ULPS : 0) |
 			 (dsi->lanes > 2 ? DSI1_STAT_PHY_D2_ULPS : 0) |
 			 (dsi->lanes > 3 ? DSI1_STAT_PHY_D3_ULPS : 0));
-	u32 stat_stop = ((continuous ? DSI1_STAT_PHY_CLOCK_STOP : 0) |
+	u32 stat_stop = ((non_continuous ? DSI1_STAT_PHY_CLOCK_STOP : 0) |
 			 DSI1_STAT_PHY_D0_STOP |
 			 (dsi->lanes > 1 ? DSI1_STAT_PHY_D1_STOP : 0) |
 			 (dsi->lanes > 2 ? DSI1_STAT_PHY_D2_STOP : 0) |
@@ -1035,7 +1035,17 @@ static void vc4_dsi_encoder_enable(struct drm_encoder *encoder)
 				     DSI_HS_DLT4_TRAIL) |
 		       VC4_SET_FIELD(0, DSI_HS_DLT4_ANLAT));
 
-	DSI_PORT_WRITE(HS_DLT5, VC4_SET_FIELD(dsi_hs_timing(ui_ns, 1000, 5000),
+	/* T_INIT is how long STOP is driven after power-up to
+	 * indicate to the slave (also coming out of power-up) that
+	 * master init is complete, and should be greater than the
+	 * maximum of two value: T_INIT,MASTER and T_INIT,SLAVE.  The
+	 * D-PHY spec gives a minimum 100us for T_INIT,MASTER and
+	 * T_INIT,SLAVE, while allowing protocols on top of it to give
+	 * greater minimums.  The vc4 firmware uses an extremely
+	 * conservative 5ms, and we maintain that here.
+	 */
+	DSI_PORT_WRITE(HS_DLT5, VC4_SET_FIELD(dsi_hs_timing(ui_ns,
+							    5 * 1000 * 1000, 0),
 					      DSI_HS_DLT5_INIT));
 
 	DSI_PORT_WRITE(HS_DLT6,
@@ -1626,14 +1636,10 @@ static void vc4_dsi_unbind(struct device *dev, struct device *master,
 
 	pm_runtime_disable(dev);
 
-	drm_bridge_remove(dsi->bridge);
 	vc4_dsi_encoder_destroy(dsi->encoder);
 
 	mipi_dsi_host_unregister(&dsi->dsi_host);
 
-	clk_disable_unprepare(dsi->pll_phy_clock);
-	clk_disable_unprepare(dsi->escape_clock);
-
 	if (dsi->port == 1)
 		vc4->dsi1 = NULL;
 }
diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
index d5b821a..d0c6bfb 100644
--- a/drivers/gpu/drm/vc4/vc4_gem.c
+++ b/drivers/gpu/drm/vc4/vc4_gem.c
@@ -55,7 +55,7 @@ vc4_free_hang_state(struct drm_device *dev, struct vc4_hang_state *state)
 	unsigned int i;
 
 	for (i = 0; i < state->user_state.bo_count; i++)
-		drm_gem_object_unreference_unlocked(state->bo[i]);
+		drm_gem_object_put_unlocked(state->bo[i]);
 
 	kfree(state);
 }
@@ -119,7 +119,7 @@ vc4_get_hang_state_ioctl(struct drm_device *dev, void *data,
 		bo_state[i].size = vc4_bo->base.base.size;
 	}
 
-	if (copy_to_user((void __user *)(uintptr_t)get_state->bo,
+	if (copy_to_user(u64_to_user_ptr(get_state->bo),
 			 bo_state,
 			 state->bo_count * sizeof(*bo_state)))
 		ret = -EFAULT;
@@ -188,12 +188,12 @@ vc4_save_hang_state(struct drm_device *dev)
 			continue;
 
 		for (j = 0; j < exec[i]->bo_count; j++) {
-			drm_gem_object_reference(&exec[i]->bo[j]->base);
+			drm_gem_object_get(&exec[i]->bo[j]->base);
 			kernel_state->bo[j + prev_idx] = &exec[i]->bo[j]->base;
 		}
 
 		list_for_each_entry(bo, &exec[i]->unref_list, unref_head) {
-			drm_gem_object_reference(&bo->base.base);
+			drm_gem_object_get(&bo->base.base);
 			kernel_state->bo[j + prev_idx] = &bo->base.base;
 			j++;
 		}
@@ -659,7 +659,7 @@ vc4_cl_lookup_bos(struct drm_device *dev,
 		/* See comment on bo_index for why we have to check
 		 * this.
 		 */
-		DRM_ERROR("Rendering requires BOs to validate\n");
+		DRM_DEBUG("Rendering requires BOs to validate\n");
 		return -EINVAL;
 	}
 
@@ -678,8 +678,7 @@ vc4_cl_lookup_bos(struct drm_device *dev,
 		goto fail;
 	}
 
-	if (copy_from_user(handles,
-			   (void __user *)(uintptr_t)args->bo_handles,
+	if (copy_from_user(handles, u64_to_user_ptr(args->bo_handles),
 			   exec->bo_count * sizeof(uint32_t))) {
 		ret = -EFAULT;
 		DRM_ERROR("Failed to copy in GEM handles\n");
@@ -691,13 +690,13 @@ vc4_cl_lookup_bos(struct drm_device *dev,
 		struct drm_gem_object *bo = idr_find(&file_priv->object_idr,
 						     handles[i]);
 		if (!bo) {
-			DRM_ERROR("Failed to look up GEM BO %d: %d\n",
+			DRM_DEBUG("Failed to look up GEM BO %d: %d\n",
 				  i, handles[i]);
 			ret = -EINVAL;
 			spin_unlock(&file_priv->table_lock);
 			goto fail;
 		}
-		drm_gem_object_reference(bo);
+		drm_gem_object_get(bo);
 		exec->bo[i] = (struct drm_gem_cma_object *)bo;
 	}
 	spin_unlock(&file_priv->table_lock);
@@ -729,7 +728,7 @@ vc4_get_bcl(struct drm_device *dev, struct vc4_exec_info *exec)
 	    args->shader_rec_count >= (UINT_MAX /
 					  sizeof(struct vc4_shader_state)) ||
 	    temp_size < exec_size) {
-		DRM_ERROR("overflow in exec arguments\n");
+		DRM_DEBUG("overflow in exec arguments\n");
 		ret = -EINVAL;
 		goto fail;
 	}
@@ -755,27 +754,27 @@ vc4_get_bcl(struct drm_device *dev, struct vc4_exec_info *exec)
 	exec->shader_state_size = args->shader_rec_count;
 
 	if (copy_from_user(bin,
-			   (void __user *)(uintptr_t)args->bin_cl,
+			   u64_to_user_ptr(args->bin_cl),
 			   args->bin_cl_size)) {
 		ret = -EFAULT;
 		goto fail;
 	}
 
 	if (copy_from_user(exec->shader_rec_u,
-			   (void __user *)(uintptr_t)args->shader_rec,
+			   u64_to_user_ptr(args->shader_rec),
 			   args->shader_rec_size)) {
 		ret = -EFAULT;
 		goto fail;
 	}
 
 	if (copy_from_user(exec->uniforms_u,
-			   (void __user *)(uintptr_t)args->uniforms,
+			   u64_to_user_ptr(args->uniforms),
 			   args->uniforms_size)) {
 		ret = -EFAULT;
 		goto fail;
 	}
 
-	bo = vc4_bo_create(dev, exec_size, true);
+	bo = vc4_bo_create(dev, exec_size, true, VC4_BO_TYPE_BCL);
 	if (IS_ERR(bo)) {
 		DRM_ERROR("Couldn't allocate BO for binning\n");
 		ret = PTR_ERR(bo);
@@ -835,7 +834,7 @@ vc4_complete_exec(struct drm_device *dev, struct vc4_exec_info *exec)
 
 	if (exec->bo) {
 		for (i = 0; i < exec->bo_count; i++)
-			drm_gem_object_unreference_unlocked(&exec->bo[i]->base);
+			drm_gem_object_put_unlocked(&exec->bo[i]->base);
 		kvfree(exec->bo);
 	}
 
@@ -843,7 +842,7 @@ vc4_complete_exec(struct drm_device *dev, struct vc4_exec_info *exec)
 		struct vc4_bo *bo = list_first_entry(&exec->unref_list,
 						     struct vc4_bo, unref_head);
 		list_del(&bo->unref_head);
-		drm_gem_object_unreference_unlocked(&bo->base.base);
+		drm_gem_object_put_unlocked(&bo->base.base);
 	}
 
 	/* Free up the allocation of any bin slots we used. */
@@ -974,7 +973,7 @@ vc4_wait_bo_ioctl(struct drm_device *dev, void *data,
 
 	gem_obj = drm_gem_object_lookup(file_priv, args->handle);
 	if (!gem_obj) {
-		DRM_ERROR("Failed to look up GEM BO %d\n", args->handle);
+		DRM_DEBUG("Failed to look up GEM BO %d\n", args->handle);
 		return -EINVAL;
 	}
 	bo = to_vc4_bo(gem_obj);
@@ -982,7 +981,7 @@ vc4_wait_bo_ioctl(struct drm_device *dev, void *data,
 	ret = vc4_wait_for_seqno_ioctl_helper(dev, bo->seqno,
 					      &args->timeout_ns);
 
-	drm_gem_object_unreference_unlocked(gem_obj);
+	drm_gem_object_put_unlocked(gem_obj);
 	return ret;
 }
 
@@ -1008,8 +1007,11 @@ vc4_submit_cl_ioctl(struct drm_device *dev, void *data,
 	struct ww_acquire_ctx acquire_ctx;
 	int ret = 0;
 
-	if ((args->flags & ~VC4_SUBMIT_CL_USE_CLEAR_COLOR) != 0) {
-		DRM_ERROR("Unknown flags: 0x%02x\n", args->flags);
+	if ((args->flags & ~(VC4_SUBMIT_CL_USE_CLEAR_COLOR |
+			     VC4_SUBMIT_CL_FIXED_RCL_ORDER |
+			     VC4_SUBMIT_CL_RCL_ORDER_INCREASING_X |
+			     VC4_SUBMIT_CL_RCL_ORDER_INCREASING_Y)) != 0) {
+		DRM_DEBUG("Unknown flags: 0x%02x\n", args->flags);
 		return -EINVAL;
 	}
 
@@ -1118,6 +1120,4 @@ vc4_gem_destroy(struct drm_device *dev)
 
 	if (vc4->hang_state)
 		vc4_free_hang_state(dev, vc4->hang_state);
-
-	vc4_bo_cache_destroy(dev);
 }
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c
index ed63d4e..937da8d 100644
--- a/drivers/gpu/drm/vc4/vc4_hdmi.c
+++ b/drivers/gpu/drm/vc4/vc4_hdmi.c
@@ -57,9 +57,14 @@
 #include <sound/pcm_drm_eld.h>
 #include <sound/pcm_params.h>
 #include <sound/soc.h>
+#include "media/cec.h"
 #include "vc4_drv.h"
 #include "vc4_regs.h"
 
+#define HSM_CLOCK_FREQ 163682864
+#define CEC_CLOCK_FREQ 40000
+#define CEC_CLOCK_DIV  (HSM_CLOCK_FREQ / CEC_CLOCK_FREQ)
+
 /* HDMI audio information */
 struct vc4_hdmi_audio {
 	struct snd_soc_card card;
@@ -85,6 +90,11 @@ struct vc4_hdmi {
 	int hpd_gpio;
 	bool hpd_active_low;
 
+	struct cec_adapter *cec_adap;
+	struct cec_msg cec_rx_msg;
+	bool cec_tx_ok;
+	bool cec_irq_was_rx;
+
 	struct clk *pixel_clock;
 	struct clk *hsm_clock;
 };
@@ -149,6 +159,23 @@ static const struct {
 	HDMI_REG(VC4_HDMI_VERTB1),
 	HDMI_REG(VC4_HDMI_TX_PHY_RESET_CTL),
 	HDMI_REG(VC4_HDMI_TX_PHY_CTL0),
+
+	HDMI_REG(VC4_HDMI_CEC_CNTRL_1),
+	HDMI_REG(VC4_HDMI_CEC_CNTRL_2),
+	HDMI_REG(VC4_HDMI_CEC_CNTRL_3),
+	HDMI_REG(VC4_HDMI_CEC_CNTRL_4),
+	HDMI_REG(VC4_HDMI_CEC_CNTRL_5),
+	HDMI_REG(VC4_HDMI_CPU_STATUS),
+	HDMI_REG(VC4_HDMI_CPU_MASK_STATUS),
+
+	HDMI_REG(VC4_HDMI_CEC_RX_DATA_1),
+	HDMI_REG(VC4_HDMI_CEC_RX_DATA_2),
+	HDMI_REG(VC4_HDMI_CEC_RX_DATA_3),
+	HDMI_REG(VC4_HDMI_CEC_RX_DATA_4),
+	HDMI_REG(VC4_HDMI_CEC_TX_DATA_1),
+	HDMI_REG(VC4_HDMI_CEC_TX_DATA_2),
+	HDMI_REG(VC4_HDMI_CEC_TX_DATA_3),
+	HDMI_REG(VC4_HDMI_CEC_TX_DATA_4),
 };
 
 static const struct {
@@ -216,8 +243,8 @@ vc4_hdmi_connector_detect(struct drm_connector *connector, bool force)
 		if (gpio_get_value_cansleep(vc4->hdmi->hpd_gpio) ^
 		    vc4->hdmi->hpd_active_low)
 			return connector_status_connected;
-		else
-			return connector_status_disconnected;
+		cec_phys_addr_invalidate(vc4->hdmi->cec_adap);
+		return connector_status_disconnected;
 	}
 
 	if (drm_probe_ddc(vc4->hdmi->ddc))
@@ -225,8 +252,8 @@ vc4_hdmi_connector_detect(struct drm_connector *connector, bool force)
 
 	if (HDMI_READ(VC4_HDMI_HOTPLUG) & VC4_HDMI_HOTPLUG_CONNECTED)
 		return connector_status_connected;
-	else
-		return connector_status_disconnected;
+	cec_phys_addr_invalidate(vc4->hdmi->cec_adap);
+	return connector_status_disconnected;
 }
 
 static void vc4_hdmi_connector_destroy(struct drm_connector *connector)
@@ -247,6 +274,7 @@ static int vc4_hdmi_connector_get_modes(struct drm_connector *connector)
 	struct edid *edid;
 
 	edid = drm_get_edid(connector, vc4->hdmi->ddc);
+	cec_s_phys_addr_from_edid(vc4->hdmi->cec_adap, edid);
 	if (!edid)
 		return -ENODEV;
 
@@ -260,12 +288,12 @@ static int vc4_hdmi_connector_get_modes(struct drm_connector *connector)
 	drm_mode_connector_update_edid_property(connector, edid);
 	ret = drm_add_edid_modes(connector, edid);
 	drm_edid_to_eld(connector, edid);
+	kfree(edid);
 
 	return ret;
 }
 
 static const struct drm_connector_funcs vc4_hdmi_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.detect = vc4_hdmi_connector_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = vc4_hdmi_connector_destroy,
@@ -395,7 +423,7 @@ static void vc4_hdmi_set_avi_infoframe(struct drm_encoder *encoder)
 	union hdmi_infoframe frame;
 	int ret;
 
-	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi, mode);
+	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi, mode, false);
 	if (ret < 0) {
 		DRM_ERROR("couldn't fill AVI infoframe\n");
 		return;
@@ -463,11 +491,6 @@ static void vc4_hdmi_encoder_disable(struct drm_encoder *encoder)
 	HD_WRITE(VC4_HD_VID_CTL,
 		 HD_READ(VC4_HD_VID_CTL) & ~VC4_HD_VID_CTL_ENABLE);
 
-	HD_WRITE(VC4_HD_M_CTL, VC4_HD_M_SW_RST);
-	udelay(1);
-	HD_WRITE(VC4_HD_M_CTL, 0);
-
-	clk_disable_unprepare(hdmi->hsm_clock);
 	clk_disable_unprepare(hdmi->pixel_clock);
 
 	ret = pm_runtime_put(&hdmi->pdev->dev);
@@ -509,16 +532,6 @@ static void vc4_hdmi_encoder_enable(struct drm_encoder *encoder)
 		return;
 	}
 
-	/* This is the rate that is set by the firmware.  The number
-	 * needs to be a bit higher than the pixel clock rate
-	 * (generally 148.5Mhz).
-	 */
-	ret = clk_set_rate(hdmi->hsm_clock, 163682864);
-	if (ret) {
-		DRM_ERROR("Failed to set HSM clock rate: %d\n", ret);
-		return;
-	}
-
 	ret = clk_set_rate(hdmi->pixel_clock,
 			   mode->clock * 1000 *
 			   ((mode->flags & DRM_MODE_FLAG_DBLCLK) ? 2 : 1));
@@ -533,20 +546,6 @@ static void vc4_hdmi_encoder_enable(struct drm_encoder *encoder)
 		return;
 	}
 
-	ret = clk_prepare_enable(hdmi->hsm_clock);
-	if (ret) {
-		DRM_ERROR("Failed to turn on HDMI state machine clock: %d\n",
-			  ret);
-		clk_disable_unprepare(hdmi->pixel_clock);
-		return;
-	}
-
-	HD_WRITE(VC4_HD_M_CTL, VC4_HD_M_SW_RST);
-	udelay(1);
-	HD_WRITE(VC4_HD_M_CTL, 0);
-
-	HD_WRITE(VC4_HD_M_CTL, VC4_HD_M_ENABLE);
-
 	HDMI_WRITE(VC4_HDMI_SW_RESET_CONTROL,
 		   VC4_HDMI_SW_RESET_HDMI |
 		   VC4_HDMI_SW_RESET_FORMAT_DETECT);
@@ -1150,6 +1149,159 @@ static void vc4_hdmi_audio_cleanup(struct vc4_hdmi *hdmi)
 		snd_soc_unregister_codec(dev);
 }
 
+#ifdef CONFIG_DRM_VC4_HDMI_CEC
+static irqreturn_t vc4_cec_irq_handler_thread(int irq, void *priv)
+{
+	struct vc4_dev *vc4 = priv;
+	struct vc4_hdmi *hdmi = vc4->hdmi;
+
+	if (hdmi->cec_irq_was_rx) {
+		if (hdmi->cec_rx_msg.len)
+			cec_received_msg(hdmi->cec_adap, &hdmi->cec_rx_msg);
+	} else if (hdmi->cec_tx_ok) {
+		cec_transmit_done(hdmi->cec_adap, CEC_TX_STATUS_OK,
+				  0, 0, 0, 0);
+	} else {
+		/*
+		 * This CEC implementation makes 1 retry, so if we
+		 * get a NACK, then that means it made 2 attempts.
+		 */
+		cec_transmit_done(hdmi->cec_adap, CEC_TX_STATUS_NACK,
+				  0, 2, 0, 0);
+	}
+	return IRQ_HANDLED;
+}
+
+static void vc4_cec_read_msg(struct vc4_dev *vc4, u32 cntrl1)
+{
+	struct cec_msg *msg = &vc4->hdmi->cec_rx_msg;
+	unsigned int i;
+
+	msg->len = 1 + ((cntrl1 & VC4_HDMI_CEC_REC_WRD_CNT_MASK) >>
+					VC4_HDMI_CEC_REC_WRD_CNT_SHIFT);
+	for (i = 0; i < msg->len; i += 4) {
+		u32 val = HDMI_READ(VC4_HDMI_CEC_RX_DATA_1 + i);
+
+		msg->msg[i] = val & 0xff;
+		msg->msg[i + 1] = (val >> 8) & 0xff;
+		msg->msg[i + 2] = (val >> 16) & 0xff;
+		msg->msg[i + 3] = (val >> 24) & 0xff;
+	}
+}
+
+static irqreturn_t vc4_cec_irq_handler(int irq, void *priv)
+{
+	struct vc4_dev *vc4 = priv;
+	struct vc4_hdmi *hdmi = vc4->hdmi;
+	u32 stat = HDMI_READ(VC4_HDMI_CPU_STATUS);
+	u32 cntrl1, cntrl5;
+
+	if (!(stat & VC4_HDMI_CPU_CEC))
+		return IRQ_NONE;
+	hdmi->cec_rx_msg.len = 0;
+	cntrl1 = HDMI_READ(VC4_HDMI_CEC_CNTRL_1);
+	cntrl5 = HDMI_READ(VC4_HDMI_CEC_CNTRL_5);
+	hdmi->cec_irq_was_rx = cntrl5 & VC4_HDMI_CEC_RX_CEC_INT;
+	if (hdmi->cec_irq_was_rx) {
+		vc4_cec_read_msg(vc4, cntrl1);
+		cntrl1 |= VC4_HDMI_CEC_CLEAR_RECEIVE_OFF;
+		HDMI_WRITE(VC4_HDMI_CEC_CNTRL_1, cntrl1);
+		cntrl1 &= ~VC4_HDMI_CEC_CLEAR_RECEIVE_OFF;
+	} else {
+		hdmi->cec_tx_ok = cntrl1 & VC4_HDMI_CEC_TX_STATUS_GOOD;
+		cntrl1 &= ~VC4_HDMI_CEC_START_XMIT_BEGIN;
+	}
+	HDMI_WRITE(VC4_HDMI_CEC_CNTRL_1, cntrl1);
+	HDMI_WRITE(VC4_HDMI_CPU_CLEAR, VC4_HDMI_CPU_CEC);
+
+	return IRQ_WAKE_THREAD;
+}
+
+static int vc4_hdmi_cec_adap_enable(struct cec_adapter *adap, bool enable)
+{
+	struct vc4_dev *vc4 = cec_get_drvdata(adap);
+	/* clock period in microseconds */
+	const u32 usecs = 1000000 / CEC_CLOCK_FREQ;
+	u32 val = HDMI_READ(VC4_HDMI_CEC_CNTRL_5);
+
+	val &= ~(VC4_HDMI_CEC_TX_SW_RESET | VC4_HDMI_CEC_RX_SW_RESET |
+		 VC4_HDMI_CEC_CNT_TO_4700_US_MASK |
+		 VC4_HDMI_CEC_CNT_TO_4500_US_MASK);
+	val |= ((4700 / usecs) << VC4_HDMI_CEC_CNT_TO_4700_US_SHIFT) |
+	       ((4500 / usecs) << VC4_HDMI_CEC_CNT_TO_4500_US_SHIFT);
+
+	if (enable) {
+		HDMI_WRITE(VC4_HDMI_CEC_CNTRL_5, val |
+			   VC4_HDMI_CEC_TX_SW_RESET | VC4_HDMI_CEC_RX_SW_RESET);
+		HDMI_WRITE(VC4_HDMI_CEC_CNTRL_5, val);
+		HDMI_WRITE(VC4_HDMI_CEC_CNTRL_2,
+			 ((1500 / usecs) << VC4_HDMI_CEC_CNT_TO_1500_US_SHIFT) |
+			 ((1300 / usecs) << VC4_HDMI_CEC_CNT_TO_1300_US_SHIFT) |
+			 ((800 / usecs) << VC4_HDMI_CEC_CNT_TO_800_US_SHIFT) |
+			 ((600 / usecs) << VC4_HDMI_CEC_CNT_TO_600_US_SHIFT) |
+			 ((400 / usecs) << VC4_HDMI_CEC_CNT_TO_400_US_SHIFT));
+		HDMI_WRITE(VC4_HDMI_CEC_CNTRL_3,
+			 ((2750 / usecs) << VC4_HDMI_CEC_CNT_TO_2750_US_SHIFT) |
+			 ((2400 / usecs) << VC4_HDMI_CEC_CNT_TO_2400_US_SHIFT) |
+			 ((2050 / usecs) << VC4_HDMI_CEC_CNT_TO_2050_US_SHIFT) |
+			 ((1700 / usecs) << VC4_HDMI_CEC_CNT_TO_1700_US_SHIFT));
+		HDMI_WRITE(VC4_HDMI_CEC_CNTRL_4,
+			 ((4300 / usecs) << VC4_HDMI_CEC_CNT_TO_4300_US_SHIFT) |
+			 ((3900 / usecs) << VC4_HDMI_CEC_CNT_TO_3900_US_SHIFT) |
+			 ((3600 / usecs) << VC4_HDMI_CEC_CNT_TO_3600_US_SHIFT) |
+			 ((3500 / usecs) << VC4_HDMI_CEC_CNT_TO_3500_US_SHIFT));
+
+		HDMI_WRITE(VC4_HDMI_CPU_MASK_CLEAR, VC4_HDMI_CPU_CEC);
+	} else {
+		HDMI_WRITE(VC4_HDMI_CPU_MASK_SET, VC4_HDMI_CPU_CEC);
+		HDMI_WRITE(VC4_HDMI_CEC_CNTRL_5, val |
+			   VC4_HDMI_CEC_TX_SW_RESET | VC4_HDMI_CEC_RX_SW_RESET);
+	}
+	return 0;
+}
+
+static int vc4_hdmi_cec_adap_log_addr(struct cec_adapter *adap, u8 log_addr)
+{
+	struct vc4_dev *vc4 = cec_get_drvdata(adap);
+
+	HDMI_WRITE(VC4_HDMI_CEC_CNTRL_1,
+		   (HDMI_READ(VC4_HDMI_CEC_CNTRL_1) & ~VC4_HDMI_CEC_ADDR_MASK) |
+		   (log_addr & 0xf) << VC4_HDMI_CEC_ADDR_SHIFT);
+	return 0;
+}
+
+static int vc4_hdmi_cec_adap_transmit(struct cec_adapter *adap, u8 attempts,
+				      u32 signal_free_time, struct cec_msg *msg)
+{
+	struct vc4_dev *vc4 = cec_get_drvdata(adap);
+	u32 val;
+	unsigned int i;
+
+	for (i = 0; i < msg->len; i += 4)
+		HDMI_WRITE(VC4_HDMI_CEC_TX_DATA_1 + i,
+			   (msg->msg[i]) |
+			   (msg->msg[i + 1] << 8) |
+			   (msg->msg[i + 2] << 16) |
+			   (msg->msg[i + 3] << 24));
+
+	val = HDMI_READ(VC4_HDMI_CEC_CNTRL_1);
+	val &= ~VC4_HDMI_CEC_START_XMIT_BEGIN;
+	HDMI_WRITE(VC4_HDMI_CEC_CNTRL_1, val);
+	val &= ~VC4_HDMI_CEC_MESSAGE_LENGTH_MASK;
+	val |= (msg->len - 1) << VC4_HDMI_CEC_MESSAGE_LENGTH_SHIFT;
+	val |= VC4_HDMI_CEC_START_XMIT_BEGIN;
+
+	HDMI_WRITE(VC4_HDMI_CEC_CNTRL_1, val);
+	return 0;
+}
+
+static const struct cec_adap_ops vc4_hdmi_cec_adap_ops = {
+	.adap_enable = vc4_hdmi_cec_adap_enable,
+	.adap_log_addr = vc4_hdmi_cec_adap_log_addr,
+	.adap_transmit = vc4_hdmi_cec_adap_transmit,
+};
+#endif
+
 static int vc4_hdmi_bind(struct device *dev, struct device *master, void *data)
 {
 	struct platform_device *pdev = to_platform_device(dev);
@@ -1205,6 +1357,23 @@ static int vc4_hdmi_bind(struct device *dev, struct device *master, void *data)
 		return -EPROBE_DEFER;
 	}
 
+	/* This is the rate that is set by the firmware.  The number
+	 * needs to be a bit higher than the pixel clock rate
+	 * (generally 148.5Mhz).
+	 */
+	ret = clk_set_rate(hdmi->hsm_clock, HSM_CLOCK_FREQ);
+	if (ret) {
+		DRM_ERROR("Failed to set HSM clock rate: %d\n", ret);
+		goto err_put_i2c;
+	}
+
+	ret = clk_prepare_enable(hdmi->hsm_clock);
+	if (ret) {
+		DRM_ERROR("Failed to turn on HDMI state machine clock: %d\n",
+			  ret);
+		goto err_put_i2c;
+	}
+
 	/* Only use the GPIO HPD pin if present in the DT, otherwise
 	 * we'll use the HDMI core's register.
 	 */
@@ -1216,7 +1385,7 @@ static int vc4_hdmi_bind(struct device *dev, struct device *master, void *data)
 							 &hpd_gpio_flags);
 		if (hdmi->hpd_gpio < 0) {
 			ret = hdmi->hpd_gpio;
-			goto err_put_i2c;
+			goto err_unprepare_hsm;
 		}
 
 		hdmi->hpd_active_low = hpd_gpio_flags & OF_GPIO_ACTIVE_LOW;
@@ -1224,6 +1393,14 @@ static int vc4_hdmi_bind(struct device *dev, struct device *master, void *data)
 
 	vc4->hdmi = hdmi;
 
+	/* HDMI core must be enabled. */
+	if (!(HD_READ(VC4_HD_M_CTL) & VC4_HD_M_ENABLE)) {
+		HD_WRITE(VC4_HD_M_CTL, VC4_HD_M_SW_RST);
+		udelay(1);
+		HD_WRITE(VC4_HD_M_CTL, 0);
+
+		HD_WRITE(VC4_HD_M_CTL, VC4_HD_M_ENABLE);
+	}
 	pm_runtime_enable(dev);
 
 	drm_encoder_init(drm, hdmi->encoder, &vc4_hdmi_encoder_funcs,
@@ -1235,6 +1412,37 @@ static int vc4_hdmi_bind(struct device *dev, struct device *master, void *data)
 		ret = PTR_ERR(hdmi->connector);
 		goto err_destroy_encoder;
 	}
+#ifdef CONFIG_DRM_VC4_HDMI_CEC
+	hdmi->cec_adap = cec_allocate_adapter(&vc4_hdmi_cec_adap_ops,
+					      vc4, "vc4",
+					      CEC_CAP_TRANSMIT |
+					      CEC_CAP_LOG_ADDRS |
+					      CEC_CAP_PASSTHROUGH |
+					      CEC_CAP_RC, 1);
+	ret = PTR_ERR_OR_ZERO(hdmi->cec_adap);
+	if (ret < 0)
+		goto err_destroy_conn;
+	HDMI_WRITE(VC4_HDMI_CPU_MASK_SET, 0xffffffff);
+	value = HDMI_READ(VC4_HDMI_CEC_CNTRL_1);
+	value &= ~VC4_HDMI_CEC_DIV_CLK_CNT_MASK;
+	/*
+	 * Set the logical address to Unregistered and set the clock
+	 * divider: the hsm_clock rate and this divider setting will
+	 * give a 40 kHz CEC clock.
+	 */
+	value |= VC4_HDMI_CEC_ADDR_MASK |
+		 (4091 << VC4_HDMI_CEC_DIV_CLK_CNT_SHIFT);
+	HDMI_WRITE(VC4_HDMI_CEC_CNTRL_1, value);
+	ret = devm_request_threaded_irq(dev, platform_get_irq(pdev, 0),
+					vc4_cec_irq_handler,
+					vc4_cec_irq_handler_thread, 0,
+					"vc4 hdmi cec", vc4);
+	if (ret)
+		goto err_delete_cec_adap;
+	ret = cec_register_adapter(hdmi->cec_adap, dev);
+	if (ret < 0)
+		goto err_delete_cec_adap;
+#endif
 
 	ret = vc4_hdmi_audio_init(hdmi);
 	if (ret)
@@ -1242,8 +1450,16 @@ static int vc4_hdmi_bind(struct device *dev, struct device *master, void *data)
 
 	return 0;
 
+#ifdef CONFIG_DRM_VC4_HDMI_CEC
+err_delete_cec_adap:
+	cec_delete_adapter(hdmi->cec_adap);
+err_destroy_conn:
+	vc4_hdmi_connector_destroy(hdmi->connector);
+#endif
 err_destroy_encoder:
 	vc4_hdmi_encoder_destroy(hdmi->encoder);
+err_unprepare_hsm:
+	clk_disable_unprepare(hdmi->hsm_clock);
 	pm_runtime_disable(dev);
 err_put_i2c:
 	put_device(&hdmi->ddc->dev);
@@ -1259,10 +1475,11 @@ static void vc4_hdmi_unbind(struct device *dev, struct device *master,
 	struct vc4_hdmi *hdmi = vc4->hdmi;
 
 	vc4_hdmi_audio_cleanup(hdmi);
-
+	cec_unregister_adapter(hdmi->cec_adap);
 	vc4_hdmi_connector_destroy(hdmi->connector);
 	vc4_hdmi_encoder_destroy(hdmi->encoder);
 
+	clk_disable_unprepare(hdmi->hsm_clock);
 	pm_runtime_disable(dev);
 
 	put_device(&hdmi->ddc->dev);
diff --git a/drivers/gpu/drm/vc4/vc4_kms.c b/drivers/gpu/drm/vc4/vc4_kms.c
index bc6ecdc..50c4959 100644
--- a/drivers/gpu/drm/vc4/vc4_kms.c
+++ b/drivers/gpu/drm/vc4/vc4_kms.c
@@ -20,6 +20,7 @@
 #include <drm/drm_crtc_helper.h>
 #include <drm/drm_plane_helper.h>
 #include <drm/drm_fb_cma_helper.h>
+#include <drm/drm_gem_framebuffer_helper.h>
 #include "vc4_drv.h"
 
 static void vc4_output_poll_changed(struct drm_device *dev)
@@ -29,16 +30,9 @@ static void vc4_output_poll_changed(struct drm_device *dev)
 	drm_fbdev_cma_hotplug_event(vc4->fbdev);
 }
 
-struct vc4_commit {
-	struct drm_device *dev;
-	struct drm_atomic_state *state;
-	struct vc4_seqno_cb cb;
-};
-
 static void
-vc4_atomic_complete_commit(struct vc4_commit *c)
+vc4_atomic_complete_commit(struct drm_atomic_state *state)
 {
-	struct drm_atomic_state *state = c->state;
 	struct drm_device *dev = state->dev;
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 
@@ -72,28 +66,14 @@ vc4_atomic_complete_commit(struct vc4_commit *c)
 	drm_atomic_state_put(state);
 
 	up(&vc4->async_modeset);
-
-	kfree(c);
 }
 
-static void
-vc4_atomic_complete_commit_seqno_cb(struct vc4_seqno_cb *cb)
+static void commit_work(struct work_struct *work)
 {
-	struct vc4_commit *c = container_of(cb, struct vc4_commit, cb);
-
-	vc4_atomic_complete_commit(c);
-}
-
-static struct vc4_commit *commit_init(struct drm_atomic_state *state)
-{
-	struct vc4_commit *c = kzalloc(sizeof(*c), GFP_KERNEL);
-
-	if (!c)
-		return NULL;
-	c->dev = state->dev;
-	c->state = state;
-
-	return c;
+	struct drm_atomic_state *state = container_of(work,
+						      struct drm_atomic_state,
+						      commit_work);
+	vc4_atomic_complete_commit(state);
 }
 
 /**
@@ -115,40 +95,29 @@ static int vc4_atomic_commit(struct drm_device *dev,
 {
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 	int ret;
-	int i;
-	uint64_t wait_seqno = 0;
-	struct vc4_commit *c;
-	struct drm_plane *plane;
-	struct drm_plane_state *new_state;
-
-	c = commit_init(state);
-	if (!c)
-		return -ENOMEM;
 
 	ret = drm_atomic_helper_setup_commit(state, nonblock);
 	if (ret)
 		return ret;
 
+	INIT_WORK(&state->commit_work, commit_work);
+
 	ret = down_interruptible(&vc4->async_modeset);
-	if (ret) {
-		kfree(c);
+	if (ret)
 		return ret;
-	}
 
 	ret = drm_atomic_helper_prepare_planes(dev, state);
 	if (ret) {
-		kfree(c);
 		up(&vc4->async_modeset);
 		return ret;
 	}
 
-	for_each_plane_in_state(state, plane, new_state, i) {
-		if ((plane->state->fb != new_state->fb) && new_state->fb) {
-			struct drm_gem_cma_object *cma_bo =
-				drm_fb_cma_get_gem_obj(new_state->fb, 0);
-			struct vc4_bo *bo = to_vc4_bo(&cma_bo->base);
-
-			wait_seqno = max(bo->seqno, wait_seqno);
+	if (!nonblock) {
+		ret = drm_atomic_helper_wait_for_fences(dev, state, true);
+		if (ret) {
+			drm_atomic_helper_cleanup_planes(dev, state);
+			up(&vc4->async_modeset);
+			return ret;
 		}
 	}
 
@@ -158,7 +127,7 @@ static int vc4_atomic_commit(struct drm_device *dev,
 	 * the software side now.
 	 */
 
-	drm_atomic_helper_swap_state(state, true);
+	BUG_ON(drm_atomic_helper_swap_state(state, false) < 0);
 
 	/*
 	 * Everything below can be run asynchronously without the need to grab
@@ -177,13 +146,10 @@ static int vc4_atomic_commit(struct drm_device *dev,
 	 */
 
 	drm_atomic_state_get(state);
-	if (nonblock) {
-		vc4_queue_seqno_cb(dev, &c->cb, wait_seqno,
-				   vc4_atomic_complete_commit_seqno_cb);
-	} else {
-		vc4_wait_for_seqno(dev, wait_seqno, ~0ull, false);
-		vc4_atomic_complete_commit(c);
-	}
+	if (nonblock)
+		queue_work(system_unbound_wq, &state->commit_work);
+	else
+		vc4_atomic_complete_commit(state);
 
 	return 0;
 }
@@ -204,7 +170,7 @@ static struct drm_framebuffer *vc4_fb_create(struct drm_device *dev,
 		gem_obj = drm_gem_object_lookup(file_priv,
 						mode_cmd->handles[0]);
 		if (!gem_obj) {
-			DRM_ERROR("Failed to look up GEM BO %d\n",
+			DRM_DEBUG("Failed to look up GEM BO %d\n",
 				  mode_cmd->handles[0]);
 			return ERR_PTR(-ENOENT);
 		}
@@ -219,12 +185,12 @@ static struct drm_framebuffer *vc4_fb_create(struct drm_device *dev,
 			mode_cmd_local.modifier[0] = DRM_FORMAT_MOD_NONE;
 		}
 
-		drm_gem_object_unreference_unlocked(gem_obj);
+		drm_gem_object_put_unlocked(gem_obj);
 
 		mode_cmd = &mode_cmd_local;
 	}
 
-	return drm_fb_cma_create(dev, file_priv, mode_cmd);
+	return drm_gem_fb_create(dev, file_priv, mode_cmd);
 }
 
 static const struct drm_mode_config_funcs vc4_mode_funcs = {
@@ -241,6 +207,9 @@ int vc4_kms_load(struct drm_device *dev)
 
 	sema_init(&vc4->async_modeset, 1);
 
+	/* Set support for vblank irq fast disable, before drm_vblank_init() */
+	dev->vblank_disable_immediate = true;
+
 	ret = drm_vblank_init(dev, dev->mode_config.num_crtc);
 	if (ret < 0) {
 		dev_err(dev->dev, "failed to initialize vblank\n");
diff --git a/drivers/gpu/drm/vc4/vc4_plane.c b/drivers/gpu/drm/vc4/vc4_plane.c
index fa6809d..2968b3e 100644
--- a/drivers/gpu/drm/vc4/vc4_plane.c
+++ b/drivers/gpu/drm/vc4/vc4_plane.c
@@ -759,9 +759,26 @@ void vc4_plane_async_set_fb(struct drm_plane *plane, struct drm_framebuffer *fb)
 	vc4_state->dlist[vc4_state->ptr0_offset] = addr;
 }
 
+static int vc4_prepare_fb(struct drm_plane *plane,
+			  struct drm_plane_state *state)
+{
+	struct vc4_bo *bo;
+	struct dma_fence *fence;
+
+	if ((plane->state->fb == state->fb) || !state->fb)
+		return 0;
+
+	bo = to_vc4_bo(&drm_fb_cma_get_gem_obj(state->fb, 0)->base);
+	fence = reservation_object_get_excl_rcu(bo->resv);
+	drm_atomic_set_fence_for_plane(state, fence);
+
+	return 0;
+}
+
 static const struct drm_plane_helper_funcs vc4_plane_helper_funcs = {
 	.atomic_check = vc4_plane_atomic_check,
 	.atomic_update = vc4_plane_atomic_update,
+	.prepare_fb = vc4_prepare_fb,
 };
 
 static void vc4_plane_destroy(struct drm_plane *plane)
@@ -885,7 +902,7 @@ struct drm_plane *vc4_plane_init(struct drm_device *dev,
 	ret = drm_universal_plane_init(dev, plane, 0,
 				       &vc4_plane_funcs,
 				       formats, num_formats,
-				       type, NULL);
+				       NULL, type, NULL);
 
 	drm_plane_helper_add(plane, &vc4_plane_helper_funcs);
 
diff --git a/drivers/gpu/drm/vc4/vc4_regs.h b/drivers/gpu/drm/vc4/vc4_regs.h
index d382c34..55677bd 100644
--- a/drivers/gpu/drm/vc4/vc4_regs.h
+++ b/drivers/gpu/drm/vc4/vc4_regs.h
@@ -561,16 +561,129 @@
 # define VC4_HDMI_VERTB_VBP_MASK		VC4_MASK(8, 0)
 # define VC4_HDMI_VERTB_VBP_SHIFT		0
 
+#define VC4_HDMI_CEC_CNTRL_1			0x0e8
+/* Set when the transmission has ended. */
+# define VC4_HDMI_CEC_TX_EOM			BIT(31)
+/* If set, transmission was acked on the 1st or 2nd attempt (only one
+ * retry is attempted).  If in continuous mode, this means TX needs to
+ * be filled if !TX_EOM.
+ */
+# define VC4_HDMI_CEC_TX_STATUS_GOOD		BIT(30)
+# define VC4_HDMI_CEC_RX_EOM			BIT(29)
+# define VC4_HDMI_CEC_RX_STATUS_GOOD		BIT(28)
+/* Number of bytes received for the message. */
+# define VC4_HDMI_CEC_REC_WRD_CNT_MASK		VC4_MASK(27, 24)
+# define VC4_HDMI_CEC_REC_WRD_CNT_SHIFT		24
+/* Sets continuous receive mode.  Generates interrupt after each 8
+ * bytes to signal that RX_DATA should be consumed, and at RX_EOM.
+ *
+ * If disabled, maximum 16 bytes will be received (including header),
+ * and interrupt at RX_EOM.  Later bytes will be acked but not put
+ * into the RX_DATA.
+ */
+# define VC4_HDMI_CEC_RX_CONTINUE		BIT(23)
+# define VC4_HDMI_CEC_TX_CONTINUE		BIT(22)
+/* Set this after a CEC interrupt. */
+# define VC4_HDMI_CEC_CLEAR_RECEIVE_OFF		BIT(21)
+/* Starts a TX.  Will wait for appropriate idel time before CEC
+ * activity. Must be cleared in between transmits.
+ */
+# define VC4_HDMI_CEC_START_XMIT_BEGIN		BIT(20)
+# define VC4_HDMI_CEC_MESSAGE_LENGTH_MASK	VC4_MASK(19, 16)
+# define VC4_HDMI_CEC_MESSAGE_LENGTH_SHIFT	16
+/* Device's CEC address */
+# define VC4_HDMI_CEC_ADDR_MASK			VC4_MASK(15, 12)
+# define VC4_HDMI_CEC_ADDR_SHIFT		12
+/* Divides off of HSM clock to generate CEC bit clock. */
+/* With the current defaults the CEC bit clock is 40 kHz = 25 usec */
+# define VC4_HDMI_CEC_DIV_CLK_CNT_MASK		VC4_MASK(11, 0)
+# define VC4_HDMI_CEC_DIV_CLK_CNT_SHIFT		0
+
+/* Set these fields to how many bit clock cycles get to that many
+ * microseconds.
+ */
+#define VC4_HDMI_CEC_CNTRL_2			0x0ec
+# define VC4_HDMI_CEC_CNT_TO_1500_US_MASK	VC4_MASK(30, 24)
+# define VC4_HDMI_CEC_CNT_TO_1500_US_SHIFT	24
+# define VC4_HDMI_CEC_CNT_TO_1300_US_MASK	VC4_MASK(23, 17)
+# define VC4_HDMI_CEC_CNT_TO_1300_US_SHIFT	17
+# define VC4_HDMI_CEC_CNT_TO_800_US_MASK	VC4_MASK(16, 11)
+# define VC4_HDMI_CEC_CNT_TO_800_US_SHIFT	11
+# define VC4_HDMI_CEC_CNT_TO_600_US_MASK	VC4_MASK(10, 5)
+# define VC4_HDMI_CEC_CNT_TO_600_US_SHIFT	5
+# define VC4_HDMI_CEC_CNT_TO_400_US_MASK	VC4_MASK(4, 0)
+# define VC4_HDMI_CEC_CNT_TO_400_US_SHIFT	0
+
+#define VC4_HDMI_CEC_CNTRL_3			0x0f0
+# define VC4_HDMI_CEC_CNT_TO_2750_US_MASK	VC4_MASK(31, 24)
+# define VC4_HDMI_CEC_CNT_TO_2750_US_SHIFT	24
+# define VC4_HDMI_CEC_CNT_TO_2400_US_MASK	VC4_MASK(23, 16)
+# define VC4_HDMI_CEC_CNT_TO_2400_US_SHIFT	16
+# define VC4_HDMI_CEC_CNT_TO_2050_US_MASK	VC4_MASK(15, 8)
+# define VC4_HDMI_CEC_CNT_TO_2050_US_SHIFT	8
+# define VC4_HDMI_CEC_CNT_TO_1700_US_MASK	VC4_MASK(7, 0)
+# define VC4_HDMI_CEC_CNT_TO_1700_US_SHIFT	0
+
+#define VC4_HDMI_CEC_CNTRL_4			0x0f4
+# define VC4_HDMI_CEC_CNT_TO_4300_US_MASK	VC4_MASK(31, 24)
+# define VC4_HDMI_CEC_CNT_TO_4300_US_SHIFT	24
+# define VC4_HDMI_CEC_CNT_TO_3900_US_MASK	VC4_MASK(23, 16)
+# define VC4_HDMI_CEC_CNT_TO_3900_US_SHIFT	16
+# define VC4_HDMI_CEC_CNT_TO_3600_US_MASK	VC4_MASK(15, 8)
+# define VC4_HDMI_CEC_CNT_TO_3600_US_SHIFT	8
+# define VC4_HDMI_CEC_CNT_TO_3500_US_MASK	VC4_MASK(7, 0)
+# define VC4_HDMI_CEC_CNT_TO_3500_US_SHIFT	0
+
+#define VC4_HDMI_CEC_CNTRL_5			0x0f8
+# define VC4_HDMI_CEC_TX_SW_RESET		BIT(27)
+# define VC4_HDMI_CEC_RX_SW_RESET		BIT(26)
+# define VC4_HDMI_CEC_PAD_SW_RESET		BIT(25)
+# define VC4_HDMI_CEC_MUX_TP_OUT_CEC		BIT(24)
+# define VC4_HDMI_CEC_RX_CEC_INT		BIT(23)
+# define VC4_HDMI_CEC_CLK_PRELOAD_MASK		VC4_MASK(22, 16)
+# define VC4_HDMI_CEC_CLK_PRELOAD_SHIFT		16
+# define VC4_HDMI_CEC_CNT_TO_4700_US_MASK	VC4_MASK(15, 8)
+# define VC4_HDMI_CEC_CNT_TO_4700_US_SHIFT	8
+# define VC4_HDMI_CEC_CNT_TO_4500_US_MASK	VC4_MASK(7, 0)
+# define VC4_HDMI_CEC_CNT_TO_4500_US_SHIFT	0
+
+/* Transmit data, first byte is low byte of the 32-bit reg.  MSB of
+ * each byte transmitted first.
+ */
+#define VC4_HDMI_CEC_TX_DATA_1			0x0fc
+#define VC4_HDMI_CEC_TX_DATA_2			0x100
+#define VC4_HDMI_CEC_TX_DATA_3			0x104
+#define VC4_HDMI_CEC_TX_DATA_4			0x108
+#define VC4_HDMI_CEC_RX_DATA_1			0x10c
+#define VC4_HDMI_CEC_RX_DATA_2			0x110
+#define VC4_HDMI_CEC_RX_DATA_3			0x114
+#define VC4_HDMI_CEC_RX_DATA_4			0x118
+
 #define VC4_HDMI_TX_PHY_RESET_CTL		0x2c0
 
 #define VC4_HDMI_TX_PHY_CTL0			0x2c4
 # define VC4_HDMI_TX_PHY_RNG_PWRDN		BIT(25)
 
+/* Interrupt status bits */
+#define VC4_HDMI_CPU_STATUS			0x340
+#define VC4_HDMI_CPU_SET			0x344
+#define VC4_HDMI_CPU_CLEAR			0x348
+# define VC4_HDMI_CPU_CEC			BIT(6)
+# define VC4_HDMI_CPU_HOTPLUG			BIT(0)
+
+#define VC4_HDMI_CPU_MASK_STATUS		0x34c
+#define VC4_HDMI_CPU_MASK_SET			0x350
+#define VC4_HDMI_CPU_MASK_CLEAR			0x354
+
 #define VC4_HDMI_GCP(x)				(0x400 + ((x) * 0x4))
 #define VC4_HDMI_RAM_PACKET(x)			(0x400 + ((x) * 0x24))
 #define VC4_HDMI_PACKET_STRIDE			0x24
 
 #define VC4_HD_M_CTL				0x00c
+/* Debug: Current receive value on the CEC pad. */
+# define VC4_HD_CECRXD				BIT(9)
+/* Debug: Override CEC output to 0. */
+# define VC4_HD_CECOVR				BIT(8)
 # define VC4_HD_M_REGISTER_FILE_STANDBY		(3 << 6)
 # define VC4_HD_M_RAM_STANDBY			(3 << 4)
 # define VC4_HD_M_SW_RST			BIT(2)
diff --git a/drivers/gpu/drm/vc4/vc4_render_cl.c b/drivers/gpu/drm/vc4/vc4_render_cl.c
index 5dc1942..273984f 100644
--- a/drivers/gpu/drm/vc4/vc4_render_cl.c
+++ b/drivers/gpu/drm/vc4/vc4_render_cl.c
@@ -261,8 +261,17 @@ static int vc4_create_rcl_bo(struct drm_device *dev, struct vc4_exec_info *exec,
 	uint8_t max_y_tile = args->max_y_tile;
 	uint8_t xtiles = max_x_tile - min_x_tile + 1;
 	uint8_t ytiles = max_y_tile - min_y_tile + 1;
-	uint8_t x, y;
+	uint8_t xi, yi;
 	uint32_t size, loop_body_size;
+	bool positive_x = true;
+	bool positive_y = true;
+
+	if (args->flags & VC4_SUBMIT_CL_FIXED_RCL_ORDER) {
+		if (!(args->flags & VC4_SUBMIT_CL_RCL_ORDER_INCREASING_X))
+			positive_x = false;
+		if (!(args->flags & VC4_SUBMIT_CL_RCL_ORDER_INCREASING_Y))
+			positive_y = false;
+	}
 
 	size = VC4_PACKET_TILE_RENDERING_MODE_CONFIG_SIZE;
 	loop_body_size = VC4_PACKET_TILE_COORDINATES_SIZE;
@@ -320,7 +329,7 @@ static int vc4_create_rcl_bo(struct drm_device *dev, struct vc4_exec_info *exec,
 
 	size += xtiles * ytiles * loop_body_size;
 
-	setup->rcl = &vc4_bo_create(dev, size, true)->base;
+	setup->rcl = &vc4_bo_create(dev, size, true, VC4_BO_TYPE_RCL)->base;
 	if (IS_ERR(setup->rcl))
 		return PTR_ERR(setup->rcl);
 	list_add_tail(&to_vc4_bo(&setup->rcl->base)->unref_head,
@@ -354,10 +363,12 @@ static int vc4_create_rcl_bo(struct drm_device *dev, struct vc4_exec_info *exec,
 	rcl_u16(setup, args->height);
 	rcl_u16(setup, args->color_write.bits);
 
-	for (y = min_y_tile; y <= max_y_tile; y++) {
-		for (x = min_x_tile; x <= max_x_tile; x++) {
-			bool first = (x == min_x_tile && y == min_y_tile);
-			bool last = (x == max_x_tile && y == max_y_tile);
+	for (yi = 0; yi < ytiles; yi++) {
+		int y = positive_y ? min_y_tile + yi : max_y_tile - yi;
+		for (xi = 0; xi < xtiles; xi++) {
+			int x = positive_x ? min_x_tile + xi : max_x_tile - xi;
+			bool first = (xi == 0 && yi == 0);
+			bool last = (xi == xtiles - 1 && yi == ytiles - 1);
 
 			emit_tile(exec, setup, x, y, first, last);
 		}
@@ -378,14 +389,14 @@ static int vc4_full_res_bounds_check(struct vc4_exec_info *exec,
 	u32 render_tiles_stride = DIV_ROUND_UP(exec->args->width, 32);
 
 	if (surf->offset > obj->base.size) {
-		DRM_ERROR("surface offset %d > BO size %zd\n",
+		DRM_DEBUG("surface offset %d > BO size %zd\n",
 			  surf->offset, obj->base.size);
 		return -EINVAL;
 	}
 
 	if ((obj->base.size - surf->offset) / VC4_TILE_BUFFER_SIZE <
 	    render_tiles_stride * args->max_y_tile + args->max_x_tile) {
-		DRM_ERROR("MSAA tile %d, %d out of bounds "
+		DRM_DEBUG("MSAA tile %d, %d out of bounds "
 			  "(bo size %zd, offset %d).\n",
 			  args->max_x_tile, args->max_y_tile,
 			  obj->base.size,
@@ -401,7 +412,7 @@ static int vc4_rcl_msaa_surface_setup(struct vc4_exec_info *exec,
 				      struct drm_vc4_submit_rcl_surface *surf)
 {
 	if (surf->flags != 0 || surf->bits != 0) {
-		DRM_ERROR("MSAA surface had nonzero flags/bits\n");
+		DRM_DEBUG("MSAA surface had nonzero flags/bits\n");
 		return -EINVAL;
 	}
 
@@ -415,7 +426,7 @@ static int vc4_rcl_msaa_surface_setup(struct vc4_exec_info *exec,
 	exec->rcl_write_bo[exec->rcl_write_bo_count++] = *obj;
 
 	if (surf->offset & 0xf) {
-		DRM_ERROR("MSAA write must be 16b aligned.\n");
+		DRM_DEBUG("MSAA write must be 16b aligned.\n");
 		return -EINVAL;
 	}
 
@@ -437,7 +448,7 @@ static int vc4_rcl_surface_setup(struct vc4_exec_info *exec,
 	int ret;
 
 	if (surf->flags & ~VC4_SUBMIT_RCL_SURFACE_READ_IS_FULL_RES) {
-		DRM_ERROR("Extra flags set\n");
+		DRM_DEBUG("Extra flags set\n");
 		return -EINVAL;
 	}
 
@@ -453,12 +464,12 @@ static int vc4_rcl_surface_setup(struct vc4_exec_info *exec,
 
 	if (surf->flags & VC4_SUBMIT_RCL_SURFACE_READ_IS_FULL_RES) {
 		if (surf == &exec->args->zs_write) {
-			DRM_ERROR("general zs write may not be a full-res.\n");
+			DRM_DEBUG("general zs write may not be a full-res.\n");
 			return -EINVAL;
 		}
 
 		if (surf->bits != 0) {
-			DRM_ERROR("load/store general bits set with "
+			DRM_DEBUG("load/store general bits set with "
 				  "full res load/store.\n");
 			return -EINVAL;
 		}
@@ -473,19 +484,19 @@ static int vc4_rcl_surface_setup(struct vc4_exec_info *exec,
 	if (surf->bits & ~(VC4_LOADSTORE_TILE_BUFFER_TILING_MASK |
 			   VC4_LOADSTORE_TILE_BUFFER_BUFFER_MASK |
 			   VC4_LOADSTORE_TILE_BUFFER_FORMAT_MASK)) {
-		DRM_ERROR("Unknown bits in load/store: 0x%04x\n",
+		DRM_DEBUG("Unknown bits in load/store: 0x%04x\n",
 			  surf->bits);
 		return -EINVAL;
 	}
 
 	if (tiling > VC4_TILING_FORMAT_LT) {
-		DRM_ERROR("Bad tiling format\n");
+		DRM_DEBUG("Bad tiling format\n");
 		return -EINVAL;
 	}
 
 	if (buffer == VC4_LOADSTORE_TILE_BUFFER_ZS) {
 		if (format != 0) {
-			DRM_ERROR("No color format should be set for ZS\n");
+			DRM_DEBUG("No color format should be set for ZS\n");
 			return -EINVAL;
 		}
 		cpp = 4;
@@ -499,16 +510,16 @@ static int vc4_rcl_surface_setup(struct vc4_exec_info *exec,
 			cpp = 4;
 			break;
 		default:
-			DRM_ERROR("Bad tile buffer format\n");
+			DRM_DEBUG("Bad tile buffer format\n");
 			return -EINVAL;
 		}
 	} else {
-		DRM_ERROR("Bad load/store buffer %d.\n", buffer);
+		DRM_DEBUG("Bad load/store buffer %d.\n", buffer);
 		return -EINVAL;
 	}
 
 	if (surf->offset & 0xf) {
-		DRM_ERROR("load/store buffer must be 16b aligned.\n");
+		DRM_DEBUG("load/store buffer must be 16b aligned.\n");
 		return -EINVAL;
 	}
 
@@ -533,7 +544,7 @@ vc4_rcl_render_config_surface_setup(struct vc4_exec_info *exec,
 	int cpp;
 
 	if (surf->flags != 0) {
-		DRM_ERROR("No flags supported on render config.\n");
+		DRM_DEBUG("No flags supported on render config.\n");
 		return -EINVAL;
 	}
 
@@ -541,7 +552,7 @@ vc4_rcl_render_config_surface_setup(struct vc4_exec_info *exec,
 			   VC4_RENDER_CONFIG_FORMAT_MASK |
 			   VC4_RENDER_CONFIG_MS_MODE_4X |
 			   VC4_RENDER_CONFIG_DECIMATE_MODE_4X)) {
-		DRM_ERROR("Unknown bits in render config: 0x%04x\n",
+		DRM_DEBUG("Unknown bits in render config: 0x%04x\n",
 			  surf->bits);
 		return -EINVAL;
 	}
@@ -556,7 +567,7 @@ vc4_rcl_render_config_surface_setup(struct vc4_exec_info *exec,
 	exec->rcl_write_bo[exec->rcl_write_bo_count++] = *obj;
 
 	if (tiling > VC4_TILING_FORMAT_LT) {
-		DRM_ERROR("Bad tiling format\n");
+		DRM_DEBUG("Bad tiling format\n");
 		return -EINVAL;
 	}
 
@@ -569,7 +580,7 @@ vc4_rcl_render_config_surface_setup(struct vc4_exec_info *exec,
 		cpp = 4;
 		break;
 	default:
-		DRM_ERROR("Bad tile buffer format\n");
+		DRM_DEBUG("Bad tile buffer format\n");
 		return -EINVAL;
 	}
 
@@ -590,7 +601,7 @@ int vc4_get_rcl(struct drm_device *dev, struct vc4_exec_info *exec)
 
 	if (args->min_x_tile > args->max_x_tile ||
 	    args->min_y_tile > args->max_y_tile) {
-		DRM_ERROR("Bad render tile set (%d,%d)-(%d,%d)\n",
+		DRM_DEBUG("Bad render tile set (%d,%d)-(%d,%d)\n",
 			  args->min_x_tile, args->min_y_tile,
 			  args->max_x_tile, args->max_y_tile);
 		return -EINVAL;
@@ -599,7 +610,7 @@ int vc4_get_rcl(struct drm_device *dev, struct vc4_exec_info *exec)
 	if (has_bin &&
 	    (args->max_x_tile > exec->bin_tiles_x ||
 	     args->max_y_tile > exec->bin_tiles_y)) {
-		DRM_ERROR("Render tiles (%d,%d) outside of bin config "
+		DRM_DEBUG("Render tiles (%d,%d) outside of bin config "
 			  "(%d,%d)\n",
 			  args->max_x_tile, args->max_y_tile,
 			  exec->bin_tiles_x, exec->bin_tiles_y);
@@ -642,7 +653,7 @@ int vc4_get_rcl(struct drm_device *dev, struct vc4_exec_info *exec)
 	 */
 	if (!setup.color_write && !setup.zs_write &&
 	    !setup.msaa_color_write && !setup.msaa_zs_write) {
-		DRM_ERROR("RCL requires color or Z/S write\n");
+		DRM_DEBUG("RCL requires color or Z/S write\n");
 		return -EINVAL;
 	}
 
diff --git a/drivers/gpu/drm/vc4/vc4_v3d.c b/drivers/gpu/drm/vc4/vc4_v3d.c
index 8c723da..622cd43 100644
--- a/drivers/gpu/drm/vc4/vc4_v3d.c
+++ b/drivers/gpu/drm/vc4/vc4_v3d.c
@@ -236,7 +236,8 @@ vc4_allocate_bin_bo(struct drm_device *drm)
 	INIT_LIST_HEAD(&list);
 
 	while (true) {
-		struct vc4_bo *bo = vc4_bo_create(drm, size, true);
+		struct vc4_bo *bo = vc4_bo_create(drm, size, true,
+						  VC4_BO_TYPE_BIN);
 
 		if (IS_ERR(bo)) {
 			ret = PTR_ERR(bo);
diff --git a/drivers/gpu/drm/vc4/vc4_validate.c b/drivers/gpu/drm/vc4/vc4_validate.c
index 814b512..2db485a 100644
--- a/drivers/gpu/drm/vc4/vc4_validate.c
+++ b/drivers/gpu/drm/vc4/vc4_validate.c
@@ -109,7 +109,7 @@ vc4_use_bo(struct vc4_exec_info *exec, uint32_t hindex)
 	struct vc4_bo *bo;
 
 	if (hindex >= exec->bo_count) {
-		DRM_ERROR("BO index %d greater than BO count %d\n",
+		DRM_DEBUG("BO index %d greater than BO count %d\n",
 			  hindex, exec->bo_count);
 		return NULL;
 	}
@@ -117,7 +117,7 @@ vc4_use_bo(struct vc4_exec_info *exec, uint32_t hindex)
 	bo = to_vc4_bo(&obj->base);
 
 	if (bo->validated_shader) {
-		DRM_ERROR("Trying to use shader BO as something other than "
+		DRM_DEBUG("Trying to use shader BO as something other than "
 			  "a shader\n");
 		return NULL;
 	}
@@ -172,7 +172,7 @@ vc4_check_tex_size(struct vc4_exec_info *exec, struct drm_gem_cma_object *fbo,
 	 * our math.
 	 */
 	if (width > 4096 || height > 4096) {
-		DRM_ERROR("Surface dimensions (%d,%d) too large",
+		DRM_DEBUG("Surface dimensions (%d,%d) too large",
 			  width, height);
 		return false;
 	}
@@ -191,7 +191,7 @@ vc4_check_tex_size(struct vc4_exec_info *exec, struct drm_gem_cma_object *fbo,
 		aligned_height = round_up(height, utile_h);
 		break;
 	default:
-		DRM_ERROR("buffer tiling %d unsupported\n", tiling_format);
+		DRM_DEBUG("buffer tiling %d unsupported\n", tiling_format);
 		return false;
 	}
 
@@ -200,7 +200,7 @@ vc4_check_tex_size(struct vc4_exec_info *exec, struct drm_gem_cma_object *fbo,
 
 	if (size + offset < size ||
 	    size + offset > fbo->base.size) {
-		DRM_ERROR("Overflow in %dx%d (%dx%d) fbo size (%d + %d > %zd)\n",
+		DRM_DEBUG("Overflow in %dx%d (%dx%d) fbo size (%d + %d > %zd)\n",
 			  width, height,
 			  aligned_width, aligned_height,
 			  size, offset, fbo->base.size);
@@ -214,7 +214,7 @@ static int
 validate_flush(VALIDATE_ARGS)
 {
 	if (!validate_bin_pos(exec, untrusted, exec->args->bin_cl_size - 1)) {
-		DRM_ERROR("Bin CL must end with VC4_PACKET_FLUSH\n");
+		DRM_DEBUG("Bin CL must end with VC4_PACKET_FLUSH\n");
 		return -EINVAL;
 	}
 	exec->found_flush = true;
@@ -226,13 +226,13 @@ static int
 validate_start_tile_binning(VALIDATE_ARGS)
 {
 	if (exec->found_start_tile_binning_packet) {
-		DRM_ERROR("Duplicate VC4_PACKET_START_TILE_BINNING\n");
+		DRM_DEBUG("Duplicate VC4_PACKET_START_TILE_BINNING\n");
 		return -EINVAL;
 	}
 	exec->found_start_tile_binning_packet = true;
 
 	if (!exec->found_tile_binning_mode_config_packet) {
-		DRM_ERROR("missing VC4_PACKET_TILE_BINNING_MODE_CONFIG\n");
+		DRM_DEBUG("missing VC4_PACKET_TILE_BINNING_MODE_CONFIG\n");
 		return -EINVAL;
 	}
 
@@ -243,7 +243,7 @@ static int
 validate_increment_semaphore(VALIDATE_ARGS)
 {
 	if (!validate_bin_pos(exec, untrusted, exec->args->bin_cl_size - 2)) {
-		DRM_ERROR("Bin CL must end with "
+		DRM_DEBUG("Bin CL must end with "
 			  "VC4_PACKET_INCREMENT_SEMAPHORE\n");
 		return -EINVAL;
 	}
@@ -264,7 +264,7 @@ validate_indexed_prim_list(VALIDATE_ARGS)
 
 	/* Check overflow condition */
 	if (exec->shader_state_count == 0) {
-		DRM_ERROR("shader state must precede primitives\n");
+		DRM_DEBUG("shader state must precede primitives\n");
 		return -EINVAL;
 	}
 	shader_state = &exec->shader_state[exec->shader_state_count - 1];
@@ -281,7 +281,7 @@ validate_indexed_prim_list(VALIDATE_ARGS)
 
 	if (offset > ib->base.size ||
 	    (ib->base.size - offset) / index_size < length) {
-		DRM_ERROR("IB access overflow (%d + %d*%d > %zd)\n",
+		DRM_DEBUG("IB access overflow (%d + %d*%d > %zd)\n",
 			  offset, length, index_size, ib->base.size);
 		return -EINVAL;
 	}
@@ -301,13 +301,13 @@ validate_gl_array_primitive(VALIDATE_ARGS)
 
 	/* Check overflow condition */
 	if (exec->shader_state_count == 0) {
-		DRM_ERROR("shader state must precede primitives\n");
+		DRM_DEBUG("shader state must precede primitives\n");
 		return -EINVAL;
 	}
 	shader_state = &exec->shader_state[exec->shader_state_count - 1];
 
 	if (length + base_index < length) {
-		DRM_ERROR("primitive vertex count overflow\n");
+		DRM_DEBUG("primitive vertex count overflow\n");
 		return -EINVAL;
 	}
 	max_index = length + base_index - 1;
@@ -324,7 +324,7 @@ validate_gl_shader_state(VALIDATE_ARGS)
 	uint32_t i = exec->shader_state_count++;
 
 	if (i >= exec->shader_state_size) {
-		DRM_ERROR("More requests for shader states than declared\n");
+		DRM_DEBUG("More requests for shader states than declared\n");
 		return -EINVAL;
 	}
 
@@ -332,7 +332,7 @@ validate_gl_shader_state(VALIDATE_ARGS)
 	exec->shader_state[i].max_index = 0;
 
 	if (exec->shader_state[i].addr & ~0xf) {
-		DRM_ERROR("high bits set in GL shader rec reference\n");
+		DRM_DEBUG("high bits set in GL shader rec reference\n");
 		return -EINVAL;
 	}
 
@@ -356,7 +356,7 @@ validate_tile_binning_config(VALIDATE_ARGS)
 	int bin_slot;
 
 	if (exec->found_tile_binning_mode_config_packet) {
-		DRM_ERROR("Duplicate VC4_PACKET_TILE_BINNING_MODE_CONFIG\n");
+		DRM_DEBUG("Duplicate VC4_PACKET_TILE_BINNING_MODE_CONFIG\n");
 		return -EINVAL;
 	}
 	exec->found_tile_binning_mode_config_packet = true;
@@ -368,14 +368,14 @@ validate_tile_binning_config(VALIDATE_ARGS)
 
 	if (exec->bin_tiles_x == 0 ||
 	    exec->bin_tiles_y == 0) {
-		DRM_ERROR("Tile binning config of %dx%d too small\n",
+		DRM_DEBUG("Tile binning config of %dx%d too small\n",
 			  exec->bin_tiles_x, exec->bin_tiles_y);
 		return -EINVAL;
 	}
 
 	if (flags & (VC4_BIN_CONFIG_DB_NON_MS |
 		     VC4_BIN_CONFIG_TILE_BUFFER_64BIT)) {
-		DRM_ERROR("unsupported binning config flags 0x%02x\n", flags);
+		DRM_DEBUG("unsupported binning config flags 0x%02x\n", flags);
 		return -EINVAL;
 	}
 
@@ -493,20 +493,20 @@ vc4_validate_bin_cl(struct drm_device *dev,
 		const struct cmd_info *info;
 
 		if (cmd >= ARRAY_SIZE(cmd_info)) {
-			DRM_ERROR("0x%08x: packet %d out of bounds\n",
+			DRM_DEBUG("0x%08x: packet %d out of bounds\n",
 				  src_offset, cmd);
 			return -EINVAL;
 		}
 
 		info = &cmd_info[cmd];
 		if (!info->name) {
-			DRM_ERROR("0x%08x: packet %d invalid\n",
+			DRM_DEBUG("0x%08x: packet %d invalid\n",
 				  src_offset, cmd);
 			return -EINVAL;
 		}
 
 		if (src_offset + info->len > len) {
-			DRM_ERROR("0x%08x: packet %d (%s) length 0x%08x "
+			DRM_DEBUG("0x%08x: packet %d (%s) length 0x%08x "
 				  "exceeds bounds (0x%08x)\n",
 				  src_offset, cmd, info->name, info->len,
 				  src_offset + len);
@@ -519,7 +519,7 @@ vc4_validate_bin_cl(struct drm_device *dev,
 		if (info->func && info->func(exec,
 					     dst_pkt + 1,
 					     src_pkt + 1)) {
-			DRM_ERROR("0x%08x: packet %d (%s) failed to validate\n",
+			DRM_DEBUG("0x%08x: packet %d (%s) failed to validate\n",
 				  src_offset, cmd, info->name);
 			return -EINVAL;
 		}
@@ -537,7 +537,7 @@ vc4_validate_bin_cl(struct drm_device *dev,
 	exec->ct0ea = exec->ct0ca + dst_offset;
 
 	if (!exec->found_start_tile_binning_packet) {
-		DRM_ERROR("Bin CL missing VC4_PACKET_START_TILE_BINNING\n");
+		DRM_DEBUG("Bin CL missing VC4_PACKET_START_TILE_BINNING\n");
 		return -EINVAL;
 	}
 
@@ -549,7 +549,7 @@ vc4_validate_bin_cl(struct drm_device *dev,
 	 * semaphore increment.
 	 */
 	if (!exec->found_increment_semaphore_packet || !exec->found_flush) {
-		DRM_ERROR("Bin CL missing VC4_PACKET_INCREMENT_SEMAPHORE + "
+		DRM_DEBUG("Bin CL missing VC4_PACKET_INCREMENT_SEMAPHORE + "
 			  "VC4_PACKET_FLUSH\n");
 		return -EINVAL;
 	}
@@ -588,11 +588,11 @@ reloc_tex(struct vc4_exec_info *exec,
 		uint32_t remaining_size = tex->base.size - p0;
 
 		if (p0 > tex->base.size - 4) {
-			DRM_ERROR("UBO offset greater than UBO size\n");
+			DRM_DEBUG("UBO offset greater than UBO size\n");
 			goto fail;
 		}
 		if (p1 > remaining_size - 4) {
-			DRM_ERROR("UBO clamp would allow reads "
+			DRM_DEBUG("UBO clamp would allow reads "
 				  "outside of UBO\n");
 			goto fail;
 		}
@@ -612,14 +612,14 @@ reloc_tex(struct vc4_exec_info *exec,
 		if (VC4_GET_FIELD(p3, VC4_TEX_P2_PTYPE) ==
 		    VC4_TEX_P2_PTYPE_CUBE_MAP_STRIDE) {
 			if (cube_map_stride) {
-				DRM_ERROR("Cube map stride set twice\n");
+				DRM_DEBUG("Cube map stride set twice\n");
 				goto fail;
 			}
 
 			cube_map_stride = p3 & VC4_TEX_P2_CMST_MASK;
 		}
 		if (!cube_map_stride) {
-			DRM_ERROR("Cube map stride not set\n");
+			DRM_DEBUG("Cube map stride not set\n");
 			goto fail;
 		}
 	}
@@ -660,7 +660,7 @@ reloc_tex(struct vc4_exec_info *exec,
 	case VC4_TEXTURE_TYPE_RGBA64:
 	case VC4_TEXTURE_TYPE_YUV422R:
 	default:
-		DRM_ERROR("Texture format %d unsupported\n", type);
+		DRM_DEBUG("Texture format %d unsupported\n", type);
 		goto fail;
 	}
 	utile_w = utile_width(cpp);
@@ -713,7 +713,7 @@ reloc_tex(struct vc4_exec_info *exec,
 		level_size = aligned_width * cpp * aligned_height;
 
 		if (offset < level_size) {
-			DRM_ERROR("Level %d (%dx%d -> %dx%d) size %db "
+			DRM_DEBUG("Level %d (%dx%d -> %dx%d) size %db "
 				  "overflowed buffer bounds (offset %d)\n",
 				  i, level_width, level_height,
 				  aligned_width, aligned_height,
@@ -764,7 +764,7 @@ validate_gl_shader_rec(struct drm_device *dev,
 
 	nr_relocs = ARRAY_SIZE(shader_reloc_offsets) + nr_attributes;
 	if (nr_relocs * 4 > exec->shader_rec_size) {
-		DRM_ERROR("overflowed shader recs reading %d handles "
+		DRM_DEBUG("overflowed shader recs reading %d handles "
 			  "from %d bytes left\n",
 			  nr_relocs, exec->shader_rec_size);
 		return -EINVAL;
@@ -774,7 +774,7 @@ validate_gl_shader_rec(struct drm_device *dev,
 	exec->shader_rec_size -= nr_relocs * 4;
 
 	if (packet_size > exec->shader_rec_size) {
-		DRM_ERROR("overflowed shader recs copying %db packet "
+		DRM_DEBUG("overflowed shader recs copying %db packet "
 			  "from %d bytes left\n",
 			  packet_size, exec->shader_rec_size);
 		return -EINVAL;
@@ -794,7 +794,7 @@ validate_gl_shader_rec(struct drm_device *dev,
 
 	for (i = 0; i < shader_reloc_count; i++) {
 		if (src_handles[i] > exec->bo_count) {
-			DRM_ERROR("Shader handle %d too big\n", src_handles[i]);
+			DRM_DEBUG("Shader handle %d too big\n", src_handles[i]);
 			return -EINVAL;
 		}
 
@@ -810,13 +810,13 @@ validate_gl_shader_rec(struct drm_device *dev,
 
 	if (((*(uint16_t *)pkt_u & VC4_SHADER_FLAG_FS_SINGLE_THREAD) == 0) !=
 	    to_vc4_bo(&bo[0]->base)->validated_shader->is_threaded) {
-		DRM_ERROR("Thread mode of CL and FS do not match\n");
+		DRM_DEBUG("Thread mode of CL and FS do not match\n");
 		return -EINVAL;
 	}
 
 	if (to_vc4_bo(&bo[1]->base)->validated_shader->is_threaded ||
 	    to_vc4_bo(&bo[2]->base)->validated_shader->is_threaded) {
-		DRM_ERROR("cs and vs cannot be threaded\n");
+		DRM_DEBUG("cs and vs cannot be threaded\n");
 		return -EINVAL;
 	}
 
@@ -831,7 +831,7 @@ validate_gl_shader_rec(struct drm_device *dev,
 		*(uint32_t *)(pkt_v + o) = bo[i]->paddr + src_offset;
 
 		if (src_offset != 0) {
-			DRM_ERROR("Shaders must be at offset 0 of "
+			DRM_DEBUG("Shaders must be at offset 0 of "
 				  "the BO.\n");
 			return -EINVAL;
 		}
@@ -842,7 +842,7 @@ validate_gl_shader_rec(struct drm_device *dev,
 
 		if (validated_shader->uniforms_src_size >
 		    exec->uniforms_size) {
-			DRM_ERROR("Uniforms src buffer overflow\n");
+			DRM_DEBUG("Uniforms src buffer overflow\n");
 			return -EINVAL;
 		}
 
@@ -900,7 +900,7 @@ validate_gl_shader_rec(struct drm_device *dev,
 
 		if (vbo->base.size < offset ||
 		    vbo->base.size - offset < attr_size) {
-			DRM_ERROR("BO offset overflow (%d + %d > %zu)\n",
+			DRM_DEBUG("BO offset overflow (%d + %d > %zu)\n",
 				  offset, attr_size, vbo->base.size);
 			return -EINVAL;
 		}
@@ -909,7 +909,7 @@ validate_gl_shader_rec(struct drm_device *dev,
 			max_index = ((vbo->base.size - offset - attr_size) /
 				     stride);
 			if (state->max_index > max_index) {
-				DRM_ERROR("primitives use index %d out of "
+				DRM_DEBUG("primitives use index %d out of "
 					  "supplied %d\n",
 					  state->max_index, max_index);
 				return -EINVAL;
diff --git a/drivers/gpu/drm/vc4/vc4_validate_shaders.c b/drivers/gpu/drm/vc4/vc4_validate_shaders.c
index 0b2df5c..d3f15bf 100644
--- a/drivers/gpu/drm/vc4/vc4_validate_shaders.c
+++ b/drivers/gpu/drm/vc4/vc4_validate_shaders.c
@@ -200,7 +200,7 @@ check_tmu_write(struct vc4_validated_shader_info *validated_shader,
 		uint32_t clamp_reg, clamp_offset;
 
 		if (sig == QPU_SIG_SMALL_IMM) {
-			DRM_ERROR("direct TMU read used small immediate\n");
+			DRM_DEBUG("direct TMU read used small immediate\n");
 			return false;
 		}
 
@@ -209,7 +209,7 @@ check_tmu_write(struct vc4_validated_shader_info *validated_shader,
 		 */
 		if (is_mul ||
 		    QPU_GET_FIELD(inst, QPU_OP_ADD) != QPU_A_ADD) {
-			DRM_ERROR("direct TMU load wasn't an add\n");
+			DRM_DEBUG("direct TMU load wasn't an add\n");
 			return false;
 		}
 
@@ -220,13 +220,13 @@ check_tmu_write(struct vc4_validated_shader_info *validated_shader,
 		 */
 		clamp_reg = raddr_add_a_to_live_reg_index(inst);
 		if (clamp_reg == ~0) {
-			DRM_ERROR("direct TMU load wasn't clamped\n");
+			DRM_DEBUG("direct TMU load wasn't clamped\n");
 			return false;
 		}
 
 		clamp_offset = validation_state->live_min_clamp_offsets[clamp_reg];
 		if (clamp_offset == ~0) {
-			DRM_ERROR("direct TMU load wasn't clamped\n");
+			DRM_DEBUG("direct TMU load wasn't clamped\n");
 			return false;
 		}
 
@@ -238,7 +238,7 @@ check_tmu_write(struct vc4_validated_shader_info *validated_shader,
 
 		if (!(add_b == QPU_MUX_A && raddr_a == QPU_R_UNIF) &&
 		    !(add_b == QPU_MUX_B && raddr_b == QPU_R_UNIF)) {
-			DRM_ERROR("direct TMU load didn't add to a uniform\n");
+			DRM_DEBUG("direct TMU load didn't add to a uniform\n");
 			return false;
 		}
 
@@ -246,14 +246,14 @@ check_tmu_write(struct vc4_validated_shader_info *validated_shader,
 	} else {
 		if (raddr_a == QPU_R_UNIF || (sig != QPU_SIG_SMALL_IMM &&
 					      raddr_b == QPU_R_UNIF)) {
-			DRM_ERROR("uniform read in the same instruction as "
+			DRM_DEBUG("uniform read in the same instruction as "
 				  "texture setup.\n");
 			return false;
 		}
 	}
 
 	if (validation_state->tmu_write_count[tmu] >= 4) {
-		DRM_ERROR("TMU%d got too many parameters before dispatch\n",
+		DRM_DEBUG("TMU%d got too many parameters before dispatch\n",
 			  tmu);
 		return false;
 	}
@@ -265,7 +265,7 @@ check_tmu_write(struct vc4_validated_shader_info *validated_shader,
 	 */
 	if (!is_direct) {
 		if (validation_state->needs_uniform_address_update) {
-			DRM_ERROR("Texturing with undefined uniform address\n");
+			DRM_DEBUG("Texturing with undefined uniform address\n");
 			return false;
 		}
 
@@ -336,35 +336,35 @@ validate_uniform_address_write(struct vc4_validated_shader_info *validated_shade
 	case QPU_SIG_LOAD_TMU1:
 		break;
 	default:
-		DRM_ERROR("uniforms address change must be "
+		DRM_DEBUG("uniforms address change must be "
 			  "normal math\n");
 		return false;
 	}
 
 	if (is_mul || QPU_GET_FIELD(inst, QPU_OP_ADD) != QPU_A_ADD) {
-		DRM_ERROR("Uniform address reset must be an ADD.\n");
+		DRM_DEBUG("Uniform address reset must be an ADD.\n");
 		return false;
 	}
 
 	if (QPU_GET_FIELD(inst, QPU_COND_ADD) != QPU_COND_ALWAYS) {
-		DRM_ERROR("Uniform address reset must be unconditional.\n");
+		DRM_DEBUG("Uniform address reset must be unconditional.\n");
 		return false;
 	}
 
 	if (QPU_GET_FIELD(inst, QPU_PACK) != QPU_PACK_A_NOP &&
 	    !(inst & QPU_PM)) {
-		DRM_ERROR("No packing allowed on uniforms reset\n");
+		DRM_DEBUG("No packing allowed on uniforms reset\n");
 		return false;
 	}
 
 	if (add_lri == -1) {
-		DRM_ERROR("First argument of uniform address write must be "
+		DRM_DEBUG("First argument of uniform address write must be "
 			  "an immediate value.\n");
 		return false;
 	}
 
 	if (validation_state->live_immediates[add_lri] != expected_offset) {
-		DRM_ERROR("Resetting uniforms with offset %db instead of %db\n",
+		DRM_DEBUG("Resetting uniforms with offset %db instead of %db\n",
 			  validation_state->live_immediates[add_lri],
 			  expected_offset);
 		return false;
@@ -372,7 +372,7 @@ validate_uniform_address_write(struct vc4_validated_shader_info *validated_shade
 
 	if (!(add_b == QPU_MUX_A && raddr_a == QPU_R_UNIF) &&
 	    !(add_b == QPU_MUX_B && raddr_b == QPU_R_UNIF)) {
-		DRM_ERROR("Second argument of uniform address write must be "
+		DRM_DEBUG("Second argument of uniform address write must be "
 			  "a uniform.\n");
 		return false;
 	}
@@ -417,7 +417,7 @@ check_reg_write(struct vc4_validated_shader_info *validated_shader,
 	switch (waddr) {
 	case QPU_W_UNIFORMS_ADDRESS:
 		if (is_b) {
-			DRM_ERROR("relative uniforms address change "
+			DRM_DEBUG("relative uniforms address change "
 				  "unsupported\n");
 			return false;
 		}
@@ -452,11 +452,11 @@ check_reg_write(struct vc4_validated_shader_info *validated_shader,
 		/* XXX: I haven't thought about these, so don't support them
 		 * for now.
 		 */
-		DRM_ERROR("Unsupported waddr %d\n", waddr);
+		DRM_DEBUG("Unsupported waddr %d\n", waddr);
 		return false;
 
 	case QPU_W_VPM_ADDR:
-		DRM_ERROR("General VPM DMA unsupported\n");
+		DRM_DEBUG("General VPM DMA unsupported\n");
 		return false;
 
 	case QPU_W_VPM:
@@ -559,7 +559,7 @@ check_instruction_writes(struct vc4_validated_shader_info *validated_shader,
 	bool ok;
 
 	if (is_tmu_write(waddr_add) && is_tmu_write(waddr_mul)) {
-		DRM_ERROR("ADD and MUL both set up textures\n");
+		DRM_DEBUG("ADD and MUL both set up textures\n");
 		return false;
 	}
 
@@ -588,7 +588,7 @@ check_branch(uint64_t inst,
 	 * there's no need for it.
 	 */
 	if (waddr_add != QPU_W_NOP || waddr_mul != QPU_W_NOP) {
-		DRM_ERROR("branch instruction at %d wrote a register.\n",
+		DRM_DEBUG("branch instruction at %d wrote a register.\n",
 			  validation_state->ip);
 		return false;
 	}
@@ -614,7 +614,7 @@ check_instruction_reads(struct vc4_validated_shader_info *validated_shader,
 		validated_shader->uniforms_size += 4;
 
 		if (validation_state->needs_uniform_address_update) {
-			DRM_ERROR("Uniform read with undefined uniform "
+			DRM_DEBUG("Uniform read with undefined uniform "
 				  "address\n");
 			return false;
 		}
@@ -660,19 +660,19 @@ vc4_validate_branches(struct vc4_shader_validation_state *validation_state)
 			continue;
 
 		if (ip - last_branch < 4) {
-			DRM_ERROR("Branch at %d during delay slots\n", ip);
+			DRM_DEBUG("Branch at %d during delay slots\n", ip);
 			return false;
 		}
 		last_branch = ip;
 
 		if (inst & QPU_BRANCH_REG) {
-			DRM_ERROR("branching from register relative "
+			DRM_DEBUG("branching from register relative "
 				  "not supported\n");
 			return false;
 		}
 
 		if (!(inst & QPU_BRANCH_REL)) {
-			DRM_ERROR("relative branching required\n");
+			DRM_DEBUG("relative branching required\n");
 			return false;
 		}
 
@@ -682,13 +682,13 @@ vc4_validate_branches(struct vc4_shader_validation_state *validation_state)
 		 * end of the shader object.
 		 */
 		if (branch_imm % sizeof(inst) != 0) {
-			DRM_ERROR("branch target not aligned\n");
+			DRM_DEBUG("branch target not aligned\n");
 			return false;
 		}
 
 		branch_target_ip = after_delay_ip + (branch_imm >> 3);
 		if (branch_target_ip >= validation_state->max_ip) {
-			DRM_ERROR("Branch at %d outside of shader (ip %d/%d)\n",
+			DRM_DEBUG("Branch at %d outside of shader (ip %d/%d)\n",
 				  ip, branch_target_ip,
 				  validation_state->max_ip);
 			return false;
@@ -699,7 +699,7 @@ vc4_validate_branches(struct vc4_shader_validation_state *validation_state)
 		 * the shader.
 		 */
 		if (after_delay_ip >= validation_state->max_ip) {
-			DRM_ERROR("Branch at %d continues past shader end "
+			DRM_DEBUG("Branch at %d continues past shader end "
 				  "(%d/%d)\n",
 				  ip, after_delay_ip, validation_state->max_ip);
 			return false;
@@ -709,7 +709,7 @@ vc4_validate_branches(struct vc4_shader_validation_state *validation_state)
 	}
 
 	if (max_branch_target > validation_state->max_ip - 3) {
-		DRM_ERROR("Branch landed after QPU_SIG_PROG_END");
+		DRM_DEBUG("Branch landed after QPU_SIG_PROG_END");
 		return false;
 	}
 
@@ -750,7 +750,7 @@ vc4_handle_branch_target(struct vc4_shader_validation_state *validation_state)
 		return true;
 
 	if (texturing_in_progress(validation_state)) {
-		DRM_ERROR("Branch target landed during TMU setup\n");
+		DRM_DEBUG("Branch target landed during TMU setup\n");
 		return false;
 	}
 
@@ -837,7 +837,7 @@ vc4_validate_shader(struct drm_gem_cma_object *shader_obj)
 		case QPU_SIG_LAST_THREAD_SWITCH:
 			if (!check_instruction_writes(validated_shader,
 						      &validation_state)) {
-				DRM_ERROR("Bad write at ip %d\n", ip);
+				DRM_DEBUG("Bad write at ip %d\n", ip);
 				goto fail;
 			}
 
@@ -855,7 +855,7 @@ vc4_validate_shader(struct drm_gem_cma_object *shader_obj)
 				validated_shader->is_threaded = true;
 
 				if (ip < last_thread_switch_ip + 3) {
-					DRM_ERROR("Thread switch too soon after "
+					DRM_DEBUG("Thread switch too soon after "
 						  "last switch at ip %d\n", ip);
 					goto fail;
 				}
@@ -867,7 +867,7 @@ vc4_validate_shader(struct drm_gem_cma_object *shader_obj)
 		case QPU_SIG_LOAD_IMM:
 			if (!check_instruction_writes(validated_shader,
 						      &validation_state)) {
-				DRM_ERROR("Bad LOAD_IMM write at ip %d\n", ip);
+				DRM_DEBUG("Bad LOAD_IMM write at ip %d\n", ip);
 				goto fail;
 			}
 			break;
@@ -878,14 +878,14 @@ vc4_validate_shader(struct drm_gem_cma_object *shader_obj)
 				goto fail;
 
 			if (ip < last_thread_switch_ip + 3) {
-				DRM_ERROR("Branch in thread switch at ip %d",
+				DRM_DEBUG("Branch in thread switch at ip %d",
 					  ip);
 				goto fail;
 			}
 
 			break;
 		default:
-			DRM_ERROR("Unsupported QPU signal %d at "
+			DRM_DEBUG("Unsupported QPU signal %d at "
 				  "instruction %d\n", sig, ip);
 			goto fail;
 		}
@@ -898,7 +898,7 @@ vc4_validate_shader(struct drm_gem_cma_object *shader_obj)
 	}
 
 	if (ip == validation_state.max_ip) {
-		DRM_ERROR("shader failed to terminate before "
+		DRM_DEBUG("shader failed to terminate before "
 			  "shader BO end at %zd\n",
 			  shader_obj->base.size);
 		goto fail;
@@ -907,7 +907,7 @@ vc4_validate_shader(struct drm_gem_cma_object *shader_obj)
 	/* Might corrupt other thread */
 	if (validated_shader->is_threaded &&
 	    validation_state.all_registers_used) {
-		DRM_ERROR("Shader uses threading, but uses the upper "
+		DRM_DEBUG("Shader uses threading, but uses the upper "
 			  "half of the registers, too\n");
 		goto fail;
 	}
diff --git a/drivers/gpu/drm/vc4/vc4_vec.c b/drivers/gpu/drm/vc4/vc4_vec.c
index 09c1e05..3a9a302 100644
--- a/drivers/gpu/drm/vc4/vc4_vec.c
+++ b/drivers/gpu/drm/vc4/vc4_vec.c
@@ -366,10 +366,8 @@ static int vc4_vec_connector_get_modes(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs vc4_vec_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.detect = vc4_vec_connector_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
-	.set_property = drm_atomic_helper_connector_set_property,
 	.destroy = vc4_vec_connector_destroy,
 	.reset = drm_atomic_helper_connector_reset,
 	.atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
diff --git a/drivers/gpu/drm/vgem/vgem_drv.c b/drivers/gpu/drm/vgem/vgem_drv.c
index 18f401b..2524ff1 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.c
+++ b/drivers/gpu/drm/vgem/vgem_drv.c
@@ -52,6 +52,7 @@ static void vgem_gem_free_object(struct drm_gem_object *obj)
 	struct drm_vgem_gem_object *vgem_obj = to_vgem_bo(obj);
 
 	kvfree(vgem_obj->pages);
+	mutex_destroy(&vgem_obj->pages_lock);
 
 	if (obj->import_attach)
 		drm_prime_gem_destroy(obj, vgem_obj->table);
@@ -76,11 +77,15 @@ static int vgem_gem_fault(struct vm_fault *vmf)
 	if (page_offset > num_pages)
 		return VM_FAULT_SIGBUS;
 
+	ret = -ENOENT;
+	mutex_lock(&obj->pages_lock);
 	if (obj->pages) {
 		get_page(obj->pages[page_offset]);
 		vmf->page = obj->pages[page_offset];
 		ret = 0;
-	} else {
+	}
+	mutex_unlock(&obj->pages_lock);
+	if (ret) {
 		struct page *page;
 
 		page = shmem_read_mapping_page(
@@ -161,6 +166,8 @@ static struct drm_vgem_gem_object *__vgem_gem_create(struct drm_device *dev,
 		return ERR_PTR(ret);
 	}
 
+	mutex_init(&obj->pages_lock);
+
 	return obj;
 }
 
@@ -183,7 +190,7 @@ static struct drm_gem_object *vgem_gem_create(struct drm_device *dev,
 		return ERR_CAST(obj);
 
 	ret = drm_gem_handle_create(file, &obj->base, handle);
-	drm_gem_object_unreference_unlocked(&obj->base);
+	drm_gem_object_put_unlocked(&obj->base);
 	if (ret)
 		goto err;
 
@@ -238,7 +245,7 @@ static int vgem_gem_dumb_map(struct drm_file *file, struct drm_device *dev,
 
 	*offset = drm_vma_node_offset_addr(&obj->vma_node);
 unref:
-	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
 
 	return ret;
 }
@@ -271,40 +278,70 @@ static const struct file_operations vgem_driver_fops = {
 	.poll		= drm_poll,
 	.read		= drm_read,
 	.unlocked_ioctl = drm_ioctl,
+	.compat_ioctl	= drm_compat_ioctl,
 	.release	= drm_release,
 };
 
+static struct page **vgem_pin_pages(struct drm_vgem_gem_object *bo)
+{
+	mutex_lock(&bo->pages_lock);
+	if (bo->pages_pin_count++ == 0) {
+		struct page **pages;
+
+		pages = drm_gem_get_pages(&bo->base);
+		if (IS_ERR(pages)) {
+			bo->pages_pin_count--;
+			mutex_unlock(&bo->pages_lock);
+			return pages;
+		}
+
+		bo->pages = pages;
+	}
+	mutex_unlock(&bo->pages_lock);
+
+	return bo->pages;
+}
+
+static void vgem_unpin_pages(struct drm_vgem_gem_object *bo)
+{
+	mutex_lock(&bo->pages_lock);
+	if (--bo->pages_pin_count == 0) {
+		drm_gem_put_pages(&bo->base, bo->pages, true, true);
+		bo->pages = NULL;
+	}
+	mutex_unlock(&bo->pages_lock);
+}
+
 static int vgem_prime_pin(struct drm_gem_object *obj)
 {
+	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
 	long n_pages = obj->size >> PAGE_SHIFT;
 	struct page **pages;
 
+	pages = vgem_pin_pages(bo);
+	if (IS_ERR(pages))
+		return PTR_ERR(pages);
+
 	/* Flush the object from the CPU cache so that importers can rely
 	 * on coherent indirect access via the exported dma-address.
 	 */
-	pages = drm_gem_get_pages(obj);
-	if (IS_ERR(pages))
-		return PTR_ERR(pages);
-
 	drm_clflush_pages(pages, n_pages);
-	drm_gem_put_pages(obj, pages, true, false);
 
 	return 0;
 }
 
+static void vgem_prime_unpin(struct drm_gem_object *obj)
+{
+	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
+
+	vgem_unpin_pages(bo);
+}
+
 static struct sg_table *vgem_prime_get_sg_table(struct drm_gem_object *obj)
 {
-	struct sg_table *st;
-	struct page **pages;
+	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
 
-	pages = drm_gem_get_pages(obj);
-	if (IS_ERR(pages))
-		return ERR_CAST(pages);
-
-	st = drm_prime_pages_to_sg(pages, obj->size >> PAGE_SHIFT);
-	drm_gem_put_pages(obj, pages, false, false);
-
-	return st;
+	return drm_prime_pages_to_sg(bo->pages, bo->base.size >> PAGE_SHIFT);
 }
 
 static struct drm_gem_object* vgem_prime_import(struct drm_device *dev,
@@ -333,6 +370,8 @@ static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev,
 		__vgem_gem_destroy(obj);
 		return ERR_PTR(-ENOMEM);
 	}
+
+	obj->pages_pin_count++; /* perma-pinned */
 	drm_prime_sg_to_page_addr_arrays(obj->table, obj->pages, NULL,
 					npages);
 	return &obj->base;
@@ -340,23 +379,23 @@ static struct drm_gem_object *vgem_prime_import_sg_table(struct drm_device *dev,
 
 static void *vgem_prime_vmap(struct drm_gem_object *obj)
 {
+	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
 	long n_pages = obj->size >> PAGE_SHIFT;
 	struct page **pages;
-	void *addr;
 
-	pages = drm_gem_get_pages(obj);
+	pages = vgem_pin_pages(bo);
 	if (IS_ERR(pages))
 		return NULL;
 
-	addr = vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
-	drm_gem_put_pages(obj, pages, false, false);
-
-	return addr;
+	return vmap(pages, n_pages, 0, pgprot_writecombine(PAGE_KERNEL));
 }
 
 static void vgem_prime_vunmap(struct drm_gem_object *obj, void *vaddr)
 {
+	struct drm_vgem_gem_object *bo = to_vgem_bo(obj);
+
 	vunmap(vaddr);
+	vgem_unpin_pages(bo);
 }
 
 static int vgem_prime_mmap(struct drm_gem_object *obj,
@@ -409,6 +448,7 @@ static struct drm_driver vgem_driver = {
 	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
 	.gem_prime_pin = vgem_prime_pin,
+	.gem_prime_unpin = vgem_prime_unpin,
 	.gem_prime_import = vgem_prime_import,
 	.gem_prime_export = drm_gem_prime_export,
 	.gem_prime_import_sg_table = vgem_prime_import_sg_table,
diff --git a/drivers/gpu/drm/vgem/vgem_drv.h b/drivers/gpu/drm/vgem/vgem_drv.h
index 1aae014..5c8f6d6 100644
--- a/drivers/gpu/drm/vgem/vgem_drv.h
+++ b/drivers/gpu/drm/vgem/vgem_drv.h
@@ -43,7 +43,11 @@ struct vgem_file {
 #define to_vgem_bo(x) container_of(x, struct drm_vgem_gem_object, base)
 struct drm_vgem_gem_object {
 	struct drm_gem_object base;
+
 	struct page **pages;
+	unsigned int pages_pin_count;
+	struct mutex pages_lock;
+
 	struct sg_table *table;
 };
 
diff --git a/drivers/gpu/drm/vgem/vgem_fence.c b/drivers/gpu/drm/vgem/vgem_fence.c
index 3109c83..8fd52f2 100644
--- a/drivers/gpu/drm/vgem/vgem_fence.c
+++ b/drivers/gpu/drm/vgem/vgem_fence.c
@@ -213,7 +213,7 @@ int vgem_fence_attach_ioctl(struct drm_device *dev,
 		dma_fence_put(fence);
 	}
 err:
-	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/via/via_drv.c b/drivers/gpu/drm/via/via_drv.c
index 9e0e539..aaf766f 100644
--- a/drivers/gpu/drm/via/via_drv.c
+++ b/drivers/gpu/drm/via/via_drv.c
@@ -77,7 +77,6 @@ static struct drm_driver driver = {
 	.open = via_driver_open,
 	.preclose = via_reclaim_buffers_locked,
 	.postclose = via_driver_postclose,
-	.set_busid = drm_pci_set_busid,
 	.context_dtor = via_final_context,
 	.get_vblank_counter = via_get_vblank_counter,
 	.enable_vblank = via_enable_vblank,
@@ -107,12 +106,12 @@ static int __init via_init(void)
 {
 	driver.num_ioctls = via_max_ioctl;
 	via_init_command_verifier();
-	return drm_pci_init(&driver, &via_pci_driver);
+	return drm_legacy_pci_init(&driver, &via_pci_driver);
 }
 
 static void __exit via_exit(void)
 {
-	drm_pci_exit(&driver, &via_pci_driver);
+	drm_legacy_pci_exit(&driver, &via_pci_driver);
 }
 
 module_init(via_init);
diff --git a/drivers/gpu/drm/virtio/virtgpu_display.c b/drivers/gpu/drm/virtio/virtgpu_display.c
index d51bd45..b6d5205 100644
--- a/drivers/gpu/drm/virtio/virtgpu_display.c
+++ b/drivers/gpu/drm/virtio/virtgpu_display.c
@@ -113,11 +113,13 @@ static void virtio_gpu_crtc_mode_set_nofb(struct drm_crtc *crtc)
 				   crtc->mode.vdisplay, 0, 0);
 }
 
-static void virtio_gpu_crtc_enable(struct drm_crtc *crtc)
+static void virtio_gpu_crtc_atomic_enable(struct drm_crtc *crtc,
+					  struct drm_crtc_state *old_state)
 {
 }
 
-static void virtio_gpu_crtc_disable(struct drm_crtc *crtc)
+static void virtio_gpu_crtc_atomic_disable(struct drm_crtc *crtc,
+					   struct drm_crtc_state *old_state)
 {
 	struct drm_device *dev = crtc->dev;
 	struct virtio_gpu_device *vgdev = dev->dev_private;
@@ -145,11 +147,11 @@ static void virtio_gpu_crtc_atomic_flush(struct drm_crtc *crtc,
 }
 
 static const struct drm_crtc_helper_funcs virtio_gpu_crtc_helper_funcs = {
-	.enable        = virtio_gpu_crtc_enable,
-	.disable       = virtio_gpu_crtc_disable,
 	.mode_set_nofb = virtio_gpu_crtc_mode_set_nofb,
 	.atomic_check  = virtio_gpu_crtc_atomic_check,
 	.atomic_flush  = virtio_gpu_crtc_atomic_flush,
+	.atomic_enable = virtio_gpu_crtc_atomic_enable,
+	.atomic_disable = virtio_gpu_crtc_atomic_disable,
 };
 
 static void virtio_gpu_enc_mode_set(struct drm_encoder *encoder,
@@ -250,7 +252,6 @@ static void virtio_gpu_conn_destroy(struct drm_connector *connector)
 }
 
 static const struct drm_connector_funcs virtio_gpu_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.detect = virtio_gpu_conn_detect,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = virtio_gpu_conn_destroy,
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.c b/drivers/gpu/drm/virtio/virtgpu_drv.c
index 63d35c7..49a3d8d 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.c
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.c
@@ -122,7 +122,6 @@ static struct drm_driver driver = {
 
 	.dumb_create = virtio_gpu_mode_dumb_create,
 	.dumb_map_offset = virtio_gpu_mode_dumb_mmap,
-	.dumb_destroy = virtio_gpu_mode_dumb_destroy,
 
 #if defined(CONFIG_DEBUG_FS)
 	.debugfs_init = virtio_gpu_debugfs_init,
diff --git a/drivers/gpu/drm/virtio/virtgpu_drv.h b/drivers/gpu/drm/virtio/virtgpu_drv.h
index 3a66abb..da2fb585 100644
--- a/drivers/gpu/drm/virtio/virtgpu_drv.h
+++ b/drivers/gpu/drm/virtio/virtgpu_drv.h
@@ -236,9 +236,6 @@ struct virtio_gpu_object *virtio_gpu_alloc_object(struct drm_device *dev,
 int virtio_gpu_mode_dumb_create(struct drm_file *file_priv,
 				struct drm_device *dev,
 				struct drm_mode_create_dumb *args);
-int virtio_gpu_mode_dumb_destroy(struct drm_file *file_priv,
-				 struct drm_device *dev,
-				 uint32_t handle);
 int virtio_gpu_mode_dumb_mmap(struct drm_file *file_priv,
 			      struct drm_device *dev,
 			      uint32_t handle, uint64_t *offset_p);
diff --git a/drivers/gpu/drm/virtio/virtgpu_fb.c b/drivers/gpu/drm/virtio/virtgpu_fb.c
index 33df067..15d18fd 100644
--- a/drivers/gpu/drm/virtio/virtgpu_fb.c
+++ b/drivers/gpu/drm/virtio/virtgpu_fb.c
@@ -273,7 +273,6 @@ static int virtio_gpufb_create(struct drm_fb_helper *helper,
 	vfbdev->helper.fb = fb;
 
 	strcpy(info->fix.id, "virtiodrmfb");
-	info->flags = FBINFO_DEFAULT;
 	info->fbops = &virtio_gpufb_ops;
 	info->pixmap.flags = FB_PIXMAP_SYSTEM;
 
@@ -309,7 +308,7 @@ static int virtio_gpu_fbdev_destroy(struct drm_device *dev,
 
 	return 0;
 }
-static struct drm_fb_helper_funcs virtio_gpu_fb_helper_funcs = {
+static const struct drm_fb_helper_funcs virtio_gpu_fb_helper_funcs = {
 	.fb_probe = virtio_gpufb_create,
 };
 
diff --git a/drivers/gpu/drm/virtio/virtgpu_gem.c b/drivers/gpu/drm/virtio/virtgpu_gem.c
index cc025d8..72ad7b1 100644
--- a/drivers/gpu/drm/virtio/virtgpu_gem.c
+++ b/drivers/gpu/drm/virtio/virtgpu_gem.c
@@ -118,13 +118,6 @@ int virtio_gpu_mode_dumb_create(struct drm_file *file_priv,
 	return ret;
 }
 
-int virtio_gpu_mode_dumb_destroy(struct drm_file *file_priv,
-				 struct drm_device *dev,
-				 uint32_t handle)
-{
-	return drm_gem_handle_delete(file_priv, handle);
-}
-
 int virtio_gpu_mode_dumb_mmap(struct drm_file *file_priv,
 			      struct drm_device *dev,
 			      uint32_t handle, uint64_t *offset_p)
diff --git a/drivers/gpu/drm/virtio/virtgpu_plane.c b/drivers/gpu/drm/virtio/virtgpu_plane.c
index adcdbd0..71ba455 100644
--- a/drivers/gpu/drm/virtio/virtgpu_plane.c
+++ b/drivers/gpu/drm/virtio/virtgpu_plane.c
@@ -298,7 +298,7 @@ struct drm_plane *virtio_gpu_plane_init(struct virtio_gpu_device *vgdev,
 	ret = drm_universal_plane_init(dev, plane, 1 << index,
 				       &virtio_gpu_plane_funcs,
 				       formats, nformats,
-				       type, NULL);
+				       NULL, type, NULL);
 	if (ret)
 		goto err_plane_init;
 
diff --git a/drivers/gpu/drm/virtio/virtgpu_ttm.c b/drivers/gpu/drm/virtio/virtgpu_ttm.c
index c1f2af4..cd389c5 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ttm.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ttm.c
@@ -192,7 +192,7 @@ static int ttm_bo_man_takedown(struct ttm_mem_type_manager *man)
 }
 
 static void ttm_bo_man_debug(struct ttm_mem_type_manager *man,
-			     const char *prefix)
+			     struct drm_printer *printer)
 {
 }
 
@@ -234,7 +234,7 @@ static int virtio_gpu_init_mem_type(struct ttm_bo_device *bdev, uint32_t type,
 static void virtio_gpu_evict_flags(struct ttm_buffer_object *bo,
 				struct ttm_placement *placement)
 {
-	static struct ttm_place placements = {
+	static const struct ttm_place placements = {
 		.fpfn  = 0,
 		.lpfn  = 0,
 		.flags = TTM_PL_MASK_CACHING | TTM_PL_FLAG_SYSTEM,
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c
index 8617879..c706ad3 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_cmdbuf.c
@@ -51,6 +51,7 @@ struct vmw_cmdbuf_context {
 	struct list_head hw_submitted;
 	struct list_head preempted;
 	unsigned num_hw_submitted;
+	bool block_submission;
 };
 
 /**
@@ -60,6 +61,9 @@ struct vmw_cmdbuf_context {
  * kernel command submissions, @cur.
  * @space_mutex: Mutex to protect against starvation when we allocate
  * main pool buffer space.
+ * @error_mutex: Mutex to serialize the work queue error handling.
+ * Note this is not needed if the same workqueue handler
+ * can't race with itself...
  * @work: A struct work_struct implementeing command buffer error handling.
  * Immutable.
  * @dev_priv: Pointer to the device private struct. Immutable.
@@ -85,7 +89,6 @@ struct vmw_cmdbuf_context {
  * Internal protection.
  * @dheaders: Pool of DMA memory for device command buffer headers with trailing
  * space for inline data. Internal protection.
- * @tasklet: Tasklet struct for irq processing. Immutable.
  * @alloc_queue: Wait queue for processes waiting to allocate command buffer
  * space.
  * @idle_queue: Wait queue for processes waiting for command buffer idle.
@@ -102,6 +105,7 @@ struct vmw_cmdbuf_context {
 struct vmw_cmdbuf_man {
 	struct mutex cur_mutex;
 	struct mutex space_mutex;
+	struct mutex error_mutex;
 	struct work_struct work;
 	struct vmw_private *dev_priv;
 	struct vmw_cmdbuf_context ctx[SVGA_CB_CONTEXT_MAX];
@@ -117,7 +121,6 @@ struct vmw_cmdbuf_man {
 	spinlock_t lock;
 	struct dma_pool *headers;
 	struct dma_pool *dheaders;
-	struct tasklet_struct tasklet;
 	wait_queue_head_t alloc_queue;
 	wait_queue_head_t idle_queue;
 	bool irq_on;
@@ -181,12 +184,13 @@ struct vmw_cmdbuf_alloc_info {
 };
 
 /* Loop over each context in the command buffer manager. */
-#define for_each_cmdbuf_ctx(_man, _i, _ctx) \
+#define for_each_cmdbuf_ctx(_man, _i, _ctx)				\
 	for (_i = 0, _ctx = &(_man)->ctx[0]; (_i) < SVGA_CB_CONTEXT_MAX; \
 	     ++(_i), ++(_ctx))
 
-static int vmw_cmdbuf_startstop(struct vmw_cmdbuf_man *man, bool enable);
-
+static int vmw_cmdbuf_startstop(struct vmw_cmdbuf_man *man, u32 context,
+				bool enable);
+static int vmw_cmdbuf_preempt(struct vmw_cmdbuf_man *man, u32 context);
 
 /**
  * vmw_cmdbuf_cur_lock - Helper to lock the cur_mutex.
@@ -278,9 +282,9 @@ void vmw_cmdbuf_header_free(struct vmw_cmdbuf_header *header)
 		vmw_cmdbuf_header_inline_free(header);
 		return;
 	}
-	spin_lock_bh(&man->lock);
+	spin_lock(&man->lock);
 	__vmw_cmdbuf_header_free(header);
-	spin_unlock_bh(&man->lock);
+	spin_unlock(&man->lock);
 }
 
 
@@ -331,7 +335,8 @@ static void vmw_cmdbuf_ctx_submit(struct vmw_cmdbuf_man *man,
 				  struct vmw_cmdbuf_context *ctx)
 {
 	while (ctx->num_hw_submitted < man->max_hw_submitted &&
-	      !list_empty(&ctx->submitted)) {
+	       !list_empty(&ctx->submitted) &&
+	       !ctx->block_submission) {
 		struct vmw_cmdbuf_header *entry;
 		SVGACBStatus status;
 
@@ -386,12 +391,17 @@ static void vmw_cmdbuf_ctx_process(struct vmw_cmdbuf_man *man,
 			__vmw_cmdbuf_header_free(entry);
 			break;
 		case SVGA_CB_STATUS_COMMAND_ERROR:
-		case SVGA_CB_STATUS_CB_HEADER_ERROR:
+			entry->cb_header->status = SVGA_CB_STATUS_NONE;
 			list_add_tail(&entry->list, &man->error);
 			schedule_work(&man->work);
 			break;
 		case SVGA_CB_STATUS_PREEMPTED:
-			list_add(&entry->list, &ctx->preempted);
+			entry->cb_header->status = SVGA_CB_STATUS_NONE;
+			list_add_tail(&entry->list, &ctx->preempted);
+			break;
+		case SVGA_CB_STATUS_CB_HEADER_ERROR:
+			WARN_ONCE(true, "Command buffer header error.\n");
+			__vmw_cmdbuf_header_free(entry);
 			break;
 		default:
 			WARN_ONCE(true, "Undefined command buffer status.\n");
@@ -468,20 +478,17 @@ static void vmw_cmdbuf_ctx_add(struct vmw_cmdbuf_man *man,
 }
 
 /**
- * vmw_cmdbuf_man_tasklet - The main part of the command buffer interrupt
- * handler implemented as a tasklet.
+ * vmw_cmdbuf_irqthread - The main part of the command buffer interrupt
+ * handler implemented as a threaded irq task.
  *
- * @data: Tasklet closure. A pointer to the command buffer manager cast to
- * an unsigned long.
+ * @man: Pointer to the command buffer manager.
  *
- * The bottom half (tasklet) of the interrupt handler simply calls into the
+ * The bottom half of the interrupt handler simply calls into the
  * command buffer processor to free finished buffers and submit any
  * queued buffers to hardware.
  */
-static void vmw_cmdbuf_man_tasklet(unsigned long data)
+void vmw_cmdbuf_irqthread(struct vmw_cmdbuf_man *man)
 {
-	struct vmw_cmdbuf_man *man = (struct vmw_cmdbuf_man *) data;
-
 	spin_lock(&man->lock);
 	vmw_cmdbuf_man_process(man);
 	spin_unlock(&man->lock);
@@ -502,24 +509,112 @@ static void vmw_cmdbuf_work_func(struct work_struct *work)
 		container_of(work, struct vmw_cmdbuf_man, work);
 	struct vmw_cmdbuf_header *entry, *next;
 	uint32_t dummy;
-	bool restart = false;
+	bool restart[SVGA_CB_CONTEXT_MAX];
+	bool send_fence = false;
+	struct list_head restart_head[SVGA_CB_CONTEXT_MAX];
+	int i;
+	struct vmw_cmdbuf_context *ctx;
 
-	spin_lock_bh(&man->lock);
-	list_for_each_entry_safe(entry, next, &man->error, list) {
-		restart = true;
-		DRM_ERROR("Command buffer error.\n");
-
-		list_del(&entry->list);
-		__vmw_cmdbuf_header_free(entry);
-		wake_up_all(&man->idle_queue);
+	for_each_cmdbuf_ctx(man, i, ctx) {
+		INIT_LIST_HEAD(&restart_head[i]);
+		restart[i] = false;
 	}
-	spin_unlock_bh(&man->lock);
 
-	if (restart && vmw_cmdbuf_startstop(man, true))
-		DRM_ERROR("Failed restarting command buffer context 0.\n");
+	mutex_lock(&man->error_mutex);
+	spin_lock(&man->lock);
+	list_for_each_entry_safe(entry, next, &man->error, list) {
+		SVGACBHeader *cb_hdr = entry->cb_header;
+		SVGA3dCmdHeader *header = (SVGA3dCmdHeader *)
+			(entry->cmd + cb_hdr->errorOffset);
+		u32 error_cmd_size, new_start_offset;
+		const char *cmd_name;
+
+		list_del_init(&entry->list);
+		restart[entry->cb_context] = true;
+
+		if (!vmw_cmd_describe(header, &error_cmd_size, &cmd_name)) {
+			DRM_ERROR("Unknown command causing device error.\n");
+			DRM_ERROR("Command buffer offset is %lu\n",
+				  (unsigned long) cb_hdr->errorOffset);
+			__vmw_cmdbuf_header_free(entry);
+			send_fence = true;
+			continue;
+		}
+
+		DRM_ERROR("Command \"%s\" causing device error.\n", cmd_name);
+		DRM_ERROR("Command buffer offset is %lu\n",
+			  (unsigned long) cb_hdr->errorOffset);
+		DRM_ERROR("Command size is %lu\n",
+			  (unsigned long) error_cmd_size);
+
+		new_start_offset = cb_hdr->errorOffset + error_cmd_size;
+
+		if (new_start_offset >= cb_hdr->length) {
+			__vmw_cmdbuf_header_free(entry);
+			send_fence = true;
+			continue;
+		}
+
+		if (man->using_mob)
+			cb_hdr->ptr.mob.mobOffset += new_start_offset;
+		else
+			cb_hdr->ptr.pa += (u64) new_start_offset;
+
+		entry->cmd += new_start_offset;
+		cb_hdr->length -= new_start_offset;
+		cb_hdr->errorOffset = 0;
+		cb_hdr->offset = 0;
+		list_add_tail(&entry->list, &restart_head[entry->cb_context]);
+		man->ctx[entry->cb_context].block_submission = true;
+	}
+	spin_unlock(&man->lock);
+
+	/* Preempt all contexts with errors */
+	for_each_cmdbuf_ctx(man, i, ctx) {
+		if (ctx->block_submission && vmw_cmdbuf_preempt(man, i))
+			DRM_ERROR("Failed preempting command buffer "
+				  "context %u.\n", i);
+	}
+
+	spin_lock(&man->lock);
+	for_each_cmdbuf_ctx(man, i, ctx) {
+		if (!ctx->block_submission)
+			continue;
+
+		/* Move preempted command buffers to the preempted queue. */
+		vmw_cmdbuf_ctx_process(man, ctx, &dummy);
+
+		/*
+		 * Add the preempted queue after the command buffer
+		 * that caused an error.
+		 */
+		list_splice_init(&ctx->preempted, restart_head[i].prev);
+
+		/*
+		 * Finally add all command buffers first in the submitted
+		 * queue, to rerun them.
+		 */
+		list_splice_init(&restart_head[i], &ctx->submitted);
+
+		ctx->block_submission = false;
+	}
+
+	vmw_cmdbuf_man_process(man);
+	spin_unlock(&man->lock);
+
+	for_each_cmdbuf_ctx(man, i, ctx) {
+		if (restart[i] && vmw_cmdbuf_startstop(man, i, true))
+			DRM_ERROR("Failed restarting command buffer "
+				  "context %u.\n", i);
+	}
 
 	/* Send a new fence in case one was removed */
-	vmw_fifo_send_fence(man->dev_priv, &dummy);
+	if (send_fence) {
+		vmw_fifo_send_fence(man->dev_priv, &dummy);
+		wake_up_all(&man->idle_queue);
+	}
+
+	mutex_unlock(&man->error_mutex);
 }
 
 /**
@@ -536,7 +631,7 @@ static bool vmw_cmdbuf_man_idle(struct vmw_cmdbuf_man *man,
 	bool idle = false;
 	int i;
 
-	spin_lock_bh(&man->lock);
+	spin_lock(&man->lock);
 	vmw_cmdbuf_man_process(man);
 	for_each_cmdbuf_ctx(man, i, ctx) {
 		if (!list_empty(&ctx->submitted) ||
@@ -548,7 +643,7 @@ static bool vmw_cmdbuf_man_idle(struct vmw_cmdbuf_man *man,
 	idle = list_empty(&man->error);
 
 out_unlock:
-	spin_unlock_bh(&man->lock);
+	spin_unlock(&man->lock);
 
 	return idle;
 }
@@ -571,7 +666,7 @@ static void __vmw_cmdbuf_cur_flush(struct vmw_cmdbuf_man *man)
 	if (!cur)
 		return;
 
-	spin_lock_bh(&man->lock);
+	spin_lock(&man->lock);
 	if (man->cur_pos == 0) {
 		__vmw_cmdbuf_header_free(cur);
 		goto out_unlock;
@@ -580,7 +675,7 @@ static void __vmw_cmdbuf_cur_flush(struct vmw_cmdbuf_man *man)
 	man->cur->cb_header->length = man->cur_pos;
 	vmw_cmdbuf_ctx_add(man, man->cur, SVGA_CB_CONTEXT_0);
 out_unlock:
-	spin_unlock_bh(&man->lock);
+	spin_unlock(&man->lock);
 	man->cur = NULL;
 	man->cur_pos = 0;
 }
@@ -673,14 +768,14 @@ static bool vmw_cmdbuf_try_alloc(struct vmw_cmdbuf_man *man,
 		return true;
  
 	memset(info->node, 0, sizeof(*info->node));
-	spin_lock_bh(&man->lock);
+	spin_lock(&man->lock);
 	ret = drm_mm_insert_node(&man->mm, info->node, info->page_size);
 	if (ret) {
 		vmw_cmdbuf_man_process(man);
 		ret = drm_mm_insert_node(&man->mm, info->node, info->page_size);
 	}
 
-	spin_unlock_bh(&man->lock);
+	spin_unlock(&man->lock);
 	info->done = !ret;
 
 	return info->done;
@@ -801,9 +896,9 @@ static int vmw_cmdbuf_space_pool(struct vmw_cmdbuf_man *man,
 	return 0;
 
 out_no_cb_header:
-	spin_lock_bh(&man->lock);
+	spin_lock(&man->lock);
 	drm_mm_remove_node(&header->node);
-	spin_unlock_bh(&man->lock);
+	spin_unlock(&man->lock);
 
 	return ret;
 }
@@ -1023,18 +1118,6 @@ void vmw_cmdbuf_commit(struct vmw_cmdbuf_man *man, size_t size,
 	vmw_cmdbuf_cur_unlock(man);
 }
 
-/**
- * vmw_cmdbuf_tasklet_schedule - Schedule the interrupt handler bottom half.
- *
- * @man: The command buffer manager.
- */
-void vmw_cmdbuf_tasklet_schedule(struct vmw_cmdbuf_man *man)
-{
-	if (!man)
-		return;
-
-	tasklet_schedule(&man->tasklet);
-}
 
 /**
  * vmw_cmdbuf_send_device_command - Send a command through the device context.
@@ -1059,9 +1142,9 @@ static int vmw_cmdbuf_send_device_command(struct vmw_cmdbuf_man *man,
 	memcpy(cmd, command, size);
 	header->cb_header->length = size;
 	header->cb_context = SVGA_CB_CONTEXT_DEVICE;
-	spin_lock_bh(&man->lock);
+	spin_lock(&man->lock);
 	status = vmw_cmdbuf_header_submit(header);
-	spin_unlock_bh(&man->lock);
+	spin_unlock(&man->lock);
 	vmw_cmdbuf_header_free(header);
 
 	if (status != SVGA_CB_STATUS_COMPLETED) {
@@ -1074,6 +1157,29 @@ static int vmw_cmdbuf_send_device_command(struct vmw_cmdbuf_man *man,
 }
 
 /**
+ * vmw_cmdbuf_preempt - Send a preempt command through the device
+ * context.
+ *
+ * @man: The command buffer manager.
+ *
+ * Synchronously sends a preempt command.
+ */
+static int vmw_cmdbuf_preempt(struct vmw_cmdbuf_man *man, u32 context)
+{
+	struct {
+		uint32 id;
+		SVGADCCmdPreempt body;
+	} __packed cmd;
+
+	cmd.id = SVGA_DC_CMD_PREEMPT;
+	cmd.body.context = SVGA_CB_CONTEXT_0 + context;
+	cmd.body.ignoreIDZero = 0;
+
+	return vmw_cmdbuf_send_device_command(man, &cmd, sizeof(cmd));
+}
+
+
+/**
  * vmw_cmdbuf_startstop - Send a start / stop command through the device
  * context.
  *
@@ -1082,7 +1188,7 @@ static int vmw_cmdbuf_send_device_command(struct vmw_cmdbuf_man *man,
  *
  * Synchronously sends a device start / stop context command.
  */
-static int vmw_cmdbuf_startstop(struct vmw_cmdbuf_man *man,
+static int vmw_cmdbuf_startstop(struct vmw_cmdbuf_man *man, u32 context,
 				bool enable)
 {
 	struct {
@@ -1092,7 +1198,7 @@ static int vmw_cmdbuf_startstop(struct vmw_cmdbuf_man *man,
 
 	cmd.id = SVGA_DC_CMD_START_STOP_CONTEXT;
 	cmd.body.enable = (enable) ? 1 : 0;
-	cmd.body.context = SVGA_CB_CONTEXT_0;
+	cmd.body.context = SVGA_CB_CONTEXT_0 + context;
 
 	return vmw_cmdbuf_send_device_command(man, &cmd, sizeof(cmd));
 }
@@ -1191,7 +1297,7 @@ struct vmw_cmdbuf_man *vmw_cmdbuf_man_create(struct vmw_private *dev_priv)
 {
 	struct vmw_cmdbuf_man *man;
 	struct vmw_cmdbuf_context *ctx;
-	int i;
+	unsigned int i;
 	int ret;
 
 	if (!(dev_priv->capabilities & SVGA_CAP_COMMAND_BUFFERS))
@@ -1226,8 +1332,7 @@ struct vmw_cmdbuf_man *vmw_cmdbuf_man_create(struct vmw_private *dev_priv)
 	spin_lock_init(&man->lock);
 	mutex_init(&man->cur_mutex);
 	mutex_init(&man->space_mutex);
-	tasklet_init(&man->tasklet, vmw_cmdbuf_man_tasklet,
-		     (unsigned long) man);
+	mutex_init(&man->error_mutex);
 	man->default_size = VMW_CMDBUF_INLINE_SIZE;
 	init_waitqueue_head(&man->alloc_queue);
 	init_waitqueue_head(&man->idle_queue);
@@ -1236,11 +1341,14 @@ struct vmw_cmdbuf_man *vmw_cmdbuf_man_create(struct vmw_private *dev_priv)
 	INIT_WORK(&man->work, &vmw_cmdbuf_work_func);
 	vmw_generic_waiter_add(dev_priv, SVGA_IRQFLAG_ERROR,
 			       &dev_priv->error_waiters);
-	ret = vmw_cmdbuf_startstop(man, true);
-	if (ret) {
-		DRM_ERROR("Failed starting command buffer context 0.\n");
-		vmw_cmdbuf_man_destroy(man);
-		return ERR_PTR(ret);
+	for_each_cmdbuf_ctx(man, i, ctx) {
+		ret = vmw_cmdbuf_startstop(man, i, true);
+		if (ret) {
+			DRM_ERROR("Failed starting command buffer "
+				  "context %u.\n", i);
+			vmw_cmdbuf_man_destroy(man);
+			return ERR_PTR(ret);
+		}
 	}
 
 	return man;
@@ -1290,18 +1398,24 @@ void vmw_cmdbuf_remove_pool(struct vmw_cmdbuf_man *man)
  */
 void vmw_cmdbuf_man_destroy(struct vmw_cmdbuf_man *man)
 {
+	struct vmw_cmdbuf_context *ctx;
+	unsigned int i;
+
 	WARN_ON_ONCE(man->has_pool);
 	(void) vmw_cmdbuf_idle(man, false, 10*HZ);
-	if (vmw_cmdbuf_startstop(man, false))
-		DRM_ERROR("Failed stopping command buffer context 0.\n");
+
+	for_each_cmdbuf_ctx(man, i, ctx)
+		if (vmw_cmdbuf_startstop(man, i, false))
+			DRM_ERROR("Failed stopping command buffer "
+				  "context %u.\n", i);
 
 	vmw_generic_waiter_remove(man->dev_priv, SVGA_IRQFLAG_ERROR,
 				  &man->dev_priv->error_waiters);
-	tasklet_kill(&man->tasklet);
 	(void) cancel_work_sync(&man->work);
 	dma_pool_destroy(man->dheaders);
 	dma_pool_destroy(man->headers);
 	mutex_destroy(&man->cur_mutex);
 	mutex_destroy(&man->space_mutex);
+	mutex_destroy(&man->error_mutex);
 	kfree(man);
 }
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
index 4436d53..e84fee3 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.c
@@ -36,7 +36,6 @@
 #include <drm/ttm/ttm_module.h>
 #include <linux/dma_remapping.h>
 
-#define VMWGFX_DRIVER_NAME "vmwgfx"
 #define VMWGFX_DRIVER_DESC "Linux drm driver for VMware graphics devices"
 #define VMWGFX_CHIP_SVGAII 0
 #define VMW_FB_RESERVATION 0
@@ -825,7 +824,7 @@ static int vmw_driver_load(struct drm_device *dev, unsigned long chipset)
 	}
 
 	if (dev_priv->capabilities & SVGA_CAP_IRQMASK) {
-		ret = drm_irq_install(dev, dev->pdev->irq);
+		ret = vmw_irq_install(dev, dev->pdev->irq);
 		if (ret != 0) {
 			DRM_ERROR("Failed installing irq: %d\n", ret);
 			goto out_no_irq;
@@ -937,7 +936,7 @@ static int vmw_driver_load(struct drm_device *dev, unsigned long chipset)
 	vmw_fence_manager_takedown(dev_priv->fman);
 out_no_fman:
 	if (dev_priv->capabilities & SVGA_CAP_IRQMASK)
-		drm_irq_uninstall(dev_priv->dev);
+		vmw_irq_uninstall(dev_priv->dev);
 out_no_irq:
 	if (dev_priv->stealth)
 		pci_release_region(dev->pdev, 2);
@@ -990,7 +989,7 @@ static void vmw_driver_unload(struct drm_device *dev)
 	vmw_release_device_late(dev_priv);
 	vmw_fence_manager_takedown(dev_priv->fman);
 	if (dev_priv->capabilities & SVGA_CAP_IRQMASK)
-		drm_irq_uninstall(dev_priv->dev);
+		vmw_irq_uninstall(dev_priv->dev);
 	if (dev_priv->stealth)
 		pci_release_region(dev->pdev, 2);
 	else
@@ -1516,10 +1515,6 @@ static struct drm_driver driver = {
 	.load = vmw_driver_load,
 	.unload = vmw_driver_unload,
 	.lastclose = vmw_lastclose,
-	.irq_preinstall = vmw_irq_preinstall,
-	.irq_postinstall = vmw_irq_postinstall,
-	.irq_uninstall = vmw_irq_uninstall,
-	.irq_handler = vmw_irq_handler,
 	.get_vblank_counter = vmw_get_vblank_counter,
 	.enable_vblank = vmw_enable_vblank,
 	.disable_vblank = vmw_disable_vblank,
@@ -1531,7 +1526,6 @@ static struct drm_driver driver = {
 	.master_drop = vmw_master_drop,
 	.open = vmw_driver_open,
 	.postclose = vmw_postclose,
-	.set_busid = drm_pci_set_busid,
 
 	.dumb_create = vmw_dumb_create,
 	.dumb_map_offset = vmw_dumb_map_offset,
@@ -1571,7 +1565,7 @@ static int __init vmwgfx_init(void)
 	if (vgacon_text_force())
 		return -EINVAL;
 
-	ret = drm_pci_init(&driver, &vmw_pci_driver);
+	ret = pci_register_driver(&vmw_pci_driver);
 	if (ret)
 		DRM_ERROR("Failed initializing DRM.\n");
 	return ret;
@@ -1579,7 +1573,7 @@ static int __init vmwgfx_init(void)
 
 static void __exit vmwgfx_exit(void)
 {
-	drm_pci_exit(&driver, &vmw_pci_driver);
+	pci_unregister_driver(&vmw_pci_driver);
 }
 
 module_init(vmwgfx_init);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
index 4b948fba..7e5f30e 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_drv.h
@@ -40,10 +40,12 @@
 #include <drm/ttm/ttm_execbuf_util.h>
 #include <drm/ttm/ttm_module.h>
 #include "vmwgfx_fence.h"
+#include <linux/sync_file.h>
 
-#define VMWGFX_DRIVER_DATE "20170607"
+#define VMWGFX_DRIVER_NAME "vmwgfx"
+#define VMWGFX_DRIVER_DATE "20170612"
 #define VMWGFX_DRIVER_MAJOR 2
-#define VMWGFX_DRIVER_MINOR 13
+#define VMWGFX_DRIVER_MINOR 14
 #define VMWGFX_DRIVER_PATCHLEVEL 0
 #define VMWGFX_FILE_PAGE_OFFSET 0x00100000
 #define VMWGFX_FIFO_STATIC_SIZE (1024*1024)
@@ -351,6 +353,12 @@ struct vmw_otable_batch {
 	struct ttm_buffer_object *otable_bo;
 };
 
+enum {
+	VMW_IRQTHREAD_FENCE,
+	VMW_IRQTHREAD_CMDBUF,
+	VMW_IRQTHREAD_MAX
+};
+
 struct vmw_private {
 	struct ttm_bo_device bdev;
 	struct ttm_bo_global_ref bo_global_ref;
@@ -529,6 +537,7 @@ struct vmw_private {
 	struct vmw_otable_batch otable_batch;
 
 	struct vmw_cmdbuf_man *cman;
+	DECLARE_BITMAP(irqthread_pending, VMW_IRQTHREAD_MAX);
 };
 
 static inline struct vmw_surface *vmw_res_to_srf(struct vmw_resource *res)
@@ -561,24 +570,21 @@ static inline struct vmw_master *vmw_master(struct drm_master *master)
 static inline void vmw_write(struct vmw_private *dev_priv,
 			     unsigned int offset, uint32_t value)
 {
-	unsigned long irq_flags;
-
-	spin_lock_irqsave(&dev_priv->hw_lock, irq_flags);
+	spin_lock(&dev_priv->hw_lock);
 	outl(offset, dev_priv->io_start + VMWGFX_INDEX_PORT);
 	outl(value, dev_priv->io_start + VMWGFX_VALUE_PORT);
-	spin_unlock_irqrestore(&dev_priv->hw_lock, irq_flags);
+	spin_unlock(&dev_priv->hw_lock);
 }
 
 static inline uint32_t vmw_read(struct vmw_private *dev_priv,
 				unsigned int offset)
 {
-	unsigned long irq_flags;
 	u32 val;
 
-	spin_lock_irqsave(&dev_priv->hw_lock, irq_flags);
+	spin_lock(&dev_priv->hw_lock);
 	outl(offset, dev_priv->io_start + VMWGFX_INDEX_PORT);
 	val = inl(dev_priv->io_start + VMWGFX_VALUE_PORT);
-	spin_unlock_irqrestore(&dev_priv->hw_lock, irq_flags);
+	spin_unlock(&dev_priv->hw_lock);
 
 	return val;
 }
@@ -821,7 +827,8 @@ extern int vmw_execbuf_process(struct drm_file *file_priv,
 			       uint32_t dx_context_handle,
 			       struct drm_vmw_fence_rep __user
 			       *user_fence_rep,
-			       struct vmw_fence_obj **out_fence);
+			       struct vmw_fence_obj **out_fence,
+			       uint32_t flags);
 extern void __vmw_execbuf_release_pinned_bo(struct vmw_private *dev_priv,
 					    struct vmw_fence_obj *fence);
 extern void vmw_execbuf_release_pinned_bo(struct vmw_private *dev_priv);
@@ -836,23 +843,23 @@ extern void vmw_execbuf_copy_fence_user(struct vmw_private *dev_priv,
 					struct drm_vmw_fence_rep __user
 					*user_fence_rep,
 					struct vmw_fence_obj *fence,
-					uint32_t fence_handle);
+					uint32_t fence_handle,
+					int32_t out_fence_fd,
+					struct sync_file *sync_file);
 extern int vmw_validate_single_buffer(struct vmw_private *dev_priv,
 				      struct ttm_buffer_object *bo,
 				      bool interruptible,
 				      bool validate_as_mob);
-
+bool vmw_cmd_describe(const void *buf, u32 *size, char const **cmd);
 
 /**
  * IRQs and wating - vmwgfx_irq.c
  */
 
-extern irqreturn_t vmw_irq_handler(int irq, void *arg);
 extern int vmw_wait_seqno(struct vmw_private *dev_priv, bool lazy,
 			  uint32_t seqno, bool interruptible,
 			  unsigned long timeout);
-extern void vmw_irq_preinstall(struct drm_device *dev);
-extern int vmw_irq_postinstall(struct drm_device *dev);
+extern int vmw_irq_install(struct drm_device *dev, int irq);
 extern void vmw_irq_uninstall(struct drm_device *dev);
 extern bool vmw_seqno_passed(struct vmw_private *dev_priv,
 				uint32_t seqno);
@@ -1150,13 +1157,13 @@ extern void *vmw_cmdbuf_reserve(struct vmw_cmdbuf_man *man, size_t size,
 extern void vmw_cmdbuf_commit(struct vmw_cmdbuf_man *man, size_t size,
 			      struct vmw_cmdbuf_header *header,
 			      bool flush);
-extern void vmw_cmdbuf_tasklet_schedule(struct vmw_cmdbuf_man *man);
 extern void *vmw_cmdbuf_alloc(struct vmw_cmdbuf_man *man,
 			      size_t size, bool interruptible,
 			      struct vmw_cmdbuf_header **p_header);
 extern void vmw_cmdbuf_header_free(struct vmw_cmdbuf_header *header);
 extern int vmw_cmdbuf_cur_flush(struct vmw_cmdbuf_man *man,
 				bool interruptible);
+extern void vmw_cmdbuf_irqthread(struct vmw_cmdbuf_man *man);
 
 
 /**
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
index 2cfb3c9..21c62a3 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
@@ -24,6 +24,7 @@
  * USE OR OTHER DEALINGS IN THE SOFTWARE.
  *
  **************************************************************************/
+#include <linux/sync_file.h>
 
 #include "vmwgfx_drv.h"
 #include "vmwgfx_reg.h"
@@ -112,11 +113,12 @@ struct vmw_cmd_entry {
 	bool user_allow;
 	bool gb_disable;
 	bool gb_enable;
+	const char *cmd_name;
 };
 
 #define VMW_CMD_DEF(_cmd, _func, _user_allow, _gb_disable, _gb_enable)	\
 	[(_cmd) - SVGA_3D_CMD_BASE] = {(_func), (_user_allow),\
-				       (_gb_disable), (_gb_enable)}
+				       (_gb_disable), (_gb_enable), #_cmd}
 
 static int vmw_resource_context_res_add(struct vmw_private *dev_priv,
 					struct vmw_sw_context *sw_context,
@@ -3302,6 +3304,8 @@ static const struct vmw_cmd_entry vmw_cmd_entries[SVGA_3D_CMD_MAX] = {
 		    true, false, true),
 	VMW_CMD_DEF(SVGA_3D_CMD_NOP, &vmw_cmd_ok,
 		    true, false, true),
+	VMW_CMD_DEF(SVGA_3D_CMD_NOP_ERROR, &vmw_cmd_ok,
+		    true, false, true),
 	VMW_CMD_DEF(SVGA_3D_CMD_ENABLE_GART, &vmw_cmd_invalid,
 		    false, false, true),
 	VMW_CMD_DEF(SVGA_3D_CMD_DISABLE_GART, &vmw_cmd_invalid,
@@ -3469,6 +3473,51 @@ static const struct vmw_cmd_entry vmw_cmd_entries[SVGA_3D_CMD_MAX] = {
 		    true, false, true),
 };
 
+bool vmw_cmd_describe(const void *buf, u32 *size, char const **cmd)
+{
+	u32 cmd_id = ((u32 *) buf)[0];
+
+	if (cmd_id >= SVGA_CMD_MAX) {
+		SVGA3dCmdHeader *header = (SVGA3dCmdHeader *) buf;
+		const struct vmw_cmd_entry *entry;
+
+		*size = header->size + sizeof(SVGA3dCmdHeader);
+		cmd_id = header->id;
+		if (cmd_id >= SVGA_3D_CMD_MAX)
+			return false;
+
+		cmd_id -= SVGA_3D_CMD_BASE;
+		entry = &vmw_cmd_entries[cmd_id];
+		*cmd = entry->cmd_name;
+		return true;
+	}
+
+	switch (cmd_id) {
+	case SVGA_CMD_UPDATE:
+		*cmd = "SVGA_CMD_UPDATE";
+		*size = sizeof(u32) + sizeof(SVGAFifoCmdUpdate);
+		break;
+	case SVGA_CMD_DEFINE_GMRFB:
+		*cmd = "SVGA_CMD_DEFINE_GMRFB";
+		*size = sizeof(u32) + sizeof(SVGAFifoCmdDefineGMRFB);
+		break;
+	case SVGA_CMD_BLIT_GMRFB_TO_SCREEN:
+		*cmd = "SVGA_CMD_BLIT_GMRFB_TO_SCREEN";
+		*size = sizeof(u32) + sizeof(SVGAFifoCmdBlitGMRFBToScreen);
+		break;
+	case SVGA_CMD_BLIT_SCREEN_TO_GMRFB:
+		*cmd = "SVGA_CMD_BLIT_SCREEN_TO_GMRFB";
+		*size = sizeof(u32) + sizeof(SVGAFifoCmdBlitGMRFBToScreen);
+		break;
+	default:
+		*cmd = "UNKNOWN";
+		*size = 0;
+		return false;
+	}
+
+	return true;
+}
+
 static int vmw_cmd_check(struct vmw_private *dev_priv,
 			 struct vmw_sw_context *sw_context,
 			 void *buf, uint32_t *size)
@@ -3781,6 +3830,8 @@ int vmw_execbuf_fence_commands(struct drm_file *file_priv,
  * which the information should be copied.
  * @fence: Pointer to the fenc object.
  * @fence_handle: User-space fence handle.
+ * @out_fence_fd: exported file descriptor for the fence.  -1 if not used
+ * @sync_file:  Only used to clean up in case of an error in this function.
  *
  * This function copies fence information to user-space. If copying fails,
  * The user-space struct drm_vmw_fence_rep::error member is hopefully
@@ -3796,7 +3847,9 @@ vmw_execbuf_copy_fence_user(struct vmw_private *dev_priv,
 			    int ret,
 			    struct drm_vmw_fence_rep __user *user_fence_rep,
 			    struct vmw_fence_obj *fence,
-			    uint32_t fence_handle)
+			    uint32_t fence_handle,
+			    int32_t out_fence_fd,
+			    struct sync_file *sync_file)
 {
 	struct drm_vmw_fence_rep fence_rep;
 
@@ -3806,6 +3859,7 @@ vmw_execbuf_copy_fence_user(struct vmw_private *dev_priv,
 	memset(&fence_rep, 0, sizeof(fence_rep));
 
 	fence_rep.error = ret;
+	fence_rep.fd = out_fence_fd;
 	if (ret == 0) {
 		BUG_ON(fence == NULL);
 
@@ -3828,6 +3882,14 @@ vmw_execbuf_copy_fence_user(struct vmw_private *dev_priv,
 	 * and unreference the handle.
 	 */
 	if (unlikely(ret != 0) && (fence_rep.error == 0)) {
+		if (sync_file)
+			fput(sync_file->file);
+
+		if (fence_rep.fd != -1) {
+			put_unused_fd(fence_rep.fd);
+			fence_rep.fd = -1;
+		}
+
 		ttm_ref_object_base_unref(vmw_fp->tfile,
 					  fence_handle, TTM_REF_USAGE);
 		DRM_ERROR("Fence copy error. Syncing.\n");
@@ -4003,7 +4065,8 @@ int vmw_execbuf_process(struct drm_file *file_priv,
 			uint64_t throttle_us,
 			uint32_t dx_context_handle,
 			struct drm_vmw_fence_rep __user *user_fence_rep,
-			struct vmw_fence_obj **out_fence)
+			struct vmw_fence_obj **out_fence,
+			uint32_t flags)
 {
 	struct vmw_sw_context *sw_context = &dev_priv->ctx;
 	struct vmw_fence_obj *fence = NULL;
@@ -4013,20 +4076,33 @@ int vmw_execbuf_process(struct drm_file *file_priv,
 	struct ww_acquire_ctx ticket;
 	uint32_t handle;
 	int ret;
+	int32_t out_fence_fd = -1;
+	struct sync_file *sync_file = NULL;
+
+
+	if (flags & DRM_VMW_EXECBUF_FLAG_EXPORT_FENCE_FD) {
+		out_fence_fd = get_unused_fd_flags(O_CLOEXEC);
+		if (out_fence_fd < 0) {
+			DRM_ERROR("Failed to get a fence file descriptor.\n");
+			return out_fence_fd;
+		}
+	}
 
 	if (throttle_us) {
 		ret = vmw_wait_lag(dev_priv, &dev_priv->fifo.marker_queue,
 				   throttle_us);
 
 		if (ret)
-			return ret;
+			goto out_free_fence_fd;
 	}
 
 	kernel_commands = vmw_execbuf_cmdbuf(dev_priv, user_commands,
 					     kernel_commands, command_size,
 					     &header);
-	if (IS_ERR(kernel_commands))
-		return PTR_ERR(kernel_commands);
+	if (IS_ERR(kernel_commands)) {
+		ret = PTR_ERR(kernel_commands);
+		goto out_free_fence_fd;
+	}
 
 	ret = mutex_lock_interruptible(&dev_priv->cmdbuf_mutex);
 	if (ret) {
@@ -4162,8 +4238,32 @@ int vmw_execbuf_process(struct drm_file *file_priv,
 		__vmw_execbuf_release_pinned_bo(dev_priv, fence);
 
 	vmw_clear_validations(sw_context);
+
+	/*
+	 * If anything fails here, give up trying to export the fence
+	 * and do a sync since the user mode will not be able to sync
+	 * the fence itself.  This ensures we are still functionally
+	 * correct.
+	 */
+	if (flags & DRM_VMW_EXECBUF_FLAG_EXPORT_FENCE_FD) {
+
+		sync_file = sync_file_create(&fence->base);
+		if (!sync_file) {
+			DRM_ERROR("Unable to create sync file for fence\n");
+			put_unused_fd(out_fence_fd);
+			out_fence_fd = -1;
+
+			(void) vmw_fence_obj_wait(fence, false, false,
+						  VMW_FENCE_WAIT_TIMEOUT);
+		} else {
+			/* Link the fence with the FD created earlier */
+			fd_install(out_fence_fd, sync_file->file);
+		}
+	}
+
 	vmw_execbuf_copy_fence_user(dev_priv, vmw_fpriv(file_priv), ret,
-				    user_fence_rep, fence, handle);
+				    user_fence_rep, fence, handle,
+				    out_fence_fd, sync_file);
 
 	/* Don't unreference when handing fence out */
 	if (unlikely(out_fence != NULL)) {
@@ -4214,6 +4314,9 @@ int vmw_execbuf_process(struct drm_file *file_priv,
 out_free_header:
 	if (header)
 		vmw_cmdbuf_header_free(header);
+out_free_fence_fd:
+	if (out_fence_fd >= 0)
+		put_unused_fd(out_fence_fd);
 
 	return ret;
 }
@@ -4366,6 +4469,7 @@ int vmw_execbuf_ioctl(struct drm_device *dev, unsigned long data,
 	static const size_t copy_offset[] = {
 		offsetof(struct drm_vmw_execbuf_arg, context_handle),
 		sizeof(struct drm_vmw_execbuf_arg)};
+	struct dma_fence *in_fence = NULL;
 
 	if (unlikely(size < copy_offset[0])) {
 		DRM_ERROR("Invalid command size, ioctl %d\n",
@@ -4401,15 +4505,25 @@ int vmw_execbuf_ioctl(struct drm_device *dev, unsigned long data,
 		arg.context_handle = (uint32_t) -1;
 		break;
 	case 2:
-		if (arg.pad64 != 0) {
-			DRM_ERROR("Unused IOCTL data not set to zero.\n");
-			return -EINVAL;
-		}
-		break;
 	default:
 		break;
 	}
 
+
+	/* If imported a fence FD from elsewhere, then wait on it */
+	if (arg.flags & DRM_VMW_EXECBUF_FLAG_IMPORT_FENCE_FD) {
+		in_fence = sync_file_get_fence(arg.imported_fence_fd);
+
+		if (!in_fence) {
+			DRM_ERROR("Cannot get imported fence\n");
+			return -EINVAL;
+		}
+
+		ret = vmw_wait_dma_fence(dev_priv->fman, in_fence);
+		if (ret)
+			goto out;
+	}
+
 	ret = ttm_read_lock(&dev_priv->reservation_sem, true);
 	if (unlikely(ret != 0))
 		return ret;
@@ -4419,12 +4533,16 @@ int vmw_execbuf_ioctl(struct drm_device *dev, unsigned long data,
 				  NULL, arg.command_size, arg.throttle_us,
 				  arg.context_handle,
 				  (void __user *)(unsigned long)arg.fence_rep,
-				  NULL);
+				  NULL,
+				  arg.flags);
 	ttm_read_unlock(&dev_priv->reservation_sem);
 	if (unlikely(ret != 0))
-		return ret;
+		goto out;
 
 	vmw_kms_cursor_post_execbuf(dev_priv);
 
-	return 0;
+out:
+	if (in_fence)
+		dma_fence_put(in_fence);
+	return ret;
 }
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_fb.c b/drivers/gpu/drm/vmwgfx/vmwgfx_fb.c
index 6f4cb46..d23a18a 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_fb.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_fb.c
@@ -779,7 +779,6 @@ int vmw_fb_init(struct vmw_private *vmw_priv)
 	info->screen_base = (char __iomem *)par->vmalloc;
 	info->screen_size = fb_size;
 
-	info->flags = FBINFO_DEFAULT;
 	info->fbops = &vmw_fb_ops;
 
 	/* 24 depth per default */
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c b/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c
index b8bc5bc..3bbad22 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_fence.c
@@ -114,12 +114,11 @@ static void vmw_fence_obj_destroy(struct dma_fence *f)
 		container_of(f, struct vmw_fence_obj, base);
 
 	struct vmw_fence_manager *fman = fman_from_fence(fence);
-	unsigned long irq_flags;
 
-	spin_lock_irqsave(&fman->lock, irq_flags);
+	spin_lock(&fman->lock);
 	list_del_init(&fence->head);
 	--fman->num_fence_objects;
-	spin_unlock_irqrestore(&fman->lock, irq_flags);
+	spin_unlock(&fman->lock);
 	fence->destroy(fence);
 }
 
@@ -252,10 +251,10 @@ static void vmw_fence_work_func(struct work_struct *work)
 		INIT_LIST_HEAD(&list);
 		mutex_lock(&fman->goal_irq_mutex);
 
-		spin_lock_irq(&fman->lock);
+		spin_lock(&fman->lock);
 		list_splice_init(&fman->cleanup_list, &list);
 		seqno_valid = fman->seqno_valid;
-		spin_unlock_irq(&fman->lock);
+		spin_unlock(&fman->lock);
 
 		if (!seqno_valid && fman->goal_irq_on) {
 			fman->goal_irq_on = false;
@@ -305,15 +304,14 @@ struct vmw_fence_manager *vmw_fence_manager_init(struct vmw_private *dev_priv)
 
 void vmw_fence_manager_takedown(struct vmw_fence_manager *fman)
 {
-	unsigned long irq_flags;
 	bool lists_empty;
 
 	(void) cancel_work_sync(&fman->work);
 
-	spin_lock_irqsave(&fman->lock, irq_flags);
+	spin_lock(&fman->lock);
 	lists_empty = list_empty(&fman->fence_list) &&
 		list_empty(&fman->cleanup_list);
-	spin_unlock_irqrestore(&fman->lock, irq_flags);
+	spin_unlock(&fman->lock);
 
 	BUG_ON(!lists_empty);
 	kfree(fman);
@@ -323,7 +321,6 @@ static int vmw_fence_obj_init(struct vmw_fence_manager *fman,
 			      struct vmw_fence_obj *fence, u32 seqno,
 			      void (*destroy) (struct vmw_fence_obj *fence))
 {
-	unsigned long irq_flags;
 	int ret = 0;
 
 	dma_fence_init(&fence->base, &vmw_fence_ops, &fman->lock,
@@ -331,7 +328,7 @@ static int vmw_fence_obj_init(struct vmw_fence_manager *fman,
 	INIT_LIST_HEAD(&fence->seq_passed_actions);
 	fence->destroy = destroy;
 
-	spin_lock_irqsave(&fman->lock, irq_flags);
+	spin_lock(&fman->lock);
 	if (unlikely(fman->fifo_down)) {
 		ret = -EBUSY;
 		goto out_unlock;
@@ -340,7 +337,7 @@ static int vmw_fence_obj_init(struct vmw_fence_manager *fman,
 	++fman->num_fence_objects;
 
 out_unlock:
-	spin_unlock_irqrestore(&fman->lock, irq_flags);
+	spin_unlock(&fman->lock);
 	return ret;
 
 }
@@ -489,11 +486,9 @@ static void __vmw_fences_update(struct vmw_fence_manager *fman)
 
 void vmw_fences_update(struct vmw_fence_manager *fman)
 {
-	unsigned long irq_flags;
-
-	spin_lock_irqsave(&fman->lock, irq_flags);
+	spin_lock(&fman->lock);
 	__vmw_fences_update(fman);
-	spin_unlock_irqrestore(&fman->lock, irq_flags);
+	spin_unlock(&fman->lock);
 }
 
 bool vmw_fence_obj_signaled(struct vmw_fence_obj *fence)
@@ -650,6 +645,51 @@ int vmw_user_fence_create(struct drm_file *file_priv,
 
 
 /**
+ * vmw_wait_dma_fence - Wait for a dma fence
+ *
+ * @fman: pointer to a fence manager
+ * @fence: DMA fence to wait on
+ *
+ * This function handles the case when the fence is actually a fence
+ * array.  If that's the case, it'll wait on each of the child fence
+ */
+int vmw_wait_dma_fence(struct vmw_fence_manager *fman,
+		       struct dma_fence *fence)
+{
+	struct dma_fence_array *fence_array;
+	int ret = 0;
+	int i;
+
+
+	if (dma_fence_is_signaled(fence))
+		return 0;
+
+	if (!dma_fence_is_array(fence))
+		return dma_fence_wait(fence, true);
+
+	/* From i915: Note that if the fence-array was created in
+	 * signal-on-any mode, we should *not* decompose it into its individual
+	 * fences. However, we don't currently store which mode the fence-array
+	 * is operating in. Fortunately, the only user of signal-on-any is
+	 * private to amdgpu and we should not see any incoming fence-array
+	 * from sync-file being in signal-on-any mode.
+	 */
+
+	fence_array = to_dma_fence_array(fence);
+	for (i = 0; i < fence_array->num_fences; i++) {
+		struct dma_fence *child = fence_array->fences[i];
+
+		ret = dma_fence_wait(child, true);
+
+		if (ret < 0)
+			return ret;
+	}
+
+	return 0;
+}
+
+
+/**
  * vmw_fence_fifo_down - signal all unsignaled fence objects.
  */
 
@@ -663,14 +703,14 @@ void vmw_fence_fifo_down(struct vmw_fence_manager *fman)
 	 * restart when we've released the fman->lock.
 	 */
 
-	spin_lock_irq(&fman->lock);
+	spin_lock(&fman->lock);
 	fman->fifo_down = true;
 	while (!list_empty(&fman->fence_list)) {
 		struct vmw_fence_obj *fence =
 			list_entry(fman->fence_list.prev, struct vmw_fence_obj,
 				   head);
 		dma_fence_get(&fence->base);
-		spin_unlock_irq(&fman->lock);
+		spin_unlock(&fman->lock);
 
 		ret = vmw_fence_obj_wait(fence, false, false,
 					 VMW_FENCE_WAIT_TIMEOUT);
@@ -686,18 +726,16 @@ void vmw_fence_fifo_down(struct vmw_fence_manager *fman)
 
 		BUG_ON(!list_empty(&fence->head));
 		dma_fence_put(&fence->base);
-		spin_lock_irq(&fman->lock);
+		spin_lock(&fman->lock);
 	}
-	spin_unlock_irq(&fman->lock);
+	spin_unlock(&fman->lock);
 }
 
 void vmw_fence_fifo_up(struct vmw_fence_manager *fman)
 {
-	unsigned long irq_flags;
-
-	spin_lock_irqsave(&fman->lock, irq_flags);
+	spin_lock(&fman->lock);
 	fman->fifo_down = false;
-	spin_unlock_irqrestore(&fman->lock, irq_flags);
+	spin_unlock(&fman->lock);
 }
 
 
@@ -812,9 +850,9 @@ int vmw_fence_obj_signaled_ioctl(struct drm_device *dev, void *data,
 	arg->signaled = vmw_fence_obj_signaled(fence);
 
 	arg->signaled_flags = arg->flags;
-	spin_lock_irq(&fman->lock);
+	spin_lock(&fman->lock);
 	arg->passed_seqno = dev_priv->last_read_seqno;
-	spin_unlock_irq(&fman->lock);
+	spin_unlock(&fman->lock);
 
 	ttm_base_object_unref(&base);
 
@@ -841,8 +879,7 @@ int vmw_fence_obj_unref_ioctl(struct drm_device *dev, void *data,
  *
  * This function is called when the seqno of the fence where @action is
  * attached has passed. It queues the event on the submitter's event list.
- * This function is always called from atomic context, and may be called
- * from irq context.
+ * This function is always called from atomic context.
  */
 static void vmw_event_fence_action_seq_passed(struct vmw_fence_action *action)
 {
@@ -851,13 +888,13 @@ static void vmw_event_fence_action_seq_passed(struct vmw_fence_action *action)
 	struct drm_device *dev = eaction->dev;
 	struct drm_pending_event *event = eaction->event;
 	struct drm_file *file_priv;
-	unsigned long irq_flags;
+
 
 	if (unlikely(event == NULL))
 		return;
 
 	file_priv = event->file_priv;
-	spin_lock_irqsave(&dev->event_lock, irq_flags);
+	spin_lock_irq(&dev->event_lock);
 
 	if (likely(eaction->tv_sec != NULL)) {
 		struct timeval tv;
@@ -869,7 +906,7 @@ static void vmw_event_fence_action_seq_passed(struct vmw_fence_action *action)
 
 	drm_send_event_locked(dev, eaction->event);
 	eaction->event = NULL;
-	spin_unlock_irqrestore(&dev->event_lock, irq_flags);
+	spin_unlock_irq(&dev->event_lock);
 }
 
 /**
@@ -904,11 +941,10 @@ static void vmw_fence_obj_add_action(struct vmw_fence_obj *fence,
 			      struct vmw_fence_action *action)
 {
 	struct vmw_fence_manager *fman = fman_from_fence(fence);
-	unsigned long irq_flags;
 	bool run_update = false;
 
 	mutex_lock(&fman->goal_irq_mutex);
-	spin_lock_irqsave(&fman->lock, irq_flags);
+	spin_lock(&fman->lock);
 
 	fman->pending_actions[action->type]++;
 	if (dma_fence_is_signaled_locked(&fence->base)) {
@@ -927,7 +963,7 @@ static void vmw_fence_obj_add_action(struct vmw_fence_obj *fence,
 		run_update = vmw_fence_goal_check_locked(fence);
 	}
 
-	spin_unlock_irqrestore(&fman->lock, irq_flags);
+	spin_unlock(&fman->lock);
 
 	if (run_update) {
 		if (!fman->goal_irq_on) {
@@ -1114,7 +1150,7 @@ int vmw_fence_event_ioctl(struct drm_device *dev, void *data,
 	}
 
 	vmw_execbuf_copy_fence_user(dev_priv, vmw_fp, 0, user_fence_rep, fence,
-				    handle);
+				    handle, -1, NULL);
 	vmw_fence_obj_unreference(&fence);
 	return 0;
 out_no_create:
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_fence.h b/drivers/gpu/drm/vmwgfx/vmwgfx_fence.h
index d9d85aa..20224db 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_fence.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_fence.h
@@ -28,6 +28,7 @@
 #ifndef _VMWGFX_FENCE_H_
 
 #include <linux/dma-fence.h>
+#include <linux/dma-fence-array.h>
 
 #define VMW_FENCE_WAIT_TIMEOUT (5*HZ)
 
@@ -102,6 +103,9 @@ extern int vmw_user_fence_create(struct drm_file *file_priv,
 				 struct vmw_fence_obj **p_fence,
 				 uint32_t *p_handle);
 
+extern int vmw_wait_dma_fence(struct vmw_fence_manager *fman,
+			      struct dma_fence *fence);
+
 extern void vmw_fence_fifo_up(struct vmw_fence_manager *fman);
 
 extern void vmw_fence_fifo_down(struct vmw_fence_manager *fman);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
index d2b03d4..f2f9d88 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_gmrid_manager.c
@@ -157,9 +157,9 @@ static int vmw_gmrid_man_takedown(struct ttm_mem_type_manager *man)
 }
 
 static void vmw_gmrid_man_debug(struct ttm_mem_type_manager *man,
-				const char *prefix)
+				struct drm_printer *printer)
 {
-	pr_info("%s: No debug info available for the GMR id manager\n", prefix);
+	drm_printf(printer, "No debug info available for the GMR id manager\n");
 }
 
 const struct ttm_mem_type_manager_func vmw_gmrid_manager_func = {
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_irq.c b/drivers/gpu/drm/vmwgfx/vmwgfx_irq.c
index 0c7e172..b9239ba 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_irq.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_irq.c
@@ -30,11 +30,56 @@
 
 #define VMW_FENCE_WRAP (1 << 24)
 
-irqreturn_t vmw_irq_handler(int irq, void *arg)
+/**
+ * vmw_thread_fn - Deferred (process context) irq handler
+ *
+ * @irq: irq number
+ * @arg: Closure argument. Pointer to a struct drm_device cast to void *
+ *
+ * This function implements the deferred part of irq processing.
+ * The function is guaranteed to run at least once after the
+ * vmw_irq_handler has returned with IRQ_WAKE_THREAD.
+ *
+ */
+static irqreturn_t vmw_thread_fn(int irq, void *arg)
+{
+	struct drm_device *dev = (struct drm_device *)arg;
+	struct vmw_private *dev_priv = vmw_priv(dev);
+	irqreturn_t ret = IRQ_NONE;
+
+	if (test_and_clear_bit(VMW_IRQTHREAD_FENCE,
+			       dev_priv->irqthread_pending)) {
+		vmw_fences_update(dev_priv->fman);
+		wake_up_all(&dev_priv->fence_queue);
+		ret = IRQ_HANDLED;
+	}
+
+	if (test_and_clear_bit(VMW_IRQTHREAD_CMDBUF,
+			       dev_priv->irqthread_pending)) {
+		vmw_cmdbuf_irqthread(dev_priv->cman);
+		ret = IRQ_HANDLED;
+	}
+
+	return ret;
+}
+
+/**
+ * vmw_irq_handler irq handler
+ *
+ * @irq: irq number
+ * @arg: Closure argument. Pointer to a struct drm_device cast to void *
+ *
+ * This function implements the quick part of irq processing.
+ * The function performs fast actions like clearing the device interrupt
+ * flags and also reasonably quick actions like waking processes waiting for
+ * FIFO space. Other IRQ actions are deferred to the IRQ thread.
+ */
+static irqreturn_t vmw_irq_handler(int irq, void *arg)
 {
 	struct drm_device *dev = (struct drm_device *)arg;
 	struct vmw_private *dev_priv = vmw_priv(dev);
 	uint32_t status, masked_status;
+	irqreturn_t ret = IRQ_HANDLED;
 
 	status = inl(dev_priv->io_start + VMWGFX_IRQSTATUS_PORT);
 	masked_status = status & READ_ONCE(dev_priv->irq_mask);
@@ -45,20 +90,21 @@ irqreturn_t vmw_irq_handler(int irq, void *arg)
 	if (!status)
 		return IRQ_NONE;
 
-	if (masked_status & (SVGA_IRQFLAG_ANY_FENCE |
-			     SVGA_IRQFLAG_FENCE_GOAL)) {
-		vmw_fences_update(dev_priv->fman);
-		wake_up_all(&dev_priv->fence_queue);
-	}
-
 	if (masked_status & SVGA_IRQFLAG_FIFO_PROGRESS)
 		wake_up_all(&dev_priv->fifo_queue);
 
-	if (masked_status & (SVGA_IRQFLAG_COMMAND_BUFFER |
-			     SVGA_IRQFLAG_ERROR))
-		vmw_cmdbuf_tasklet_schedule(dev_priv->cman);
+	if ((masked_status & (SVGA_IRQFLAG_ANY_FENCE |
+			      SVGA_IRQFLAG_FENCE_GOAL)) &&
+	    !test_and_set_bit(VMW_IRQTHREAD_FENCE, dev_priv->irqthread_pending))
+		ret = IRQ_WAKE_THREAD;
 
-	return IRQ_HANDLED;
+	if ((masked_status & (SVGA_IRQFLAG_COMMAND_BUFFER |
+			      SVGA_IRQFLAG_ERROR)) &&
+	    !test_and_set_bit(VMW_IRQTHREAD_CMDBUF,
+			      dev_priv->irqthread_pending))
+		ret = IRQ_WAKE_THREAD;
+
+	return ret;
 }
 
 static bool vmw_fifo_idle(struct vmw_private *dev_priv, uint32_t seqno)
@@ -281,23 +327,15 @@ int vmw_wait_seqno(struct vmw_private *dev_priv,
 	return ret;
 }
 
-void vmw_irq_preinstall(struct drm_device *dev)
+static void vmw_irq_preinstall(struct drm_device *dev)
 {
 	struct vmw_private *dev_priv = vmw_priv(dev);
 	uint32_t status;
 
-	if (!(dev_priv->capabilities & SVGA_CAP_IRQMASK))
-		return;
-
 	status = inl(dev_priv->io_start + VMWGFX_IRQSTATUS_PORT);
 	outl(status, dev_priv->io_start + VMWGFX_IRQSTATUS_PORT);
 }
 
-int vmw_irq_postinstall(struct drm_device *dev)
-{
-	return 0;
-}
-
 void vmw_irq_uninstall(struct drm_device *dev)
 {
 	struct vmw_private *dev_priv = vmw_priv(dev);
@@ -306,8 +344,41 @@ void vmw_irq_uninstall(struct drm_device *dev)
 	if (!(dev_priv->capabilities & SVGA_CAP_IRQMASK))
 		return;
 
+	if (!dev->irq_enabled)
+		return;
+
 	vmw_write(dev_priv, SVGA_REG_IRQMASK, 0);
 
 	status = inl(dev_priv->io_start + VMWGFX_IRQSTATUS_PORT);
 	outl(status, dev_priv->io_start + VMWGFX_IRQSTATUS_PORT);
+
+	dev->irq_enabled = false;
+	free_irq(dev->irq, dev);
+}
+
+/**
+ * vmw_irq_install - Install the irq handlers
+ *
+ * @dev:  Pointer to the drm device.
+ * @irq:  The irq number.
+ * Return:  Zero if successful. Negative number otherwise.
+ */
+int vmw_irq_install(struct drm_device *dev, int irq)
+{
+	int ret;
+
+	if (dev->irq_enabled)
+		return -EBUSY;
+
+	vmw_irq_preinstall(dev);
+
+	ret = request_threaded_irq(irq, vmw_irq_handler, vmw_thread_fn,
+				   IRQF_SHARED, VMWGFX_DRIVER_NAME, dev);
+	if (ret < 0)
+		return ret;
+
+	dev->irq_enabled = true;
+	dev->irq = irq;
+
+	return ret;
 }
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
index 61e06f0..b850562f 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
@@ -1536,7 +1536,7 @@ static struct drm_framebuffer *vmw_kms_fb_create(struct drm_device *dev,
  * RETURNS
  * Zero for success or -errno
  */
-int
+static int
 vmw_kms_atomic_check_modeset(struct drm_device *dev,
 			     struct drm_atomic_state *state)
 {
@@ -1545,8 +1545,7 @@ vmw_kms_atomic_check_modeset(struct drm_device *dev,
 	struct vmw_private *dev_priv = vmw_priv(dev);
 	int i;
 
-
-	for_each_crtc_in_state(state, crtc, crtc_state, i) {
+	for_each_new_crtc_in_state(state, crtc, crtc_state, i) {
 		unsigned long requested_bb_mem = 0;
 
 		if (dev_priv->active_display_unit == vmw_du_screen_target) {
@@ -1567,10 +1566,34 @@ vmw_kms_atomic_check_modeset(struct drm_device *dev,
 }
 
 
+/**
+ * vmw_kms_atomic_commit - Perform an atomic state commit
+ *
+ * @dev: DRM device
+ * @state: the driver state object
+ * @nonblock: Whether nonblocking behaviour is requested
+ *
+ * This is a simple wrapper around drm_atomic_helper_commit() for
+ * us to clear the nonblocking value.
+ *
+ * Nonblocking commits currently cause synchronization issues
+ * for vmwgfx.
+ *
+ * RETURNS
+ * Zero for success or negative error code on failure.
+ */
+int vmw_kms_atomic_commit(struct drm_device *dev,
+			  struct drm_atomic_state *state,
+			  bool nonblock)
+{
+	return drm_atomic_helper_commit(dev, state, false);
+}
+
+
 static const struct drm_mode_config_funcs vmw_kms_funcs = {
 	.fb_create = vmw_kms_fb_create,
 	.atomic_check = vmw_kms_atomic_check_modeset,
-	.atomic_commit = drm_atomic_helper_commit,
+	.atomic_commit = vmw_kms_atomic_commit,
 };
 
 static int vmw_kms_generic_present(struct vmw_private *dev_priv,
@@ -1667,7 +1690,7 @@ int vmw_kms_init(struct vmw_private *dev_priv)
 
 int vmw_kms_close(struct vmw_private *dev_priv)
 {
-	int ret;
+	int ret = 0;
 
 	/*
 	 * Docs says we should take the lock before calling this function
@@ -1675,11 +1698,7 @@ int vmw_kms_close(struct vmw_private *dev_priv)
 	 * drm_encoder_cleanup which takes the lock we deadlock.
 	 */
 	drm_mode_config_cleanup(dev_priv->dev);
-	if (dev_priv->active_display_unit == vmw_du_screen_object)
-		ret = vmw_kms_sou_close_display(dev_priv);
-	else if (dev_priv->active_display_unit == vmw_du_screen_target)
-		ret = vmw_kms_stdu_close_display(dev_priv);
-	else
+	if (dev_priv->active_display_unit == vmw_du_legacy)
 		ret = vmw_kms_ldu_close_display(dev_priv);
 
 	return ret;
@@ -2499,7 +2518,7 @@ void vmw_kms_helper_buffer_finish(struct vmw_private *dev_priv,
 	if (file_priv)
 		vmw_execbuf_copy_fence_user(dev_priv, vmw_fpriv(file_priv),
 					    ret, user_fence_rep, fence,
-					    handle);
+					    handle, -1, NULL);
 	if (out_fence)
 		*out_fence = fence;
 	else
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
index 5f8d678..ff9c838 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
@@ -390,7 +390,6 @@ int vmw_kms_update_proxy(struct vmw_resource *res,
  * Screen Objects display functions - vmwgfx_scrn.c
  */
 int vmw_kms_sou_init_display(struct vmw_private *dev_priv);
-int vmw_kms_sou_close_display(struct vmw_private *dev_priv);
 int vmw_kms_sou_do_surface_dirty(struct vmw_private *dev_priv,
 				 struct vmw_framebuffer *framebuffer,
 				 struct drm_clip_rect *clips,
@@ -418,7 +417,6 @@ int vmw_kms_sou_readback(struct vmw_private *dev_priv,
  * Screen Target Display Unit functions - vmwgfx_stdu.c
  */
 int vmw_kms_stdu_init_display(struct vmw_private *dev_priv);
-int vmw_kms_stdu_close_display(struct vmw_private *dev_priv);
 int vmw_kms_stdu_surface_dirty(struct vmw_private *dev_priv,
 			       struct vmw_framebuffer *framebuffer,
 			       struct drm_clip_rect *clips,
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c b/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c
index d3987bc..b8a0980 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c
@@ -203,19 +203,7 @@ static void vmw_ldu_crtc_mode_set_nofb(struct drm_crtc *crtc)
 }
 
 /**
- * vmw_ldu_crtc_helper_prepare - Noop
- *
- * @crtc: CRTC associated with the new screen
- *
- * Prepares the CRTC for a mode set, but we don't need to do anything here.
- *
- */
-static void vmw_ldu_crtc_helper_prepare(struct drm_crtc *crtc)
-{
-}
-
-/**
- * vmw_ldu_crtc_helper_commit - Noop
+ * vmw_ldu_crtc_atomic_enable - Noop
  *
  * @crtc: CRTC associated with the new screen
  *
@@ -224,16 +212,18 @@ static void vmw_ldu_crtc_helper_prepare(struct drm_crtc *crtc)
  * but since for LDU the display plane is closely tied to the
  * CRTC, it makes more sense to do those at plane update time.
  */
-static void vmw_ldu_crtc_helper_commit(struct drm_crtc *crtc)
+static void vmw_ldu_crtc_atomic_enable(struct drm_crtc *crtc,
+				       struct drm_crtc_state *old_state)
 {
 }
 
 /**
- * vmw_ldu_crtc_helper_disable - Turns off CRTC
+ * vmw_ldu_crtc_atomic_disable - Turns off CRTC
  *
  * @crtc: CRTC to be turned off
  */
-static void vmw_ldu_crtc_helper_disable(struct drm_crtc *crtc)
+static void vmw_ldu_crtc_atomic_disable(struct drm_crtc *crtc,
+					struct drm_crtc_state *old_state)
 {
 }
 
@@ -388,13 +378,12 @@ drm_plane_helper_funcs vmw_ldu_primary_plane_helper_funcs = {
 };
 
 static const struct drm_crtc_helper_funcs vmw_ldu_crtc_helper_funcs = {
-	.prepare = vmw_ldu_crtc_helper_prepare,
-	.commit = vmw_ldu_crtc_helper_commit,
-	.disable = vmw_ldu_crtc_helper_disable,
 	.mode_set_nofb = vmw_ldu_crtc_mode_set_nofb,
 	.atomic_check = vmw_du_crtc_atomic_check,
 	.atomic_begin = vmw_du_crtc_atomic_begin,
 	.atomic_flush = vmw_du_crtc_atomic_flush,
+	.atomic_enable = vmw_ldu_crtc_atomic_enable,
+	.atomic_disable = vmw_ldu_crtc_atomic_disable,
 };
 
 
@@ -439,7 +428,7 @@ static int vmw_ldu_init(struct vmw_private *dev_priv, unsigned unit)
 				       0, &vmw_ldu_plane_funcs,
 				       vmw_primary_plane_formats,
 				       ARRAY_SIZE(vmw_primary_plane_formats),
-				       DRM_PLANE_TYPE_PRIMARY, NULL);
+				       NULL, DRM_PLANE_TYPE_PRIMARY, NULL);
 	if (ret) {
 		DRM_ERROR("Failed to initialize primary plane");
 		goto err_free;
@@ -454,7 +443,7 @@ static int vmw_ldu_init(struct vmw_private *dev_priv, unsigned unit)
 			0, &vmw_ldu_cursor_funcs,
 			vmw_cursor_plane_formats,
 			ARRAY_SIZE(vmw_cursor_plane_formats),
-			DRM_PLANE_TYPE_CURSOR, NULL);
+			NULL, DRM_PLANE_TYPE_CURSOR, NULL);
 	if (ret) {
 		DRM_ERROR("Failed to initialize cursor plane");
 		drm_plane_cleanup(&ldu->base.primary);
@@ -582,13 +571,9 @@ int vmw_kms_ldu_init_display(struct vmw_private *dev_priv)
 
 int vmw_kms_ldu_close_display(struct vmw_private *dev_priv)
 {
-	struct drm_device *dev = dev_priv->dev;
-
 	if (!dev_priv->ldu_priv)
 		return -ENOSYS;
 
-	drm_vblank_cleanup(dev);
-
 	BUG_ON(!list_empty(&dev_priv->ldu_priv->active));
 
 	kfree(dev_priv->ldu_priv);
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c
index 8d7dc9d..d1552d3 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c
@@ -270,22 +270,24 @@ static void vmw_sou_crtc_helper_prepare(struct drm_crtc *crtc)
 }
 
 /**
- * vmw_sou_crtc_helper_commit - Noop
+ * vmw_sou_crtc_atomic_enable - Noop
  *
  * @crtc: CRTC associated with the new screen
  *
  * This is called after a mode set has been completed.
  */
-static void vmw_sou_crtc_helper_commit(struct drm_crtc *crtc)
+static void vmw_sou_crtc_atomic_enable(struct drm_crtc *crtc,
+				       struct drm_crtc_state *old_state)
 {
 }
 
 /**
- * vmw_sou_crtc_helper_disable - Turns off CRTC
+ * vmw_sou_crtc_atomic_disable - Turns off CRTC
  *
  * @crtc: CRTC to be turned off
  */
-static void vmw_sou_crtc_helper_disable(struct drm_crtc *crtc)
+static void vmw_sou_crtc_atomic_disable(struct drm_crtc *crtc,
+					struct drm_crtc_state *old_state)
 {
 	struct vmw_private *dev_priv;
 	struct vmw_screen_object_unit *sou;
@@ -573,12 +575,12 @@ drm_plane_helper_funcs vmw_sou_primary_plane_helper_funcs = {
 
 static const struct drm_crtc_helper_funcs vmw_sou_crtc_helper_funcs = {
 	.prepare = vmw_sou_crtc_helper_prepare,
-	.commit = vmw_sou_crtc_helper_commit,
-	.disable = vmw_sou_crtc_helper_disable,
 	.mode_set_nofb = vmw_sou_crtc_mode_set_nofb,
 	.atomic_check = vmw_du_crtc_atomic_check,
 	.atomic_begin = vmw_du_crtc_atomic_begin,
 	.atomic_flush = vmw_du_crtc_atomic_flush,
+	.atomic_enable = vmw_sou_crtc_atomic_enable,
+	.atomic_disable = vmw_sou_crtc_atomic_disable,
 };
 
 
@@ -622,7 +624,7 @@ static int vmw_sou_init(struct vmw_private *dev_priv, unsigned unit)
 				       0, &vmw_sou_plane_funcs,
 				       vmw_primary_plane_formats,
 				       ARRAY_SIZE(vmw_primary_plane_formats),
-				       DRM_PLANE_TYPE_PRIMARY, NULL);
+				       NULL, DRM_PLANE_TYPE_PRIMARY, NULL);
 	if (ret) {
 		DRM_ERROR("Failed to initialize primary plane");
 		goto err_free;
@@ -637,7 +639,7 @@ static int vmw_sou_init(struct vmw_private *dev_priv, unsigned unit)
 			0, &vmw_sou_cursor_funcs,
 			vmw_cursor_plane_formats,
 			ARRAY_SIZE(vmw_cursor_plane_formats),
-			DRM_PLANE_TYPE_CURSOR, NULL);
+			NULL, DRM_PLANE_TYPE_CURSOR, NULL);
 	if (ret) {
 		DRM_ERROR("Failed to initialize cursor plane");
 		drm_plane_cleanup(&sou->base.primary);
@@ -746,15 +748,6 @@ int vmw_kms_sou_init_display(struct vmw_private *dev_priv)
 	return 0;
 }
 
-int vmw_kms_sou_close_display(struct vmw_private *dev_priv)
-{
-	struct drm_device *dev = dev_priv->dev;
-
-	drm_vblank_cleanup(dev);
-
-	return 0;
-}
-
 static int do_dmabuf_define_gmrfb(struct vmw_private *dev_priv,
 				  struct vmw_framebuffer *framebuffer)
 {
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
index 5284e8d..ca3afae 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
@@ -412,7 +412,8 @@ static void vmw_stdu_crtc_helper_prepare(struct drm_crtc *crtc)
 }
 
 
-static void vmw_stdu_crtc_helper_commit(struct drm_crtc *crtc)
+static void vmw_stdu_crtc_atomic_enable(struct drm_crtc *crtc,
+					struct drm_crtc_state *old_state)
 {
 	struct vmw_private *dev_priv;
 	struct vmw_screen_target_display_unit *stdu;
@@ -432,7 +433,8 @@ static void vmw_stdu_crtc_helper_commit(struct drm_crtc *crtc)
 		vmw_kms_del_active(dev_priv, &stdu->base);
 }
 
-static void vmw_stdu_crtc_helper_disable(struct drm_crtc *crtc)
+static void vmw_stdu_crtc_atomic_disable(struct drm_crtc *crtc,
+					 struct drm_crtc_state *old_state)
 {
 	struct vmw_private *dev_priv;
 	struct vmw_screen_target_display_unit *stdu;
@@ -1415,12 +1417,12 @@ drm_plane_helper_funcs vmw_stdu_primary_plane_helper_funcs = {
 
 static const struct drm_crtc_helper_funcs vmw_stdu_crtc_helper_funcs = {
 	.prepare = vmw_stdu_crtc_helper_prepare,
-	.commit = vmw_stdu_crtc_helper_commit,
-	.disable = vmw_stdu_crtc_helper_disable,
 	.mode_set_nofb = vmw_stdu_crtc_mode_set_nofb,
 	.atomic_check = vmw_du_crtc_atomic_check,
 	.atomic_begin = vmw_du_crtc_atomic_begin,
 	.atomic_flush = vmw_du_crtc_atomic_flush,
+	.atomic_enable = vmw_stdu_crtc_atomic_enable,
+	.atomic_disable = vmw_stdu_crtc_atomic_disable,
 };
 
 
@@ -1473,7 +1475,7 @@ static int vmw_stdu_init(struct vmw_private *dev_priv, unsigned unit)
 				       0, &vmw_stdu_plane_funcs,
 				       vmw_primary_plane_formats,
 				       ARRAY_SIZE(vmw_primary_plane_formats),
-				       DRM_PLANE_TYPE_PRIMARY, NULL);
+				       NULL, DRM_PLANE_TYPE_PRIMARY, NULL);
 	if (ret) {
 		DRM_ERROR("Failed to initialize primary plane");
 		goto err_free;
@@ -1488,7 +1490,7 @@ static int vmw_stdu_init(struct vmw_private *dev_priv, unsigned unit)
 			0, &vmw_stdu_cursor_funcs,
 			vmw_cursor_plane_formats,
 			ARRAY_SIZE(vmw_cursor_plane_formats),
-			DRM_PLANE_TYPE_CURSOR, NULL);
+			NULL, DRM_PLANE_TYPE_CURSOR, NULL);
 	if (ret) {
 		DRM_ERROR("Failed to initialize cursor plane");
 		drm_plane_cleanup(&stdu->base.primary);
@@ -1651,36 +1653,11 @@ int vmw_kms_stdu_init_display(struct vmw_private *dev_priv)
 
 		if (unlikely(ret != 0)) {
 			DRM_ERROR("Failed to initialize STDU %d", i);
-			goto err_vblank_cleanup;
+			return ret;
 		}
 	}
 
 	DRM_INFO("Screen Target Display device initialized\n");
 
 	return 0;
-
-err_vblank_cleanup:
-	drm_vblank_cleanup(dev);
-	return ret;
-}
-
-
-
-/**
- * vmw_kms_stdu_close_display - Cleans up after vmw_kms_stdu_init_display
- *
- * @dev_priv: VMW DRM device
- *
- * Frees up any resources allocated by vmw_kms_stdu_init_display
- *
- * RETURNS:
- * 0 on success
- */
-int vmw_kms_stdu_close_display(struct vmw_private *dev_priv)
-{
-	struct drm_device *dev = dev_priv->dev;
-
-	drm_vblank_cleanup(dev);
-
-	return 0;
 }
diff --git a/drivers/gpu/drm/zte/zx_drm_drv.c b/drivers/gpu/drm/zte/zx_drm_drv.c
index f46c855..4524482 100644
--- a/drivers/gpu/drm/zte/zx_drm_drv.c
+++ b/drivers/gpu/drm/zte/zx_drm_drv.c
@@ -59,11 +59,9 @@ static struct drm_driver zx_drm_driver = {
 	.driver_features = DRIVER_GEM | DRIVER_MODESET | DRIVER_PRIME |
 			   DRIVER_ATOMIC,
 	.lastclose = zx_drm_lastclose,
-	.gem_free_object = drm_gem_cma_free_object,
+	.gem_free_object_unlocked = drm_gem_cma_free_object,
 	.gem_vm_ops = &drm_gem_cma_vm_ops,
 	.dumb_create = drm_gem_cma_dumb_create,
-	.dumb_map_offset = drm_gem_cma_dumb_map_offset,
-	.dumb_destroy = drm_gem_dumb_destroy,
 	.prime_handle_to_fd = drm_gem_prime_handle_to_fd,
 	.prime_fd_to_handle = drm_gem_prime_fd_to_handle,
 	.gem_prime_export = drm_gem_prime_export,
@@ -149,7 +147,6 @@ static int zx_drm_bind(struct device *dev)
 out_poll_fini:
 	drm_kms_helper_poll_fini(drm);
 	drm_mode_config_cleanup(drm);
-	drm_vblank_cleanup(drm);
 out_unbind:
 	component_unbind_all(dev, drm);
 out_unregister:
@@ -171,7 +168,6 @@ static void zx_drm_unbind(struct device *dev)
 	}
 	drm_kms_helper_poll_fini(drm);
 	drm_mode_config_cleanup(drm);
-	drm_vblank_cleanup(drm);
 	component_unbind_all(dev, drm);
 	dev_set_drvdata(dev, NULL);
 	drm->dev_private = NULL;
diff --git a/drivers/gpu/drm/zte/zx_hdmi.c b/drivers/gpu/drm/zte/zx_hdmi.c
index 0df7366..b8abb1b 100644
--- a/drivers/gpu/drm/zte/zx_hdmi.c
+++ b/drivers/gpu/drm/zte/zx_hdmi.c
@@ -124,7 +124,7 @@ static int zx_hdmi_config_video_avi(struct zx_hdmi *hdmi,
 	union hdmi_infoframe frame;
 	int ret;
 
-	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi, mode);
+	ret = drm_hdmi_avi_infoframe_from_display_mode(&frame.avi, mode, false);
 	if (ret) {
 		DRM_DEV_ERROR(hdmi->dev, "failed to get avi infoframe: %d\n",
 			      ret);
@@ -300,7 +300,6 @@ zx_hdmi_connector_detect(struct drm_connector *connector, bool force)
 }
 
 static const struct drm_connector_funcs zx_hdmi_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.detect = zx_hdmi_connector_detect,
 	.destroy = drm_connector_cleanup,
diff --git a/drivers/gpu/drm/zte/zx_plane.c b/drivers/gpu/drm/zte/zx_plane.c
index 4a62527..18e7634 100644
--- a/drivers/gpu/drm/zte/zx_plane.c
+++ b/drivers/gpu/drm/zte/zx_plane.c
@@ -540,7 +540,7 @@ int zx_plane_init(struct drm_device *drm, struct zx_plane *zplane,
 
 	ret = drm_universal_plane_init(drm, plane, VOU_CRTC_MASK,
 				       &zx_plane_funcs, formats, format_count,
-				       type, NULL);
+				       NULL, type, NULL);
 	if (ret) {
 		DRM_DEV_ERROR(dev, "failed to init universal plane: %d\n", ret);
 		return ret;
diff --git a/drivers/gpu/drm/zte/zx_tvenc.c b/drivers/gpu/drm/zte/zx_tvenc.c
index b56dc69..0de1a71 100644
--- a/drivers/gpu/drm/zte/zx_tvenc.c
+++ b/drivers/gpu/drm/zte/zx_tvenc.c
@@ -269,7 +269,6 @@ static struct drm_connector_helper_funcs zx_tvenc_connector_helper_funcs = {
 };
 
 static const struct drm_connector_funcs zx_tvenc_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.destroy = drm_connector_cleanup,
 	.reset = drm_atomic_helper_connector_reset,
diff --git a/drivers/gpu/drm/zte/zx_vga.c b/drivers/gpu/drm/zte/zx_vga.c
index 1e0811f..3e7e33c 100644
--- a/drivers/gpu/drm/zte/zx_vga.c
+++ b/drivers/gpu/drm/zte/zx_vga.c
@@ -138,7 +138,6 @@ zx_vga_connector_detect(struct drm_connector *connector, bool force)
 }
 
 static const struct drm_connector_funcs zx_vga_connector_funcs = {
-	.dpms = drm_atomic_helper_connector_dpms,
 	.fill_modes = drm_helper_probe_single_connector_modes,
 	.detect = zx_vga_connector_detect,
 	.destroy = drm_connector_cleanup,
diff --git a/drivers/gpu/drm/zte/zx_vou.c b/drivers/gpu/drm/zte/zx_vou.c
index 5fbd10b..7491813 100644
--- a/drivers/gpu/drm/zte/zx_vou.c
+++ b/drivers/gpu/drm/zte/zx_vou.c
@@ -350,7 +350,8 @@ static inline void vou_chn_set_update(struct zx_crtc *zcrtc)
 	zx_writel(zcrtc->chnreg + CHN_UPDATE, 1);
 }
 
-static void zx_crtc_enable(struct drm_crtc *crtc)
+static void zx_crtc_atomic_enable(struct drm_crtc *crtc,
+				  struct drm_crtc_state *old_state)
 {
 	struct drm_display_mode *mode = &crtc->state->adjusted_mode;
 	bool interlaced = mode->flags & DRM_MODE_FLAG_INTERLACE;
@@ -454,7 +455,8 @@ static void zx_crtc_enable(struct drm_crtc *crtc)
 		DRM_DEV_ERROR(vou->dev, "failed to enable pixclk: %d\n", ret);
 }
 
-static void zx_crtc_disable(struct drm_crtc *crtc)
+static void zx_crtc_atomic_disable(struct drm_crtc *crtc,
+				   struct drm_crtc_state *old_state)
 {
 	struct zx_crtc *zcrtc = to_zx_crtc(crtc);
 	const struct zx_crtc_bits *bits = zcrtc->bits;
@@ -490,9 +492,9 @@ static void zx_crtc_atomic_flush(struct drm_crtc *crtc,
 }
 
 static const struct drm_crtc_helper_funcs zx_crtc_helper_funcs = {
-	.enable = zx_crtc_enable,
-	.disable = zx_crtc_disable,
 	.atomic_flush = zx_crtc_atomic_flush,
+	.atomic_enable = zx_crtc_atomic_enable,
+	.atomic_disable = zx_crtc_atomic_disable,
 };
 
 static int zx_vou_enable_vblank(struct drm_crtc *crtc)
diff --git a/drivers/gpu/host1x/bus.c b/drivers/gpu/host1x/bus.c
index a048e3a..f9cde03 100644
--- a/drivers/gpu/host1x/bus.c
+++ b/drivers/gpu/host1x/bus.c
@@ -41,13 +41,15 @@ struct host1x_subdev {
 /**
  * host1x_subdev_add() - add a new subdevice with an associated device node
  * @device: host1x device to add the subdevice to
- * @driver: host1x driver
  * @np: device node
  */
 static int host1x_subdev_add(struct host1x_device *device,
+			     struct host1x_driver *driver,
 			     struct device_node *np)
 {
 	struct host1x_subdev *subdev;
+	struct device_node *child;
+	int err;
 
 	subdev = kzalloc(sizeof(*subdev), GFP_KERNEL);
 	if (!subdev)
@@ -60,6 +62,19 @@ static int host1x_subdev_add(struct host1x_device *device,
 	list_add_tail(&subdev->list, &device->subdevs);
 	mutex_unlock(&device->subdevs_lock);
 
+	/* recursively add children */
+	for_each_child_of_node(np, child) {
+		if (of_match_node(driver->subdevs, child) &&
+		    of_device_is_available(child)) {
+			err = host1x_subdev_add(device, driver, child);
+			if (err < 0) {
+				/* XXX cleanup? */
+				of_node_put(child);
+				return err;
+			}
+		}
+	}
+
 	return 0;
 }
 
@@ -88,7 +103,7 @@ static int host1x_device_parse_dt(struct host1x_device *device,
 	for_each_child_of_node(device->dev.parent->of_node, np) {
 		if (of_match_node(driver->subdevs, np) &&
 		    of_device_is_available(np)) {
-			err = host1x_subdev_add(device, np);
+			err = host1x_subdev_add(device, driver, np);
 			if (err < 0) {
 				of_node_put(np);
 				return err;
diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c
index 7782725..7f22c5c 100644
--- a/drivers/gpu/host1x/dev.c
+++ b/drivers/gpu/host1x/dev.c
@@ -134,8 +134,8 @@ static int host1x_probe(struct platform_device *pdev)
 
 	syncpt_irq = platform_get_irq(pdev, 0);
 	if (syncpt_irq < 0) {
-		dev_err(&pdev->dev, "failed to get IRQ\n");
-		return -ENXIO;
+		dev_err(&pdev->dev, "failed to get IRQ: %d\n", syncpt_irq);
+		return syncpt_irq;
 	}
 
 	host = devm_kzalloc(&pdev->dev, sizeof(*host), GFP_KERNEL);
diff --git a/drivers/gpu/host1x/hw/intr_hw.c b/drivers/gpu/host1x/hw/intr_hw.c
index dacb800..37ebb51 100644
--- a/drivers/gpu/host1x/hw/intr_hw.c
+++ b/drivers/gpu/host1x/hw/intr_hw.c
@@ -33,10 +33,10 @@ static void host1x_intr_syncpt_handle(struct host1x_syncpt *syncpt)
 	unsigned int id = syncpt->id;
 	struct host1x *host = syncpt->host;
 
-	host1x_sync_writel(host, BIT_MASK(id),
-		HOST1X_SYNC_SYNCPT_THRESH_INT_DISABLE(BIT_WORD(id)));
-	host1x_sync_writel(host, BIT_MASK(id),
-		HOST1X_SYNC_SYNCPT_THRESH_CPU0_INT_STATUS(BIT_WORD(id)));
+	host1x_sync_writel(host, BIT(id % 32),
+		HOST1X_SYNC_SYNCPT_THRESH_INT_DISABLE(id / 32));
+	host1x_sync_writel(host, BIT(id % 32),
+		HOST1X_SYNC_SYNCPT_THRESH_CPU0_INT_STATUS(id / 32));
 
 	schedule_work(&syncpt->intr.work);
 }
@@ -50,9 +50,9 @@ static irqreturn_t syncpt_thresh_isr(int irq, void *dev_id)
 	for (i = 0; i < DIV_ROUND_UP(host->info->nb_pts, 32); i++) {
 		reg = host1x_sync_readl(host,
 			HOST1X_SYNC_SYNCPT_THRESH_CPU0_INT_STATUS(i));
-		for_each_set_bit(id, &reg, BITS_PER_LONG) {
+		for_each_set_bit(id, &reg, 32) {
 			struct host1x_syncpt *syncpt =
-				host->syncpt + (i * BITS_PER_LONG + id);
+				host->syncpt + (i * 32 + id);
 			host1x_intr_syncpt_handle(syncpt);
 		}
 	}
@@ -117,17 +117,17 @@ static void _host1x_intr_set_syncpt_threshold(struct host1x *host,
 static void _host1x_intr_enable_syncpt_intr(struct host1x *host,
 					    unsigned int id)
 {
-	host1x_sync_writel(host, BIT_MASK(id),
-		HOST1X_SYNC_SYNCPT_THRESH_INT_ENABLE_CPU0(BIT_WORD(id)));
+	host1x_sync_writel(host, BIT(id % 32),
+		HOST1X_SYNC_SYNCPT_THRESH_INT_ENABLE_CPU0(id / 32));
 }
 
 static void _host1x_intr_disable_syncpt_intr(struct host1x *host,
 					     unsigned int id)
 {
-	host1x_sync_writel(host, BIT_MASK(id),
-		HOST1X_SYNC_SYNCPT_THRESH_INT_DISABLE(BIT_WORD(id)));
-	host1x_sync_writel(host, BIT_MASK(id),
-		HOST1X_SYNC_SYNCPT_THRESH_CPU0_INT_STATUS(BIT_WORD(id)));
+	host1x_sync_writel(host, BIT(id % 32),
+		HOST1X_SYNC_SYNCPT_THRESH_INT_DISABLE(id / 32));
+	host1x_sync_writel(host, BIT(id % 32),
+		HOST1X_SYNC_SYNCPT_THRESH_CPU0_INT_STATUS(id / 32));
 }
 
 static int _host1x_free_syncpt_irq(struct host1x *host)
diff --git a/drivers/gpu/host1x/hw/syncpt_hw.c b/drivers/gpu/host1x/hw/syncpt_hw.c
index c93f74f..7b0270d 100644
--- a/drivers/gpu/host1x/hw/syncpt_hw.c
+++ b/drivers/gpu/host1x/hw/syncpt_hw.c
@@ -89,7 +89,7 @@ static int syncpt_cpu_incr(struct host1x_syncpt *sp)
 	    host1x_syncpt_idle(sp))
 		return -EINVAL;
 
-	host1x_sync_writel(host, BIT_MASK(sp->id),
+	host1x_sync_writel(host, BIT(sp->id % 32),
 			   HOST1X_SYNC_SYNCPT_CPU_INCR(reg_offset));
 	wmb();
 
diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c
index bee5044..db509ab 100644
--- a/drivers/gpu/host1x/job.c
+++ b/drivers/gpu/host1x/job.c
@@ -197,10 +197,6 @@ static unsigned int pin_job(struct host1x *host, struct host1x_job *job)
 		}
 
 		phys_addr = host1x_bo_pin(reloc->target.bo, &sgt);
-		if (!phys_addr) {
-			err = -EINVAL;
-			goto unpin;
-		}
 
 		job->addr_phys[job->num_unpins] = phys_addr;
 		job->unpins[job->num_unpins].bo = reloc->target.bo;
@@ -225,10 +221,6 @@ static unsigned int pin_job(struct host1x *host, struct host1x_job *job)
 		}
 
 		phys_addr = host1x_bo_pin(g->bo, &sgt);
-		if (!phys_addr) {
-			err = -EINVAL;
-			goto unpin;
-		}
 
 		if (!IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) && host->domain) {
 			for_each_sg(sgt->sgl, sg, sgt->nents, j)
diff --git a/drivers/gpu/ipu-v3/ipu-common.c b/drivers/gpu/ipu-v3/ipu-common.c
index 960d816..6a573d2 100644
--- a/drivers/gpu/ipu-v3/ipu-common.c
+++ b/drivers/gpu/ipu-v3/ipu-common.c
@@ -1217,8 +1217,8 @@ static int ipu_add_client_devices(struct ipu_soc *ipu, unsigned long ipu_base)
 		of_node = of_graph_get_port_by_id(dev->of_node, i);
 		if (!of_node) {
 			dev_info(dev,
-				 "no port@%d node in %s, not using %s%d\n",
-				 i, dev->of_node->full_name,
+				 "no port@%d node in %pOF, not using %s%d\n",
+				 i, dev->of_node,
 				 (i / 2) ? "DI" : "CSI", i % 2);
 			continue;
 		}
diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index 5ef2814..d654314 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -343,6 +343,7 @@
 
 config SENSORS_ASPEED
 	tristate "ASPEED AST2400/AST2500 PWM and Fan tach driver"
+	depends on THERMAL || THERMAL=n
 	select REGMAP
 	help
 	  This driver provides support for ASPEED AST2400/AST2500 PWM
@@ -790,6 +791,13 @@
 	  This driver can also be built as a module. If so, the module will
 	  be called ltc4261.
 
+config SENSORS_LTQ_CPUTEMP
+	bool "Lantiq cpu temperature sensor driver"
+	depends on LANTIQ
+	help
+	  If you say yes here you get support for the temperature
+	  sensor inside your CPU.
+
 config SENSORS_MAX1111
 	tristate "Maxim MAX1111 Serial 8-bit ADC chip and compatibles"
 	depends on SPI_MASTER
diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
index d4641a9..c84d978 100644
--- a/drivers/hwmon/Makefile
+++ b/drivers/hwmon/Makefile
@@ -110,6 +110,7 @@
 obj-$(CONFIG_SENSORS_LTC4245)	+= ltc4245.o
 obj-$(CONFIG_SENSORS_LTC4260)	+= ltc4260.o
 obj-$(CONFIG_SENSORS_LTC4261)	+= ltc4261.o
+obj-$(CONFIG_SENSORS_LTQ_CPUTEMP) += ltq-cputemp.o
 obj-$(CONFIG_SENSORS_MAX1111)	+= max1111.o
 obj-$(CONFIG_SENSORS_MAX16065)	+= max16065.o
 obj-$(CONFIG_SENSORS_MAX1619)	+= max1619.o
diff --git a/drivers/hwmon/adc128d818.c b/drivers/hwmon/adc128d818.c
index a557b46..bd2ca31 100644
--- a/drivers/hwmon/adc128d818.c
+++ b/drivers/hwmon/adc128d818.c
@@ -384,7 +384,7 @@ static struct attribute *adc128_attrs[] = {
 	NULL
 };
 
-static struct attribute_group adc128_group = {
+static const struct attribute_group adc128_group = {
 	.attrs = adc128_attrs,
 	.is_visible = adc128_is_visible,
 };
diff --git a/drivers/hwmon/ads1015.c b/drivers/hwmon/ads1015.c
index 357b426..98c704d 100644
--- a/drivers/hwmon/ads1015.c
+++ b/drivers/hwmon/ads1015.c
@@ -191,24 +191,23 @@ static int ads1015_get_channels_config_of(struct i2c_client *client)
 		unsigned int data_rate = ADS1015_DEFAULT_DATA_RATE;
 
 		if (of_property_read_u32(node, "reg", &pval)) {
-			dev_err(&client->dev, "invalid reg on %s\n",
-				node->full_name);
+			dev_err(&client->dev, "invalid reg on %pOF\n", node);
 			continue;
 		}
 
 		channel = pval;
 		if (channel >= ADS1015_CHANNELS) {
 			dev_err(&client->dev,
-				"invalid channel index %d on %s\n",
-				channel, node->full_name);
+				"invalid channel index %d on %pOF\n",
+				channel, node);
 			continue;
 		}
 
 		if (!of_property_read_u32(node, "ti,gain", &pval)) {
 			pga = pval;
 			if (pga > 6) {
-				dev_err(&client->dev, "invalid gain on %s\n",
-					node->full_name);
+				dev_err(&client->dev, "invalid gain on %pOF\n",
+					node);
 				return -EINVAL;
 			}
 		}
@@ -217,8 +216,7 @@ static int ads1015_get_channels_config_of(struct i2c_client *client)
 			data_rate = pval;
 			if (data_rate > 7) {
 				dev_err(&client->dev,
-					"invalid data_rate on %s\n",
-					node->full_name);
+					"invalid data_rate on %pOF\n", node);
 				return -EINVAL;
 			}
 		}
diff --git a/drivers/hwmon/adt7475.c b/drivers/hwmon/adt7475.c
index 1baa213..9ef8499 100644
--- a/drivers/hwmon/adt7475.c
+++ b/drivers/hwmon/adt7475.c
@@ -1319,14 +1319,14 @@ static struct attribute *vid_attrs[] = {
 	NULL
 };
 
-static struct attribute_group adt7475_attr_group = { .attrs = adt7475_attrs };
-static struct attribute_group fan4_attr_group = { .attrs = fan4_attrs };
-static struct attribute_group pwm2_attr_group = { .attrs = pwm2_attrs };
-static struct attribute_group in0_attr_group = { .attrs = in0_attrs };
-static struct attribute_group in3_attr_group = { .attrs = in3_attrs };
-static struct attribute_group in4_attr_group = { .attrs = in4_attrs };
-static struct attribute_group in5_attr_group = { .attrs = in5_attrs };
-static struct attribute_group vid_attr_group = { .attrs = vid_attrs };
+static const struct attribute_group adt7475_attr_group = { .attrs = adt7475_attrs };
+static const struct attribute_group fan4_attr_group = { .attrs = fan4_attrs };
+static const struct attribute_group pwm2_attr_group = { .attrs = pwm2_attrs };
+static const struct attribute_group in0_attr_group = { .attrs = in0_attrs };
+static const struct attribute_group in3_attr_group = { .attrs = in3_attrs };
+static const struct attribute_group in4_attr_group = { .attrs = in4_attrs };
+static const struct attribute_group in5_attr_group = { .attrs = in5_attrs };
+static const struct attribute_group vid_attr_group = { .attrs = vid_attrs };
 
 static int adt7475_detect(struct i2c_client *client,
 			  struct i2c_board_info *info)
diff --git a/drivers/hwmon/asc7621.c b/drivers/hwmon/asc7621.c
index c77644d..4875e99 100644
--- a/drivers/hwmon/asc7621.c
+++ b/drivers/hwmon/asc7621.c
@@ -512,7 +512,7 @@ static ssize_t show_pwm_ac(struct device *dev,
 {
 	SETUP_SHOW_DATA_PARAM(dev, attr);
 	u8 config, altbit, regval;
-	const u8 map[] = {
+	static const u8 map[] = {
 		0x01, 0x02, 0x04, 0x1f, 0x00, 0x06, 0x07, 0x10,
 		0x08, 0x0f, 0x1f, 0x1f, 0x1f, 0x1f, 0x1f, 0x1f
 	};
@@ -533,7 +533,7 @@ static ssize_t store_pwm_ac(struct device *dev,
 	SETUP_STORE_DATA_PARAM(dev, attr);
 	unsigned long reqval;
 	u8 currval, config, altbit, newval;
-	const u16 map[] = {
+	static const u16 map[] = {
 		0x04, 0x00, 0x01, 0xff, 0x02, 0xff, 0x05, 0x06,
 		0x08, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0x0f,
 		0x07, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff,
diff --git a/drivers/hwmon/aspeed-pwm-tacho.c b/drivers/hwmon/aspeed-pwm-tacho.c
index ddfe66b..69b97d4 100644
--- a/drivers/hwmon/aspeed-pwm-tacho.c
+++ b/drivers/hwmon/aspeed-pwm-tacho.c
@@ -20,6 +20,7 @@
 #include <linux/platform_device.h>
 #include <linux/sysfs.h>
 #include <linux/regmap.h>
+#include <linux/thermal.h>
 
 /* ASPEED PWM & FAN Tach Register Definition */
 #define ASPEED_PTCR_CTRL		0x00
@@ -166,6 +167,18 @@
 /* How long we sleep in us while waiting for an RPM result. */
 #define ASPEED_RPM_STATUS_SLEEP_USEC	500
 
+#define MAX_CDEV_NAME_LEN 16
+
+struct aspeed_cooling_device {
+	char name[16];
+	struct aspeed_pwm_tacho_data *priv;
+	struct thermal_cooling_device *tcdev;
+	int pwm_port;
+	u8 *cooling_levels;
+	u8 max_state;
+	u8 cur_state;
+};
+
 struct aspeed_pwm_tacho_data {
 	struct regmap *regmap;
 	unsigned long clk_freq;
@@ -180,6 +193,7 @@ struct aspeed_pwm_tacho_data {
 	u8 pwm_port_type[8];
 	u8 pwm_port_fan_ctrl[8];
 	u8 fan_tach_ch_source[16];
+	struct aspeed_cooling_device *cdev[8];
 	const struct attribute_group *groups[3];
 };
 
@@ -765,6 +779,94 @@ static void aspeed_create_fan_tach_channel(struct aspeed_pwm_tacho_data *priv,
 	}
 }
 
+static int
+aspeed_pwm_cz_get_max_state(struct thermal_cooling_device *tcdev,
+			    unsigned long *state)
+{
+	struct aspeed_cooling_device *cdev = tcdev->devdata;
+
+	*state = cdev->max_state;
+
+	return 0;
+}
+
+static int
+aspeed_pwm_cz_get_cur_state(struct thermal_cooling_device *tcdev,
+			    unsigned long *state)
+{
+	struct aspeed_cooling_device *cdev = tcdev->devdata;
+
+	*state = cdev->cur_state;
+
+	return 0;
+}
+
+static int
+aspeed_pwm_cz_set_cur_state(struct thermal_cooling_device *tcdev,
+			    unsigned long state)
+{
+	struct aspeed_cooling_device *cdev = tcdev->devdata;
+
+	if (state > cdev->max_state)
+		return -EINVAL;
+
+	cdev->cur_state = state;
+	cdev->priv->pwm_port_fan_ctrl[cdev->pwm_port] =
+					cdev->cooling_levels[cdev->cur_state];
+	aspeed_set_pwm_port_fan_ctrl(cdev->priv, cdev->pwm_port,
+				     cdev->cooling_levels[cdev->cur_state]);
+
+	return 0;
+}
+
+static const struct thermal_cooling_device_ops aspeed_pwm_cool_ops = {
+	.get_max_state = aspeed_pwm_cz_get_max_state,
+	.get_cur_state = aspeed_pwm_cz_get_cur_state,
+	.set_cur_state = aspeed_pwm_cz_set_cur_state,
+};
+
+static int aspeed_create_pwm_cooling(struct device *dev,
+				     struct device_node *child,
+				     struct aspeed_pwm_tacho_data *priv,
+				     u32 pwm_port, u8 num_levels)
+{
+	int ret;
+	struct aspeed_cooling_device *cdev;
+
+	cdev = devm_kzalloc(dev, sizeof(*cdev), GFP_KERNEL);
+
+	if (!cdev)
+		return -ENOMEM;
+
+	cdev->cooling_levels = devm_kzalloc(dev, num_levels, GFP_KERNEL);
+	if (!cdev->cooling_levels)
+		return -ENOMEM;
+
+	cdev->max_state = num_levels - 1;
+	ret = of_property_read_u8_array(child, "cooling-levels",
+					cdev->cooling_levels,
+					num_levels);
+	if (ret) {
+		dev_err(dev, "Property 'cooling-levels' cannot be read.\n");
+		return ret;
+	}
+	snprintf(cdev->name, MAX_CDEV_NAME_LEN, "%s%d", child->name, pwm_port);
+
+	cdev->tcdev = thermal_of_cooling_device_register(child,
+							 cdev->name,
+							 cdev,
+							 &aspeed_pwm_cool_ops);
+	if (IS_ERR(cdev->tcdev))
+		return PTR_ERR(cdev->tcdev);
+
+	cdev->priv = priv;
+	cdev->pwm_port = pwm_port;
+
+	priv->cdev[pwm_port] = cdev;
+
+	return 0;
+}
+
 static int aspeed_create_fan(struct device *dev,
 			     struct device_node *child,
 			     struct aspeed_pwm_tacho_data *priv)
@@ -778,6 +880,15 @@ static int aspeed_create_fan(struct device *dev,
 		return ret;
 	aspeed_create_pwm_port(priv, (u8)pwm_port);
 
+	ret = of_property_count_u8_elems(child, "cooling-levels");
+
+	if (ret > 0) {
+		ret = aspeed_create_pwm_cooling(dev, child, priv, pwm_port,
+						ret);
+		if (ret)
+			return ret;
+	}
+
 	count = of_property_count_u8_elems(child, "aspeed,fan-tach-ch");
 	if (count < 1)
 		return -EINVAL;
@@ -834,9 +945,10 @@ static int aspeed_pwm_tacho_probe(struct platform_device *pdev)
 
 	for_each_child_of_node(np, child) {
 		ret = aspeed_create_fan(dev, child, priv);
-		of_node_put(child);
-		if (ret)
+		if (ret) {
+			of_node_put(child);
 			return ret;
+		}
 	}
 
 	priv->groups[0] = &pwm_dev_group;
diff --git a/drivers/hwmon/da9052-hwmon.c b/drivers/hwmon/da9052-hwmon.c
index c9832bf..97a62f5 100644
--- a/drivers/hwmon/da9052-hwmon.c
+++ b/drivers/hwmon/da9052-hwmon.c
@@ -20,13 +20,19 @@
 #include <linux/module.h>
 #include <linux/slab.h>
 #include <linux/platform_device.h>
+#include <linux/property.h>
 
 #include <linux/mfd/da9052/da9052.h>
 #include <linux/mfd/da9052/reg.h>
+#include <linux/regulator/consumer.h>
 
 struct da9052_hwmon {
-	struct da9052	*da9052;
-	struct mutex	hwmon_lock;
+	struct da9052		*da9052;
+	struct mutex		hwmon_lock;
+	bool			tsi_as_adc;
+	int			tsiref_mv;
+	struct regulator	*tsiref;
+	struct completion	tsidone;
 };
 
 static const char * const input_names[] = {
@@ -37,6 +43,10 @@ static const char * const input_names[] = {
 	[DA9052_ADC_IN4]	=	"ADC IN4",
 	[DA9052_ADC_IN5]	=	"ADC IN5",
 	[DA9052_ADC_IN6]	=	"ADC IN6",
+	[DA9052_ADC_TSI_XP]	=	"ADC TS X+",
+	[DA9052_ADC_TSI_YP]	=	"ADC TS Y+",
+	[DA9052_ADC_TSI_XN]	=	"ADC TS X-",
+	[DA9052_ADC_TSI_YN]	=	"ADC TS Y-",
 	[DA9052_ADC_TJUNC]	=	"BATTERY JUNCTION TEMP",
 	[DA9052_ADC_VBBAT]	=	"BACK-UP BATTERY VOLTAGE",
 };
@@ -59,6 +69,11 @@ static inline int vbbat_reg_to_mv(int value)
 	return DIV_ROUND_CLOSEST(value * 5000, 1023);
 }
 
+static inline int input_tsireg_to_mv(struct da9052_hwmon *hwmon, int value)
+{
+	return DIV_ROUND_CLOSEST(value * hwmon->tsiref_mv, 1023);
+}
+
 static inline int da9052_enable_vddout_channel(struct da9052 *da9052)
 {
 	return da9052_reg_update(da9052, DA9052_ADC_CONT_REG,
@@ -154,6 +169,97 @@ static ssize_t da9052_read_misc_channel(struct device *dev,
 	return sprintf(buf, "%d\n", input_reg_to_mv(ret));
 }
 
+static int da9052_request_tsi_read(struct da9052_hwmon *hwmon, int channel)
+{
+	u8 val = DA9052_TSICONTB_TSIMAN;
+
+	switch (channel) {
+	case DA9052_ADC_TSI_XP:
+		val |= DA9052_TSICONTB_TSIMUX_XP;
+		break;
+	case DA9052_ADC_TSI_YP:
+		val |= DA9052_TSICONTB_TSIMUX_YP;
+		break;
+	case DA9052_ADC_TSI_XN:
+		val |= DA9052_TSICONTB_TSIMUX_XN;
+		break;
+	case DA9052_ADC_TSI_YN:
+		val |= DA9052_TSICONTB_TSIMUX_YN;
+		break;
+	}
+
+	return da9052_reg_write(hwmon->da9052, DA9052_TSI_CONT_B_REG, val);
+}
+
+static int da9052_get_tsi_result(struct da9052_hwmon *hwmon, int channel)
+{
+	u8 regs[3];
+	int msb, lsb, err;
+
+	/* block read to avoid separation of MSB and LSB */
+	err = da9052_group_read(hwmon->da9052, DA9052_TSI_X_MSB_REG,
+				ARRAY_SIZE(regs), regs);
+	if (err)
+		return err;
+
+	switch (channel) {
+	case DA9052_ADC_TSI_XP:
+	case DA9052_ADC_TSI_XN:
+		msb = regs[0] << DA9052_TSILSB_TSIXL_BITS;
+		lsb = regs[2] & DA9052_TSILSB_TSIXL;
+		lsb >>= DA9052_TSILSB_TSIXL_SHIFT;
+		break;
+	case DA9052_ADC_TSI_YP:
+	case DA9052_ADC_TSI_YN:
+		msb = regs[1] << DA9052_TSILSB_TSIYL_BITS;
+		lsb = regs[2] & DA9052_TSILSB_TSIYL;
+		lsb >>= DA9052_TSILSB_TSIYL_SHIFT;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return msb | lsb;
+}
+
+
+static ssize_t __da9052_read_tsi(struct device *dev, int channel)
+{
+	struct da9052_hwmon *hwmon = dev_get_drvdata(dev);
+	int ret;
+
+	reinit_completion(&hwmon->tsidone);
+
+	ret = da9052_request_tsi_read(hwmon, channel);
+	if (ret < 0)
+		return ret;
+
+	/* Wait for an conversion done interrupt */
+	if (!wait_for_completion_timeout(&hwmon->tsidone,
+					 msecs_to_jiffies(500)))
+		return -ETIMEDOUT;
+
+	return da9052_get_tsi_result(hwmon, channel);
+}
+
+static ssize_t da9052_read_tsi(struct device *dev,
+			       struct device_attribute *devattr,
+			       char *buf)
+{
+	struct da9052_hwmon *hwmon = dev_get_drvdata(dev);
+	int channel = to_sensor_dev_attr(devattr)->index;
+	int ret;
+
+	mutex_lock(&hwmon->hwmon_lock);
+	ret = __da9052_read_tsi(dev, channel);
+	mutex_unlock(&hwmon->hwmon_lock);
+
+	if (ret < 0)
+		return ret;
+	else
+		return sprintf(buf, "%d\n", input_tsireg_to_mv(hwmon, ret));
+}
+
 static ssize_t da9052_read_tjunc(struct device *dev,
 				 struct device_attribute *devattr, char *buf)
 {
@@ -196,43 +302,82 @@ static ssize_t show_label(struct device *dev,
 		       input_names[to_sensor_dev_attr(devattr)->index]);
 }
 
-static SENSOR_DEVICE_ATTR(in0_input, S_IRUGO, da9052_read_vddout, NULL,
+static umode_t da9052_channel_is_visible(struct kobject *kobj,
+					 struct attribute *attr, int index)
+{
+	struct device *dev = container_of(kobj, struct device, kobj);
+	struct da9052_hwmon *hwmon = dev_get_drvdata(dev);
+	struct device_attribute *dattr = container_of(attr,
+				struct device_attribute, attr);
+	struct sensor_device_attribute *sattr = to_sensor_dev_attr(dattr);
+
+	if (!hwmon->tsi_as_adc) {
+		switch (sattr->index) {
+		case DA9052_ADC_TSI_XP:
+		case DA9052_ADC_TSI_YP:
+		case DA9052_ADC_TSI_XN:
+		case DA9052_ADC_TSI_YN:
+			return 0;
+		}
+	}
+
+	return attr->mode;
+}
+
+static SENSOR_DEVICE_ATTR(in0_input, 0444, da9052_read_vddout, NULL,
 			  DA9052_ADC_VDDOUT);
-static SENSOR_DEVICE_ATTR(in0_label, S_IRUGO, show_label, NULL,
+static SENSOR_DEVICE_ATTR(in0_label, 0444, show_label, NULL,
 			  DA9052_ADC_VDDOUT);
-static SENSOR_DEVICE_ATTR(in3_input, S_IRUGO, da9052_read_vbat, NULL,
+static SENSOR_DEVICE_ATTR(in3_input, 0444, da9052_read_vbat, NULL,
 			  DA9052_ADC_VBAT);
-static SENSOR_DEVICE_ATTR(in3_label, S_IRUGO, show_label, NULL,
+static SENSOR_DEVICE_ATTR(in3_label, 0444, show_label, NULL,
 			  DA9052_ADC_VBAT);
-static SENSOR_DEVICE_ATTR(in4_input, S_IRUGO, da9052_read_misc_channel, NULL,
+static SENSOR_DEVICE_ATTR(in4_input, 0444, da9052_read_misc_channel, NULL,
 			  DA9052_ADC_IN4);
-static SENSOR_DEVICE_ATTR(in4_label, S_IRUGO, show_label, NULL,
+static SENSOR_DEVICE_ATTR(in4_label, 0444, show_label, NULL,
 			  DA9052_ADC_IN4);
-static SENSOR_DEVICE_ATTR(in5_input, S_IRUGO, da9052_read_misc_channel, NULL,
+static SENSOR_DEVICE_ATTR(in5_input, 0444, da9052_read_misc_channel, NULL,
 			  DA9052_ADC_IN5);
-static SENSOR_DEVICE_ATTR(in5_label, S_IRUGO, show_label, NULL,
+static SENSOR_DEVICE_ATTR(in5_label, 0444, show_label, NULL,
 			  DA9052_ADC_IN5);
-static SENSOR_DEVICE_ATTR(in6_input, S_IRUGO, da9052_read_misc_channel, NULL,
+static SENSOR_DEVICE_ATTR(in6_input, 0444, da9052_read_misc_channel, NULL,
 			  DA9052_ADC_IN6);
-static SENSOR_DEVICE_ATTR(in6_label, S_IRUGO, show_label, NULL,
+static SENSOR_DEVICE_ATTR(in6_label, 0444, show_label, NULL,
 			  DA9052_ADC_IN6);
-static SENSOR_DEVICE_ATTR(in9_input, S_IRUGO, da9052_read_vbbat, NULL,
+static SENSOR_DEVICE_ATTR(in9_input, 0444, da9052_read_vbbat, NULL,
 			  DA9052_ADC_VBBAT);
-static SENSOR_DEVICE_ATTR(in9_label, S_IRUGO, show_label, NULL,
+static SENSOR_DEVICE_ATTR(in9_label, 0444, show_label, NULL,
 			  DA9052_ADC_VBBAT);
 
-static SENSOR_DEVICE_ATTR(curr1_input, S_IRUGO, da9052_read_ich, NULL,
+static SENSOR_DEVICE_ATTR(in70_input, 0444, da9052_read_tsi, NULL,
+			  DA9052_ADC_TSI_XP);
+static SENSOR_DEVICE_ATTR(in70_label, 0444, show_label, NULL,
+			  DA9052_ADC_TSI_XP);
+static SENSOR_DEVICE_ATTR(in71_input, 0444, da9052_read_tsi, NULL,
+			  DA9052_ADC_TSI_XN);
+static SENSOR_DEVICE_ATTR(in71_label, 0444, show_label, NULL,
+			  DA9052_ADC_TSI_XN);
+static SENSOR_DEVICE_ATTR(in72_input, 0444, da9052_read_tsi, NULL,
+			  DA9052_ADC_TSI_YP);
+static SENSOR_DEVICE_ATTR(in72_label, 0444, show_label, NULL,
+			  DA9052_ADC_TSI_YP);
+static SENSOR_DEVICE_ATTR(in73_input, 0444, da9052_read_tsi, NULL,
+			  DA9052_ADC_TSI_YN);
+static SENSOR_DEVICE_ATTR(in73_label, 0444, show_label, NULL,
+			  DA9052_ADC_TSI_YN);
+
+static SENSOR_DEVICE_ATTR(curr1_input, 0444, da9052_read_ich, NULL,
 			  DA9052_ADC_ICH);
-static SENSOR_DEVICE_ATTR(curr1_label, S_IRUGO, show_label, NULL,
+static SENSOR_DEVICE_ATTR(curr1_label, 0444, show_label, NULL,
 			  DA9052_ADC_ICH);
 
-static SENSOR_DEVICE_ATTR(temp2_input, S_IRUGO, da9052_read_tbat, NULL,
+static SENSOR_DEVICE_ATTR(temp2_input, 0444, da9052_read_tbat, NULL,
 			  DA9052_ADC_TBAT);
-static SENSOR_DEVICE_ATTR(temp2_label, S_IRUGO, show_label, NULL,
+static SENSOR_DEVICE_ATTR(temp2_label, 0444, show_label, NULL,
 			  DA9052_ADC_TBAT);
-static SENSOR_DEVICE_ATTR(temp8_input, S_IRUGO, da9052_read_tjunc, NULL,
+static SENSOR_DEVICE_ATTR(temp8_input, 0444, da9052_read_tjunc, NULL,
 			  DA9052_ADC_TJUNC);
-static SENSOR_DEVICE_ATTR(temp8_label, S_IRUGO, show_label, NULL,
+static SENSOR_DEVICE_ATTR(temp8_label, 0444, show_label, NULL,
 			  DA9052_ADC_TJUNC);
 
 static struct attribute *da9052_attrs[] = {
@@ -246,6 +391,14 @@ static struct attribute *da9052_attrs[] = {
 	&sensor_dev_attr_in5_label.dev_attr.attr,
 	&sensor_dev_attr_in6_input.dev_attr.attr,
 	&sensor_dev_attr_in6_label.dev_attr.attr,
+	&sensor_dev_attr_in70_input.dev_attr.attr,
+	&sensor_dev_attr_in70_label.dev_attr.attr,
+	&sensor_dev_attr_in71_input.dev_attr.attr,
+	&sensor_dev_attr_in71_label.dev_attr.attr,
+	&sensor_dev_attr_in72_input.dev_attr.attr,
+	&sensor_dev_attr_in72_label.dev_attr.attr,
+	&sensor_dev_attr_in73_input.dev_attr.attr,
+	&sensor_dev_attr_in73_label.dev_attr.attr,
 	&sensor_dev_attr_in9_input.dev_attr.attr,
 	&sensor_dev_attr_in9_label.dev_attr.attr,
 	&sensor_dev_attr_curr1_input.dev_attr.attr,
@@ -257,29 +410,117 @@ static struct attribute *da9052_attrs[] = {
 	NULL
 };
 
-ATTRIBUTE_GROUPS(da9052);
+static const struct attribute_group da9052_group = {
+	.attrs = da9052_attrs,
+	.is_visible = da9052_channel_is_visible,
+};
+__ATTRIBUTE_GROUPS(da9052);
+
+static irqreturn_t da9052_tsi_datardy_irq(int irq, void *data)
+{
+	struct da9052_hwmon *hwmon = data;
+
+	complete(&hwmon->tsidone);
+	return IRQ_HANDLED;
+}
 
 static int da9052_hwmon_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
 	struct da9052_hwmon *hwmon;
 	struct device *hwmon_dev;
+	int err;
 
 	hwmon = devm_kzalloc(dev, sizeof(struct da9052_hwmon), GFP_KERNEL);
 	if (!hwmon)
 		return -ENOMEM;
 
+	platform_set_drvdata(pdev, hwmon);
+
 	mutex_init(&hwmon->hwmon_lock);
 	hwmon->da9052 = dev_get_drvdata(pdev->dev.parent);
 
+	init_completion(&hwmon->tsidone);
+
+	hwmon->tsi_as_adc =
+		device_property_read_bool(pdev->dev.parent, "dlg,tsi-as-adc");
+
+	if (hwmon->tsi_as_adc) {
+		hwmon->tsiref = devm_regulator_get(pdev->dev.parent, "tsiref");
+		if (IS_ERR(hwmon->tsiref)) {
+			err = PTR_ERR(hwmon->tsiref);
+			dev_err(&pdev->dev, "failed to get tsiref: %d", err);
+			return err;
+		}
+
+		err = regulator_enable(hwmon->tsiref);
+		if (err)
+			return err;
+
+		hwmon->tsiref_mv = regulator_get_voltage(hwmon->tsiref);
+		if (hwmon->tsiref_mv < 0) {
+			err = hwmon->tsiref_mv;
+			goto exit_regulator;
+		}
+
+		/* convert from microvolt (DT) to millivolt (hwmon) */
+		hwmon->tsiref_mv /= 1000;
+
+		/* TSIREF limits from datasheet */
+		if (hwmon->tsiref_mv < 1800 || hwmon->tsiref_mv > 2600) {
+			dev_err(hwmon->da9052->dev, "invalid TSIREF voltage: %d",
+				hwmon->tsiref_mv);
+			err = -ENXIO;
+			goto exit_regulator;
+		}
+
+		/* disable touchscreen features */
+		da9052_reg_write(hwmon->da9052, DA9052_TSI_CONT_A_REG, 0x00);
+
+		err = da9052_request_irq(hwmon->da9052, DA9052_IRQ_TSIREADY,
+					 "tsiready-irq", da9052_tsi_datardy_irq,
+					 hwmon);
+		if (err) {
+			dev_err(&pdev->dev, "Failed to register TSIRDY IRQ: %d",
+				err);
+			goto exit_regulator;
+		}
+	}
+
 	hwmon_dev = devm_hwmon_device_register_with_groups(dev, "da9052",
 							   hwmon,
 							   da9052_groups);
-	return PTR_ERR_OR_ZERO(hwmon_dev);
+	err = PTR_ERR_OR_ZERO(hwmon_dev);
+	if (err)
+		goto exit_irq;
+
+	return 0;
+
+exit_irq:
+	if (hwmon->tsi_as_adc)
+		da9052_free_irq(hwmon->da9052, DA9052_IRQ_TSIREADY, hwmon);
+exit_regulator:
+	if (hwmon->tsiref)
+		regulator_disable(hwmon->tsiref);
+
+	return err;
+}
+
+static int da9052_hwmon_remove(struct platform_device *pdev)
+{
+	struct da9052_hwmon *hwmon = platform_get_drvdata(pdev);
+
+	if (hwmon->tsi_as_adc) {
+		da9052_free_irq(hwmon->da9052, DA9052_IRQ_TSIREADY, hwmon);
+		regulator_disable(hwmon->tsiref);
+	}
+
+	return 0;
 }
 
 static struct platform_driver da9052_hwmon_driver = {
 	.probe = da9052_hwmon_probe,
+	.remove = da9052_hwmon_remove,
 	.driver = {
 		.name = "da9052-hwmon",
 	},
diff --git a/drivers/hwmon/ftsteutates.c b/drivers/hwmon/ftsteutates.c
index 0f0277e..0801f48 100644
--- a/drivers/hwmon/ftsteutates.c
+++ b/drivers/hwmon/ftsteutates.c
@@ -60,7 +60,7 @@
 
 static const unsigned short normal_i2c[] = { 0x73, I2C_CLIENT_END };
 
-static struct i2c_device_id fts_id[] = {
+static const struct i2c_device_id fts_id[] = {
 	{ "ftsteutates", 0 },
 	{ }
 };
@@ -435,6 +435,7 @@ clear_temp_alarm(struct device *dev, struct device_attribute *devattr,
 		goto error;
 
 	data->valid = false;
+	ret = count;
 error:
 	mutex_unlock(&data->update_lock);
 	return ret;
@@ -508,6 +509,7 @@ clear_fan_alarm(struct device *dev, struct device_attribute *devattr,
 		goto error;
 
 	data->valid = false;
+	ret = count;
 error:
 	mutex_unlock(&data->update_lock);
 	return ret;
diff --git a/drivers/hwmon/hwmon.c b/drivers/hwmon/hwmon.c
index dd6e17c..c9790e2 100644
--- a/drivers/hwmon/hwmon.c
+++ b/drivers/hwmon/hwmon.c
@@ -85,7 +85,7 @@ static umode_t hwmon_dev_name_is_visible(struct kobject *kobj,
 	return attr->mode;
 }
 
-static struct attribute_group hwmon_dev_attr_group = {
+static const struct attribute_group hwmon_dev_attr_group = {
 	.attrs		= hwmon_dev_attrs,
 	.is_visible	= hwmon_dev_name_is_visible,
 };
@@ -135,7 +135,7 @@ static int hwmon_thermal_get_temp(void *data, int *temp)
 	return 0;
 }
 
-static struct thermal_zone_of_device_ops hwmon_thermal_ops = {
+static const struct thermal_zone_of_device_ops hwmon_thermal_ops = {
 	.get_temp = hwmon_thermal_get_temp,
 };
 
diff --git a/drivers/hwmon/i5k_amb.c b/drivers/hwmon/i5k_amb.c
index a5a9f45..9397d2f 100644
--- a/drivers/hwmon/i5k_amb.c
+++ b/drivers/hwmon/i5k_amb.c
@@ -495,7 +495,7 @@ static struct {
 };
 
 #ifdef MODULE
-static struct pci_device_id i5k_amb_ids[] = {
+static const struct pci_device_id i5k_amb_ids[] = {
 	{ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_5000_ERR) },
 	{ PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_5400_ERR) },
 	{ 0, }
diff --git a/drivers/hwmon/it87.c b/drivers/hwmon/it87.c
index 4dfc723..f8499cb 100644
--- a/drivers/hwmon/it87.c
+++ b/drivers/hwmon/it87.c
@@ -497,12 +497,14 @@ static const struct it87_devices it87_devices[] = {
 #define has_vin3_5v(data)	((data)->features & FEAT_VIN3_5V)
 
 struct it87_sio_data {
+	int sioaddr;
 	enum chips type;
 	/* Values read from Super-I/O config space */
 	u8 revision;
 	u8 vid_value;
 	u8 beep_pin;
 	u8 internal;	/* Internal sensors can be labeled */
+	bool need_in7_reroute;
 	/* Features skipped based on config or DMI */
 	u16 skip_in;
 	u8 skip_vid;
@@ -517,6 +519,7 @@ struct it87_sio_data {
  */
 struct it87_data {
 	const struct attribute_group *groups[7];
+	int sioaddr;
 	enum chips type;
 	u32 features;
 	u8 peci_mask;
@@ -532,6 +535,7 @@ struct it87_data {
 	u16 in_internal;	/* Bitfield, internal sensors (for labels) */
 	u16 has_in;		/* Bitfield, voltage sensors enabled */
 	u8 in[NUM_VIN][3];		/* [nr][0]=in, [1]=min, [2]=max */
+	bool need_in7_reroute;
 	u8 has_fan;		/* Bitfield, fans enabled */
 	u16 fan[NUM_FAN][2];	/* Register values, [nr][0]=fan, [1]=min */
 	u8 has_temp;		/* Bitfield, temp sensors enabled */
@@ -2487,6 +2491,7 @@ static int __init it87_find(int sioaddr, unsigned short *address,
 	}
 
 	err = 0;
+	sio_data->sioaddr = sioaddr;
 	sio_data->revision = superio_inb(sioaddr, DEVREV) & 0x0f;
 	pr_info("Found IT%04x%s chip at 0x%x, revision %d\n", chip_type,
 		it87_devices[sio_data->type].suffix,
@@ -2575,6 +2580,7 @@ static int __init it87_find(int sioaddr, unsigned short *address,
 				reg2c |= BIT(1);
 				superio_outb(sioaddr, IT87_SIO_PINX2_REG,
 					     reg2c);
+				sio_data->need_in7_reroute = true;
 				pr_notice("Routing internal VCCH5V to in7.\n");
 			}
 			pr_notice("in7 routed to internal voltage divider, with external pin disabled.\n");
@@ -2761,13 +2767,13 @@ static int __init it87_find(int sioaddr, unsigned short *address,
 		uart6 = sio_data->type == it8782 && (reg & BIT(2));
 
 		/*
-		 * The IT8720F has no VIN7 pin, so VCCH should always be
+		 * The IT8720F has no VIN7 pin, so VCCH5V should always be
 		 * routed internally to VIN7 with an internal divider.
 		 * Curiously, there still is a configuration bit to control
 		 * this, which means it can be set incorrectly. And even
 		 * more curiously, many boards out there are improperly
 		 * configured, even though the IT8720F datasheet claims
-		 * that the internal routing of VCCH to VIN7 is the default
+		 * that the internal routing of VCCH5V to VIN7 is the default
 		 * setting. So we force the internal routing in this case.
 		 *
 		 * On IT8782F, VIN7 is multiplexed with one of the UART6 pins.
@@ -2777,7 +2783,8 @@ static int __init it87_find(int sioaddr, unsigned short *address,
 		if ((sio_data->type == it8720 || uart6) && !(reg & BIT(1))) {
 			reg |= BIT(1);
 			superio_outb(sioaddr, IT87_SIO_PINX2_REG, reg);
-			pr_notice("Routing internal VCCH to in7\n");
+			sio_data->need_in7_reroute = true;
+			pr_notice("Routing internal VCCH5V to in7\n");
 		}
 		if (reg & BIT(0))
 			sio_data->internal |= BIT(0);
@@ -2828,13 +2835,89 @@ static int __init it87_find(int sioaddr, unsigned short *address,
 	return err;
 }
 
+/*
+ * Some chips seem to have default value 0xff for all limit
+ * registers. For low voltage limits it makes no sense and triggers
+ * alarms, so change to 0 instead. For high temperature limits, it
+ * means -1 degree C, which surprisingly doesn't trigger an alarm,
+ * but is still confusing, so change to 127 degrees C.
+ */
+static void it87_check_limit_regs(struct it87_data *data)
+{
+	int i, reg;
+
+	for (i = 0; i < NUM_VIN_LIMIT; i++) {
+		reg = it87_read_value(data, IT87_REG_VIN_MIN(i));
+		if (reg == 0xff)
+			it87_write_value(data, IT87_REG_VIN_MIN(i), 0);
+	}
+	for (i = 0; i < NUM_TEMP_LIMIT; i++) {
+		reg = it87_read_value(data, IT87_REG_TEMP_HIGH(i));
+		if (reg == 0xff)
+			it87_write_value(data, IT87_REG_TEMP_HIGH(i), 127);
+	}
+}
+
+/* Check if voltage monitors are reset manually or by some reason */
+static void it87_check_voltage_monitors_reset(struct it87_data *data)
+{
+	int reg;
+
+	reg = it87_read_value(data, IT87_REG_VIN_ENABLE);
+	if ((reg & 0xff) == 0) {
+		/* Enable all voltage monitors */
+		it87_write_value(data, IT87_REG_VIN_ENABLE, 0xff);
+	}
+}
+
+/* Check if tachometers are reset manually or by some reason */
+static void it87_check_tachometers_reset(struct platform_device *pdev)
+{
+	struct it87_sio_data *sio_data = dev_get_platdata(&pdev->dev);
+	struct it87_data *data = platform_get_drvdata(pdev);
+	u8 mask, fan_main_ctrl;
+
+	mask = 0x70 & ~(sio_data->skip_fan << 4);
+	fan_main_ctrl = it87_read_value(data, IT87_REG_FAN_MAIN_CTRL);
+	if ((fan_main_ctrl & mask) == 0) {
+		/* Enable all fan tachometers */
+		fan_main_ctrl |= mask;
+		it87_write_value(data, IT87_REG_FAN_MAIN_CTRL,
+				 fan_main_ctrl);
+	}
+}
+
+/* Set tachometers to 16-bit mode if needed */
+static void it87_check_tachometers_16bit_mode(struct platform_device *pdev)
+{
+	struct it87_data *data = platform_get_drvdata(pdev);
+	int reg;
+
+	if (!has_fan16_config(data))
+		return;
+
+	reg = it87_read_value(data, IT87_REG_FAN_16BIT);
+	if (~reg & 0x07 & data->has_fan) {
+		dev_dbg(&pdev->dev,
+			"Setting fan1-3 to 16-bit mode\n");
+		it87_write_value(data, IT87_REG_FAN_16BIT,
+				 reg | 0x07);
+	}
+}
+
+static void it87_start_monitoring(struct it87_data *data)
+{
+	it87_write_value(data, IT87_REG_CONFIG,
+			 (it87_read_value(data, IT87_REG_CONFIG) & 0x3e)
+			 | (update_vbat ? 0x41 : 0x01));
+}
+
 /* Called when we have found a new IT87. */
 static void it87_init_device(struct platform_device *pdev)
 {
 	struct it87_sio_data *sio_data = dev_get_platdata(&pdev->dev);
 	struct it87_data *data = platform_get_drvdata(pdev);
 	int tmp, i;
-	u8 mask;
 
 	/*
 	 * For each PWM channel:
@@ -2855,23 +2938,7 @@ static void it87_init_device(struct platform_device *pdev)
 		data->auto_pwm[i][3] = 0x7f;	/* Full speed, hard-coded */
 	}
 
-	/*
-	 * Some chips seem to have default value 0xff for all limit
-	 * registers. For low voltage limits it makes no sense and triggers
-	 * alarms, so change to 0 instead. For high temperature limits, it
-	 * means -1 degree C, which surprisingly doesn't trigger an alarm,
-	 * but is still confusing, so change to 127 degrees C.
-	 */
-	for (i = 0; i < NUM_VIN_LIMIT; i++) {
-		tmp = it87_read_value(data, IT87_REG_VIN_MIN(i));
-		if (tmp == 0xff)
-			it87_write_value(data, IT87_REG_VIN_MIN(i), 0);
-	}
-	for (i = 0; i < NUM_TEMP_LIMIT; i++) {
-		tmp = it87_read_value(data, IT87_REG_TEMP_HIGH(i));
-		if (tmp == 0xff)
-			it87_write_value(data, IT87_REG_TEMP_HIGH(i), 127);
-	}
+	it87_check_limit_regs(data);
 
 	/*
 	 * Temperature channels are not forcibly enabled, as they can be
@@ -2880,38 +2947,19 @@ static void it87_init_device(struct platform_device *pdev)
 	 * run-time through the temp{1-3}_type sysfs accessors if needed.
 	 */
 
-	/* Check if voltage monitors are reset manually or by some reason */
-	tmp = it87_read_value(data, IT87_REG_VIN_ENABLE);
-	if ((tmp & 0xff) == 0) {
-		/* Enable all voltage monitors */
-		it87_write_value(data, IT87_REG_VIN_ENABLE, 0xff);
-	}
+	it87_check_voltage_monitors_reset(data);
 
-	/* Check if tachometers are reset manually or by some reason */
-	mask = 0x70 & ~(sio_data->skip_fan << 4);
+	it87_check_tachometers_reset(pdev);
+
 	data->fan_main_ctrl = it87_read_value(data, IT87_REG_FAN_MAIN_CTRL);
-	if ((data->fan_main_ctrl & mask) == 0) {
-		/* Enable all fan tachometers */
-		data->fan_main_ctrl |= mask;
-		it87_write_value(data, IT87_REG_FAN_MAIN_CTRL,
-				 data->fan_main_ctrl);
-	}
 	data->has_fan = (data->fan_main_ctrl >> 4) & 0x07;
 
-	tmp = it87_read_value(data, IT87_REG_FAN_16BIT);
-
-	/* Set tachometers to 16-bit mode if needed */
-	if (has_fan16_config(data)) {
-		if (~tmp & 0x07 & data->has_fan) {
-			dev_dbg(&pdev->dev,
-				"Setting fan1-3 to 16-bit mode\n");
-			it87_write_value(data, IT87_REG_FAN_16BIT,
-					 tmp | 0x07);
-		}
-	}
+	it87_check_tachometers_16bit_mode(pdev);
 
 	/* Check for additional fans */
 	if (has_five_fans(data)) {
+		tmp = it87_read_value(data, IT87_REG_FAN_16BIT);
+
 		if (tmp & BIT(4))
 			data->has_fan |= BIT(3); /* fan4 enabled */
 		if (tmp & BIT(5))
@@ -2933,10 +2981,7 @@ static void it87_init_device(struct platform_device *pdev)
 			sio_data->skip_pwm |= BIT(5);
 	}
 
-	/* Start monitoring */
-	it87_write_value(data, IT87_REG_CONFIG,
-			 (it87_read_value(data, IT87_REG_CONFIG) & 0x3e)
-			 | (update_vbat ? 0x41 : 0x01));
+	it87_start_monitoring(data);
 }
 
 /* Return 1 if and only if the PWM interface is safe to use */
@@ -2986,8 +3031,6 @@ static int it87_check_pwm(struct device *dev)
 				 "PWM configuration is too broken to be fixed\n");
 		}
 
-		dev_info(dev,
-			 "Detected broken BIOS defaults, disabling PWM interface\n");
 		return 0;
 	} else if (fix_pwm_polarity) {
 		dev_info(dev,
@@ -3020,6 +3063,7 @@ static int it87_probe(struct platform_device *pdev)
 		return -ENOMEM;
 
 	data->addr = res->start;
+	data->sioaddr = sio_data->sioaddr;
 	data->type = sio_data->type;
 	data->features = it87_devices[sio_data->type].features;
 	data->peci_mask = it87_devices[sio_data->type].peci_mask;
@@ -3058,6 +3102,9 @@ static int it87_probe(struct platform_device *pdev)
 
 	/* Check PWM configuration */
 	enable_pwm_interface = it87_check_pwm(dev);
+	if (!enable_pwm_interface)
+		dev_info(dev,
+			 "Detected broken BIOS defaults, disabling PWM interface\n");
 
 	/* Starting with IT8721F, we handle scaling of internal voltages */
 	if (has_12mv_adc(data)) {
@@ -3085,6 +3132,7 @@ static int it87_probe(struct platform_device *pdev)
 	}
 
 	data->in_internal = sio_data->internal;
+	data->need_in7_reroute = sio_data->need_in7_reroute;
 	data->has_in = 0x3ff & ~sio_data->skip_in;
 
 	if (has_six_temp(data)) {
@@ -3140,9 +3188,71 @@ static int it87_probe(struct platform_device *pdev)
 	return PTR_ERR_OR_ZERO(hwmon_dev);
 }
 
+static void __maybe_unused it87_resume_sio(struct platform_device *pdev)
+{
+	struct it87_data *data = dev_get_drvdata(&pdev->dev);
+	int err;
+	int reg2c;
+
+	if (!data->need_in7_reroute)
+		return;
+
+	err = superio_enter(data->sioaddr);
+	if (err) {
+		dev_warn(&pdev->dev,
+			 "Unable to enter Super I/O to reroute in7 (%d)",
+			 err);
+		return;
+	}
+
+	superio_select(data->sioaddr, GPIO);
+
+	reg2c = superio_inb(data->sioaddr, IT87_SIO_PINX2_REG);
+	if (!(reg2c & BIT(1))) {
+		dev_dbg(&pdev->dev,
+			"Routing internal VCCH5V to in7 again");
+
+		reg2c |= BIT(1);
+		superio_outb(data->sioaddr, IT87_SIO_PINX2_REG,
+			     reg2c);
+	}
+
+	superio_exit(data->sioaddr);
+}
+
+static int __maybe_unused it87_resume(struct device *dev)
+{
+	struct platform_device *pdev = to_platform_device(dev);
+	struct it87_data *data = dev_get_drvdata(dev);
+
+	it87_resume_sio(pdev);
+
+	mutex_lock(&data->update_lock);
+
+	it87_check_pwm(dev);
+	it87_check_limit_regs(data);
+	it87_check_voltage_monitors_reset(data);
+	it87_check_tachometers_reset(pdev);
+	it87_check_tachometers_16bit_mode(pdev);
+
+	it87_start_monitoring(data);
+
+	/* force update */
+	data->valid = 0;
+
+	mutex_unlock(&data->update_lock);
+
+	it87_update_device(dev);
+
+	return 0;
+}
+
+static SIMPLE_DEV_PM_OPS(it87_dev_pm_ops, NULL, it87_resume);
+
 static struct platform_driver it87_driver = {
 	.driver = {
 		.name	= DRVNAME,
+		.pm     = &it87_dev_pm_ops,
 	},
 	.probe	= it87_probe,
 };
diff --git a/drivers/hwmon/jc42.c b/drivers/hwmon/jc42.c
index 1bf22ef..5f11dc0 100644
--- a/drivers/hwmon/jc42.c
+++ b/drivers/hwmon/jc42.c
@@ -72,6 +72,8 @@ static const unsigned short normal_i2c[] = {
 #define NXP_MANID		0x1131  /* NXP Semiconductors */
 #define ONS_MANID		0x1b09  /* ON Semiconductor */
 #define STM_MANID		0x104a  /* ST Microelectronics */
+#define GT_MANID		0x1c68	/* Giantec */
+#define GT_MANID2		0x132d	/* Giantec, 2nd mfg ID */
 
 /* Supported chips */
 
@@ -86,6 +88,13 @@ static const unsigned short normal_i2c[] = {
 #define AT30TSE004_DEVID	0x2200
 #define AT30TSE004_DEVID_MASK	0xffff
 
+/* Giantec */
+#define GT30TS00_DEVID		0x2200
+#define GT30TS00_DEVID_MASK	0xff00
+
+#define GT34TS02_DEVID		0x3300
+#define GT34TS02_DEVID_MASK	0xff00
+
 /* IDT */
 #define TSE2004_DEVID		0x2200
 #define TSE2004_DEVID_MASK	0xff00
@@ -130,6 +139,12 @@ static const unsigned short normal_i2c[] = {
 #define CAT6095_DEVID		0x0800	/* Also matches CAT34TS02 */
 #define CAT6095_DEVID_MASK	0xffe0
 
+#define CAT34TS02C_DEVID	0x0a00
+#define CAT34TS02C_DEVID_MASK	0xfff0
+
+#define CAT34TS04_DEVID		0x2200
+#define CAT34TS04_DEVID_MASK	0xfff0
+
 /* ST Microelectronics */
 #define STTS424_DEVID		0x0101
 #define STTS424_DEVID_MASK	0xffff
@@ -158,6 +173,8 @@ static struct jc42_chips jc42_chips[] = {
 	{ ADT_MANID, ADT7408_DEVID, ADT7408_DEVID_MASK },
 	{ ATMEL_MANID, AT30TS00_DEVID, AT30TS00_DEVID_MASK },
 	{ ATMEL_MANID2, AT30TSE004_DEVID, AT30TSE004_DEVID_MASK },
+	{ GT_MANID, GT30TS00_DEVID, GT30TS00_DEVID_MASK },
+	{ GT_MANID2, GT34TS02_DEVID, GT34TS02_DEVID_MASK },
 	{ IDT_MANID, TSE2004_DEVID, TSE2004_DEVID_MASK },
 	{ IDT_MANID, TS3000_DEVID, TS3000_DEVID_MASK },
 	{ IDT_MANID, TS3001_DEVID, TS3001_DEVID_MASK },
@@ -170,6 +187,8 @@ static struct jc42_chips jc42_chips[] = {
 	{ MCP_MANID, MCP9843_DEVID, MCP9843_DEVID_MASK },
 	{ NXP_MANID, SE97_DEVID, SE97_DEVID_MASK },
 	{ ONS_MANID, CAT6095_DEVID, CAT6095_DEVID_MASK },
+	{ ONS_MANID, CAT34TS02C_DEVID, CAT34TS02C_DEVID_MASK },
+	{ ONS_MANID, CAT34TS04_DEVID, CAT34TS04_DEVID_MASK },
 	{ NXP_MANID, SE98_DEVID, SE98_DEVID_MASK },
 	{ STM_MANID, STTS424_DEVID, STTS424_DEVID_MASK },
 	{ STM_MANID, STTS424E_DEVID, STTS424E_DEVID_MASK },
diff --git a/drivers/hwmon/ltq-cputemp.c b/drivers/hwmon/ltq-cputemp.c
new file mode 100644
index 0000000..1d33f94
--- /dev/null
+++ b/drivers/hwmon/ltq-cputemp.c
@@ -0,0 +1,163 @@
+/* Lantiq cpu temperature sensor driver
+ *
+ * Copyright (C) 2017 Florian Eckert <fe@dev.tdt.de>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version
+ *
+ * This program is distributed in the hope that it will be useful
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>
+ */
+
+#include <linux/bitops.h>
+#include <linux/delay.h>
+#include <linux/hwmon.h>
+#include <linux/hwmon-sysfs.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/of_device.h>
+
+#include <lantiq_soc.h>
+
+/* gphy1 configuration register contains cpu temperature */
+#define CGU_GPHY1_CR   0x0040
+#define CGU_TEMP_PD    BIT(19)
+
+static void ltq_cputemp_enable(void)
+{
+	ltq_cgu_w32(ltq_cgu_r32(CGU_GPHY1_CR) | CGU_TEMP_PD, CGU_GPHY1_CR);
+}
+
+static void ltq_cputemp_disable(void *data)
+{
+	ltq_cgu_w32(ltq_cgu_r32(CGU_GPHY1_CR) & ~CGU_TEMP_PD, CGU_GPHY1_CR);
+}
+
+static int ltq_read(struct device *dev, enum hwmon_sensor_types type,
+		    u32 attr, int channel, long *temp)
+{
+	int value;
+
+	switch (attr) {
+	case hwmon_temp_input:
+		/* get the temperature including one decimal place */
+		value = (ltq_cgu_r32(CGU_GPHY1_CR) >> 9) & 0x01FF;
+		value = value * 5;
+		/* range -38 to +154 °C, register value zero is -38.0 °C */
+		value -= 380;
+		/* scale temp to millidegree */
+		value = value * 100;
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	*temp = value;
+	return 0;
+}
+
+static umode_t ltq_is_visible(const void *_data, enum hwmon_sensor_types type,
+			      u32 attr, int channel)
+{
+	if (type != hwmon_temp)
+		return 0;
+
+	switch (attr) {
+	case hwmon_temp_input:
+		return 0444;
+	default:
+		return 0;
+	}
+}
+
+static const u32 ltq_chip_config[] = {
+	HWMON_C_REGISTER_TZ,
+	0
+};
+
+static const struct hwmon_channel_info ltq_chip = {
+	.type = hwmon_chip,
+	.config = ltq_chip_config,
+};
+
+static const u32 ltq_temp_config[] = {
+	HWMON_T_INPUT,
+	0
+};
+
+static const struct hwmon_channel_info ltq_temp = {
+	.type = hwmon_temp,
+	.config = ltq_temp_config,
+};
+
+static const struct hwmon_channel_info *ltq_info[] = {
+	&ltq_chip,
+	&ltq_temp,
+	NULL
+};
+
+static const struct hwmon_ops ltq_hwmon_ops = {
+	.is_visible = ltq_is_visible,
+	.read = ltq_read,
+};
+
+static const struct hwmon_chip_info ltq_chip_info = {
+	.ops = &ltq_hwmon_ops,
+	.info = ltq_info,
+};
+
+static int ltq_cputemp_probe(struct platform_device *pdev)
+{
+	struct device *hwmon_dev;
+	int err = 0;
+
+	/* available on vr9 v1.2 SoCs only */
+	if (ltq_soc_type() != SOC_TYPE_VR9_2)
+		return -ENODEV;
+
+	err = devm_add_action(&pdev->dev, ltq_cputemp_disable, NULL);
+	if (err)
+		return err;
+
+	ltq_cputemp_enable();
+
+	hwmon_dev = devm_hwmon_device_register_with_info(&pdev->dev,
+							 "ltq_cputemp",
+							 NULL,
+							 &ltq_chip_info,
+							 NULL);
+
+	if (IS_ERR(hwmon_dev)) {
+		dev_err(&pdev->dev, "Failed to register as hwmon device");
+		return PTR_ERR(hwmon_dev);
+	}
+
+	return 0;
+}
+
+const struct of_device_id ltq_cputemp_match[] = {
+	{ .compatible = "lantiq,cputemp" },
+	{},
+};
+MODULE_DEVICE_TABLE(of, ltq_cputemp_match);
+
+static struct platform_driver ltq_cputemp_driver = {
+	.probe = ltq_cputemp_probe,
+	.driver = {
+		.name = "ltq-cputemp",
+		.of_match_table = ltq_cputemp_match,
+	},
+};
+
+module_platform_driver(ltq_cputemp_driver);
+
+MODULE_AUTHOR("Florian Eckert <fe@dev.tdt.de>");
+MODULE_DESCRIPTION("Lantiq cpu temperature sensor driver");
+MODULE_LICENSE("GPL");
diff --git a/drivers/hwmon/nct7802.c b/drivers/hwmon/nct7802.c
index 12b94b0..2876c18 100644
--- a/drivers/hwmon/nct7802.c
+++ b/drivers/hwmon/nct7802.c
@@ -704,7 +704,7 @@ static umode_t nct7802_temp_is_visible(struct kobject *kobj,
 	return attr->mode;
 }
 
-static struct attribute_group nct7802_temp_group = {
+static const struct attribute_group nct7802_temp_group = {
 	.attrs = nct7802_temp_attrs,
 	.is_visible = nct7802_temp_is_visible,
 };
@@ -802,7 +802,7 @@ static umode_t nct7802_in_is_visible(struct kobject *kobj,
 	return attr->mode;
 }
 
-static struct attribute_group nct7802_in_group = {
+static const struct attribute_group nct7802_in_group = {
 	.attrs = nct7802_in_attrs,
 	.is_visible = nct7802_in_is_visible,
 };
@@ -880,7 +880,7 @@ static umode_t nct7802_fan_is_visible(struct kobject *kobj,
 	return attr->mode;
 }
 
-static struct attribute_group nct7802_fan_group = {
+static const struct attribute_group nct7802_fan_group = {
 	.attrs = nct7802_fan_attrs,
 	.is_visible = nct7802_fan_is_visible,
 };
@@ -898,7 +898,7 @@ static struct attribute *nct7802_pwm_attrs[] = {
 	NULL
 };
 
-static struct attribute_group nct7802_pwm_group = {
+static const struct attribute_group nct7802_pwm_group = {
 	.attrs = nct7802_pwm_attrs,
 };
 
@@ -1011,7 +1011,7 @@ static struct attribute *nct7802_auto_point_attrs[] = {
 	NULL
 };
 
-static struct attribute_group nct7802_auto_point_group = {
+static const struct attribute_group nct7802_auto_point_group = {
 	.attrs = nct7802_auto_point_attrs,
 };
 
diff --git a/drivers/hwmon/pmbus/Kconfig b/drivers/hwmon/pmbus/Kconfig
index 68d717a..4001932 100644
--- a/drivers/hwmon/pmbus/Kconfig
+++ b/drivers/hwmon/pmbus/Kconfig
@@ -37,6 +37,15 @@
 	  This driver can also be built as a module. If so, the module will
 	  be called adm1275.
 
+config SENSORS_IBM_CFFPS
+	tristate "IBM Common Form Factor Power Supply"
+	help
+	  If you say yes here you get hardware monitoring support for the IBM
+	  Common Form Factor power supply.
+
+	  This driver can also be built as a module. If so, the module will
+	  be called ibm-cffps.
+
 config SENSORS_IR35221
 	tristate "Infineon IR35221"
 	default n
@@ -135,6 +144,15 @@
 	  This driver can also be built as a module. If so, the module will
 	  be called tps40422.
 
+config SENSORS_TPS53679
+	tristate "TI TPS53679"
+	help
+	  If you say yes here you get hardware monitoring support for TI
+	  TPS53679.
+
+	  This driver can also be built as a module. If so, the module will
+	  be called tps53679.
+
 config SENSORS_UCD9000
 	tristate "TI UCD90120, UCD90124, UCD90160, UCD9090, UCD90910"
 	default n
diff --git a/drivers/hwmon/pmbus/Makefile b/drivers/hwmon/pmbus/Makefile
index 75bb7ca..459a6be 100644
--- a/drivers/hwmon/pmbus/Makefile
+++ b/drivers/hwmon/pmbus/Makefile
@@ -5,6 +5,7 @@
 obj-$(CONFIG_PMBUS)		+= pmbus_core.o
 obj-$(CONFIG_SENSORS_PMBUS)	+= pmbus.o
 obj-$(CONFIG_SENSORS_ADM1275)	+= adm1275.o
+obj-$(CONFIG_SENSORS_IBM_CFFPS)	+= ibm-cffps.o
 obj-$(CONFIG_SENSORS_IR35221)	+= ir35221.o
 obj-$(CONFIG_SENSORS_LM25066)	+= lm25066.o
 obj-$(CONFIG_SENSORS_LTC2978)	+= ltc2978.o
@@ -14,6 +15,7 @@
 obj-$(CONFIG_SENSORS_MAX34440)	+= max34440.o
 obj-$(CONFIG_SENSORS_MAX8688)	+= max8688.o
 obj-$(CONFIG_SENSORS_TPS40422)	+= tps40422.o
+obj-$(CONFIG_SENSORS_TPS53679)	+= tps53679.o
 obj-$(CONFIG_SENSORS_UCD9000)	+= ucd9000.o
 obj-$(CONFIG_SENSORS_UCD9200)	+= ucd9200.o
 obj-$(CONFIG_SENSORS_ZL6100)	+= zl6100.o
diff --git a/drivers/hwmon/pmbus/ibm-cffps.c b/drivers/hwmon/pmbus/ibm-cffps.c
new file mode 100644
index 0000000..cb56da6
--- /dev/null
+++ b/drivers/hwmon/pmbus/ibm-cffps.c
@@ -0,0 +1,151 @@
+/*
+ * Copyright 2017 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ */
+
+#include <linux/bitops.h>
+#include <linux/device.h>
+#include <linux/i2c.h>
+#include <linux/module.h>
+
+#include "pmbus.h"
+
+/* STATUS_MFR_SPECIFIC bits */
+#define CFFPS_MFR_FAN_FAULT			BIT(0)
+#define CFFPS_MFR_THERMAL_FAULT			BIT(1)
+#define CFFPS_MFR_OV_FAULT			BIT(2)
+#define CFFPS_MFR_UV_FAULT			BIT(3)
+#define CFFPS_MFR_PS_KILL			BIT(4)
+#define CFFPS_MFR_OC_FAULT			BIT(5)
+#define CFFPS_MFR_VAUX_FAULT			BIT(6)
+#define CFFPS_MFR_CURRENT_SHARE_WARNING		BIT(7)
+
+static int ibm_cffps_read_byte_data(struct i2c_client *client, int page,
+				    int reg)
+{
+	int rc, mfr;
+
+	switch (reg) {
+	case PMBUS_STATUS_VOUT:
+	case PMBUS_STATUS_IOUT:
+	case PMBUS_STATUS_TEMPERATURE:
+	case PMBUS_STATUS_FAN_12:
+		rc = pmbus_read_byte_data(client, page, reg);
+		if (rc < 0)
+			return rc;
+
+		mfr = pmbus_read_byte_data(client, page,
+					   PMBUS_STATUS_MFR_SPECIFIC);
+		if (mfr < 0)
+			/*
+			 * Return the status register instead of an error,
+			 * since we successfully read status.
+			 */
+			return rc;
+
+		/* Add MFR_SPECIFIC bits to the standard pmbus status regs. */
+		if (reg == PMBUS_STATUS_FAN_12) {
+			if (mfr & CFFPS_MFR_FAN_FAULT)
+				rc |= PB_FAN_FAN1_FAULT;
+		} else if (reg == PMBUS_STATUS_TEMPERATURE) {
+			if (mfr & CFFPS_MFR_THERMAL_FAULT)
+				rc |= PB_TEMP_OT_FAULT;
+		} else if (reg == PMBUS_STATUS_VOUT) {
+			if (mfr & (CFFPS_MFR_OV_FAULT | CFFPS_MFR_VAUX_FAULT))
+				rc |= PB_VOLTAGE_OV_FAULT;
+			if (mfr & CFFPS_MFR_UV_FAULT)
+				rc |= PB_VOLTAGE_UV_FAULT;
+		} else if (reg == PMBUS_STATUS_IOUT) {
+			if (mfr & CFFPS_MFR_OC_FAULT)
+				rc |= PB_IOUT_OC_FAULT;
+			if (mfr & CFFPS_MFR_CURRENT_SHARE_WARNING)
+				rc |= PB_CURRENT_SHARE_FAULT;
+		}
+		break;
+	default:
+		rc = -ENODATA;
+		break;
+	}
+
+	return rc;
+}
+
+static int ibm_cffps_read_word_data(struct i2c_client *client, int page,
+				    int reg)
+{
+	int rc, mfr;
+
+	switch (reg) {
+	case PMBUS_STATUS_WORD:
+		rc = pmbus_read_word_data(client, page, reg);
+		if (rc < 0)
+			return rc;
+
+		mfr = pmbus_read_byte_data(client, page,
+					   PMBUS_STATUS_MFR_SPECIFIC);
+		if (mfr < 0)
+			/*
+			 * Return the status register instead of an error,
+			 * since we successfully read status.
+			 */
+			return rc;
+
+		if (mfr & CFFPS_MFR_PS_KILL)
+			rc |= PB_STATUS_OFF;
+		break;
+	default:
+		rc = -ENODATA;
+		break;
+	}
+
+	return rc;
+}
+
+static struct pmbus_driver_info ibm_cffps_info = {
+	.pages = 1,
+	.func[0] = PMBUS_HAVE_VIN | PMBUS_HAVE_VOUT | PMBUS_HAVE_IOUT |
+		PMBUS_HAVE_PIN | PMBUS_HAVE_FAN12 | PMBUS_HAVE_TEMP |
+		PMBUS_HAVE_TEMP2 | PMBUS_HAVE_TEMP3 | PMBUS_HAVE_STATUS_VOUT |
+		PMBUS_HAVE_STATUS_IOUT | PMBUS_HAVE_STATUS_INPUT |
+		PMBUS_HAVE_STATUS_TEMP | PMBUS_HAVE_STATUS_FAN12,
+	.read_byte_data = ibm_cffps_read_byte_data,
+	.read_word_data = ibm_cffps_read_word_data,
+};
+
+static int ibm_cffps_probe(struct i2c_client *client,
+			   const struct i2c_device_id *id)
+{
+	return pmbus_do_probe(client, id, &ibm_cffps_info);
+}
+
+static const struct i2c_device_id ibm_cffps_id[] = {
+	{ "ibm_cffps1", 1 },
+	{}
+};
+MODULE_DEVICE_TABLE(i2c, ibm_cffps_id);
+
+static const struct of_device_id ibm_cffps_of_match[] = {
+	{ .compatible = "ibm,cffps1" },
+	{}
+};
+MODULE_DEVICE_TABLE(of, ibm_cffps_of_match);
+
+static struct i2c_driver ibm_cffps_driver = {
+	.driver = {
+		.name = "ibm-cffps",
+		.of_match_table = ibm_cffps_of_match,
+	},
+	.probe = ibm_cffps_probe,
+	.remove = pmbus_do_remove,
+	.id_table = ibm_cffps_id,
+};
+
+module_i2c_driver(ibm_cffps_driver);
+
+MODULE_AUTHOR("Eddie James");
+MODULE_DESCRIPTION("PMBus driver for IBM Common Form Factor power supplies");
+MODULE_LICENSE("GPL");
diff --git a/drivers/hwmon/pmbus/lm25066.c b/drivers/hwmon/pmbus/lm25066.c
index a3d912cd..10d17fb 100644
--- a/drivers/hwmon/pmbus/lm25066.c
+++ b/drivers/hwmon/pmbus/lm25066.c
@@ -28,7 +28,7 @@
 #include <linux/i2c.h>
 #include "pmbus.h"
 
-enum chips { lm25056, lm25063, lm25066, lm5064, lm5066 };
+enum chips { lm25056, lm25063, lm25066, lm5064, lm5066, lm5066i };
 
 #define LM25066_READ_VAUX		0xd0
 #define LM25066_MFR_READ_IIN		0xd1
@@ -65,7 +65,7 @@ struct __coeff {
 #define PSC_CURRENT_IN_L	(PSC_NUM_CLASSES)
 #define PSC_POWER_L		(PSC_NUM_CLASSES + 1)
 
-static struct __coeff lm25066_coeff[5][PSC_NUM_CLASSES + 2] = {
+static struct __coeff lm25066_coeff[6][PSC_NUM_CLASSES + 2] = {
 	[lm25056] = {
 		[PSC_VOLTAGE_IN] = {
 			.m = 16296,
@@ -210,6 +210,41 @@ static struct __coeff lm25066_coeff[5][PSC_NUM_CLASSES + 2] = {
 			.m = 16,
 		},
 	},
+	[lm5066i] = {
+		[PSC_VOLTAGE_IN] = {
+			.m = 4617,
+			.b = -140,
+			.R = -2,
+		},
+		[PSC_VOLTAGE_OUT] = {
+			.m = 4602,
+			.b = 500,
+			.R = -2,
+		},
+		[PSC_CURRENT_IN] = {
+			.m = 15076,
+			.b = -504,
+			.R = -2,
+		},
+		[PSC_CURRENT_IN_L] = {
+			.m = 7645,
+			.b = 100,
+			.R = -2,
+		},
+		[PSC_POWER] = {
+			.m = 1701,
+			.b = -4000,
+			.R = -3,
+		},
+		[PSC_POWER_L] = {
+			.m = 861,
+			.b = -965,
+			.R = -3,
+		},
+		[PSC_TEMPERATURE] = {
+			.m = 16,
+		},
+	},
 };
 
 struct lm25066_data {
@@ -250,6 +285,7 @@ static int lm25066_read_word_data(struct i2c_client *client, int page, int reg)
 			ret = DIV_ROUND_CLOSEST(ret * 70, 453);
 			break;
 		case lm5066:
+		case lm5066i:
 			/* VIN: 2.18 mV VAUX: 725 uV LSB */
 			ret = DIV_ROUND_CLOSEST(ret * 725, 2180);
 			break;
@@ -488,16 +524,18 @@ static int lm25066_probe(struct i2c_client *client,
 	info->m[PSC_VOLTAGE_OUT] = coeff[PSC_VOLTAGE_OUT].m;
 	info->b[PSC_VOLTAGE_OUT] = coeff[PSC_VOLTAGE_OUT].b;
 	info->R[PSC_VOLTAGE_OUT] = coeff[PSC_VOLTAGE_OUT].R;
-	info->b[PSC_CURRENT_IN] = coeff[PSC_CURRENT_IN].b;
 	info->R[PSC_CURRENT_IN] = coeff[PSC_CURRENT_IN].R;
-	info->b[PSC_POWER] = coeff[PSC_POWER].b;
 	info->R[PSC_POWER] = coeff[PSC_POWER].R;
 	if (config & LM25066_DEV_SETUP_CL) {
 		info->m[PSC_CURRENT_IN] = coeff[PSC_CURRENT_IN_L].m;
+		info->b[PSC_CURRENT_IN] = coeff[PSC_CURRENT_IN_L].b;
 		info->m[PSC_POWER] = coeff[PSC_POWER_L].m;
+		info->b[PSC_POWER] = coeff[PSC_POWER_L].b;
 	} else {
 		info->m[PSC_CURRENT_IN] = coeff[PSC_CURRENT_IN].m;
+		info->b[PSC_CURRENT_IN] = coeff[PSC_CURRENT_IN].b;
 		info->m[PSC_POWER] = coeff[PSC_POWER].m;
+		info->b[PSC_POWER] = coeff[PSC_POWER].b;
 	}
 
 	return pmbus_do_probe(client, id, info);
@@ -509,6 +547,7 @@ static const struct i2c_device_id lm25066_id[] = {
 	{"lm25066", lm25066},
 	{"lm5064", lm5064},
 	{"lm5066", lm5066},
+	{"lm5066i", lm5066i},
 	{ }
 };
 
diff --git a/drivers/hwmon/pmbus/pmbus.h b/drivers/hwmon/pmbus/pmbus.h
index bfcb13b..4efa2bd 100644
--- a/drivers/hwmon/pmbus/pmbus.h
+++ b/drivers/hwmon/pmbus/pmbus.h
@@ -341,7 +341,7 @@ enum pmbus_sensor_classes {
 #define PMBUS_HAVE_STATUS_VMON	BIT(19)
 
 enum pmbus_data_format { linear = 0, direct, vid };
-enum vrm_version { vr11 = 0, vr12 };
+enum vrm_version { vr11 = 0, vr12, vr13 };
 
 struct pmbus_driver_info {
 	int pages;		/* Total number of pages */
diff --git a/drivers/hwmon/pmbus/pmbus_core.c b/drivers/hwmon/pmbus/pmbus_core.c
index f1eff6b..302f0ae 100644
--- a/drivers/hwmon/pmbus/pmbus_core.c
+++ b/drivers/hwmon/pmbus/pmbus_core.c
@@ -19,6 +19,7 @@
  * Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
  */
 
+#include <linux/debugfs.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <linux/init.h>
@@ -101,6 +102,7 @@ struct pmbus_data {
 	int num_attributes;
 	struct attribute_group group;
 	const struct attribute_group *groups[2];
+	struct dentry *debugfs;		/* debugfs device directory */
 
 	struct pmbus_sensor *sensors;
 
@@ -112,12 +114,20 @@ struct pmbus_data {
 	 * A single status register covers multiple attributes,
 	 * so we keep them all together.
 	 */
-	u8 status[PB_NUM_STATUS_REG];
-	u8 status_register;
+	u16 status[PB_NUM_STATUS_REG];
+
+	bool has_status_word;		/* device uses STATUS_WORD register */
+	int (*read_status)(struct i2c_client *client, int page);
 
 	u8 currpage;
 };
 
+struct pmbus_debugfs_entry {
+	struct i2c_client *client;
+	u8 page;
+	u8 reg;
+};
+
 void pmbus_clear_cache(struct i2c_client *client)
 {
 	struct pmbus_data *data = i2c_get_clientdata(client);
@@ -324,7 +334,7 @@ static int pmbus_check_status_cml(struct i2c_client *client)
 	struct pmbus_data *data = i2c_get_clientdata(client);
 	int status, status2;
 
-	status = _pmbus_read_byte_data(client, -1, data->status_register);
+	status = data->read_status(client, -1);
 	if (status < 0 || (status & PB_STATUS_CML)) {
 		status2 = _pmbus_read_byte_data(client, -1, PMBUS_STATUS_CML);
 		if (status2 < 0 || (status2 & PB_CML_FAULT_INVALID_COMMAND))
@@ -348,6 +358,23 @@ static bool pmbus_check_register(struct i2c_client *client,
 	return rv >= 0;
 }
 
+static bool pmbus_check_status_register(struct i2c_client *client, int page)
+{
+	int status;
+	struct pmbus_data *data = i2c_get_clientdata(client);
+
+	status = data->read_status(client, page);
+	if (status >= 0 && !(data->flags & PMBUS_SKIP_STATUS_CHECK) &&
+	    (status & PB_STATUS_CML)) {
+		status = _pmbus_read_byte_data(client, -1, PMBUS_STATUS_CML);
+		if (status < 0 || (status & PB_CML_FAULT_INVALID_COMMAND))
+			status = -EIO;
+	}
+
+	pmbus_clear_fault_page(client, -1);
+	return status >= 0;
+}
+
 bool pmbus_check_byte_register(struct i2c_client *client, int page, int reg)
 {
 	return pmbus_check_register(client, _pmbus_read_byte_data, page, reg);
@@ -394,8 +421,7 @@ static struct pmbus_data *pmbus_update_device(struct device *dev)
 
 		for (i = 0; i < info->pages; i++) {
 			data->status[PB_STATUS_BASE + i]
-			    = _pmbus_read_byte_data(client, i,
-						    data->status_register);
+			    = data->read_status(client, i);
 			for (j = 0; j < ARRAY_SIZE(pmbus_status); j++) {
 				struct _pmbus_status *s = &pmbus_status[j];
 
@@ -531,6 +557,10 @@ static long pmbus_reg2data_vid(struct pmbus_data *data,
 		if (val >= 0x01)
 			rv = 250 + (val - 1) * 5;
 		break;
+	case vr13:
+		if (val >= 0x01)
+			rv = 500 + (val - 1) * 10;
+		break;
 	}
 	return rv;
 }
@@ -716,10 +746,10 @@ static int pmbus_get_boolean(struct pmbus_data *data, struct pmbus_boolean *b,
 {
 	struct pmbus_sensor *s1 = b->s1;
 	struct pmbus_sensor *s2 = b->s2;
-	u16 reg = (index >> 8) & 0xffff;
-	u8 mask = index & 0xff;
+	u16 reg = (index >> 16) & 0xffff;
+	u16 mask = index & 0xffff;
 	int ret, status;
-	u8 regval;
+	u16 regval;
 
 	status = data->status[reg];
 	if (status < 0)
@@ -860,7 +890,7 @@ static int pmbus_add_boolean(struct pmbus_data *data,
 			     const char *name, const char *type, int seq,
 			     struct pmbus_sensor *s1,
 			     struct pmbus_sensor *s2,
-			     u16 reg, u8 mask)
+			     u16 reg, u16 mask)
 {
 	struct pmbus_boolean *boolean;
 	struct sensor_device_attribute *a;
@@ -876,7 +906,7 @@ static int pmbus_add_boolean(struct pmbus_data *data,
 	boolean->s1 = s1;
 	boolean->s2 = s2;
 	pmbus_attr_init(a, boolean->name, S_IRUGO, pmbus_show_boolean, NULL,
-			(reg << 8) | mask);
+			(reg << 16) | mask);
 
 	return pmbus_add_attribute(data, &a->dev_attr.attr);
 }
@@ -962,7 +992,7 @@ struct pmbus_limit_attr {
  */
 struct pmbus_sensor_attr {
 	u16 reg;			/* sensor register */
-	u8 gbit;			/* generic status bit */
+	u16 gbit;			/* generic status bit */
 	u8 nlimit;			/* # of limit registers */
 	enum pmbus_sensor_classes class;/* sensor class */
 	const char *label;		/* sensor label */
@@ -1028,6 +1058,7 @@ static int pmbus_add_sensor_attrs_one(struct i2c_client *client,
 				      const struct pmbus_sensor_attr *attr)
 {
 	struct pmbus_sensor *base;
+	bool upper = !!(attr->gbit & 0xff00);	/* need to check STATUS_WORD */
 	int ret;
 
 	if (attr->label) {
@@ -1048,11 +1079,12 @@ static int pmbus_add_sensor_attrs_one(struct i2c_client *client,
 		/*
 		 * Add generic alarm attribute only if there are no individual
 		 * alarm attributes, if there is a global alarm bit, and if
-		 * the generic status register for this page is accessible.
+		 * the generic status register (word or byte, depending on
+		 * which global bit is set) for this page is accessible.
 		 */
 		if (!ret && attr->gbit &&
-		    pmbus_check_byte_register(client, page,
-					      data->status_register)) {
+		    (!upper || (upper && data->has_status_word)) &&
+		    pmbus_check_status_register(client, page)) {
 			ret = pmbus_add_boolean(data, name, "alarm", index,
 						NULL, NULL,
 						PB_STATUS_BASE + page,
@@ -1308,6 +1340,7 @@ static const struct pmbus_sensor_attr current_attributes[] = {
 		.func = PMBUS_HAVE_IIN,
 		.sfunc = PMBUS_HAVE_STATUS_INPUT,
 		.sbase = PB_STATUS_INPUT_BASE,
+		.gbit = PB_STATUS_INPUT,
 		.limit = iin_limit_attrs,
 		.nlimit = ARRAY_SIZE(iin_limit_attrs),
 	}, {
@@ -1392,6 +1425,7 @@ static const struct pmbus_sensor_attr power_attributes[] = {
 		.func = PMBUS_HAVE_PIN,
 		.sfunc = PMBUS_HAVE_STATUS_INPUT,
 		.sbase = PB_STATUS_INPUT_BASE,
+		.gbit = PB_STATUS_INPUT,
 		.limit = pin_limit_attrs,
 		.nlimit = ARRAY_SIZE(pin_limit_attrs),
 	}, {
@@ -1729,6 +1763,16 @@ static int pmbus_identify_common(struct i2c_client *client,
 	return 0;
 }
 
+static int pmbus_read_status_byte(struct i2c_client *client, int page)
+{
+	return _pmbus_read_byte_data(client, page, PMBUS_STATUS_BYTE);
+}
+
+static int pmbus_read_status_word(struct i2c_client *client, int page)
+{
+	return _pmbus_read_word_data(client, page, PMBUS_STATUS_WORD);
+}
+
 static int pmbus_init_common(struct i2c_client *client, struct pmbus_data *data,
 			     struct pmbus_driver_info *info)
 {
@@ -1736,19 +1780,21 @@ static int pmbus_init_common(struct i2c_client *client, struct pmbus_data *data,
 	int page, ret;
 
 	/*
-	 * Some PMBus chips don't support PMBUS_STATUS_BYTE, so try
-	 * to use PMBUS_STATUS_WORD instead if that is the case.
+	 * Some PMBus chips don't support PMBUS_STATUS_WORD, so try
+	 * to use PMBUS_STATUS_BYTE instead if that is the case.
 	 * Bail out if both registers are not supported.
 	 */
-	data->status_register = PMBUS_STATUS_BYTE;
-	ret = i2c_smbus_read_byte_data(client, PMBUS_STATUS_BYTE);
-	if (ret < 0 || ret == 0xff) {
-		data->status_register = PMBUS_STATUS_WORD;
-		ret = i2c_smbus_read_word_data(client, PMBUS_STATUS_WORD);
-		if (ret < 0 || ret == 0xffff) {
+	data->read_status = pmbus_read_status_word;
+	ret = i2c_smbus_read_word_data(client, PMBUS_STATUS_WORD);
+	if (ret < 0 || ret == 0xffff) {
+		data->read_status = pmbus_read_status_byte;
+		ret = i2c_smbus_read_byte_data(client, PMBUS_STATUS_BYTE);
+		if (ret < 0 || ret == 0xff) {
 			dev_err(dev, "PMBus status register not found\n");
 			return -ENODEV;
 		}
+	} else {
+		data->has_status_word = true;
 	}
 
 	/* Enable PEC if the controller supports it */
@@ -1859,6 +1905,184 @@ static int pmbus_regulator_register(struct pmbus_data *data)
 }
 #endif
 
+static struct dentry *pmbus_debugfs_dir;	/* pmbus debugfs directory */
+
+#if IS_ENABLED(CONFIG_DEBUG_FS)
+static int pmbus_debugfs_get(void *data, u64 *val)
+{
+	int rc;
+	struct pmbus_debugfs_entry *entry = data;
+
+	rc = _pmbus_read_byte_data(entry->client, entry->page, entry->reg);
+	if (rc < 0)
+		return rc;
+
+	*val = rc;
+
+	return 0;
+}
+DEFINE_DEBUGFS_ATTRIBUTE(pmbus_debugfs_ops, pmbus_debugfs_get, NULL,
+			 "0x%02llx\n");
+
+static int pmbus_debugfs_get_status(void *data, u64 *val)
+{
+	int rc;
+	struct pmbus_debugfs_entry *entry = data;
+	struct pmbus_data *pdata = i2c_get_clientdata(entry->client);
+
+	rc = pdata->read_status(entry->client, entry->page);
+	if (rc < 0)
+		return rc;
+
+	*val = rc;
+
+	return 0;
+}
+DEFINE_DEBUGFS_ATTRIBUTE(pmbus_debugfs_ops_status, pmbus_debugfs_get_status,
+			 NULL, "0x%04llx\n");
+
+static int pmbus_init_debugfs(struct i2c_client *client,
+			      struct pmbus_data *data)
+{
+	int i, idx = 0;
+	char name[PMBUS_NAME_SIZE];
+	struct pmbus_debugfs_entry *entries;
+
+	if (!pmbus_debugfs_dir)
+		return -ENODEV;
+
+	/*
+	 * Create the debugfs directory for this device. Use the hwmon device
+	 * name to avoid conflicts (hwmon numbers are globally unique).
+	 */
+	data->debugfs = debugfs_create_dir(dev_name(data->hwmon_dev),
+					   pmbus_debugfs_dir);
+	if (IS_ERR_OR_NULL(data->debugfs)) {
+		data->debugfs = NULL;
+		return -ENODEV;
+	}
+
+	/* Allocate the max possible entries we need. */
+	entries = devm_kzalloc(data->dev,
+			       sizeof(*entries) * (data->info->pages * 10),
+			       GFP_KERNEL);
+	if (!entries)
+		return -ENOMEM;
+
+	for (i = 0; i < data->info->pages; ++i) {
+		/* Check accessibility of status register if it's not page 0 */
+		if (!i || pmbus_check_status_register(client, i)) {
+			/* No need to set reg as we have special read op. */
+			entries[idx].client = client;
+			entries[idx].page = i;
+			scnprintf(name, PMBUS_NAME_SIZE, "status%d", i);
+			debugfs_create_file(name, 0444, data->debugfs,
+					    &entries[idx++],
+					    &pmbus_debugfs_ops_status);
+		}
+
+		if (data->info->func[i] & PMBUS_HAVE_STATUS_VOUT) {
+			entries[idx].client = client;
+			entries[idx].page = i;
+			entries[idx].reg = PMBUS_STATUS_VOUT;
+			scnprintf(name, PMBUS_NAME_SIZE, "status%d_vout", i);
+			debugfs_create_file(name, 0444, data->debugfs,
+					    &entries[idx++],
+					    &pmbus_debugfs_ops);
+		}
+
+		if (data->info->func[i] & PMBUS_HAVE_STATUS_IOUT) {
+			entries[idx].client = client;
+			entries[idx].page = i;
+			entries[idx].reg = PMBUS_STATUS_IOUT;
+			scnprintf(name, PMBUS_NAME_SIZE, "status%d_iout", i);
+			debugfs_create_file(name, 0444, data->debugfs,
+					    &entries[idx++],
+					    &pmbus_debugfs_ops);
+		}
+
+		if (data->info->func[i] & PMBUS_HAVE_STATUS_INPUT) {
+			entries[idx].client = client;
+			entries[idx].page = i;
+			entries[idx].reg = PMBUS_STATUS_INPUT;
+			scnprintf(name, PMBUS_NAME_SIZE, "status%d_input", i);
+			debugfs_create_file(name, 0444, data->debugfs,
+					    &entries[idx++],
+					    &pmbus_debugfs_ops);
+		}
+
+		if (data->info->func[i] & PMBUS_HAVE_STATUS_TEMP) {
+			entries[idx].client = client;
+			entries[idx].page = i;
+			entries[idx].reg = PMBUS_STATUS_TEMPERATURE;
+			scnprintf(name, PMBUS_NAME_SIZE, "status%d_temp", i);
+			debugfs_create_file(name, 0444, data->debugfs,
+					    &entries[idx++],
+					    &pmbus_debugfs_ops);
+		}
+
+		if (pmbus_check_byte_register(client, i, PMBUS_STATUS_CML)) {
+			entries[idx].client = client;
+			entries[idx].page = i;
+			entries[idx].reg = PMBUS_STATUS_CML;
+			scnprintf(name, PMBUS_NAME_SIZE, "status%d_cml", i);
+			debugfs_create_file(name, 0444, data->debugfs,
+					    &entries[idx++],
+					    &pmbus_debugfs_ops);
+		}
+
+		if (pmbus_check_byte_register(client, i, PMBUS_STATUS_OTHER)) {
+			entries[idx].client = client;
+			entries[idx].page = i;
+			entries[idx].reg = PMBUS_STATUS_OTHER;
+			scnprintf(name, PMBUS_NAME_SIZE, "status%d_other", i);
+			debugfs_create_file(name, 0444, data->debugfs,
+					    &entries[idx++],
+					    &pmbus_debugfs_ops);
+		}
+
+		if (pmbus_check_byte_register(client, i,
+					      PMBUS_STATUS_MFR_SPECIFIC)) {
+			entries[idx].client = client;
+			entries[idx].page = i;
+			entries[idx].reg = PMBUS_STATUS_MFR_SPECIFIC;
+			scnprintf(name, PMBUS_NAME_SIZE, "status%d_mfr", i);
+			debugfs_create_file(name, 0444, data->debugfs,
+					    &entries[idx++],
+					    &pmbus_debugfs_ops);
+		}
+
+		if (data->info->func[i] & PMBUS_HAVE_STATUS_FAN12) {
+			entries[idx].client = client;
+			entries[idx].page = i;
+			entries[idx].reg = PMBUS_STATUS_FAN_12;
+			scnprintf(name, PMBUS_NAME_SIZE, "status%d_fan12", i);
+			debugfs_create_file(name, 0444, data->debugfs,
+					    &entries[idx++],
+					    &pmbus_debugfs_ops);
+		}
+
+		if (data->info->func[i] & PMBUS_HAVE_STATUS_FAN34) {
+			entries[idx].client = client;
+			entries[idx].page = i;
+			entries[idx].reg = PMBUS_STATUS_FAN_34;
+			scnprintf(name, PMBUS_NAME_SIZE, "status%d_fan34", i);
+			debugfs_create_file(name, 0444, data->debugfs,
+					    &entries[idx++],
+					    &pmbus_debugfs_ops);
+		}
+	}
+
+	return 0;
+}
+#else
+static int pmbus_init_debugfs(struct i2c_client *client,
+			      struct pmbus_data *data)
+{
+	return 0;
+}
+#endif	/* IS_ENABLED(CONFIG_DEBUG_FS) */
+
 int pmbus_do_probe(struct i2c_client *client, const struct i2c_device_id *id,
 		   struct pmbus_driver_info *info)
 {
@@ -1918,6 +2142,10 @@ int pmbus_do_probe(struct i2c_client *client, const struct i2c_device_id *id,
 	if (ret)
 		goto out_unregister;
 
+	ret = pmbus_init_debugfs(client, data);
+	if (ret)
+		dev_warn(dev, "Failed to register debugfs\n");
+
 	return 0;
 
 out_unregister:
@@ -1931,12 +2159,32 @@ EXPORT_SYMBOL_GPL(pmbus_do_probe);
 int pmbus_do_remove(struct i2c_client *client)
 {
 	struct pmbus_data *data = i2c_get_clientdata(client);
+
+	debugfs_remove_recursive(data->debugfs);
+
 	hwmon_device_unregister(data->hwmon_dev);
 	kfree(data->group.attrs);
 	return 0;
 }
 EXPORT_SYMBOL_GPL(pmbus_do_remove);
 
+static int __init pmbus_core_init(void)
+{
+	pmbus_debugfs_dir = debugfs_create_dir("pmbus", NULL);
+	if (IS_ERR(pmbus_debugfs_dir))
+		pmbus_debugfs_dir = NULL;
+
+	return 0;
+}
+
+static void __exit pmbus_core_exit(void)
+{
+	debugfs_remove_recursive(pmbus_debugfs_dir);
+}
+
+module_init(pmbus_core_init);
+module_exit(pmbus_core_exit);
+
 MODULE_AUTHOR("Guenter Roeck");
 MODULE_DESCRIPTION("PMBus core driver");
 MODULE_LICENSE("GPL");
diff --git a/drivers/hwmon/pmbus/tps53679.c b/drivers/hwmon/pmbus/tps53679.c
new file mode 100644
index 0000000..85b515c
--- /dev/null
+++ b/drivers/hwmon/pmbus/tps53679.c
@@ -0,0 +1,113 @@
+/*
+ * Hardware monitoring driver for Texas Instruments TPS53679
+ *
+ * Copyright (c) 2017 Mellanox Technologies. All rights reserved.
+ * Copyright (c) 2017 Vadim Pasternak <vadimp@mellanox.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/err.h>
+#include <linux/i2c.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include "pmbus.h"
+
+#define TPS53679_PROT_VR12_5MV		0x01 /* VR12.0 mode, 5-mV DAC */
+#define TPS53679_PROT_VR12_5_10MV	0x02 /* VR12.5 mode, 10-mV DAC */
+#define TPS53679_PROT_VR13_10MV		0x04 /* VR13.0 mode, 10-mV DAC */
+#define TPS53679_PROT_IMVP8_5MV		0x05 /* IMVP8 mode, 5-mV DAC */
+#define TPS53679_PROT_VR13_5MV		0x07 /* VR13.0 mode, 5-mV DAC */
+#define TPS53679_PAGE_NUM		2
+
+static int tps53679_identify(struct i2c_client *client,
+			     struct pmbus_driver_info *info)
+{
+	u8 vout_params;
+	int ret;
+
+	/* Read the register with VOUT scaling value.*/
+	ret = pmbus_read_byte_data(client, 0, PMBUS_VOUT_MODE);
+	if (ret < 0)
+		return ret;
+
+	vout_params = ret & GENMASK(4, 0);
+
+	switch (vout_params) {
+	case TPS53679_PROT_VR13_10MV:
+	case TPS53679_PROT_VR12_5_10MV:
+		info->vrm_version = vr13;
+		break;
+	case TPS53679_PROT_VR13_5MV:
+	case TPS53679_PROT_VR12_5MV:
+	case TPS53679_PROT_IMVP8_5MV:
+		info->vrm_version = vr12;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static struct pmbus_driver_info tps53679_info = {
+	.pages = TPS53679_PAGE_NUM,
+	.format[PSC_VOLTAGE_IN] = linear,
+	.format[PSC_VOLTAGE_OUT] = vid,
+	.format[PSC_TEMPERATURE] = linear,
+	.format[PSC_CURRENT_OUT] = linear,
+	.format[PSC_POWER] = linear,
+	.func[0] = PMBUS_HAVE_VIN | PMBUS_HAVE_VOUT | PMBUS_HAVE_STATUS_VOUT |
+		PMBUS_HAVE_IOUT | PMBUS_HAVE_STATUS_IOUT |
+		PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP |
+		PMBUS_HAVE_POUT,
+	.func[1] = PMBUS_HAVE_VIN | PMBUS_HAVE_VOUT | PMBUS_HAVE_STATUS_VOUT |
+		PMBUS_HAVE_IOUT | PMBUS_HAVE_STATUS_IOUT |
+		PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP |
+		PMBUS_HAVE_POUT,
+	.identify = tps53679_identify,
+};
+
+static int tps53679_probe(struct i2c_client *client,
+			  const struct i2c_device_id *id)
+{
+	return pmbus_do_probe(client, id, &tps53679_info);
+}
+
+static const struct i2c_device_id tps53679_id[] = {
+	{"tps53679", 0},
+	{}
+};
+
+MODULE_DEVICE_TABLE(i2c, tps53679_id);
+
+static const struct of_device_id tps53679_of_match[] = {
+	{.compatible = "ti,tps53679"},
+	{}
+};
+MODULE_DEVICE_TABLE(of, tps53679_of_match);
+
+static struct i2c_driver tps53679_driver = {
+	.driver = {
+		.name = "tps53679",
+		.of_match_table = of_match_ptr(tps53679_of_match),
+	},
+	.probe = tps53679_probe,
+	.remove = pmbus_do_remove,
+	.id_table = tps53679_id,
+};
+
+module_i2c_driver(tps53679_driver);
+
+MODULE_AUTHOR("Vadim Pasternak <vadimp@mellanox.com>");
+MODULE_DESCRIPTION("PMBus driver for Texas Instruments TPS53679");
+MODULE_LICENSE("GPL");
diff --git a/drivers/hwmon/scpi-hwmon.c b/drivers/hwmon/scpi-hwmon.c
index a586480..7e49da5 100644
--- a/drivers/hwmon/scpi-hwmon.c
+++ b/drivers/hwmon/scpi-hwmon.c
@@ -120,7 +120,7 @@ scpi_show_label(struct device *dev, struct device_attribute *attr, char *buf)
 	return sprintf(buf, "%s\n", sensor->info.name);
 }
 
-static struct thermal_zone_of_device_ops scpi_sensor_ops = {
+static const struct thermal_zone_of_device_ops scpi_sensor_ops = {
 	.get_temp = scpi_read_temp,
 };
 
diff --git a/drivers/hwmon/stts751.c b/drivers/hwmon/stts751.c
index d56251d6..3f940fb 100644
--- a/drivers/hwmon/stts751.c
+++ b/drivers/hwmon/stts751.c
@@ -718,6 +718,10 @@ static int stts751_read_chip_config(struct stts751_priv *priv)
 	ret = i2c_smbus_read_byte_data(priv->client, STTS751_REG_RATE);
 	if (ret < 0)
 		return ret;
+	if (ret >= ARRAY_SIZE(stts751_intervals)) {
+		dev_err(priv->dev, "Unrecognized conversion rate 0x%x\n", ret);
+		return -ENODEV;
+	}
 	priv->interval = ret;
 
 	ret = stts751_read_reg16(priv, &priv->event_max,
diff --git a/drivers/i2c/busses/i2c-designware-platdrv.c b/drivers/i2c/busses/i2c-designware-platdrv.c
index 57248bc..2b98a17 100644
--- a/drivers/i2c/busses/i2c-designware-platdrv.c
+++ b/drivers/i2c/busses/i2c-designware-platdrv.c
@@ -256,7 +256,8 @@ static int dw_i2c_plat_probe(struct platform_device *pdev)
 	struct dw_i2c_dev *dev;
 	u32 acpi_speed, ht = 0;
 	struct resource *mem;
-	int irq, ret;
+	int i, irq, ret;
+	const int supported_speeds[] = { 0, 100000, 400000, 1000000, 3400000 };
 
 	irq = platform_get_irq(pdev, 0);
 	if (irq < 0)
@@ -297,9 +298,16 @@ static int dw_i2c_plat_probe(struct platform_device *pdev)
 	}
 
 	acpi_speed = i2c_acpi_find_bus_speed(&pdev->dev);
-	/* Some broken DSTDs use 1MiHz instead of 1MHz */
-	if (acpi_speed == 1048576)
-		acpi_speed = 1000000;
+	/*
+	 * Some DSTDs use a non standard speed, round down to the lowest
+	 * standard speed.
+	 */
+	for (i = 1; i < ARRAY_SIZE(supported_speeds); i++) {
+		if (acpi_speed < supported_speeds[i])
+			break;
+	}
+	acpi_speed = supported_speeds[i - 1];
+
 	/*
 	 * Find bus speed from the "clock-frequency" device property, ACPI
 	 * or by using fast mode if neither is set.
diff --git a/drivers/i2c/busses/i2c-ismt.c b/drivers/i2c/busses/i2c-ismt.c
index e98e44e..22ffcb7 100644
--- a/drivers/i2c/busses/i2c-ismt.c
+++ b/drivers/i2c/busses/i2c-ismt.c
@@ -341,8 +341,10 @@ static int ismt_process_desc(const struct ismt_desc *desc,
 			break;
 		case I2C_SMBUS_BLOCK_DATA:
 		case I2C_SMBUS_I2C_BLOCK_DATA:
-			memcpy(&data->block[1], dma_buffer, desc->rxbytes);
-			data->block[0] = desc->rxbytes;
+			if (desc->rxbytes != dma_buffer[0] + 1)
+				return -EMSGSIZE;
+
+			memcpy(data->block, dma_buffer, desc->rxbytes);
 			break;
 		}
 		return 0;
diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig
index 234fe01..3726205 100644
--- a/drivers/infiniband/Kconfig
+++ b/drivers/infiniband/Kconfig
@@ -34,6 +34,15 @@
 	  libibverbs, libibcm and a hardware driver library from
 	  <http://www.openfabrics.org/git/>.
 
+config INFINIBAND_EXP_USER_ACCESS
+	bool "Allow experimental support for Infiniband ABI"
+	depends on INFINIBAND_USER_ACCESS
+	---help---
+	  IOCTL based ABI support for Infiniband. This allows userspace
+	  to invoke the experimental IOCTL based ABI.
+	  These commands are parsed via per-device parsing tree and
+	  enables per-device features.
+
 config INFINIBAND_USER_MEM
 	bool
 	depends on INFINIBAND_USER_ACCESS != n
diff --git a/drivers/infiniband/core/Makefile b/drivers/infiniband/core/Makefile
index e3cdaff..b4df164 100644
--- a/drivers/infiniband/core/Makefile
+++ b/drivers/infiniband/core/Makefile
@@ -11,7 +11,8 @@
 				device.o fmr_pool.o cache.o netlink.o \
 				roce_gid_mgmt.o mr_pool.o addr.o sa_query.o \
 				multicast.o mad.o smi.o agent.o mad_rmpp.o \
-				security.o
+				security.o nldev.o
+
 ib_core-$(CONFIG_INFINIBAND_USER_MEM) += umem.o
 ib_core-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += umem_odp.o umem_rbtree.o
 ib_core-$(CONFIG_CGROUP_RDMA) += cgroup.o
@@ -31,4 +32,5 @@
 ib_ucm-y :=			ucm.o
 
 ib_uverbs-y :=			uverbs_main.o uverbs_cmd.o uverbs_marshall.o \
-				rdma_core.o uverbs_std_types.o
+				rdma_core.o uverbs_std_types.o uverbs_ioctl.o \
+				uverbs_ioctl_merge.o
diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index 437522c..12523f6 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -130,13 +130,11 @@ static void ib_nl_process_good_ip_rsep(const struct nlmsghdr *nlh)
 }
 
 int ib_nl_handle_ip_res_resp(struct sk_buff *skb,
-			     struct netlink_callback *cb)
+			     struct nlmsghdr *nlh,
+			     struct netlink_ext_ack *extack)
 {
-	const struct nlmsghdr *nlh = (struct nlmsghdr *)cb->nlh;
-
 	if ((nlh->nlmsg_flags & NLM_F_REQUEST) ||
-	    !(NETLINK_CB(skb).sk) ||
-	    !netlink_capable(skb, CAP_NET_ADMIN))
+	    !(NETLINK_CB(skb).sk))
 		return -EPERM;
 
 	if (ib_nl_is_good_ip_resp(nlh))
@@ -186,7 +184,7 @@ static int ib_nl_ip_send_msg(struct rdma_dev_addr *dev_addr,
 
 	/* Repair the nlmsg header length */
 	nlmsg_end(skb, nlh);
-	ibnl_multicast(skb, nlh, RDMA_NL_GROUP_LS, GFP_KERNEL);
+	rdma_nl_multicast(skb, RDMA_NL_GROUP_LS, GFP_KERNEL);
 
 	/* Make the request retry, so when we get the response from userspace
 	 * we will have something.
@@ -326,7 +324,7 @@ static void queue_req(struct addr_req *req)
 static int ib_nl_fetch_ha(struct dst_entry *dst, struct rdma_dev_addr *dev_addr,
 			  const void *daddr, u32 seq, u16 family)
 {
-	if (ibnl_chk_listeners(RDMA_NL_GROUP_LS))
+	if (rdma_nl_chk_listeners(RDMA_NL_GROUP_LS))
 		return -EADDRNOTAVAIL;
 
 	/* We fill in what we can, the response will fill the rest */
diff --git a/drivers/infiniband/core/cache.c b/drivers/infiniband/core/cache.c
index efc9430..7751563 100644
--- a/drivers/infiniband/core/cache.c
+++ b/drivers/infiniband/core/cache.c
@@ -1199,30 +1199,23 @@ int ib_cache_setup_one(struct ib_device *device)
 	device->cache.ports =
 		kzalloc(sizeof(*device->cache.ports) *
 			(rdma_end_port(device) - rdma_start_port(device) + 1), GFP_KERNEL);
-	if (!device->cache.ports) {
-		err = -ENOMEM;
-		goto out;
-	}
+	if (!device->cache.ports)
+		return -ENOMEM;
 
 	err = gid_table_setup_one(device);
-	if (err)
-		goto out;
+	if (err) {
+		kfree(device->cache.ports);
+		device->cache.ports = NULL;
+		return err;
+	}
 
 	for (p = 0; p <= rdma_end_port(device) - rdma_start_port(device); ++p)
 		ib_cache_update(device, p + rdma_start_port(device), true);
 
 	INIT_IB_EVENT_HANDLER(&device->cache.event_handler,
 			      device, ib_cache_event);
-	err = ib_register_event_handler(&device->cache.event_handler);
-	if (err)
-		goto err;
-
+	ib_register_event_handler(&device->cache.event_handler);
 	return 0;
-
-err:
-	gid_table_cleanup_one(device);
-out:
-	return err;
 }
 
 void ib_cache_release_one(struct ib_device *device)
diff --git a/drivers/infiniband/core/cm.c b/drivers/infiniband/core/cm.c
index 2b4d613a..4c4b465 100644
--- a/drivers/infiniband/core/cm.c
+++ b/drivers/infiniband/core/cm.c
@@ -373,11 +373,19 @@ static int cm_alloc_msg(struct cm_id_private *cm_id_priv,
 	return ret;
 }
 
-static int cm_alloc_response_msg(struct cm_port *port,
-				 struct ib_mad_recv_wc *mad_recv_wc,
-				 struct ib_mad_send_buf **msg)
+static struct ib_mad_send_buf *cm_alloc_response_msg_no_ah(struct cm_port *port,
+							   struct ib_mad_recv_wc *mad_recv_wc)
 {
-	struct ib_mad_send_buf *m;
+	return ib_create_send_mad(port->mad_agent, 1, mad_recv_wc->wc->pkey_index,
+				  0, IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
+				  GFP_ATOMIC,
+				  IB_MGMT_BASE_VERSION);
+}
+
+static int cm_create_response_msg_ah(struct cm_port *port,
+				     struct ib_mad_recv_wc *mad_recv_wc,
+				     struct ib_mad_send_buf *msg)
+{
 	struct ib_ah *ah;
 
 	ah = ib_create_ah_from_wc(port->mad_agent->qp->pd, mad_recv_wc->wc,
@@ -385,27 +393,40 @@ static int cm_alloc_response_msg(struct cm_port *port,
 	if (IS_ERR(ah))
 		return PTR_ERR(ah);
 
-	m = ib_create_send_mad(port->mad_agent, 1, mad_recv_wc->wc->pkey_index,
-			       0, IB_MGMT_MAD_HDR, IB_MGMT_MAD_DATA,
-			       GFP_ATOMIC,
-			       IB_MGMT_BASE_VERSION);
-	if (IS_ERR(m)) {
-		rdma_destroy_ah(ah);
-		return PTR_ERR(m);
-	}
-	m->ah = ah;
-	*msg = m;
+	msg->ah = ah;
 	return 0;
 }
 
 static void cm_free_msg(struct ib_mad_send_buf *msg)
 {
-	rdma_destroy_ah(msg->ah);
+	if (msg->ah)
+		rdma_destroy_ah(msg->ah);
 	if (msg->context[0])
 		cm_deref_id(msg->context[0]);
 	ib_free_send_mad(msg);
 }
 
+static int cm_alloc_response_msg(struct cm_port *port,
+				 struct ib_mad_recv_wc *mad_recv_wc,
+				 struct ib_mad_send_buf **msg)
+{
+	struct ib_mad_send_buf *m;
+	int ret;
+
+	m = cm_alloc_response_msg_no_ah(port, mad_recv_wc);
+	if (IS_ERR(m))
+		return PTR_ERR(m);
+
+	ret = cm_create_response_msg_ah(port, mad_recv_wc, m);
+	if (ret) {
+		cm_free_msg(m);
+		return ret;
+	}
+
+	*msg = m;
+	return 0;
+}
+
 static void * cm_copy_private_data(const void *private_data,
 				   u8 private_data_len)
 {
@@ -1175,6 +1196,11 @@ static void cm_format_req(struct cm_req_msg *req_msg,
 {
 	struct sa_path_rec *pri_path = param->primary_path;
 	struct sa_path_rec *alt_path = param->alternate_path;
+	bool pri_ext = false;
+
+	if (pri_path->rec_type == SA_PATH_REC_TYPE_OPA)
+		pri_ext = opa_is_extended_lid(pri_path->opa.dlid,
+					      pri_path->opa.slid);
 
 	cm_format_mad_hdr(&req_msg->hdr, CM_REQ_ATTR_ID,
 			  cm_form_tid(cm_id_priv, CM_MSG_SEQUENCE_REQ));
@@ -1202,18 +1228,24 @@ static void cm_format_req(struct cm_req_msg *req_msg,
 		cm_req_set_srq(req_msg, param->srq);
 	}
 
+	req_msg->primary_local_gid = pri_path->sgid;
+	req_msg->primary_remote_gid = pri_path->dgid;
+	if (pri_ext) {
+		req_msg->primary_local_gid.global.interface_id
+			= OPA_MAKE_ID(be32_to_cpu(pri_path->opa.slid));
+		req_msg->primary_remote_gid.global.interface_id
+			= OPA_MAKE_ID(be32_to_cpu(pri_path->opa.dlid));
+	}
 	if (pri_path->hop_limit <= 1) {
-		req_msg->primary_local_lid =
+		req_msg->primary_local_lid = pri_ext ? 0 :
 			htons(ntohl(sa_path_get_slid(pri_path)));
-		req_msg->primary_remote_lid =
+		req_msg->primary_remote_lid = pri_ext ? 0 :
 			htons(ntohl(sa_path_get_dlid(pri_path)));
 	} else {
 		/* Work-around until there's a way to obtain remote LID info */
 		req_msg->primary_local_lid = IB_LID_PERMISSIVE;
 		req_msg->primary_remote_lid = IB_LID_PERMISSIVE;
 	}
-	req_msg->primary_local_gid = pri_path->sgid;
-	req_msg->primary_remote_gid = pri_path->dgid;
 	cm_req_set_primary_flow_label(req_msg, pri_path->flow_label);
 	cm_req_set_primary_packet_rate(req_msg, pri_path->rate);
 	req_msg->primary_traffic_class = pri_path->traffic_class;
@@ -1225,17 +1257,29 @@ static void cm_format_req(struct cm_req_msg *req_msg,
 			       pri_path->packet_life_time));
 
 	if (alt_path) {
+		bool alt_ext = false;
+
+		if (alt_path->rec_type == SA_PATH_REC_TYPE_OPA)
+			alt_ext = opa_is_extended_lid(alt_path->opa.dlid,
+						      alt_path->opa.slid);
+
+		req_msg->alt_local_gid = alt_path->sgid;
+		req_msg->alt_remote_gid = alt_path->dgid;
+		if (alt_ext) {
+			req_msg->alt_local_gid.global.interface_id
+				= OPA_MAKE_ID(be32_to_cpu(alt_path->opa.slid));
+			req_msg->alt_remote_gid.global.interface_id
+				= OPA_MAKE_ID(be32_to_cpu(alt_path->opa.dlid));
+		}
 		if (alt_path->hop_limit <= 1) {
-			req_msg->alt_local_lid =
+			req_msg->alt_local_lid = alt_ext ? 0 :
 				htons(ntohl(sa_path_get_slid(alt_path)));
-			req_msg->alt_remote_lid =
+			req_msg->alt_remote_lid = alt_ext ? 0 :
 				htons(ntohl(sa_path_get_dlid(alt_path)));
 		} else {
 			req_msg->alt_local_lid = IB_LID_PERMISSIVE;
 			req_msg->alt_remote_lid = IB_LID_PERMISSIVE;
 		}
-		req_msg->alt_local_gid = alt_path->sgid;
-		req_msg->alt_remote_gid = alt_path->dgid;
 		cm_req_set_alt_flow_label(req_msg,
 					  alt_path->flow_label);
 		cm_req_set_alt_packet_rate(req_msg, alt_path->rate);
@@ -1405,16 +1449,63 @@ static inline int cm_is_active_peer(__be64 local_ca_guid, __be64 remote_ca_guid,
 		 (be32_to_cpu(local_qpn) > be32_to_cpu(remote_qpn))));
 }
 
+static bool cm_req_has_alt_path(struct cm_req_msg *req_msg)
+{
+	return ((req_msg->alt_local_lid) ||
+		(ib_is_opa_gid(&req_msg->alt_local_gid)));
+}
+
+static void cm_path_set_rec_type(struct ib_device *ib_device, u8 port_num,
+				 struct sa_path_rec *path, union ib_gid *gid)
+{
+	if (ib_is_opa_gid(gid) && rdma_cap_opa_ah(ib_device, port_num))
+		path->rec_type = SA_PATH_REC_TYPE_OPA;
+	else
+		path->rec_type = SA_PATH_REC_TYPE_IB;
+}
+
+static void cm_format_path_lid_from_req(struct cm_req_msg *req_msg,
+					struct sa_path_rec *primary_path,
+					struct sa_path_rec *alt_path)
+{
+	u32 lid;
+
+	if (primary_path->rec_type != SA_PATH_REC_TYPE_OPA) {
+		sa_path_set_dlid(primary_path,
+				 htonl(ntohs(req_msg->primary_local_lid)));
+		sa_path_set_slid(primary_path,
+				 htonl(ntohs(req_msg->primary_remote_lid)));
+	} else {
+		lid = opa_get_lid_from_gid(&req_msg->primary_local_gid);
+		sa_path_set_dlid(primary_path, cpu_to_be32(lid));
+
+		lid = opa_get_lid_from_gid(&req_msg->primary_remote_gid);
+		sa_path_set_slid(primary_path, cpu_to_be32(lid));
+	}
+
+	if (!cm_req_has_alt_path(req_msg))
+		return;
+
+	if (alt_path->rec_type != SA_PATH_REC_TYPE_OPA) {
+		sa_path_set_dlid(alt_path,
+				 htonl(ntohs(req_msg->alt_local_lid)));
+		sa_path_set_slid(alt_path,
+				 htonl(ntohs(req_msg->alt_remote_lid)));
+	} else {
+		lid = opa_get_lid_from_gid(&req_msg->alt_local_gid);
+		sa_path_set_dlid(alt_path, cpu_to_be32(lid));
+
+		lid = opa_get_lid_from_gid(&req_msg->alt_remote_gid);
+		sa_path_set_slid(alt_path, cpu_to_be32(lid));
+	}
+}
+
 static void cm_format_paths_from_req(struct cm_req_msg *req_msg,
 				     struct sa_path_rec *primary_path,
 				     struct sa_path_rec *alt_path)
 {
 	primary_path->dgid = req_msg->primary_local_gid;
 	primary_path->sgid = req_msg->primary_remote_gid;
-	sa_path_set_dlid(primary_path,
-			 htonl(ntohs(req_msg->primary_local_lid)));
-	sa_path_set_slid(primary_path,
-			 htonl(ntohs(req_msg->primary_remote_lid)));
 	primary_path->flow_label = cm_req_get_primary_flow_label(req_msg);
 	primary_path->hop_limit = req_msg->primary_hop_limit;
 	primary_path->traffic_class = req_msg->primary_traffic_class;
@@ -1431,13 +1522,9 @@ static void cm_format_paths_from_req(struct cm_req_msg *req_msg,
 	primary_path->packet_life_time -= (primary_path->packet_life_time > 0);
 	primary_path->service_id = req_msg->service_id;
 
-	if (req_msg->alt_local_lid) {
+	if (cm_req_has_alt_path(req_msg)) {
 		alt_path->dgid = req_msg->alt_local_gid;
 		alt_path->sgid = req_msg->alt_remote_gid;
-		sa_path_set_dlid(alt_path,
-				 htonl(ntohs(req_msg->alt_local_lid)));
-		sa_path_set_slid(alt_path,
-				 htonl(ntohs(req_msg->alt_remote_lid)));
 		alt_path->flow_label = cm_req_get_alt_flow_label(req_msg);
 		alt_path->hop_limit = req_msg->alt_hop_limit;
 		alt_path->traffic_class = req_msg->alt_traffic_class;
@@ -1454,6 +1541,7 @@ static void cm_format_paths_from_req(struct cm_req_msg *req_msg,
 		alt_path->packet_life_time -= (alt_path->packet_life_time > 0);
 		alt_path->service_id = req_msg->service_id;
 	}
+	cm_format_path_lid_from_req(req_msg, primary_path, alt_path);
 }
 
 static u16 cm_get_bth_pkey(struct cm_work *work)
@@ -1703,7 +1791,7 @@ static void cm_process_routed_req(struct cm_req_msg *req_msg, struct ib_wc *wc)
 {
 	if (!cm_req_get_primary_subnet_local(req_msg)) {
 		if (req_msg->primary_local_lid == IB_LID_PERMISSIVE) {
-			req_msg->primary_local_lid = cpu_to_be16(wc->slid);
+			req_msg->primary_local_lid = ib_lid_be16(wc->slid);
 			cm_req_set_primary_sl(req_msg, wc->sl);
 		}
 
@@ -1713,7 +1801,7 @@ static void cm_process_routed_req(struct cm_req_msg *req_msg, struct ib_wc *wc)
 
 	if (!cm_req_get_alt_subnet_local(req_msg)) {
 		if (req_msg->alt_local_lid == IB_LID_PERMISSIVE) {
-			req_msg->alt_local_lid = cpu_to_be16(wc->slid);
+			req_msg->alt_local_lid = ib_lid_be16(wc->slid);
 			cm_req_set_alt_sl(req_msg, wc->sl);
 		}
 
@@ -1784,9 +1872,12 @@ static int cm_req_handler(struct cm_work *work)
 					 dev_net(gid_attr.ndev));
 			dev_put(gid_attr.ndev);
 		} else {
-			work->path[0].rec_type = SA_PATH_REC_TYPE_IB;
+			cm_path_set_rec_type(work->port->cm_dev->ib_device,
+					     work->port->port_num,
+					     &work->path[0],
+					     &req_msg->primary_local_gid);
 		}
-		if (req_msg->alt_local_lid)
+		if (cm_req_has_alt_path(req_msg))
 			work->path[1].rec_type = work->path[0].rec_type;
 		cm_format_paths_from_req(req_msg, &work->path[0],
 					 &work->path[1]);
@@ -1811,16 +1902,19 @@ static int cm_req_handler(struct cm_work *work)
 					 dev_net(gid_attr.ndev));
 			dev_put(gid_attr.ndev);
 		} else {
-			work->path[0].rec_type = SA_PATH_REC_TYPE_IB;
+			cm_path_set_rec_type(work->port->cm_dev->ib_device,
+					     work->port->port_num,
+					     &work->path[0],
+					     &req_msg->primary_local_gid);
 		}
-		if (req_msg->alt_local_lid)
+		if (cm_req_has_alt_path(req_msg))
 			work->path[1].rec_type = work->path[0].rec_type;
 		ib_send_cm_rej(cm_id, IB_CM_REJ_INVALID_GID,
 			       &work->path[0].sgid, sizeof work->path[0].sgid,
 			       NULL, 0);
 		goto rejected;
 	}
-	if (req_msg->alt_local_lid) {
+	if (cm_req_has_alt_path(req_msg)) {
 		ret = cm_init_av_by_path(&work->path[1], &cm_id_priv->alt_av,
 					 cm_id_priv);
 		if (ret) {
@@ -2424,7 +2518,8 @@ static int cm_dreq_handler(struct cm_work *work)
 	case IB_CM_TIMEWAIT:
 		atomic_long_inc(&work->port->counter_group[CM_RECV_DUPLICATES].
 				counter[CM_DREQ_COUNTER]);
-		if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg))
+		msg = cm_alloc_response_msg_no_ah(work->port, work->mad_recv_wc);
+		if (IS_ERR(msg))
 			goto unlock;
 
 		cm_format_drep((struct cm_drep_msg *) msg->mad, cm_id_priv,
@@ -2432,7 +2527,8 @@ static int cm_dreq_handler(struct cm_work *work)
 			       cm_id_priv->private_data_len);
 		spin_unlock_irq(&cm_id_priv->lock);
 
-		if (ib_post_send_mad(msg, NULL))
+		if (cm_create_response_msg_ah(work->port, work->mad_recv_wc, msg) ||
+		    ib_post_send_mad(msg, NULL))
 			cm_free_msg(msg);
 		goto deref;
 	case IB_CM_DREQ_RCVD:
@@ -2843,6 +2939,11 @@ static void cm_format_lap(struct cm_lap_msg *lap_msg,
 			  const void *private_data,
 			  u8 private_data_len)
 {
+	bool alt_ext = false;
+
+	if (alternate_path->rec_type == SA_PATH_REC_TYPE_OPA)
+		alt_ext = opa_is_extended_lid(alternate_path->opa.dlid,
+					      alternate_path->opa.slid);
 	cm_format_mad_hdr(&lap_msg->hdr, CM_LAP_ATTR_ID,
 			  cm_form_tid(cm_id_priv, CM_MSG_SEQUENCE_LAP));
 	lap_msg->local_comm_id = cm_id_priv->id.local_id;
@@ -2856,6 +2957,12 @@ static void cm_format_lap(struct cm_lap_msg *lap_msg,
 		htons(ntohl(sa_path_get_dlid(alternate_path)));
 	lap_msg->alt_local_gid = alternate_path->sgid;
 	lap_msg->alt_remote_gid = alternate_path->dgid;
+	if (alt_ext) {
+		lap_msg->alt_local_gid.global.interface_id
+			= OPA_MAKE_ID(be32_to_cpu(alternate_path->opa.slid));
+		lap_msg->alt_remote_gid.global.interface_id
+			= OPA_MAKE_ID(be32_to_cpu(alternate_path->opa.dlid));
+	}
 	cm_lap_set_flow_label(lap_msg, alternate_path->flow_label);
 	cm_lap_set_traffic_class(lap_msg, alternate_path->traffic_class);
 	lap_msg->alt_hop_limit = alternate_path->hop_limit;
@@ -2924,16 +3031,29 @@ out:	spin_unlock_irqrestore(&cm_id_priv->lock, flags);
 }
 EXPORT_SYMBOL(ib_send_cm_lap);
 
+static void cm_format_path_lid_from_lap(struct cm_lap_msg *lap_msg,
+					struct sa_path_rec *path)
+{
+	u32 lid;
+
+	if (path->rec_type != SA_PATH_REC_TYPE_OPA) {
+		sa_path_set_dlid(path, htonl(ntohs(lap_msg->alt_local_lid)));
+		sa_path_set_slid(path, htonl(ntohs(lap_msg->alt_remote_lid)));
+	} else {
+		lid = opa_get_lid_from_gid(&lap_msg->alt_local_gid);
+		sa_path_set_dlid(path, cpu_to_be32(lid));
+
+		lid = opa_get_lid_from_gid(&lap_msg->alt_remote_gid);
+		sa_path_set_slid(path, cpu_to_be32(lid));
+	}
+}
+
 static void cm_format_path_from_lap(struct cm_id_private *cm_id_priv,
 				    struct sa_path_rec *path,
 				    struct cm_lap_msg *lap_msg)
 {
-	memset(path, 0, sizeof *path);
-	path->rec_type = SA_PATH_REC_TYPE_IB;
 	path->dgid = lap_msg->alt_local_gid;
 	path->sgid = lap_msg->alt_remote_gid;
-	sa_path_set_dlid(path, htonl(ntohs(lap_msg->alt_local_lid)));
-	sa_path_set_slid(path, htonl(ntohs(lap_msg->alt_remote_lid)));
 	path->flow_label = cm_lap_get_flow_label(lap_msg);
 	path->hop_limit = lap_msg->alt_hop_limit;
 	path->traffic_class = cm_lap_get_traffic_class(lap_msg);
@@ -2947,6 +3067,7 @@ static void cm_format_path_from_lap(struct cm_id_private *cm_id_priv,
 	path->packet_life_time_selector = IB_SA_EQ;
 	path->packet_life_time = cm_lap_get_local_ack_timeout(lap_msg);
 	path->packet_life_time -= (path->packet_life_time > 0);
+	cm_format_path_lid_from_lap(lap_msg, path);
 }
 
 static int cm_lap_handler(struct cm_work *work)
@@ -2965,6 +3086,11 @@ static int cm_lap_handler(struct cm_work *work)
 		return -EINVAL;
 
 	param = &work->cm_event.param.lap_rcvd;
+	memset(&work->path[0], 0, sizeof(work->path[1]));
+	cm_path_set_rec_type(work->port->cm_dev->ib_device,
+			     work->port->port_num,
+			     &work->path[0],
+			     &lap_msg->alt_local_gid);
 	param->alternate_path = &work->path[0];
 	cm_format_path_from_lap(cm_id_priv, param->alternate_path, lap_msg);
 	work->cm_event.private_data = &lap_msg->private_data;
@@ -2980,7 +3106,8 @@ static int cm_lap_handler(struct cm_work *work)
 	case IB_CM_MRA_LAP_SENT:
 		atomic_long_inc(&work->port->counter_group[CM_RECV_DUPLICATES].
 				counter[CM_LAP_COUNTER]);
-		if (cm_alloc_response_msg(work->port, work->mad_recv_wc, &msg))
+		msg = cm_alloc_response_msg_no_ah(work->port, work->mad_recv_wc);
+		if (IS_ERR(msg))
 			goto unlock;
 
 		cm_format_mra((struct cm_mra_msg *) msg->mad, cm_id_priv,
@@ -2990,7 +3117,8 @@ static int cm_lap_handler(struct cm_work *work)
 			      cm_id_priv->private_data_len);
 		spin_unlock_irq(&cm_id_priv->lock);
 
-		if (ib_post_send_mad(msg, NULL))
+		if (cm_create_response_msg_ah(work->port, work->mad_recv_wc, msg) ||
+		    ib_post_send_mad(msg, NULL))
 			cm_free_msg(msg);
 		goto deref;
 	case IB_CM_LAP_RCVD:
@@ -4201,7 +4329,7 @@ static int __init ib_cm_init(void)
 		goto error1;
 	}
 
-	cm.wq = create_workqueue("ib_cm");
+	cm.wq = alloc_workqueue("ib_cm", 0, 1);
 	if (!cm.wq) {
 		ret = -ENOMEM;
 		goto error2;
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 0eb3932..852c8fe 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -72,6 +72,7 @@ MODULE_LICENSE("Dual BSD/GPL");
 #define CMA_MAX_CM_RETRIES 15
 #define CMA_CM_MRA_SETTING (IB_CM_MRA_FLAG_DELAY | 24)
 #define CMA_IBOE_PACKET_LIFETIME 18
+#define CMA_PREFERRED_ROCE_GID_TYPE IB_GID_TYPE_ROCE_UDP_ENCAP
 
 static const char * const cma_events[] = {
 	[RDMA_CM_EVENT_ADDR_RESOLVED]	 = "address resolved",
@@ -3998,7 +3999,8 @@ static void iboe_mcast_work_handler(struct work_struct *work)
 	kfree(mw);
 }
 
-static void cma_iboe_set_mgid(struct sockaddr *addr, union ib_gid *mgid)
+static void cma_iboe_set_mgid(struct sockaddr *addr, union ib_gid *mgid,
+			      enum ib_gid_type gid_type)
 {
 	struct sockaddr_in *sin = (struct sockaddr_in *)addr;
 	struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)addr;
@@ -4008,8 +4010,8 @@ static void cma_iboe_set_mgid(struct sockaddr *addr, union ib_gid *mgid)
 	} else if (addr->sa_family == AF_INET6) {
 		memcpy(mgid, &sin6->sin6_addr, sizeof *mgid);
 	} else {
-		mgid->raw[0] = 0xff;
-		mgid->raw[1] = 0x0e;
+		mgid->raw[0] = (gid_type == IB_GID_TYPE_IB) ? 0xff : 0;
+		mgid->raw[1] = (gid_type == IB_GID_TYPE_IB) ? 0x0e : 0;
 		mgid->raw[2] = 0;
 		mgid->raw[3] = 0;
 		mgid->raw[4] = 0;
@@ -4050,7 +4052,9 @@ static int cma_iboe_join_multicast(struct rdma_id_private *id_priv,
 		goto out1;
 	}
 
-	cma_iboe_set_mgid(addr, &mc->multicast.ib->rec.mgid);
+	gid_type = id_priv->cma_dev->default_gid_type[id_priv->id.port_num -
+		   rdma_start_port(id_priv->cma_dev->device)];
+	cma_iboe_set_mgid(addr, &mc->multicast.ib->rec.mgid, gid_type);
 
 	mc->multicast.ib->rec.pkey = cpu_to_be16(0xffff);
 	if (id_priv->id.ps == RDMA_PS_UDP)
@@ -4066,8 +4070,6 @@ static int cma_iboe_join_multicast(struct rdma_id_private *id_priv,
 	mc->multicast.ib->rec.hop_limit = 1;
 	mc->multicast.ib->rec.mtu = iboe_get_mtu(ndev->mtu);
 
-	gid_type = id_priv->cma_dev->default_gid_type[id_priv->id.port_num -
-		   rdma_start_port(id_priv->cma_dev->device)];
 	if (addr->sa_family == AF_INET) {
 		if (gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP) {
 			mc->multicast.ib->rec.hop_limit = IPV6_DEFAULT_HOPLIMIT;
@@ -4280,8 +4282,12 @@ static void cma_add_one(struct ib_device *device)
 	for (i = rdma_start_port(device); i <= rdma_end_port(device); i++) {
 		supported_gids = roce_gid_type_mask_support(device, i);
 		WARN_ON(!supported_gids);
-		cma_dev->default_gid_type[i - rdma_start_port(device)] =
-			find_first_bit(&supported_gids, BITS_PER_LONG);
+		if (supported_gids & (1 << CMA_PREFERRED_ROCE_GID_TYPE))
+			cma_dev->default_gid_type[i - rdma_start_port(device)] =
+				CMA_PREFERRED_ROCE_GID_TYPE;
+		else
+			cma_dev->default_gid_type[i - rdma_start_port(device)] =
+				find_first_bit(&supported_gids, BITS_PER_LONG);
 		cma_dev->default_roce_tos[i - rdma_start_port(device)] = 0;
 	}
 
@@ -4452,9 +4458,8 @@ static int cma_get_id_stats(struct sk_buff *skb, struct netlink_callback *cb)
 	return skb->len;
 }
 
-static const struct ibnl_client_cbs cma_cb_table[] = {
-	[RDMA_NL_RDMA_CM_ID_STATS] = { .dump = cma_get_id_stats,
-				       .module = THIS_MODULE },
+static const struct rdma_nl_cbs cma_cb_table[] = {
+	[RDMA_NL_RDMA_CM_ID_STATS] = { .dump = cma_get_id_stats},
 };
 
 static int cma_init_net(struct net *net)
@@ -4506,9 +4511,7 @@ static int __init cma_init(void)
 	if (ret)
 		goto err;
 
-	if (ibnl_add_client(RDMA_NL_RDMA_CM, ARRAY_SIZE(cma_cb_table),
-			    cma_cb_table))
-		pr_warn("RDMA CMA: failed to add netlink callback\n");
+	rdma_nl_register(RDMA_NL_RDMA_CM, cma_cb_table);
 	cma_configfs_init();
 
 	return 0;
@@ -4525,7 +4528,7 @@ static int __init cma_init(void)
 static void __exit cma_cleanup(void)
 {
 	cma_configfs_exit();
-	ibnl_remove_client(RDMA_NL_RDMA_CM);
+	rdma_nl_unregister(RDMA_NL_RDMA_CM);
 	ib_unregister_client(&cma_client);
 	unregister_netdevice_notifier(&cma_nb);
 	rdma_addr_unregister_client(&addr_client);
@@ -4534,5 +4537,7 @@ static void __exit cma_cleanup(void)
 	destroy_workqueue(cma_wq);
 }
 
+MODULE_ALIAS_RDMA_NETLINK(RDMA_NL_RDMA_CM, 1);
+
 module_init(cma_init);
 module_exit(cma_cleanup);
diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index 11ae675..a1d687a 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -38,6 +38,7 @@
 #include <linux/cgroup_rdma.h>
 
 #include <rdma/ib_verbs.h>
+#include <rdma/opa_addr.h>
 #include <rdma/ib_mad.h>
 #include "mad_priv.h"
 
@@ -102,6 +103,14 @@ void ib_enum_all_roce_netdevs(roce_netdev_filter filter,
 			      roce_netdev_callback cb,
 			      void *cookie);
 
+typedef int (*nldev_callback)(struct ib_device *device,
+			      struct sk_buff *skb,
+			      struct netlink_callback *cb,
+			      unsigned int idx);
+
+int ib_enum_all_devs(nldev_callback nldev_cb, struct sk_buff *skb,
+		     struct netlink_callback *cb);
+
 enum ib_cache_gid_default_mode {
 	IB_CACHE_GID_DEFAULT_MODE_SET,
 	IB_CACHE_GID_DEFAULT_MODE_DELETE
@@ -179,8 +188,8 @@ void ib_mad_cleanup(void);
 int ib_sa_init(void);
 void ib_sa_cleanup(void);
 
-int ibnl_init(void);
-void ibnl_cleanup(void);
+int rdma_nl_init(void);
+void rdma_nl_exit(void);
 
 /**
  * Check if there are any listeners to the netlink group
@@ -190,11 +199,14 @@ void ibnl_cleanup(void);
 int ibnl_chk_listeners(unsigned int group);
 
 int ib_nl_handle_resolve_resp(struct sk_buff *skb,
-			      struct netlink_callback *cb);
+			      struct nlmsghdr *nlh,
+			      struct netlink_ext_ack *extack);
 int ib_nl_handle_set_timeout(struct sk_buff *skb,
-			     struct netlink_callback *cb);
+			     struct nlmsghdr *nlh,
+			     struct netlink_ext_ack *extack);
 int ib_nl_handle_ip_res_resp(struct sk_buff *skb,
-			     struct netlink_callback *cb);
+			     struct nlmsghdr *nlh,
+			     struct netlink_ext_ack *extack);
 
 int ib_get_cached_subnet_prefix(struct ib_device *device,
 				u8                port_num,
@@ -301,4 +313,9 @@ static inline int ib_mad_enforce_security(struct ib_mad_agent_private *map,
 	return 0;
 }
 #endif
+
+struct ib_device *__ib_device_get_by_index(u32 ifindex);
+/* RDMA device netlink */
+void nldev_init(void);
+void nldev_exit(void);
 #endif /* _CORE_PRIV_H */
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 221468f..84fc32a 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -134,6 +134,17 @@ static int ib_device_check_mandatory(struct ib_device *device)
 	return 0;
 }
 
+struct ib_device *__ib_device_get_by_index(u32 index)
+{
+	struct ib_device *device;
+
+	list_for_each_entry(device, &device_list, core_list)
+		if (device->index == index)
+			return device;
+
+	return NULL;
+}
+
 static struct ib_device *__ib_device_get_by_name(const char *name)
 {
 	struct ib_device *device;
@@ -145,7 +156,6 @@ static struct ib_device *__ib_device_get_by_name(const char *name)
 	return NULL;
 }
 
-
 static int alloc_name(char *name)
 {
 	unsigned long *inuse;
@@ -326,10 +336,10 @@ static int read_port_immutable(struct ib_device *device)
 	return 0;
 }
 
-void ib_get_device_fw_str(struct ib_device *dev, char *str, size_t str_len)
+void ib_get_device_fw_str(struct ib_device *dev, char *str)
 {
 	if (dev->get_dev_fw_str)
-		dev->get_dev_fw_str(dev, str, str_len);
+		dev->get_dev_fw_str(dev, str);
 	else
 		str[0] = '\0';
 }
@@ -395,6 +405,30 @@ static int ib_security_change(struct notifier_block *nb, unsigned long event,
 }
 
 /**
+ *	__dev_new_index	-	allocate an device index
+ *
+ *	Returns a suitable unique value for a new device interface
+ *	number.  It assumes that there are less than 2^32-1 ib devices
+ *	will be present in the system.
+ */
+static u32 __dev_new_index(void)
+{
+	/*
+	 * The device index to allow stable naming.
+	 * Similar to struct net -> ifindex.
+	 */
+	static u32 index;
+
+	for (;;) {
+		if (!(++index))
+			index = 1;
+
+		if (!__ib_device_get_by_index(index))
+			return index;
+	}
+}
+
+/**
  * ib_register_device - Register an IB device with IB core
  * @device:Device to register
  *
@@ -489,9 +523,10 @@ int ib_register_device(struct ib_device *device,
 	device->reg_state = IB_DEV_REGISTERED;
 
 	list_for_each_entry(client, &client_list, list)
-		if (client->add && !add_client_context(device, client))
+		if (!add_client_context(device, client) && client->add)
 			client->add(device);
 
+	device->index = __dev_new_index();
 	down_write(&lists_rwsem);
 	list_add_tail(&device->core_list, &device_list);
 	up_write(&lists_rwsem);
@@ -578,7 +613,7 @@ int ib_register_client(struct ib_client *client)
 	mutex_lock(&device_mutex);
 
 	list_for_each_entry(device, &device_list, core_list)
-		if (client->add && !add_client_context(device, client))
+		if (!add_client_context(device, client) && client->add)
 			client->add(device);
 
 	down_write(&lists_rwsem);
@@ -712,7 +747,7 @@ EXPORT_SYMBOL(ib_set_client_data);
  * chapter 11 of the InfiniBand Architecture Specification).  This
  * callback may occur in interrupt context.
  */
-int ib_register_event_handler  (struct ib_event_handler *event_handler)
+void ib_register_event_handler(struct ib_event_handler *event_handler)
 {
 	unsigned long flags;
 
@@ -720,8 +755,6 @@ int ib_register_event_handler  (struct ib_event_handler *event_handler)
 	list_add_tail(&event_handler->list,
 		      &event_handler->device->event_handler_list);
 	spin_unlock_irqrestore(&event_handler->device->event_handler_lock, flags);
-
-	return 0;
 }
 EXPORT_SYMBOL(ib_register_event_handler);
 
@@ -732,15 +765,13 @@ EXPORT_SYMBOL(ib_register_event_handler);
  * Unregister an event handler registered with
  * ib_register_event_handler().
  */
-int ib_unregister_event_handler(struct ib_event_handler *event_handler)
+void ib_unregister_event_handler(struct ib_event_handler *event_handler)
 {
 	unsigned long flags;
 
 	spin_lock_irqsave(&event_handler->device->event_handler_lock, flags);
 	list_del(&event_handler->list);
 	spin_unlock_irqrestore(&event_handler->device->event_handler_lock, flags);
-
-	return 0;
 }
 EXPORT_SYMBOL(ib_unregister_event_handler);
 
@@ -894,6 +925,31 @@ void ib_enum_all_roce_netdevs(roce_netdev_filter filter,
 }
 
 /**
+ * ib_enum_all_devs - enumerate all ib_devices
+ * @cb: Callback to call for each found ib_device
+ *
+ * Enumerates all ib_devices and calls callback() on each device.
+ */
+int ib_enum_all_devs(nldev_callback nldev_cb, struct sk_buff *skb,
+		     struct netlink_callback *cb)
+{
+	struct ib_device *dev;
+	unsigned int idx = 0;
+	int ret = 0;
+
+	down_read(&lists_rwsem);
+	list_for_each_entry(dev, &device_list, core_list) {
+		ret = nldev_cb(dev, skb, cb, idx);
+		if (ret)
+			break;
+		idx++;
+	}
+
+	up_read(&lists_rwsem);
+	return ret;
+}
+
+/**
  * ib_query_pkey - Get P_Key table entry
  * @device:Device to query
  * @port_num:Port number to query
@@ -945,14 +1001,17 @@ int ib_modify_port(struct ib_device *device,
 		   u8 port_num, int port_modify_mask,
 		   struct ib_port_modify *port_modify)
 {
-	if (!device->modify_port)
-		return -ENOSYS;
+	int rc;
 
 	if (!rdma_is_port_valid(device, port_num))
 		return -EINVAL;
 
-	return device->modify_port(device, port_num, port_modify_mask,
-				   port_modify);
+	if (device->modify_port)
+		rc = device->modify_port(device, port_num, port_modify_mask,
+					   port_modify);
+	else
+		rc = rdma_protocol_roce(device, port_num) ? 0 : -ENOSYS;
+	return rc;
 }
 EXPORT_SYMBOL(ib_modify_port);
 
@@ -1087,29 +1146,21 @@ struct net_device *ib_get_net_dev_by_params(struct ib_device *dev,
 }
 EXPORT_SYMBOL(ib_get_net_dev_by_params);
 
-static struct ibnl_client_cbs ibnl_ls_cb_table[] = {
+static const struct rdma_nl_cbs ibnl_ls_cb_table[] = {
 	[RDMA_NL_LS_OP_RESOLVE] = {
-		.dump = ib_nl_handle_resolve_resp,
-		.module = THIS_MODULE },
+		.doit = ib_nl_handle_resolve_resp,
+		.flags = RDMA_NL_ADMIN_PERM,
+	},
 	[RDMA_NL_LS_OP_SET_TIMEOUT] = {
-		.dump = ib_nl_handle_set_timeout,
-		.module = THIS_MODULE },
+		.doit = ib_nl_handle_set_timeout,
+		.flags = RDMA_NL_ADMIN_PERM,
+	},
 	[RDMA_NL_LS_OP_IP_RESOLVE] = {
-		.dump = ib_nl_handle_ip_res_resp,
-		.module = THIS_MODULE },
+		.doit = ib_nl_handle_ip_res_resp,
+		.flags = RDMA_NL_ADMIN_PERM,
+	},
 };
 
-static int ib_add_ibnl_clients(void)
-{
-	return ibnl_add_client(RDMA_NL_LS, ARRAY_SIZE(ibnl_ls_cb_table),
-			       ibnl_ls_cb_table);
-}
-
-static void ib_remove_ibnl_clients(void)
-{
-	ibnl_remove_client(RDMA_NL_LS);
-}
-
 static int __init ib_core_init(void)
 {
 	int ret;
@@ -1131,9 +1182,9 @@ static int __init ib_core_init(void)
 		goto err_comp;
 	}
 
-	ret = ibnl_init();
+	ret = rdma_nl_init();
 	if (ret) {
-		pr_warn("Couldn't init IB netlink interface\n");
+		pr_warn("Couldn't init IB netlink interface: err %d\n", ret);
 		goto err_sysfs;
 	}
 
@@ -1155,24 +1206,18 @@ static int __init ib_core_init(void)
 		goto err_mad;
 	}
 
-	ret = ib_add_ibnl_clients();
-	if (ret) {
-		pr_warn("Couldn't register ibnl clients\n");
-		goto err_sa;
-	}
-
 	ret = register_lsm_notifier(&ibdev_lsm_nb);
 	if (ret) {
 		pr_warn("Couldn't register LSM notifier. ret %d\n", ret);
-		goto err_ibnl_clients;
+		goto err_sa;
 	}
 
+	nldev_init();
+	rdma_nl_register(RDMA_NL_LS, ibnl_ls_cb_table);
 	ib_cache_setup();
 
 	return 0;
 
-err_ibnl_clients:
-	ib_remove_ibnl_clients();
 err_sa:
 	ib_sa_cleanup();
 err_mad:
@@ -1180,7 +1225,7 @@ static int __init ib_core_init(void)
 err_addr:
 	addr_cleanup();
 err_ibnl:
-	ibnl_cleanup();
+	rdma_nl_exit();
 err_sysfs:
 	class_unregister(&ib_class);
 err_comp:
@@ -1192,18 +1237,21 @@ static int __init ib_core_init(void)
 
 static void __exit ib_core_cleanup(void)
 {
-	unregister_lsm_notifier(&ibdev_lsm_nb);
 	ib_cache_cleanup();
-	ib_remove_ibnl_clients();
+	nldev_exit();
+	rdma_nl_unregister(RDMA_NL_LS);
+	unregister_lsm_notifier(&ibdev_lsm_nb);
 	ib_sa_cleanup();
 	ib_mad_cleanup();
 	addr_cleanup();
-	ibnl_cleanup();
+	rdma_nl_exit();
 	class_unregister(&ib_class);
 	destroy_workqueue(ib_comp_wq);
 	/* Make sure that any pending umem accounting work is done. */
 	destroy_workqueue(ib_wq);
 }
 
+MODULE_ALIAS_RDMA_NETLINK(RDMA_NL_LS, 4);
+
 module_init(ib_core_init);
 module_exit(ib_core_cleanup);
diff --git a/drivers/infiniband/core/iwcm.c b/drivers/infiniband/core/iwcm.c
index 31661b5..fcf42f6 100644
--- a/drivers/infiniband/core/iwcm.c
+++ b/drivers/infiniband/core/iwcm.c
@@ -80,7 +80,7 @@ const char *__attribute_const__ iwcm_reject_msg(int reason)
 }
 EXPORT_SYMBOL(iwcm_reject_msg);
 
-static struct ibnl_client_cbs iwcm_nl_cb_table[] = {
+static struct rdma_nl_cbs iwcm_nl_cb_table[] = {
 	[RDMA_NL_IWPM_REG_PID] = {.dump = iwpm_register_pid_cb},
 	[RDMA_NL_IWPM_ADD_MAPPING] = {.dump = iwpm_add_mapping_cb},
 	[RDMA_NL_IWPM_QUERY_MAPPING] = {.dump = iwpm_add_and_query_mapping_cb},
@@ -1175,13 +1175,9 @@ static int __init iw_cm_init(void)
 	ret = iwpm_init(RDMA_NL_IWCM);
 	if (ret)
 		pr_err("iw_cm: couldn't init iwpm\n");
-
-	ret = ibnl_add_client(RDMA_NL_IWCM, ARRAY_SIZE(iwcm_nl_cb_table),
-			      iwcm_nl_cb_table);
-	if (ret)
-		pr_err("iw_cm: couldn't register netlink callbacks\n");
-
-	iwcm_wq = alloc_ordered_workqueue("iw_cm_wq", WQ_MEM_RECLAIM);
+	else
+		rdma_nl_register(RDMA_NL_IWCM, iwcm_nl_cb_table);
+	iwcm_wq = alloc_ordered_workqueue("iw_cm_wq", 0);
 	if (!iwcm_wq)
 		return -ENOMEM;
 
@@ -1200,9 +1196,11 @@ static void __exit iw_cm_cleanup(void)
 {
 	unregister_net_sysctl_table(iwcm_ctl_table_hdr);
 	destroy_workqueue(iwcm_wq);
-	ibnl_remove_client(RDMA_NL_IWCM);
+	rdma_nl_unregister(RDMA_NL_IWCM);
 	iwpm_exit(RDMA_NL_IWCM);
 }
 
+MODULE_ALIAS_RDMA_NETLINK(RDMA_NL_IWCM, 2);
+
 module_init(iw_cm_init);
 module_exit(iw_cm_cleanup);
diff --git a/drivers/infiniband/core/iwpm_msg.c b/drivers/infiniband/core/iwpm_msg.c
index a0e7c16..30825bb 100644
--- a/drivers/infiniband/core/iwpm_msg.c
+++ b/drivers/infiniband/core/iwpm_msg.c
@@ -42,7 +42,6 @@ int iwpm_valid_pid(void)
 {
 	return iwpm_user_pid > 0;
 }
-EXPORT_SYMBOL(iwpm_valid_pid);
 
 /*
  * iwpm_register_pid - Send a netlink query to user space
@@ -104,7 +103,7 @@ int iwpm_register_pid(struct iwpm_dev_data *pm_msg, u8 nl_client)
 	pr_debug("%s: Multicasting a nlmsg (dev = %s ifname = %s iwpm = %s)\n",
 		__func__, pm_msg->dev_name, pm_msg->if_name, iwpm_ulib_name);
 
-	ret = ibnl_multicast(skb, nlh, RDMA_NL_GROUP_IWPM, GFP_KERNEL);
+	ret = rdma_nl_multicast(skb, RDMA_NL_GROUP_IWPM, GFP_KERNEL);
 	if (ret) {
 		skb = NULL; /* skb is freed in the netlink send-op handling */
 		iwpm_user_pid = IWPM_PID_UNAVAILABLE;
@@ -122,7 +121,6 @@ int iwpm_register_pid(struct iwpm_dev_data *pm_msg, u8 nl_client)
 		iwpm_free_nlmsg_request(&nlmsg_request->kref);
 	return ret;
 }
-EXPORT_SYMBOL(iwpm_register_pid);
 
 /*
  * iwpm_add_mapping - Send a netlink add mapping message
@@ -174,7 +172,7 @@ int iwpm_add_mapping(struct iwpm_sa_data *pm_msg, u8 nl_client)
 		goto add_mapping_error;
 	nlmsg_request->req_buffer = pm_msg;
 
-	ret = ibnl_unicast(skb, nlh, iwpm_user_pid);
+	ret = rdma_nl_unicast_wait(skb, iwpm_user_pid);
 	if (ret) {
 		skb = NULL; /* skb is freed in the netlink send-op handling */
 		iwpm_user_pid = IWPM_PID_UNDEFINED;
@@ -191,7 +189,6 @@ int iwpm_add_mapping(struct iwpm_sa_data *pm_msg, u8 nl_client)
 		iwpm_free_nlmsg_request(&nlmsg_request->kref);
 	return ret;
 }
-EXPORT_SYMBOL(iwpm_add_mapping);
 
 /*
  * iwpm_add_and_query_mapping - Send a netlink add and query
@@ -251,7 +248,7 @@ int iwpm_add_and_query_mapping(struct iwpm_sa_data *pm_msg, u8 nl_client)
 		goto query_mapping_error;
 	nlmsg_request->req_buffer = pm_msg;
 
-	ret = ibnl_unicast(skb, nlh, iwpm_user_pid);
+	ret = rdma_nl_unicast_wait(skb, iwpm_user_pid);
 	if (ret) {
 		skb = NULL; /* skb is freed in the netlink send-op handling */
 		err_str = "Unable to send a nlmsg";
@@ -267,7 +264,6 @@ int iwpm_add_and_query_mapping(struct iwpm_sa_data *pm_msg, u8 nl_client)
 		iwpm_free_nlmsg_request(&nlmsg_request->kref);
 	return ret;
 }
-EXPORT_SYMBOL(iwpm_add_and_query_mapping);
 
 /*
  * iwpm_remove_mapping - Send a netlink remove mapping message
@@ -312,7 +308,7 @@ int iwpm_remove_mapping(struct sockaddr_storage *local_addr, u8 nl_client)
 	if (ret)
 		goto remove_mapping_error;
 
-	ret = ibnl_unicast(skb, nlh, iwpm_user_pid);
+	ret = rdma_nl_unicast_wait(skb, iwpm_user_pid);
 	if (ret) {
 		skb = NULL; /* skb is freed in the netlink send-op handling */
 		iwpm_user_pid = IWPM_PID_UNDEFINED;
@@ -328,7 +324,6 @@ int iwpm_remove_mapping(struct sockaddr_storage *local_addr, u8 nl_client)
 		dev_kfree_skb_any(skb);
 	return ret;
 }
-EXPORT_SYMBOL(iwpm_remove_mapping);
 
 /* netlink attribute policy for the received response to register pid request */
 static const struct nla_policy resp_reg_policy[IWPM_NLA_RREG_PID_MAX] = {
@@ -397,7 +392,6 @@ int iwpm_register_pid_cb(struct sk_buff *skb, struct netlink_callback *cb)
 	up(&nlmsg_request->sem);
 	return 0;
 }
-EXPORT_SYMBOL(iwpm_register_pid_cb);
 
 /* netlink attribute policy for the received response to add mapping request */
 static const struct nla_policy resp_add_policy[IWPM_NLA_RMANAGE_MAPPING_MAX] = {
@@ -466,7 +460,6 @@ int iwpm_add_mapping_cb(struct sk_buff *skb, struct netlink_callback *cb)
 	up(&nlmsg_request->sem);
 	return 0;
 }
-EXPORT_SYMBOL(iwpm_add_mapping_cb);
 
 /* netlink attribute policy for the response to add and query mapping request
  * and response with remote address info */
@@ -558,7 +551,6 @@ int iwpm_add_and_query_mapping_cb(struct sk_buff *skb,
 	up(&nlmsg_request->sem);
 	return 0;
 }
-EXPORT_SYMBOL(iwpm_add_and_query_mapping_cb);
 
 /*
  * iwpm_remote_info_cb - Process a port mapper message, containing
@@ -627,7 +619,6 @@ int iwpm_remote_info_cb(struct sk_buff *skb, struct netlink_callback *cb)
 			"remote_info: Mapped remote sockaddr:");
 	return ret;
 }
-EXPORT_SYMBOL(iwpm_remote_info_cb);
 
 /* netlink attribute policy for the received request for mapping info */
 static const struct nla_policy resp_mapinfo_policy[IWPM_NLA_MAPINFO_REQ_MAX] = {
@@ -677,7 +668,6 @@ int iwpm_mapping_info_cb(struct sk_buff *skb, struct netlink_callback *cb)
 	ret = iwpm_send_mapinfo(nl_client, iwpm_user_pid);
 	return ret;
 }
-EXPORT_SYMBOL(iwpm_mapping_info_cb);
 
 /* netlink attribute policy for the received mapping info ack */
 static const struct nla_policy ack_mapinfo_policy[IWPM_NLA_MAPINFO_NUM_MAX] = {
@@ -707,7 +697,6 @@ int iwpm_ack_mapping_info_cb(struct sk_buff *skb, struct netlink_callback *cb)
 	atomic_set(&echo_nlmsg_seq, cb->nlh->nlmsg_seq);
 	return 0;
 }
-EXPORT_SYMBOL(iwpm_ack_mapping_info_cb);
 
 /* netlink attribute policy for the received port mapper error message */
 static const struct nla_policy map_error_policy[IWPM_NLA_ERR_MAX] = {
@@ -751,4 +740,3 @@ int iwpm_mapping_error_cb(struct sk_buff *skb, struct netlink_callback *cb)
 	up(&nlmsg_request->sem);
 	return 0;
 }
-EXPORT_SYMBOL(iwpm_mapping_error_cb);
diff --git a/drivers/infiniband/core/iwpm_util.c b/drivers/infiniband/core/iwpm_util.c
index f13870e..c81c559 100644
--- a/drivers/infiniband/core/iwpm_util.c
+++ b/drivers/infiniband/core/iwpm_util.c
@@ -54,8 +54,6 @@ static struct iwpm_admin_data iwpm_admin;
 int iwpm_init(u8 nl_client)
 {
 	int ret = 0;
-	if (iwpm_valid_client(nl_client))
-		return -EINVAL;
 	mutex_lock(&iwpm_admin_lock);
 	if (atomic_read(&iwpm_admin.refcount) == 0) {
 		iwpm_hash_bucket = kzalloc(IWPM_MAPINFO_HASH_SIZE *
@@ -83,7 +81,6 @@ int iwpm_init(u8 nl_client)
 	}
 	return ret;
 }
-EXPORT_SYMBOL(iwpm_init);
 
 static void free_hash_bucket(void);
 static void free_reminfo_bucket(void);
@@ -109,7 +106,6 @@ int iwpm_exit(u8 nl_client)
 	iwpm_set_registration(nl_client, IWPM_REG_UNDEF);
 	return 0;
 }
-EXPORT_SYMBOL(iwpm_exit);
 
 static struct hlist_head *get_mapinfo_hash_bucket(struct sockaddr_storage *,
 					       struct sockaddr_storage *);
@@ -148,7 +144,6 @@ int iwpm_create_mapinfo(struct sockaddr_storage *local_sockaddr,
 	spin_unlock_irqrestore(&iwpm_mapinfo_lock, flags);
 	return ret;
 }
-EXPORT_SYMBOL(iwpm_create_mapinfo);
 
 int iwpm_remove_mapinfo(struct sockaddr_storage *local_sockaddr,
 			struct sockaddr_storage *mapped_local_addr)
@@ -184,7 +179,6 @@ int iwpm_remove_mapinfo(struct sockaddr_storage *local_sockaddr,
 	spin_unlock_irqrestore(&iwpm_mapinfo_lock, flags);
 	return ret;
 }
-EXPORT_SYMBOL(iwpm_remove_mapinfo);
 
 static void free_hash_bucket(void)
 {
@@ -297,7 +291,6 @@ int iwpm_get_remote_info(struct sockaddr_storage *mapped_loc_addr,
 	spin_unlock_irqrestore(&iwpm_reminfo_lock, flags);
 	return ret;
 }
-EXPORT_SYMBOL(iwpm_get_remote_info);
 
 struct iwpm_nlmsg_request *iwpm_get_nlmsg_request(__u32 nlmsg_seq,
 					u8 nl_client, gfp_t gfp)
@@ -383,15 +376,11 @@ int iwpm_get_nlmsg_seq(void)
 
 int iwpm_valid_client(u8 nl_client)
 {
-	if (nl_client >= RDMA_NL_NUM_CLIENTS)
-		return 0;
 	return iwpm_admin.client_list[nl_client];
 }
 
 void iwpm_set_valid(u8 nl_client, int valid)
 {
-	if (nl_client >= RDMA_NL_NUM_CLIENTS)
-		return;
 	iwpm_admin.client_list[nl_client] = valid;
 }
 
@@ -608,7 +597,7 @@ static int send_mapinfo_num(u32 mapping_num, u8 nl_client, int iwpm_pid)
 				&mapping_num, IWPM_NLA_MAPINFO_SEND_NUM);
 	if (ret)
 		goto mapinfo_num_error;
-	ret = ibnl_unicast(skb, nlh, iwpm_pid);
+	ret = rdma_nl_unicast(skb, iwpm_pid);
 	if (ret) {
 		skb = NULL;
 		err_str = "Unable to send a nlmsg";
@@ -637,7 +626,7 @@ static int send_nlmsg_done(struct sk_buff *skb, u8 nl_client, int iwpm_pid)
 		return -ENOMEM;
 	}
 	nlh->nlmsg_type = NLMSG_DONE;
-	ret = ibnl_unicast(skb, (struct nlmsghdr *)skb->data, iwpm_pid);
+	ret = rdma_nl_unicast(skb, iwpm_pid);
 	if (ret)
 		pr_warn("%s Unable to send a nlmsg\n", __func__);
 	return ret;
diff --git a/drivers/infiniband/core/mad_rmpp.c b/drivers/infiniband/core/mad_rmpp.c
index 0d3cca0..e5cf09c 100644
--- a/drivers/infiniband/core/mad_rmpp.c
+++ b/drivers/infiniband/core/mad_rmpp.c
@@ -64,7 +64,7 @@ struct mad_rmpp_recv {
 
 	__be64 tid;
 	u32 src_qp;
-	u16 slid;
+	u32 slid;
 	u8 mgmt_class;
 	u8 class_version;
 	u8 method;
diff --git a/drivers/infiniband/core/netlink.c b/drivers/infiniband/core/netlink.c
index 94931c4..e685148 100644
--- a/drivers/infiniband/core/netlink.c
+++ b/drivers/infiniband/core/netlink.c
@@ -1,4 +1,5 @@
 /*
+ * Copyright (c) 2017 Mellanox Technologies Inc.  All rights reserved.
  * Copyright (c) 2010 Voltaire Inc.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
@@ -37,239 +38,267 @@
 #include <net/net_namespace.h>
 #include <net/sock.h>
 #include <rdma/rdma_netlink.h>
+#include <linux/module.h>
 #include "core_priv.h"
 
-struct ibnl_client {
-	struct list_head		list;
-	int				index;
-	int				nops;
-	const struct ibnl_client_cbs   *cb_table;
-};
+#include "core_priv.h"
 
-static DEFINE_MUTEX(ibnl_mutex);
+static DEFINE_MUTEX(rdma_nl_mutex);
 static struct sock *nls;
-static LIST_HEAD(client_list);
+static struct {
+	const struct rdma_nl_cbs   *cb_table;
+} rdma_nl_types[RDMA_NL_NUM_CLIENTS];
 
-int ibnl_chk_listeners(unsigned int group)
+int rdma_nl_chk_listeners(unsigned int group)
 {
-	if (netlink_has_listeners(nls, group) == 0)
-		return -1;
-	return 0;
+	return (netlink_has_listeners(nls, group)) ? 0 : -1;
+}
+EXPORT_SYMBOL(rdma_nl_chk_listeners);
+
+static bool is_nl_msg_valid(unsigned int type, unsigned int op)
+{
+	static const unsigned int max_num_ops[RDMA_NL_NUM_CLIENTS - 1] = {
+				  RDMA_NL_RDMA_CM_NUM_OPS,
+				  RDMA_NL_IWPM_NUM_OPS,
+				  0,
+				  RDMA_NL_LS_NUM_OPS,
+				  RDMA_NLDEV_NUM_OPS };
+
+	/*
+	 * This BUILD_BUG_ON is intended to catch addition of new
+	 * RDMA netlink protocol without updating the array above.
+	 */
+	BUILD_BUG_ON(RDMA_NL_NUM_CLIENTS != 6);
+
+	if (type > RDMA_NL_NUM_CLIENTS - 1)
+		return false;
+
+	return (op < max_num_ops[type - 1]) ? true : false;
 }
 
-int ibnl_add_client(int index, int nops,
-		    const struct ibnl_client_cbs cb_table[])
+static bool is_nl_valid(unsigned int type, unsigned int op)
 {
-	struct ibnl_client *cur;
-	struct ibnl_client *nl_client;
+	const struct rdma_nl_cbs *cb_table;
 
-	nl_client = kmalloc(sizeof *nl_client, GFP_KERNEL);
-	if (!nl_client)
-		return -ENOMEM;
+	if (!is_nl_msg_valid(type, op))
+		return false;
 
-	nl_client->index	= index;
-	nl_client->nops		= nops;
-	nl_client->cb_table	= cb_table;
+	cb_table = rdma_nl_types[type].cb_table;
+#ifdef CONFIG_MODULES
+	if (!cb_table) {
+		mutex_unlock(&rdma_nl_mutex);
+		request_module("rdma-netlink-subsys-%d", type);
+		mutex_lock(&rdma_nl_mutex);
+		cb_table = rdma_nl_types[type].cb_table;
+	}
+#endif
 
-	mutex_lock(&ibnl_mutex);
+	if (!cb_table || (!cb_table[op].dump && !cb_table[op].doit))
+		return false;
+	return true;
+}
 
-	list_for_each_entry(cur, &client_list, list) {
-		if (cur->index == index) {
-			pr_warn("Client for %d already exists\n", index);
-			mutex_unlock(&ibnl_mutex);
-			kfree(nl_client);
-			return -EINVAL;
-		}
+void rdma_nl_register(unsigned int index,
+		      const struct rdma_nl_cbs cb_table[])
+{
+	mutex_lock(&rdma_nl_mutex);
+	if (!is_nl_msg_valid(index, 0)) {
+		/*
+		 * All clients are not interesting in success/failure of
+		 * this call. They want to see the print to error log and
+		 * continue their initialization. Print warning for them,
+		 * because it is programmer's error to be here.
+		 */
+		mutex_unlock(&rdma_nl_mutex);
+		WARN(true,
+		     "The not-valid %u index was supplied to RDMA netlink\n",
+		     index);
+		return;
 	}
 
-	list_add_tail(&nl_client->list, &client_list);
-
-	mutex_unlock(&ibnl_mutex);
-
-	return 0;
-}
-EXPORT_SYMBOL(ibnl_add_client);
-
-int ibnl_remove_client(int index)
-{
-	struct ibnl_client *cur, *next;
-
-	mutex_lock(&ibnl_mutex);
-	list_for_each_entry_safe(cur, next, &client_list, list) {
-		if (cur->index == index) {
-			list_del(&(cur->list));
-			mutex_unlock(&ibnl_mutex);
-			kfree(cur);
-			return 0;
-		}
+	if (rdma_nl_types[index].cb_table) {
+		mutex_unlock(&rdma_nl_mutex);
+		WARN(true,
+		     "The %u index is already registered in RDMA netlink\n",
+		     index);
+		return;
 	}
-	pr_warn("Can't remove callback for client idx %d. Not found\n", index);
-	mutex_unlock(&ibnl_mutex);
 
-	return -EINVAL;
+	rdma_nl_types[index].cb_table = cb_table;
+	mutex_unlock(&rdma_nl_mutex);
 }
-EXPORT_SYMBOL(ibnl_remove_client);
+EXPORT_SYMBOL(rdma_nl_register);
+
+void rdma_nl_unregister(unsigned int index)
+{
+	mutex_lock(&rdma_nl_mutex);
+	rdma_nl_types[index].cb_table = NULL;
+	mutex_unlock(&rdma_nl_mutex);
+}
+EXPORT_SYMBOL(rdma_nl_unregister);
 
 void *ibnl_put_msg(struct sk_buff *skb, struct nlmsghdr **nlh, int seq,
 		   int len, int client, int op, int flags)
 {
-	unsigned char *prev_tail;
-
-	prev_tail = skb_tail_pointer(skb);
-	*nlh = nlmsg_put(skb, 0, seq, RDMA_NL_GET_TYPE(client, op),
-			 len, flags);
+	*nlh = nlmsg_put(skb, 0, seq, RDMA_NL_GET_TYPE(client, op), len, flags);
 	if (!*nlh)
-		goto out_nlmsg_trim;
-	(*nlh)->nlmsg_len = skb_tail_pointer(skb) - prev_tail;
+		return NULL;
 	return nlmsg_data(*nlh);
-
-out_nlmsg_trim:
-	nlmsg_trim(skb, prev_tail);
-	return NULL;
 }
 EXPORT_SYMBOL(ibnl_put_msg);
 
 int ibnl_put_attr(struct sk_buff *skb, struct nlmsghdr *nlh,
 		  int len, void *data, int type)
 {
-	unsigned char *prev_tail;
-
-	prev_tail = skb_tail_pointer(skb);
-	if (nla_put(skb, type, len, data))
-		goto nla_put_failure;
-	nlh->nlmsg_len += skb_tail_pointer(skb) - prev_tail;
+	if (nla_put(skb, type, len, data)) {
+		nlmsg_cancel(skb, nlh);
+		return -EMSGSIZE;
+	}
 	return 0;
-
-nla_put_failure:
-	nlmsg_trim(skb, prev_tail - nlh->nlmsg_len);
-	return -EMSGSIZE;
 }
 EXPORT_SYMBOL(ibnl_put_attr);
 
-static int ibnl_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh,
-			struct netlink_ext_ack *extack)
+static int rdma_nl_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh,
+			   struct netlink_ext_ack *extack)
 {
-	struct ibnl_client *client;
 	int type = nlh->nlmsg_type;
-	int index = RDMA_NL_GET_CLIENT(type);
+	unsigned int index = RDMA_NL_GET_CLIENT(type);
 	unsigned int op = RDMA_NL_GET_OP(type);
+	const struct rdma_nl_cbs *cb_table;
 
-	list_for_each_entry(client, &client_list, list) {
-		if (client->index == index) {
-			if (op >= client->nops || !client->cb_table[op].dump)
-				return -EINVAL;
+	if (!is_nl_valid(index, op))
+		return -EINVAL;
 
-			/*
-			 * For response or local service set_timeout request,
-			 * there is no need to use netlink_dump_start.
-			 */
-			if (!(nlh->nlmsg_flags & NLM_F_REQUEST) ||
-			    (index == RDMA_NL_LS &&
-			     op == RDMA_NL_LS_OP_SET_TIMEOUT)) {
-				struct netlink_callback cb = {
-					.skb = skb,
-					.nlh = nlh,
-					.dump = client->cb_table[op].dump,
-					.module = client->cb_table[op].module,
-				};
+	cb_table = rdma_nl_types[index].cb_table;
 
-				return cb.dump(skb, &cb);
-			}
+	if ((cb_table[op].flags & RDMA_NL_ADMIN_PERM) &&
+	    !netlink_capable(skb, CAP_NET_ADMIN))
+		return -EPERM;
 
-			{
-				struct netlink_dump_control c = {
-					.dump = client->cb_table[op].dump,
-					.module = client->cb_table[op].module,
-				};
-				return netlink_dump_start(nls, skb, nlh, &c);
-			}
-		}
+	/* FIXME: Convert IWCM to properly handle doit callbacks */
+	if ((nlh->nlmsg_flags & NLM_F_DUMP) || index == RDMA_NL_RDMA_CM ||
+	    index == RDMA_NL_IWCM) {
+		struct netlink_dump_control c = {
+			.dump = cb_table[op].dump,
+		};
+		return netlink_dump_start(nls, skb, nlh, &c);
 	}
 
-	pr_info("Index %d wasn't found in client list\n", index);
-	return -EINVAL;
+	if (cb_table[op].doit)
+		return cb_table[op].doit(skb, nlh, extack);
+
+	return 0;
 }
 
-static void ibnl_rcv_reply_skb(struct sk_buff *skb)
+/*
+ * This function is similar to netlink_rcv_skb with one exception:
+ * It calls to the callback for the netlink messages without NLM_F_REQUEST
+ * flag. These messages are intended for RDMA_NL_LS consumer, so it is allowed
+ * for that consumer only.
+ */
+static int rdma_nl_rcv_skb(struct sk_buff *skb, int (*cb)(struct sk_buff *,
+						   struct nlmsghdr *,
+						   struct netlink_ext_ack *))
 {
+	struct netlink_ext_ack extack = {};
 	struct nlmsghdr *nlh;
-	int msglen;
+	int err;
 
-	/*
-	 * Process responses until there is no more message or the first
-	 * request. Generally speaking, it is not recommended to mix responses
-	 * with requests.
-	 */
 	while (skb->len >= nlmsg_total_size(0)) {
+		int msglen;
+
 		nlh = nlmsg_hdr(skb);
+		err = 0;
 
 		if (nlh->nlmsg_len < NLMSG_HDRLEN || skb->len < nlh->nlmsg_len)
-			return;
+			return 0;
 
-		/* Handle response only */
-		if (nlh->nlmsg_flags & NLM_F_REQUEST)
-			return;
+		/*
+		 * Generally speaking, the only requests are handled
+		 * by the kernel, but RDMA_NL_LS is different, because it
+		 * runs backward netlink scheme. Kernel initiates messages
+		 * and waits for reply with data to keep pathrecord cache
+		 * in sync.
+		 */
+		if (!(nlh->nlmsg_flags & NLM_F_REQUEST) &&
+		    (RDMA_NL_GET_CLIENT(nlh->nlmsg_type) != RDMA_NL_LS))
+			goto ack;
 
-		ibnl_rcv_msg(skb, nlh, NULL);
+		/* Skip control messages */
+		if (nlh->nlmsg_type < NLMSG_MIN_TYPE)
+			goto ack;
 
+		err = cb(skb, nlh, &extack);
+		if (err == -EINTR)
+			goto skip;
+
+ack:
+		if (nlh->nlmsg_flags & NLM_F_ACK || err)
+			netlink_ack(skb, nlh, err, &extack);
+
+skip:
 		msglen = NLMSG_ALIGN(nlh->nlmsg_len);
 		if (msglen > skb->len)
 			msglen = skb->len;
 		skb_pull(skb, msglen);
 	}
+
+	return 0;
 }
 
-static void ibnl_rcv(struct sk_buff *skb)
+static void rdma_nl_rcv(struct sk_buff *skb)
 {
-	mutex_lock(&ibnl_mutex);
-	ibnl_rcv_reply_skb(skb);
-	netlink_rcv_skb(skb, &ibnl_rcv_msg);
-	mutex_unlock(&ibnl_mutex);
+	mutex_lock(&rdma_nl_mutex);
+	rdma_nl_rcv_skb(skb, &rdma_nl_rcv_msg);
+	mutex_unlock(&rdma_nl_mutex);
 }
 
-int ibnl_unicast(struct sk_buff *skb, struct nlmsghdr *nlh,
-			__u32 pid)
+int rdma_nl_unicast(struct sk_buff *skb, u32 pid)
+{
+	int err;
+
+	err = netlink_unicast(nls, skb, pid, MSG_DONTWAIT);
+	return (err < 0) ? err : 0;
+}
+EXPORT_SYMBOL(rdma_nl_unicast);
+
+int rdma_nl_unicast_wait(struct sk_buff *skb, __u32 pid)
 {
 	int err;
 
 	err = netlink_unicast(nls, skb, pid, 0);
 	return (err < 0) ? err : 0;
 }
-EXPORT_SYMBOL(ibnl_unicast);
+EXPORT_SYMBOL(rdma_nl_unicast_wait);
 
-int ibnl_multicast(struct sk_buff *skb, struct nlmsghdr *nlh,
-			unsigned int group, gfp_t flags)
+int rdma_nl_multicast(struct sk_buff *skb, unsigned int group, gfp_t flags)
 {
 	return nlmsg_multicast(nls, skb, 0, group, flags);
 }
-EXPORT_SYMBOL(ibnl_multicast);
+EXPORT_SYMBOL(rdma_nl_multicast);
 
-int __init ibnl_init(void)
+int __init rdma_nl_init(void)
 {
 	struct netlink_kernel_cfg cfg = {
-		.input	= ibnl_rcv,
+		.input	= rdma_nl_rcv,
 	};
 
 	nls = netlink_kernel_create(&init_net, NETLINK_RDMA, &cfg);
-	if (!nls) {
-		pr_warn("Failed to create netlink socket\n");
+	if (!nls)
 		return -ENOMEM;
-	}
 
 	nls->sk_sndtimeo = 10 * HZ;
 	return 0;
 }
 
-void ibnl_cleanup(void)
+void rdma_nl_exit(void)
 {
-	struct ibnl_client *cur, *next;
+	int idx;
 
-	mutex_lock(&ibnl_mutex);
-	list_for_each_entry_safe(cur, next, &client_list, list) {
-		list_del(&(cur->list));
-		kfree(cur);
-	}
-	mutex_unlock(&ibnl_mutex);
+	for (idx = 0; idx < RDMA_NL_NUM_CLIENTS; idx++)
+		rdma_nl_unregister(idx);
 
 	netlink_kernel_release(nls);
 }
+
+MODULE_ALIAS_NET_PF_PROTO(PF_NETLINK, NETLINK_RDMA);
diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
new file mode 100644
index 0000000..3ba24c4
--- /dev/null
+++ b/drivers/infiniband/core/nldev.c
@@ -0,0 +1,325 @@
+/*
+ * Copyright (c) 2017 Mellanox Technologies. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in the
+ *    documentation and/or other materials provided with the distribution.
+ * 3. Neither the names of the copyright holders nor the names of its
+ *    contributors may be used to endorse or promote products derived from
+ *    this software without specific prior written permission.
+ *
+ * Alternatively, this software may be distributed under the terms of the
+ * GNU General Public License ("GPL") version 2 as published by the Free
+ * Software Foundation.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
+ * ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
+ * INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
+ * CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <linux/module.h>
+#include <net/netlink.h>
+#include <rdma/rdma_netlink.h>
+
+#include "core_priv.h"
+
+static const struct nla_policy nldev_policy[RDMA_NLDEV_ATTR_MAX] = {
+	[RDMA_NLDEV_ATTR_DEV_INDEX]     = { .type = NLA_U32 },
+	[RDMA_NLDEV_ATTR_DEV_NAME]	= { .type = NLA_NUL_STRING,
+					    .len = IB_DEVICE_NAME_MAX - 1},
+	[RDMA_NLDEV_ATTR_PORT_INDEX]	= { .type = NLA_U32 },
+	[RDMA_NLDEV_ATTR_FW_VERSION]	= { .type = NLA_NUL_STRING,
+					    .len = IB_FW_VERSION_NAME_MAX - 1},
+	[RDMA_NLDEV_ATTR_NODE_GUID]	= { .type = NLA_U64 },
+	[RDMA_NLDEV_ATTR_SYS_IMAGE_GUID] = { .type = NLA_U64 },
+	[RDMA_NLDEV_ATTR_SUBNET_PREFIX]	= { .type = NLA_U64 },
+	[RDMA_NLDEV_ATTR_LID]		= { .type = NLA_U32 },
+	[RDMA_NLDEV_ATTR_SM_LID]	= { .type = NLA_U32 },
+	[RDMA_NLDEV_ATTR_LMC]		= { .type = NLA_U8 },
+	[RDMA_NLDEV_ATTR_PORT_STATE]	= { .type = NLA_U8 },
+	[RDMA_NLDEV_ATTR_PORT_PHYS_STATE] = { .type = NLA_U8 },
+	[RDMA_NLDEV_ATTR_DEV_NODE_TYPE] = { .type = NLA_U8 },
+};
+
+static int fill_dev_info(struct sk_buff *msg, struct ib_device *device)
+{
+	char fw[IB_FW_VERSION_NAME_MAX];
+
+	if (nla_put_u32(msg, RDMA_NLDEV_ATTR_DEV_INDEX, device->index))
+		return -EMSGSIZE;
+	if (nla_put_string(msg, RDMA_NLDEV_ATTR_DEV_NAME, device->name))
+		return -EMSGSIZE;
+	if (nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, rdma_end_port(device)))
+		return -EMSGSIZE;
+
+	BUILD_BUG_ON(sizeof(device->attrs.device_cap_flags) != sizeof(u64));
+	if (nla_put_u64_64bit(msg, RDMA_NLDEV_ATTR_CAP_FLAGS,
+			      device->attrs.device_cap_flags, 0))
+		return -EMSGSIZE;
+
+	ib_get_device_fw_str(device, fw);
+	/* Device without FW has strlen(fw) */
+	if (strlen(fw) && nla_put_string(msg, RDMA_NLDEV_ATTR_FW_VERSION, fw))
+		return -EMSGSIZE;
+
+	if (nla_put_u64_64bit(msg, RDMA_NLDEV_ATTR_NODE_GUID,
+			      be64_to_cpu(device->node_guid), 0))
+		return -EMSGSIZE;
+	if (nla_put_u64_64bit(msg, RDMA_NLDEV_ATTR_SYS_IMAGE_GUID,
+			      be64_to_cpu(device->attrs.sys_image_guid), 0))
+		return -EMSGSIZE;
+	if (nla_put_u8(msg, RDMA_NLDEV_ATTR_DEV_NODE_TYPE, device->node_type))
+		return -EMSGSIZE;
+	return 0;
+}
+
+static int fill_port_info(struct sk_buff *msg,
+			  struct ib_device *device, u32 port)
+{
+	struct ib_port_attr attr;
+	int ret;
+
+	if (nla_put_u32(msg, RDMA_NLDEV_ATTR_DEV_INDEX, device->index))
+		return -EMSGSIZE;
+	if (nla_put_string(msg, RDMA_NLDEV_ATTR_DEV_NAME, device->name))
+		return -EMSGSIZE;
+	if (nla_put_u32(msg, RDMA_NLDEV_ATTR_PORT_INDEX, port))
+		return -EMSGSIZE;
+
+	ret = ib_query_port(device, port, &attr);
+	if (ret)
+		return ret;
+
+	BUILD_BUG_ON(sizeof(attr.port_cap_flags) > sizeof(u64));
+	if (nla_put_u64_64bit(msg, RDMA_NLDEV_ATTR_CAP_FLAGS,
+			      (u64)attr.port_cap_flags, 0))
+		return -EMSGSIZE;
+	if (rdma_protocol_ib(device, port) &&
+	    nla_put_u64_64bit(msg, RDMA_NLDEV_ATTR_SUBNET_PREFIX,
+			      attr.subnet_prefix, 0))
+		return -EMSGSIZE;
+	if (rdma_protocol_ib(device, port)) {
+		if (nla_put_u32(msg, RDMA_NLDEV_ATTR_LID, attr.lid))
+			return -EMSGSIZE;
+		if (nla_put_u32(msg, RDMA_NLDEV_ATTR_SM_LID, attr.sm_lid))
+			return -EMSGSIZE;
+		if (nla_put_u8(msg, RDMA_NLDEV_ATTR_LMC, attr.lmc))
+			return -EMSGSIZE;
+	}
+	if (nla_put_u8(msg, RDMA_NLDEV_ATTR_PORT_STATE, attr.state))
+		return -EMSGSIZE;
+	if (nla_put_u8(msg, RDMA_NLDEV_ATTR_PORT_PHYS_STATE, attr.phys_state))
+		return -EMSGSIZE;
+	return 0;
+}
+
+static int nldev_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
+			  struct netlink_ext_ack *extack)
+{
+	struct nlattr *tb[RDMA_NLDEV_ATTR_MAX];
+	struct ib_device *device;
+	struct sk_buff *msg;
+	u32 index;
+	int err;
+
+	err = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
+			  nldev_policy, extack);
+	if (err || !tb[RDMA_NLDEV_ATTR_DEV_INDEX])
+		return -EINVAL;
+
+	index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
+
+	device = __ib_device_get_by_index(index);
+	if (!device)
+		return -EINVAL;
+
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
+			RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_GET),
+			0, 0);
+
+	err = fill_dev_info(msg, device);
+	if (err) {
+		nlmsg_free(msg);
+		return err;
+	}
+
+	nlmsg_end(msg, nlh);
+
+	return rdma_nl_unicast(msg, NETLINK_CB(skb).portid);
+}
+
+static int _nldev_get_dumpit(struct ib_device *device,
+			     struct sk_buff *skb,
+			     struct netlink_callback *cb,
+			     unsigned int idx)
+{
+	int start = cb->args[0];
+	struct nlmsghdr *nlh;
+
+	if (idx < start)
+		return 0;
+
+	nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
+			RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_GET),
+			0, NLM_F_MULTI);
+
+	if (fill_dev_info(skb, device)) {
+		nlmsg_cancel(skb, nlh);
+		goto out;
+	}
+
+	nlmsg_end(skb, nlh);
+
+	idx++;
+
+out:	cb->args[0] = idx;
+	return skb->len;
+}
+
+static int nldev_get_dumpit(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	/*
+	 * There is no need to take lock, because
+	 * we are relying on ib_core's lists_rwsem
+	 */
+	return ib_enum_all_devs(_nldev_get_dumpit, skb, cb);
+}
+
+static int nldev_port_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
+			       struct netlink_ext_ack *extack)
+{
+	struct nlattr *tb[RDMA_NLDEV_ATTR_MAX];
+	struct ib_device *device;
+	struct sk_buff *msg;
+	u32 index;
+	u32 port;
+	int err;
+
+	err = nlmsg_parse(nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
+			  nldev_policy, extack);
+	if (err || !tb[RDMA_NLDEV_ATTR_PORT_INDEX])
+		return -EINVAL;
+
+	index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
+	device = __ib_device_get_by_index(index);
+	if (!device)
+		return -EINVAL;
+
+	port = nla_get_u32(tb[RDMA_NLDEV_ATTR_PORT_INDEX]);
+	if (!rdma_is_port_valid(device, port))
+		return -EINVAL;
+
+	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
+	if (!msg)
+		return -ENOMEM;
+
+	nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
+			RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_GET),
+			0, 0);
+
+	err = fill_port_info(msg, device, port);
+	if (err) {
+		nlmsg_free(msg);
+		return err;
+	}
+
+	nlmsg_end(msg, nlh);
+
+	return rdma_nl_unicast(msg, NETLINK_CB(skb).portid);
+}
+
+static int nldev_port_get_dumpit(struct sk_buff *skb,
+				 struct netlink_callback *cb)
+{
+	struct nlattr *tb[RDMA_NLDEV_ATTR_MAX];
+	struct ib_device *device;
+	int start = cb->args[0];
+	struct nlmsghdr *nlh;
+	u32 idx = 0;
+	u32 ifindex;
+	int err;
+	u32 p;
+
+	err = nlmsg_parse(cb->nlh, 0, tb, RDMA_NLDEV_ATTR_MAX - 1,
+			  nldev_policy, NULL);
+	if (err || !tb[RDMA_NLDEV_ATTR_DEV_INDEX])
+		return -EINVAL;
+
+	ifindex = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
+	device = __ib_device_get_by_index(ifindex);
+	if (!device)
+		return -EINVAL;
+
+	for (p = rdma_start_port(device); p <= rdma_end_port(device); ++p) {
+		/*
+		 * The dumpit function returns all information from specific
+		 * index. This specific index is taken from the netlink
+		 * messages request sent by user and it is available
+		 * in cb->args[0].
+		 *
+		 * Usually, the user doesn't fill this field and it causes
+		 * to return everything.
+		 *
+		 */
+		if (idx < start) {
+			idx++;
+			continue;
+		}
+
+		nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid,
+				cb->nlh->nlmsg_seq,
+				RDMA_NL_GET_TYPE(RDMA_NL_NLDEV,
+						 RDMA_NLDEV_CMD_PORT_GET),
+				0, NLM_F_MULTI);
+
+		if (fill_port_info(skb, device, p)) {
+			nlmsg_cancel(skb, nlh);
+			goto out;
+		}
+		idx++;
+		nlmsg_end(skb, nlh);
+	}
+
+out:	cb->args[0] = idx;
+	return skb->len;
+}
+
+static const struct rdma_nl_cbs nldev_cb_table[] = {
+	[RDMA_NLDEV_CMD_GET] = {
+		.doit = nldev_get_doit,
+		.dump = nldev_get_dumpit,
+	},
+	[RDMA_NLDEV_CMD_PORT_GET] = {
+		.doit = nldev_port_get_doit,
+		.dump = nldev_port_get_dumpit,
+	},
+};
+
+void __init nldev_init(void)
+{
+	rdma_nl_register(RDMA_NL_NLDEV, nldev_cb_table);
+}
+
+void __exit nldev_exit(void)
+{
+	rdma_nl_unregister(RDMA_NL_NLDEV);
+}
+
+MODULE_ALIAS_RDMA_NETLINK(RDMA_NL_NLDEV, 5);
diff --git a/drivers/infiniband/core/rdma_core.c b/drivers/infiniband/core/rdma_core.c
index 41c31a2..85b5ee4 100644
--- a/drivers/infiniband/core/rdma_core.c
+++ b/drivers/infiniband/core/rdma_core.c
@@ -35,10 +35,57 @@
 #include <rdma/ib_verbs.h>
 #include <rdma/uverbs_types.h>
 #include <linux/rcupdate.h>
+#include <rdma/uverbs_ioctl.h>
+#include <rdma/rdma_user_ioctl.h>
 #include "uverbs.h"
 #include "core_priv.h"
 #include "rdma_core.h"
 
+int uverbs_ns_idx(u16 *id, unsigned int ns_count)
+{
+	int ret = (*id & UVERBS_ID_NS_MASK) >> UVERBS_ID_NS_SHIFT;
+
+	if (ret >= ns_count)
+		return -EINVAL;
+
+	*id &= ~UVERBS_ID_NS_MASK;
+	return ret;
+}
+
+const struct uverbs_object_spec *uverbs_get_object(const struct ib_device *ibdev,
+						   uint16_t object)
+{
+	const struct uverbs_root_spec *object_hash = ibdev->specs_root;
+	const struct uverbs_object_spec_hash *objects;
+	int ret = uverbs_ns_idx(&object, object_hash->num_buckets);
+
+	if (ret < 0)
+		return NULL;
+
+	objects = object_hash->object_buckets[ret];
+
+	if (object >= objects->num_objects)
+		return NULL;
+
+	return objects->objects[object];
+}
+
+const struct uverbs_method_spec *uverbs_get_method(const struct uverbs_object_spec *object,
+						   uint16_t method)
+{
+	const struct uverbs_method_spec_hash *methods;
+	int ret = uverbs_ns_idx(&method, object->num_buckets);
+
+	if (ret < 0)
+		return NULL;
+
+	methods = object->method_buckets[ret];
+	if (method >= methods->num_methods)
+		return NULL;
+
+	return methods->methods[method];
+}
+
 void uverbs_uobject_get(struct ib_uobject *uobject)
 {
 	kref_get(&uobject->ref);
@@ -404,6 +451,41 @@ int __must_check rdma_remove_commit_uobject(struct ib_uobject *uobj)
 	return ret;
 }
 
+static int null_obj_type_class_remove_commit(struct ib_uobject *uobj,
+					     enum rdma_remove_reason why)
+{
+	return 0;
+}
+
+static const struct uverbs_obj_type null_obj_type = {
+	.type_class = &((const struct uverbs_obj_type_class){
+			.remove_commit = null_obj_type_class_remove_commit,
+			/* be cautious */
+			.needs_kfree_rcu = true}),
+};
+
+int rdma_explicit_destroy(struct ib_uobject *uobject)
+{
+	int ret;
+	struct ib_ucontext *ucontext = uobject->context;
+
+	/* Cleanup is running. Calling this should have been impossible */
+	if (!down_read_trylock(&ucontext->cleanup_rwsem)) {
+		WARN(true, "ib_uverbs: Cleanup is running while removing an uobject\n");
+		return 0;
+	}
+	lockdep_check(uobject, true);
+	ret = uobject->type->type_class->remove_commit(uobject,
+						       RDMA_REMOVE_DESTROY);
+	if (ret)
+		return ret;
+
+	uobject->type = &null_obj_type;
+
+	up_read(&ucontext->cleanup_rwsem);
+	return 0;
+}
+
 static void alloc_commit_idr_uobject(struct ib_uobject *uobj)
 {
 	uverbs_uobject_add(uobj);
@@ -625,3 +707,100 @@ const struct uverbs_obj_type_class uverbs_fd_class = {
 	.needs_kfree_rcu = false,
 };
 
+struct ib_uobject *uverbs_get_uobject_from_context(const struct uverbs_obj_type *type_attrs,
+						   struct ib_ucontext *ucontext,
+						   enum uverbs_obj_access access,
+						   int id)
+{
+	switch (access) {
+	case UVERBS_ACCESS_READ:
+		return rdma_lookup_get_uobject(type_attrs, ucontext, id, false);
+	case UVERBS_ACCESS_DESTROY:
+	case UVERBS_ACCESS_WRITE:
+		return rdma_lookup_get_uobject(type_attrs, ucontext, id, true);
+	case UVERBS_ACCESS_NEW:
+		return rdma_alloc_begin_uobject(type_attrs, ucontext);
+	default:
+		WARN_ON(true);
+		return ERR_PTR(-EOPNOTSUPP);
+	}
+}
+
+int uverbs_finalize_object(struct ib_uobject *uobj,
+			   enum uverbs_obj_access access,
+			   bool commit)
+{
+	int ret = 0;
+
+	/*
+	 * refcounts should be handled at the object level and not at the
+	 * uobject level. Refcounts of the objects themselves are done in
+	 * handlers.
+	 */
+
+	switch (access) {
+	case UVERBS_ACCESS_READ:
+		rdma_lookup_put_uobject(uobj, false);
+		break;
+	case UVERBS_ACCESS_WRITE:
+		rdma_lookup_put_uobject(uobj, true);
+		break;
+	case UVERBS_ACCESS_DESTROY:
+		if (commit)
+			ret = rdma_remove_commit_uobject(uobj);
+		else
+			rdma_lookup_put_uobject(uobj, true);
+		break;
+	case UVERBS_ACCESS_NEW:
+		if (commit)
+			ret = rdma_alloc_commit_uobject(uobj);
+		else
+			rdma_alloc_abort_uobject(uobj);
+		break;
+	default:
+		WARN_ON(true);
+		ret = -EOPNOTSUPP;
+	}
+
+	return ret;
+}
+
+int uverbs_finalize_objects(struct uverbs_attr_bundle *attrs_bundle,
+			    struct uverbs_attr_spec_hash * const *spec_hash,
+			    size_t num,
+			    bool commit)
+{
+	unsigned int i;
+	int ret = 0;
+
+	for (i = 0; i < num; i++) {
+		struct uverbs_attr_bundle_hash *curr_bundle =
+			&attrs_bundle->hash[i];
+		const struct uverbs_attr_spec_hash *curr_spec_bucket =
+			spec_hash[i];
+		unsigned int j;
+
+		for (j = 0; j < curr_bundle->num_attrs; j++) {
+			struct uverbs_attr *attr;
+			const struct uverbs_attr_spec *spec;
+
+			if (!uverbs_attr_is_valid_in_hash(curr_bundle, j))
+				continue;
+
+			attr = &curr_bundle->attrs[j];
+			spec = &curr_spec_bucket->attrs[j];
+
+			if (spec->type == UVERBS_ATTR_TYPE_IDR ||
+			    spec->type == UVERBS_ATTR_TYPE_FD) {
+				int current_ret;
+
+				current_ret = uverbs_finalize_object(attr->obj_attr.uobject,
+								     spec->obj.access,
+								     commit);
+				if (!ret)
+					ret = current_ret;
+			}
+		}
+	}
+	return ret;
+}
diff --git a/drivers/infiniband/core/rdma_core.h b/drivers/infiniband/core/rdma_core.h
index 1b82e7f..1efcf93 100644
--- a/drivers/infiniband/core/rdma_core.h
+++ b/drivers/infiniband/core/rdma_core.h
@@ -39,9 +39,15 @@
 
 #include <linux/idr.h>
 #include <rdma/uverbs_types.h>
+#include <rdma/uverbs_ioctl.h>
 #include <rdma/ib_verbs.h>
 #include <linux/mutex.h>
 
+int uverbs_ns_idx(u16 *id, unsigned int ns_count);
+const struct uverbs_object_spec *uverbs_get_object(const struct ib_device *ibdev,
+						   uint16_t object);
+const struct uverbs_method_spec *uverbs_get_method(const struct uverbs_object_spec *object,
+						   uint16_t method);
 /*
  * These functions initialize the context and cleanups its uobjects.
  * The context has a list of objects which is protected by a mutex
@@ -75,4 +81,40 @@ void uverbs_uobject_put(struct ib_uobject *uobject);
  */
 void uverbs_close_fd(struct file *f);
 
+/*
+ * Get an ib_uobject that corresponds to the given id from ucontext, assuming
+ * the object is from the given type. Lock it to the required access when
+ * applicable.
+ * This function could create (access == NEW), destroy (access == DESTROY)
+ * or unlock (access == READ || access == WRITE) objects if required.
+ * The action will be finalized only when uverbs_finalize_object or
+ * uverbs_finalize_objects are called.
+ */
+struct ib_uobject *uverbs_get_uobject_from_context(const struct uverbs_obj_type *type_attrs,
+						   struct ib_ucontext *ucontext,
+						   enum uverbs_obj_access access,
+						   int id);
+int uverbs_finalize_object(struct ib_uobject *uobj,
+			   enum uverbs_obj_access access,
+			   bool commit);
+/*
+ * Note that certain finalize stages could return a status:
+ *   (a) alloc_commit could return a failure if the object is committed at the
+ *       same time when the context is destroyed.
+ *   (b) remove_commit could fail if the object wasn't destroyed successfully.
+ * Since multiple objects could be finalized in one transaction, it is very NOT
+ * recommended to have several finalize actions which have side effects.
+ * For example, it's NOT recommended to have a certain action which has both
+ * a commit action and a destroy action or two destroy objects in the same
+ * action. The rule of thumb is to have one destroy or commit action with
+ * multiple lookups.
+ * The first non zero return value of finalize_object is returned from this
+ * function. For example, this could happen when we couldn't destroy an
+ * object.
+ */
+int uverbs_finalize_objects(struct uverbs_attr_bundle *attrs_bundle,
+			    struct uverbs_attr_spec_hash * const *spec_hash,
+			    size_t num,
+			    bool commit);
+
 #endif /* RDMA_CORE_H */
diff --git a/drivers/infiniband/core/roce_gid_mgmt.c b/drivers/infiniband/core/roce_gid_mgmt.c
index 94a9eef..90e3889 100644
--- a/drivers/infiniband/core/roce_gid_mgmt.c
+++ b/drivers/infiniband/core/roce_gid_mgmt.c
@@ -44,6 +44,8 @@
 
 static struct workqueue_struct *gid_cache_wq;
 
+static struct workqueue_struct *gid_cache_wq;
+
 enum gid_op_type {
 	GID_DEL = 0,
 	GID_ADD
diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index 70fa4ca..ab5e102 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -50,6 +50,7 @@
 #include <uapi/rdma/ib_user_sa.h>
 #include <rdma/ib_marshall.h>
 #include <rdma/ib_addr.h>
+#include <rdma/opa_addr.h>
 #include "sa.h"
 #include "core_priv.h"
 
@@ -861,7 +862,7 @@ static int ib_nl_send_msg(struct ib_sa_query *query, gfp_t gfp_mask)
 	/* Repair the nlmsg header length */
 	nlmsg_end(skb, nlh);
 
-	ret = ibnl_multicast(skb, nlh, RDMA_NL_GROUP_LS, gfp_mask);
+	ret = rdma_nl_multicast(skb, RDMA_NL_GROUP_LS, gfp_mask);
 	if (!ret)
 		ret = len;
 	else
@@ -1021,9 +1022,9 @@ static void ib_nl_request_timeout(struct work_struct *work)
 }
 
 int ib_nl_handle_set_timeout(struct sk_buff *skb,
-			     struct netlink_callback *cb)
+			     struct nlmsghdr *nlh,
+			     struct netlink_ext_ack *extack)
 {
-	const struct nlmsghdr *nlh = (struct nlmsghdr *)cb->nlh;
 	int timeout, delta, abs_delta;
 	const struct nlattr *attr;
 	unsigned long flags;
@@ -1033,8 +1034,7 @@ int ib_nl_handle_set_timeout(struct sk_buff *skb,
 	int ret;
 
 	if (!(nlh->nlmsg_flags & NLM_F_REQUEST) ||
-	    !(NETLINK_CB(skb).sk) ||
-	    !netlink_capable(skb, CAP_NET_ADMIN))
+	    !(NETLINK_CB(skb).sk))
 		return -EPERM;
 
 	ret = nla_parse(tb, LS_NLA_TYPE_MAX - 1, nlmsg_data(nlh),
@@ -1098,9 +1098,9 @@ static inline int ib_nl_is_good_resolve_resp(const struct nlmsghdr *nlh)
 }
 
 int ib_nl_handle_resolve_resp(struct sk_buff *skb,
-			      struct netlink_callback *cb)
+			      struct nlmsghdr *nlh,
+			      struct netlink_ext_ack *extack)
 {
-	const struct nlmsghdr *nlh = (struct nlmsghdr *)cb->nlh;
 	unsigned long flags;
 	struct ib_sa_query *query;
 	struct ib_mad_send_buf *send_buf;
@@ -1109,8 +1109,7 @@ int ib_nl_handle_resolve_resp(struct sk_buff *skb,
 	int ret;
 
 	if ((nlh->nlmsg_flags & NLM_F_REQUEST) ||
-	    !(NETLINK_CB(skb).sk) ||
-	    !netlink_capable(skb, CAP_NET_ADMIN))
+	    !(NETLINK_CB(skb).sk))
 		return -EPERM;
 
 	spin_lock_irqsave(&ib_nl_request_lock, flags);
@@ -1241,6 +1240,11 @@ int ib_init_ah_from_path(struct ib_device *device, u8 port_num,
 	ah_attr->type = rdma_ah_find_type(device, port_num);
 
 	rdma_ah_set_dlid(ah_attr, be32_to_cpu(sa_path_get_dlid(rec)));
+
+	if ((ah_attr->type == RDMA_AH_ATTR_TYPE_OPA) &&
+	    (rdma_ah_get_dlid(ah_attr) == be16_to_cpu(IB_LID_PERMISSIVE)))
+		rdma_ah_set_make_grd(ah_attr, true);
+
 	rdma_ah_set_sl(ah_attr, rec->sl);
 	rdma_ah_set_path_bits(ah_attr, be32_to_cpu(sa_path_get_slid(rec)) &
 			      get_src_path_mask(device, port_num));
@@ -1420,7 +1424,7 @@ static int send_mad(struct ib_sa_query *query, int timeout_ms, gfp_t gfp_mask)
 
 	if ((query->flags & IB_SA_ENABLE_LOCAL_SERVICE) &&
 	    (!(query->flags & IB_SA_QUERY_OPA))) {
-		if (!ibnl_chk_listeners(RDMA_NL_GROUP_LS)) {
+		if (!rdma_nl_chk_listeners(RDMA_NL_GROUP_LS)) {
 			if (!ib_nl_make_request(query, gfp_mask))
 				return id;
 		}
@@ -2290,12 +2294,15 @@ static void update_sm_ah(struct work_struct *work)
 	rdma_ah_set_sl(&ah_attr, port_attr.sm_sl);
 	rdma_ah_set_port_num(&ah_attr, port->port_num);
 	if (port_attr.grh_required) {
-		rdma_ah_set_ah_flags(&ah_attr, IB_AH_GRH);
-
-		rdma_ah_set_subnet_prefix(&ah_attr,
-					  cpu_to_be64(port_attr.subnet_prefix));
-		rdma_ah_set_interface_id(&ah_attr,
-					 cpu_to_be64(IB_SA_WELL_KNOWN_GUID));
+		if (ah_attr.type == RDMA_AH_ATTR_TYPE_OPA) {
+			rdma_ah_set_make_grd(&ah_attr, true);
+		} else {
+			rdma_ah_set_ah_flags(&ah_attr, IB_AH_GRH);
+			rdma_ah_set_subnet_prefix(&ah_attr,
+						  cpu_to_be64(port_attr.subnet_prefix));
+			rdma_ah_set_interface_id(&ah_attr,
+						 cpu_to_be64(IB_SA_WELL_KNOWN_GUID));
+		}
 	}
 
 	new_ah->ah = rdma_create_ah(port->agent->qp->pd, &ah_attr);
@@ -2410,8 +2417,7 @@ static void ib_sa_add_one(struct ib_device *device)
 	 */
 
 	INIT_IB_EVENT_HANDLER(&sa_dev->event_handler, device, ib_sa_event);
-	if (ib_register_event_handler(&sa_dev->event_handler))
-		goto err;
+	ib_register_event_handler(&sa_dev->event_handler);
 
 	for (i = 0; i <= e - s; ++i) {
 		if (rdma_cap_ib_sa(device, i + 1))
diff --git a/drivers/infiniband/core/sysfs.c b/drivers/infiniband/core/sysfs.c
index 7ebe1ef..abc5ab5 100644
--- a/drivers/infiniband/core/sysfs.c
+++ b/drivers/infiniband/core/sysfs.c
@@ -1210,8 +1210,8 @@ static ssize_t show_fw_ver(struct device *device, struct device_attribute *attr,
 {
 	struct ib_device *dev = container_of(device, struct ib_device, dev);
 
-	ib_get_device_fw_str(dev, buf, PAGE_SIZE);
-	strlcat(buf, "\n", PAGE_SIZE);
+	ib_get_device_fw_str(dev, buf);
+	strlcat(buf, "\n", IB_FW_VERSION_NAME_MAX);
 	return strlen(buf);
 }
 
diff --git a/drivers/infiniband/core/ucm.c b/drivers/infiniband/core/ucm.c
index 112099c..f2a7f62 100644
--- a/drivers/infiniband/core/ucm.c
+++ b/drivers/infiniband/core/ucm.c
@@ -618,7 +618,7 @@ static ssize_t ib_ucm_init_qp_attr(struct ib_ucm_file *file,
 	if (result)
 		goto out;
 
-	ib_copy_qp_attr_to_user(&resp, &qp_attr);
+	ib_copy_qp_attr_to_user(ctx->cm_id->device, &resp, &qp_attr);
 
 	if (copy_to_user((void __user *)(unsigned long)cmd.response,
 			 &resp, sizeof(resp)))
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index 276f0ef..eb85b54 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -248,14 +248,15 @@ static void ucma_copy_conn_event(struct rdma_ucm_conn_param *dst,
 	dst->qp_num = src->qp_num;
 }
 
-static void ucma_copy_ud_event(struct rdma_ucm_ud_param *dst,
+static void ucma_copy_ud_event(struct ib_device *device,
+			       struct rdma_ucm_ud_param *dst,
 			       struct rdma_ud_param *src)
 {
 	if (src->private_data_len)
 		memcpy(dst->private_data, src->private_data,
 		       src->private_data_len);
 	dst->private_data_len = src->private_data_len;
-	ib_copy_ah_attr_to_user(&dst->ah_attr, &src->ah_attr);
+	ib_copy_ah_attr_to_user(device, &dst->ah_attr, &src->ah_attr);
 	dst->qp_num = src->qp_num;
 	dst->qkey = src->qkey;
 }
@@ -335,7 +336,8 @@ static int ucma_event_handler(struct rdma_cm_id *cm_id,
 	uevent->resp.event = event->event;
 	uevent->resp.status = event->status;
 	if (cm_id->qp_type == IB_QPT_UD)
-		ucma_copy_ud_event(&uevent->resp.param.ud, &event->param.ud);
+		ucma_copy_ud_event(cm_id->device, &uevent->resp.param.ud,
+				   &event->param.ud);
 	else
 		ucma_copy_conn_event(&uevent->resp.param.conn,
 				     &event->param.conn);
@@ -1157,7 +1159,7 @@ static ssize_t ucma_init_qp_attr(struct ucma_file *file,
 	if (ret)
 		goto out;
 
-	ib_copy_qp_attr_to_user(&resp, &qp_attr);
+	ib_copy_qp_attr_to_user(ctx->cm_id->device, &resp, &qp_attr);
 	if (copy_to_user((void __user *)(unsigned long)cmd.response,
 			 &resp, sizeof(resp)))
 		ret = -EFAULT;
diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index 8c4ec56..55e8f5e 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -166,24 +166,6 @@ static int invalidate_page_trampoline(struct ib_umem *item, u64 start,
 	return 0;
 }
 
-static void ib_umem_notifier_invalidate_page(struct mmu_notifier *mn,
-					     struct mm_struct *mm,
-					     unsigned long address)
-{
-	struct ib_ucontext *context = container_of(mn, struct ib_ucontext, mn);
-
-	if (!context->invalidate_range)
-		return;
-
-	ib_ucontext_notifier_start_account(context);
-	down_read(&context->umem_rwsem);
-	rbt_ib_umem_for_each_in_range(&context->umem_tree, address,
-				      address + PAGE_SIZE,
-				      invalidate_page_trampoline, NULL);
-	up_read(&context->umem_rwsem);
-	ib_ucontext_notifier_end_account(context);
-}
-
 static int invalidate_range_start_trampoline(struct ib_umem *item, u64 start,
 					     u64 end, void *cookie)
 {
@@ -237,7 +219,6 @@ static void ib_umem_notifier_invalidate_range_end(struct mmu_notifier *mn,
 
 static const struct mmu_notifier_ops ib_umem_notifiers = {
 	.release                    = ib_umem_notifier_release,
-	.invalidate_page            = ib_umem_notifier_invalidate_page,
 	.invalidate_range_start     = ib_umem_notifier_invalidate_range_start,
 	.invalidate_range_end       = ib_umem_notifier_invalidate_range_end,
 };
diff --git a/drivers/infiniband/core/user_mad.c b/drivers/infiniband/core/user_mad.c
index 36a6f5c..c1696e6 100644
--- a/drivers/infiniband/core/user_mad.c
+++ b/drivers/infiniband/core/user_mad.c
@@ -229,7 +229,7 @@ static void recv_handler(struct ib_mad_agent *agent,
 	packet->mad.hdr.status	   = 0;
 	packet->mad.hdr.length	   = hdr_size(file) + mad_recv_wc->mad_len;
 	packet->mad.hdr.qpn	   = cpu_to_be32(mad_recv_wc->wc->src_qp);
-	packet->mad.hdr.lid	   = cpu_to_be16(mad_recv_wc->wc->slid);
+	packet->mad.hdr.lid	   = ib_lid_be16(mad_recv_wc->wc->slid);
 	packet->mad.hdr.sl	   = mad_recv_wc->wc->sl;
 	packet->mad.hdr.path_bits  = mad_recv_wc->wc->dlid_path_bits;
 	packet->mad.hdr.pkey_index = mad_recv_wc->wc->pkey_index;
diff --git a/drivers/infiniband/core/uverbs.h b/drivers/infiniband/core/uverbs.h
index 64d494a..37c8903 100644
--- a/drivers/infiniband/core/uverbs.h
+++ b/drivers/infiniband/core/uverbs.h
@@ -100,6 +100,7 @@ struct ib_uverbs_device {
 	struct mutex				lists_mutex; /* protect lists */
 	struct list_head			uverbs_file_list;
 	struct list_head			uverbs_events_file_list;
+	struct uverbs_root_spec			*specs_root;
 };
 
 struct ib_uverbs_event_queue {
@@ -218,6 +219,8 @@ int uverbs_dealloc_mw(struct ib_mw *mw);
 void ib_uverbs_detach_umcast(struct ib_qp *qp,
 			     struct ib_uqp_object *uobj);
 
+long ib_uverbs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg);
+
 struct ib_uverbs_flow_spec {
 	union {
 		union {
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 739bd69..e0cb9986 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -91,9 +91,10 @@ ssize_t ib_uverbs_get_context(struct ib_uverbs_file *file,
 		goto err;
 	}
 
-	INIT_UDATA(&udata, buf + sizeof cmd,
-		   (unsigned long) cmd.response + sizeof resp,
-		   in_len - sizeof cmd, out_len - sizeof resp);
+	INIT_UDATA(&udata, buf + sizeof(cmd),
+		   (unsigned long) cmd.response + sizeof(resp),
+		   in_len - sizeof(cmd) - sizeof(struct ib_uverbs_cmd_hdr),
+		   out_len - sizeof(resp));
 
 	ret = ib_rdmacg_try_charge(&cg_obj, ib_dev, RDMACG_RESOURCE_HCA_HANDLE);
 	if (ret)
@@ -275,8 +276,14 @@ ssize_t ib_uverbs_query_port(struct ib_uverbs_file *file,
 	resp.bad_pkey_cntr   = attr.bad_pkey_cntr;
 	resp.qkey_viol_cntr  = attr.qkey_viol_cntr;
 	resp.pkey_tbl_len    = attr.pkey_tbl_len;
-	resp.lid 	     = attr.lid;
-	resp.sm_lid 	     = attr.sm_lid;
+
+	if (rdma_cap_opa_ah(ib_dev, cmd.port_num)) {
+		resp.lid     = OPA_TO_IB_UCAST_LID(attr.lid);
+		resp.sm_lid  = OPA_TO_IB_UCAST_LID(attr.sm_lid);
+	} else {
+		resp.lid     = ib_lid_cpu16(attr.lid);
+		resp.sm_lid  = ib_lid_cpu16(attr.sm_lid);
+	}
 	resp.lmc 	     = attr.lmc;
 	resp.max_vl_num      = attr.max_vl_num;
 	resp.sm_sl 	     = attr.sm_sl;
@@ -313,9 +320,10 @@ ssize_t ib_uverbs_alloc_pd(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	INIT_UDATA(&udata, buf + sizeof cmd,
-		   (unsigned long) cmd.response + sizeof resp,
-		   in_len - sizeof cmd, out_len - sizeof resp);
+	INIT_UDATA(&udata, buf + sizeof(cmd),
+		   (unsigned long) cmd.response + sizeof(resp),
+                   in_len - sizeof(cmd) - sizeof(struct ib_uverbs_cmd_hdr),
+                   out_len - sizeof(resp));
 
 	uobj  = uobj_alloc(uobj_get_type(pd), file->ucontext);
 	if (IS_ERR(uobj))
@@ -482,9 +490,10 @@ ssize_t ib_uverbs_open_xrcd(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	INIT_UDATA(&udata, buf + sizeof cmd,
-		   (unsigned long) cmd.response + sizeof resp,
-		   in_len - sizeof cmd, out_len - sizeof  resp);
+	INIT_UDATA(&udata, buf + sizeof(cmd),
+		   (unsigned long) cmd.response + sizeof(resp),
+                   in_len - sizeof(cmd) - sizeof(struct ib_uverbs_cmd_hdr),
+                   out_len - sizeof(resp));
 
 	mutex_lock(&file->device->xrcd_tree_mutex);
 
@@ -646,9 +655,10 @@ ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	INIT_UDATA(&udata, buf + sizeof cmd,
-		   (unsigned long) cmd.response + sizeof resp,
-		   in_len - sizeof cmd, out_len - sizeof resp);
+	INIT_UDATA(&udata, buf + sizeof(cmd),
+		   (unsigned long) cmd.response + sizeof(resp),
+                   in_len - sizeof(cmd) - sizeof(struct ib_uverbs_cmd_hdr),
+                   out_len - sizeof(resp));
 
 	if ((cmd.start & ~PAGE_MASK) != (cmd.hca_va & ~PAGE_MASK))
 		return -EINVAL;
@@ -740,7 +750,8 @@ ssize_t ib_uverbs_rereg_mr(struct ib_uverbs_file *file,
 
 	INIT_UDATA(&udata, buf + sizeof(cmd),
 		   (unsigned long) cmd.response + sizeof(resp),
-		   in_len - sizeof(cmd), out_len - sizeof(resp));
+                   in_len - sizeof(cmd) - sizeof(struct ib_uverbs_cmd_hdr),
+                   out_len - sizeof(resp));
 
 	if (cmd.flags & ~IB_MR_REREG_SUPPORTED || !cmd.flags)
 		return -EINVAL;
@@ -1080,7 +1091,8 @@ ssize_t ib_uverbs_create_cq(struct ib_uverbs_file *file,
 
 	INIT_UDATA(&uhw, buf + sizeof(cmd),
 		   (unsigned long)cmd.response + sizeof(resp),
-		   in_len - sizeof(cmd), out_len - sizeof(resp));
+		   in_len - sizeof(cmd) - sizeof(struct ib_uverbs_cmd_hdr),
+		   out_len - sizeof(resp));
 
 	memset(&cmd_ex, 0, sizeof(cmd_ex));
 	cmd_ex.user_handle = cmd.user_handle;
@@ -1161,9 +1173,10 @@ ssize_t ib_uverbs_resize_cq(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	INIT_UDATA(&udata, buf + sizeof cmd,
-		   (unsigned long) cmd.response + sizeof resp,
-		   in_len - sizeof cmd, out_len - sizeof resp);
+	INIT_UDATA(&udata, buf + sizeof(cmd),
+		   (unsigned long) cmd.response + sizeof(resp),
+		   in_len - sizeof(cmd) - sizeof(struct ib_uverbs_cmd_hdr),
+		   out_len - sizeof(resp));
 
 	cq = uobj_get_obj_read(cq, cmd.cq_handle, file->ucontext);
 	if (!cq)
@@ -1185,7 +1198,8 @@ ssize_t ib_uverbs_resize_cq(struct ib_uverbs_file *file,
 	return ret ? ret : in_len;
 }
 
-static int copy_wc_to_user(void __user *dest, struct ib_wc *wc)
+static int copy_wc_to_user(struct ib_device *ib_dev, void __user *dest,
+			   struct ib_wc *wc)
 {
 	struct ib_uverbs_wc tmp;
 
@@ -1199,7 +1213,10 @@ static int copy_wc_to_user(void __user *dest, struct ib_wc *wc)
 	tmp.src_qp		= wc->src_qp;
 	tmp.wc_flags		= wc->wc_flags;
 	tmp.pkey_index		= wc->pkey_index;
-	tmp.slid		= wc->slid;
+	if (rdma_cap_opa_ah(ib_dev, wc->port_num))
+		tmp.slid	= OPA_TO_IB_UCAST_LID(wc->slid);
+	else
+		tmp.slid	= ib_lid_cpu16(wc->slid);
 	tmp.sl			= wc->sl;
 	tmp.dlid_path_bits	= wc->dlid_path_bits;
 	tmp.port_num		= wc->port_num;
@@ -1243,7 +1260,7 @@ ssize_t ib_uverbs_poll_cq(struct ib_uverbs_file *file,
 		if (!ret)
 			break;
 
-		ret = copy_wc_to_user(data_ptr, &wc);
+		ret = copy_wc_to_user(ib_dev, data_ptr, &wc);
 		if (ret)
 			goto out_put;
 
@@ -1383,8 +1400,9 @@ static int create_qp(struct ib_uverbs_file *file,
 		attr.rwq_ind_tbl = ind_tbl;
 	}
 
-	if ((cmd_sz >= offsetof(typeof(*cmd), reserved1) +
-		       sizeof(cmd->reserved1)) && cmd->reserved1) {
+	if (cmd_sz > sizeof(*cmd) &&
+	    !ib_is_udata_cleared(ucore, sizeof(*cmd),
+				 cmd_sz - sizeof(*cmd))) {
 		ret = -EOPNOTSUPP;
 		goto err_put;
 	}
@@ -1420,7 +1438,7 @@ static int create_qp(struct ib_uverbs_file *file,
 			if (cmd->is_srq) {
 				srq = uobj_get_obj_read(srq, cmd->srq_handle,
 							file->ucontext);
-				if (!srq || srq->srq_type != IB_SRQT_BASIC) {
+				if (!srq || srq->srq_type == IB_SRQT_XRC) {
 					ret = -EINVAL;
 					goto err_put;
 				}
@@ -1482,11 +1500,21 @@ static int create_qp(struct ib_uverbs_file *file,
 				IB_QP_CREATE_MANAGED_SEND |
 				IB_QP_CREATE_MANAGED_RECV |
 				IB_QP_CREATE_SCATTER_FCS |
-				IB_QP_CREATE_CVLAN_STRIPPING)) {
+				IB_QP_CREATE_CVLAN_STRIPPING |
+				IB_QP_CREATE_SOURCE_QPN)) {
 		ret = -EINVAL;
 		goto err_put;
 	}
 
+	if (attr.create_flags & IB_QP_CREATE_SOURCE_QPN) {
+		if (!capable(CAP_NET_RAW)) {
+			ret = -EPERM;
+			goto err_put;
+		}
+
+		attr.source_qpn = cmd->source_qpn;
+	}
+
 	buf = (void *)cmd + sizeof(*cmd);
 	if (cmd_sz > sizeof(*cmd))
 		if (!(buf[0] == 0 && !memcmp(buf, buf + 1,
@@ -1722,9 +1750,10 @@ ssize_t ib_uverbs_open_qp(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	INIT_UDATA(&udata, buf + sizeof cmd,
-		   (unsigned long) cmd.response + sizeof resp,
-		   in_len - sizeof cmd, out_len - sizeof resp);
+	INIT_UDATA(&udata, buf + sizeof(cmd),
+		   (unsigned long) cmd.response + sizeof(resp),
+		   in_len - sizeof(cmd) - sizeof(struct ib_uverbs_cmd_hdr),
+		   out_len - sizeof(resp));
 
 	obj  = (struct ib_uqp_object *)uobj_alloc(uobj_get_type(qp),
 						  file->ucontext);
@@ -1791,6 +1820,28 @@ ssize_t ib_uverbs_open_qp(struct ib_uverbs_file *file,
 	return ret;
 }
 
+static void copy_ah_attr_to_uverbs(struct ib_uverbs_qp_dest *uverb_attr,
+				   struct rdma_ah_attr *rdma_attr)
+{
+	const struct ib_global_route   *grh;
+
+	uverb_attr->dlid              = rdma_ah_get_dlid(rdma_attr);
+	uverb_attr->sl                = rdma_ah_get_sl(rdma_attr);
+	uverb_attr->src_path_bits     = rdma_ah_get_path_bits(rdma_attr);
+	uverb_attr->static_rate       = rdma_ah_get_static_rate(rdma_attr);
+	uverb_attr->is_global         = !!(rdma_ah_get_ah_flags(rdma_attr) &
+					 IB_AH_GRH);
+	if (uverb_attr->is_global) {
+		grh = rdma_ah_read_grh(rdma_attr);
+		memcpy(uverb_attr->dgid, grh->dgid.raw, 16);
+		uverb_attr->flow_label        = grh->flow_label;
+		uverb_attr->sgid_index        = grh->sgid_index;
+		uverb_attr->hop_limit         = grh->hop_limit;
+		uverb_attr->traffic_class     = grh->traffic_class;
+	}
+	uverb_attr->port_num          = rdma_ah_get_port_num(rdma_attr);
+}
+
 ssize_t ib_uverbs_query_qp(struct ib_uverbs_file *file,
 			   struct ib_device *ib_dev,
 			   const char __user *buf, int in_len,
@@ -1801,7 +1852,6 @@ ssize_t ib_uverbs_query_qp(struct ib_uverbs_file *file,
 	struct ib_qp                   *qp;
 	struct ib_qp_attr              *attr;
 	struct ib_qp_init_attr         *init_attr;
-	const struct ib_global_route   *grh;
 	int                            ret;
 
 	if (copy_from_user(&cmd, buf, sizeof cmd))
@@ -1851,39 +1901,8 @@ ssize_t ib_uverbs_query_qp(struct ib_uverbs_file *file,
 	resp.alt_port_num           = attr->alt_port_num;
 	resp.alt_timeout            = attr->alt_timeout;
 
-	resp.dest.dlid              = rdma_ah_get_dlid(&attr->ah_attr);
-	resp.dest.sl                = rdma_ah_get_sl(&attr->ah_attr);
-	resp.dest.src_path_bits     = rdma_ah_get_path_bits(&attr->ah_attr);
-	resp.dest.static_rate       = rdma_ah_get_static_rate(&attr->ah_attr);
-	resp.dest.is_global         = !!(rdma_ah_get_ah_flags(&attr->ah_attr) &
-					 IB_AH_GRH);
-	if (resp.dest.is_global) {
-		grh = rdma_ah_read_grh(&attr->ah_attr);
-		memcpy(resp.dest.dgid, grh->dgid.raw, 16);
-		resp.dest.flow_label        = grh->flow_label;
-		resp.dest.sgid_index        = grh->sgid_index;
-		resp.dest.hop_limit         = grh->hop_limit;
-		resp.dest.traffic_class     = grh->traffic_class;
-	}
-	resp.dest.port_num          = rdma_ah_get_port_num(&attr->ah_attr);
-
-	resp.alt_dest.dlid          = rdma_ah_get_dlid(&attr->alt_ah_attr);
-	resp.alt_dest.sl            = rdma_ah_get_sl(&attr->alt_ah_attr);
-	resp.alt_dest.src_path_bits = rdma_ah_get_path_bits(&attr->alt_ah_attr);
-	resp.alt_dest.static_rate
-			= rdma_ah_get_static_rate(&attr->alt_ah_attr);
-	resp.alt_dest.is_global
-			= !!(rdma_ah_get_ah_flags(&attr->alt_ah_attr) &
-						  IB_AH_GRH);
-	if (resp.alt_dest.is_global) {
-		grh = rdma_ah_read_grh(&attr->alt_ah_attr);
-		memcpy(resp.alt_dest.dgid, grh->dgid.raw, 16);
-		resp.alt_dest.flow_label    = grh->flow_label;
-		resp.alt_dest.sgid_index    = grh->sgid_index;
-		resp.alt_dest.hop_limit     = grh->hop_limit;
-		resp.alt_dest.traffic_class = grh->traffic_class;
-	}
-	resp.alt_dest.port_num      = rdma_ah_get_port_num(&attr->alt_ah_attr);
+	copy_ah_attr_to_uverbs(&resp.dest, &attr->ah_attr);
+	copy_ah_attr_to_uverbs(&resp.alt_dest, &attr->alt_ah_attr);
 
 	resp.max_send_wr            = init_attr->cap.max_send_wr;
 	resp.max_recv_wr            = init_attr->cap.max_recv_wr;
@@ -1917,6 +1936,29 @@ static int modify_qp_mask(enum ib_qp_type qp_type, int mask)
 	}
 }
 
+static void copy_ah_attr_from_uverbs(struct ib_device *dev,
+				     struct rdma_ah_attr *rdma_attr,
+				     struct ib_uverbs_qp_dest *uverb_attr)
+{
+	rdma_attr->type = rdma_ah_find_type(dev, uverb_attr->port_num);
+	if (uverb_attr->is_global) {
+		rdma_ah_set_grh(rdma_attr, NULL,
+				uverb_attr->flow_label,
+				uverb_attr->sgid_index,
+				uverb_attr->hop_limit,
+				uverb_attr->traffic_class);
+		rdma_ah_set_dgid_raw(rdma_attr, uverb_attr->dgid);
+	} else {
+		rdma_ah_set_ah_flags(rdma_attr, 0);
+	}
+	rdma_ah_set_dlid(rdma_attr, uverb_attr->dlid);
+	rdma_ah_set_sl(rdma_attr, uverb_attr->sl);
+	rdma_ah_set_path_bits(rdma_attr, uverb_attr->src_path_bits);
+	rdma_ah_set_static_rate(rdma_attr, uverb_attr->static_rate);
+	rdma_ah_set_port_num(rdma_attr, uverb_attr->port_num);
+	rdma_ah_set_make_grd(rdma_attr, false);
+}
+
 static int modify_qp(struct ib_uverbs_file *file,
 		     struct ib_uverbs_ex_modify_qp *cmd, struct ib_udata *udata)
 {
@@ -1964,48 +2006,12 @@ static int modify_qp(struct ib_uverbs_file *file,
 	attr->rate_limit	  = cmd->rate_limit;
 
 	if (cmd->base.attr_mask & IB_QP_AV)
-		attr->ah_attr.type = rdma_ah_find_type(qp->device,
-						       cmd->base.dest.port_num);
-	if (cmd->base.dest.is_global) {
-		rdma_ah_set_grh(&attr->ah_attr, NULL,
-				cmd->base.dest.flow_label,
-				cmd->base.dest.sgid_index,
-				cmd->base.dest.hop_limit,
-				cmd->base.dest.traffic_class);
-		rdma_ah_set_dgid_raw(&attr->ah_attr, cmd->base.dest.dgid);
-	} else {
-		rdma_ah_set_ah_flags(&attr->ah_attr, 0);
-	}
-	rdma_ah_set_dlid(&attr->ah_attr, cmd->base.dest.dlid);
-	rdma_ah_set_sl(&attr->ah_attr, cmd->base.dest.sl);
-	rdma_ah_set_path_bits(&attr->ah_attr, cmd->base.dest.src_path_bits);
-	rdma_ah_set_static_rate(&attr->ah_attr, cmd->base.dest.static_rate);
-	rdma_ah_set_port_num(&attr->ah_attr,
-			     cmd->base.dest.port_num);
+		copy_ah_attr_from_uverbs(qp->device, &attr->ah_attr,
+					 &cmd->base.dest);
 
 	if (cmd->base.attr_mask & IB_QP_ALT_PATH)
-		attr->alt_ah_attr.type =
-			rdma_ah_find_type(qp->device, cmd->base.dest.port_num);
-	if (cmd->base.alt_dest.is_global) {
-		rdma_ah_set_grh(&attr->alt_ah_attr, NULL,
-				cmd->base.alt_dest.flow_label,
-				cmd->base.alt_dest.sgid_index,
-				cmd->base.alt_dest.hop_limit,
-				cmd->base.alt_dest.traffic_class);
-		rdma_ah_set_dgid_raw(&attr->alt_ah_attr,
-				     cmd->base.alt_dest.dgid);
-	} else {
-		rdma_ah_set_ah_flags(&attr->alt_ah_attr, 0);
-	}
-
-	rdma_ah_set_dlid(&attr->alt_ah_attr, cmd->base.alt_dest.dlid);
-	rdma_ah_set_sl(&attr->alt_ah_attr, cmd->base.alt_dest.sl);
-	rdma_ah_set_path_bits(&attr->alt_ah_attr,
-			      cmd->base.alt_dest.src_path_bits);
-	rdma_ah_set_static_rate(&attr->alt_ah_attr,
-				cmd->base.alt_dest.static_rate);
-	rdma_ah_set_port_num(&attr->alt_ah_attr,
-			     cmd->base.alt_dest.port_num);
+		copy_ah_attr_from_uverbs(qp->device, &attr->alt_ah_attr,
+					 &cmd->base.alt_dest);
 
 	ret = ib_modify_qp_with_udata(qp, attr,
 				      modify_qp_mask(qp->qp_type,
@@ -2037,7 +2043,8 @@ ssize_t ib_uverbs_modify_qp(struct ib_uverbs_file *file,
 		return -EOPNOTSUPP;
 
 	INIT_UDATA(&udata, buf + sizeof(cmd.base), NULL,
-		   in_len - sizeof(cmd.base), out_len);
+		   in_len - sizeof(cmd.base) - sizeof(struct ib_uverbs_cmd_hdr),
+		   out_len);
 
 	ret = modify_qp(file, &cmd, &udata);
 	if (ret)
@@ -2543,7 +2550,8 @@ ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *file,
 
 	INIT_UDATA(&udata, buf + sizeof(cmd),
 		   (unsigned long)cmd.response + sizeof(resp),
-		   in_len - sizeof(cmd), out_len - sizeof(resp));
+		   in_len - sizeof(cmd) - sizeof(struct ib_uverbs_cmd_hdr),
+		   out_len - sizeof(resp));
 
 	uobj  = uobj_alloc(uobj_get_type(ah), file->ucontext);
 	if (IS_ERR(uobj))
@@ -2556,6 +2564,7 @@ ssize_t ib_uverbs_create_ah(struct ib_uverbs_file *file,
 	}
 
 	attr.type = rdma_ah_find_type(ib_dev, cmd.attr.port_num);
+	rdma_ah_set_make_grd(&attr, false);
 	rdma_ah_set_dlid(&attr, cmd.attr.dlid);
 	rdma_ah_set_sl(&attr, cmd.attr.sl);
 	rdma_ah_set_path_bits(&attr, cmd.attr.src_path_bits);
@@ -3472,6 +3481,9 @@ static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
 	if (IS_ERR(obj))
 		return PTR_ERR(obj);
 
+	if (cmd->srq_type == IB_SRQT_TM)
+		attr.ext.tag_matching.max_num_tags = cmd->max_num_tags;
+
 	if (cmd->srq_type == IB_SRQT_XRC) {
 		xrcd_uobj = uobj_get_read(uobj_get_type(xrcd), cmd->xrcd_handle,
 					  file->ucontext);
@@ -3488,10 +3500,12 @@ static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
 
 		obj->uxrcd = container_of(xrcd_uobj, struct ib_uxrcd_object, uobject);
 		atomic_inc(&obj->uxrcd->refcnt);
+	}
 
-		attr.ext.xrc.cq  = uobj_get_obj_read(cq, cmd->cq_handle,
-						     file->ucontext);
-		if (!attr.ext.xrc.cq) {
+	if (ib_srq_has_cq(cmd->srq_type)) {
+		attr.ext.cq  = uobj_get_obj_read(cq, cmd->cq_handle,
+						 file->ucontext);
+		if (!attr.ext.cq) {
 			ret = -EINVAL;
 			goto err_put_xrcd;
 		}
@@ -3526,10 +3540,13 @@ static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
 	srq->event_handler = attr.event_handler;
 	srq->srq_context   = attr.srq_context;
 
+	if (ib_srq_has_cq(cmd->srq_type)) {
+		srq->ext.cq       = attr.ext.cq;
+		atomic_inc(&attr.ext.cq->usecnt);
+	}
+
 	if (cmd->srq_type == IB_SRQT_XRC) {
-		srq->ext.xrc.cq   = attr.ext.xrc.cq;
 		srq->ext.xrc.xrcd = attr.ext.xrc.xrcd;
-		atomic_inc(&attr.ext.xrc.cq->usecnt);
 		atomic_inc(&attr.ext.xrc.xrcd->usecnt);
 	}
 
@@ -3552,10 +3569,12 @@ static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
 		goto err_copy;
 	}
 
-	if (cmd->srq_type == IB_SRQT_XRC) {
+	if (cmd->srq_type == IB_SRQT_XRC)
 		uobj_put_read(xrcd_uobj);
-		uobj_put_obj_read(attr.ext.xrc.cq);
-	}
+
+	if (ib_srq_has_cq(cmd->srq_type))
+		uobj_put_obj_read(attr.ext.cq);
+
 	uobj_put_obj_read(pd);
 	uobj_alloc_commit(&obj->uevent.uobject);
 
@@ -3568,8 +3587,8 @@ static int __uverbs_create_xsrq(struct ib_uverbs_file *file,
 	uobj_put_obj_read(pd);
 
 err_put_cq:
-	if (cmd->srq_type == IB_SRQT_XRC)
-		uobj_put_obj_read(attr.ext.xrc.cq);
+	if (ib_srq_has_cq(cmd->srq_type))
+		uobj_put_obj_read(attr.ext.cq);
 
 err_put_xrcd:
 	if (cmd->srq_type == IB_SRQT_XRC) {
@@ -3599,6 +3618,7 @@ ssize_t ib_uverbs_create_srq(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
+	memset(&xcmd, 0, sizeof(xcmd));
 	xcmd.response	 = cmd.response;
 	xcmd.user_handle = cmd.user_handle;
 	xcmd.srq_type	 = IB_SRQT_BASIC;
@@ -3607,10 +3627,10 @@ ssize_t ib_uverbs_create_srq(struct ib_uverbs_file *file,
 	xcmd.max_sge	 = cmd.max_sge;
 	xcmd.srq_limit	 = cmd.srq_limit;
 
-	INIT_UDATA(&udata, buf + sizeof cmd,
-		   (unsigned long) cmd.response + sizeof resp,
-		   in_len - sizeof cmd - sizeof(struct ib_uverbs_cmd_hdr),
-		   out_len - sizeof resp);
+	INIT_UDATA(&udata, buf + sizeof(cmd),
+		   (unsigned long) cmd.response + sizeof(resp),
+		   in_len - sizeof(cmd) - sizeof(struct ib_uverbs_cmd_hdr),
+		   out_len - sizeof(resp));
 
 	ret = __uverbs_create_xsrq(file, ib_dev, &xcmd, &udata);
 	if (ret)
@@ -3634,10 +3654,10 @@ ssize_t ib_uverbs_create_xsrq(struct ib_uverbs_file *file,
 	if (copy_from_user(&cmd, buf, sizeof cmd))
 		return -EFAULT;
 
-	INIT_UDATA(&udata, buf + sizeof cmd,
-		   (unsigned long) cmd.response + sizeof resp,
-		   in_len - sizeof cmd - sizeof(struct ib_uverbs_cmd_hdr),
-		   out_len - sizeof resp);
+	INIT_UDATA(&udata, buf + sizeof(cmd),
+		   (unsigned long) cmd.response + sizeof(resp),
+		   in_len - sizeof(cmd) - sizeof(struct ib_uverbs_cmd_hdr),
+		   out_len - sizeof(resp));
 
 	ret = __uverbs_create_xsrq(file, ib_dev, &cmd, &udata);
 	if (ret)
@@ -3848,6 +3868,16 @@ int ib_uverbs_ex_query_device(struct ib_uverbs_file *file,
 
 	resp.raw_packet_caps = attr.raw_packet_caps;
 	resp.response_length += sizeof(resp.raw_packet_caps);
+
+	if (ucore->outlen < resp.response_length + sizeof(resp.xrq_caps))
+		goto end;
+
+	resp.xrq_caps.max_rndv_hdr_size = attr.xrq_caps.max_rndv_hdr_size;
+	resp.xrq_caps.max_num_tags      = attr.xrq_caps.max_num_tags;
+	resp.xrq_caps.max_ops		= attr.xrq_caps.max_ops;
+	resp.xrq_caps.max_sge		= attr.xrq_caps.max_sge;
+	resp.xrq_caps.flags		= attr.xrq_caps.flags;
+	resp.response_length += sizeof(resp.xrq_caps);
 end:
 	err = ib_copy_to_udata(ucore, &resp, resp.response_length);
 	return err;
diff --git a/drivers/infiniband/core/uverbs_ioctl.c b/drivers/infiniband/core/uverbs_ioctl.c
new file mode 100644
index 0000000..5286ad5
--- /dev/null
+++ b/drivers/infiniband/core/uverbs_ioctl.c
@@ -0,0 +1,364 @@
+/*
+ * Copyright (c) 2017, Mellanox Technologies inc.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <rdma/rdma_user_ioctl.h>
+#include <rdma/uverbs_ioctl.h>
+#include "rdma_core.h"
+#include "uverbs.h"
+
+static int uverbs_process_attr(struct ib_device *ibdev,
+			       struct ib_ucontext *ucontext,
+			       const struct ib_uverbs_attr *uattr,
+			       u16 attr_id,
+			       const struct uverbs_attr_spec_hash *attr_spec_bucket,
+			       struct uverbs_attr_bundle_hash *attr_bundle_h,
+			       struct ib_uverbs_attr __user *uattr_ptr)
+{
+	const struct uverbs_attr_spec *spec;
+	struct uverbs_attr *e;
+	const struct uverbs_object_spec *object;
+	struct uverbs_obj_attr *o_attr;
+	struct uverbs_attr *elements = attr_bundle_h->attrs;
+
+	if (uattr->reserved)
+		return -EINVAL;
+
+	if (attr_id >= attr_spec_bucket->num_attrs) {
+		if (uattr->flags & UVERBS_ATTR_F_MANDATORY)
+			return -EINVAL;
+		else
+			return 0;
+	}
+
+	spec = &attr_spec_bucket->attrs[attr_id];
+	e = &elements[attr_id];
+	e->uattr = uattr_ptr;
+
+	switch (spec->type) {
+	case UVERBS_ATTR_TYPE_PTR_IN:
+	case UVERBS_ATTR_TYPE_PTR_OUT:
+		if (uattr->len < spec->len ||
+		    (!(spec->flags & UVERBS_ATTR_SPEC_F_MIN_SZ) &&
+		     uattr->len > spec->len))
+			return -EINVAL;
+
+		e->ptr_attr.data = uattr->data;
+		e->ptr_attr.len = uattr->len;
+		e->ptr_attr.flags = uattr->flags;
+		break;
+
+	case UVERBS_ATTR_TYPE_IDR:
+		if (uattr->data >> 32)
+			return -EINVAL;
+	/* fall through */
+	case UVERBS_ATTR_TYPE_FD:
+		if (uattr->len != 0 || !ucontext || uattr->data > INT_MAX)
+			return -EINVAL;
+
+		o_attr = &e->obj_attr;
+		object = uverbs_get_object(ibdev, spec->obj.obj_type);
+		if (!object)
+			return -EINVAL;
+		o_attr->type = object->type_attrs;
+
+		o_attr->id = (int)uattr->data;
+		o_attr->uobject = uverbs_get_uobject_from_context(
+					o_attr->type,
+					ucontext,
+					spec->obj.access,
+					o_attr->id);
+
+		if (IS_ERR(o_attr->uobject))
+			return PTR_ERR(o_attr->uobject);
+
+		if (spec->obj.access == UVERBS_ACCESS_NEW) {
+			u64 id = o_attr->uobject->id;
+
+			/* Copy the allocated id to the user-space */
+			if (put_user(id, &e->uattr->data)) {
+				uverbs_finalize_object(o_attr->uobject,
+						       UVERBS_ACCESS_NEW,
+						       false);
+				return -EFAULT;
+			}
+		}
+
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	set_bit(attr_id, attr_bundle_h->valid_bitmap);
+	return 0;
+}
+
+static int uverbs_uattrs_process(struct ib_device *ibdev,
+				 struct ib_ucontext *ucontext,
+				 const struct ib_uverbs_attr *uattrs,
+				 size_t num_uattrs,
+				 const struct uverbs_method_spec *method,
+				 struct uverbs_attr_bundle *attr_bundle,
+				 struct ib_uverbs_attr __user *uattr_ptr)
+{
+	size_t i;
+	int ret = 0;
+	int num_given_buckets = 0;
+
+	for (i = 0; i < num_uattrs; i++) {
+		const struct ib_uverbs_attr *uattr = &uattrs[i];
+		u16 attr_id = uattr->attr_id;
+		struct uverbs_attr_spec_hash *attr_spec_bucket;
+
+		ret = uverbs_ns_idx(&attr_id, method->num_buckets);
+		if (ret < 0) {
+			if (uattr->flags & UVERBS_ATTR_F_MANDATORY) {
+				uverbs_finalize_objects(attr_bundle,
+							method->attr_buckets,
+							num_given_buckets,
+							false);
+				return ret;
+			}
+			continue;
+		}
+
+		/*
+		 * ret is the found ns, so increase num_given_buckets if
+		 * necessary.
+		 */
+		if (ret >= num_given_buckets)
+			num_given_buckets = ret + 1;
+
+		attr_spec_bucket = method->attr_buckets[ret];
+		ret = uverbs_process_attr(ibdev, ucontext, uattr, attr_id,
+					  attr_spec_bucket, &attr_bundle->hash[ret],
+					  uattr_ptr++);
+		if (ret) {
+			uverbs_finalize_objects(attr_bundle,
+						method->attr_buckets,
+						num_given_buckets,
+						false);
+			return ret;
+		}
+	}
+
+	return num_given_buckets;
+}
+
+static int uverbs_validate_kernel_mandatory(const struct uverbs_method_spec *method_spec,
+					    struct uverbs_attr_bundle *attr_bundle)
+{
+	unsigned int i;
+
+	for (i = 0; i < attr_bundle->num_buckets; i++) {
+		struct uverbs_attr_spec_hash *attr_spec_bucket =
+			method_spec->attr_buckets[i];
+
+		if (!bitmap_subset(attr_spec_bucket->mandatory_attrs_bitmask,
+				   attr_bundle->hash[i].valid_bitmap,
+				   attr_spec_bucket->num_attrs))
+			return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int uverbs_handle_method(struct ib_uverbs_attr __user *uattr_ptr,
+				const struct ib_uverbs_attr *uattrs,
+				size_t num_uattrs,
+				struct ib_device *ibdev,
+				struct ib_uverbs_file *ufile,
+				const struct uverbs_method_spec *method_spec,
+				struct uverbs_attr_bundle *attr_bundle)
+{
+	int ret;
+	int finalize_ret;
+	int num_given_buckets;
+
+	num_given_buckets = uverbs_uattrs_process(ibdev, ufile->ucontext, uattrs,
+						  num_uattrs, method_spec,
+						  attr_bundle, uattr_ptr);
+	if (num_given_buckets <= 0)
+		return -EINVAL;
+
+	attr_bundle->num_buckets = num_given_buckets;
+	ret = uverbs_validate_kernel_mandatory(method_spec, attr_bundle);
+	if (ret)
+		goto cleanup;
+
+	ret = method_spec->handler(ibdev, ufile, attr_bundle);
+cleanup:
+	finalize_ret = uverbs_finalize_objects(attr_bundle,
+					       method_spec->attr_buckets,
+					       attr_bundle->num_buckets,
+					       !ret);
+
+	return ret ? ret : finalize_ret;
+}
+
+#define UVERBS_OPTIMIZE_USING_STACK_SZ  256
+static long ib_uverbs_cmd_verbs(struct ib_device *ib_dev,
+				struct ib_uverbs_file *file,
+				struct ib_uverbs_ioctl_hdr *hdr,
+				void __user *buf)
+{
+	const struct uverbs_object_spec *object_spec;
+	const struct uverbs_method_spec *method_spec;
+	long err = 0;
+	unsigned int i;
+	struct {
+		struct ib_uverbs_attr		*uattrs;
+		struct uverbs_attr_bundle	*uverbs_attr_bundle;
+	} *ctx = NULL;
+	struct uverbs_attr *curr_attr;
+	unsigned long *curr_bitmap;
+	size_t ctx_size;
+#ifdef UVERBS_OPTIMIZE_USING_STACK_SZ
+	uintptr_t data[UVERBS_OPTIMIZE_USING_STACK_SZ / sizeof(uintptr_t)];
+#endif
+
+	if (hdr->reserved)
+		return -EINVAL;
+
+	object_spec = uverbs_get_object(ib_dev, hdr->object_id);
+	if (!object_spec)
+		return -EOPNOTSUPP;
+
+	method_spec = uverbs_get_method(object_spec, hdr->method_id);
+	if (!method_spec)
+		return -EOPNOTSUPP;
+
+	if ((method_spec->flags & UVERBS_ACTION_FLAG_CREATE_ROOT) ^ !file->ucontext)
+		return -EINVAL;
+
+	ctx_size = sizeof(*ctx) +
+		   sizeof(struct uverbs_attr_bundle) +
+		   sizeof(struct uverbs_attr_bundle_hash) * method_spec->num_buckets +
+		   sizeof(*ctx->uattrs) * hdr->num_attrs +
+		   sizeof(*ctx->uverbs_attr_bundle->hash[0].attrs) *
+		   method_spec->num_child_attrs +
+		   sizeof(*ctx->uverbs_attr_bundle->hash[0].valid_bitmap) *
+			(method_spec->num_child_attrs / BITS_PER_LONG +
+			 method_spec->num_buckets);
+
+#ifdef UVERBS_OPTIMIZE_USING_STACK_SZ
+	if (ctx_size <= UVERBS_OPTIMIZE_USING_STACK_SZ)
+		ctx = (void *)data;
+
+	if (!ctx)
+#endif
+	ctx = kmalloc(ctx_size, GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+
+	ctx->uverbs_attr_bundle = (void *)ctx + sizeof(*ctx);
+	ctx->uattrs = (void *)(ctx->uverbs_attr_bundle + 1) +
+			      (sizeof(ctx->uverbs_attr_bundle->hash[0]) *
+			       method_spec->num_buckets);
+	curr_attr = (void *)(ctx->uattrs + hdr->num_attrs);
+	curr_bitmap = (void *)(curr_attr + method_spec->num_child_attrs);
+
+	/*
+	 * We just fill the pointers and num_attrs here. The data itself will be
+	 * filled at a later stage (uverbs_process_attr)
+	 */
+	for (i = 0; i < method_spec->num_buckets; i++) {
+		unsigned int curr_num_attrs = method_spec->attr_buckets[i]->num_attrs;
+
+		ctx->uverbs_attr_bundle->hash[i].attrs = curr_attr;
+		curr_attr += curr_num_attrs;
+		ctx->uverbs_attr_bundle->hash[i].num_attrs = curr_num_attrs;
+		ctx->uverbs_attr_bundle->hash[i].valid_bitmap = curr_bitmap;
+		bitmap_zero(curr_bitmap, curr_num_attrs);
+		curr_bitmap += BITS_TO_LONGS(curr_num_attrs);
+	}
+
+	err = copy_from_user(ctx->uattrs, buf,
+			     sizeof(*ctx->uattrs) * hdr->num_attrs);
+	if (err) {
+		err = -EFAULT;
+		goto out;
+	}
+
+	err = uverbs_handle_method(buf, ctx->uattrs, hdr->num_attrs, ib_dev,
+				   file, method_spec, ctx->uverbs_attr_bundle);
+out:
+#ifdef UVERBS_OPTIMIZE_USING_STACK_SZ
+	if (ctx_size > UVERBS_OPTIMIZE_USING_STACK_SZ)
+#endif
+	kfree(ctx);
+	return err;
+}
+
+#define IB_UVERBS_MAX_CMD_SZ 4096
+
+long ib_uverbs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
+{
+	struct ib_uverbs_file *file = filp->private_data;
+	struct ib_uverbs_ioctl_hdr __user *user_hdr =
+		(struct ib_uverbs_ioctl_hdr __user *)arg;
+	struct ib_uverbs_ioctl_hdr hdr;
+	struct ib_device *ib_dev;
+	int srcu_key;
+	long err;
+
+	srcu_key = srcu_read_lock(&file->device->disassociate_srcu);
+	ib_dev = srcu_dereference(file->device->ib_dev,
+				  &file->device->disassociate_srcu);
+	if (!ib_dev) {
+		err = -EIO;
+		goto out;
+	}
+
+	if (cmd == RDMA_VERBS_IOCTL) {
+		err = copy_from_user(&hdr, user_hdr, sizeof(hdr));
+
+		if (err || hdr.length > IB_UVERBS_MAX_CMD_SZ ||
+		    hdr.length != sizeof(hdr) + hdr.num_attrs * sizeof(struct ib_uverbs_attr)) {
+			err = -EINVAL;
+			goto out;
+		}
+
+		if (hdr.reserved) {
+			err = -EOPNOTSUPP;
+			goto out;
+		}
+
+		err = ib_uverbs_cmd_verbs(ib_dev, file, &hdr,
+					  (__user void *)arg + sizeof(hdr));
+	} else {
+		err = -ENOIOCTLCMD;
+	}
+out:
+	srcu_read_unlock(&file->device->disassociate_srcu, srcu_key);
+
+	return err;
+}
diff --git a/drivers/infiniband/core/uverbs_ioctl_merge.c b/drivers/infiniband/core/uverbs_ioctl_merge.c
new file mode 100644
index 0000000..76ddb65
--- /dev/null
+++ b/drivers/infiniband/core/uverbs_ioctl_merge.c
@@ -0,0 +1,665 @@
+/*
+ * Copyright (c) 2017, Mellanox Technologies inc.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <rdma/uverbs_ioctl.h>
+#include <rdma/rdma_user_ioctl.h>
+#include <linux/bitops.h>
+#include "uverbs.h"
+
+#define UVERBS_NUM_NS (UVERBS_ID_NS_MASK >> UVERBS_ID_NS_SHIFT)
+#define GET_NS_ID(idx) (((idx) & UVERBS_ID_NS_MASK) >> UVERBS_ID_NS_SHIFT)
+#define GET_ID(idx) ((idx) & ~UVERBS_ID_NS_MASK)
+
+#define _for_each_element(elem, tmpi, tmpj, hashes, num_buckets_offset,	       \
+			  buckets_offset)				       \
+	for (tmpj = 0,							       \
+	     elem = (*(const void ***)((hashes)[tmpi] +			       \
+				       (buckets_offset)))[0];	               \
+	     tmpj < *(size_t *)((hashes)[tmpi] + (num_buckets_offset));        \
+	     tmpj++)						               \
+		if ((elem = ((*(const void ***)(hashes[tmpi] +		       \
+						(buckets_offset)))[tmpj])))
+
+/*
+ * Iterate all elements of a few @hashes. The number of given hashes is
+ * indicated by @num_hashes. The offset of the number of buckets in the hash is
+ * represented by @num_buckets_offset, while the offset of the buckets array in
+ * the hash structure is represented by @buckets_offset. tmpi and tmpj are two
+ * short (or int) based indices that are given by the user. tmpi iterates over
+ * the different hashes. @elem points the current element in the hashes[tmpi]
+ * bucket we are looping on. To be honest, @hashes representation isn't exactly
+ * a hash, but more a collection of elements. These elements' ids are treated
+ * in a hash like manner, where the first upper bits are the bucket number.
+ * These elements are later mapped into a perfect-hash.
+ */
+#define for_each_element(elem, tmpi, tmpj, hashes, num_hashes,                 \
+			 num_buckets_offset, buckets_offset)		       \
+	for (tmpi = 0; tmpi < (num_hashes); tmpi++)		               \
+		_for_each_element(elem, tmpi, tmpj, hashes, num_buckets_offset,\
+				  buckets_offset)
+
+#define get_elements_iterators_entry_above(iters, num_elements, elements,     \
+					  num_objects_fld, objects_fld, bucket,\
+					  min_id)			       \
+	get_elements_above_id((const void **)iters, num_elements,       \
+				     (const void **)(elements),		       \
+				     offsetof(typeof(**elements),	       \
+					      num_objects_fld),		       \
+				     offsetof(typeof(**elements), objects_fld),\
+				     offsetof(typeof(***(*elements)->objects_fld), id),\
+				     bucket, min_id)
+
+#define get_objects_above_id(iters, num_trees, trees, bucket, min_id)	       \
+	get_elements_iterators_entry_above(iters, num_trees, trees,	       \
+					   num_objects, objects, bucket, min_id)
+
+#define get_methods_above_id(method_iters, num_iters, iters, bucket, min_id)\
+	get_elements_iterators_entry_above(method_iters, num_iters, iters,     \
+					   num_methods, methods, bucket, min_id)
+
+#define get_attrs_above_id(attrs_iters, num_iters, iters, bucket, min_id)\
+	get_elements_iterators_entry_above(attrs_iters, num_iters, iters,      \
+					   num_attrs, attrs, bucket, min_id)
+
+/*
+ * get_elements_above_id get a few hashes represented by @elements and
+ * @num_elements. The hashes fields are described by @num_offset, @data_offset
+ * and @id_offset in the same way as required by for_each_element. The function
+ * returns an array of @iters, represents an array of elements in the hashes
+ * buckets, which their ids are the smallest ids in all hashes but are all
+ * larger than the id given by min_id. Elements are only added to the iters
+ * array if their id belongs to the bucket @bucket. The number of elements in
+ * the returned array is returned by the function. @min_id is also updated to
+ * reflect the new min_id of all elements in iters.
+ */
+static size_t get_elements_above_id(const void **iters,
+				    unsigned int num_elements,
+				    const void **elements,
+				    size_t num_offset,
+				    size_t data_offset,
+				    size_t id_offset,
+				    u16 bucket,
+				    short *min_id)
+{
+	size_t num_iters = 0;
+	short min = SHRT_MAX;
+	const void *elem;
+	int i, j, last_stored = -1;
+
+	for_each_element(elem, i, j, elements, num_elements, num_offset,
+			 data_offset) {
+		u16 id = *(u16 *)(elem + id_offset);
+
+		if (GET_NS_ID(id) != bucket)
+			continue;
+
+		if (GET_ID(id) < *min_id ||
+		    (min != SHRT_MAX && GET_ID(id) > min))
+			continue;
+
+		/*
+		 * We first iterate all hashes represented by @elements. When
+		 * we do, we try to find an element @elem in the bucket @bucket
+		 * which its id is min. Since we can't ensure the user sorted
+		 * the elements in increasing order, we override this hash's
+		 * minimal id element we found, if a new element with a smaller
+		 * id was just found.
+		 */
+		iters[last_stored == i ? num_iters - 1 : num_iters++] = elem;
+		last_stored = i;
+		min = GET_ID(id);
+	}
+
+	/*
+	 * We only insert to our iters array an element, if its id is smaller
+	 * than all previous ids. Therefore, the final iters array is sorted so
+	 * that smaller ids are in the end of the array.
+	 * Therefore, we need to clean the beginning of the array to make sure
+	 * all ids of final elements are equal to min.
+	 */
+	for (i = num_iters - 1; i >= 0 &&
+	     GET_ID(*(u16 *)(iters[i] + id_offset)) == min; i--)
+		;
+
+	num_iters -= i + 1;
+	memmove(iters, iters + i + 1, sizeof(*iters) * num_iters);
+
+	*min_id = min;
+	return num_iters;
+}
+
+#define find_max_element_entry_id(num_elements, elements, num_objects_fld, \
+				  objects_fld, bucket)			   \
+	find_max_element_id(num_elements, (const void **)(elements),	   \
+			    offsetof(typeof(**elements), num_objects_fld),    \
+			    offsetof(typeof(**elements), objects_fld),	      \
+			    offsetof(typeof(***(*elements)->objects_fld), id),\
+			    bucket)
+
+static short find_max_element_ns_id(unsigned int num_elements,
+				    const void **elements,
+				    size_t num_offset,
+				    size_t data_offset,
+				    size_t id_offset)
+{
+	short max_ns = SHRT_MIN;
+	const void *elem;
+	int i, j;
+
+	for_each_element(elem, i, j, elements, num_elements, num_offset,
+			 data_offset) {
+		u16 id = *(u16 *)(elem + id_offset);
+
+		if (GET_NS_ID(id) > max_ns)
+			max_ns = GET_NS_ID(id);
+	}
+
+	return max_ns;
+}
+
+static short find_max_element_id(unsigned int num_elements,
+				 const void **elements,
+				 size_t num_offset,
+				 size_t data_offset,
+				 size_t id_offset,
+				 u16 bucket)
+{
+	short max_id = SHRT_MIN;
+	const void *elem;
+	int i, j;
+
+	for_each_element(elem, i, j, elements, num_elements, num_offset,
+			 data_offset) {
+		u16 id = *(u16 *)(elem + id_offset);
+
+		if (GET_NS_ID(id) == bucket &&
+		    GET_ID(id) > max_id)
+			max_id = GET_ID(id);
+	}
+	return max_id;
+}
+
+#define find_max_element_entry_id(num_elements, elements, num_objects_fld,   \
+				  objects_fld, bucket)			      \
+	find_max_element_id(num_elements, (const void **)(elements),	      \
+			    offsetof(typeof(**elements), num_objects_fld),    \
+			    offsetof(typeof(**elements), objects_fld),	      \
+			    offsetof(typeof(***(*elements)->objects_fld), id),\
+			    bucket)
+
+#define find_max_element_ns_entry_id(num_elements, elements,		    \
+				     num_objects_fld, objects_fld)	    \
+	find_max_element_ns_id(num_elements, (const void **)(elements),	    \
+			      offsetof(typeof(**elements), num_objects_fld),\
+			      offsetof(typeof(**elements), objects_fld),    \
+			      offsetof(typeof(***(*elements)->objects_fld), id))
+
+/*
+ * find_max_xxxx_ns_id gets a few elements. Each element is described by an id
+ * which its upper bits represents a namespace. It finds the max namespace. This
+ * could be used in order to know how many buckets do we need to allocate. If no
+ * elements exist, SHRT_MIN is returned. Namespace represents here different
+ * buckets. The common example is "common bucket" and "driver bucket".
+ *
+ * find_max_xxxx_id gets a few elements and a bucket. Each element is described
+ * by an id which its upper bits represent a namespace. It returns the max id
+ * which is contained in the same namespace defined in @bucket. This could be
+ * used in order to know how many elements do we need to allocate in the bucket.
+ * If no elements exist, SHRT_MIN is returned.
+ */
+
+#define find_max_object_id(num_trees, trees, bucket)			\
+		find_max_element_entry_id(num_trees, trees, num_objects,\
+					  objects, bucket)
+#define find_max_object_ns_id(num_trees, trees)			\
+		find_max_element_ns_entry_id(num_trees, trees,		\
+					     num_objects, objects)
+
+#define find_max_method_id(num_iters, iters, bucket)			\
+		find_max_element_entry_id(num_iters, iters, num_methods,\
+					  methods, bucket)
+#define find_max_method_ns_id(num_iters, iters)			\
+		find_max_element_ns_entry_id(num_iters, iters,		\
+					     num_methods, methods)
+
+#define find_max_attr_id(num_iters, iters, bucket)			\
+		find_max_element_entry_id(num_iters, iters, num_attrs,  \
+					  attrs, bucket)
+#define find_max_attr_ns_id(num_iters, iters)				\
+		find_max_element_ns_entry_id(num_iters, iters,		\
+					     num_attrs, attrs)
+
+static void free_method(struct uverbs_method_spec *method)
+{
+	unsigned int i;
+
+	if (!method)
+		return;
+
+	for (i = 0; i < method->num_buckets; i++)
+		kfree(method->attr_buckets[i]);
+
+	kfree(method);
+}
+
+#define IS_ATTR_OBJECT(attr) ((attr)->type == UVERBS_ATTR_TYPE_IDR || \
+			      (attr)->type == UVERBS_ATTR_TYPE_FD)
+
+/*
+ * This function gets array of size @num_method_defs which contains pointers to
+ * method definitions @method_defs. The function allocates an
+ * uverbs_method_spec structure and initializes its number of buckets and the
+ * elements in buckets to the correct attributes. While doing that, it
+ * validates that there aren't conflicts between attributes of different
+ * method_defs.
+ */
+static struct uverbs_method_spec *build_method_with_attrs(const struct uverbs_method_def **method_defs,
+							  size_t num_method_defs)
+{
+	int bucket_idx;
+	int max_attr_buckets = 0;
+	size_t num_attr_buckets = 0;
+	int res = 0;
+	struct uverbs_method_spec *method = NULL;
+	const struct uverbs_attr_def **attr_defs;
+	unsigned int num_of_singularities = 0;
+
+	max_attr_buckets = find_max_attr_ns_id(num_method_defs, method_defs);
+	if (max_attr_buckets >= 0)
+		num_attr_buckets = max_attr_buckets + 1;
+
+	method = kzalloc(sizeof(*method) +
+			 num_attr_buckets * sizeof(*method->attr_buckets),
+			 GFP_KERNEL);
+	if (!method)
+		return ERR_PTR(-ENOMEM);
+
+	method->num_buckets = num_attr_buckets;
+	attr_defs = kcalloc(num_method_defs, sizeof(*attr_defs), GFP_KERNEL);
+	if (!attr_defs) {
+		res = -ENOMEM;
+		goto free_method;
+	}
+	for (bucket_idx = 0; bucket_idx < method->num_buckets; bucket_idx++) {
+		short min_id = SHRT_MIN;
+		int attr_max_bucket = 0;
+		struct uverbs_attr_spec_hash *hash = NULL;
+
+		attr_max_bucket = find_max_attr_id(num_method_defs, method_defs,
+						   bucket_idx);
+		if (attr_max_bucket < 0)
+			continue;
+
+		hash = kzalloc(sizeof(*hash) +
+			       ALIGN(sizeof(*hash->attrs) * (attr_max_bucket + 1),
+				     sizeof(long)) +
+			       BITS_TO_LONGS(attr_max_bucket) * sizeof(long),
+			       GFP_KERNEL);
+		if (!hash) {
+			res = -ENOMEM;
+			goto free;
+		}
+		hash->num_attrs = attr_max_bucket + 1;
+		method->num_child_attrs += hash->num_attrs;
+		hash->mandatory_attrs_bitmask = (void *)(hash + 1) +
+						 ALIGN(sizeof(*hash->attrs) *
+						       (attr_max_bucket + 1),
+						       sizeof(long));
+
+		method->attr_buckets[bucket_idx] = hash;
+
+		do {
+			size_t			 num_attr_defs;
+			struct uverbs_attr_spec	*attr;
+			bool attr_obj_with_special_access;
+
+			num_attr_defs =
+				get_attrs_above_id(attr_defs,
+						   num_method_defs,
+						   method_defs,
+						   bucket_idx,
+						   &min_id);
+			/* Last attr in bucket */
+			if (!num_attr_defs)
+				break;
+
+			if (num_attr_defs > 1) {
+				/*
+				 * We don't allow two attribute definitions for
+				 * the same attribute. This is usually a
+				 * programmer error. If required, it's better to
+				 * just add a new attribute to capture the new
+				 * semantics.
+				 */
+				res = -EEXIST;
+				goto free;
+			}
+
+			attr = &hash->attrs[min_id];
+			memcpy(attr, &attr_defs[0]->attr, sizeof(*attr));
+
+			attr_obj_with_special_access = IS_ATTR_OBJECT(attr) &&
+				   (attr->obj.access == UVERBS_ACCESS_NEW ||
+				    attr->obj.access == UVERBS_ACCESS_DESTROY);
+			num_of_singularities +=  !!attr_obj_with_special_access;
+			if (WARN(num_of_singularities > 1,
+				 "ib_uverbs: Method contains more than one object attr (%d) with new/destroy access\n",
+				 min_id) ||
+			    WARN(attr_obj_with_special_access &&
+				 !(attr->flags & UVERBS_ATTR_SPEC_F_MANDATORY),
+				 "ib_uverbs: Tried to merge attr (%d) but it's an object with new/destroy aceess but isn't mandatory\n",
+				 min_id) ||
+			    WARN(IS_ATTR_OBJECT(attr) &&
+				 attr->flags & UVERBS_ATTR_SPEC_F_MIN_SZ,
+				 "ib_uverbs: Tried to merge attr (%d) but it's an object with min_sz flag\n",
+				 min_id)) {
+				res = -EINVAL;
+				goto free;
+			}
+
+			if (attr->flags & UVERBS_ATTR_SPEC_F_MANDATORY)
+				set_bit(min_id, hash->mandatory_attrs_bitmask);
+			min_id++;
+
+		} while (1);
+	}
+	kfree(attr_defs);
+	return method;
+
+free:
+	kfree(attr_defs);
+free_method:
+	free_method(method);
+	return ERR_PTR(res);
+}
+
+static void free_object(struct uverbs_object_spec *object)
+{
+	unsigned int i, j;
+
+	if (!object)
+		return;
+
+	for (i = 0; i < object->num_buckets; i++) {
+		struct uverbs_method_spec_hash	*method_buckets =
+			object->method_buckets[i];
+
+		if (!method_buckets)
+			continue;
+
+		for (j = 0; j < method_buckets->num_methods; j++)
+			free_method(method_buckets->methods[j]);
+
+		kfree(method_buckets);
+	}
+
+	kfree(object);
+}
+
+/*
+ * This function gets array of size @num_object_defs which contains pointers to
+ * object definitions @object_defs. The function allocated an
+ * uverbs_object_spec structure and initialize its number of buckets and the
+ * elements in buckets to the correct methods. While doing that, it
+ * sorts out the correct relationship between conflicts in the same method.
+ */
+static struct uverbs_object_spec *build_object_with_methods(const struct uverbs_object_def **object_defs,
+							    size_t num_object_defs)
+{
+	u16 bucket_idx;
+	int max_method_buckets = 0;
+	u16 num_method_buckets = 0;
+	int res = 0;
+	struct uverbs_object_spec *object = NULL;
+	const struct uverbs_method_def **method_defs;
+
+	max_method_buckets = find_max_method_ns_id(num_object_defs, object_defs);
+	if (max_method_buckets >= 0)
+		num_method_buckets = max_method_buckets + 1;
+
+	object = kzalloc(sizeof(*object) +
+			 num_method_buckets *
+			 sizeof(*object->method_buckets), GFP_KERNEL);
+	if (!object)
+		return ERR_PTR(-ENOMEM);
+
+	object->num_buckets = num_method_buckets;
+	method_defs = kcalloc(num_object_defs, sizeof(*method_defs), GFP_KERNEL);
+	if (!method_defs) {
+		res = -ENOMEM;
+		goto free_object;
+	}
+
+	for (bucket_idx = 0; bucket_idx < object->num_buckets; bucket_idx++) {
+		short min_id = SHRT_MIN;
+		int methods_max_bucket = 0;
+		struct uverbs_method_spec_hash *hash = NULL;
+
+		methods_max_bucket = find_max_method_id(num_object_defs, object_defs,
+							bucket_idx);
+		if (methods_max_bucket < 0)
+			continue;
+
+		hash = kzalloc(sizeof(*hash) +
+			       sizeof(*hash->methods) * (methods_max_bucket + 1),
+			       GFP_KERNEL);
+		if (!hash) {
+			res = -ENOMEM;
+			goto free;
+		}
+
+		hash->num_methods = methods_max_bucket + 1;
+		object->method_buckets[bucket_idx] = hash;
+
+		do {
+			size_t				num_method_defs;
+			struct uverbs_method_spec	*method;
+			int i;
+
+			num_method_defs =
+				get_methods_above_id(method_defs,
+						     num_object_defs,
+						     object_defs,
+						     bucket_idx,
+						     &min_id);
+			/* Last method in bucket */
+			if (!num_method_defs)
+				break;
+
+			method = build_method_with_attrs(method_defs,
+							 num_method_defs);
+			if (IS_ERR(method)) {
+				res = PTR_ERR(method);
+				goto free;
+			}
+
+			/*
+			 * The last tree which is given as an argument to the
+			 * merge overrides previous method handler.
+			 * Therefore, we iterate backwards and search for the
+			 * first handler which != NULL. This also defines the
+			 * set of flags used for this handler.
+			 */
+			for (i = num_object_defs - 1;
+			     i >= 0 && !method_defs[i]->handler; i--)
+				;
+			hash->methods[min_id++] = method;
+			/* NULL handler isn't allowed */
+			if (WARN(i < 0,
+				 "ib_uverbs: tried to merge function id %d, but all handlers are NULL\n",
+				 min_id)) {
+				res = -EINVAL;
+				goto free;
+			}
+			method->handler = method_defs[i]->handler;
+			method->flags = method_defs[i]->flags;
+
+		} while (1);
+	}
+	kfree(method_defs);
+	return object;
+
+free:
+	kfree(method_defs);
+free_object:
+	free_object(object);
+	return ERR_PTR(res);
+}
+
+void uverbs_free_spec_tree(struct uverbs_root_spec *root)
+{
+	unsigned int i, j;
+
+	if (!root)
+		return;
+
+	for (i = 0; i < root->num_buckets; i++) {
+		struct uverbs_object_spec_hash *object_hash =
+			root->object_buckets[i];
+
+		if (!object_hash)
+			continue;
+
+		for (j = 0; j < object_hash->num_objects; j++)
+			free_object(object_hash->objects[j]);
+
+		kfree(object_hash);
+	}
+
+	kfree(root);
+}
+EXPORT_SYMBOL(uverbs_free_spec_tree);
+
+struct uverbs_root_spec *uverbs_alloc_spec_tree(unsigned int num_trees,
+						const struct uverbs_object_tree_def **trees)
+{
+	u16 bucket_idx;
+	short max_object_buckets = 0;
+	size_t num_objects_buckets = 0;
+	struct uverbs_root_spec *root_spec = NULL;
+	const struct uverbs_object_def **object_defs;
+	int i;
+	int res = 0;
+
+	max_object_buckets = find_max_object_ns_id(num_trees, trees);
+	/*
+	 * Devices which don't want to support ib_uverbs, should just allocate
+	 * an empty parsing tree. Every user-space command won't hit any valid
+	 * entry in the parsing tree and thus will fail.
+	 */
+	if (max_object_buckets >= 0)
+		num_objects_buckets = max_object_buckets + 1;
+
+	root_spec = kzalloc(sizeof(*root_spec) +
+			    num_objects_buckets * sizeof(*root_spec->object_buckets),
+			    GFP_KERNEL);
+	if (!root_spec)
+		return ERR_PTR(-ENOMEM);
+	root_spec->num_buckets = num_objects_buckets;
+
+	object_defs = kcalloc(num_trees, sizeof(*object_defs),
+			      GFP_KERNEL);
+	if (!object_defs) {
+		res = -ENOMEM;
+		goto free_root;
+	}
+
+	for (bucket_idx = 0; bucket_idx < root_spec->num_buckets; bucket_idx++) {
+		short min_id = SHRT_MIN;
+		short objects_max_bucket;
+		struct uverbs_object_spec_hash *hash = NULL;
+
+		objects_max_bucket = find_max_object_id(num_trees, trees,
+							bucket_idx);
+		if (objects_max_bucket < 0)
+			continue;
+
+		hash = kzalloc(sizeof(*hash) +
+			       sizeof(*hash->objects) * (objects_max_bucket + 1),
+			       GFP_KERNEL);
+		if (!hash) {
+			res = -ENOMEM;
+			goto free;
+		}
+		hash->num_objects = objects_max_bucket + 1;
+		root_spec->object_buckets[bucket_idx] = hash;
+
+		do {
+			size_t				num_object_defs;
+			struct uverbs_object_spec	*object;
+
+			num_object_defs = get_objects_above_id(object_defs,
+							       num_trees,
+							       trees,
+							       bucket_idx,
+							       &min_id);
+			/* Last object in bucket */
+			if (!num_object_defs)
+				break;
+
+			object = build_object_with_methods(object_defs,
+							   num_object_defs);
+			if (IS_ERR(object)) {
+				res = PTR_ERR(object);
+				goto free;
+			}
+
+			/*
+			 * The last tree which is given as an argument to the
+			 * merge overrides previous object's type_attrs.
+			 * Therefore, we iterate backwards and search for the
+			 * first type_attrs which != NULL.
+			 */
+			for (i = num_object_defs - 1;
+			     i >= 0 && !object_defs[i]->type_attrs; i--)
+				;
+			/*
+			 * NULL is a valid type_attrs. It means an object we
+			 * can't instantiate (like DEVICE).
+			 */
+			object->type_attrs = i < 0 ? NULL :
+				object_defs[i]->type_attrs;
+
+			hash->objects[min_id++] = object;
+		} while (1);
+	}
+
+	kfree(object_defs);
+	return root_spec;
+
+free:
+	kfree(object_defs);
+free_root:
+	uverbs_free_spec_tree(root_spec);
+	return ERR_PTR(res);
+}
+EXPORT_SYMBOL(uverbs_alloc_spec_tree);
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 5e530d2..dc2aed6 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -49,6 +49,7 @@
 #include <linux/uaccess.h>
 
 #include <rdma/ib.h>
+#include <rdma/uverbs_std_types.h>
 
 #include "uverbs.h"
 #include "core_priv.h"
@@ -595,7 +596,6 @@ struct file *ib_uverbs_alloc_async_event_file(struct ib_uverbs_file *uverbs_file
 {
 	struct ib_uverbs_async_event_file *ev_file;
 	struct file *filp;
-	int ret;
 
 	ev_file = kzalloc(sizeof(*ev_file), GFP_KERNEL);
 	if (!ev_file)
@@ -621,21 +621,11 @@ struct file *ib_uverbs_alloc_async_event_file(struct ib_uverbs_file *uverbs_file
 	INIT_IB_EVENT_HANDLER(&uverbs_file->event_handler,
 			      ib_dev,
 			      ib_uverbs_event_handler);
-	ret = ib_register_event_handler(&uverbs_file->event_handler);
-	if (ret)
-		goto err_put_file;
-
+	ib_register_event_handler(&uverbs_file->event_handler);
 	/* At that point async file stuff was fully set */
 
 	return filp;
 
-err_put_file:
-	fput(filp);
-	kref_put(&uverbs_file->async_file->ref,
-		 ib_uverbs_release_async_event_file);
-	uverbs_file->async_file = NULL;
-	return ERR_PTR(ret);
-
 err_put_refs:
 	kref_put(&ev_file->uverbs_file->ref, ib_uverbs_release_file);
 	kref_put(&ev_file->ref, ib_uverbs_release_async_event_file);
@@ -949,6 +939,9 @@ static const struct file_operations uverbs_fops = {
 	.open	 = ib_uverbs_open,
 	.release = ib_uverbs_close,
 	.llseek	 = no_llseek,
+#if IS_ENABLED(CONFIG_INFINIBAND_EXP_USER_ACCESS)
+	.unlocked_ioctl = ib_uverbs_ioctl,
+#endif
 };
 
 static const struct file_operations uverbs_mmap_fops = {
@@ -958,6 +951,9 @@ static const struct file_operations uverbs_mmap_fops = {
 	.open	 = ib_uverbs_open,
 	.release = ib_uverbs_close,
 	.llseek	 = no_llseek,
+#if IS_ENABLED(CONFIG_INFINIBAND_EXP_USER_ACCESS)
+	.unlocked_ioctl = ib_uverbs_ioctl,
+#endif
 };
 
 static struct ib_client uverbs_client = {
@@ -1108,6 +1104,18 @@ static void ib_uverbs_add_one(struct ib_device *device)
 	if (device_create_file(uverbs_dev->dev, &dev_attr_abi_version))
 		goto err_class;
 
+	if (!device->specs_root) {
+		const struct uverbs_object_tree_def *default_root[] = {
+			uverbs_default_get_objects()};
+
+		uverbs_dev->specs_root = uverbs_alloc_spec_tree(1,
+								default_root);
+		if (IS_ERR(uverbs_dev->specs_root))
+			goto err_class;
+
+		device->specs_root = uverbs_dev->specs_root;
+	}
+
 	ib_set_client_data(device, &uverbs_client, uverbs_dev);
 
 	return;
@@ -1239,6 +1247,11 @@ static void ib_uverbs_remove_one(struct ib_device *device, void *client_data)
 		ib_uverbs_comp_dev(uverbs_dev);
 	if (wait_clients)
 		wait_for_completion(&uverbs_dev->comp);
+	if (uverbs_dev->specs_root) {
+		uverbs_free_spec_tree(uverbs_dev->specs_root);
+		device->specs_root = NULL;
+	}
+
 	kobject_put(&uverbs_dev->kobj);
 }
 
diff --git a/drivers/infiniband/core/uverbs_marshall.c b/drivers/infiniband/core/uverbs_marshall.c
index 94fd989..bd0acf3 100644
--- a/drivers/infiniband/core/uverbs_marshall.c
+++ b/drivers/infiniband/core/uverbs_marshall.c
@@ -33,10 +33,47 @@
 #include <linux/export.h>
 #include <rdma/ib_marshall.h>
 
-void ib_copy_ah_attr_to_user(struct ib_uverbs_ah_attr *dst,
-			     struct rdma_ah_attr *src)
+#define OPA_DEFAULT_GID_PREFIX cpu_to_be64(0xfe80000000000000ULL)
+static int rdma_ah_conv_opa_to_ib(struct ib_device *dev,
+				  struct rdma_ah_attr *ib,
+				  struct rdma_ah_attr *opa)
 {
+	struct ib_port_attr port_attr;
+	int ret = 0;
+
+	/* Do structure copy and the over-write fields */
+	*ib = *opa;
+
+	ib->type = RDMA_AH_ATTR_TYPE_IB;
+	rdma_ah_set_grh(ib, NULL, 0, 0, 1, 0);
+
+	if (ib_query_port(dev, opa->port_num, &port_attr)) {
+		/* Set to default subnet to indicate error */
+		rdma_ah_set_subnet_prefix(ib, OPA_DEFAULT_GID_PREFIX);
+		ret = -EINVAL;
+	} else {
+		rdma_ah_set_subnet_prefix(ib,
+					  cpu_to_be64(port_attr.subnet_prefix));
+	}
+	rdma_ah_set_interface_id(ib, OPA_MAKE_ID(rdma_ah_get_dlid(opa)));
+	return ret;
+}
+
+void ib_copy_ah_attr_to_user(struct ib_device *device,
+			     struct ib_uverbs_ah_attr *dst,
+			     struct rdma_ah_attr *ah_attr)
+{
+	struct rdma_ah_attr *src = ah_attr;
+	struct rdma_ah_attr conv_ah;
+
 	memset(&dst->grh.reserved, 0, sizeof(dst->grh.reserved));
+
+	if ((ah_attr->type == RDMA_AH_ATTR_TYPE_OPA) &&
+	    (rdma_ah_get_dlid(ah_attr) >=
+	     be16_to_cpu(IB_MULTICAST_LID_BASE)) &&
+	    (!rdma_ah_conv_opa_to_ib(device, &conv_ah, ah_attr)))
+		src = &conv_ah;
+
 	dst->dlid		   = rdma_ah_get_dlid(src);
 	dst->sl			   = rdma_ah_get_sl(src);
 	dst->src_path_bits	   = rdma_ah_get_path_bits(src);
@@ -57,7 +94,8 @@ void ib_copy_ah_attr_to_user(struct ib_uverbs_ah_attr *dst,
 }
 EXPORT_SYMBOL(ib_copy_ah_attr_to_user);
 
-void ib_copy_qp_attr_to_user(struct ib_uverbs_qp_attr *dst,
+void ib_copy_qp_attr_to_user(struct ib_device *device,
+			     struct ib_uverbs_qp_attr *dst,
 			     struct ib_qp_attr *src)
 {
 	dst->qp_state	        = src->qp_state;
@@ -76,8 +114,8 @@ void ib_copy_qp_attr_to_user(struct ib_uverbs_qp_attr *dst,
 	dst->max_recv_sge	= src->cap.max_recv_sge;
 	dst->max_inline_data	= src->cap.max_inline_data;
 
-	ib_copy_ah_attr_to_user(&dst->ah_attr, &src->ah_attr);
-	ib_copy_ah_attr_to_user(&dst->alt_ah_attr, &src->alt_ah_attr);
+	ib_copy_ah_attr_to_user(device, &dst->ah_attr, &src->ah_attr);
+	ib_copy_ah_attr_to_user(device, &dst->alt_ah_attr, &src->alt_ah_attr);
 
 	dst->pkey_index		= src->pkey_index;
 	dst->alt_pkey_index	= src->alt_pkey_index;
diff --git a/drivers/infiniband/core/uverbs_std_types.c b/drivers/infiniband/core/uverbs_std_types.c
index ef29337..0a98579 100644
--- a/drivers/infiniband/core/uverbs_std_types.c
+++ b/drivers/infiniband/core/uverbs_std_types.c
@@ -209,67 +209,244 @@ static int uverbs_hot_unplug_completion_event_file(struct ib_uobject_file *uobj_
 	return 0;
 };
 
-const struct uverbs_obj_fd_type uverbs_type_attrs_comp_channel = {
-	.type = UVERBS_TYPE_ALLOC_FD(sizeof(struct ib_uverbs_completion_event_file), 0),
-	.context_closed = uverbs_hot_unplug_completion_event_file,
-	.fops = &uverbs_event_fops,
-	.name = "[infinibandevent]",
-	.flags = O_RDONLY,
+/*
+ * This spec is used in order to pass information to the hardware driver in a
+ * legacy way. Every verb that could get driver specific data should get this
+ * spec.
+ */
+static const struct uverbs_attr_def uverbs_uhw_compat_in =
+	UVERBS_ATTR_PTR_IN_SZ(UVERBS_UHW_IN, 0, UA_FLAGS(UVERBS_ATTR_SPEC_F_MIN_SZ));
+static const struct uverbs_attr_def uverbs_uhw_compat_out =
+	UVERBS_ATTR_PTR_OUT_SZ(UVERBS_UHW_OUT, 0, UA_FLAGS(UVERBS_ATTR_SPEC_F_MIN_SZ));
+
+static void create_udata(struct uverbs_attr_bundle *ctx,
+			 struct ib_udata *udata)
+{
+	/*
+	 * This is for ease of conversion. The purpose is to convert all drivers
+	 * to use uverbs_attr_bundle instead of ib_udata.
+	 * Assume attr == 0 is input and attr == 1 is output.
+	 */
+	void __user *inbuf;
+	size_t inbuf_len = 0;
+	void __user *outbuf;
+	size_t outbuf_len = 0;
+	const struct uverbs_attr *uhw_in =
+		uverbs_attr_get(ctx, UVERBS_UHW_IN);
+	const struct uverbs_attr *uhw_out =
+		uverbs_attr_get(ctx, UVERBS_UHW_OUT);
+
+	if (!IS_ERR(uhw_in)) {
+		inbuf = uhw_in->ptr_attr.ptr;
+		inbuf_len = uhw_in->ptr_attr.len;
+	}
+
+	if (!IS_ERR(uhw_out)) {
+		outbuf = uhw_out->ptr_attr.ptr;
+		outbuf_len = uhw_out->ptr_attr.len;
+	}
+
+	INIT_UDATA_BUF_OR_NULL(udata, inbuf, outbuf, inbuf_len, outbuf_len);
+}
+
+static int uverbs_create_cq_handler(struct ib_device *ib_dev,
+				    struct ib_uverbs_file *file,
+				    struct uverbs_attr_bundle *attrs)
+{
+	struct ib_ucontext *ucontext = file->ucontext;
+	struct ib_ucq_object           *obj;
+	struct ib_udata uhw;
+	int ret;
+	u64 user_handle;
+	struct ib_cq_init_attr attr = {};
+	struct ib_cq                   *cq;
+	struct ib_uverbs_completion_event_file    *ev_file = NULL;
+	const struct uverbs_attr *ev_file_attr;
+	struct ib_uobject *ev_file_uobj;
+
+	if (!(ib_dev->uverbs_cmd_mask & 1ULL << IB_USER_VERBS_CMD_CREATE_CQ))
+		return -EOPNOTSUPP;
+
+	ret = uverbs_copy_from(&attr.comp_vector, attrs, CREATE_CQ_COMP_VECTOR);
+	if (!ret)
+		ret = uverbs_copy_from(&attr.cqe, attrs, CREATE_CQ_CQE);
+	if (!ret)
+		ret = uverbs_copy_from(&user_handle, attrs, CREATE_CQ_USER_HANDLE);
+	if (ret)
+		return ret;
+
+	/* Optional param, if it doesn't exist, we get -ENOENT and skip it */
+	if (uverbs_copy_from(&attr.flags, attrs, CREATE_CQ_FLAGS) == -EFAULT)
+		return -EFAULT;
+
+	ev_file_attr = uverbs_attr_get(attrs, CREATE_CQ_COMP_CHANNEL);
+	if (!IS_ERR(ev_file_attr)) {
+		ev_file_uobj = ev_file_attr->obj_attr.uobject;
+
+		ev_file = container_of(ev_file_uobj,
+				       struct ib_uverbs_completion_event_file,
+				       uobj_file.uobj);
+		uverbs_uobject_get(ev_file_uobj);
+	}
+
+	if (attr.comp_vector >= ucontext->ufile->device->num_comp_vectors) {
+		ret = -EINVAL;
+		goto err_event_file;
+	}
+
+	obj = container_of(uverbs_attr_get(attrs, CREATE_CQ_HANDLE)->obj_attr.uobject,
+			   typeof(*obj), uobject);
+	obj->uverbs_file	   = ucontext->ufile;
+	obj->comp_events_reported  = 0;
+	obj->async_events_reported = 0;
+	INIT_LIST_HEAD(&obj->comp_list);
+	INIT_LIST_HEAD(&obj->async_list);
+
+	/* Temporary, only until drivers get the new uverbs_attr_bundle */
+	create_udata(attrs, &uhw);
+
+	cq = ib_dev->create_cq(ib_dev, &attr, ucontext, &uhw);
+	if (IS_ERR(cq)) {
+		ret = PTR_ERR(cq);
+		goto err_event_file;
+	}
+
+	cq->device        = ib_dev;
+	cq->uobject       = &obj->uobject;
+	cq->comp_handler  = ib_uverbs_comp_handler;
+	cq->event_handler = ib_uverbs_cq_event_handler;
+	cq->cq_context    = &ev_file->ev_queue;
+	obj->uobject.object = cq;
+	obj->uobject.user_handle = user_handle;
+	atomic_set(&cq->usecnt, 0);
+
+	ret = uverbs_copy_to(attrs, CREATE_CQ_RESP_CQE, &cq->cqe);
+	if (ret)
+		goto err_cq;
+
+	return 0;
+err_cq:
+	ib_destroy_cq(cq);
+
+err_event_file:
+	if (ev_file)
+		uverbs_uobject_put(ev_file_uobj);
+	return ret;
 };
 
-const struct uverbs_obj_idr_type uverbs_type_attrs_cq = {
-	.type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_ucq_object), 0),
-	.destroy_object = uverbs_free_cq,
-};
+static DECLARE_UVERBS_METHOD(
+	uverbs_method_cq_create, UVERBS_CQ_CREATE, uverbs_create_cq_handler,
+	&UVERBS_ATTR_IDR(CREATE_CQ_HANDLE, UVERBS_OBJECT_CQ, UVERBS_ACCESS_NEW,
+			 UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
+	&UVERBS_ATTR_PTR_IN(CREATE_CQ_CQE, u32,
+			    UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
+	&UVERBS_ATTR_PTR_IN(CREATE_CQ_USER_HANDLE, u64,
+			    UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
+	&UVERBS_ATTR_FD(CREATE_CQ_COMP_CHANNEL, UVERBS_OBJECT_COMP_CHANNEL,
+			UVERBS_ACCESS_READ),
+	&UVERBS_ATTR_PTR_IN(CREATE_CQ_COMP_VECTOR, u32,
+			    UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
+	&UVERBS_ATTR_PTR_IN(CREATE_CQ_FLAGS, u32),
+	&UVERBS_ATTR_PTR_OUT(CREATE_CQ_RESP_CQE, u32,
+			     UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
+	&uverbs_uhw_compat_in, &uverbs_uhw_compat_out);
 
-const struct uverbs_obj_idr_type uverbs_type_attrs_qp = {
-	.type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uqp_object), 0),
-	.destroy_object = uverbs_free_qp,
-};
+static int uverbs_destroy_cq_handler(struct ib_device *ib_dev,
+				     struct ib_uverbs_file *file,
+				     struct uverbs_attr_bundle *attrs)
+{
+	struct ib_uverbs_destroy_cq_resp resp;
+	struct ib_uobject *uobj =
+		uverbs_attr_get(attrs, DESTROY_CQ_HANDLE)->obj_attr.uobject;
+	struct ib_ucq_object *obj = container_of(uobj, struct ib_ucq_object,
+						 uobject);
+	int ret;
 
-const struct uverbs_obj_idr_type uverbs_type_attrs_mw = {
-	.type = UVERBS_TYPE_ALLOC_IDR(0),
-	.destroy_object = uverbs_free_mw,
-};
+	if (!(ib_dev->uverbs_cmd_mask & 1ULL << IB_USER_VERBS_CMD_DESTROY_CQ))
+		return -EOPNOTSUPP;
 
-const struct uverbs_obj_idr_type uverbs_type_attrs_mr = {
-	/* 1 is used in order to free the MR after all the MWs */
-	.type = UVERBS_TYPE_ALLOC_IDR(1),
-	.destroy_object = uverbs_free_mr,
-};
+	ret = rdma_explicit_destroy(uobj);
+	if (ret)
+		return ret;
 
-const struct uverbs_obj_idr_type uverbs_type_attrs_srq = {
-	.type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_usrq_object), 0),
-	.destroy_object = uverbs_free_srq,
-};
+	resp.comp_events_reported  = obj->comp_events_reported;
+	resp.async_events_reported = obj->async_events_reported;
 
-const struct uverbs_obj_idr_type uverbs_type_attrs_ah = {
-	.type = UVERBS_TYPE_ALLOC_IDR(0),
-	.destroy_object = uverbs_free_ah,
-};
+	return uverbs_copy_to(attrs, DESTROY_CQ_RESP, &resp);
+}
 
-const struct uverbs_obj_idr_type uverbs_type_attrs_flow = {
-	.type = UVERBS_TYPE_ALLOC_IDR(0),
-	.destroy_object = uverbs_free_flow,
-};
+static DECLARE_UVERBS_METHOD(
+	uverbs_method_cq_destroy, UVERBS_CQ_DESTROY, uverbs_destroy_cq_handler,
+	&UVERBS_ATTR_IDR(DESTROY_CQ_HANDLE, UVERBS_OBJECT_CQ,
+			 UVERBS_ACCESS_DESTROY,
+			 UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)),
+	&UVERBS_ATTR_PTR_OUT(DESTROY_CQ_RESP, struct ib_uverbs_destroy_cq_resp,
+			     UA_FLAGS(UVERBS_ATTR_SPEC_F_MANDATORY)));
 
-const struct uverbs_obj_idr_type uverbs_type_attrs_wq = {
-	.type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uwq_object), 0),
-	.destroy_object = uverbs_free_wq,
-};
+DECLARE_UVERBS_OBJECT(uverbs_object_comp_channel,
+		      UVERBS_OBJECT_COMP_CHANNEL,
+		      &UVERBS_TYPE_ALLOC_FD(0,
+					      sizeof(struct ib_uverbs_completion_event_file),
+					      uverbs_hot_unplug_completion_event_file,
+					      &uverbs_event_fops,
+					      "[infinibandevent]", O_RDONLY));
 
-const struct uverbs_obj_idr_type uverbs_type_attrs_rwq_ind_table = {
-	.type = UVERBS_TYPE_ALLOC_IDR(0),
-	.destroy_object = uverbs_free_rwq_ind_tbl,
-};
+DECLARE_UVERBS_OBJECT(uverbs_object_cq, UVERBS_OBJECT_CQ,
+		      &UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_ucq_object), 0,
+						  uverbs_free_cq),
+		      &uverbs_method_cq_create,
+		      &uverbs_method_cq_destroy);
 
-const struct uverbs_obj_idr_type uverbs_type_attrs_xrcd = {
-	.type = UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uxrcd_object), 0),
-	.destroy_object = uverbs_free_xrcd,
-};
+DECLARE_UVERBS_OBJECT(uverbs_object_qp, UVERBS_OBJECT_QP,
+		      &UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uqp_object), 0,
+						  uverbs_free_qp));
 
-const struct uverbs_obj_idr_type uverbs_type_attrs_pd = {
-	/* 2 is used in order to free the PD after MRs */
-	.type = UVERBS_TYPE_ALLOC_IDR(2),
-	.destroy_object = uverbs_free_pd,
-};
+DECLARE_UVERBS_OBJECT(uverbs_object_mw, UVERBS_OBJECT_MW,
+		      &UVERBS_TYPE_ALLOC_IDR(0, uverbs_free_mw));
+
+DECLARE_UVERBS_OBJECT(uverbs_object_mr, UVERBS_OBJECT_MR,
+		      /* 1 is used in order to free the MR after all the MWs */
+		      &UVERBS_TYPE_ALLOC_IDR(1, uverbs_free_mr));
+
+DECLARE_UVERBS_OBJECT(uverbs_object_srq, UVERBS_OBJECT_SRQ,
+		      &UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_usrq_object), 0,
+						  uverbs_free_srq));
+
+DECLARE_UVERBS_OBJECT(uverbs_object_ah, UVERBS_OBJECT_AH,
+		      &UVERBS_TYPE_ALLOC_IDR(0, uverbs_free_ah));
+
+DECLARE_UVERBS_OBJECT(uverbs_object_flow, UVERBS_OBJECT_FLOW,
+		      &UVERBS_TYPE_ALLOC_IDR(0, uverbs_free_flow));
+
+DECLARE_UVERBS_OBJECT(uverbs_object_wq, UVERBS_OBJECT_WQ,
+		      &UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uwq_object), 0,
+						  uverbs_free_wq));
+
+DECLARE_UVERBS_OBJECT(uverbs_object_rwq_ind_table,
+		      UVERBS_OBJECT_RWQ_IND_TBL,
+		      &UVERBS_TYPE_ALLOC_IDR(0, uverbs_free_rwq_ind_tbl));
+
+DECLARE_UVERBS_OBJECT(uverbs_object_xrcd, UVERBS_OBJECT_XRCD,
+		      &UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uxrcd_object), 0,
+						  uverbs_free_xrcd));
+
+DECLARE_UVERBS_OBJECT(uverbs_object_pd, UVERBS_OBJECT_PD,
+		      /* 2 is used in order to free the PD after MRs */
+		      &UVERBS_TYPE_ALLOC_IDR(2, uverbs_free_pd));
+
+DECLARE_UVERBS_OBJECT(uverbs_object_device, UVERBS_OBJECT_DEVICE, NULL);
+
+DECLARE_UVERBS_OBJECT_TREE(uverbs_default_objects,
+			   &uverbs_object_device,
+			   &uverbs_object_pd,
+			   &uverbs_object_mr,
+			   &uverbs_object_comp_channel,
+			   &uverbs_object_cq,
+			   &uverbs_object_qp,
+			   &uverbs_object_ah,
+			   &uverbs_object_mw,
+			   &uverbs_object_srq,
+			   &uverbs_object_flow,
+			   &uverbs_object_wq,
+			   &uverbs_object_rwq_ind_table,
+			   &uverbs_object_xrcd);
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index b456e3c..ee9e27d 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -180,39 +180,29 @@ EXPORT_SYMBOL(ib_rate_to_mbps);
 __attribute_const__ enum rdma_transport_type
 rdma_node_get_transport(enum rdma_node_type node_type)
 {
-	switch (node_type) {
-	case RDMA_NODE_IB_CA:
-	case RDMA_NODE_IB_SWITCH:
-	case RDMA_NODE_IB_ROUTER:
-		return RDMA_TRANSPORT_IB;
-	case RDMA_NODE_RNIC:
-		return RDMA_TRANSPORT_IWARP;
-	case RDMA_NODE_USNIC:
+
+	if (node_type == RDMA_NODE_USNIC)
 		return RDMA_TRANSPORT_USNIC;
-	case RDMA_NODE_USNIC_UDP:
+	if (node_type == RDMA_NODE_USNIC_UDP)
 		return RDMA_TRANSPORT_USNIC_UDP;
-	default:
-		BUG();
-		return 0;
-	}
+	if (node_type == RDMA_NODE_RNIC)
+		return RDMA_TRANSPORT_IWARP;
+
+	return RDMA_TRANSPORT_IB;
 }
 EXPORT_SYMBOL(rdma_node_get_transport);
 
 enum rdma_link_layer rdma_port_get_link_layer(struct ib_device *device, u8 port_num)
 {
+	enum rdma_transport_type lt;
 	if (device->get_link_layer)
 		return device->get_link_layer(device, port_num);
 
-	switch (rdma_node_get_transport(device->node_type)) {
-	case RDMA_TRANSPORT_IB:
+	lt = rdma_node_get_transport(device->node_type);
+	if (lt == RDMA_TRANSPORT_IB)
 		return IB_LINK_LAYER_INFINIBAND;
-	case RDMA_TRANSPORT_IWARP:
-	case RDMA_TRANSPORT_USNIC:
-	case RDMA_TRANSPORT_USNIC_UDP:
-		return IB_LINK_LAYER_ETHERNET;
-	default:
-		return IB_LINK_LAYER_UNSPECIFIED;
-	}
+
+	return IB_LINK_LAYER_ETHERNET;
 }
 EXPORT_SYMBOL(rdma_port_get_link_layer);
 
@@ -478,6 +468,8 @@ int ib_init_ah_from_wc(struct ib_device *device, u8 port_num,
 	union ib_gid dgid;
 	union ib_gid sgid;
 
+	might_sleep();
+
 	memset(ah_attr, 0, sizeof *ah_attr);
 	ah_attr->type = rdma_ah_find_type(device, port_num);
 	if (rdma_cap_eth_ah(device, port_num)) {
@@ -632,11 +624,13 @@ struct ib_srq *ib_create_srq(struct ib_pd *pd,
 		srq->event_handler = srq_init_attr->event_handler;
 		srq->srq_context   = srq_init_attr->srq_context;
 		srq->srq_type      = srq_init_attr->srq_type;
+		if (ib_srq_has_cq(srq->srq_type)) {
+			srq->ext.cq   = srq_init_attr->ext.cq;
+			atomic_inc(&srq->ext.cq->usecnt);
+		}
 		if (srq->srq_type == IB_SRQT_XRC) {
 			srq->ext.xrc.xrcd = srq_init_attr->ext.xrc.xrcd;
-			srq->ext.xrc.cq   = srq_init_attr->ext.xrc.cq;
 			atomic_inc(&srq->ext.xrc.xrcd->usecnt);
-			atomic_inc(&srq->ext.xrc.cq->usecnt);
 		}
 		atomic_inc(&pd->usecnt);
 		atomic_set(&srq->usecnt, 0);
@@ -677,18 +671,18 @@ int ib_destroy_srq(struct ib_srq *srq)
 
 	pd = srq->pd;
 	srq_type = srq->srq_type;
-	if (srq_type == IB_SRQT_XRC) {
+	if (ib_srq_has_cq(srq_type))
+		cq = srq->ext.cq;
+	if (srq_type == IB_SRQT_XRC)
 		xrcd = srq->ext.xrc.xrcd;
-		cq = srq->ext.xrc.cq;
-	}
 
 	ret = srq->device->destroy_srq(srq);
 	if (!ret) {
 		atomic_dec(&pd->usecnt);
-		if (srq_type == IB_SRQT_XRC) {
+		if (srq_type == IB_SRQT_XRC)
 			atomic_dec(&xrcd->usecnt);
+		if (ib_srq_has_cq(srq_type))
 			atomic_dec(&cq->usecnt);
-		}
 	}
 
 	return ret;
@@ -1244,6 +1238,18 @@ int ib_resolve_eth_dmac(struct ib_device *device,
 	if (rdma_link_local_addr((struct in6_addr *)grh->dgid.raw)) {
 		rdma_get_ll_mac((struct in6_addr *)grh->dgid.raw,
 				ah_attr->roce.dmac);
+		return 0;
+	}
+	if (rdma_is_multicast_addr((struct in6_addr *)ah_attr->grh.dgid.raw)) {
+		if (ipv6_addr_v4mapped((struct in6_addr *)ah_attr->grh.dgid.raw)) {
+			__be32 addr = 0;
+
+			memcpy(&addr, ah_attr->grh.dgid.raw + 12, 4);
+			ip_eth_mc_map(addr, (char *)ah_attr->roce.dmac);
+		} else {
+			ipv6_eth_mc_map((struct in6_addr *)ah_attr->grh.dgid.raw,
+					(char *)ah_attr->roce.dmac);
+		}
 	} else {
 		union ib_gid		sgid;
 		struct ib_gid_attr	sgid_attr;
@@ -1306,6 +1312,61 @@ int ib_modify_qp_with_udata(struct ib_qp *qp, struct ib_qp_attr *attr,
 }
 EXPORT_SYMBOL(ib_modify_qp_with_udata);
 
+int ib_get_eth_speed(struct ib_device *dev, u8 port_num, u8 *speed, u8 *width)
+{
+	int rc;
+	u32 netdev_speed;
+	struct net_device *netdev;
+	struct ethtool_link_ksettings lksettings;
+
+	if (rdma_port_get_link_layer(dev, port_num) != IB_LINK_LAYER_ETHERNET)
+		return -EINVAL;
+
+	if (!dev->get_netdev)
+		return -EOPNOTSUPP;
+
+	netdev = dev->get_netdev(dev, port_num);
+	if (!netdev)
+		return -ENODEV;
+
+	rtnl_lock();
+	rc = __ethtool_get_link_ksettings(netdev, &lksettings);
+	rtnl_unlock();
+
+	dev_put(netdev);
+
+	if (!rc) {
+		netdev_speed = lksettings.base.speed;
+	} else {
+		netdev_speed = SPEED_1000;
+		pr_warn("%s speed is unknown, defaulting to %d\n", netdev->name,
+			netdev_speed);
+	}
+
+	if (netdev_speed <= SPEED_1000) {
+		*width = IB_WIDTH_1X;
+		*speed = IB_SPEED_SDR;
+	} else if (netdev_speed <= SPEED_10000) {
+		*width = IB_WIDTH_1X;
+		*speed = IB_SPEED_FDR10;
+	} else if (netdev_speed <= SPEED_20000) {
+		*width = IB_WIDTH_4X;
+		*speed = IB_SPEED_DDR;
+	} else if (netdev_speed <= SPEED_25000) {
+		*width = IB_WIDTH_1X;
+		*speed = IB_SPEED_EDR;
+	} else if (netdev_speed <= SPEED_40000) {
+		*width = IB_WIDTH_4X;
+		*speed = IB_SPEED_FDR10;
+	} else {
+		*width = IB_WIDTH_4X;
+		*speed = IB_SPEED_EDR;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(ib_get_eth_speed);
+
 int ib_modify_qp(struct ib_qp *qp,
 		 struct ib_qp_attr *qp_attr,
 		 int qp_attr_mask)
@@ -1573,15 +1634,53 @@ EXPORT_SYMBOL(ib_dealloc_fmr);
 
 /* Multicast groups */
 
+static bool is_valid_mcast_lid(struct ib_qp *qp, u16 lid)
+{
+	struct ib_qp_init_attr init_attr = {};
+	struct ib_qp_attr attr = {};
+	int num_eth_ports = 0;
+	int port;
+
+	/* If QP state >= init, it is assigned to a port and we can check this
+	 * port only.
+	 */
+	if (!ib_query_qp(qp, &attr, IB_QP_STATE | IB_QP_PORT, &init_attr)) {
+		if (attr.qp_state >= IB_QPS_INIT) {
+			if (qp->device->get_link_layer(qp->device, attr.port_num) !=
+			    IB_LINK_LAYER_INFINIBAND)
+				return true;
+			goto lid_check;
+		}
+	}
+
+	/* Can't get a quick answer, iterate over all ports */
+	for (port = 0; port < qp->device->phys_port_cnt; port++)
+		if (qp->device->get_link_layer(qp->device, port) !=
+		    IB_LINK_LAYER_INFINIBAND)
+			num_eth_ports++;
+
+	/* If we have at lease one Ethernet port, RoCE annex declares that
+	 * multicast LID should be ignored. We can't tell at this step if the
+	 * QP belongs to an IB or Ethernet port.
+	 */
+	if (num_eth_ports)
+		return true;
+
+	/* If all the ports are IB, we can check according to IB spec. */
+lid_check:
+	return !(lid < be16_to_cpu(IB_MULTICAST_LID_BASE) ||
+		 lid == be16_to_cpu(IB_LID_PERMISSIVE));
+}
+
 int ib_attach_mcast(struct ib_qp *qp, union ib_gid *gid, u16 lid)
 {
 	int ret;
 
 	if (!qp->device->attach_mcast)
 		return -ENOSYS;
-	if (gid->raw[0] != 0xff || qp->qp_type != IB_QPT_UD ||
-	    lid < be16_to_cpu(IB_MULTICAST_LID_BASE) ||
-	    lid == be16_to_cpu(IB_LID_PERMISSIVE))
+
+	if (!rdma_is_multicast_addr((struct in6_addr *)gid->raw) ||
+	    qp->qp_type != IB_QPT_UD || !is_valid_mcast_lid(qp, lid))
 		return -EINVAL;
 
 	ret = qp->device->attach_mcast(qp, gid, lid);
@@ -1597,9 +1696,9 @@ int ib_detach_mcast(struct ib_qp *qp, union ib_gid *gid, u16 lid)
 
 	if (!qp->device->detach_mcast)
 		return -ENOSYS;
-	if (gid->raw[0] != 0xff || qp->qp_type != IB_QPT_UD ||
-	    lid < be16_to_cpu(IB_MULTICAST_LID_BASE) ||
-	    lid == be16_to_cpu(IB_LID_PERMISSIVE))
+
+	if (!rdma_is_multicast_addr((struct in6_addr *)gid->raw) ||
+	    qp->qp_type != IB_QPT_UD || !is_valid_mcast_lid(qp, lid))
 		return -EINVAL;
 
 	ret = qp->device->detach_mcast(qp, gid, lid);
diff --git a/drivers/infiniband/hw/bnxt_re/Makefile b/drivers/infiniband/hw/bnxt_re/Makefile
index 036f84e..afbaa0e 100644
--- a/drivers/infiniband/hw/bnxt_re/Makefile
+++ b/drivers/infiniband/hw/bnxt_re/Makefile
@@ -3,4 +3,4 @@
 obj-$(CONFIG_INFINIBAND_BNXT_RE) += bnxt_re.o
 bnxt_re-y := main.o ib_verbs.o \
 	     qplib_res.o qplib_rcfw.o	\
-	     qplib_sp.o qplib_fp.o
+	     qplib_sp.o qplib_fp.o  hw_counters.o
diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
index 8552753..b3ad37f 100644
--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
@@ -85,7 +85,7 @@ struct bnxt_re_sqp_entries {
 };
 
 #define BNXT_RE_MIN_MSIX		2
-#define BNXT_RE_MAX_MSIX		16
+#define BNXT_RE_MAX_MSIX		9
 #define BNXT_RE_AEQ_IDX			0
 #define BNXT_RE_NQ_IDX			1
 
@@ -116,7 +116,7 @@ struct bnxt_re_dev {
 	struct bnxt_qplib_rcfw		rcfw;
 
 	/* NQ */
-	struct bnxt_qplib_nq		nq;
+	struct bnxt_qplib_nq		nq[BNXT_RE_MAX_MSIX];
 
 	/* Device Resources */
 	struct bnxt_qplib_dev_attr	dev_attr;
@@ -140,6 +140,7 @@ struct bnxt_re_dev {
 	struct bnxt_re_qp		*qp1_sqp;
 	struct bnxt_re_ah		*sqp_ah;
 	struct bnxt_re_sqp_entries sqp_tbl[1024];
+	atomic_t nq_alloc_cnt;
 };
 
 #define to_bnxt_re_dev(ptr, member)	\
diff --git a/drivers/infiniband/hw/bnxt_re/hw_counters.c b/drivers/infiniband/hw/bnxt_re/hw_counters.c
new file mode 100644
index 0000000..7b28219
--- /dev/null
+++ b/drivers/infiniband/hw/bnxt_re/hw_counters.c
@@ -0,0 +1,114 @@
+/*
+ * Broadcom NetXtreme-E RoCE driver.
+ *
+ * Copyright (c) 2016 - 2017, Broadcom. All rights reserved.  The term
+ * Broadcom refers to Broadcom Limited and/or its subsidiaries.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
+ * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
+ * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Description: Statistics
+ *
+ */
+
+#include <linux/interrupt.h>
+#include <linux/types.h>
+#include <linux/spinlock.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/pci.h>
+#include <linux/prefetch.h>
+#include <linux/delay.h>
+
+#include <rdma/ib_addr.h>
+
+#include "bnxt_ulp.h"
+#include "roce_hsi.h"
+#include "qplib_res.h"
+#include "qplib_sp.h"
+#include "qplib_fp.h"
+#include "qplib_rcfw.h"
+#include "bnxt_re.h"
+#include "hw_counters.h"
+
+static const char * const bnxt_re_stat_name[] = {
+	[BNXT_RE_ACTIVE_QP]           =  "active_qps",
+	[BNXT_RE_ACTIVE_SRQ]          =  "active_srqs",
+	[BNXT_RE_ACTIVE_CQ]           =  "active_cqs",
+	[BNXT_RE_ACTIVE_MR]           =  "active_mrs",
+	[BNXT_RE_ACTIVE_MW]           =  "active_mws",
+	[BNXT_RE_RX_PKTS]             =  "rx_pkts",
+	[BNXT_RE_RX_BYTES]            =  "rx_bytes",
+	[BNXT_RE_TX_PKTS]             =  "tx_pkts",
+	[BNXT_RE_TX_BYTES]            =  "tx_bytes",
+	[BNXT_RE_RECOVERABLE_ERRORS]  =  "recoverable_errors"
+};
+
+int bnxt_re_ib_get_hw_stats(struct ib_device *ibdev,
+			    struct rdma_hw_stats *stats,
+			    u8 port, int index)
+{
+	struct bnxt_re_dev *rdev = to_bnxt_re_dev(ibdev, ibdev);
+	struct ctx_hw_stats *bnxt_re_stats = rdev->qplib_ctx.stats.dma;
+
+	if (!port || !stats)
+		return -EINVAL;
+
+	stats->value[BNXT_RE_ACTIVE_QP] = atomic_read(&rdev->qp_count);
+	stats->value[BNXT_RE_ACTIVE_SRQ] = atomic_read(&rdev->srq_count);
+	stats->value[BNXT_RE_ACTIVE_CQ] = atomic_read(&rdev->cq_count);
+	stats->value[BNXT_RE_ACTIVE_MR] = atomic_read(&rdev->mr_count);
+	stats->value[BNXT_RE_ACTIVE_MW] = atomic_read(&rdev->mw_count);
+	if (bnxt_re_stats) {
+		stats->value[BNXT_RE_RECOVERABLE_ERRORS] =
+			le64_to_cpu(bnxt_re_stats->tx_bcast_pkts);
+		stats->value[BNXT_RE_RX_PKTS] =
+			le64_to_cpu(bnxt_re_stats->rx_ucast_pkts);
+		stats->value[BNXT_RE_RX_BYTES] =
+			le64_to_cpu(bnxt_re_stats->rx_ucast_bytes);
+		stats->value[BNXT_RE_TX_PKTS] =
+			le64_to_cpu(bnxt_re_stats->tx_ucast_pkts);
+		stats->value[BNXT_RE_TX_BYTES] =
+			le64_to_cpu(bnxt_re_stats->tx_ucast_bytes);
+	}
+	return ARRAY_SIZE(bnxt_re_stat_name);
+}
+
+struct rdma_hw_stats *bnxt_re_ib_alloc_hw_stats(struct ib_device *ibdev,
+						u8 port_num)
+{
+	BUILD_BUG_ON(ARRAY_SIZE(bnxt_re_stat_name) != BNXT_RE_NUM_COUNTERS);
+	/* We support only per port stats */
+	if (!port_num)
+		return NULL;
+
+	return rdma_alloc_hw_stats_struct(bnxt_re_stat_name,
+					  ARRAY_SIZE(bnxt_re_stat_name),
+					  RDMA_HW_STATS_DEFAULT_LIFESPAN);
+}
diff --git a/drivers/infiniband/hw/bnxt_re/hw_counters.h b/drivers/infiniband/hw/bnxt_re/hw_counters.h
new file mode 100644
index 0000000..be0dc00
--- /dev/null
+++ b/drivers/infiniband/hw/bnxt_re/hw_counters.h
@@ -0,0 +1,62 @@
+/*
+ * Broadcom NetXtreme-E RoCE driver.
+ *
+ * Copyright (c) 2016 - 2017, Broadcom. All rights reserved.  The term
+ * Broadcom refers to Broadcom Limited and/or its subsidiaries.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS''
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS
+ * BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
+ * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN
+ * IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ * Description: Statistics (header)
+ *
+ */
+
+#ifndef __BNXT_RE_HW_STATS_H__
+#define __BNXT_RE_HW_STATS_H__
+
+enum bnxt_re_hw_stats {
+	BNXT_RE_ACTIVE_QP,
+	BNXT_RE_ACTIVE_SRQ,
+	BNXT_RE_ACTIVE_CQ,
+	BNXT_RE_ACTIVE_MR,
+	BNXT_RE_ACTIVE_MW,
+	BNXT_RE_RX_PKTS,
+	BNXT_RE_RX_BYTES,
+	BNXT_RE_TX_PKTS,
+	BNXT_RE_TX_BYTES,
+	BNXT_RE_RECOVERABLE_ERRORS,
+	BNXT_RE_NUM_COUNTERS
+};
+
+struct rdma_hw_stats *bnxt_re_ib_alloc_hw_stats(struct ib_device *ibdev,
+						u8 port_num);
+int bnxt_re_ib_get_hw_stats(struct ib_device *ibdev,
+			    struct rdma_hw_stats *stats,
+			    u8 port, int index);
+#endif /* __BNXT_RE_HW_STATS_H__ */
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index f0e01b3..01eee15 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -223,50 +223,6 @@ int bnxt_re_modify_device(struct ib_device *ibdev,
 	return 0;
 }
 
-static void __to_ib_speed_width(struct net_device *netdev, u8 *speed, u8 *width)
-{
-	struct ethtool_link_ksettings lksettings;
-	u32 espeed;
-
-	if (netdev->ethtool_ops && netdev->ethtool_ops->get_link_ksettings) {
-		memset(&lksettings, 0, sizeof(lksettings));
-		rtnl_lock();
-		netdev->ethtool_ops->get_link_ksettings(netdev, &lksettings);
-		rtnl_unlock();
-		espeed = lksettings.base.speed;
-	} else {
-		espeed = SPEED_UNKNOWN;
-	}
-	switch (espeed) {
-	case SPEED_1000:
-		*speed = IB_SPEED_SDR;
-		*width = IB_WIDTH_1X;
-		break;
-	case SPEED_10000:
-		*speed = IB_SPEED_QDR;
-		*width = IB_WIDTH_1X;
-		break;
-	case SPEED_20000:
-		*speed = IB_SPEED_DDR;
-		*width = IB_WIDTH_4X;
-		break;
-	case SPEED_25000:
-		*speed = IB_SPEED_EDR;
-		*width = IB_WIDTH_1X;
-		break;
-	case SPEED_40000:
-		*speed = IB_SPEED_QDR;
-		*width = IB_WIDTH_4X;
-		break;
-	case SPEED_50000:
-		break;
-	default:
-		*speed = IB_SPEED_SDR;
-		*width = IB_WIDTH_1X;
-		break;
-	}
-}
-
 /* Port */
 int bnxt_re_query_port(struct ib_device *ibdev, u8 port_num,
 		       struct ib_port_attr *port_attr)
@@ -308,25 +264,9 @@ int bnxt_re_query_port(struct ib_device *ibdev, u8 port_num,
 	 * IB stack to avoid race in the NETDEV_UNREG path
 	 */
 	if (test_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags))
-		__to_ib_speed_width(rdev->netdev, &port_attr->active_speed,
-				    &port_attr->active_width);
-	return 0;
-}
-
-int bnxt_re_modify_port(struct ib_device *ibdev, u8 port_num,
-			int port_modify_mask,
-			struct ib_port_modify *port_modify)
-{
-	switch (port_modify_mask) {
-	case IB_PORT_SHUTDOWN:
-		break;
-	case IB_PORT_INIT_TYPE:
-		break;
-	case IB_PORT_RESET_QKEY_CNTR:
-		break;
-	default:
-		break;
-	}
+		if (ib_get_eth_speed(ibdev, port_num, &port_attr->active_speed,
+				     &port_attr->active_width))
+			return -EINVAL;
 	return 0;
 }
 
@@ -846,6 +786,7 @@ int bnxt_re_destroy_qp(struct ib_qp *ib_qp)
 	struct bnxt_re_dev *rdev = qp->rdev;
 	int rc;
 
+	bnxt_qplib_del_flush_qp(&qp->qplib_qp);
 	rc = bnxt_qplib_destroy_qp(&rdev->qplib_res, &qp->qplib_qp);
 	if (rc) {
 		dev_err(rdev_to_dev(rdev), "Failed to destroy HW QP");
@@ -860,6 +801,7 @@ int bnxt_re_destroy_qp(struct ib_qp *ib_qp)
 			return rc;
 		}
 
+		bnxt_qplib_del_flush_qp(&qp->qplib_qp);
 		rc = bnxt_qplib_destroy_qp(&rdev->qplib_res,
 					   &rdev->qp1_sqp->qplib_qp);
 		if (rc) {
@@ -969,7 +911,6 @@ static struct bnxt_re_ah *bnxt_re_create_shadow_qp_ah
 	if (!ah)
 		return NULL;
 
-	memset(ah, 0, sizeof(*ah));
 	ah->rdev = rdev;
 	ah->qplib_ah.pd = &pd->qplib_pd;
 
@@ -1016,7 +957,6 @@ static struct bnxt_re_qp *bnxt_re_create_shadow_qp
 	if (!qp)
 		return NULL;
 
-	memset(qp, 0, sizeof(*qp));
 	qp->rdev = rdev;
 
 	/* Initialize the shadow QP structure from the QP1 values */
@@ -1404,6 +1344,21 @@ int bnxt_re_modify_qp(struct ib_qp *ib_qp, struct ib_qp_attr *qp_attr,
 		}
 		qp->qplib_qp.modify_flags |= CMDQ_MODIFY_QP_MODIFY_MASK_STATE;
 		qp->qplib_qp.state = __from_ib_qp_state(qp_attr->qp_state);
+
+		if (!qp->sumem &&
+		    qp->qplib_qp.state == CMDQ_MODIFY_QP_NEW_STATE_ERR) {
+			dev_dbg(rdev_to_dev(rdev),
+				"Move QP = %p to flush list\n",
+				qp);
+			bnxt_qplib_add_flush_qp(&qp->qplib_qp);
+		}
+		if (!qp->sumem &&
+		    qp->qplib_qp.state == CMDQ_MODIFY_QP_NEW_STATE_RESET) {
+			dev_dbg(rdev_to_dev(rdev),
+				"Move QP = %p out of flush list\n",
+				qp);
+			bnxt_qplib_del_flush_qp(&qp->qplib_qp);
+		}
 	}
 	if (qp_attr_mask & IB_QP_EN_SQD_ASYNC_NOTIFY) {
 		qp->qplib_qp.modify_flags |=
@@ -2333,6 +2288,7 @@ int bnxt_re_destroy_cq(struct ib_cq *ib_cq)
 	struct bnxt_re_cq *cq = container_of(ib_cq, struct bnxt_re_cq, ib_cq);
 	struct bnxt_re_dev *rdev = cq->rdev;
 	int rc;
+	struct bnxt_qplib_nq *nq = cq->qplib_cq.nq;
 
 	rc = bnxt_qplib_destroy_cq(&rdev->qplib_res, &cq->qplib_cq);
 	if (rc) {
@@ -2347,7 +2303,7 @@ int bnxt_re_destroy_cq(struct ib_cq *ib_cq)
 		kfree(cq);
 	}
 	atomic_dec(&rdev->cq_count);
-	rdev->nq.budget--;
+	nq->budget--;
 	return 0;
 }
 
@@ -2361,6 +2317,8 @@ struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
 	struct bnxt_re_cq *cq = NULL;
 	int rc, entries;
 	int cqe = attr->cqe;
+	struct bnxt_qplib_nq *nq = NULL;
+	unsigned int nq_alloc_cnt;
 
 	/* Validate CQ fields */
 	if (cqe < 1 || cqe > dev_attr->max_cq_wqes) {
@@ -2412,8 +2370,15 @@ struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
 		cq->qplib_cq.sghead = NULL;
 		cq->qplib_cq.nmap = 0;
 	}
+	/*
+	 * Allocating the NQ in a round robin fashion. nq_alloc_cnt is a
+	 * used for getting the NQ index.
+	 */
+	nq_alloc_cnt = atomic_inc_return(&rdev->nq_alloc_cnt);
+	nq = &rdev->nq[nq_alloc_cnt % (rdev->num_msix - 1)];
 	cq->qplib_cq.max_wqe = entries;
-	cq->qplib_cq.cnq_hw_ring_id = rdev->nq.ring_id;
+	cq->qplib_cq.cnq_hw_ring_id = nq->ring_id;
+	cq->qplib_cq.nq	= nq;
 
 	rc = bnxt_qplib_create_cq(&rdev->qplib_res, &cq->qplib_cq);
 	if (rc) {
@@ -2423,7 +2388,7 @@ struct ib_cq *bnxt_re_create_cq(struct ib_device *ibdev,
 
 	cq->ib_cq.cqe = entries;
 	cq->cq_period = cq->qplib_cq.period;
-	rdev->nq.budget++;
+	nq->budget++;
 
 	atomic_inc(&rdev->cq_count);
 
@@ -2921,6 +2886,10 @@ int bnxt_re_poll_cq(struct ib_cq *ib_cq, int num_entries, struct ib_wc *wc)
 					sq->send_phantom = false;
 			}
 		}
+		if (ncqe < budget)
+			ncqe += bnxt_qplib_process_flush_list(&cq->qplib_cq,
+							      cqe + ncqe,
+							      budget - ncqe);
 
 		if (!ncqe)
 			break;
@@ -3410,7 +3379,7 @@ int bnxt_re_dealloc_ucontext(struct ib_ucontext *ib_uctx)
 					    &rdev->qplib_res.dpi_tbl,
 					    &uctx->dpi);
 		if (rc)
-			dev_err(rdev_to_dev(rdev), "Deallocte HW DPI failed!");
+			dev_err(rdev_to_dev(rdev), "Deallocate HW DPI failed!");
 			/* Don't fail, continue*/
 		uctx->dpi.dbr = NULL;
 	}
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.h b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
index a0bb7e3..1df11ed2 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.h
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
@@ -141,9 +141,6 @@ int bnxt_re_modify_device(struct ib_device *ibdev,
 			  struct ib_device_modify *device_modify);
 int bnxt_re_query_port(struct ib_device *ibdev, u8 port_num,
 		       struct ib_port_attr *port_attr);
-int bnxt_re_modify_port(struct ib_device *ibdev, u8 port_num,
-			int port_modify_mask,
-			struct ib_port_modify *port_modify);
 int bnxt_re_get_port_immutable(struct ib_device *ibdev, u8 port_num,
 			       struct ib_port_immutable *immutable);
 int bnxt_re_query_pkey(struct ib_device *ibdev, u8 port_num,
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index ceae2d9..82d1cbc 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -64,13 +64,14 @@
 #include "ib_verbs.h"
 #include <rdma/bnxt_re-abi.h>
 #include "bnxt.h"
+#include "hw_counters.h"
+
 static char version[] =
 		BNXT_RE_DESC " v" ROCE_DRV_MODULE_VERSION "\n";
 
 MODULE_AUTHOR("Eddie Wai <eddie.wai@broadcom.com>");
 MODULE_DESCRIPTION(BNXT_RE_DESC " Driver");
 MODULE_LICENSE("Dual BSD/GPL");
-MODULE_VERSION(ROCE_DRV_MODULE_VERSION);
 
 /* globals */
 static struct list_head bnxt_re_dev_list = LIST_HEAD_INIT(bnxt_re_dev_list);
@@ -162,7 +163,7 @@ static int bnxt_re_free_msix(struct bnxt_re_dev *rdev, bool lock_wait)
 
 static int bnxt_re_request_msix(struct bnxt_re_dev *rdev)
 {
-	int rc = 0, num_msix_want = BNXT_RE_MIN_MSIX, num_msix_got;
+	int rc = 0, num_msix_want = BNXT_RE_MAX_MSIX, num_msix_got;
 	struct bnxt_en_dev *en_dev;
 
 	if (!rdev)
@@ -170,6 +171,8 @@ static int bnxt_re_request_msix(struct bnxt_re_dev *rdev)
 
 	en_dev = rdev->en_dev;
 
+	num_msix_want = min_t(u32, BNXT_RE_MAX_MSIX, num_online_cpus());
+
 	rtnl_lock();
 	num_msix_got = en_dev->en_ops->bnxt_request_msix(en_dev, BNXT_ROCE_ULP,
 							 rdev->msix_entries,
@@ -474,7 +477,6 @@ static int bnxt_re_register_ib(struct bnxt_re_dev *rdev)
 	ibdev->modify_device		= bnxt_re_modify_device;
 
 	ibdev->query_port		= bnxt_re_query_port;
-	ibdev->modify_port		= bnxt_re_modify_port;
 	ibdev->get_port_immutable	= bnxt_re_get_port_immutable;
 	ibdev->query_pkey		= bnxt_re_query_pkey;
 	ibdev->query_gid		= bnxt_re_query_gid;
@@ -513,6 +515,8 @@ static int bnxt_re_register_ib(struct bnxt_re_dev *rdev)
 	ibdev->alloc_ucontext		= bnxt_re_alloc_ucontext;
 	ibdev->dealloc_ucontext		= bnxt_re_dealloc_ucontext;
 	ibdev->mmap			= bnxt_re_mmap;
+	ibdev->get_hw_stats             = bnxt_re_ib_get_hw_stats;
+	ibdev->alloc_hw_stats           = bnxt_re_ib_alloc_hw_stats;
 
 	return ib_register_device(ibdev, NULL);
 }
@@ -653,8 +657,12 @@ static int bnxt_re_cqn_handler(struct bnxt_qplib_nq *nq,
 
 static void bnxt_re_cleanup_res(struct bnxt_re_dev *rdev)
 {
-	if (rdev->nq.hwq.max_elements)
-		bnxt_qplib_disable_nq(&rdev->nq);
+	int i;
+
+	if (rdev->nq[0].hwq.max_elements) {
+		for (i = 1; i < rdev->num_msix; i++)
+			bnxt_qplib_disable_nq(&rdev->nq[i - 1]);
+	}
 
 	if (rdev->qplib_res.rcfw)
 		bnxt_qplib_cleanup_res(&rdev->qplib_res);
@@ -662,31 +670,41 @@ static void bnxt_re_cleanup_res(struct bnxt_re_dev *rdev)
 
 static int bnxt_re_init_res(struct bnxt_re_dev *rdev)
 {
-	int rc = 0;
+	int rc = 0, i;
 
 	bnxt_qplib_init_res(&rdev->qplib_res);
 
-	if (rdev->msix_entries[BNXT_RE_NQ_IDX].vector <= 0)
-		return -EINVAL;
+	for (i = 1; i < rdev->num_msix ; i++) {
+		rc = bnxt_qplib_enable_nq(rdev->en_dev->pdev, &rdev->nq[i - 1],
+					  i - 1, rdev->msix_entries[i].vector,
+					  rdev->msix_entries[i].db_offset,
+					  &bnxt_re_cqn_handler, NULL);
 
-	rc = bnxt_qplib_enable_nq(rdev->en_dev->pdev, &rdev->nq,
-				  rdev->msix_entries[BNXT_RE_NQ_IDX].vector,
-				  rdev->msix_entries[BNXT_RE_NQ_IDX].db_offset,
-				  &bnxt_re_cqn_handler,
-				  NULL);
-
-	if (rc)
-		dev_err(rdev_to_dev(rdev), "Failed to enable NQ: %#x", rc);
-
+		if (rc) {
+			dev_err(rdev_to_dev(rdev),
+				"Failed to enable NQ with rc = 0x%x", rc);
+			goto fail;
+		}
+	}
+	return 0;
+fail:
 	return rc;
 }
 
+static void bnxt_re_free_nq_res(struct bnxt_re_dev *rdev, bool lock_wait)
+{
+	int i;
+
+	for (i = 0; i < rdev->num_msix - 1; i++) {
+		bnxt_re_net_ring_free(rdev, rdev->nq[i].ring_id, lock_wait);
+		bnxt_qplib_free_nq(&rdev->nq[i]);
+	}
+}
+
 static void bnxt_re_free_res(struct bnxt_re_dev *rdev, bool lock_wait)
 {
-	if (rdev->nq.hwq.max_elements) {
-		bnxt_re_net_ring_free(rdev, rdev->nq.ring_id, lock_wait);
-		bnxt_qplib_free_nq(&rdev->nq);
-	}
+	bnxt_re_free_nq_res(rdev, lock_wait);
+
 	if (rdev->qplib_res.dpi_tbl.max) {
 		bnxt_qplib_dealloc_dpi(&rdev->qplib_res,
 				       &rdev->qplib_res.dpi_tbl,
@@ -700,7 +718,7 @@ static void bnxt_re_free_res(struct bnxt_re_dev *rdev, bool lock_wait)
 
 static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev)
 {
-	int rc = 0;
+	int rc = 0, i;
 
 	/* Configure and allocate resources for qplib */
 	rdev->qplib_res.rcfw = &rdev->rcfw;
@@ -717,30 +735,42 @@ static int bnxt_re_alloc_res(struct bnxt_re_dev *rdev)
 				  &rdev->dpi_privileged,
 				  rdev);
 	if (rc)
-		goto fail;
+		goto dealloc_res;
 
-	rdev->nq.hwq.max_elements = BNXT_RE_MAX_CQ_COUNT +
-				    BNXT_RE_MAX_SRQC_COUNT + 2;
-	rc = bnxt_qplib_alloc_nq(rdev->en_dev->pdev, &rdev->nq);
-	if (rc) {
-		dev_err(rdev_to_dev(rdev),
-			"Failed to allocate NQ memory: %#x", rc);
-		goto fail;
-	}
-	rc = bnxt_re_net_ring_alloc
-			(rdev, rdev->nq.hwq.pbl[PBL_LVL_0].pg_map_arr,
-			 rdev->nq.hwq.pbl[rdev->nq.hwq.level].pg_count,
-			 HWRM_RING_ALLOC_CMPL, BNXT_QPLIB_NQE_MAX_CNT - 1,
-			 rdev->msix_entries[BNXT_RE_NQ_IDX].ring_idx,
-			 &rdev->nq.ring_id);
-	if (rc) {
-		dev_err(rdev_to_dev(rdev),
-			"Failed to allocate NQ ring: %#x", rc);
-		goto free_nq;
+	for (i = 0; i < rdev->num_msix - 1; i++) {
+		rdev->nq[i].hwq.max_elements = BNXT_RE_MAX_CQ_COUNT +
+			BNXT_RE_MAX_SRQC_COUNT + 2;
+		rc = bnxt_qplib_alloc_nq(rdev->en_dev->pdev, &rdev->nq[i]);
+		if (rc) {
+			dev_err(rdev_to_dev(rdev), "Alloc Failed NQ%d rc:%#x",
+				i, rc);
+			goto dealloc_dpi;
+		}
+		rc = bnxt_re_net_ring_alloc
+			(rdev, rdev->nq[i].hwq.pbl[PBL_LVL_0].pg_map_arr,
+			 rdev->nq[i].hwq.pbl[rdev->nq[i].hwq.level].pg_count,
+			 HWRM_RING_ALLOC_CMPL,
+			 BNXT_QPLIB_NQE_MAX_CNT - 1,
+			 rdev->msix_entries[i + 1].ring_idx,
+			 &rdev->nq[i].ring_id);
+		if (rc) {
+			dev_err(rdev_to_dev(rdev),
+				"Failed to allocate NQ fw id with rc = 0x%x",
+				rc);
+			goto free_nq;
+		}
 	}
 	return 0;
 free_nq:
-	bnxt_qplib_free_nq(&rdev->nq);
+	for (i = 0; i < rdev->num_msix - 1; i++)
+		bnxt_qplib_free_nq(&rdev->nq[i]);
+dealloc_dpi:
+	bnxt_qplib_dealloc_dpi(&rdev->qplib_res,
+			       &rdev->qplib_res.dpi_tbl,
+			       &rdev->dpi_privileged);
+dealloc_res:
+	bnxt_qplib_free_res(&rdev->qplib_res);
+
 fail:
 	rdev->qplib_res.rcfw = NULL;
 	return rc;
@@ -835,6 +865,42 @@ static void bnxt_re_dev_stop(struct bnxt_re_dev *rdev)
 	mutex_unlock(&rdev->qp_lock);
 }
 
+static int bnxt_re_update_gid(struct bnxt_re_dev *rdev)
+{
+	struct bnxt_qplib_sgid_tbl *sgid_tbl = &rdev->qplib_res.sgid_tbl;
+	struct bnxt_qplib_gid gid;
+	u16 gid_idx, index;
+	int rc = 0;
+
+	if (!test_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags))
+		return 0;
+
+	if (!sgid_tbl) {
+		dev_err(rdev_to_dev(rdev), "QPLIB: SGID table not allocated");
+		return -EINVAL;
+	}
+
+	for (index = 0; index < sgid_tbl->active; index++) {
+		gid_idx = sgid_tbl->hw_id[index];
+
+		if (!memcmp(&sgid_tbl->tbl[index], &bnxt_qplib_gid_zero,
+			    sizeof(bnxt_qplib_gid_zero)))
+			continue;
+		/* need to modify the VLAN enable setting of non VLAN GID only
+		 * as setting is done for VLAN GID while adding GID
+		 */
+		if (sgid_tbl->vlan[index])
+			continue;
+
+		memcpy(&gid, &sgid_tbl->tbl[index], sizeof(gid));
+
+		rc = bnxt_qplib_update_sgid(sgid_tbl, &gid, gid_idx,
+					    rdev->qplib_res.netdev->dev_addr);
+	}
+
+	return rc;
+}
+
 static u32 bnxt_re_get_priority_mask(struct bnxt_re_dev *rdev)
 {
 	u32 prio_map = 0, tmp_map = 0;
@@ -854,8 +920,6 @@ static u32 bnxt_re_get_priority_mask(struct bnxt_re_dev *rdev)
 	tmp_map = dcb_ieee_getapp_mask(netdev, &app);
 	prio_map |= tmp_map;
 
-	if (!prio_map)
-		prio_map = -EFAULT;
 	return prio_map;
 }
 
@@ -881,10 +945,7 @@ static int bnxt_re_setup_qos(struct bnxt_re_dev *rdev)
 	int rc;
 
 	/* Get priority for roce */
-	rc = bnxt_re_get_priority_mask(rdev);
-	if (rc < 0)
-		return rc;
-	prio_map = (u8)rc;
+	prio_map = bnxt_re_get_priority_mask(rdev);
 
 	if (prio_map == rdev->cur_prio_map)
 		return 0;
@@ -906,6 +967,16 @@ static int bnxt_re_setup_qos(struct bnxt_re_dev *rdev)
 		return rc;
 	}
 
+	/* Actual priorities are not programmed as they are already
+	 * done by L2 driver; just enable or disable priority vlan tagging
+	 */
+	if ((prio_map == 0 && rdev->qplib_res.prio) ||
+	    (prio_map != 0 && !rdev->qplib_res.prio)) {
+		rdev->qplib_res.prio = prio_map ? true : false;
+
+		bnxt_re_update_gid(rdev);
+	}
+
 	return 0;
 }
 
@@ -998,7 +1069,8 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev)
 	/* Establish RCFW Communication Channel to initialize the context
 	 * memory for the function and all child VFs
 	 */
-	rc = bnxt_qplib_alloc_rcfw_channel(rdev->en_dev->pdev, &rdev->rcfw);
+	rc = bnxt_qplib_alloc_rcfw_channel(rdev->en_dev->pdev, &rdev->rcfw,
+					   BNXT_RE_MAX_QPC_COUNT);
 	if (rc)
 		goto fail;
 
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.c b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
index 9af1514..e8afc47 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.c
@@ -51,6 +51,168 @@
 #include "qplib_fp.h"
 
 static void bnxt_qplib_arm_cq_enable(struct bnxt_qplib_cq *cq);
+static void __clean_cq(struct bnxt_qplib_cq *cq, u64 qp);
+
+static void bnxt_qplib_cancel_phantom_processing(struct bnxt_qplib_qp *qp)
+{
+	qp->sq.condition = false;
+	qp->sq.send_phantom = false;
+	qp->sq.single = false;
+}
+
+/* Flush list */
+static void __bnxt_qplib_add_flush_qp(struct bnxt_qplib_qp *qp)
+{
+	struct bnxt_qplib_cq *scq, *rcq;
+
+	scq = qp->scq;
+	rcq = qp->rcq;
+
+	if (!qp->sq.flushed) {
+		dev_dbg(&scq->hwq.pdev->dev,
+			"QPLIB: FP: Adding to SQ Flush list = %p",
+			qp);
+		bnxt_qplib_cancel_phantom_processing(qp);
+		list_add_tail(&qp->sq_flush, &scq->sqf_head);
+		qp->sq.flushed = true;
+	}
+	if (!qp->srq) {
+		if (!qp->rq.flushed) {
+			dev_dbg(&rcq->hwq.pdev->dev,
+				"QPLIB: FP: Adding to RQ Flush list = %p",
+				qp);
+			list_add_tail(&qp->rq_flush, &rcq->rqf_head);
+			qp->rq.flushed = true;
+		}
+	}
+}
+
+void bnxt_qplib_acquire_cq_locks(struct bnxt_qplib_qp *qp,
+				 unsigned long *flags)
+	__acquires(&qp->scq->hwq.lock) __acquires(&qp->rcq->hwq.lock)
+{
+	spin_lock_irqsave(&qp->scq->hwq.lock, *flags);
+	if (qp->scq == qp->rcq)
+		__acquire(&qp->rcq->hwq.lock);
+	else
+		spin_lock(&qp->rcq->hwq.lock);
+}
+
+void bnxt_qplib_release_cq_locks(struct bnxt_qplib_qp *qp,
+				 unsigned long *flags)
+	__releases(&qp->scq->hwq.lock) __releases(&qp->rcq->hwq.lock)
+{
+	if (qp->scq == qp->rcq)
+		__release(&qp->rcq->hwq.lock);
+	else
+		spin_unlock(&qp->rcq->hwq.lock);
+	spin_unlock_irqrestore(&qp->scq->hwq.lock, *flags);
+}
+
+static struct bnxt_qplib_cq *bnxt_qplib_find_buddy_cq(struct bnxt_qplib_qp *qp,
+						      struct bnxt_qplib_cq *cq)
+{
+	struct bnxt_qplib_cq *buddy_cq = NULL;
+
+	if (qp->scq == qp->rcq)
+		buddy_cq = NULL;
+	else if (qp->scq == cq)
+		buddy_cq = qp->rcq;
+	else
+		buddy_cq = qp->scq;
+	return buddy_cq;
+}
+
+static void bnxt_qplib_lock_buddy_cq(struct bnxt_qplib_qp *qp,
+				     struct bnxt_qplib_cq *cq)
+	__acquires(&buddy_cq->hwq.lock)
+{
+	struct bnxt_qplib_cq *buddy_cq = NULL;
+
+	buddy_cq = bnxt_qplib_find_buddy_cq(qp, cq);
+	if (!buddy_cq)
+		__acquire(&cq->hwq.lock);
+	else
+		spin_lock(&buddy_cq->hwq.lock);
+}
+
+static void bnxt_qplib_unlock_buddy_cq(struct bnxt_qplib_qp *qp,
+				       struct bnxt_qplib_cq *cq)
+	__releases(&buddy_cq->hwq.lock)
+{
+	struct bnxt_qplib_cq *buddy_cq = NULL;
+
+	buddy_cq = bnxt_qplib_find_buddy_cq(qp, cq);
+	if (!buddy_cq)
+		__release(&cq->hwq.lock);
+	else
+		spin_unlock(&buddy_cq->hwq.lock);
+}
+
+void bnxt_qplib_add_flush_qp(struct bnxt_qplib_qp *qp)
+{
+	unsigned long flags;
+
+	bnxt_qplib_acquire_cq_locks(qp, &flags);
+	__bnxt_qplib_add_flush_qp(qp);
+	bnxt_qplib_release_cq_locks(qp, &flags);
+}
+
+static void __bnxt_qplib_del_flush_qp(struct bnxt_qplib_qp *qp)
+{
+	struct bnxt_qplib_cq *scq, *rcq;
+
+	scq = qp->scq;
+	rcq = qp->rcq;
+
+	if (qp->sq.flushed) {
+		qp->sq.flushed = false;
+		list_del(&qp->sq_flush);
+	}
+	if (!qp->srq) {
+		if (qp->rq.flushed) {
+			qp->rq.flushed = false;
+			list_del(&qp->rq_flush);
+		}
+	}
+}
+
+void bnxt_qplib_del_flush_qp(struct bnxt_qplib_qp *qp)
+{
+	unsigned long flags;
+
+	bnxt_qplib_acquire_cq_locks(qp, &flags);
+	__clean_cq(qp->scq, (u64)(unsigned long)qp);
+	qp->sq.hwq.prod = 0;
+	qp->sq.hwq.cons = 0;
+	__clean_cq(qp->rcq, (u64)(unsigned long)qp);
+	qp->rq.hwq.prod = 0;
+	qp->rq.hwq.cons = 0;
+
+	__bnxt_qplib_del_flush_qp(qp);
+	bnxt_qplib_release_cq_locks(qp, &flags);
+}
+
+static void bnxt_qpn_cqn_sched_task(struct work_struct *work)
+{
+	struct bnxt_qplib_nq_work *nq_work =
+			container_of(work, struct bnxt_qplib_nq_work, work);
+
+	struct bnxt_qplib_cq *cq = nq_work->cq;
+	struct bnxt_qplib_nq *nq = nq_work->nq;
+
+	if (cq && nq) {
+		spin_lock_bh(&cq->compl_lock);
+		if (atomic_read(&cq->arm_state) && nq->cqn_handler) {
+			dev_dbg(&nq->pdev->dev,
+				"%s:Trigger cq  = %p event nq = %p\n",
+				__func__, cq, nq);
+			nq->cqn_handler(nq, cq);
+		}
+		spin_unlock_bh(&cq->compl_lock);
+	}
+	kfree(nq_work);
+}
 
 static void bnxt_qplib_free_qp_hdr_buf(struct bnxt_qplib_res *res,
 				       struct bnxt_qplib_qp *qp)
@@ -119,6 +281,7 @@ static void bnxt_qplib_service_nq(unsigned long data)
 	struct bnxt_qplib_nq *nq = (struct bnxt_qplib_nq *)data;
 	struct bnxt_qplib_hwq *hwq = &nq->hwq;
 	struct nq_base *nqe, **nq_ptr;
+	struct bnxt_qplib_cq *cq;
 	int num_cqne_processed = 0;
 	u32 sw_cons, raw_cons;
 	u16 type;
@@ -143,15 +306,17 @@ static void bnxt_qplib_service_nq(unsigned long data)
 			q_handle = le32_to_cpu(nqcne->cq_handle_low);
 			q_handle |= (u64)le32_to_cpu(nqcne->cq_handle_high)
 						     << 32;
-			bnxt_qplib_arm_cq_enable((struct bnxt_qplib_cq *)
-						 ((unsigned long)q_handle));
-			if (!nq->cqn_handler(nq, (struct bnxt_qplib_cq *)
-						 ((unsigned long)q_handle)))
+			cq = (struct bnxt_qplib_cq *)(unsigned long)q_handle;
+			bnxt_qplib_arm_cq_enable(cq);
+			spin_lock_bh(&cq->compl_lock);
+			atomic_set(&cq->arm_state, 0);
+			if (!nq->cqn_handler(nq, (cq)))
 				num_cqne_processed++;
 			else
 				dev_warn(&nq->pdev->dev,
 					 "QPLIB: cqn - type 0x%x not handled",
 					 type);
+			spin_unlock_bh(&cq->compl_lock);
 			break;
 		}
 		case NQ_BASE_TYPE_DBQ_EVENT:
@@ -190,12 +355,17 @@ static irqreturn_t bnxt_qplib_nq_irq(int irq, void *dev_instance)
 
 void bnxt_qplib_disable_nq(struct bnxt_qplib_nq *nq)
 {
+	if (nq->cqn_wq) {
+		destroy_workqueue(nq->cqn_wq);
+		nq->cqn_wq = NULL;
+	}
 	/* Make sure the HW is stopped! */
 	synchronize_irq(nq->vector);
 	tasklet_disable(&nq->worker);
 	tasklet_kill(&nq->worker);
 
 	if (nq->requested) {
+		irq_set_affinity_hint(nq->vector, NULL);
 		free_irq(nq->vector, nq);
 		nq->requested = false;
 	}
@@ -209,14 +379,14 @@ void bnxt_qplib_disable_nq(struct bnxt_qplib_nq *nq)
 }
 
 int bnxt_qplib_enable_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq,
-			 int msix_vector, int bar_reg_offset,
+			 int nq_idx, int msix_vector, int bar_reg_offset,
 			 int (*cqn_handler)(struct bnxt_qplib_nq *nq,
 					    struct bnxt_qplib_cq *),
 			 int (*srqn_handler)(struct bnxt_qplib_nq *nq,
 					     void *, u8 event))
 {
 	resource_size_t nq_base;
-	int rc;
+	int rc = -1;
 
 	nq->pdev = pdev;
 	nq->vector = msix_vector;
@@ -227,14 +397,31 @@ int bnxt_qplib_enable_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq,
 
 	tasklet_init(&nq->worker, bnxt_qplib_service_nq, (unsigned long)nq);
 
+	/* Have a task to schedule CQ notifiers in post send case */
+	nq->cqn_wq  = create_singlethread_workqueue("bnxt_qplib_nq");
+	if (!nq->cqn_wq)
+		goto fail;
+
 	nq->requested = false;
-	rc = request_irq(nq->vector, bnxt_qplib_nq_irq, 0, "bnxt_qplib_nq", nq);
+	memset(nq->name, 0, 32);
+	sprintf(nq->name, "bnxt_qplib_nq-%d", nq_idx);
+	rc = request_irq(nq->vector, bnxt_qplib_nq_irq, 0, nq->name, nq);
 	if (rc) {
 		dev_err(&nq->pdev->dev,
 			"Failed to request IRQ for NQ: %#x", rc);
 		bnxt_qplib_disable_nq(nq);
 		goto fail;
 	}
+
+	cpumask_clear(&nq->mask);
+	cpumask_set_cpu(nq_idx, &nq->mask);
+	rc = irq_set_affinity_hint(nq->vector, &nq->mask);
+	if (rc) {
+		dev_warn(&nq->pdev->dev,
+			 "QPLIB: set affinity failed; vector: %d nq_idx: %d\n",
+			 nq->vector, nq_idx);
+	}
+
 	nq->requested = true;
 	nq->bar_reg = NQ_CONS_PCI_BAR_REGION;
 	nq->bar_reg_off = bar_reg_offset;
@@ -258,8 +445,10 @@ int bnxt_qplib_enable_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq,
 
 void bnxt_qplib_free_nq(struct bnxt_qplib_nq *nq)
 {
-	if (nq->hwq.max_elements)
+	if (nq->hwq.max_elements) {
 		bnxt_qplib_free_hwq(nq->pdev, &nq->hwq);
+		nq->hwq.max_elements = 0;
+	}
 }
 
 int bnxt_qplib_alloc_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq)
@@ -401,8 +590,8 @@ int bnxt_qplib_create_qp1(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
 
 	qp->id = le32_to_cpu(resp.xid);
 	qp->cur_qp_state = CMDQ_MODIFY_QP_NEW_STATE_RESET;
-	sq->flush_in_progress = false;
-	rq->flush_in_progress = false;
+	rcfw->qp_tbl[qp->id].qp_id = qp->id;
+	rcfw->qp_tbl[qp->id].qp_handle = (void *)qp;
 
 	return 0;
 
@@ -615,8 +804,10 @@ int bnxt_qplib_create_qp(struct bnxt_qplib_res *res, struct bnxt_qplib_qp *qp)
 
 	qp->id = le32_to_cpu(resp.xid);
 	qp->cur_qp_state = CMDQ_MODIFY_QP_NEW_STATE_RESET;
-	sq->flush_in_progress = false;
-	rq->flush_in_progress = false;
+	INIT_LIST_HEAD(&qp->sq_flush);
+	INIT_LIST_HEAD(&qp->rq_flush);
+	rcfw->qp_tbl[qp->id].qp_id = qp->id;
+	rcfw->qp_tbl[qp->id].qp_handle = (void *)qp;
 
 	return 0;
 
@@ -963,13 +1154,19 @@ int bnxt_qplib_destroy_qp(struct bnxt_qplib_res *res,
 	u16 cmd_flags = 0;
 	int rc;
 
+	rcfw->qp_tbl[qp->id].qp_id = BNXT_QPLIB_QP_ID_INVALID;
+	rcfw->qp_tbl[qp->id].qp_handle = NULL;
+
 	RCFW_CMD_PREP(req, DESTROY_QP, cmd_flags);
 
 	req.qp_cid = cpu_to_le32(qp->id);
 	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
 					  (void *)&resp, NULL, 0);
-	if (rc)
+	if (rc) {
+		rcfw->qp_tbl[qp->id].qp_id = qp->id;
+		rcfw->qp_tbl[qp->id].qp_handle = qp;
 		return rc;
+	}
 
 	/* Must walk the associated CQs to nullified the QP ptr */
 	spin_lock_irqsave(&qp->scq->hwq.lock, flags);
@@ -1074,14 +1271,21 @@ int bnxt_qplib_post_send(struct bnxt_qplib_qp *qp,
 	struct bnxt_qplib_swq *swq;
 	struct sq_send *hw_sq_send_hdr, **hw_sq_send_ptr;
 	struct sq_sge *hw_sge;
+	struct bnxt_qplib_nq_work *nq_work = NULL;
+	bool sch_handler = false;
 	u32 sw_prod;
 	u8 wqe_size16;
 	int i, rc = 0, data_len = 0, pkt_num = 0;
 	__le32 temp32;
 
 	if (qp->state != CMDQ_MODIFY_QP_NEW_STATE_RTS) {
-		rc = -EINVAL;
-		goto done;
+		if (qp->state == CMDQ_MODIFY_QP_NEW_STATE_ERR) {
+			sch_handler = true;
+			dev_dbg(&sq->hwq.pdev->dev,
+				"%s Error QP. Scheduling for poll_cq\n",
+				__func__);
+			goto queue_err;
+		}
 	}
 
 	if (bnxt_qplib_queue_full(sq)) {
@@ -1301,12 +1505,35 @@ int bnxt_qplib_post_send(struct bnxt_qplib_qp *qp,
 			((swq->next_psn << SQ_PSN_SEARCH_NEXT_PSN_SFT) &
 			 SQ_PSN_SEARCH_NEXT_PSN_MASK));
 	}
-
+queue_err:
+	if (sch_handler) {
+		/* Store the ULP info in the software structures */
+		sw_prod = HWQ_CMP(sq->hwq.prod, &sq->hwq);
+		swq = &sq->swq[sw_prod];
+		swq->wr_id = wqe->wr_id;
+		swq->type = wqe->type;
+		swq->flags = wqe->flags;
+		if (qp->sig_type)
+			swq->flags |= SQ_SEND_FLAGS_SIGNAL_COMP;
+		swq->start_psn = sq->psn & BTH_PSN_MASK;
+	}
 	sq->hwq.prod++;
-
 	qp->wqe_cnt++;
 
 done:
+	if (sch_handler) {
+		nq_work = kzalloc(sizeof(*nq_work), GFP_ATOMIC);
+		if (nq_work) {
+			nq_work->cq = qp->scq;
+			nq_work->nq = qp->scq->nq;
+			INIT_WORK(&nq_work->work, bnxt_qpn_cqn_sched_task);
+			queue_work(qp->scq->nq->cqn_wq, &nq_work->work);
+		} else {
+			dev_err(&sq->hwq.pdev->dev,
+				"QPLIB: FP: Failed to allocate SQ nq_work!");
+			rc = -ENOMEM;
+		}
+	}
 	return rc;
 }
 
@@ -1334,15 +1561,17 @@ int bnxt_qplib_post_recv(struct bnxt_qplib_qp *qp,
 	struct bnxt_qplib_q *rq = &qp->rq;
 	struct rq_wqe *rqe, **rqe_ptr;
 	struct sq_sge *hw_sge;
+	struct bnxt_qplib_nq_work *nq_work = NULL;
+	bool sch_handler = false;
 	u32 sw_prod;
 	int i, rc = 0;
 
 	if (qp->state == CMDQ_MODIFY_QP_NEW_STATE_ERR) {
-		dev_err(&rq->hwq.pdev->dev,
-			"QPLIB: FP: QP (0x%x) is in the 0x%x state",
-			qp->id, qp->state);
-		rc = -EINVAL;
-		goto done;
+		sch_handler = true;
+		dev_dbg(&rq->hwq.pdev->dev,
+			"%s Error QP. Scheduling for poll_cq\n",
+			__func__);
+		goto queue_err;
 	}
 	if (bnxt_qplib_queue_full(rq)) {
 		dev_err(&rq->hwq.pdev->dev,
@@ -1378,7 +1607,27 @@ int bnxt_qplib_post_recv(struct bnxt_qplib_qp *qp,
 	/* Supply the rqe->wr_id index to the wr_id_tbl for now */
 	rqe->wr_id[0] = cpu_to_le32(sw_prod);
 
+queue_err:
+	if (sch_handler) {
+		/* Store the ULP info in the software structures */
+		sw_prod = HWQ_CMP(rq->hwq.prod, &rq->hwq);
+		rq->swq[sw_prod].wr_id = wqe->wr_id;
+	}
+
 	rq->hwq.prod++;
+	if (sch_handler) {
+		nq_work = kzalloc(sizeof(*nq_work), GFP_ATOMIC);
+		if (nq_work) {
+			nq_work->cq = qp->rcq;
+			nq_work->nq = qp->rcq->nq;
+			INIT_WORK(&nq_work->work, bnxt_qpn_cqn_sched_task);
+			queue_work(qp->rcq->nq->cqn_wq, &nq_work->work);
+		} else {
+			dev_err(&rq->hwq.pdev->dev,
+				"QPLIB: FP: Failed to allocate RQ nq_work!");
+			rc = -ENOMEM;
+		}
+	}
 done:
 	return rc;
 }
@@ -1471,6 +1720,9 @@ int bnxt_qplib_create_cq(struct bnxt_qplib_res *res, struct bnxt_qplib_cq *cq)
 	cq->dbr_base = res->dpi_tbl.dbr_bar_reg_iomem;
 	cq->period = BNXT_QPLIB_QUEUE_START_PERIOD;
 	init_waitqueue_head(&cq->waitq);
+	INIT_LIST_HEAD(&cq->sqf_head);
+	INIT_LIST_HEAD(&cq->rqf_head);
+	spin_lock_init(&cq->compl_lock);
 
 	bnxt_qplib_arm_cq_enable(cq);
 	return 0;
@@ -1513,9 +1765,13 @@ static int __flush_sq(struct bnxt_qplib_q *sq, struct bnxt_qplib_qp *qp,
 	while (*budget) {
 		sw_cons = HWQ_CMP(sq->hwq.cons, &sq->hwq);
 		if (sw_cons == sw_prod) {
-			sq->flush_in_progress = false;
 			break;
 		}
+		/* Skip the FENCE WQE completions */
+		if (sq->swq[sw_cons].wr_id == BNXT_QPLIB_FENCE_WRID) {
+			bnxt_qplib_cancel_phantom_processing(qp);
+			goto skip_compl;
+		}
 		memset(cqe, 0, sizeof(*cqe));
 		cqe->status = CQ_REQ_STATUS_WORK_REQUEST_FLUSHED_ERR;
 		cqe->opcode = CQ_BASE_CQE_TYPE_REQ;
@@ -1525,6 +1781,7 @@ static int __flush_sq(struct bnxt_qplib_q *sq, struct bnxt_qplib_qp *qp,
 		cqe->type = sq->swq[sw_cons].type;
 		cqe++;
 		(*budget)--;
+skip_compl:
 		sq->hwq.cons++;
 	}
 	*pcqe = cqe;
@@ -1536,11 +1793,24 @@ static int __flush_sq(struct bnxt_qplib_q *sq, struct bnxt_qplib_qp *qp,
 }
 
 static int __flush_rq(struct bnxt_qplib_q *rq, struct bnxt_qplib_qp *qp,
-		      int opcode, struct bnxt_qplib_cqe **pcqe, int *budget)
+		      struct bnxt_qplib_cqe **pcqe, int *budget)
 {
 	struct bnxt_qplib_cqe *cqe;
 	u32 sw_prod, sw_cons;
 	int rc = 0;
+	int opcode = 0;
+
+	switch (qp->type) {
+	case CMDQ_CREATE_QP1_TYPE_GSI:
+		opcode = CQ_BASE_CQE_TYPE_RES_RAWETH_QP1;
+		break;
+	case CMDQ_CREATE_QP_TYPE_RC:
+		opcode = CQ_BASE_CQE_TYPE_RES_RC;
+		break;
+	case CMDQ_CREATE_QP_TYPE_UD:
+		opcode = CQ_BASE_CQE_TYPE_RES_UD;
+		break;
+	}
 
 	/* Flush the rest of the RQ */
 	sw_prod = HWQ_CMP(rq->hwq.prod, &rq->hwq);
@@ -1567,6 +1837,21 @@ static int __flush_rq(struct bnxt_qplib_q *rq, struct bnxt_qplib_qp *qp,
 	return rc;
 }
 
+void bnxt_qplib_mark_qp_error(void *qp_handle)
+{
+	struct bnxt_qplib_qp *qp = qp_handle;
+
+	if (!qp)
+		return;
+
+	/* Must block new posting of SQ and RQ */
+	qp->state = CMDQ_MODIFY_QP_NEW_STATE_ERR;
+	bnxt_qplib_cancel_phantom_processing(qp);
+
+	/* Add qp to flush list of the CQ */
+	__bnxt_qplib_add_flush_qp(qp);
+}
+
 /* Note: SQE is valid from sw_sq_cons up to cqe_sq_cons (exclusive)
  *       CQE is track from sw_cq_cons to max_element but valid only if VALID=1
  */
@@ -1694,10 +1979,12 @@ static int bnxt_qplib_cq_process_req(struct bnxt_qplib_cq *cq,
 			cqe_sq_cons, sq->hwq.max_elements);
 		return -EINVAL;
 	}
-	/* If we were in the middle of flushing the SQ, continue */
-	if (sq->flush_in_progress)
-		goto flush;
 
+	if (qp->sq.flushed) {
+		dev_dbg(&cq->hwq.pdev->dev,
+			"%s: QPLIB: QP in Flush QP = %p\n", __func__, qp);
+		goto done;
+	}
 	/* Require to walk the sq's swq to fabricate CQEs for all previously
 	 * signaled SWQEs due to CQE aggregation from the current sq cons
 	 * to the cqe_sq_cons
@@ -1733,11 +2020,9 @@ static int bnxt_qplib_cq_process_req(struct bnxt_qplib_cq *cq,
 				sw_sq_cons, cqe->wr_id, cqe->status);
 			cqe++;
 			(*budget)--;
-			sq->flush_in_progress = true;
-			/* Must block new posting of SQ and RQ */
-			qp->state = CMDQ_MODIFY_QP_NEW_STATE_ERR;
-			sq->condition = false;
-			sq->single = false;
+			bnxt_qplib_lock_buddy_cq(qp, cq);
+			bnxt_qplib_mark_qp_error(qp);
+			bnxt_qplib_unlock_buddy_cq(qp, cq);
 		} else {
 			if (swq->flags & SQ_SEND_FLAGS_SIGNAL_COMP) {
 				/* Before we complete, do WA 9060 */
@@ -1768,15 +2053,6 @@ static int bnxt_qplib_cq_process_req(struct bnxt_qplib_cq *cq,
 	 * the WC for this CQE
 	 */
 	sq->single = false;
-	if (!sq->flush_in_progress)
-		goto done;
-flush:
-	/* Require to walk the sq's swq to fabricate CQEs for all
-	 * previously posted SWQEs due to the error CQE received
-	 */
-	rc = __flush_sq(sq, qp, pcqe, budget);
-	if (!rc)
-		sq->flush_in_progress = false;
 done:
 	return rc;
 }
@@ -1798,6 +2074,12 @@ static int bnxt_qplib_cq_process_res_rc(struct bnxt_qplib_cq *cq,
 		dev_err(&cq->hwq.pdev->dev, "QPLIB: process_cq RC qp is NULL");
 		return -EINVAL;
 	}
+	if (qp->rq.flushed) {
+		dev_dbg(&cq->hwq.pdev->dev,
+			"%s: QPLIB: QP in Flush QP = %p\n", __func__, qp);
+		goto done;
+	}
+
 	cqe = *pcqe;
 	cqe->opcode = hwcqe->cqe_type_toggle & CQ_BASE_CQE_TYPE_MASK;
 	cqe->length = le32_to_cpu(hwcqe->length);
@@ -1817,8 +2099,6 @@ static int bnxt_qplib_cq_process_res_rc(struct bnxt_qplib_cq *cq,
 			wr_id_idx, rq->hwq.max_elements);
 		return -EINVAL;
 	}
-	if (rq->flush_in_progress)
-		goto flush_rq;
 
 	cqe->wr_id = rq->swq[wr_id_idx].wr_id;
 	cqe++;
@@ -1827,12 +2107,13 @@ static int bnxt_qplib_cq_process_res_rc(struct bnxt_qplib_cq *cq,
 	*pcqe = cqe;
 
 	if (hwcqe->status != CQ_RES_RC_STATUS_OK) {
-		rq->flush_in_progress = true;
-flush_rq:
-		rc = __flush_rq(rq, qp, CQ_BASE_CQE_TYPE_RES_RC, pcqe, budget);
-		if (!rc)
-			rq->flush_in_progress = false;
+		 /* Add qp to flush list of the CQ */
+		bnxt_qplib_lock_buddy_cq(qp, cq);
+		__bnxt_qplib_add_flush_qp(qp);
+		bnxt_qplib_unlock_buddy_cq(qp, cq);
 	}
+
+done:
 	return rc;
 }
 
@@ -1853,6 +2134,11 @@ static int bnxt_qplib_cq_process_res_ud(struct bnxt_qplib_cq *cq,
 		dev_err(&cq->hwq.pdev->dev, "QPLIB: process_cq UD qp is NULL");
 		return -EINVAL;
 	}
+	if (qp->rq.flushed) {
+		dev_dbg(&cq->hwq.pdev->dev,
+			"%s: QPLIB: QP in Flush QP = %p\n", __func__, qp);
+		goto done;
+	}
 	cqe = *pcqe;
 	cqe->opcode = hwcqe->cqe_type_toggle & CQ_BASE_CQE_TYPE_MASK;
 	cqe->length = le32_to_cpu(hwcqe->length);
@@ -1876,8 +2162,6 @@ static int bnxt_qplib_cq_process_res_ud(struct bnxt_qplib_cq *cq,
 			wr_id_idx, rq->hwq.max_elements);
 		return -EINVAL;
 	}
-	if (rq->flush_in_progress)
-		goto flush_rq;
 
 	cqe->wr_id = rq->swq[wr_id_idx].wr_id;
 	cqe++;
@@ -1886,12 +2170,12 @@ static int bnxt_qplib_cq_process_res_ud(struct bnxt_qplib_cq *cq,
 	*pcqe = cqe;
 
 	if (hwcqe->status != CQ_RES_RC_STATUS_OK) {
-		rq->flush_in_progress = true;
-flush_rq:
-		rc = __flush_rq(rq, qp, CQ_BASE_CQE_TYPE_RES_UD, pcqe, budget);
-		if (!rc)
-			rq->flush_in_progress = false;
+		/* Add qp to flush list of the CQ */
+		bnxt_qplib_lock_buddy_cq(qp, cq);
+		__bnxt_qplib_add_flush_qp(qp);
+		bnxt_qplib_unlock_buddy_cq(qp, cq);
 	}
+done:
 	return rc;
 }
 
@@ -1932,6 +2216,11 @@ static int bnxt_qplib_cq_process_res_raweth_qp1(struct bnxt_qplib_cq *cq,
 			"QPLIB: process_cq Raw/QP1 qp is NULL");
 		return -EINVAL;
 	}
+	if (qp->rq.flushed) {
+		dev_dbg(&cq->hwq.pdev->dev,
+			"%s: QPLIB: QP in Flush QP = %p\n", __func__, qp);
+		goto done;
+	}
 	cqe = *pcqe;
 	cqe->opcode = hwcqe->cqe_type_toggle & CQ_BASE_CQE_TYPE_MASK;
 	cqe->flags = le16_to_cpu(hwcqe->flags);
@@ -1960,8 +2249,6 @@ static int bnxt_qplib_cq_process_res_raweth_qp1(struct bnxt_qplib_cq *cq,
 			wr_id_idx, rq->hwq.max_elements);
 		return -EINVAL;
 	}
-	if (rq->flush_in_progress)
-		goto flush_rq;
 
 	cqe->wr_id = rq->swq[wr_id_idx].wr_id;
 	cqe++;
@@ -1970,13 +2257,13 @@ static int bnxt_qplib_cq_process_res_raweth_qp1(struct bnxt_qplib_cq *cq,
 	*pcqe = cqe;
 
 	if (hwcqe->status != CQ_RES_RC_STATUS_OK) {
-		rq->flush_in_progress = true;
-flush_rq:
-		rc = __flush_rq(rq, qp, CQ_BASE_CQE_TYPE_RES_RAWETH_QP1, pcqe,
-				budget);
-		if (!rc)
-			rq->flush_in_progress = false;
+		/* Add qp to flush list of the CQ */
+		bnxt_qplib_lock_buddy_cq(qp, cq);
+		__bnxt_qplib_add_flush_qp(qp);
+		bnxt_qplib_unlock_buddy_cq(qp, cq);
 	}
+
+done:
 	return rc;
 }
 
@@ -1990,7 +2277,6 @@ static int bnxt_qplib_cq_process_terminal(struct bnxt_qplib_cq *cq,
 	struct bnxt_qplib_cqe *cqe;
 	u32 sw_cons = 0, cqe_cons;
 	int rc = 0;
-	u8 opcode = 0;
 
 	/* Check the Status */
 	if (hwcqe->status != CQ_TERMINAL_STATUS_OK)
@@ -2005,6 +2291,7 @@ static int bnxt_qplib_cq_process_terminal(struct bnxt_qplib_cq *cq,
 			"QPLIB: FP: CQ Process terminal qp is NULL");
 		return -EINVAL;
 	}
+
 	/* Must block new posting of SQ and RQ */
 	qp->state = CMDQ_MODIFY_QP_NEW_STATE_ERR;
 
@@ -2023,9 +2310,12 @@ static int bnxt_qplib_cq_process_terminal(struct bnxt_qplib_cq *cq,
 			cqe_cons, sq->hwq.max_elements);
 		goto do_rq;
 	}
-	/* If we were in the middle of flushing, continue */
-	if (sq->flush_in_progress)
-		goto flush_sq;
+
+	if (qp->sq.flushed) {
+		dev_dbg(&cq->hwq.pdev->dev,
+			"%s: QPLIB: QP in Flush QP = %p\n", __func__, qp);
+		goto sq_done;
+	}
 
 	/* Terminal CQE can also include aggregated successful CQEs prior.
 	 * So we must complete all CQEs from the current sq's cons to the
@@ -2055,11 +2345,6 @@ static int bnxt_qplib_cq_process_terminal(struct bnxt_qplib_cq *cq,
 		rc = -EAGAIN;
 		goto sq_done;
 	}
-	sq->flush_in_progress = true;
-flush_sq:
-	rc = __flush_sq(sq, qp, pcqe, budget);
-	if (!rc)
-		sq->flush_in_progress = false;
 sq_done:
 	if (rc)
 		return rc;
@@ -2075,26 +2360,23 @@ static int bnxt_qplib_cq_process_terminal(struct bnxt_qplib_cq *cq,
 			cqe_cons, rq->hwq.max_elements);
 		goto done;
 	}
+
+	if (qp->rq.flushed) {
+		dev_dbg(&cq->hwq.pdev->dev,
+			"%s: QPLIB: QP in Flush QP = %p\n", __func__, qp);
+		rc = 0;
+		goto done;
+	}
+
 	/* Terminal CQE requires all posted RQEs to complete with FLUSHED_ERR
 	 * from the current rq->cons to the rq->prod regardless what the
 	 * rq->cons the terminal CQE indicates
 	 */
-	rq->flush_in_progress = true;
-	switch (qp->type) {
-	case CMDQ_CREATE_QP1_TYPE_GSI:
-		opcode = CQ_BASE_CQE_TYPE_RES_RAWETH_QP1;
-		break;
-	case CMDQ_CREATE_QP_TYPE_RC:
-		opcode = CQ_BASE_CQE_TYPE_RES_RC;
-		break;
-	case CMDQ_CREATE_QP_TYPE_UD:
-		opcode = CQ_BASE_CQE_TYPE_RES_UD;
-		break;
-	}
 
-	rc = __flush_rq(rq, qp, opcode, pcqe, budget);
-	if (!rc)
-		rq->flush_in_progress = false;
+	/* Add qp to flush list of the CQ */
+	bnxt_qplib_lock_buddy_cq(qp, cq);
+	__bnxt_qplib_add_flush_qp(qp);
+	bnxt_qplib_unlock_buddy_cq(qp, cq);
 done:
 	return rc;
 }
@@ -2115,6 +2397,33 @@ static int bnxt_qplib_cq_process_cutoff(struct bnxt_qplib_cq *cq,
 	return 0;
 }
 
+int bnxt_qplib_process_flush_list(struct bnxt_qplib_cq *cq,
+				  struct bnxt_qplib_cqe *cqe,
+				  int num_cqes)
+{
+	struct bnxt_qplib_qp *qp = NULL;
+	u32 budget = num_cqes;
+	unsigned long flags;
+
+	spin_lock_irqsave(&cq->hwq.lock, flags);
+	list_for_each_entry(qp, &cq->sqf_head, sq_flush) {
+		dev_dbg(&cq->hwq.pdev->dev,
+			"QPLIB: FP: Flushing SQ QP= %p",
+			qp);
+		__flush_sq(&qp->sq, qp, &cqe, &budget);
+	}
+
+	list_for_each_entry(qp, &cq->rqf_head, rq_flush) {
+		dev_dbg(&cq->hwq.pdev->dev,
+			"QPLIB: FP: Flushing RQ QP= %p",
+			qp);
+		__flush_rq(&qp->rq, qp, &cqe, &budget);
+	}
+	spin_unlock_irqrestore(&cq->hwq.lock, flags);
+
+	return num_cqes - budget;
+}
+
 int bnxt_qplib_poll_cq(struct bnxt_qplib_cq *cq, struct bnxt_qplib_cqe *cqe,
 		       int num_cqes, struct bnxt_qplib_qp **lib_qp)
 {
@@ -2205,6 +2514,7 @@ void bnxt_qplib_req_notify_cq(struct bnxt_qplib_cq *cq, u32 arm_type)
 	spin_lock_irqsave(&cq->hwq.lock, flags);
 	if (arm_type)
 		bnxt_qplib_arm_cq(cq, arm_type);
-
+	/* Using cq->arm_state variable to track whether to issue cq handler */
+	atomic_set(&cq->arm_state, 1);
 	spin_unlock_irqrestore(&cq->hwq.lock, flags);
 }
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_fp.h b/drivers/infiniband/hw/bnxt_re/qplib_fp.h
index 19176e0..8ead70c 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_fp.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_fp.h
@@ -220,19 +220,20 @@ struct bnxt_qplib_q {
 	u16				q_full_delta;
 	u16				max_sge;
 	u32				psn;
-	bool				flush_in_progress;
 	bool				condition;
 	bool				single;
 	bool				send_phantom;
 	u32				phantom_wqe_cnt;
 	u32				phantom_cqe_cnt;
 	u32				next_cq_cons;
+	bool				flushed;
 };
 
 struct bnxt_qplib_qp {
 	struct bnxt_qplib_pd		*pd;
 	struct bnxt_qplib_dpi		*dpi;
 	u64				qp_handle;
+#define        BNXT_QPLIB_QP_ID_INVALID        0xFFFFFFFF
 	u32				id;
 	u8				type;
 	u8				sig_type;
@@ -296,6 +297,8 @@ struct bnxt_qplib_qp {
 	dma_addr_t			sq_hdr_buf_map;
 	void				*rq_hdr_buf;
 	dma_addr_t			rq_hdr_buf_map;
+	struct list_head		sq_flush;
+	struct list_head		rq_flush;
 };
 
 #define BNXT_QPLIB_MAX_CQE_ENTRY_SIZE	sizeof(struct cq_base)
@@ -351,6 +354,7 @@ struct bnxt_qplib_cq {
 	u16				period;
 	struct bnxt_qplib_hwq		hwq;
 	u32				cnq_hw_ring_id;
+	struct bnxt_qplib_nq		*nq;
 	bool				resize_in_progress;
 	struct scatterlist		*sghead;
 	u32				nmap;
@@ -360,6 +364,9 @@ struct bnxt_qplib_cq {
 	unsigned long			flags;
 #define CQ_FLAGS_RESIZE_IN_PROG		1
 	wait_queue_head_t		waitq;
+	struct list_head		sqf_head, rqf_head;
+	atomic_t			arm_state;
+	spinlock_t			compl_lock; /* synch CQ handlers */
 };
 
 #define BNXT_QPLIB_MAX_IRRQE_ENTRY_SIZE	sizeof(struct xrrq_irrq)
@@ -400,6 +407,7 @@ struct bnxt_qplib_nq {
 	struct pci_dev			*pdev;
 
 	int				vector;
+	cpumask_t			mask;
 	int				budget;
 	bool				requested;
 	struct tasklet_struct		worker;
@@ -417,11 +425,19 @@ struct bnxt_qplib_nq {
 						(struct bnxt_qplib_nq *nq,
 						 void *srq,
 						 u8 event);
+	struct workqueue_struct         *cqn_wq;
+	char                            name[32];
+};
+
+struct bnxt_qplib_nq_work {
+	struct work_struct      work;
+	struct bnxt_qplib_nq    *nq;
+	struct bnxt_qplib_cq    *cq;
 };
 
 void bnxt_qplib_disable_nq(struct bnxt_qplib_nq *nq);
 int bnxt_qplib_enable_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq,
-			 int msix_vector, int bar_reg_offset,
+			 int nq_idx, int msix_vector, int bar_reg_offset,
 			 int (*cqn_handler)(struct bnxt_qplib_nq *nq,
 					    struct bnxt_qplib_cq *cq),
 			 int (*srqn_handler)(struct bnxt_qplib_nq *nq,
@@ -453,4 +469,13 @@ bool bnxt_qplib_is_cq_empty(struct bnxt_qplib_cq *cq);
 void bnxt_qplib_req_notify_cq(struct bnxt_qplib_cq *cq, u32 arm_type);
 void bnxt_qplib_free_nq(struct bnxt_qplib_nq *nq);
 int bnxt_qplib_alloc_nq(struct pci_dev *pdev, struct bnxt_qplib_nq *nq);
+void bnxt_qplib_add_flush_qp(struct bnxt_qplib_qp *qp);
+void bnxt_qplib_del_flush_qp(struct bnxt_qplib_qp *qp);
+void bnxt_qplib_acquire_cq_locks(struct bnxt_qplib_qp *qp,
+				 unsigned long *flags);
+void bnxt_qplib_release_cq_locks(struct bnxt_qplib_qp *qp,
+				 unsigned long *flags);
+int bnxt_qplib_process_flush_list(struct bnxt_qplib_cq *cq,
+				  struct bnxt_qplib_cqe *cqe,
+				  int num_cqes);
 #endif /* __BNXT_QPLIB_FP_H__ */
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
index 16e4275..391bb70 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.c
@@ -44,6 +44,9 @@
 #include "roce_hsi.h"
 #include "qplib_res.h"
 #include "qplib_rcfw.h"
+#include "qplib_sp.h"
+#include "qplib_fp.h"
+
 static void bnxt_qplib_service_creq(unsigned long data);
 
 /* Hardware communication channel */
@@ -279,16 +282,29 @@ static int bnxt_qplib_process_qp_event(struct bnxt_qplib_rcfw *rcfw,
 				       struct creq_qp_event *qp_event)
 {
 	struct bnxt_qplib_hwq *cmdq = &rcfw->cmdq;
+	struct creq_qp_error_notification *err_event;
 	struct bnxt_qplib_crsq *crsqe;
 	unsigned long flags;
+	struct bnxt_qplib_qp *qp;
 	u16 cbit, blocked = 0;
 	u16 cookie;
 	__le16  mcookie;
+	u32 qp_id;
 
 	switch (qp_event->event) {
 	case CREQ_QP_EVENT_EVENT_QP_ERROR_NOTIFICATION:
+		err_event = (struct creq_qp_error_notification *)qp_event;
+		qp_id = le32_to_cpu(err_event->xid);
+		qp = rcfw->qp_tbl[qp_id].qp_handle;
 		dev_dbg(&rcfw->pdev->dev,
 			"QPLIB: Received QP error notification");
+		dev_dbg(&rcfw->pdev->dev,
+			"QPLIB: qpid 0x%x, req_err=0x%x, resp_err=0x%x\n",
+			qp_id, err_event->req_err_state_reason,
+			err_event->res_err_state_reason);
+		bnxt_qplib_acquire_cq_locks(qp, &flags);
+		bnxt_qplib_mark_qp_error(qp);
+		bnxt_qplib_release_cq_locks(qp, &flags);
 		break;
 	default:
 		/* Command Response */
@@ -507,6 +523,7 @@ int bnxt_qplib_init_rcfw(struct bnxt_qplib_rcfw *rcfw,
 
 void bnxt_qplib_free_rcfw_channel(struct bnxt_qplib_rcfw *rcfw)
 {
+	kfree(rcfw->qp_tbl);
 	kfree(rcfw->crsqe_tbl);
 	bnxt_qplib_free_hwq(rcfw->pdev, &rcfw->cmdq);
 	bnxt_qplib_free_hwq(rcfw->pdev, &rcfw->creq);
@@ -514,7 +531,8 @@ void bnxt_qplib_free_rcfw_channel(struct bnxt_qplib_rcfw *rcfw)
 }
 
 int bnxt_qplib_alloc_rcfw_channel(struct pci_dev *pdev,
-				  struct bnxt_qplib_rcfw *rcfw)
+				  struct bnxt_qplib_rcfw *rcfw,
+				  int qp_tbl_sz)
 {
 	rcfw->pdev = pdev;
 	rcfw->creq.max_elements = BNXT_QPLIB_CREQE_MAX_CNT;
@@ -541,6 +559,12 @@ int bnxt_qplib_alloc_rcfw_channel(struct pci_dev *pdev,
 	if (!rcfw->crsqe_tbl)
 		goto fail;
 
+	rcfw->qp_tbl_size = qp_tbl_sz;
+	rcfw->qp_tbl = kcalloc(qp_tbl_sz, sizeof(struct bnxt_qplib_qp_node),
+			       GFP_KERNEL);
+	if (!rcfw->qp_tbl)
+		goto fail;
+
 	return 0;
 
 fail:
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
index 09ce121..0ed312f 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_rcfw.h
@@ -148,6 +148,11 @@ struct bnxt_qplib_rcfw_sbuf {
 	u32 size;
 };
 
+struct bnxt_qplib_qp_node {
+	u32 qp_id;              /* QP id */
+	void *qp_handle;        /* ptr to qplib_qp */
+};
+
 /* RCFW Communication Channels */
 struct bnxt_qplib_rcfw {
 	struct pci_dev		*pdev;
@@ -181,11 +186,13 @@ struct bnxt_qplib_rcfw {
 	/* Actual Cmd and Resp Queues */
 	struct bnxt_qplib_hwq	cmdq;
 	struct bnxt_qplib_crsq	*crsqe_tbl;
+	int qp_tbl_size;
+	struct bnxt_qplib_qp_node *qp_tbl;
 };
 
 void bnxt_qplib_free_rcfw_channel(struct bnxt_qplib_rcfw *rcfw);
 int bnxt_qplib_alloc_rcfw_channel(struct pci_dev *pdev,
-				  struct bnxt_qplib_rcfw *rcfw);
+				  struct bnxt_qplib_rcfw *rcfw, int qp_tbl_sz);
 void bnxt_qplib_disable_rcfw_channel(struct bnxt_qplib_rcfw *rcfw);
 int bnxt_qplib_enable_rcfw_channel(struct pci_dev *pdev,
 				   struct bnxt_qplib_rcfw *rcfw,
@@ -207,4 +214,5 @@ int bnxt_qplib_rcfw_send_message(struct bnxt_qplib_rcfw *rcfw,
 int bnxt_qplib_deinit_rcfw(struct bnxt_qplib_rcfw *rcfw);
 int bnxt_qplib_init_rcfw(struct bnxt_qplib_rcfw *rcfw,
 			 struct bnxt_qplib_ctx *ctx, int is_virtfn);
+void bnxt_qplib_mark_qp_error(void *qp_handle);
 #endif /* __BNXT_QPLIB_RCFW_H__ */
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.c b/drivers/infiniband/hw/bnxt_re/qplib_res.c
index 62447b3..4e10170 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_res.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_res.c
@@ -468,9 +468,11 @@ static void bnxt_qplib_free_sgid_tbl(struct bnxt_qplib_res *res,
 	kfree(sgid_tbl->tbl);
 	kfree(sgid_tbl->hw_id);
 	kfree(sgid_tbl->ctx);
+	kfree(sgid_tbl->vlan);
 	sgid_tbl->tbl = NULL;
 	sgid_tbl->hw_id = NULL;
 	sgid_tbl->ctx = NULL;
+	sgid_tbl->vlan = NULL;
 	sgid_tbl->max = 0;
 	sgid_tbl->active = 0;
 }
@@ -491,8 +493,15 @@ static int bnxt_qplib_alloc_sgid_tbl(struct bnxt_qplib_res *res,
 	if (!sgid_tbl->ctx)
 		goto out_free2;
 
+	sgid_tbl->vlan = kcalloc(max, sizeof(u8), GFP_KERNEL);
+	if (!sgid_tbl->vlan)
+		goto out_free3;
+
 	sgid_tbl->max = max;
 	return 0;
+out_free3:
+	kfree(sgid_tbl->ctx);
+	sgid_tbl->ctx = NULL;
 out_free2:
 	kfree(sgid_tbl->hw_id);
 	sgid_tbl->hw_id = NULL;
@@ -514,6 +523,7 @@ static void bnxt_qplib_cleanup_sgid_tbl(struct bnxt_qplib_res *res,
 	}
 	memset(sgid_tbl->tbl, 0, sizeof(struct bnxt_qplib_gid) * sgid_tbl->max);
 	memset(sgid_tbl->hw_id, -1, sizeof(u16) * sgid_tbl->max);
+	memset(sgid_tbl->vlan, 0, sizeof(u8) * sgid_tbl->max);
 	sgid_tbl->active = 0;
 }
 
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_res.h b/drivers/infiniband/hw/bnxt_re/qplib_res.h
index 2e485550..e872075 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_res.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_res.h
@@ -116,6 +116,7 @@ struct bnxt_qplib_sgid_tbl {
 	u16				max;
 	u16				active;
 	void				*ctx;
+	u8				*vlan;
 };
 
 struct bnxt_qplib_pkey_tbl {
@@ -188,6 +189,7 @@ struct bnxt_qplib_res {
 	struct bnxt_qplib_sgid_tbl	sgid_tbl;
 	struct bnxt_qplib_pkey_tbl	pkey_tbl;
 	struct bnxt_qplib_dpi_tbl	dpi_tbl;
+	bool				prio;
 };
 
 #define to_bnxt_qplib(ptr, type, member)	\
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.c b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
index ef91ab7..e277e54 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_sp.c
+++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.c
@@ -213,6 +213,7 @@ int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
 	}
 	memcpy(&sgid_tbl->tbl[index], &bnxt_qplib_gid_zero,
 	       sizeof(bnxt_qplib_gid_zero));
+	sgid_tbl->vlan[index] = 0;
 	sgid_tbl->active--;
 	dev_dbg(&res->pdev->dev,
 		"QPLIB: SGID deleted hw_id[0x%x] = 0x%x active = 0x%x",
@@ -265,28 +266,32 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
 		struct cmdq_add_gid req;
 		struct creq_add_gid_resp resp;
 		u16 cmd_flags = 0;
-		u32 temp32[4];
-		u16 temp16[3];
 		int rc;
 
 		RCFW_CMD_PREP(req, ADD_GID, cmd_flags);
 
-		memcpy(temp32, gid->data, sizeof(struct bnxt_qplib_gid));
-		req.gid[0] = cpu_to_be32(temp32[3]);
-		req.gid[1] = cpu_to_be32(temp32[2]);
-		req.gid[2] = cpu_to_be32(temp32[1]);
-		req.gid[3] = cpu_to_be32(temp32[0]);
-		if (vlan_id != 0xFFFF)
-			req.vlan = cpu_to_le16((vlan_id &
-					CMDQ_ADD_GID_VLAN_VLAN_ID_MASK) |
-					CMDQ_ADD_GID_VLAN_TPID_TPID_8100 |
-					CMDQ_ADD_GID_VLAN_VLAN_EN);
+		req.gid[0] = cpu_to_be32(((u32 *)gid->data)[3]);
+		req.gid[1] = cpu_to_be32(((u32 *)gid->data)[2]);
+		req.gid[2] = cpu_to_be32(((u32 *)gid->data)[1]);
+		req.gid[3] = cpu_to_be32(((u32 *)gid->data)[0]);
+		/*
+		 * driver should ensure that all RoCE traffic is always VLAN
+		 * tagged if RoCE traffic is running on non-zero VLAN ID or
+		 * RoCE traffic is running on non-zero Priority.
+		 */
+		if ((vlan_id != 0xFFFF) || res->prio) {
+			if (vlan_id != 0xFFFF)
+				req.vlan = cpu_to_le16
+				(vlan_id & CMDQ_ADD_GID_VLAN_VLAN_ID_MASK);
+			req.vlan |= cpu_to_le16
+					(CMDQ_ADD_GID_VLAN_TPID_TPID_8100 |
+					 CMDQ_ADD_GID_VLAN_VLAN_EN);
+		}
 
 		/* MAC in network format */
-		memcpy(temp16, smac, 6);
-		req.src_mac[0] = cpu_to_be16(temp16[0]);
-		req.src_mac[1] = cpu_to_be16(temp16[1]);
-		req.src_mac[2] = cpu_to_be16(temp16[2]);
+		req.src_mac[0] = cpu_to_be16(((u16 *)smac)[0]);
+		req.src_mac[1] = cpu_to_be16(((u16 *)smac)[1]);
+		req.src_mac[2] = cpu_to_be16(((u16 *)smac)[2]);
 
 		rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
 						  (void *)&resp, NULL, 0);
@@ -297,6 +302,9 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
 	/* Add GID to the sgid_tbl */
 	memcpy(&sgid_tbl->tbl[free_idx], gid, sizeof(*gid));
 	sgid_tbl->active++;
+	if (vlan_id != 0xFFFF)
+		sgid_tbl->vlan[free_idx] = 1;
+
 	dev_dbg(&res->pdev->dev,
 		"QPLIB: SGID added hw_id[0x%x] = 0x%x active = 0x%x",
 		 free_idx, sgid_tbl->hw_id[free_idx], sgid_tbl->active);
@@ -306,6 +314,43 @@ int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
 	return 0;
 }
 
+int bnxt_qplib_update_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
+			   struct bnxt_qplib_gid *gid, u16 gid_idx,
+			   u8 *smac)
+{
+	struct bnxt_qplib_res *res = to_bnxt_qplib(sgid_tbl,
+						   struct bnxt_qplib_res,
+						   sgid_tbl);
+	struct bnxt_qplib_rcfw *rcfw = res->rcfw;
+	struct creq_modify_gid_resp resp;
+	struct cmdq_modify_gid req;
+	int rc;
+	u16 cmd_flags = 0;
+
+	RCFW_CMD_PREP(req, MODIFY_GID, cmd_flags);
+
+	req.gid[0] = cpu_to_be32(((u32 *)gid->data)[3]);
+	req.gid[1] = cpu_to_be32(((u32 *)gid->data)[2]);
+	req.gid[2] = cpu_to_be32(((u32 *)gid->data)[1]);
+	req.gid[3] = cpu_to_be32(((u32 *)gid->data)[0]);
+	if (res->prio) {
+		req.vlan |= cpu_to_le16
+			(CMDQ_ADD_GID_VLAN_TPID_TPID_8100 |
+			 CMDQ_ADD_GID_VLAN_VLAN_EN);
+	}
+
+	/* MAC in network format */
+	req.src_mac[0] = cpu_to_be16(((u16 *)smac)[0]);
+	req.src_mac[1] = cpu_to_be16(((u16 *)smac)[1]);
+	req.src_mac[2] = cpu_to_be16(((u16 *)smac)[2]);
+
+	req.gid_index = cpu_to_le16(gid_idx);
+
+	rc = bnxt_qplib_rcfw_send_message(rcfw, (void *)&req,
+					  (void *)&resp, NULL, 0);
+	return rc;
+}
+
 /* pkeys */
 int bnxt_qplib_get_pkey(struct bnxt_qplib_res *res,
 			struct bnxt_qplib_pkey_tbl *pkey_tbl, u16 index,
diff --git a/drivers/infiniband/hw/bnxt_re/qplib_sp.h b/drivers/infiniband/hw/bnxt_re/qplib_sp.h
index 2ce7e2a..1132258 100644
--- a/drivers/infiniband/hw/bnxt_re/qplib_sp.h
+++ b/drivers/infiniband/hw/bnxt_re/qplib_sp.h
@@ -135,6 +135,8 @@ int bnxt_qplib_del_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
 int bnxt_qplib_add_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
 			struct bnxt_qplib_gid *gid, u8 *mac, u16 vlan_id,
 			bool update, u32 *index);
+int bnxt_qplib_update_sgid(struct bnxt_qplib_sgid_tbl *sgid_tbl,
+			   struct bnxt_qplib_gid *gid, u16 gid_idx, u8 *smac);
 int bnxt_qplib_get_pkey(struct bnxt_qplib_res *res,
 			struct bnxt_qplib_pkey_tbl *pkey_tbl, u16 index,
 			u16 *pkey);
diff --git a/drivers/infiniband/hw/bnxt_re/roce_hsi.h b/drivers/infiniband/hw/bnxt_re/roce_hsi.h
index fc23477..eeb55b2 100644
--- a/drivers/infiniband/hw/bnxt_re/roce_hsi.h
+++ b/drivers/infiniband/hw/bnxt_re/roce_hsi.h
@@ -1473,8 +1473,8 @@ struct cmdq_modify_gid {
 	u8 resp_size;
 	u8 reserved8;
 	__le64 resp_addr;
-	__le32 gid[4];
-	__le16 src_mac[3];
+	__be32 gid[4];
+	__be16 src_mac[3];
 	__le16 vlan;
 	#define CMDQ_MODIFY_GID_VLAN_VLAN_ID_MASK		    0xfffUL
 	#define CMDQ_MODIFY_GID_VLAN_VLAN_ID_SFT		    0
diff --git a/drivers/infiniband/hw/cxgb3/iwch.c b/drivers/infiniband/hw/cxgb3/iwch.c
index 47b2ce2..591de31 100644
--- a/drivers/infiniband/hw/cxgb3/iwch.c
+++ b/drivers/infiniband/hw/cxgb3/iwch.c
@@ -45,7 +45,6 @@
 MODULE_AUTHOR("Boyd Faulkner, Steve Wise");
 MODULE_DESCRIPTION("Chelsio T3 RDMA Driver");
 MODULE_LICENSE("Dual BSD/GPL");
-MODULE_VERSION(DRV_VERSION);
 
 static void open_rnic_dev(struct t3cdev *);
 static void close_rnic_dev(struct t3cdev *);
diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index 0cd0c1f..099e76f 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -1336,8 +1336,7 @@ static int iwch_port_immutable(struct ib_device *ibdev, u8 port_num,
 	return 0;
 }
 
-static void get_dev_fw_ver_str(struct ib_device *ibdev, char *str,
-			       size_t str_len)
+static void get_dev_fw_ver_str(struct ib_device *ibdev, char *str)
 {
 	struct iwch_dev *iwch_dev = to_iwch_dev(ibdev);
 	struct ethtool_drvinfo info;
@@ -1345,7 +1344,7 @@ static void get_dev_fw_ver_str(struct ib_device *ibdev, char *str,
 
 	pr_debug("%s dev 0x%p\n", __func__, iwch_dev);
 	lldev->ethtool_ops->get_drvinfo(lldev, &info);
-	snprintf(str, str_len, "%s", info.fw_version);
+	snprintf(str, IB_FW_VERSION_NAME_MAX, "%s", info.fw_version);
 }
 
 int iwch_register_device(struct iwch_dev *dev)
diff --git a/drivers/infiniband/hw/cxgb4/cm.c b/drivers/infiniband/hw/cxgb4/cm.c
index e49b34c..ceaa2fa 100644
--- a/drivers/infiniband/hw/cxgb4/cm.c
+++ b/drivers/infiniband/hw/cxgb4/cm.c
@@ -2871,7 +2871,6 @@ static int close_con_rpl(struct c4iw_dev *dev, struct sk_buff *skb)
 		return 0;
 
 	pr_debug("%s ep %p tid %u\n", __func__, ep, ep->hwtid);
-	BUG_ON(!ep);
 
 	/* The cm_id may be null if we failed to connect */
 	mutex_lock(&ep->com.mutex);
diff --git a/drivers/infiniband/hw/cxgb4/device.c b/drivers/infiniband/hw/cxgb4/device.c
index ae0b79ae..fc886f8 100644
--- a/drivers/infiniband/hw/cxgb4/device.c
+++ b/drivers/infiniband/hw/cxgb4/device.c
@@ -44,7 +44,6 @@
 MODULE_AUTHOR("Steve Wise");
 MODULE_DESCRIPTION("Chelsio T4/T5 RDMA Driver");
 MODULE_LICENSE("Dual BSD/GPL");
-MODULE_VERSION(DRV_VERSION);
 
 static int allow_db_fc_on_t5;
 module_param(allow_db_fc_on_t5, int, 0644);
diff --git a/drivers/infiniband/hw/cxgb4/provider.c b/drivers/infiniband/hw/cxgb4/provider.c
index 0771e9a..346e833 100644
--- a/drivers/infiniband/hw/cxgb4/provider.c
+++ b/drivers/infiniband/hw/cxgb4/provider.c
@@ -517,14 +517,13 @@ static int c4iw_port_immutable(struct ib_device *ibdev, u8 port_num,
 	return 0;
 }
 
-static void get_dev_fw_str(struct ib_device *dev, char *str,
-			   size_t str_len)
+static void get_dev_fw_str(struct ib_device *dev, char *str)
 {
 	struct c4iw_dev *c4iw_dev = container_of(dev, struct c4iw_dev,
 						 ibdev);
 	pr_debug("%s dev 0x%p\n", __func__, dev);
 
-	snprintf(str, str_len, "%u.%u.%u.%u",
+	snprintf(str, IB_FW_VERSION_NAME_MAX, "%u.%u.%u.%u",
 		 FW_HDR_FW_VER_MAJOR_G(c4iw_dev->rdev.lldi.fw_vers),
 		 FW_HDR_FW_VER_MINOR_G(c4iw_dev->rdev.lldi.fw_vers),
 		 FW_HDR_FW_VER_MICRO_G(c4iw_dev->rdev.lldi.fw_vers),
diff --git a/drivers/infiniband/hw/hfi1/Kconfig b/drivers/infiniband/hw/hfi1/Kconfig
index f6ea088..7b146b6 100644
--- a/drivers/infiniband/hw/hfi1/Kconfig
+++ b/drivers/infiniband/hw/hfi1/Kconfig
@@ -13,13 +13,6 @@
 	---help---
 	This is a debug flag to test for out of order
 	sdma completions for unit testing
-config HFI1_VERBS_31BIT_PSN
-	bool "HFI1 enable 31 bit PSN"
-	depends on INFINIBAND_HFI1
-	default y
-	---help---
-	Setting this enables 31 BIT PSN
-	For verbs RC/UC
 config SDMA_VERBOSITY
 	bool "Config SDMA Verbosity"
 	depends on INFINIBAND_HFI1
diff --git a/drivers/infiniband/hw/hfi1/Makefile b/drivers/infiniband/hw/hfi1/Makefile
index 88085f6..66d538c 100644
--- a/drivers/infiniband/hw/hfi1/Makefile
+++ b/drivers/infiniband/hw/hfi1/Makefile
@@ -8,7 +8,7 @@
 obj-$(CONFIG_INFINIBAND_HFI1) += hfi1.o
 
 hfi1-y := affinity.o chip.o device.o driver.o efivar.o \
-	eprom.o file_ops.o firmware.o \
+	eprom.o exp_rcv.o file_ops.o firmware.o \
 	init.o intr.o mad.o mmu_rb.o pcie.o pio.o pio_copy.o platform.o \
 	qp.o qsfp.o rc.o ruc.o sdma.o sysfs.o trace.o \
 	uc.o ud.o user_exp_rcv.o user_pages.o user_sdma.o verbs.o \
diff --git a/drivers/infiniband/hw/hfi1/affinity.c b/drivers/infiniband/hw/hfi1/affinity.c
index e2cd2cd..a97055d 100644
--- a/drivers/infiniband/hw/hfi1/affinity.c
+++ b/drivers/infiniband/hw/hfi1/affinity.c
@@ -1,5 +1,5 @@
 /*
- * Copyright(c) 2015, 2016 Intel Corporation.
+ * Copyright(c) 2015 - 2017 Intel Corporation.
  *
  * This file is provided under a dual BSD/GPLv2 license.  When using or
  * redistributing this file, you may do so under either license.
@@ -335,10 +335,10 @@ static void hfi1_update_sdma_affinity(struct hfi1_msix_entry *msix, int cpu)
 	sde->cpu = cpu;
 	cpumask_clear(&msix->mask);
 	cpumask_set_cpu(cpu, &msix->mask);
-	dd_dev_dbg(dd, "IRQ vector: %u, type %s engine %u -> cpu: %d\n",
-		   msix->msix.vector, irq_type_names[msix->type],
+	dd_dev_dbg(dd, "IRQ: %u, type %s engine %u -> cpu: %d\n",
+		   msix->irq, irq_type_names[msix->type],
 		   sde->this_idx, cpu);
-	irq_set_affinity_hint(msix->msix.vector, &msix->mask);
+	irq_set_affinity_hint(msix->irq, &msix->mask);
 
 	/*
 	 * Set the new cpu in the hfi1_affinity_node and clean
@@ -387,7 +387,7 @@ static void hfi1_setup_sdma_notifier(struct hfi1_msix_entry *msix)
 {
 	struct irq_affinity_notify *notify = &msix->notify;
 
-	notify->irq = msix->msix.vector;
+	notify->irq = msix->irq;
 	notify->notify = hfi1_irq_notifier_notify;
 	notify->release = hfi1_irq_notifier_release;
 
@@ -472,10 +472,10 @@ static int get_irq_affinity(struct hfi1_devdata *dd,
 	}
 
 	cpumask_set_cpu(cpu, &msix->mask);
-	dd_dev_info(dd, "IRQ vector: %u, type %s %s -> cpu: %d\n",
-		    msix->msix.vector, irq_type_names[msix->type],
+	dd_dev_info(dd, "IRQ: %u, type %s %s -> cpu: %d\n",
+		    msix->irq, irq_type_names[msix->type],
 		    extra, cpu);
-	irq_set_affinity_hint(msix->msix.vector, &msix->mask);
+	irq_set_affinity_hint(msix->irq, &msix->mask);
 
 	if (msix->type == IRQ_SDMA) {
 		sde->cpu = cpu;
@@ -533,7 +533,7 @@ void hfi1_put_irq_affinity(struct hfi1_devdata *dd,
 		}
 	}
 
-	irq_set_affinity_hint(msix->msix.vector, NULL);
+	irq_set_affinity_hint(msix->irq, NULL);
 	cpumask_clear(&msix->mask);
 	mutex_unlock(&node_affinity.lock);
 }
diff --git a/drivers/infiniband/hw/hfi1/affinity.h b/drivers/infiniband/hw/hfi1/affinity.h
index e78c7aa..2a1e374 100644
--- a/drivers/infiniband/hw/hfi1/affinity.h
+++ b/drivers/infiniband/hw/hfi1/affinity.h
@@ -1,5 +1,5 @@
 /*
- * Copyright(c) 2015, 2016 Intel Corporation.
+ * Copyright(c) 2015 - 2017 Intel Corporation.
  *
  * This file is provided under a dual BSD/GPLv2 license.  When using or
  * redistributing this file, you may do so under either license.
@@ -75,24 +75,26 @@ struct hfi1_msix_entry;
 /* Initialize non-HT cpu cores mask */
 void init_real_cpu_mask(void);
 /* Initialize driver affinity data */
-int hfi1_dev_affinity_init(struct hfi1_devdata *);
+int hfi1_dev_affinity_init(struct hfi1_devdata *dd);
 /*
  * Set IRQ affinity to a CPU. The function will determine the
  * CPU and set the affinity to it.
  */
-int hfi1_get_irq_affinity(struct hfi1_devdata *, struct hfi1_msix_entry *);
+int hfi1_get_irq_affinity(struct hfi1_devdata *dd,
+			  struct hfi1_msix_entry *msix);
 /*
  * Remove the IRQ's CPU affinity. This function also updates
  * any internal CPU tracking data
  */
-void hfi1_put_irq_affinity(struct hfi1_devdata *, struct hfi1_msix_entry *);
+void hfi1_put_irq_affinity(struct hfi1_devdata *dd,
+			   struct hfi1_msix_entry *msix);
 /*
  * Determine a CPU affinity for a user process, if the process does not
  * have an affinity set yet.
  */
-int hfi1_get_proc_affinity(int);
+int hfi1_get_proc_affinity(int node);
 /* Release a CPU used by a user process. */
-void hfi1_put_proc_affinity(int);
+void hfi1_put_proc_affinity(int cpu);
 
 struct hfi1_affinity_node {
 	int node;
diff --git a/drivers/infiniband/hw/hfi1/aspm.h b/drivers/infiniband/hw/hfi1/aspm.h
index 794e681..522b40e 100644
--- a/drivers/infiniband/hw/hfi1/aspm.h
+++ b/drivers/infiniband/hw/hfi1/aspm.h
@@ -237,14 +237,17 @@ static inline void aspm_disable_all(struct hfi1_devdata *dd)
 {
 	struct hfi1_ctxtdata *rcd;
 	unsigned long flags;
-	unsigned i;
+	u16 i;
 
 	for (i = 0; i < dd->first_dyn_alloc_ctxt; i++) {
-		rcd = dd->rcd[i];
-		del_timer_sync(&rcd->aspm_timer);
-		spin_lock_irqsave(&rcd->aspm_lock, flags);
-		rcd->aspm_intr_enable = false;
-		spin_unlock_irqrestore(&rcd->aspm_lock, flags);
+		rcd = hfi1_rcd_get_by_index(dd, i);
+		if (rcd) {
+			del_timer_sync(&rcd->aspm_timer);
+			spin_lock_irqsave(&rcd->aspm_lock, flags);
+			rcd->aspm_intr_enable = false;
+			spin_unlock_irqrestore(&rcd->aspm_lock, flags);
+			hfi1_rcd_put(rcd);
+		}
 	}
 
 	aspm_disable(dd);
@@ -256,7 +259,7 @@ static inline void aspm_enable_all(struct hfi1_devdata *dd)
 {
 	struct hfi1_ctxtdata *rcd;
 	unsigned long flags;
-	unsigned i;
+	u16 i;
 
 	aspm_enable(dd);
 
@@ -264,11 +267,14 @@ static inline void aspm_enable_all(struct hfi1_devdata *dd)
 		return;
 
 	for (i = 0; i < dd->first_dyn_alloc_ctxt; i++) {
-		rcd = dd->rcd[i];
-		spin_lock_irqsave(&rcd->aspm_lock, flags);
-		rcd->aspm_intr_enable = true;
-		rcd->aspm_enabled = true;
-		spin_unlock_irqrestore(&rcd->aspm_lock, flags);
+		rcd = hfi1_rcd_get_by_index(dd, i);
+		if (rcd) {
+			spin_lock_irqsave(&rcd->aspm_lock, flags);
+			rcd->aspm_intr_enable = true;
+			rcd->aspm_enabled = true;
+			spin_unlock_irqrestore(&rcd->aspm_lock, flags);
+			hfi1_rcd_put(rcd);
+		}
 	}
 }
 
@@ -284,13 +290,18 @@ static inline void aspm_ctx_init(struct hfi1_ctxtdata *rcd)
 
 static inline void aspm_init(struct hfi1_devdata *dd)
 {
-	unsigned i;
+	struct hfi1_ctxtdata *rcd;
+	u16 i;
 
 	spin_lock_init(&dd->aspm_lock);
 	dd->aspm_supported = aspm_hw_l1_supported(dd);
 
-	for (i = 0; i < dd->first_dyn_alloc_ctxt; i++)
-		aspm_ctx_init(dd->rcd[i]);
+	for (i = 0; i < dd->first_dyn_alloc_ctxt; i++) {
+		rcd = hfi1_rcd_get_by_index(dd, i);
+		if (rcd)
+			aspm_ctx_init(rcd);
+		hfi1_rcd_put(rcd);
+	}
 
 	/* Start with ASPM disabled */
 	aspm_hw_set_l1_ent_latency(dd);
diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index 94b54850..b2ed4b9 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -1012,14 +1012,15 @@ static struct flag_table dc8051_info_err_flags[] = {
  */
 static struct flag_table dc8051_info_host_msg_flags[] = {
 	FLAG_ENTRY0("Host request done", 0x0001),
-	FLAG_ENTRY0("BC SMA message", 0x0002),
-	FLAG_ENTRY0("BC PWR_MGM message", 0x0004),
+	FLAG_ENTRY0("BC PWR_MGM message", 0x0002),
+	FLAG_ENTRY0("BC SMA message", 0x0004),
 	FLAG_ENTRY0("BC Unknown message (BCC)", 0x0008),
 	FLAG_ENTRY0("BC Unknown message (LCB)", 0x0010),
 	FLAG_ENTRY0("External device config request", 0x0020),
 	FLAG_ENTRY0("VerifyCap all frames received", 0x0040),
 	FLAG_ENTRY0("LinkUp achieved", 0x0080),
 	FLAG_ENTRY0("Link going down", 0x0100),
+	FLAG_ENTRY0("Link width downgraded", 0x0200),
 };
 
 static u32 encoded_size(u32 size);
@@ -1064,8 +1065,13 @@ static int do_8051_command(struct hfi1_devdata *dd, u32 type, u64 in_data,
 static int read_idle_sma(struct hfi1_devdata *dd, u64 *data);
 static int thermal_init(struct hfi1_devdata *dd);
 
+static void update_statusp(struct hfi1_pportdata *ppd, u32 state);
 static int wait_logical_linkstate(struct hfi1_pportdata *ppd, u32 state,
 				  int msecs);
+static void log_state_transition(struct hfi1_pportdata *ppd, u32 state);
+static void log_physical_state(struct hfi1_pportdata *ppd, u32 state);
+static int wait_physical_linkstate(struct hfi1_pportdata *ppd, u32 state,
+				   int msecs);
 static void read_planned_down_reason_code(struct hfi1_devdata *dd, u8 *pdrrc);
 static void read_link_down_reason(struct hfi1_devdata *dd, u8 *ldr);
 static void handle_temp_err(struct hfi1_devdata *dd);
@@ -1294,25 +1300,71 @@ CNTR_ELEM(#name, \
 	  CNTR_SYNTH, \
 	  access_ibp_##cntr)
 
+/**
+ * hfi_addr_from_offset - return addr for readq/writeq
+ * @dd - the dd device
+ * @offset - the offset of the CSR within bar0
+ *
+ * This routine selects the appropriate base address
+ * based on the indicated offset.
+ */
+static inline void __iomem *hfi1_addr_from_offset(
+	const struct hfi1_devdata *dd,
+	u32 offset)
+{
+	if (offset >= dd->base2_start)
+		return dd->kregbase2 + (offset - dd->base2_start);
+	return dd->kregbase1 + offset;
+}
+
+/**
+ * read_csr - read CSR at the indicated offset
+ * @dd - the dd device
+ * @offset - the offset of the CSR within bar0
+ *
+ * Return: the value read or all FF's if there
+ * is no mapping
+ */
 u64 read_csr(const struct hfi1_devdata *dd, u32 offset)
 {
-	if (dd->flags & HFI1_PRESENT) {
-		return readq((void __iomem *)dd->kregbase + offset);
-	}
+	if (dd->flags & HFI1_PRESENT)
+		return readq(hfi1_addr_from_offset(dd, offset));
 	return -1;
 }
 
+/**
+ * write_csr - write CSR at the indicated offset
+ * @dd - the dd device
+ * @offset - the offset of the CSR within bar0
+ * @value - value to write
+ */
 void write_csr(const struct hfi1_devdata *dd, u32 offset, u64 value)
 {
-	if (dd->flags & HFI1_PRESENT)
-		writeq(value, (void __iomem *)dd->kregbase + offset);
+	if (dd->flags & HFI1_PRESENT) {
+		void __iomem *base = hfi1_addr_from_offset(dd, offset);
+
+		/* avoid write to RcvArray */
+		if (WARN_ON(offset >= RCV_ARRAY && offset < dd->base2_start))
+			return;
+		writeq(value, base);
+	}
 }
 
+/**
+ * get_csr_addr - return te iomem address for offset
+ * @dd - the dd device
+ * @offset - the offset of the CSR within bar0
+ *
+ * Return: The iomem address to use in subsequent
+ * writeq/readq operations.
+ */
 void __iomem *get_csr_addr(
-	struct hfi1_devdata *dd,
+	const struct hfi1_devdata *dd,
 	u32 offset)
 {
-	return (void __iomem *)dd->kregbase + offset;
+	if (dd->flags & HFI1_PRESENT)
+		return hfi1_addr_from_offset(dd, offset);
+	return NULL;
 }
 
 static inline u64 read_write_csr(const struct hfi1_devdata *dd, u32 csr,
@@ -5496,7 +5548,7 @@ static void update_rcverr_timer(unsigned long opaque)
 		set_link_down_reason(
 		ppd, OPA_LINKDOWN_REASON_EXCESSIVE_BUFFER_OVERRUN, 0,
 		OPA_LINKDOWN_REASON_EXCESSIVE_BUFFER_OVERRUN);
-		queue_work(ppd->hfi1_wq, &ppd->link_bounce_work);
+		queue_work(ppd->link_wq, &ppd->link_bounce_work);
 	}
 	dd->rcv_ovfl_cnt = (u32)cur_ovfl_cnt;
 
@@ -6051,7 +6103,7 @@ static void handle_qsfp_int(struct hfi1_devdata *dd, u32 src_ctx, u64 reg)
 				 * will not happen. We have to do it here
 				 * before turning the DC off.
 				 */
-				queue_work(ppd->hfi1_wq, &ppd->link_down_work);
+				queue_work(ppd->link_wq, &ppd->link_down_work);
 			}
 		} else {
 			dd_dev_info(dd, "%s: QSFP module inserted\n",
@@ -6086,7 +6138,7 @@ static void handle_qsfp_int(struct hfi1_devdata *dd, u32 src_ctx, u64 reg)
 
 	/* Schedule the QSFP work only if there is a cable attached. */
 	if (qsfp_mod_present(ppd))
-		queue_work(ppd->hfi1_wq, &ppd->qsfp_info.qsfp_work);
+		queue_work(ppd->link_wq, &ppd->qsfp_info.qsfp_work);
 }
 
 static int request_host_lcb_access(struct hfi1_devdata *dd)
@@ -6735,13 +6787,17 @@ static void wait_for_freeze_status(struct hfi1_devdata *dd, int freeze)
 static void rxe_freeze(struct hfi1_devdata *dd)
 {
 	int i;
+	struct hfi1_ctxtdata *rcd;
 
 	/* disable port */
 	clear_rcvctrl(dd, RCV_CTRL_RCV_PORT_ENABLE_SMASK);
 
 	/* disable all receive contexts */
-	for (i = 0; i < dd->num_rcv_contexts; i++)
-		hfi1_rcvctrl(dd, HFI1_RCVCTRL_CTXT_DIS, i);
+	for (i = 0; i < dd->num_rcv_contexts; i++) {
+		rcd = hfi1_rcd_get_by_index(dd, i);
+		hfi1_rcvctrl(dd, HFI1_RCVCTRL_CTXT_DIS, rcd);
+		hfi1_rcd_put(rcd);
+	}
 }
 
 /*
@@ -6753,21 +6809,24 @@ static void rxe_freeze(struct hfi1_devdata *dd)
 static void rxe_kernel_unfreeze(struct hfi1_devdata *dd)
 {
 	u32 rcvmask;
-	int i;
+	u16 i;
+	struct hfi1_ctxtdata *rcd;
 
 	/* enable all kernel contexts */
 	for (i = 0; i < dd->num_rcv_contexts; i++) {
-		struct hfi1_ctxtdata *rcd = dd->rcd[i];
+		rcd = hfi1_rcd_get_by_index(dd, i);
 
 		/* Ensure all non-user contexts(including vnic) are enabled */
-		if (!rcd || !rcd->sc || (rcd->sc->type == SC_USER))
+		if (!rcd || !rcd->sc || (rcd->sc->type == SC_USER)) {
+			hfi1_rcd_put(rcd);
 			continue;
-
+		}
 		rcvmask = HFI1_RCVCTRL_CTXT_ENB;
 		/* HFI1_RCVCTRL_TAILUPD_[ENB|DIS] needs to be set explicitly */
-		rcvmask |= HFI1_CAP_KGET_MASK(dd->rcd[i]->flags, DMA_RTAIL) ?
+		rcvmask |= HFI1_CAP_KGET_MASK(rcd->flags, DMA_RTAIL) ?
 			HFI1_RCVCTRL_TAILUPD_ENB : HFI1_RCVCTRL_TAILUPD_DIS;
-		hfi1_rcvctrl(dd, rcvmask, i);
+		hfi1_rcvctrl(dd, rcvmask, rcd);
+		hfi1_rcd_put(rcd);
 	}
 
 	/* enable port */
@@ -6906,7 +6965,7 @@ static void reset_neighbor_info(struct hfi1_pportdata *ppd)
 
 static const char * const link_down_reason_strs[] = {
 	[OPA_LINKDOWN_REASON_NONE] = "None",
-	[OPA_LINKDOWN_REASON_RCV_ERROR_0] = "Recive error 0",
+	[OPA_LINKDOWN_REASON_RCV_ERROR_0] = "Receive error 0",
 	[OPA_LINKDOWN_REASON_BAD_PKT_LEN] = "Bad packet length",
 	[OPA_LINKDOWN_REASON_PKT_TOO_LONG] = "Packet too long",
 	[OPA_LINKDOWN_REASON_PKT_TOO_SHORT] = "Packet too short",
@@ -6996,6 +7055,7 @@ void handle_link_down(struct work_struct *work)
 	/* Go offline first, then deal with reading/writing through 8051 */
 	was_up = !!(ppd->host_link_state & HLS_UP);
 	set_link_state(ppd, HLS_DN_OFFLINE);
+	xchg(&ppd->is_link_down_queued, 0);
 
 	if (was_up) {
 		lcl_reason = 0;
@@ -7330,7 +7390,7 @@ void handle_verify_cap(struct work_struct *work)
 	struct hfi1_devdata *dd = ppd->dd;
 	u64 reg;
 	u8 power_management;
-	u8 continious;
+	u8 continuous;
 	u8 vcu;
 	u8 vau;
 	u8 z;
@@ -7349,7 +7409,7 @@ void handle_verify_cap(struct work_struct *work)
 	lcb_shutdown(dd, 0);
 	adjust_lcb_for_fpga_serdes(dd);
 
-	read_vc_remote_phy(dd, &power_management, &continious);
+	read_vc_remote_phy(dd, &power_management, &continuous);
 	read_vc_remote_fabric(dd, &vau, &z, &vcu, &vl15buf,
 			      &partner_supported_crc);
 	read_vc_remote_link_width(dd, &remote_tx_rate, &link_widths);
@@ -7363,7 +7423,7 @@ void handle_verify_cap(struct work_struct *work)
 	get_link_widths(dd, &active_tx, &active_rx);
 	dd_dev_info(dd,
 		    "Peer PHY: power management 0x%x, continuous updates 0x%x\n",
-		    (int)power_management, (int)continious);
+		    (int)power_management, (int)continuous);
 	dd_dev_info(dd,
 		    "Peer Fabric: vAU %d, Z %d, vCU %d, vl15 credits 0x%x, CRC sizes 0x%x\n",
 		    (int)vau, (int)z, (int)vcu, (int)vl15buf,
@@ -7689,12 +7749,12 @@ static void handle_8051_interrupt(struct hfi1_devdata *dd, u32 unused, u64 reg)
 			host_msg &= ~(u64)HOST_REQ_DONE;
 		}
 		if (host_msg & BC_SMA_MSG) {
-			queue_work(ppd->hfi1_wq, &ppd->sma_message_work);
+			queue_work(ppd->link_wq, &ppd->sma_message_work);
 			host_msg &= ~(u64)BC_SMA_MSG;
 		}
 		if (host_msg & LINKUP_ACHIEVED) {
 			dd_dev_info(dd, "8051: Link up\n");
-			queue_work(ppd->hfi1_wq, &ppd->link_up_work);
+			queue_work(ppd->link_wq, &ppd->link_up_work);
 			host_msg &= ~(u64)LINKUP_ACHIEVED;
 		}
 		if (host_msg & EXT_DEVICE_CFG_REQ) {
@@ -7702,7 +7762,7 @@ static void handle_8051_interrupt(struct hfi1_devdata *dd, u32 unused, u64 reg)
 			host_msg &= ~(u64)EXT_DEVICE_CFG_REQ;
 		}
 		if (host_msg & VERIFY_CAP_FRAME) {
-			queue_work(ppd->hfi1_wq, &ppd->link_vc_work);
+			queue_work(ppd->link_wq, &ppd->link_vc_work);
 			host_msg &= ~(u64)VERIFY_CAP_FRAME;
 		}
 		if (host_msg & LINK_GOING_DOWN) {
@@ -7717,7 +7777,7 @@ static void handle_8051_interrupt(struct hfi1_devdata *dd, u32 unused, u64 reg)
 			host_msg &= ~(u64)LINK_GOING_DOWN;
 		}
 		if (host_msg & LINK_WIDTH_DOWNGRADED) {
-			queue_work(ppd->hfi1_wq, &ppd->link_downgrade_work);
+			queue_work(ppd->link_wq, &ppd->link_downgrade_work);
 			host_msg &= ~(u64)LINK_WIDTH_DOWNGRADED;
 		}
 		if (host_msg) {
@@ -7752,15 +7812,22 @@ static void handle_8051_interrupt(struct hfi1_devdata *dd, u32 unused, u64 reg)
 	if (queue_link_down) {
 		/*
 		 * if the link is already going down or disabled, do not
-		 * queue another
+		 * queue another. If there's a link down entry already
+		 * queued, don't queue another one.
 		 */
 		if ((ppd->host_link_state &
 		    (HLS_GOING_OFFLINE | HLS_LINK_COOLDOWN)) ||
 		    ppd->link_enabled == 0) {
-			dd_dev_info(dd, "%s: not queuing link down\n",
-				    __func__);
+			dd_dev_info(dd, "%s: not queuing link down. host_link_state %x, link_enabled %x\n",
+				    __func__, ppd->host_link_state,
+				    ppd->link_enabled);
 		} else {
-			queue_work(ppd->hfi1_wq, &ppd->link_down_work);
+			if (xchg(&ppd->is_link_down_queued, 1) == 1)
+				dd_dev_info(dd,
+					    "%s: link down request already queued\n",
+					    __func__);
+			else
+				queue_work(ppd->link_wq, &ppd->link_down_work);
 		}
 	}
 }
@@ -7968,7 +8035,7 @@ static void handle_dcc_err(struct hfi1_devdata *dd, u32 unused, u64 reg)
 		dd_dev_info_ratelimited(dd, "%s: PortErrorAction bounce\n",
 					__func__);
 		set_link_down_reason(ppd, lcl_reason, 0, lcl_reason);
-		queue_work(ppd->hfi1_wq, &ppd->link_bounce_work);
+		queue_work(ppd->link_wq, &ppd->link_bounce_work);
 	}
 }
 
@@ -8052,7 +8119,7 @@ static void is_rcv_avail_int(struct hfi1_devdata *dd, unsigned int source)
 	char *err_detail;
 
 	if (likely(source < dd->num_rcv_contexts)) {
-		rcd = dd->rcd[source];
+		rcd = hfi1_rcd_get_by_index(dd, source);
 		if (rcd) {
 			/* Check for non-user contexts, including vnic */
 			if ((source < dd->first_dyn_alloc_ctxt) ||
@@ -8060,6 +8127,8 @@ static void is_rcv_avail_int(struct hfi1_devdata *dd, unsigned int source)
 				rcd->do_interrupt(rcd, 0);
 			else
 				handle_user_interrupt(rcd);
+
+			hfi1_rcd_put(rcd);
 			return;	/* OK */
 		}
 		/* received an interrupt, but no rcd */
@@ -8081,12 +8150,14 @@ static void is_rcv_urgent_int(struct hfi1_devdata *dd, unsigned int source)
 	char *err_detail;
 
 	if (likely(source < dd->num_rcv_contexts)) {
-		rcd = dd->rcd[source];
+		rcd = hfi1_rcd_get_by_index(dd, source);
 		if (rcd) {
 			/* only pay attention to user urgent interrupts */
 			if ((source >= dd->first_dyn_alloc_ctxt) &&
 			    (!rcd->sc || (rcd->sc->type == SC_USER)))
 				handle_user_interrupt(rcd);
+
+			hfi1_rcd_put(rcd);
 			return;	/* OK */
 		}
 		/* received an interrupt, but no rcd */
@@ -8219,8 +8290,8 @@ static irqreturn_t sdma_interrupt(int irq, void *data)
 		/* handle the interrupt(s) */
 		sdma_engine_interrupt(sde, status);
 	} else {
-		dd_dev_err(dd, "SDMA engine %u interrupt, but no status bits set\n",
-			   sde->this_idx);
+		dd_dev_err_ratelimited(dd, "SDMA engine %u interrupt, but no status bits set\n",
+				       sde->this_idx);
 	}
 	return IRQ_HANDLED;
 }
@@ -8291,7 +8362,7 @@ static irqreturn_t receive_context_interrupt(int irq, void *data)
 	int disposition;
 	int present;
 
-	trace_hfi1_receive_interrupt(dd, rcd->ctxt);
+	trace_hfi1_receive_interrupt(dd, rcd);
 	this_cpu_inc(*dd->int_counter);
 	aspm_ctx_disable(rcd);
 
@@ -8781,6 +8852,20 @@ static void read_remote_device_id(struct hfi1_devdata *dd, u16 *device_id,
 			& REMOTE_DEVICE_REV_MASK;
 }
 
+int write_host_interface_version(struct hfi1_devdata *dd, u8 version)
+{
+	u32 frame;
+	u32 mask;
+
+	mask = (HOST_INTERFACE_VERSION_MASK << HOST_INTERFACE_VERSION_SHIFT);
+	read_8051_config(dd, RESERVED_REGISTERS, GENERAL_CONFIG, &frame);
+	/* Clear, then set field */
+	frame &= ~mask;
+	frame |= ((u32)version << HOST_INTERFACE_VERSION_SHIFT);
+	return load_8051_config(dd, RESERVED_REGISTERS, GENERAL_CONFIG,
+				frame);
+}
+
 void read_misc_status(struct hfi1_devdata *dd, u8 *ver_major, u8 *ver_minor,
 		      u8 *ver_patch)
 {
@@ -9257,12 +9342,6 @@ int start_link(struct hfi1_pportdata *ppd)
 	 */
 	tune_serdes(ppd);
 
-	if (!ppd->link_enabled) {
-		dd_dev_info(ppd->dd,
-			    "%s: stopping link start because link is disabled\n",
-			    __func__);
-		return 0;
-	}
 	if (!ppd->driver_link_ready) {
 		dd_dev_info(ppd->dd,
 			    "%s: stopping link start because driver is not ready\n",
@@ -9373,13 +9452,13 @@ static int handle_qsfp_error_conditions(struct hfi1_pportdata *ppd,
 
 	if ((qsfp_interrupt_status[0] & QSFP_HIGH_TEMP_ALARM) ||
 	    (qsfp_interrupt_status[0] & QSFP_HIGH_TEMP_WARNING))
-		dd_dev_info(dd, "%s: QSFP cable temperature too high\n",
-			    __func__);
+		dd_dev_err(dd, "%s: QSFP cable temperature too high\n",
+			   __func__);
 
 	if ((qsfp_interrupt_status[0] & QSFP_LOW_TEMP_ALARM) ||
 	    (qsfp_interrupt_status[0] & QSFP_LOW_TEMP_WARNING))
-		dd_dev_info(dd, "%s: QSFP cable temperature too low\n",
-			    __func__);
+		dd_dev_err(dd, "%s: QSFP cable temperature too low\n",
+			   __func__);
 
 	/*
 	 * The remaining alarms/warnings don't matter if the link is down.
@@ -9389,75 +9468,75 @@ static int handle_qsfp_error_conditions(struct hfi1_pportdata *ppd,
 
 	if ((qsfp_interrupt_status[1] & QSFP_HIGH_VCC_ALARM) ||
 	    (qsfp_interrupt_status[1] & QSFP_HIGH_VCC_WARNING))
-		dd_dev_info(dd, "%s: QSFP supply voltage too high\n",
-			    __func__);
+		dd_dev_err(dd, "%s: QSFP supply voltage too high\n",
+			   __func__);
 
 	if ((qsfp_interrupt_status[1] & QSFP_LOW_VCC_ALARM) ||
 	    (qsfp_interrupt_status[1] & QSFP_LOW_VCC_WARNING))
-		dd_dev_info(dd, "%s: QSFP supply voltage too low\n",
-			    __func__);
+		dd_dev_err(dd, "%s: QSFP supply voltage too low\n",
+			   __func__);
 
 	/* Byte 2 is vendor specific */
 
 	if ((qsfp_interrupt_status[3] & QSFP_HIGH_POWER_ALARM) ||
 	    (qsfp_interrupt_status[3] & QSFP_HIGH_POWER_WARNING))
-		dd_dev_info(dd, "%s: Cable RX channel 1/2 power too high\n",
-			    __func__);
+		dd_dev_err(dd, "%s: Cable RX channel 1/2 power too high\n",
+			   __func__);
 
 	if ((qsfp_interrupt_status[3] & QSFP_LOW_POWER_ALARM) ||
 	    (qsfp_interrupt_status[3] & QSFP_LOW_POWER_WARNING))
-		dd_dev_info(dd, "%s: Cable RX channel 1/2 power too low\n",
-			    __func__);
+		dd_dev_err(dd, "%s: Cable RX channel 1/2 power too low\n",
+			   __func__);
 
 	if ((qsfp_interrupt_status[4] & QSFP_HIGH_POWER_ALARM) ||
 	    (qsfp_interrupt_status[4] & QSFP_HIGH_POWER_WARNING))
-		dd_dev_info(dd, "%s: Cable RX channel 3/4 power too high\n",
-			    __func__);
+		dd_dev_err(dd, "%s: Cable RX channel 3/4 power too high\n",
+			   __func__);
 
 	if ((qsfp_interrupt_status[4] & QSFP_LOW_POWER_ALARM) ||
 	    (qsfp_interrupt_status[4] & QSFP_LOW_POWER_WARNING))
-		dd_dev_info(dd, "%s: Cable RX channel 3/4 power too low\n",
-			    __func__);
+		dd_dev_err(dd, "%s: Cable RX channel 3/4 power too low\n",
+			   __func__);
 
 	if ((qsfp_interrupt_status[5] & QSFP_HIGH_BIAS_ALARM) ||
 	    (qsfp_interrupt_status[5] & QSFP_HIGH_BIAS_WARNING))
-		dd_dev_info(dd, "%s: Cable TX channel 1/2 bias too high\n",
-			    __func__);
+		dd_dev_err(dd, "%s: Cable TX channel 1/2 bias too high\n",
+			   __func__);
 
 	if ((qsfp_interrupt_status[5] & QSFP_LOW_BIAS_ALARM) ||
 	    (qsfp_interrupt_status[5] & QSFP_LOW_BIAS_WARNING))
-		dd_dev_info(dd, "%s: Cable TX channel 1/2 bias too low\n",
-			    __func__);
+		dd_dev_err(dd, "%s: Cable TX channel 1/2 bias too low\n",
+			   __func__);
 
 	if ((qsfp_interrupt_status[6] & QSFP_HIGH_BIAS_ALARM) ||
 	    (qsfp_interrupt_status[6] & QSFP_HIGH_BIAS_WARNING))
-		dd_dev_info(dd, "%s: Cable TX channel 3/4 bias too high\n",
-			    __func__);
+		dd_dev_err(dd, "%s: Cable TX channel 3/4 bias too high\n",
+			   __func__);
 
 	if ((qsfp_interrupt_status[6] & QSFP_LOW_BIAS_ALARM) ||
 	    (qsfp_interrupt_status[6] & QSFP_LOW_BIAS_WARNING))
-		dd_dev_info(dd, "%s: Cable TX channel 3/4 bias too low\n",
-			    __func__);
+		dd_dev_err(dd, "%s: Cable TX channel 3/4 bias too low\n",
+			   __func__);
 
 	if ((qsfp_interrupt_status[7] & QSFP_HIGH_POWER_ALARM) ||
 	    (qsfp_interrupt_status[7] & QSFP_HIGH_POWER_WARNING))
-		dd_dev_info(dd, "%s: Cable TX channel 1/2 power too high\n",
-			    __func__);
+		dd_dev_err(dd, "%s: Cable TX channel 1/2 power too high\n",
+			   __func__);
 
 	if ((qsfp_interrupt_status[7] & QSFP_LOW_POWER_ALARM) ||
 	    (qsfp_interrupt_status[7] & QSFP_LOW_POWER_WARNING))
-		dd_dev_info(dd, "%s: Cable TX channel 1/2 power too low\n",
-			    __func__);
+		dd_dev_err(dd, "%s: Cable TX channel 1/2 power too low\n",
+			   __func__);
 
 	if ((qsfp_interrupt_status[8] & QSFP_HIGH_POWER_ALARM) ||
 	    (qsfp_interrupt_status[8] & QSFP_HIGH_POWER_WARNING))
-		dd_dev_info(dd, "%s: Cable TX channel 3/4 power too high\n",
-			    __func__);
+		dd_dev_err(dd, "%s: Cable TX channel 3/4 power too high\n",
+			   __func__);
 
 	if ((qsfp_interrupt_status[8] & QSFP_LOW_POWER_ALARM) ||
 	    (qsfp_interrupt_status[8] & QSFP_LOW_POWER_WARNING))
-		dd_dev_info(dd, "%s: Cable TX channel 3/4 power too low\n",
-			    __func__);
+		dd_dev_err(dd, "%s: Cable TX channel 3/4 power too low\n",
+			   __func__);
 
 	/* Bytes 9-10 and 11-12 are reserved */
 	/* Bytes 13-15 are vendor specific */
@@ -9480,6 +9559,13 @@ void qsfp_event(struct work_struct *work)
 	if (!qsfp_mod_present(ppd))
 		return;
 
+	if (ppd->host_link_state == HLS_DN_DISABLE) {
+		dd_dev_info(ppd->dd,
+			    "%s: stopping link start because link is disabled\n",
+			    __func__);
+		return;
+	}
+
 	/*
 	 * Turn DC back on after cable has been re-inserted. Up until
 	 * now, the DC has been in reset to save power.
@@ -9635,7 +9721,7 @@ static void try_start_link(struct hfi1_pportdata *ppd)
 			    "QSFP not responding, waiting and retrying %d\n",
 			    (int)ppd->qsfp_retry_count);
 		ppd->qsfp_retry_count++;
-		queue_delayed_work(ppd->hfi1_wq, &ppd->start_link_work,
+		queue_delayed_work(ppd->link_wq, &ppd->start_link_work,
 				   msecs_to_jiffies(QSFP_RETRY_WAIT));
 		return;
 	}
@@ -9742,17 +9828,6 @@ static inline int init_cpu_counters(struct hfi1_devdata *dd)
 	return 0;
 }
 
-static const char * const pt_names[] = {
-	"expected",
-	"eager",
-	"invalid"
-};
-
-static const char *pt_name(u32 type)
-{
-	return type >= ARRAY_SIZE(pt_names) ? "unknown" : pt_names[type];
-}
-
 /*
  * index is the index into the receive array
  */
@@ -9760,35 +9835,34 @@ void hfi1_put_tid(struct hfi1_devdata *dd, u32 index,
 		  u32 type, unsigned long pa, u16 order)
 {
 	u64 reg;
-	void __iomem *base = (dd->rcvarray_wc ? dd->rcvarray_wc :
-			      (dd->kregbase + RCV_ARRAY));
 
 	if (!(dd->flags & HFI1_PRESENT))
 		goto done;
 
-	if (type == PT_INVALID) {
+	if (type == PT_INVALID || type == PT_INVALID_FLUSH) {
 		pa = 0;
+		order = 0;
 	} else if (type > PT_INVALID) {
 		dd_dev_err(dd,
 			   "unexpected receive array type %u for index %u, not handled\n",
 			   type, index);
 		goto done;
 	}
-
-	hfi1_cdbg(TID, "type %s, index 0x%x, pa 0x%lx, bsize 0x%lx",
-		  pt_name(type), index, pa, (unsigned long)order);
+	trace_hfi1_put_tid(dd, index, type, pa, order);
 
 #define RT_ADDR_SHIFT 12	/* 4KB kernel address boundary */
 	reg = RCV_ARRAY_RT_WRITE_ENABLE_SMASK
 		| (u64)order << RCV_ARRAY_RT_BUF_SIZE_SHIFT
 		| ((pa >> RT_ADDR_SHIFT) & RCV_ARRAY_RT_ADDR_MASK)
 					<< RCV_ARRAY_RT_ADDR_SHIFT;
-	writeq(reg, base + (index * 8));
+	trace_hfi1_write_rcvarray(dd->rcvarray_wc + (index * 8), reg);
+	writeq(reg, dd->rcvarray_wc + (index * 8));
 
-	if (type == PT_EAGER)
+	if (type == PT_EAGER || type == PT_INVALID_FLUSH || (index & 3) == 3)
 		/*
-		 * Eager entries are written one-by-one so we have to push them
-		 * after we write the entry.
+		 * Eager entries are written and flushed
+		 *
+		 * Expected entries are flushed every 4 writes
 		 */
 		flush_wc();
 done:
@@ -9810,15 +9884,6 @@ void hfi1_clear_tids(struct hfi1_ctxtdata *rcd)
 		hfi1_put_tid(dd, i, PT_INVALID, 0, 0);
 }
 
-struct ib_header *hfi1_get_msgheader(
-	struct hfi1_devdata *dd, __le32 *rhf_addr)
-{
-	u32 offset = rhf_hdrq_offset(rhf_to_cpu(rhf_addr));
-
-	return (struct ib_header *)
-		(rhf_addr - dd->rhf_offset + offset);
-}
-
 static const char * const ib_cfg_name_strings[] = {
 	"HFI1_IB_CFG_LIDLMC",
 	"HFI1_IB_CFG_LWID_DG_ENB",
@@ -10010,10 +10075,16 @@ static void set_lidlmc(struct hfi1_pportdata *ppd)
 	struct hfi1_devdata *dd = ppd->dd;
 	u32 mask = ~((1U << ppd->lmc) - 1);
 	u64 c1 = read_csr(ppd->dd, DCC_CFG_PORT_CONFIG1);
+	u32 lid;
 
+	/*
+	 * Program 0 in CSR if port lid is extended. This prevents
+	 * 9B packets being sent out for large lids.
+	 */
+	lid = (ppd->lid >= be16_to_cpu(IB_MULTICAST_LID_BASE)) ? 0 : ppd->lid;
 	c1 &= ~(DCC_CFG_PORT_CONFIG1_TARGET_DLID_SMASK
 		| DCC_CFG_PORT_CONFIG1_DLID_MASK_SMASK);
-	c1 |= ((ppd->lid & DCC_CFG_PORT_CONFIG1_TARGET_DLID_MASK)
+	c1 |= ((lid & DCC_CFG_PORT_CONFIG1_TARGET_DLID_MASK)
 			<< DCC_CFG_PORT_CONFIG1_TARGET_DLID_SHIFT) |
 	      ((mask & DCC_CFG_PORT_CONFIG1_DLID_MASK_MASK)
 			<< DCC_CFG_PORT_CONFIG1_DLID_MASK_SHIFT);
@@ -10024,7 +10095,7 @@ static void set_lidlmc(struct hfi1_pportdata *ppd)
 	 */
 	sreg = ((mask & SEND_CTXT_CHECK_SLID_MASK_MASK) <<
 			SEND_CTXT_CHECK_SLID_MASK_SHIFT) |
-	       (((ppd->lid & mask) & SEND_CTXT_CHECK_SLID_VALUE_MASK) <<
+	       (((lid & mask) & SEND_CTXT_CHECK_SLID_VALUE_MASK) <<
 			SEND_CTXT_CHECK_SLID_VALUE_SHIFT);
 
 	for (i = 0; i < dd->chip_send_contexts; i++) {
@@ -10034,29 +10105,7 @@ static void set_lidlmc(struct hfi1_pportdata *ppd)
 	}
 
 	/* Now we have to do the same thing for the sdma engines */
-	sdma_update_lmc(dd, mask, ppd->lid);
-}
-
-static int wait_phy_linkstate(struct hfi1_devdata *dd, u32 state, u32 msecs)
-{
-	unsigned long timeout;
-	u32 curr_state;
-
-	timeout = jiffies + msecs_to_jiffies(msecs);
-	while (1) {
-		curr_state = read_physical_state(dd);
-		if (curr_state == state)
-			break;
-		if (time_after(jiffies, timeout)) {
-			dd_dev_err(dd,
-				   "timeout waiting for phy link state 0x%x, current state is 0x%x\n",
-				   state, curr_state);
-			return -ETIMEDOUT;
-		}
-		usleep_range(1950, 2050); /* sleep 2ms-ish */
-	}
-
-	return 0;
+	sdma_update_lmc(dd, mask, lid);
 }
 
 static const char *state_completed_string(u32 completed)
@@ -10238,8 +10287,10 @@ static void force_logical_link_state_down(struct hfi1_pportdata *ppd)
 	write_csr(dd, DC_LCB_CFG_ALLOW_LINK_UP, 0);
 	write_csr(dd, DC_LCB_CFG_IGNORE_LOST_RCLK, 0);
 
-	/* call again to adjust ppd->statusp, if needed */
-	get_logical_state(ppd);
+	/* adjust ppd->statusp, if needed */
+	update_statusp(ppd, IB_PORT_DOWN);
+
+	dd_dev_info(ppd->dd, "logical state forced to LINK_DOWN\n");
 }
 
 /*
@@ -10253,49 +10304,35 @@ static void force_logical_link_state_down(struct hfi1_pportdata *ppd)
 static int goto_offline(struct hfi1_pportdata *ppd, u8 rem_reason)
 {
 	struct hfi1_devdata *dd = ppd->dd;
-	u32 pstate, previous_state;
+	u32 previous_state;
 	int ret;
-	int do_transition;
-	int do_wait;
 
 	update_lcb_cache(dd);
 
 	previous_state = ppd->host_link_state;
 	ppd->host_link_state = HLS_GOING_OFFLINE;
-	pstate = read_physical_state(dd);
-	if (pstate == PLS_OFFLINE) {
-		do_transition = 0;	/* in right state */
-		do_wait = 0;		/* ...no need to wait */
-	} else if ((pstate & 0xf0) == PLS_OFFLINE) {
-		do_transition = 0;	/* in an offline transient state */
-		do_wait = 1;		/* ...wait for it to settle */
-	} else {
-		do_transition = 1;	/* need to move to offline */
-		do_wait = 1;		/* ...will need to wait */
-	}
 
-	if (do_transition) {
-		ret = set_physical_link_state(dd,
-					      (rem_reason << 8) | PLS_OFFLINE);
+	/* start offline transition */
+	ret = set_physical_link_state(dd, (rem_reason << 8) | PLS_OFFLINE);
 
-		if (ret != HCMD_SUCCESS) {
-			dd_dev_err(dd,
-				   "Failed to transition to Offline link state, return %d\n",
-				   ret);
-			return -EINVAL;
-		}
-		if (ppd->offline_disabled_reason ==
-				HFI1_ODR_MASK(OPA_LINKDOWN_REASON_NONE))
-			ppd->offline_disabled_reason =
-			HFI1_ODR_MASK(OPA_LINKDOWN_REASON_TRANSIENT);
+	if (ret != HCMD_SUCCESS) {
+		dd_dev_err(dd,
+			   "Failed to transition to Offline link state, return %d\n",
+			   ret);
+		return -EINVAL;
 	}
+	if (ppd->offline_disabled_reason ==
+			HFI1_ODR_MASK(OPA_LINKDOWN_REASON_NONE))
+		ppd->offline_disabled_reason =
+		HFI1_ODR_MASK(OPA_LINKDOWN_REASON_TRANSIENT);
 
-	if (do_wait) {
-		/* it can take a while for the link to go down */
-		ret = wait_phy_linkstate(dd, PLS_OFFLINE, 10000);
-		if (ret < 0)
-			return ret;
-	}
+	/*
+	 * Wait for offline transition. It can take a while for
+	 * the link to go down.
+	 */
+	ret = wait_physical_linkstate(ppd, PLS_OFFLINE, 10000);
+	if (ret < 0)
+		return ret;
 
 	/*
 	 * Now in charge of LCB - must be after the physical state is
@@ -10415,11 +10452,11 @@ static const char *link_state_reason_name(struct hfi1_pportdata *ppd, u32 state)
 }
 
 /*
- * driver_physical_state - convert the driver's notion of a port's
+ * driver_pstate - convert the driver's notion of a port's
  * state (an HLS_*) into a physical state (a {IB,OPA}_PORTPHYSSTATE_*).
  * Return -1 (converted to a u32) to indicate error.
  */
-u32 driver_physical_state(struct hfi1_pportdata *ppd)
+u32 driver_pstate(struct hfi1_pportdata *ppd)
 {
 	switch (ppd->host_link_state) {
 	case HLS_UP_INIT:
@@ -10449,11 +10486,11 @@ u32 driver_physical_state(struct hfi1_pportdata *ppd)
 }
 
 /*
- * driver_logical_state - convert the driver's notion of a port's
+ * driver_lstate - convert the driver's notion of a port's
  * state (an HLS_*) into a logical state (a IB_PORT_*). Return -1
  * (converted to a u32) to indicate error.
  */
-u32 driver_logical_state(struct hfi1_pportdata *ppd)
+u32 driver_lstate(struct hfi1_pportdata *ppd)
 {
 	if (ppd->host_link_state && (ppd->host_link_state & HLS_DOWN))
 		return IB_PORT_DOWN;
@@ -10484,6 +10521,14 @@ void set_link_down_reason(struct hfi1_pportdata *ppd, u8 lcl_reason,
 }
 
 /*
+ * Verify if BCT for data VLs is non-zero.
+ */
+static inline bool data_vls_operational(struct hfi1_pportdata *ppd)
+{
+	return !!ppd->actual_vls_operational;
+}
+
+/*
  * Change the physical and/or logical link state.
  *
  * Do not call this routine while inside an interrupt.  It contains
@@ -10545,38 +10590,58 @@ int set_link_state(struct hfi1_pportdata *ppd, u32 state)
 			goto unexpected;
 		}
 
+		/*
+		 * Wait for Link_Up physical state.
+		 * Physical and Logical states should already be
+		 * be transitioned to LinkUp and LinkInit respectively.
+		 */
+		ret = wait_physical_linkstate(ppd, PLS_LINKUP, 1000);
+		if (ret) {
+			dd_dev_err(dd,
+				   "%s: physical state did not change to LINK-UP\n",
+				   __func__);
+			break;
+		}
+
 		ret = wait_logical_linkstate(ppd, IB_PORT_INIT, 1000);
 		if (ret) {
 			dd_dev_err(dd,
 				   "%s: logical state did not change to INIT\n",
 				   __func__);
-		} else {
-			/* clear old transient LINKINIT_REASON code */
-			if (ppd->linkinit_reason >= OPA_LINKINIT_REASON_CLEAR)
-				ppd->linkinit_reason =
-					OPA_LINKINIT_REASON_LINKUP;
-
-			/* enable the port */
-			add_rcvctrl(dd, RCV_CTRL_RCV_PORT_ENABLE_SMASK);
-
-			handle_linkup_change(dd, 1);
-			ppd->host_link_state = HLS_UP_INIT;
+			break;
 		}
+
+		/* clear old transient LINKINIT_REASON code */
+		if (ppd->linkinit_reason >= OPA_LINKINIT_REASON_CLEAR)
+			ppd->linkinit_reason =
+				OPA_LINKINIT_REASON_LINKUP;
+
+		/* enable the port */
+		add_rcvctrl(dd, RCV_CTRL_RCV_PORT_ENABLE_SMASK);
+
+		handle_linkup_change(dd, 1);
+		ppd->host_link_state = HLS_UP_INIT;
 		break;
 	case HLS_UP_ARMED:
 		if (ppd->host_link_state != HLS_UP_INIT)
 			goto unexpected;
 
-		ppd->host_link_state = HLS_UP_ARMED;
+		if (!data_vls_operational(ppd)) {
+			dd_dev_err(dd,
+				   "%s: data VLs not operational\n", __func__);
+			ret = -EINVAL;
+			break;
+		}
+
 		set_logical_state(dd, LSTATE_ARMED);
 		ret = wait_logical_linkstate(ppd, IB_PORT_ARMED, 1000);
 		if (ret) {
-			/* logical state didn't change, stay at init */
-			ppd->host_link_state = HLS_UP_INIT;
 			dd_dev_err(dd,
 				   "%s: logical state did not change to ARMED\n",
 				   __func__);
+			break;
 		}
+		ppd->host_link_state = HLS_UP_ARMED;
 		/*
 		 * The simulator does not currently implement SMA messages,
 		 * so neighbor_normal is not set.  Set it here when we first
@@ -10589,18 +10654,16 @@ int set_link_state(struct hfi1_pportdata *ppd, u32 state)
 		if (ppd->host_link_state != HLS_UP_ARMED)
 			goto unexpected;
 
-		ppd->host_link_state = HLS_UP_ACTIVE;
 		set_logical_state(dd, LSTATE_ACTIVE);
 		ret = wait_logical_linkstate(ppd, IB_PORT_ACTIVE, 1000);
 		if (ret) {
-			/* logical state didn't change, stay at armed */
-			ppd->host_link_state = HLS_UP_ARMED;
 			dd_dev_err(dd,
 				   "%s: logical state did not change to ACTIVE\n",
 				   __func__);
 		} else {
 			/* tell all engines to go running */
 			sdma_all_running(dd);
+			ppd->host_link_state = HLS_UP_ACTIVE;
 
 			/* Signal the IB layer that the port has went active */
 			event.device = &dd->verbs_dev.rdi.ibdev;
@@ -10658,6 +10721,8 @@ int set_link_state(struct hfi1_pportdata *ppd, u32 state)
 		 */
 		if (ret)
 			goto_offline(ppd, 0);
+		else
+			log_physical_state(ppd, PLS_POLLING);
 		break;
 	case HLS_DN_DISABLE:
 		/* link is disabled */
@@ -10682,6 +10747,13 @@ int set_link_state(struct hfi1_pportdata *ppd, u32 state)
 				ret = -EINVAL;
 				break;
 			}
+			ret = wait_physical_linkstate(ppd, PLS_DISABLED, 10000);
+			if (ret) {
+				dd_dev_err(dd,
+					   "%s: physical state did not change to DISABLED\n",
+					   __func__);
+				break;
+			}
 			dc_shutdown(dd);
 		}
 		ppd->host_link_state = HLS_DN_DISABLE;
@@ -10699,6 +10771,7 @@ int set_link_state(struct hfi1_pportdata *ppd, u32 state)
 		if (ppd->host_link_state != HLS_DN_POLL)
 			goto unexpected;
 		ppd->host_link_state = HLS_VERIFY_CAP;
+		log_physical_state(ppd, PLS_CONFIGPHY_VERIFYCAP);
 		break;
 	case HLS_GOING_UP:
 		if (ppd->host_link_state != HLS_VERIFY_CAP)
@@ -11693,16 +11766,18 @@ static u32 encoded_size(u32 size)
 	return 0x1;	/* if invalid, go with the minimum size */
 }
 
-void hfi1_rcvctrl(struct hfi1_devdata *dd, unsigned int op, int ctxt)
+void hfi1_rcvctrl(struct hfi1_devdata *dd, unsigned int op,
+		  struct hfi1_ctxtdata *rcd)
 {
-	struct hfi1_ctxtdata *rcd;
 	u64 rcvctrl, reg;
 	int did_enable = 0;
+	u16 ctxt;
 
-	rcd = dd->rcd[ctxt];
 	if (!rcd)
 		return;
 
+	ctxt = rcd->ctxt;
+
 	hfi1_cdbg(RCVCTRL, "ctxt %d op 0x%x", ctxt, op);
 
 	rcvctrl = read_kctxt_csr(dd, ctxt, RCV_CTXT_CTRL);
@@ -12604,20 +12679,8 @@ const char *opa_pstate_name(u32 pstate)
 	return "unknown";
 }
 
-/*
- * Read the hardware link state and set the driver's cached value of it.
- * Return the (new) current value.
- */
-u32 get_logical_state(struct hfi1_pportdata *ppd)
+static void update_statusp(struct hfi1_pportdata *ppd, u32 state)
 {
-	u32 new_state;
-
-	new_state = chip_to_opa_lstate(ppd->dd, read_logical_state(ppd->dd));
-	if (new_state != ppd->lstate) {
-		dd_dev_info(ppd->dd, "logical state changed to %s (0x%x)\n",
-			    opa_lstate_name(new_state), new_state);
-		ppd->lstate = new_state;
-	}
 	/*
 	 * Set port status flags in the page mapped into userspace
 	 * memory. Do it here to ensure a reliable state - this is
@@ -12627,7 +12690,7 @@ u32 get_logical_state(struct hfi1_pportdata *ppd)
 	 * function.
 	 */
 	if (ppd->statusp) {
-		switch (ppd->lstate) {
+		switch (state) {
 		case IB_PORT_DOWN:
 		case IB_PORT_INIT:
 			*ppd->statusp &= ~(HFI1_STATUS_IB_CONF |
@@ -12641,10 +12704,9 @@ u32 get_logical_state(struct hfi1_pportdata *ppd)
 			break;
 		}
 	}
-	return ppd->lstate;
 }
 
-/**
+/*
  * wait_logical_linkstate - wait for an IB link state change to occur
  * @ppd: port device
  * @state: the state to wait for
@@ -12658,35 +12720,88 @@ static int wait_logical_linkstate(struct hfi1_pportdata *ppd, u32 state,
 				  int msecs)
 {
 	unsigned long timeout;
+	u32 new_state;
 
 	timeout = jiffies + msecs_to_jiffies(msecs);
 	while (1) {
-		if (get_logical_state(ppd) == state)
-			return 0;
-		if (time_after(jiffies, timeout))
+		new_state = chip_to_opa_lstate(ppd->dd,
+					       read_logical_state(ppd->dd));
+		if (new_state == state)
 			break;
+		if (time_after(jiffies, timeout)) {
+			dd_dev_err(ppd->dd,
+				   "timeout waiting for link state 0x%x\n",
+				   state);
+			return -ETIMEDOUT;
+		}
 		msleep(20);
 	}
-	dd_dev_err(ppd->dd, "timeout waiting for link state 0x%x\n", state);
 
-	return -ETIMEDOUT;
+	update_statusp(ppd, state);
+	dd_dev_info(ppd->dd,
+		    "logical state changed to %s (0x%x)\n",
+		    opa_lstate_name(state),
+		    state);
+	return 0;
 }
 
-u8 hfi1_ibphys_portstate(struct hfi1_pportdata *ppd)
+static void log_state_transition(struct hfi1_pportdata *ppd, u32 state)
 {
-	u32 pstate;
-	u32 ib_pstate;
+	u32 ib_pstate = chip_to_opa_pstate(ppd->dd, state);
 
-	pstate = read_physical_state(ppd->dd);
-	ib_pstate = chip_to_opa_pstate(ppd->dd, pstate);
-	if (ppd->last_pstate != ib_pstate) {
-		dd_dev_info(ppd->dd,
-			    "%s: physical state changed to %s (0x%x), phy 0x%x\n",
-			    __func__, opa_pstate_name(ib_pstate), ib_pstate,
-			    pstate);
-		ppd->last_pstate = ib_pstate;
+	dd_dev_info(ppd->dd,
+		    "physical state changed to %s (0x%x), phy 0x%x\n",
+		    opa_pstate_name(ib_pstate), ib_pstate, state);
+}
+
+/*
+ * Read the physical hardware link state and check if it matches host
+ * drivers anticipated state.
+ */
+static void log_physical_state(struct hfi1_pportdata *ppd, u32 state)
+{
+	u32 read_state = read_physical_state(ppd->dd);
+
+	if (read_state == state) {
+		log_state_transition(ppd, state);
+	} else {
+		dd_dev_err(ppd->dd,
+			   "anticipated phy link state 0x%x, read 0x%x\n",
+			   state, read_state);
 	}
-	return ib_pstate;
+}
+
+/*
+ * wait_physical_linkstate - wait for an physical link state change to occur
+ * @ppd: port device
+ * @state: the state to wait for
+ * @msecs: the number of milliseconds to wait
+ *
+ * Wait up to msecs milliseconds for physical link state change to occur.
+ * Returns 0 if state reached, otherwise -ETIMEDOUT.
+ */
+static int wait_physical_linkstate(struct hfi1_pportdata *ppd, u32 state,
+				   int msecs)
+{
+	u32 read_state;
+	unsigned long timeout;
+
+	timeout = jiffies + msecs_to_jiffies(msecs);
+	while (1) {
+		read_state = read_physical_state(ppd->dd);
+		if (read_state == state)
+			break;
+		if (time_after(jiffies, timeout)) {
+			dd_dev_err(ppd->dd,
+				   "timeout waiting for phy link state 0x%x\n",
+				   state);
+			return -ETIMEDOUT;
+		}
+		usleep_range(1950, 2050); /* sleep 2ms-ish */
+	}
+
+	log_state_transition(ppd, state);
+	return 0;
 }
 
 #define CLEAR_STATIC_RATE_CONTROL_SMASK(r) \
@@ -12809,30 +12924,24 @@ static void clean_up_interrupts(struct hfi1_devdata *dd)
 		for (i = 0; i < dd->num_msix_entries; i++, me++) {
 			if (!me->arg) /* => no irq, no affinity */
 				continue;
-			hfi1_put_irq_affinity(dd, &dd->msix_entries[i]);
-			free_irq(me->msix.vector, me->arg);
+			hfi1_put_irq_affinity(dd, me);
+			free_irq(me->irq, me->arg);
 		}
+
+		/* clean structures */
+		kfree(dd->msix_entries);
+		dd->msix_entries = NULL;
+		dd->num_msix_entries = 0;
 	} else {
 		/* INTx */
 		if (dd->requested_intx_irq) {
 			free_irq(dd->pcidev->irq, dd);
 			dd->requested_intx_irq = 0;
 		}
-	}
-
-	/* turn off interrupts */
-	if (dd->num_msix_entries) {
-		/* MSI-X */
-		pci_disable_msix(dd->pcidev);
-	} else {
-		/* INTx */
 		disable_intx(dd->pcidev);
 	}
 
-	/* clean structures */
-	kfree(dd->msix_entries);
-	dd->msix_entries = NULL;
-	dd->num_msix_entries = 0;
+	pci_free_irq_vectors(dd->pcidev);
 }
 
 /*
@@ -12953,7 +13062,7 @@ static int request_msix_irqs(struct hfi1_devdata *dd)
 			me->type = IRQ_SDMA;
 		} else if (first_rx <= i && i < last_rx) {
 			idx = i - first_rx;
-			rcd = dd->rcd[idx];
+			rcd = hfi1_rcd_get_by_index(dd, idx);
 			if (rcd) {
 				/*
 				 * Set the interrupt register and mask for this
@@ -12972,6 +13081,7 @@ static int request_msix_irqs(struct hfi1_devdata *dd)
 				remap_intr(dd, IS_RCVAVAIL_START + idx, i);
 				me->type = IRQ_RCVCTXT;
 				rcd->msix_intr = i;
+				hfi1_rcd_put(rcd);
 			}
 		} else {
 			/* not in our expected range - complain, then
@@ -12986,13 +13096,21 @@ static int request_msix_irqs(struct hfi1_devdata *dd)
 			continue;
 		/* make sure the name is terminated */
 		me->name[sizeof(me->name) - 1] = 0;
+		me->irq = pci_irq_vector(dd->pcidev, i);
+		/*
+		 * On err return me->irq.  Don't need to clear this
+		 * because 'arg' has not been set, and cleanup will
+		 * do the right thing.
+		 */
+		if (me->irq < 0)
+			return me->irq;
 
-		ret = request_threaded_irq(me->msix.vector, handler, thread, 0,
+		ret = request_threaded_irq(me->irq, handler, thread, 0,
 					   me->name, arg);
 		if (ret) {
 			dd_dev_err(dd,
-				   "unable to allocate %s interrupt, vector %d, index %d, err %d\n",
-				   err_info, me->msix.vector, idx, ret);
+				   "unable to allocate %s interrupt, irq %d, index %d, err %d\n",
+				   err_info, me->irq, idx, ret);
 			return ret;
 		}
 		/*
@@ -13003,8 +13121,7 @@ static int request_msix_irqs(struct hfi1_devdata *dd)
 
 		ret = hfi1_get_irq_affinity(dd, me);
 		if (ret)
-			dd_dev_err(dd,
-				   "unable to pin IRQ %d\n", ret);
+			dd_dev_err(dd, "unable to pin IRQ %d\n", ret);
 	}
 
 	return ret;
@@ -13023,7 +13140,7 @@ void hfi1_vnic_synchronize_irq(struct hfi1_devdata *dd)
 		struct hfi1_ctxtdata *rcd = dd->vnic.ctxt[i];
 		struct hfi1_msix_entry *me = &dd->msix_entries[rcd->msix_intr];
 
-		synchronize_irq(me->msix.vector);
+		synchronize_irq(me->irq);
 	}
 }
 
@@ -13036,7 +13153,7 @@ void hfi1_reset_vnic_msix_info(struct hfi1_ctxtdata *rcd)
 		return;
 
 	hfi1_put_irq_affinity(dd, me);
-	free_irq(me->msix.vector, me->arg);
+	free_irq(me->irq, me->arg);
 
 	me->arg = NULL;
 }
@@ -13064,14 +13181,19 @@ void hfi1_set_vnic_msix_info(struct hfi1_ctxtdata *rcd)
 		 DRIVER_NAME "_%d kctxt%d", dd->unit, idx);
 	me->name[sizeof(me->name) - 1] = 0;
 	me->type = IRQ_RCVCTXT;
-
+	me->irq = pci_irq_vector(dd->pcidev, rcd->msix_intr);
+	if (me->irq < 0) {
+		dd_dev_err(dd, "vnic irq vector request (idx %d) fail %d\n",
+			   idx, me->irq);
+		return;
+	}
 	remap_intr(dd, IS_RCVAVAIL_START + idx, rcd->msix_intr);
 
-	ret = request_threaded_irq(me->msix.vector, receive_context_interrupt,
+	ret = request_threaded_irq(me->irq, receive_context_interrupt,
 				   receive_context_thread, 0, me->name, arg);
 	if (ret) {
-		dd_dev_err(dd, "vnic irq request (vector %d, idx %d) fail %d\n",
-			   me->msix.vector, idx, ret);
+		dd_dev_err(dd, "vnic irq request (irq %d, idx %d) fail %d\n",
+			   me->irq, idx, ret);
 		return;
 	}
 	/*
@@ -13084,7 +13206,7 @@ void hfi1_set_vnic_msix_info(struct hfi1_ctxtdata *rcd)
 	if (ret) {
 		dd_dev_err(dd,
 			   "unable to pin IRQ %d\n", ret);
-		free_irq(me->msix.vector, me->arg);
+		free_irq(me->irq, me->arg);
 	}
 }
 
@@ -13107,9 +13229,8 @@ static void reset_interrupts(struct hfi1_devdata *dd)
 
 static int set_up_interrupts(struct hfi1_devdata *dd)
 {
-	struct hfi1_msix_entry *entries;
-	u32 total, request;
-	int i, ret;
+	u32 total;
+	int ret, request;
 	int single_interrupt = 0; /* we expect to have all the interrupts */
 
 	/*
@@ -13121,39 +13242,31 @@ static int set_up_interrupts(struct hfi1_devdata *dd)
 	 */
 	total = 1 + dd->num_sdma + dd->n_krcv_queues + HFI1_NUM_VNIC_CTXT;
 
-	entries = kcalloc(total, sizeof(*entries), GFP_KERNEL);
-	if (!entries) {
-		ret = -ENOMEM;
-		goto fail;
-	}
-	/* 1-1 MSI-X entry assignment */
-	for (i = 0; i < total; i++)
-		entries[i].msix.entry = i;
-
 	/* ask for MSI-X interrupts */
-	request = total;
-	request_msix(dd, &request, entries);
-
-	if (request == 0) {
+	request = request_msix(dd, total);
+	if (request < 0) {
+		ret = request;
+		goto fail;
+	} else if (request == 0) {
 		/* using INTx */
 		/* dd->num_msix_entries already zero */
-		kfree(entries);
 		single_interrupt = 1;
 		dd_dev_err(dd, "MSI-X failed, using INTx interrupts\n");
+	} else if (request < total) {
+		/* using MSI-X, with reduced interrupts */
+		dd_dev_err(dd, "reduced interrupt found, wanted %u, got %u\n",
+			   total, request);
+		ret = -EINVAL;
+		goto fail;
 	} else {
-		/* using MSI-X */
-		dd->num_msix_entries = request;
-		dd->msix_entries = entries;
-
-		if (request != total) {
-			/* using MSI-X, with reduced interrupts */
-			dd_dev_err(
-				dd,
-				"cannot handle reduced interrupt case, want %u, got %u\n",
-				total, request);
-			ret = -EINVAL;
+		dd->msix_entries = kcalloc(total, sizeof(*dd->msix_entries),
+					   GFP_KERNEL);
+		if (!dd->msix_entries) {
+			ret = -ENOMEM;
 			goto fail;
 		}
+		/* using MSI-X */
+		dd->num_msix_entries = total;
 		dd_dev_info(dd, "%u MSI-X interrupts allocated\n", total);
 	}
 
@@ -13396,8 +13509,7 @@ static void write_uninitialized_csrs_and_memories(struct hfi1_devdata *dd)
 
 	/* RcvArray */
 	for (i = 0; i < dd->chip_rcv_array_count; i++)
-		write_csr(dd, RCV_ARRAY + (8 * i),
-			  RCV_ARRAY_RT_WRITE_ENABLE_SMASK);
+		hfi1_put_tid(dd, i, PT_INVALID_FLUSH, 0, 0);
 
 	/* RcvQPMapTable */
 	for (i = 0; i < 32; i++)
@@ -13831,9 +13943,10 @@ static void init_sc2vl_tables(struct hfi1_devdata *dd)
  * a reset following the (possible) FLR in this routine.
  *
  */
-static void init_chip(struct hfi1_devdata *dd)
+static int init_chip(struct hfi1_devdata *dd)
 {
 	int i;
+	int ret = 0;
 
 	/*
 	 * Put the HFI CSRs in a known state.
@@ -13881,12 +13994,22 @@ static void init_chip(struct hfi1_devdata *dd)
 		pcie_flr(dd->pcidev);
 
 		/* restore command and BARs */
-		restore_pci_variables(dd);
+		ret = restore_pci_variables(dd);
+		if (ret) {
+			dd_dev_err(dd, "%s: Could not restore PCI variables\n",
+				   __func__);
+			return ret;
+		}
 
 		if (is_ax(dd)) {
 			dd_dev_info(dd, "Resetting CSRs with FLR\n");
 			pcie_flr(dd->pcidev);
-			restore_pci_variables(dd);
+			ret = restore_pci_variables(dd);
+			if (ret) {
+				dd_dev_err(dd, "%s: Could not restore PCI variables\n",
+					   __func__);
+				return ret;
+			}
 		}
 	} else {
 		dd_dev_info(dd, "Resetting CSRs with writes\n");
@@ -13914,6 +14037,7 @@ static void init_chip(struct hfi1_devdata *dd)
 	write_csr(dd, ASIC_QSFP1_OUT, 0x1f);
 	write_csr(dd, ASIC_QSFP2_OUT, 0x1f);
 	init_chip_resources(dd);
+	return ret;
 }
 
 static void init_early_variables(struct hfi1_devdata *dd)
@@ -14365,6 +14489,7 @@ void hfi1_deinit_vnic_rsm(struct hfi1_devdata *dd)
 static void init_rxe(struct hfi1_devdata *dd)
 {
 	struct rsm_map_table *rmt;
+	u64 val;
 
 	/* enable all receive errors */
 	write_csr(dd, RCV_ERR_MASK, ~0ull);
@@ -14389,6 +14514,11 @@ static void init_rxe(struct hfi1_devdata *dd)
 	 * (64 bytes).  Max_Payload_Size is possibly modified upward in
 	 * tune_pcie_caps() which is called after this routine.
 	 */
+
+	/* Have 16 bytes (4DW) of bypass header available in header queue */
+	val = read_csr(dd, RCV_BYPASS);
+	val |= (4ull << 16);
+	write_csr(dd, RCV_BYPASS, val);
 }
 
 static void init_other(struct hfi1_devdata *dd)
@@ -14470,99 +14600,86 @@ static void init_txe(struct hfi1_devdata *dd)
 		write_csr(dd, SEND_CM_TIMER_CTRL, HFI1_CREDIT_RETURN_RATE);
 }
 
-int hfi1_set_ctxt_jkey(struct hfi1_devdata *dd, unsigned ctxt, u16 jkey)
+int hfi1_set_ctxt_jkey(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd,
+		       u16 jkey)
 {
-	struct hfi1_ctxtdata *rcd = dd->rcd[ctxt];
-	unsigned sctxt;
-	int ret = 0;
+	u8 hw_ctxt;
 	u64 reg;
 
-	if (!rcd || !rcd->sc) {
-		ret = -EINVAL;
-		goto done;
-	}
-	sctxt = rcd->sc->hw_context;
+	if (!rcd || !rcd->sc)
+		return -EINVAL;
+
+	hw_ctxt = rcd->sc->hw_context;
 	reg = SEND_CTXT_CHECK_JOB_KEY_MASK_SMASK | /* mask is always 1's */
 		((jkey & SEND_CTXT_CHECK_JOB_KEY_VALUE_MASK) <<
 		 SEND_CTXT_CHECK_JOB_KEY_VALUE_SHIFT);
 	/* JOB_KEY_ALLOW_PERMISSIVE is not allowed by default */
 	if (HFI1_CAP_KGET_MASK(rcd->flags, ALLOW_PERM_JKEY))
 		reg |= SEND_CTXT_CHECK_JOB_KEY_ALLOW_PERMISSIVE_SMASK;
-	write_kctxt_csr(dd, sctxt, SEND_CTXT_CHECK_JOB_KEY, reg);
+	write_kctxt_csr(dd, hw_ctxt, SEND_CTXT_CHECK_JOB_KEY, reg);
 	/*
 	 * Enable send-side J_KEY integrity check, unless this is A0 h/w
 	 */
 	if (!is_ax(dd)) {
-		reg = read_kctxt_csr(dd, sctxt, SEND_CTXT_CHECK_ENABLE);
+		reg = read_kctxt_csr(dd, hw_ctxt, SEND_CTXT_CHECK_ENABLE);
 		reg |= SEND_CTXT_CHECK_ENABLE_CHECK_JOB_KEY_SMASK;
-		write_kctxt_csr(dd, sctxt, SEND_CTXT_CHECK_ENABLE, reg);
+		write_kctxt_csr(dd, hw_ctxt, SEND_CTXT_CHECK_ENABLE, reg);
 	}
 
 	/* Enable J_KEY check on receive context. */
 	reg = RCV_KEY_CTRL_JOB_KEY_ENABLE_SMASK |
 		((jkey & RCV_KEY_CTRL_JOB_KEY_VALUE_MASK) <<
 		 RCV_KEY_CTRL_JOB_KEY_VALUE_SHIFT);
-	write_kctxt_csr(dd, ctxt, RCV_KEY_CTRL, reg);
-done:
-	return ret;
+	write_kctxt_csr(dd, rcd->ctxt, RCV_KEY_CTRL, reg);
+
+	return 0;
 }
 
-int hfi1_clear_ctxt_jkey(struct hfi1_devdata *dd, unsigned ctxt)
+int hfi1_clear_ctxt_jkey(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd)
 {
-	struct hfi1_ctxtdata *rcd = dd->rcd[ctxt];
-	unsigned sctxt;
-	int ret = 0;
+	u8 hw_ctxt;
 	u64 reg;
 
-	if (!rcd || !rcd->sc) {
-		ret = -EINVAL;
-		goto done;
-	}
-	sctxt = rcd->sc->hw_context;
-	write_kctxt_csr(dd, sctxt, SEND_CTXT_CHECK_JOB_KEY, 0);
+	if (!rcd || !rcd->sc)
+		return -EINVAL;
+
+	hw_ctxt = rcd->sc->hw_context;
+	write_kctxt_csr(dd, hw_ctxt, SEND_CTXT_CHECK_JOB_KEY, 0);
 	/*
 	 * Disable send-side J_KEY integrity check, unless this is A0 h/w.
 	 * This check would not have been enabled for A0 h/w, see
 	 * set_ctxt_jkey().
 	 */
 	if (!is_ax(dd)) {
-		reg = read_kctxt_csr(dd, sctxt, SEND_CTXT_CHECK_ENABLE);
+		reg = read_kctxt_csr(dd, hw_ctxt, SEND_CTXT_CHECK_ENABLE);
 		reg &= ~SEND_CTXT_CHECK_ENABLE_CHECK_JOB_KEY_SMASK;
-		write_kctxt_csr(dd, sctxt, SEND_CTXT_CHECK_ENABLE, reg);
+		write_kctxt_csr(dd, hw_ctxt, SEND_CTXT_CHECK_ENABLE, reg);
 	}
 	/* Turn off the J_KEY on the receive side */
-	write_kctxt_csr(dd, ctxt, RCV_KEY_CTRL, 0);
-done:
-	return ret;
+	write_kctxt_csr(dd, rcd->ctxt, RCV_KEY_CTRL, 0);
+
+	return 0;
 }
 
-int hfi1_set_ctxt_pkey(struct hfi1_devdata *dd, unsigned ctxt, u16 pkey)
+int hfi1_set_ctxt_pkey(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd,
+		       u16 pkey)
 {
-	struct hfi1_ctxtdata *rcd;
-	unsigned sctxt;
-	int ret = 0;
+	u8 hw_ctxt;
 	u64 reg;
 
-	if (ctxt < dd->num_rcv_contexts) {
-		rcd = dd->rcd[ctxt];
-	} else {
-		ret = -EINVAL;
-		goto done;
-	}
-	if (!rcd || !rcd->sc) {
-		ret = -EINVAL;
-		goto done;
-	}
-	sctxt = rcd->sc->hw_context;
+	if (!rcd || !rcd->sc)
+		return -EINVAL;
+
+	hw_ctxt = rcd->sc->hw_context;
 	reg = ((u64)pkey & SEND_CTXT_CHECK_PARTITION_KEY_VALUE_MASK) <<
 		SEND_CTXT_CHECK_PARTITION_KEY_VALUE_SHIFT;
-	write_kctxt_csr(dd, sctxt, SEND_CTXT_CHECK_PARTITION_KEY, reg);
-	reg = read_kctxt_csr(dd, sctxt, SEND_CTXT_CHECK_ENABLE);
+	write_kctxt_csr(dd, hw_ctxt, SEND_CTXT_CHECK_PARTITION_KEY, reg);
+	reg = read_kctxt_csr(dd, hw_ctxt, SEND_CTXT_CHECK_ENABLE);
 	reg |= SEND_CTXT_CHECK_ENABLE_CHECK_PARTITION_KEY_SMASK;
 	reg &= ~SEND_CTXT_CHECK_ENABLE_DISALLOW_KDETH_PACKETS_SMASK;
-	write_kctxt_csr(dd, sctxt, SEND_CTXT_CHECK_ENABLE, reg);
-done:
-	return ret;
+	write_kctxt_csr(dd, hw_ctxt, SEND_CTXT_CHECK_ENABLE, reg);
+
+	return 0;
 }
 
 int hfi1_clear_ctxt_pkey(struct hfi1_devdata *dd, struct hfi1_ctxtdata *ctxt)
@@ -14573,9 +14690,6 @@ int hfi1_clear_ctxt_pkey(struct hfi1_devdata *dd, struct hfi1_ctxtdata *ctxt)
 	if (!ctxt || !ctxt->sc)
 		return -EINVAL;
 
-	if (ctxt->ctxt >= dd->num_rcv_contexts)
-		return -EINVAL;
-
 	hw_ctxt = ctxt->sc->hw_context;
 	reg = read_kctxt_csr(dd, hw_ctxt, SEND_CTXT_CHECK_ENABLE);
 	reg &= ~SEND_CTXT_CHECK_ENABLE_CHECK_PARTITION_KEY_SMASK;
@@ -14773,7 +14887,6 @@ struct hfi1_devdata *hfi1_init_dd(struct pci_dev *pdev,
 		}
 		ppd->vls_supported = num_vls;
 		ppd->vls_operational = ppd->vls_supported;
-		ppd->actual_vls_operational = ppd->vls_supported;
 		/* Set the default MTU. */
 		for (vl = 0; vl < num_vls; vl++)
 			dd->vld[vl].mtu = hfi1_max_mtu;
@@ -14782,7 +14895,6 @@ struct hfi1_devdata *hfi1_init_dd(struct pci_dev *pdev,
 		 * Set the initial values to reasonable default, will be set
 		 * for real when link is up.
 		 */
-		ppd->lstate = IB_PORT_DOWN;
 		ppd->overrun_threshold = 0x4;
 		ppd->phy_error_threshold = 0xf;
 		ppd->port_crc_mode_enabled = link_crc_mask;
@@ -14793,7 +14905,6 @@ struct hfi1_devdata *hfi1_init_dd(struct pci_dev *pdev,
 		/* start in offline */
 		ppd->host_link_state = HLS_DN_OFFLINE;
 		init_vl_arb_caches(ppd);
-		ppd->last_pstate = 0xff; /* invalid value */
 	}
 
 	dd->link_default = HLS_DN_POLL;
@@ -14807,6 +14918,11 @@ struct hfi1_devdata *hfi1_init_dd(struct pci_dev *pdev,
 	if (ret < 0)
 		goto bail_free;
 
+	/* Save PCI space registers to rewrite after device reset */
+	ret = save_pci_variables(dd);
+	if (ret < 0)
+		goto bail_cleanup;
+
 	/* verify that reads actually work, save revision for reset check */
 	dd->revision = read_csr(dd, CCE_REVISION);
 	if (dd->revision == ~(u64)0) {
@@ -14899,7 +15015,9 @@ struct hfi1_devdata *hfi1_init_dd(struct pci_dev *pdev,
 		goto bail_cleanup;
 
 	/* obtain chip sizes, reset chip CSRs */
-	init_chip(dd);
+	ret = init_chip(dd);
+	if (ret)
+		goto bail_cleanup;
 
 	/* read in the PCIe link speed information */
 	ret = pcie_speeds(dd);
@@ -14974,10 +15092,16 @@ struct hfi1_devdata *hfi1_init_dd(struct pci_dev *pdev,
 	if (ret)
 		goto bail_cleanup;
 
-	ret = hfi1_create_ctxts(dd);
+	ret = hfi1_create_kctxts(dd);
 	if (ret)
 		goto bail_cleanup;
 
+	/*
+	 * Initialize aspm, to be done after gen3 transition and setting up
+	 * contexts and before enabling interrupts
+	 */
+	aspm_init(dd);
+
 	dd->rcvhdrsize = DEFAULT_RCVHDRSIZE;
 	/*
 	 * rcd[0] is guaranteed to be valid by this point. Also, all
@@ -14996,7 +15120,7 @@ struct hfi1_devdata *hfi1_init_dd(struct pci_dev *pdev,
 			goto bail_cleanup;
 	}
 
-	/* use contexts created by hfi1_create_ctxts */
+	/* use contexts created by hfi1_create_kctxts */
 	ret = set_up_interrupts(dd);
 	if (ret)
 		goto bail_cleanup;
diff --git a/drivers/infiniband/hw/hfi1/chip.h b/drivers/infiniband/hw/hfi1/chip.h
index cbe455d..b8345a6 100644
--- a/drivers/infiniband/hw/hfi1/chip.h
+++ b/drivers/infiniband/hw/hfi1/chip.h
@@ -384,6 +384,7 @@
 #define VERIFY_CAP_LOCAL_FABRIC	     0x08
 #define VERIFY_CAP_LOCAL_LINK_WIDTH  0x09
 #define LOCAL_DEVICE_ID		     0x0a
+#define RESERVED_REGISTERS	     0x0b
 #define LOCAL_LNI_INFO		     0x0c
 #define REMOTE_LNI_INFO              0x0d
 #define MISC_STATUS		     0x0e
@@ -506,6 +507,9 @@
 #define DOWN_REMOTE_REASON_SHIFT 16
 #define DOWN_REMOTE_REASON_MASK  0xff
 
+#define HOST_INTERFACE_VERSION_SHIFT 16
+#define HOST_INTERFACE_VERSION_MASK  0xff
+
 /* verify capability PHY power management bits */
 #define PWRM_BER_CONTROL	0x1
 #define PWRM_BANDWIDTH_CONTROL	0x2
@@ -605,11 +609,11 @@ int read_lcb_csr(struct hfi1_devdata *dd, u32 offset, u64 *data);
 int write_lcb_csr(struct hfi1_devdata *dd, u32 offset, u64 data);
 
 void __iomem *get_csr_addr(
-	struct hfi1_devdata *dd,
+	const struct hfi1_devdata *dd,
 	u32 offset);
 
 static inline void __iomem *get_kctxt_csr_addr(
-	struct hfi1_devdata *dd,
+	const struct hfi1_devdata *dd,
 	int ctxt,
 	u32 offset0)
 {
@@ -644,7 +648,6 @@ u64 create_pbc(struct hfi1_pportdata *ppd, u64 flags, int srate_mbs, u32 vl,
 #define NUM_PCIE_SERDES 16	/* number of PCIe serdes on the SBus */
 extern const u8 pcie_serdes_broadcast[];
 extern const u8 pcie_pcs_addrs[2][NUM_PCIE_SERDES];
-extern uint platform_config_load;
 
 /* SBus commands */
 #define RESET_SBUS_RECEIVER 0x20
@@ -704,6 +707,7 @@ int read_8051_data(struct hfi1_devdata *dd, u32 addr, u32 len, u64 *result);
 /* chip.c */
 void read_misc_status(struct hfi1_devdata *dd, u8 *ver_major, u8 *ver_minor,
 		      u8 *ver_patch);
+int write_host_interface_version(struct hfi1_devdata *dd, u8 version);
 void read_guid(struct hfi1_devdata *dd);
 int wait_fm_ready(struct hfi1_devdata *dd, u32 mstimeout);
 void set_link_down_reason(struct hfi1_pportdata *ppd, u8 lcl_reason,
@@ -743,11 +747,10 @@ int is_ax(struct hfi1_devdata *dd);
 int is_bx(struct hfi1_devdata *dd);
 u32 read_physical_state(struct hfi1_devdata *dd);
 u32 chip_to_opa_pstate(struct hfi1_devdata *dd, u32 chip_pstate);
-u32 get_logical_state(struct hfi1_pportdata *ppd);
 const char *opa_lstate_name(u32 lstate);
 const char *opa_pstate_name(u32 pstate);
-u32 driver_physical_state(struct hfi1_pportdata *ppd);
-u32 driver_logical_state(struct hfi1_pportdata *ppd);
+u32 driver_pstate(struct hfi1_pportdata *ppd);
+u32 driver_lstate(struct hfi1_pportdata *ppd);
 
 int acquire_lcb_access(struct hfi1_devdata *dd, int sleep_ok);
 int release_lcb_access(struct hfi1_devdata *dd, int sleep_ok);
@@ -1347,21 +1350,21 @@ enum {
 u64 get_all_cpu_total(u64 __percpu *cntr);
 void hfi1_start_cleanup(struct hfi1_devdata *dd);
 void hfi1_clear_tids(struct hfi1_ctxtdata *rcd);
-struct ib_header *hfi1_get_msgheader(
-				struct hfi1_devdata *dd, __le32 *rhf_addr);
 void hfi1_init_ctxt(struct send_context *sc);
 void hfi1_put_tid(struct hfi1_devdata *dd, u32 index,
 		  u32 type, unsigned long pa, u16 order);
 void hfi1_quiet_serdes(struct hfi1_pportdata *ppd);
-void hfi1_rcvctrl(struct hfi1_devdata *dd, unsigned int op, int ctxt);
+void hfi1_rcvctrl(struct hfi1_devdata *dd, unsigned int op,
+		  struct hfi1_ctxtdata *rcd);
 u32 hfi1_read_cntrs(struct hfi1_devdata *dd, char **namep, u64 **cntrp);
 u32 hfi1_read_portcntrs(struct hfi1_pportdata *ppd, char **namep, u64 **cntrp);
-u8 hfi1_ibphys_portstate(struct hfi1_pportdata *ppd);
 int hfi1_get_ib_cfg(struct hfi1_pportdata *ppd, int which);
 int hfi1_set_ib_cfg(struct hfi1_pportdata *ppd, int which, u32 val);
-int hfi1_set_ctxt_jkey(struct hfi1_devdata *dd, unsigned ctxt, u16 jkey);
-int hfi1_clear_ctxt_jkey(struct hfi1_devdata *dd, unsigned ctxt);
-int hfi1_set_ctxt_pkey(struct hfi1_devdata *dd, unsigned ctxt, u16 pkey);
+int hfi1_set_ctxt_jkey(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd,
+		       u16 jkey);
+int hfi1_clear_ctxt_jkey(struct hfi1_devdata *dd, struct hfi1_ctxtdata *ctxt);
+int hfi1_set_ctxt_pkey(struct hfi1_devdata *dd, struct hfi1_ctxtdata *ctxt,
+		       u16 pkey);
 int hfi1_clear_ctxt_pkey(struct hfi1_devdata *dd, struct hfi1_ctxtdata *ctxt);
 void hfi1_read_link_quality(struct hfi1_devdata *dd, u8 *link_quality);
 void hfi1_init_vnic_rsm(struct hfi1_devdata *dd);
diff --git a/drivers/infiniband/hw/hfi1/common.h b/drivers/infiniband/hw/hfi1/common.h
index 995d62c..3e27794e 100644
--- a/drivers/infiniband/hw/hfi1/common.h
+++ b/drivers/infiniband/hw/hfi1/common.h
@@ -325,22 +325,15 @@ struct diag_pkt {
 #define HFI1_LRH_BTH 0x0002      /* 1. word of IB LRH - next header: BTH */
 
 /* misc. */
+#define SC15_PACKET 0xF
 #define SIZE_OF_CRC 1
+#define SIZE_OF_LT 1
 
 #define LIM_MGMT_P_KEY       0x7FFF
 #define FULL_MGMT_P_KEY      0xFFFF
 
 #define DEFAULT_P_KEY LIM_MGMT_P_KEY
 
-/**
- * 0xF8 - 4 bits of multicast range and 1 bit for collective range
- * Example: For 24 bit LID space,
- * Multicast range: 0xF00000 to 0xF7FFFF
- * Collective range: 0xF80000 to 0xFFFFFE
- */
-#define HFI1_MCAST_NR 0x4 /* Number of top bits set */
-#define HFI1_COLLECTIVE_NR 0x1 /* Number of bits after MCAST_NR */
-
 #define HFI1_PSM_IOC_BASE_SEQ 0x0
 
 static inline __u64 rhf_to_cpu(const __le32 *rbuf)
diff --git a/drivers/infiniband/hw/hfi1/debugfs.c b/drivers/infiniband/hw/hfi1/debugfs.c
index e9fa3c2..36ae1fd 100644
--- a/drivers/infiniband/hw/hfi1/debugfs.c
+++ b/drivers/infiniband/hw/hfi1/debugfs.c
@@ -1,4 +1,3 @@
-#ifdef CONFIG_DEBUG_FS
 /*
  * Copyright(c) 2015-2017 Intel Corporation.
  *
@@ -173,12 +172,15 @@ static int _opcode_stats_seq_show(struct seq_file *s, void *v)
 	u64 n_packets = 0, n_bytes = 0;
 	struct hfi1_ibdev *ibd = (struct hfi1_ibdev *)s->private;
 	struct hfi1_devdata *dd = dd_from_dev(ibd);
+	struct hfi1_ctxtdata *rcd;
 
 	for (j = 0; j < dd->first_dyn_alloc_ctxt; j++) {
-		if (!dd->rcd[j])
-			continue;
-		n_packets += dd->rcd[j]->opstats->stats[i].n_packets;
-		n_bytes += dd->rcd[j]->opstats->stats[i].n_bytes;
+		rcd = hfi1_rcd_get_by_index(dd, j);
+		if (rcd) {
+			n_packets += rcd->opstats->stats[i].n_packets;
+			n_bytes += rcd->opstats->stats[i].n_bytes;
+		}
+		hfi1_rcd_put(rcd);
 	}
 	if (!n_packets && !n_bytes)
 		return SEQ_SKIP;
@@ -231,6 +233,7 @@ static int _ctx_stats_seq_show(struct seq_file *s, void *v)
 	u64 n_packets = 0;
 	struct hfi1_ibdev *ibd = (struct hfi1_ibdev *)s->private;
 	struct hfi1_devdata *dd = dd_from_dev(ibd);
+	struct hfi1_ctxtdata *rcd;
 
 	if (v == SEQ_START_TOKEN) {
 		seq_puts(s, "Ctx:npkts\n");
@@ -240,11 +243,14 @@ static int _ctx_stats_seq_show(struct seq_file *s, void *v)
 	spos = v;
 	i = *spos;
 
-	if (!dd->rcd[i])
+	rcd = hfi1_rcd_get_by_index(dd, i);
+	if (!rcd)
 		return SEQ_SKIP;
 
-	for (j = 0; j < ARRAY_SIZE(dd->rcd[i]->opstats->stats); j++)
-		n_packets += dd->rcd[i]->opstats->stats[j].n_packets;
+	for (j = 0; j < ARRAY_SIZE(rcd->opstats->stats); j++)
+		n_packets += rcd->opstats->stats[j].n_packets;
+
+	hfi1_rcd_put(rcd);
 
 	if (!n_packets)
 		return SEQ_SKIP;
@@ -260,10 +266,10 @@ DEBUGFS_FILE_OPS(ctx_stats);
 static void *_qp_stats_seq_start(struct seq_file *s, loff_t *pos)
 	__acquires(RCU)
 {
-	struct qp_iter *iter;
+	struct rvt_qp_iter *iter;
 	loff_t n = *pos;
 
-	iter = qp_iter_init(s->private);
+	iter = rvt_qp_iter_init(s->private, 0, NULL);
 
 	/* stop calls rcu_read_unlock */
 	rcu_read_lock();
@@ -272,7 +278,7 @@ static void *_qp_stats_seq_start(struct seq_file *s, loff_t *pos)
 		return NULL;
 
 	do {
-		if (qp_iter_next(iter)) {
+		if (rvt_qp_iter_next(iter)) {
 			kfree(iter);
 			return NULL;
 		}
@@ -285,11 +291,11 @@ static void *_qp_stats_seq_next(struct seq_file *s, void *iter_ptr,
 				loff_t *pos)
 	__must_hold(RCU)
 {
-	struct qp_iter *iter = iter_ptr;
+	struct rvt_qp_iter *iter = iter_ptr;
 
 	(*pos)++;
 
-	if (qp_iter_next(iter)) {
+	if (rvt_qp_iter_next(iter)) {
 		kfree(iter);
 		return NULL;
 	}
@@ -305,7 +311,7 @@ static void _qp_stats_seq_stop(struct seq_file *s, void *iter_ptr)
 
 static int _qp_stats_seq_show(struct seq_file *s, void *iter_ptr)
 {
-	struct qp_iter *iter = iter_ptr;
+	struct rvt_qp_iter *iter = iter_ptr;
 
 	if (!iter)
 		return 0;
@@ -361,6 +367,52 @@ DEBUGFS_SEQ_FILE_OPS(sdes);
 DEBUGFS_SEQ_FILE_OPEN(sdes)
 DEBUGFS_FILE_OPS(sdes);
 
+static void *_rcds_seq_start(struct seq_file *s, loff_t *pos)
+{
+	struct hfi1_ibdev *ibd;
+	struct hfi1_devdata *dd;
+
+	ibd = (struct hfi1_ibdev *)s->private;
+	dd = dd_from_dev(ibd);
+	if (!dd->rcd || *pos >= dd->n_krcv_queues)
+		return NULL;
+	return pos;
+}
+
+static void *_rcds_seq_next(struct seq_file *s, void *v, loff_t *pos)
+{
+	struct hfi1_ibdev *ibd = (struct hfi1_ibdev *)s->private;
+	struct hfi1_devdata *dd = dd_from_dev(ibd);
+
+	++*pos;
+	if (!dd->rcd || *pos >= dd->n_krcv_queues)
+		return NULL;
+	return pos;
+}
+
+static void _rcds_seq_stop(struct seq_file *s, void *v)
+{
+}
+
+static int _rcds_seq_show(struct seq_file *s, void *v)
+{
+	struct hfi1_ibdev *ibd = (struct hfi1_ibdev *)s->private;
+	struct hfi1_devdata *dd = dd_from_dev(ibd);
+	struct hfi1_ctxtdata *rcd;
+	loff_t *spos = v;
+	loff_t i = *spos;
+
+	rcd = hfi1_rcd_get_by_index(dd, i);
+	if (rcd)
+		seqfile_dump_rcd(s, rcd);
+	hfi1_rcd_put(rcd);
+	return 0;
+}
+
+DEBUGFS_SEQ_FILE_OPS(rcds);
+DEBUGFS_SEQ_FILE_OPEN(rcds)
+DEBUGFS_FILE_OPS(rcds);
+
 /* read the per-device counters */
 static ssize_t dev_counters_read(struct file *file, char __user *buf,
 				 size_t count, loff_t *ppos)
@@ -1098,12 +1150,15 @@ static int _fault_stats_seq_show(struct seq_file *s, void *v)
 	u64 n_packets = 0, n_bytes = 0;
 	struct hfi1_ibdev *ibd = (struct hfi1_ibdev *)s->private;
 	struct hfi1_devdata *dd = dd_from_dev(ibd);
+	struct hfi1_ctxtdata *rcd;
 
 	for (j = 0; j < dd->first_dyn_alloc_ctxt; j++) {
-		if (!dd->rcd[j])
-			continue;
-		n_packets += dd->rcd[j]->opstats->stats[i].n_packets;
-		n_bytes += dd->rcd[j]->opstats->stats[i].n_bytes;
+		rcd = hfi1_rcd_get_by_index(dd, j);
+		if (rcd) {
+			n_packets += rcd->opstats->stats[i].n_packets;
+			n_bytes += rcd->opstats->stats[i].n_bytes;
+		}
+		hfi1_rcd_put(rcd);
 	}
 	if (!n_packets && !n_bytes)
 		return SEQ_SKIP;
@@ -1311,6 +1366,7 @@ void hfi1_dbg_ibdev_init(struct hfi1_ibdev *ibd)
 	DEBUGFS_SEQ_FILE_CREATE(ctx_stats, ibd->hfi1_ibdev_dbg, ibd);
 	DEBUGFS_SEQ_FILE_CREATE(qp_stats, ibd->hfi1_ibdev_dbg, ibd);
 	DEBUGFS_SEQ_FILE_CREATE(sdes, ibd->hfi1_ibdev_dbg, ibd);
+	DEBUGFS_SEQ_FILE_CREATE(rcds, ibd->hfi1_ibdev_dbg, ibd);
 	DEBUGFS_SEQ_FILE_CREATE(sdma_cpu_list, ibd->hfi1_ibdev_dbg, ibd);
 	/* dev counter files */
 	for (i = 0; i < ARRAY_SIZE(cntr_ops); i++)
@@ -1478,5 +1534,3 @@ void hfi1_dbg_exit(void)
 	debugfs_remove_recursive(hfi1_dbg_root);
 	hfi1_dbg_root = NULL;
 }
-
-#endif
diff --git a/drivers/infiniband/hw/hfi1/driver.c b/drivers/infiniband/hw/hfi1/driver.c
index a50870e..7372cc0 100644
--- a/drivers/infiniband/hw/hfi1/driver.c
+++ b/drivers/infiniband/hw/hfi1/driver.c
@@ -96,7 +96,6 @@ MODULE_PARM_DESC(cap_mask, "Bit mask of enabled/disabled HW features");
 
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_DESCRIPTION("Intel Omni-Path Architecture driver");
-MODULE_VERSION(HFI1_DRIVER_VERSION);
 
 /*
  * MAX_PKT_RCV is the max # if packets processed per receive interrupt.
@@ -196,7 +195,7 @@ int hfi1_count_active_units(void)
 
 	spin_lock_irqsave(&hfi1_devs_lock, flags);
 	list_for_each_entry(dd, &hfi1_dev_list, list) {
-		if (!(dd->flags & HFI1_PRESENT) || !dd->kregbase)
+		if (!(dd->flags & HFI1_PRESENT) || !dd->kregbase1)
 			continue;
 		for (pidx = 0; pidx < dd->num_pports; ++pidx) {
 			ppd = dd->pport + pidx;
@@ -224,6 +223,27 @@ static inline void *get_egrbuf(const struct hfi1_ctxtdata *rcd, u64 rhf,
 			(offset * RCV_BUF_BLOCK_SIZE));
 }
 
+static inline void *hfi1_get_header(struct hfi1_devdata *dd,
+				    __le32 *rhf_addr)
+{
+	u32 offset = rhf_hdrq_offset(rhf_to_cpu(rhf_addr));
+
+	return (void *)(rhf_addr - dd->rhf_offset + offset);
+}
+
+static inline struct ib_header *hfi1_get_msgheader(struct hfi1_devdata *dd,
+						   __le32 *rhf_addr)
+{
+	return (struct ib_header *)hfi1_get_header(dd, rhf_addr);
+}
+
+static inline struct hfi1_16b_header
+		*hfi1_get_16B_header(struct hfi1_devdata *dd,
+				     __le32 *rhf_addr)
+{
+	return (struct hfi1_16b_header *)hfi1_get_header(dd, rhf_addr);
+}
+
 /*
  * Validate and encode the a given RcvArray Buffer size.
  * The function will check whether the given size falls within
@@ -249,7 +269,7 @@ static void rcv_hdrerr(struct hfi1_ctxtdata *rcd, struct hfi1_pportdata *ppd,
 {
 	struct ib_header *rhdr = packet->hdr;
 	u32 rte = rhf_rcv_type_err(packet->rhf);
-	int lnh = ib_get_lnh(rhdr);
+	u32 mlid_base;
 	struct hfi1_ibport *ibp = rcd_to_iport(rcd);
 	struct hfi1_devdata *dd = ppd->dd;
 	struct rvt_dev_info *rdi = &dd->verbs_dev.rdi;
@@ -257,37 +277,47 @@ static void rcv_hdrerr(struct hfi1_ctxtdata *rcd, struct hfi1_pportdata *ppd,
 	if (packet->rhf & (RHF_VCRC_ERR | RHF_ICRC_ERR))
 		return;
 
+	if (packet->etype == RHF_RCV_TYPE_BYPASS) {
+		goto drop;
+	} else {
+		u8 lnh = ib_get_lnh(rhdr);
+
+		mlid_base = be16_to_cpu(IB_MULTICAST_LID_BASE);
+		if (lnh == HFI1_LRH_BTH) {
+			packet->ohdr = &rhdr->u.oth;
+		} else if (lnh == HFI1_LRH_GRH) {
+			packet->ohdr = &rhdr->u.l.oth;
+			packet->grh = &rhdr->u.l.grh;
+		} else {
+			goto drop;
+		}
+	}
+
 	if (packet->rhf & RHF_TID_ERR) {
 		/* For TIDERR and RC QPs preemptively schedule a NAK */
-		struct ib_other_headers *ohdr = NULL;
 		u32 tlen = rhf_pkt_len(packet->rhf); /* in bytes */
-		u16 lid  = ib_get_dlid(rhdr);
+		u32 dlid = ib_get_dlid(rhdr);
 		u32 qp_num;
-		u32 rcv_flags = 0;
 
 		/* Sanity check packet */
 		if (tlen < 24)
 			goto drop;
 
 		/* Check for GRH */
-		if (lnh == HFI1_LRH_BTH) {
-			ohdr = &rhdr->u.oth;
-		} else if (lnh == HFI1_LRH_GRH) {
+		if (packet->grh) {
 			u32 vtf;
+			struct ib_grh *grh = packet->grh;
 
-			ohdr = &rhdr->u.l.oth;
-			if (rhdr->u.l.grh.next_hdr != IB_GRH_NEXT_HDR)
+			if (grh->next_hdr != IB_GRH_NEXT_HDR)
 				goto drop;
-			vtf = be32_to_cpu(rhdr->u.l.grh.version_tclass_flow);
+			vtf = be32_to_cpu(grh->version_tclass_flow);
 			if ((vtf >> IB_GRH_VERSION_SHIFT) != IB_GRH_VERSION)
 				goto drop;
-			rcv_flags |= HFI1_HAS_GRH;
-		} else {
-			goto drop;
 		}
+
 		/* Get the destination QP number. */
-		qp_num = be32_to_cpu(ohdr->bth[1]) & RVT_QPN_MASK;
-		if (lid < be16_to_cpu(IB_MULTICAST_LID_BASE)) {
+		qp_num = ib_bth_get_qpn(packet->ohdr);
+		if (dlid < mlid_base) {
 			struct rvt_qp *qp;
 			unsigned long flags;
 
@@ -312,11 +342,7 @@ static void rcv_hdrerr(struct hfi1_ctxtdata *rcd, struct hfi1_pportdata *ppd,
 
 			switch (qp->ibqp.qp_type) {
 			case IB_QPT_RC:
-				hfi1_rc_hdrerr(
-					rcd,
-					rhdr,
-					rcv_flags,
-					qp);
+				hfi1_rc_hdrerr(rcd, packet, qp);
 				break;
 			default:
 				/* For now don't handle any other QP types */
@@ -332,9 +358,8 @@ static void rcv_hdrerr(struct hfi1_ctxtdata *rcd, struct hfi1_pportdata *ppd,
 	switch (rte) {
 	case RHF_RTE_ERROR_OP_CODE_ERR:
 	{
-		u32 opcode;
 		void *ebuf = NULL;
-		__be32 *bth = NULL;
+		u8 opcode;
 
 		if (rhf_use_egr_bfr(packet->rhf))
 			ebuf = packet->ebuf;
@@ -342,16 +367,7 @@ static void rcv_hdrerr(struct hfi1_ctxtdata *rcd, struct hfi1_pportdata *ppd,
 		if (!ebuf)
 			goto drop; /* this should never happen */
 
-		if (lnh == HFI1_LRH_BTH)
-			bth = (__be32 *)ebuf;
-		else if (lnh == HFI1_LRH_GRH)
-			bth = (__be32 *)((char *)ebuf + sizeof(struct ib_grh));
-		else
-			goto drop;
-
-		opcode = be32_to_cpu(bth[0]) >> 24;
-		opcode &= 0xff;
-
+		opcode = ib_bth_get_opcode(packet->ohdr);
 		if (opcode == IB_OPCODE_CNP) {
 			/*
 			 * Only in pre-B0 h/w is the CNP_OPCODE handled
@@ -365,7 +381,7 @@ static void rcv_hdrerr(struct hfi1_ctxtdata *rcd, struct hfi1_pportdata *ppd,
 			sc5 = hfi1_9B_get_sc5(rhdr, packet->rhf);
 			sl = ibp->sc_to_sl[sc5];
 
-			lqpn = be32_to_cpu(bth[1]) & RVT_QPN_MASK;
+			lqpn = ib_bth_get_qpn(packet->ohdr);
 			rcu_read_lock();
 			qp = rvt_lookup_qpn(rdi, &ibp->rvp, lqpn);
 			if (!qp) {
@@ -415,33 +431,39 @@ static inline void init_packet(struct hfi1_ctxtdata *rcd,
 	packet->rhf = rhf_to_cpu(packet->rhf_addr);
 	packet->rhqoff = rcd->head;
 	packet->numpkt = 0;
-	packet->rcv_flags = 0;
 }
 
 void hfi1_process_ecn_slowpath(struct rvt_qp *qp, struct hfi1_packet *pkt,
 			       bool do_cnp)
 {
 	struct hfi1_ibport *ibp = to_iport(qp->ibqp.device, qp->port_num);
-	struct ib_header *hdr = pkt->hdr;
 	struct ib_other_headers *ohdr = pkt->ohdr;
-	struct ib_grh *grh = NULL;
+	struct ib_grh *grh = pkt->grh;
 	u32 rqpn = 0, bth1;
-	u16 rlid, dlid = ib_get_dlid(hdr);
-	u8 sc, svc_type;
+	u16 pkey, rlid, dlid = ib_get_dlid(pkt->hdr);
+	u8 hdr_type, sc, svc_type;
 	bool is_mcast = false;
 
-	if (pkt->rcv_flags & HFI1_HAS_GRH)
-		grh = &hdr->u.l.grh;
+	if (pkt->etype == RHF_RCV_TYPE_BYPASS) {
+		is_mcast = hfi1_is_16B_mcast(dlid);
+		pkey = hfi1_16B_get_pkey(pkt->hdr);
+		sc = hfi1_16B_get_sc(pkt->hdr);
+		hdr_type = HFI1_PKT_TYPE_16B;
+	} else {
+		is_mcast = (dlid > be16_to_cpu(IB_MULTICAST_LID_BASE)) &&
+			   (dlid != be16_to_cpu(IB_LID_PERMISSIVE));
+		pkey = ib_bth_get_pkey(ohdr);
+		sc = hfi1_9B_get_sc5(pkt->hdr, pkt->rhf);
+		hdr_type = HFI1_PKT_TYPE_9B;
+	}
 
 	switch (qp->ibqp.qp_type) {
 	case IB_QPT_SMI:
 	case IB_QPT_GSI:
 	case IB_QPT_UD:
-		rlid = ib_get_slid(hdr);
-		rqpn = be32_to_cpu(ohdr->u.ud.deth[1]) & RVT_QPN_MASK;
+		rlid = ib_get_slid(pkt->hdr);
+		rqpn = ib_get_sqpn(pkt->ohdr);
 		svc_type = IB_CC_SVCTYPE_UD;
-		is_mcast = (dlid > be16_to_cpu(IB_MULTICAST_LID_BASE)) &&
-			(dlid != be16_to_cpu(IB_LID_PERMISSIVE));
 		break;
 	case IB_QPT_UC:
 		rlid = rdma_ah_get_dlid(&qp->remote_ah_attr);
@@ -457,14 +479,11 @@ void hfi1_process_ecn_slowpath(struct rvt_qp *qp, struct hfi1_packet *pkt,
 		return;
 	}
 
-	sc = hfi1_9B_get_sc5(hdr, pkt->rhf);
-
 	bth1 = be32_to_cpu(ohdr->bth[1]);
-	if (do_cnp && (bth1 & IB_FECN_SMASK)) {
-		u16 pkey = (u16)be32_to_cpu(ohdr->bth[0]);
-
-		return_cnp(ibp, qp, rqpn, pkey, dlid, rlid, sc, grh);
-	}
+	/* Call appropriate CNP handler */
+	if (do_cnp && (bth1 & IB_FECN_SMASK))
+		hfi1_handle_cnp_tbl[hdr_type](ibp, qp, rqpn, pkey,
+					      dlid, rlid, sc, grh);
 
 	if (!is_mcast && (bth1 & IB_BECN_SMASK)) {
 		struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
@@ -591,9 +610,10 @@ static void __prescan_rxq(struct hfi1_packet *packet)
 
 		if (lnh == HFI1_LRH_BTH) {
 			packet->ohdr = &hdr->u.oth;
+			packet->grh = NULL;
 		} else if (lnh == HFI1_LRH_GRH) {
 			packet->ohdr = &hdr->u.l.oth;
-			packet->rcv_flags |= HFI1_HAS_GRH;
+			packet->grh = &hdr->u.l.grh;
 		} else {
 			goto next; /* just in case */
 		}
@@ -698,10 +718,8 @@ static inline int process_rcv_packet(struct hfi1_packet *packet, int thread)
 {
 	int ret;
 
-	packet->hdr = hfi1_get_msgheader(packet->rcd->dd,
-					 packet->rhf_addr);
-	packet->hlen = (u8 *)packet->rhf_addr - (u8 *)packet->hdr;
 	packet->etype = rhf_rcv_type(packet->rhf);
+
 	/* total length */
 	packet->tlen = rhf_pkt_len(packet->rhf); /* in bytes */
 	/* retrieve eager buffer details */
@@ -759,7 +777,7 @@ static inline void process_rcv_update(int last, struct hfi1_packet *packet)
 			       packet->etail, 0, 0);
 		packet->updegr = 0;
 	}
-	packet->rcv_flags = 0;
+	packet->grh = NULL;
 }
 
 static inline void finish_packet(struct hfi1_packet *packet)
@@ -837,9 +855,10 @@ int handle_receive_interrupt_dma_rtail(struct hfi1_ctxtdata *rcd, int thread)
 	return last;
 }
 
-static inline void set_nodma_rtail(struct hfi1_devdata *dd, u8 ctxt)
+static inline void set_nodma_rtail(struct hfi1_devdata *dd, u16 ctxt)
 {
-	int i;
+	struct hfi1_ctxtdata *rcd;
+	u16 i;
 
 	/*
 	 * For dynamically allocated kernel contexts (like vnic) switch
@@ -847,19 +866,28 @@ static inline void set_nodma_rtail(struct hfi1_devdata *dd, u8 ctxt)
 	 * interrupt handler for all statically allocated kernel contexts.
 	 */
 	if (ctxt >= dd->first_dyn_alloc_ctxt) {
-		dd->rcd[ctxt]->do_interrupt =
-			&handle_receive_interrupt_nodma_rtail;
+		rcd = hfi1_rcd_get_by_index(dd, ctxt);
+		if (rcd) {
+			rcd->do_interrupt =
+				&handle_receive_interrupt_nodma_rtail;
+			hfi1_rcd_put(rcd);
+		}
 		return;
 	}
 
-	for (i = HFI1_CTRL_CTXT + 1; i < dd->first_dyn_alloc_ctxt; i++)
-		dd->rcd[i]->do_interrupt =
-			&handle_receive_interrupt_nodma_rtail;
+	for (i = HFI1_CTRL_CTXT + 1; i < dd->first_dyn_alloc_ctxt; i++) {
+		rcd = hfi1_rcd_get_by_index(dd, i);
+		if (rcd)
+			rcd->do_interrupt =
+				&handle_receive_interrupt_nodma_rtail;
+		hfi1_rcd_put(rcd);
+	}
 }
 
-static inline void set_dma_rtail(struct hfi1_devdata *dd, u8 ctxt)
+static inline void set_dma_rtail(struct hfi1_devdata *dd, u16 ctxt)
 {
-	int i;
+	struct hfi1_ctxtdata *rcd;
+	u16 i;
 
 	/*
 	 * For dynamically allocated kernel contexts (like vnic) switch
@@ -867,27 +895,39 @@ static inline void set_dma_rtail(struct hfi1_devdata *dd, u8 ctxt)
 	 * interrupt handler for all statically allocated kernel contexts.
 	 */
 	if (ctxt >= dd->first_dyn_alloc_ctxt) {
-		dd->rcd[ctxt]->do_interrupt =
-			&handle_receive_interrupt_dma_rtail;
+		rcd = hfi1_rcd_get_by_index(dd, ctxt);
+		if (rcd) {
+			rcd->do_interrupt =
+				&handle_receive_interrupt_dma_rtail;
+			hfi1_rcd_put(rcd);
+		}
 		return;
 	}
 
-	for (i = HFI1_CTRL_CTXT + 1; i < dd->first_dyn_alloc_ctxt; i++)
-		dd->rcd[i]->do_interrupt =
-			&handle_receive_interrupt_dma_rtail;
+	for (i = HFI1_CTRL_CTXT + 1; i < dd->first_dyn_alloc_ctxt; i++) {
+		rcd = hfi1_rcd_get_by_index(dd, i);
+		if (rcd)
+			rcd->do_interrupt =
+				&handle_receive_interrupt_dma_rtail;
+		hfi1_rcd_put(rcd);
+	}
 }
 
 void set_all_slowpath(struct hfi1_devdata *dd)
 {
-	int i;
+	struct hfi1_ctxtdata *rcd;
+	u16 i;
 
 	/* HFI1_CTRL_CTXT must always use the slow path interrupt handler */
 	for (i = HFI1_CTRL_CTXT + 1; i < dd->num_rcv_contexts; i++) {
-		struct hfi1_ctxtdata *rcd = dd->rcd[i];
-
+		rcd = hfi1_rcd_get_by_index(dd, i);
+		if (!rcd)
+			continue;
 		if ((i < dd->first_dyn_alloc_ctxt) ||
-		    (rcd && rcd->sc && (rcd->sc->type == SC_KERNEL)))
+		    (rcd->sc && (rcd->sc->type == SC_KERNEL))) {
 			rcd->do_interrupt = &handle_receive_interrupt;
+		}
+		hfi1_rcd_put(rcd);
 	}
 }
 
@@ -896,20 +936,30 @@ static inline int set_armed_to_active(struct hfi1_ctxtdata *rcd,
 				      struct hfi1_devdata *dd)
 {
 	struct work_struct *lsaw = &rcd->ppd->linkstate_active_work;
-	struct ib_header *hdr = hfi1_get_msgheader(packet->rcd->dd,
-						   packet->rhf_addr);
 	u8 etype = rhf_rcv_type(packet->rhf);
+	u8 sc = SC15_PACKET;
 
-	if (etype == RHF_RCV_TYPE_IB &&
-	    hfi1_9B_get_sc5(hdr, packet->rhf) != 0xf) {
-		int hwstate = read_logical_state(dd);
+	if (etype == RHF_RCV_TYPE_IB) {
+		struct ib_header *hdr = hfi1_get_msgheader(packet->rcd->dd,
+							   packet->rhf_addr);
+		sc = hfi1_9B_get_sc5(hdr, packet->rhf);
+	} else if (etype == RHF_RCV_TYPE_BYPASS) {
+		struct hfi1_16b_header *hdr = hfi1_get_16B_header(
+						packet->rcd->dd,
+						packet->rhf_addr);
+		sc = hfi1_16B_get_sc(hdr);
+	}
+	if (sc != SC15_PACKET) {
+		int hwstate = driver_lstate(rcd->ppd);
 
-		if (hwstate != LSTATE_ACTIVE) {
-			dd_dev_info(dd, "Unexpected link state %d\n", hwstate);
+		if (hwstate != IB_PORT_ACTIVE) {
+			dd_dev_info(dd,
+				    "Unexpected link state %s\n",
+				    opa_lstate_name(hwstate));
 			return 0;
 		}
 
-		queue_work(rcd->ppd->hfi1_wq, lsaw);
+		queue_work(rcd->ppd->link_wq, lsaw);
 		return 1;
 	}
 	return 0;
@@ -1063,7 +1113,8 @@ void receive_interrupt_work(struct work_struct *work)
 	struct hfi1_pportdata *ppd = container_of(work, struct hfi1_pportdata,
 						  linkstate_active_work);
 	struct hfi1_devdata *dd = ppd->dd;
-	int i;
+	struct hfi1_ctxtdata *rcd;
+	u16 i;
 
 	/* Received non-SC15 packet implies neighbor_normal */
 	ppd->neighbor_normal = 1;
@@ -1073,8 +1124,12 @@ void receive_interrupt_work(struct work_struct *work)
 	 * Interrupt all statically allocated kernel contexts that could
 	 * have had an interrupt during auto activation.
 	 */
-	for (i = HFI1_CTRL_CTXT; i < dd->first_dyn_alloc_ctxt; i++)
-		force_recv_intr(dd->rcd[i]);
+	for (i = HFI1_CTRL_CTXT; i < dd->first_dyn_alloc_ctxt; i++) {
+		rcd = hfi1_rcd_get_by_index(dd, i);
+		if (rcd)
+			force_recv_intr(rcd);
+		hfi1_rcd_put(rcd);
+	}
 }
 
 /*
@@ -1264,10 +1319,9 @@ void hfi1_start_led_override(struct hfi1_pportdata *ppd, unsigned int timeon,
  */
 int hfi1_reset_device(int unit)
 {
-	int ret, i;
+	int ret;
 	struct hfi1_devdata *dd = hfi1_lookup(unit);
 	struct hfi1_pportdata *ppd;
-	unsigned long flags;
 	int pidx;
 
 	if (!dd) {
@@ -1277,7 +1331,7 @@ int hfi1_reset_device(int unit)
 
 	dd_dev_info(dd, "Reset on unit %u requested\n", unit);
 
-	if (!dd->kregbase || !(dd->flags & HFI1_PRESENT)) {
+	if (!dd->kregbase1 || !(dd->flags & HFI1_PRESENT)) {
 		dd_dev_info(dd,
 			    "Invalid unit number %u or not initialized or not present\n",
 			    unit);
@@ -1285,17 +1339,15 @@ int hfi1_reset_device(int unit)
 		goto bail;
 	}
 
-	spin_lock_irqsave(&dd->uctxt_lock, flags);
+	/* If there are any user/vnic contexts, we cannot reset */
+	mutex_lock(&hfi1_mutex);
 	if (dd->rcd)
-		for (i = dd->first_dyn_alloc_ctxt;
-		     i < dd->num_rcv_contexts; i++) {
-			if (!dd->rcd[i])
-				continue;
-			spin_unlock_irqrestore(&dd->uctxt_lock, flags);
+		if (hfi1_stats.sps_ctxts) {
+			mutex_unlock(&hfi1_mutex);
 			ret = -EBUSY;
 			goto bail;
 		}
-	spin_unlock_irqrestore(&dd->uctxt_lock, flags);
+	mutex_unlock(&hfi1_mutex);
 
 	for (pidx = 0; pidx < dd->num_pports; ++pidx) {
 		ppd = dd->pport + pidx;
@@ -1321,6 +1373,162 @@ int hfi1_reset_device(int unit)
 	return ret;
 }
 
+static inline void hfi1_setup_ib_header(struct hfi1_packet *packet)
+{
+	packet->hdr = (struct hfi1_ib_message_header *)
+			hfi1_get_msgheader(packet->rcd->dd,
+					   packet->rhf_addr);
+	packet->hlen = (u8 *)packet->rhf_addr - (u8 *)packet->hdr;
+}
+
+static int hfi1_bypass_ingress_pkt_check(struct hfi1_packet *packet)
+{
+	struct hfi1_pportdata *ppd = packet->rcd->ppd;
+
+	/* slid and dlid cannot be 0 */
+	if ((!packet->slid) || (!packet->dlid))
+		return -EINVAL;
+
+	/* Compare port lid with incoming packet dlid */
+	if ((!(hfi1_is_16B_mcast(packet->dlid))) &&
+	    (packet->dlid !=
+		opa_get_lid(be32_to_cpu(OPA_LID_PERMISSIVE), 16B))) {
+		if (packet->dlid != ppd->lid)
+			return -EINVAL;
+	}
+
+	/* No multicast packets with SC15 */
+	if ((hfi1_is_16B_mcast(packet->dlid)) && (packet->sc == 0xF))
+		return -EINVAL;
+
+	/* Packets with permissive DLID always on SC15 */
+	if ((packet->dlid == opa_get_lid(be32_to_cpu(OPA_LID_PERMISSIVE),
+					 16B)) &&
+	    (packet->sc != 0xF))
+		return -EINVAL;
+
+	return 0;
+}
+
+static int hfi1_setup_9B_packet(struct hfi1_packet *packet)
+{
+	struct hfi1_ibport *ibp = rcd_to_iport(packet->rcd);
+	struct ib_header *hdr;
+	u8 lnh;
+
+	hfi1_setup_ib_header(packet);
+	hdr = packet->hdr;
+
+	lnh = ib_get_lnh(hdr);
+	if (lnh == HFI1_LRH_BTH) {
+		packet->ohdr = &hdr->u.oth;
+		packet->grh = NULL;
+	} else if (lnh == HFI1_LRH_GRH) {
+		u32 vtf;
+
+		packet->ohdr = &hdr->u.l.oth;
+		packet->grh = &hdr->u.l.grh;
+		if (packet->grh->next_hdr != IB_GRH_NEXT_HDR)
+			goto drop;
+		vtf = be32_to_cpu(packet->grh->version_tclass_flow);
+		if ((vtf >> IB_GRH_VERSION_SHIFT) != IB_GRH_VERSION)
+			goto drop;
+	} else {
+		goto drop;
+	}
+
+	/* Query commonly used fields from packet header */
+	packet->payload = packet->ebuf;
+	packet->opcode = ib_bth_get_opcode(packet->ohdr);
+	packet->slid = ib_get_slid(hdr);
+	packet->dlid = ib_get_dlid(hdr);
+	if (unlikely((packet->dlid >= be16_to_cpu(IB_MULTICAST_LID_BASE)) &&
+		     (packet->dlid != be16_to_cpu(IB_LID_PERMISSIVE))))
+		packet->dlid += opa_get_mcast_base(OPA_MCAST_NR) -
+				be16_to_cpu(IB_MULTICAST_LID_BASE);
+	packet->sl = ib_get_sl(hdr);
+	packet->sc = hfi1_9B_get_sc5(hdr, packet->rhf);
+	packet->pad = ib_bth_get_pad(packet->ohdr);
+	packet->extra_byte = 0;
+	packet->fecn = ib_bth_get_fecn(packet->ohdr);
+	packet->becn = ib_bth_get_becn(packet->ohdr);
+
+	return 0;
+drop:
+	ibp->rvp.n_pkt_drops++;
+	return -EINVAL;
+}
+
+static int hfi1_setup_bypass_packet(struct hfi1_packet *packet)
+{
+	/*
+	 * Bypass packets have a different header/payload split
+	 * compared to an IB packet.
+	 * Current split is set such that 16 bytes of the actual
+	 * header is in the header buffer and the remining is in
+	 * the eager buffer. We chose 16 since hfi1 driver only
+	 * supports 16B bypass packets and we will be able to
+	 * receive the entire LRH with such a split.
+	 */
+
+	struct hfi1_ctxtdata *rcd = packet->rcd;
+	struct hfi1_pportdata *ppd = rcd->ppd;
+	struct hfi1_ibport *ibp = &ppd->ibport_data;
+	u8 l4;
+	u8 grh_len;
+
+	packet->hdr = (struct hfi1_16b_header *)
+			hfi1_get_16B_header(packet->rcd->dd,
+					    packet->rhf_addr);
+	packet->hlen = (u8 *)packet->rhf_addr - (u8 *)packet->hdr;
+
+	l4 = hfi1_16B_get_l4(packet->hdr);
+	if (l4 == OPA_16B_L4_IB_LOCAL) {
+		grh_len = 0;
+		packet->ohdr = packet->ebuf;
+		packet->grh = NULL;
+	} else if (l4 == OPA_16B_L4_IB_GLOBAL) {
+		u32 vtf;
+
+		grh_len = sizeof(struct ib_grh);
+		packet->ohdr = packet->ebuf + grh_len;
+		packet->grh = packet->ebuf;
+		if (packet->grh->next_hdr != IB_GRH_NEXT_HDR)
+			goto drop;
+		vtf = be32_to_cpu(packet->grh->version_tclass_flow);
+		if ((vtf >> IB_GRH_VERSION_SHIFT) != IB_GRH_VERSION)
+			goto drop;
+	} else {
+		goto drop;
+	}
+
+	/* Query commonly used fields from packet header */
+	packet->opcode = ib_bth_get_opcode(packet->ohdr);
+	packet->hlen = hdr_len_by_opcode[packet->opcode] + 8 + grh_len;
+	packet->payload = packet->ebuf + packet->hlen - (4 * sizeof(u32));
+	packet->slid = hfi1_16B_get_slid(packet->hdr);
+	packet->dlid = hfi1_16B_get_dlid(packet->hdr);
+	if (unlikely(hfi1_is_16B_mcast(packet->dlid)))
+		packet->dlid += opa_get_mcast_base(OPA_MCAST_NR) -
+				opa_get_lid(opa_get_mcast_base(OPA_MCAST_NR),
+					    16B);
+	packet->sc = hfi1_16B_get_sc(packet->hdr);
+	packet->sl = ibp->sc_to_sl[packet->sc];
+	packet->pad = hfi1_16B_bth_get_pad(packet->ohdr);
+	packet->extra_byte = SIZE_OF_LT;
+	packet->fecn = hfi1_16B_get_fecn(packet->hdr);
+	packet->becn = hfi1_16B_get_becn(packet->hdr);
+
+	if (hfi1_bypass_ingress_pkt_check(packet))
+		goto drop;
+
+	return 0;
+drop:
+	hfi1_cdbg(PKT, "%s: packet dropped\n", __func__);
+	ibp->rvp.n_pkt_drops++;
+	return -EINVAL;
+}
+
 void handle_eflags(struct hfi1_packet *packet)
 {
 	struct hfi1_ctxtdata *rcd = packet->rcd;
@@ -1351,6 +1559,9 @@ int process_receive_ib(struct hfi1_packet *packet)
 	if (unlikely(hfi1_dbg_fault_packet(packet)))
 		return RHF_RCV_CONTINUE;
 
+	if (hfi1_setup_9B_packet(packet))
+		return RHF_RCV_CONTINUE;
+
 	trace_hfi1_rcvhdr(packet->rcd->ppd->dd,
 			  packet->rcd->ctxt,
 			  rhf_err_flags(packet->rhf),
@@ -1380,8 +1591,8 @@ static inline bool hfi1_is_vnic_packet(struct hfi1_packet *packet)
 	if (packet->rcd->is_vnic)
 		return true;
 
-	if ((HFI1_GET_L2_TYPE(packet->ebuf) == OPA_VNIC_L2_TYPE) &&
-	    (HFI1_GET_L4_TYPE(packet->ebuf) == OPA_VNIC_L4_ETHR))
+	if ((hfi1_16B_get_l2(packet->ebuf) == OPA_16B_L2_TYPE) &&
+	    (hfi1_16B_get_l4(packet->ebuf) == OPA_16B_L4_ETHR))
 		return true;
 
 	return false;
@@ -1391,25 +1602,38 @@ int process_receive_bypass(struct hfi1_packet *packet)
 {
 	struct hfi1_devdata *dd = packet->rcd->dd;
 
-	if (unlikely(rhf_err_flags(packet->rhf))) {
-		handle_eflags(packet);
-	} else if (hfi1_is_vnic_packet(packet)) {
+	if (hfi1_is_vnic_packet(packet)) {
 		hfi1_vnic_bypass_rcv(packet);
 		return RHF_RCV_CONTINUE;
 	}
 
-	dd_dev_err(dd, "Unsupported bypass packet. Dropping\n");
-	incr_cntr64(&dd->sw_rcv_bypass_packet_errors);
-	if (!(dd->err_info_rcvport.status_and_code & OPA_EI_STATUS_SMASK)) {
-		u64 *flits = packet->ebuf;
+	if (hfi1_setup_bypass_packet(packet))
+		return RHF_RCV_CONTINUE;
 
-		if (flits && !(packet->rhf & RHF_LEN_ERR)) {
-			dd->err_info_rcvport.packet_flit1 = flits[0];
-			dd->err_info_rcvport.packet_flit2 =
-				packet->tlen > sizeof(flits[0]) ? flits[1] : 0;
+	if (unlikely(rhf_err_flags(packet->rhf))) {
+		handle_eflags(packet);
+		return RHF_RCV_CONTINUE;
+	}
+
+	if (hfi1_16B_get_l2(packet->hdr) == 0x2) {
+		hfi1_16B_rcv(packet);
+	} else {
+		dd_dev_err(dd,
+			   "Bypass packets other than 16B are not supported in normal operation. Dropping\n");
+		incr_cntr64(&dd->sw_rcv_bypass_packet_errors);
+		if (!(dd->err_info_rcvport.status_and_code &
+		      OPA_EI_STATUS_SMASK)) {
+			u64 *flits = packet->ebuf;
+
+			if (flits && !(packet->rhf & RHF_LEN_ERR)) {
+				dd->err_info_rcvport.packet_flit1 = flits[0];
+				dd->err_info_rcvport.packet_flit2 =
+					packet->tlen > sizeof(flits[0]) ?
+					flits[1] : 0;
+			}
+			dd->err_info_rcvport.status_and_code |=
+				(OPA_EI_STATUS_SMASK | BAD_L2_ERR);
 		}
-		dd->err_info_rcvport.status_and_code |=
-			(OPA_EI_STATUS_SMASK | BAD_L2_ERR);
 	}
 	return RHF_RCV_CONTINUE;
 }
@@ -1422,6 +1646,7 @@ int process_receive_error(struct hfi1_packet *packet)
 		 rhf_rcv_type_err(packet->rhf) == 3))
 		return RHF_RCV_CONTINUE;
 
+	hfi1_setup_ib_header(packet);
 	handle_eflags(packet);
 
 	if (unlikely(rhf_err_flags(packet->rhf)))
@@ -1435,6 +1660,8 @@ int kdeth_process_expected(struct hfi1_packet *packet)
 {
 	if (unlikely(hfi1_dbg_fault_packet(packet)))
 		return RHF_RCV_CONTINUE;
+
+	hfi1_setup_ib_header(packet);
 	if (unlikely(rhf_err_flags(packet->rhf)))
 		handle_eflags(packet);
 
@@ -1445,6 +1672,7 @@ int kdeth_process_expected(struct hfi1_packet *packet)
 
 int kdeth_process_eager(struct hfi1_packet *packet)
 {
+	hfi1_setup_ib_header(packet);
 	if (unlikely(rhf_err_flags(packet->rhf)))
 		handle_eflags(packet);
 	if (unlikely(hfi1_dbg_fault_packet(packet)))
@@ -1461,3 +1689,62 @@ int process_receive_invalid(struct hfi1_packet *packet)
 		   rhf_rcv_type(packet->rhf));
 	return RHF_RCV_CONTINUE;
 }
+
+void seqfile_dump_rcd(struct seq_file *s, struct hfi1_ctxtdata *rcd)
+{
+	struct hfi1_packet packet;
+	struct ps_mdata mdata;
+
+	seq_printf(s, "Rcd %u: RcvHdr cnt %u entsize %u %s head %llu tail %llu\n",
+		   rcd->ctxt, rcd->rcvhdrq_cnt, rcd->rcvhdrqentsize,
+		   HFI1_CAP_KGET_MASK(rcd->flags, DMA_RTAIL) ?
+		   "dma_rtail" : "nodma_rtail",
+		   read_uctxt_csr(rcd->dd, rcd->ctxt, RCV_HDR_HEAD) &
+		   RCV_HDR_HEAD_HEAD_MASK,
+		   read_uctxt_csr(rcd->dd, rcd->ctxt, RCV_HDR_TAIL));
+
+	init_packet(rcd, &packet);
+	init_ps_mdata(&mdata, &packet);
+
+	while (1) {
+		struct hfi1_devdata *dd = rcd->dd;
+		__le32 *rhf_addr = (__le32 *)rcd->rcvhdrq + mdata.ps_head +
+					 dd->rhf_offset;
+		struct ib_header *hdr;
+		u64 rhf = rhf_to_cpu(rhf_addr);
+		u32 etype = rhf_rcv_type(rhf), qpn;
+		u8 opcode;
+		u32 psn;
+		u8 lnh;
+
+		if (ps_done(&mdata, rhf, rcd))
+			break;
+
+		if (ps_skip(&mdata, rhf, rcd))
+			goto next;
+
+		if (etype > RHF_RCV_TYPE_IB)
+			goto next;
+
+		packet.hdr = hfi1_get_msgheader(dd, rhf_addr);
+		hdr = packet.hdr;
+
+		lnh = be16_to_cpu(hdr->lrh[0]) & 3;
+
+		if (lnh == HFI1_LRH_BTH)
+			packet.ohdr = &hdr->u.oth;
+		else if (lnh == HFI1_LRH_GRH)
+			packet.ohdr = &hdr->u.l.oth;
+		else
+			goto next; /* just in case */
+
+		opcode = (be32_to_cpu(packet.ohdr->bth[0]) >> 24);
+		qpn = be32_to_cpu(packet.ohdr->bth[1]) & RVT_QPN_MASK;
+		psn = mask_psn(be32_to_cpu(packet.ohdr->bth[2]));
+
+		seq_printf(s, "\tEnt %u: opcode 0x%x, qpn 0x%x, psn 0x%x\n",
+			   mdata.ps_head, opcode, qpn, psn);
+next:
+		update_ps_mdata(&mdata, rcd);
+	}
+}
diff --git a/drivers/infiniband/hw/hfi1/eprom.c b/drivers/infiniband/hw/hfi1/eprom.c
index 26da124..d46b171 100644
--- a/drivers/infiniband/hw/hfi1/eprom.c
+++ b/drivers/infiniband/hw/hfi1/eprom.c
@@ -250,7 +250,6 @@ static int read_partition_platform_config(struct hfi1_devdata *dd, void **data,
 {
 	void *buffer;
 	void *p;
-	u32 length;
 	int ret;
 
 	buffer = kmalloc(P1_SIZE, GFP_KERNEL);
@@ -265,13 +264,13 @@ static int read_partition_platform_config(struct hfi1_devdata *dd, void **data,
 
 	/* scan for image magic that may trail the actual data */
 	p = strnstr(buffer, IMAGE_TRAIL_MAGIC, P1_SIZE);
-	if (p)
-		length = p - buffer;
-	else
-		length = P1_SIZE;
+	if (!p) {
+		kfree(buffer);
+		return -ENOENT;
+	}
 
 	*data = buffer;
-	*size = length;
+	*size = p - buffer;
 	return 0;
 }
 
diff --git a/drivers/infiniband/hw/hfi1/exp_rcv.c b/drivers/infiniband/hw/hfi1/exp_rcv.c
new file mode 100644
index 0000000..0af9167
--- /dev/null
+++ b/drivers/infiniband/hw/hfi1/exp_rcv.c
@@ -0,0 +1,114 @@
+/*
+ * Copyright(c) 2017 Intel Corporation.
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ *  - Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *  - Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *  - Neither the name of Intel Corporation nor the names of its
+ *    contributors may be used to endorse or promote products derived
+ *    from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+
+#include "exp_rcv.h"
+#include "trace.h"
+
+/**
+ * exp_tid_group_init - initialize exp_tid_set
+ * @set - the set
+ */
+void hfi1_exp_tid_group_init(struct exp_tid_set *set)
+{
+	INIT_LIST_HEAD(&set->list);
+	set->count = 0;
+}
+
+/**
+ * alloc_ctxt_rcv_groups - initialize expected receive groups
+ * @rcd - the context to add the groupings to
+ */
+int hfi1_alloc_ctxt_rcv_groups(struct hfi1_ctxtdata *rcd)
+{
+	struct hfi1_devdata *dd = rcd->dd;
+	u32 tidbase;
+	struct tid_group *grp;
+	int i;
+
+	tidbase = rcd->expected_base;
+	for (i = 0; i < rcd->expected_count /
+		     dd->rcv_entries.group_size; i++) {
+		grp = kzalloc(sizeof(*grp), GFP_KERNEL);
+		if (!grp)
+			goto bail;
+		grp->size = dd->rcv_entries.group_size;
+		grp->base = tidbase;
+		tid_group_add_tail(grp, &rcd->tid_group_list);
+		tidbase += dd->rcv_entries.group_size;
+	}
+
+	return 0;
+bail:
+	hfi1_free_ctxt_rcv_groups(rcd);
+	return -ENOMEM;
+}
+
+/**
+ * free_ctxt_rcv_groups - free  expected receive groups
+ * @rcd - the context to free
+ *
+ * The routine dismantles the expect receive linked
+ * list and clears any tids associated with the receive
+ * context.
+ *
+ * This should only be called for kernel contexts and the
+ * a base user context.
+ */
+void hfi1_free_ctxt_rcv_groups(struct hfi1_ctxtdata *rcd)
+{
+	struct tid_group *grp, *gptr;
+
+	WARN_ON(!EXP_TID_SET_EMPTY(rcd->tid_full_list));
+	WARN_ON(!EXP_TID_SET_EMPTY(rcd->tid_used_list));
+
+	list_for_each_entry_safe(grp, gptr, &rcd->tid_group_list.list, list) {
+		tid_group_remove(grp, &rcd->tid_group_list);
+		kfree(grp);
+	}
+
+	hfi1_clear_tids(rcd);
+}
diff --git a/drivers/infiniband/hw/hfi1/exp_rcv.h b/drivers/infiniband/hw/hfi1/exp_rcv.h
new file mode 100644
index 0000000..0871904
--- /dev/null
+++ b/drivers/infiniband/hw/hfi1/exp_rcv.h
@@ -0,0 +1,190 @@
+#ifndef _HFI1_EXP_RCV_H
+#define _HFI1_EXP_RCV_H
+/*
+ * Copyright(c) 2017 Intel Corporation.
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ *  - Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *  - Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *  - Neither the name of Intel Corporation nor the names of its
+ *    contributors may be used to endorse or promote products derived
+ *    from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+
+#include "hfi.h"
+
+#define EXP_TID_SET_EMPTY(set) (set.count == 0 && list_empty(&set.list))
+
+#define EXP_TID_TIDLEN_MASK   0x7FFULL
+#define EXP_TID_TIDLEN_SHIFT  0
+#define EXP_TID_TIDCTRL_MASK  0x3ULL
+#define EXP_TID_TIDCTRL_SHIFT 20
+#define EXP_TID_TIDIDX_MASK   0x3FFULL
+#define EXP_TID_TIDIDX_SHIFT  22
+#define EXP_TID_GET(tid, field)	\
+	(((tid) >> EXP_TID_TID##field##_SHIFT) & EXP_TID_TID##field##_MASK)
+
+#define EXP_TID_SET(field, value)			\
+	(((value) & EXP_TID_TID##field##_MASK) <<	\
+	 EXP_TID_TID##field##_SHIFT)
+#define EXP_TID_CLEAR(tid, field) ({					\
+		(tid) &= ~(EXP_TID_TID##field##_MASK <<			\
+			   EXP_TID_TID##field##_SHIFT);			\
+		})
+#define EXP_TID_RESET(tid, field, value) do {				\
+		EXP_TID_CLEAR(tid, field);				\
+		(tid) |= EXP_TID_SET(field, (value));			\
+	} while (0)
+
+/*
+ * Define fields in the KDETH header so we can update the header
+ * template.
+ */
+#define KDETH_OFFSET_SHIFT        0
+#define KDETH_OFFSET_MASK         0x7fff
+#define KDETH_OM_SHIFT            15
+#define KDETH_OM_MASK             0x1
+#define KDETH_TID_SHIFT           16
+#define KDETH_TID_MASK            0x3ff
+#define KDETH_TIDCTRL_SHIFT       26
+#define KDETH_TIDCTRL_MASK        0x3
+#define KDETH_INTR_SHIFT          28
+#define KDETH_INTR_MASK           0x1
+#define KDETH_SH_SHIFT            29
+#define KDETH_SH_MASK             0x1
+#define KDETH_KVER_SHIFT          30
+#define KDETH_KVER_MASK           0x3
+#define KDETH_JKEY_SHIFT          0x0
+#define KDETH_JKEY_MASK           0xff
+#define KDETH_HCRC_UPPER_SHIFT    16
+#define KDETH_HCRC_UPPER_MASK     0xff
+#define KDETH_HCRC_LOWER_SHIFT    24
+#define KDETH_HCRC_LOWER_MASK     0xff
+
+#define KDETH_GET(val, field)						\
+	(((le32_to_cpu((val))) >> KDETH_##field##_SHIFT) & KDETH_##field##_MASK)
+#define KDETH_SET(dw, field, val) do {					\
+		u32 dwval = le32_to_cpu(dw);				\
+		dwval &= ~(KDETH_##field##_MASK << KDETH_##field##_SHIFT); \
+		dwval |= (((val) & KDETH_##field##_MASK) << \
+			  KDETH_##field##_SHIFT);			\
+		dw = cpu_to_le32(dwval);				\
+	} while (0)
+
+#define KDETH_RESET(dw, field, val) ({ dw = 0; KDETH_SET(dw, field, val); })
+
+/* KDETH OM multipliers and switch over point */
+#define KDETH_OM_SMALL     4
+#define KDETH_OM_SMALL_SHIFT     2
+#define KDETH_OM_LARGE     64
+#define KDETH_OM_LARGE_SHIFT     6
+#define KDETH_OM_MAX_SIZE  (1 << ((KDETH_OM_LARGE / KDETH_OM_SMALL) + 1))
+
+struct tid_group {
+	struct list_head list;
+	u32 base;
+	u8 size;
+	u8 used;
+	u8 map;
+};
+
+/*
+ * Write an "empty" RcvArray entry.
+ * This function exists so the TID registaration code can use it
+ * to write to unused/unneeded entries and still take advantage
+ * of the WC performance improvements. The HFI will ignore this
+ * write to the RcvArray entry.
+ */
+static inline void rcv_array_wc_fill(struct hfi1_devdata *dd, u32 index)
+{
+	/*
+	 * Doing the WC fill writes only makes sense if the device is
+	 * present and the RcvArray has been mapped as WC memory.
+	 */
+	if ((dd->flags & HFI1_PRESENT) && dd->rcvarray_wc) {
+		writeq(0, dd->rcvarray_wc + (index * 8));
+		if ((index & 3) == 3)
+			flush_wc();
+	}
+}
+
+static inline void tid_group_add_tail(struct tid_group *grp,
+				      struct exp_tid_set *set)
+{
+	list_add_tail(&grp->list, &set->list);
+	set->count++;
+}
+
+static inline void tid_group_remove(struct tid_group *grp,
+				    struct exp_tid_set *set)
+{
+	list_del_init(&grp->list);
+	set->count--;
+}
+
+static inline void tid_group_move(struct tid_group *group,
+				  struct exp_tid_set *s1,
+				  struct exp_tid_set *s2)
+{
+	tid_group_remove(group, s1);
+	tid_group_add_tail(group, s2);
+}
+
+static inline struct tid_group *tid_group_pop(struct exp_tid_set *set)
+{
+	struct tid_group *grp =
+		list_first_entry(&set->list, struct tid_group, list);
+	list_del_init(&grp->list);
+	set->count--;
+	return grp;
+}
+
+static inline u32 rcventry2tidinfo(u32 rcventry)
+{
+	u32 pair = rcventry & ~0x1;
+
+	return EXP_TID_SET(IDX, pair >> 1) |
+		EXP_TID_SET(CTRL, 1 << (rcventry - pair));
+}
+
+int hfi1_alloc_ctxt_rcv_groups(struct hfi1_ctxtdata *rcd);
+void hfi1_free_ctxt_rcv_groups(struct hfi1_ctxtdata *rcd);
+void hfi1_exp_tid_group_init(struct exp_tid_set *set);
+
+#endif /* _HFI1_EXP_RCV_H */
diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/hw/hfi1/file_ops.c
index 3158128..2bc8926 100644
--- a/drivers/infiniband/hw/hfi1/file_ops.c
+++ b/drivers/infiniband/hw/hfi1/file_ops.c
@@ -58,10 +58,10 @@
 #include "device.h"
 #include "common.h"
 #include "trace.h"
+#include "mmu_rb.h"
 #include "user_sdma.h"
 #include "user_exp_rcv.h"
 #include "aspm.h"
-#include "mmu_rb.h"
 
 #undef pr_fmt
 #define pr_fmt(fmt) DRIVER_NAME ": " fmt
@@ -79,21 +79,25 @@ static int hfi1_file_mmap(struct file *fp, struct vm_area_struct *vma);
 
 static u64 kvirt_to_phys(void *addr);
 static int assign_ctxt(struct hfi1_filedata *fd, struct hfi1_user_info *uinfo);
-static int init_subctxts(struct hfi1_ctxtdata *uctxt,
-			 const struct hfi1_user_info *uinfo);
-static int init_user_ctxt(struct hfi1_filedata *fd);
+static void init_subctxts(struct hfi1_ctxtdata *uctxt,
+			  const struct hfi1_user_info *uinfo);
+static int init_user_ctxt(struct hfi1_filedata *fd,
+			  struct hfi1_ctxtdata *uctxt);
 static void user_init(struct hfi1_ctxtdata *uctxt);
 static int get_ctxt_info(struct hfi1_filedata *fd, void __user *ubase,
 			 __u32 len);
 static int get_base_info(struct hfi1_filedata *fd, void __user *ubase,
 			 __u32 len);
-static int setup_base_ctxt(struct hfi1_filedata *fd);
+static int setup_base_ctxt(struct hfi1_filedata *fd,
+			   struct hfi1_ctxtdata *uctxt);
 static int setup_subctxt(struct hfi1_ctxtdata *uctxt);
 
 static int find_sub_ctxt(struct hfi1_filedata *fd,
 			 const struct hfi1_user_info *uinfo);
 static int allocate_ctxt(struct hfi1_filedata *fd, struct hfi1_devdata *dd,
-			 struct hfi1_user_info *uinfo);
+			 struct hfi1_user_info *uinfo,
+			 struct hfi1_ctxtdata **cd);
+static void deallocate_ctxt(struct hfi1_ctxtdata *uctxt);
 static unsigned int poll_urgent(struct file *fp, struct poll_table_struct *pt);
 static unsigned int poll_next(struct file *fp, struct poll_table_struct *pt);
 static int user_event_ack(struct hfi1_ctxtdata *uctxt, u16 subctxt,
@@ -116,7 +120,7 @@ static const struct file_operations hfi1_file_ops = {
 	.llseek = noop_llseek,
 };
 
-static struct vm_operations_struct vm_ops = {
+static const struct vm_operations_struct vm_ops = {
 	.fault = vma_fault,
 };
 
@@ -181,7 +185,7 @@ static int hfi1_file_open(struct inode *inode, struct file *fp)
 					       struct hfi1_devdata,
 					       user_cdev);
 
-	if (!((dd->flags & HFI1_PRESENT) && dd->kregbase))
+	if (!((dd->flags & HFI1_PRESENT) && dd->kregbase1))
 		return -EINVAL;
 
 	if (!atomic_inc_not_zero(&dd->user_refcount))
@@ -267,12 +271,14 @@ static long hfi1_file_ioctl(struct file *fp, unsigned int cmd,
 			/*
 			 * Copy the number of tidlist entries we used
 			 * and the length of the buffer we registered.
-			 * These fields are adjacent in the structure so
-			 * we can copy them at the same time.
 			 */
 			addr = arg + offsetof(struct hfi1_tid_info, tidcnt);
 			if (copy_to_user((void __user *)addr, &tinfo.tidcnt,
-					 sizeof(tinfo.tidcnt) +
+					 sizeof(tinfo.tidcnt)))
+				return -EFAULT;
+
+			addr = arg + offsetof(struct hfi1_tid_info, length);
+			if (copy_to_user((void __user *)addr, &tinfo.length,
 					 sizeof(tinfo.length)))
 				ret = -EFAULT;
 		}
@@ -388,8 +394,7 @@ static long hfi1_file_ioctl(struct file *fp, unsigned int cmd,
 
 			sc_disable(sc);
 			ret = sc_enable(sc);
-			hfi1_rcvctrl(dd, HFI1_RCVCTRL_CTXT_ENB,
-				     uctxt->ctxt);
+			hfi1_rcvctrl(dd, HFI1_RCVCTRL_CTXT_ENB, uctxt);
 		} else {
 			ret = sc_restart(sc);
 		}
@@ -425,8 +430,7 @@ static ssize_t hfi1_write_iter(struct kiocb *kiocb, struct iov_iter *from)
 	if (!iter_is_iovec(from) || !dim)
 		return -EINVAL;
 
-	hfi1_cdbg(SDMA, "SDMA request from %u:%u (%lu)",
-		  fd->uctxt->ctxt, fd->subctxt, dim);
+	trace_hfi1_sdma_request(fd->dd, fd->uctxt->ctxt, fd->subctxt, dim);
 
 	if (atomic_read(&pq->n_reqs) == pq->n_max_reqs)
 		return -ENOSPC;
@@ -752,12 +756,11 @@ static int hfi1_file_close(struct inode *inode, struct file *fp)
 	if (!uctxt)
 		goto done;
 
-	hfi1_cdbg(PROC, "freeing ctxt %u:%u", uctxt->ctxt, fdata->subctxt);
-	mutex_lock(&hfi1_mutex);
+	hfi1_cdbg(PROC, "closing ctxt %u:%u", uctxt->ctxt, fdata->subctxt);
 
 	flush_wc();
 	/* drain user sdma queue */
-	hfi1_user_sdma_free_queues(fdata);
+	hfi1_user_sdma_free_queues(fdata, uctxt);
 
 	/* release the cpu */
 	hfi1_put_proc_affinity(fdata->rec_cpu_num);
@@ -766,6 +769,13 @@ static int hfi1_file_close(struct inode *inode, struct file *fp)
 	hfi1_user_exp_rcv_free(fdata);
 
 	/*
+	 * fdata->uctxt is used in the above cleanup.  It is not ready to be
+	 * removed until here.
+	 */
+	fdata->uctxt = NULL;
+	hfi1_rcd_put(uctxt);
+
+	/*
 	 * Clear any left over, unhandled events so the next process that
 	 * gets this context doesn't get confused.
 	 */
@@ -773,13 +783,14 @@ static int hfi1_file_close(struct inode *inode, struct file *fp)
 			   HFI1_MAX_SHARED_CTXTS) + fdata->subctxt;
 	*ev = 0;
 
+	spin_lock_irqsave(&dd->uctxt_lock, flags);
 	__clear_bit(fdata->subctxt, uctxt->in_use_ctxts);
 	if (!bitmap_empty(uctxt->in_use_ctxts, HFI1_MAX_SHARED_CTXTS)) {
-		mutex_unlock(&hfi1_mutex);
+		spin_unlock_irqrestore(&dd->uctxt_lock, flags);
 		goto done;
 	}
+	spin_unlock_irqrestore(&dd->uctxt_lock, flags);
 
-	spin_lock_irqsave(&dd->uctxt_lock, flags);
 	/*
 	 * Disable receive context and interrupt available, reset all
 	 * RcvCtxtCtrl bits to default values.
@@ -790,34 +801,24 @@ static int hfi1_file_close(struct inode *inode, struct file *fp)
 		     HFI1_RCVCTRL_TAILUPD_DIS |
 		     HFI1_RCVCTRL_ONE_PKT_EGR_DIS |
 		     HFI1_RCVCTRL_NO_RHQ_DROP_DIS |
-		     HFI1_RCVCTRL_NO_EGR_DROP_DIS, uctxt->ctxt);
+		     HFI1_RCVCTRL_NO_EGR_DROP_DIS, uctxt);
 	/* Clear the context's J_KEY */
-	hfi1_clear_ctxt_jkey(dd, uctxt->ctxt);
+	hfi1_clear_ctxt_jkey(dd, uctxt);
 	/*
-	 * Reset context integrity checks to default.
-	 * (writes to CSRs probably belong in chip.c)
+	 * If a send context is allocated, reset context integrity
+	 * checks to default and disable the send context.
 	 */
-	write_kctxt_csr(dd, uctxt->sc->hw_context, SEND_CTXT_CHECK_ENABLE,
-			hfi1_pkt_default_send_ctxt_mask(dd, uctxt->sc->type));
-	sc_disable(uctxt->sc);
-	spin_unlock_irqrestore(&dd->uctxt_lock, flags);
+	if (uctxt->sc) {
+		set_pio_integrity(uctxt->sc);
+		sc_disable(uctxt->sc);
+	}
 
-	dd->rcd[uctxt->ctxt] = NULL;
-
-	hfi1_user_exp_rcv_grp_free(uctxt);
+	hfi1_free_ctxt_rcv_groups(uctxt);
 	hfi1_clear_ctxt_pkey(dd, uctxt);
 
-	uctxt->rcvwait_to = 0;
-	uctxt->piowait_to = 0;
-	uctxt->rcvnowait = 0;
-	uctxt->pionowait = 0;
 	uctxt->event_flags = 0;
 
-	hfi1_stats.sps_ctxts--;
-	if (++dd->freectxts == dd->num_user_contexts)
-		aspm_enable_all(dd);
-	mutex_unlock(&hfi1_mutex);
-	hfi1_free_ctxtdata(dd, uctxt);
+	deallocate_ctxt(uctxt);
 done:
 	mmdrop(fdata->mm);
 	kobject_put(&dd->kobj);
@@ -845,135 +846,211 @@ static u64 kvirt_to_phys(void *addr)
 	return paddr;
 }
 
-static int assign_ctxt(struct hfi1_filedata *fd, struct hfi1_user_info *uinfo)
+/**
+ * complete_subctxt
+ * @fd: valid filedata pointer
+ *
+ * Sub-context info can only be set up after the base context
+ * has been completed.  This is indicated by the clearing of the
+ * HFI1_CTXT_BASE_UINIT bit.
+ *
+ * Wait for the bit to be cleared, and then complete the subcontext
+ * initialization.
+ *
+ */
+static int complete_subctxt(struct hfi1_filedata *fd)
 {
 	int ret;
-	unsigned int swmajor, swminor;
+	unsigned long flags;
 
-	swmajor = uinfo->userversion >> 16;
-	if (swmajor != HFI1_USER_SWMAJOR)
-		return -ENODEV;
-
-	swminor = uinfo->userversion & 0xffff;
-
-	mutex_lock(&hfi1_mutex);
 	/*
-	 * Get a sub context if necessary.
-	 * ret < 0 error, 0 no context, 1 sub-context found
+	 * sub-context info can only be set up after the base context
+	 * has been completed.
 	 */
-	ret = 0;
-	if (uinfo->subctxt_cnt) {
-		ret = find_sub_ctxt(fd, uinfo);
-		if (ret > 0)
-			fd->rec_cpu_num =
-				hfi1_get_proc_affinity(fd->uctxt->numa_id);
+	ret = wait_event_interruptible(
+		fd->uctxt->wait,
+		!test_bit(HFI1_CTXT_BASE_UNINIT, &fd->uctxt->event_flags));
+
+	if (test_bit(HFI1_CTXT_BASE_FAILED, &fd->uctxt->event_flags))
+		ret = -ENOMEM;
+
+	/* Finish the sub-context init */
+	if (!ret) {
+		fd->rec_cpu_num = hfi1_get_proc_affinity(fd->uctxt->numa_id);
+		ret = init_user_ctxt(fd, fd->uctxt);
 	}
 
-	/*
-	 * Allocate a base context if context sharing is not required or we
-	 * couldn't find a sub context.
-	 */
-	if (!ret)
-		ret = allocate_ctxt(fd, fd->dd, uinfo);
-
-	mutex_unlock(&hfi1_mutex);
-
-	/* Depending on the context type, do the appropriate init */
-	if (ret > 0) {
-		/*
-		 * sub-context info can only be set up after the base
-		 * context has been completed.
-		 */
-		ret = wait_event_interruptible(fd->uctxt->wait, !test_bit(
-					       HFI1_CTXT_BASE_UNINIT,
-					       &fd->uctxt->event_flags));
-		if (test_bit(HFI1_CTXT_BASE_FAILED, &fd->uctxt->event_flags)) {
-			clear_bit(fd->subctxt, fd->uctxt->in_use_ctxts);
-			return -ENOMEM;
-		}
-		/* The only thing a sub context needs is the user_xxx stuff */
-		if (!ret)
-			ret = init_user_ctxt(fd);
-
-		if (ret)
-			clear_bit(fd->subctxt, fd->uctxt->in_use_ctxts);
-	} else if (!ret) {
-		ret = setup_base_ctxt(fd);
-		if (fd->uctxt->subctxt_cnt) {
-			/* If there is an error, set the failed bit. */
-			if (ret)
-				set_bit(HFI1_CTXT_BASE_FAILED,
-					&fd->uctxt->event_flags);
-			/*
-			 * Base context is done, notify anybody using a
-			 * sub-context that is waiting for this completion
-			 */
-			clear_bit(HFI1_CTXT_BASE_UNINIT,
-				  &fd->uctxt->event_flags);
-			wake_up(&fd->uctxt->wait);
-		}
+	if (ret) {
+		hfi1_rcd_put(fd->uctxt);
+		fd->uctxt = NULL;
+		spin_lock_irqsave(&fd->dd->uctxt_lock, flags);
+		__clear_bit(fd->subctxt, fd->uctxt->in_use_ctxts);
+		spin_unlock_irqrestore(&fd->dd->uctxt_lock, flags);
 	}
 
 	return ret;
 }
 
-/*
+static int assign_ctxt(struct hfi1_filedata *fd, struct hfi1_user_info *uinfo)
+{
+	int ret;
+	unsigned int swmajor, swminor;
+	struct hfi1_ctxtdata *uctxt = NULL;
+
+	swmajor = uinfo->userversion >> 16;
+	if (swmajor != HFI1_USER_SWMAJOR)
+		return -ENODEV;
+
+	if (uinfo->subctxt_cnt > HFI1_MAX_SHARED_CTXTS)
+		return -EINVAL;
+
+	swminor = uinfo->userversion & 0xffff;
+
+	/*
+	 * Acquire the mutex to protect against multiple creations of what
+	 * could be a shared base context.
+	 */
+	mutex_lock(&hfi1_mutex);
+	/*
+	 * Get a sub context if available  (fd->uctxt will be set).
+	 * ret < 0 error, 0 no context, 1 sub-context found
+	 */
+	ret = find_sub_ctxt(fd, uinfo);
+
+	/*
+	 * Allocate a base context if context sharing is not required or a
+	 * sub context wasn't found.
+	 */
+	if (!ret)
+		ret = allocate_ctxt(fd, fd->dd, uinfo, &uctxt);
+
+	mutex_unlock(&hfi1_mutex);
+
+	/* Depending on the context type, finish the appropriate init */
+	switch (ret) {
+	case 0:
+		ret = setup_base_ctxt(fd, uctxt);
+		if (uctxt->subctxt_cnt) {
+			/*
+			 * Base context is done (successfully or not), notify
+			 * anybody using a sub-context that is waiting for
+			 * this completion.
+			 */
+			clear_bit(HFI1_CTXT_BASE_UNINIT, &uctxt->event_flags);
+			wake_up(&uctxt->wait);
+		}
+		break;
+	case 1:
+		ret = complete_subctxt(fd);
+		break;
+	default:
+		break;
+	}
+
+	return ret;
+}
+
+/**
+ * match_ctxt
+ * @fd: valid filedata pointer
+ * @uinfo: user info to compare base context with
+ * @uctxt: context to compare uinfo to.
+ *
+ * Compare the given context with the given information to see if it
+ * can be used for a sub context.
+ */
+static int match_ctxt(struct hfi1_filedata *fd,
+		      const struct hfi1_user_info *uinfo,
+		      struct hfi1_ctxtdata *uctxt)
+{
+	struct hfi1_devdata *dd = fd->dd;
+	unsigned long flags;
+	u16 subctxt;
+
+	/* Skip dynamically allocated kernel contexts */
+	if (uctxt->sc && (uctxt->sc->type == SC_KERNEL))
+		return 0;
+
+	/* Skip ctxt if it doesn't match the requested one */
+	if (memcmp(uctxt->uuid, uinfo->uuid, sizeof(uctxt->uuid)) ||
+	    uctxt->jkey != generate_jkey(current_uid()) ||
+	    uctxt->subctxt_id != uinfo->subctxt_id ||
+	    uctxt->subctxt_cnt != uinfo->subctxt_cnt)
+		return 0;
+
+	/* Verify the sharing process matches the base */
+	if (uctxt->userversion != uinfo->userversion)
+		return -EINVAL;
+
+	/* Find an unused sub context */
+	spin_lock_irqsave(&dd->uctxt_lock, flags);
+	if (bitmap_empty(uctxt->in_use_ctxts, HFI1_MAX_SHARED_CTXTS)) {
+		/* context is being closed, do not use */
+		spin_unlock_irqrestore(&dd->uctxt_lock, flags);
+		return 0;
+	}
+
+	subctxt = find_first_zero_bit(uctxt->in_use_ctxts,
+				      HFI1_MAX_SHARED_CTXTS);
+	if (subctxt >= uctxt->subctxt_cnt) {
+		spin_unlock_irqrestore(&dd->uctxt_lock, flags);
+		return -EBUSY;
+	}
+
+	fd->subctxt = subctxt;
+	__set_bit(fd->subctxt, uctxt->in_use_ctxts);
+	spin_unlock_irqrestore(&dd->uctxt_lock, flags);
+
+	fd->uctxt = uctxt;
+	hfi1_rcd_get(uctxt);
+
+	return 1;
+}
+
+/**
+ * find_sub_ctxt
+ * @fd: valid filedata pointer
+ * @uinfo: matching info to use to find a possible context to share.
+ *
  * The hfi1_mutex must be held when this function is called.  It is
- * necessary to ensure serialized access to the bitmask in_use_ctxts.
+ * necessary to ensure serialized creation of shared contexts.
+ *
+ * Return:
+ *    0      No sub-context found
+ *    1      Subcontext found and allocated
+ *    errno  EINVAL (incorrect parameters)
+ *           EBUSY (all sub contexts in use)
  */
 static int find_sub_ctxt(struct hfi1_filedata *fd,
 			 const struct hfi1_user_info *uinfo)
 {
-	int i;
+	struct hfi1_ctxtdata *uctxt;
 	struct hfi1_devdata *dd = fd->dd;
-	u16 subctxt;
+	u16 i;
+	int ret;
+
+	if (!uinfo->subctxt_cnt)
+		return 0;
 
 	for (i = dd->first_dyn_alloc_ctxt; i < dd->num_rcv_contexts; i++) {
-		struct hfi1_ctxtdata *uctxt = dd->rcd[i];
-
-		/* Skip ctxts which are not yet open */
-		if (!uctxt ||
-		    bitmap_empty(uctxt->in_use_ctxts,
-				 HFI1_MAX_SHARED_CTXTS))
-			continue;
-
-		/* Skip dynamically allocted kernel contexts */
-		if (uctxt->sc && (uctxt->sc->type == SC_KERNEL))
-			continue;
-
-		/* Skip ctxt if it doesn't match the requested one */
-		if (memcmp(uctxt->uuid, uinfo->uuid,
-			   sizeof(uctxt->uuid)) ||
-		    uctxt->jkey != generate_jkey(current_uid()) ||
-		    uctxt->subctxt_id != uinfo->subctxt_id ||
-		    uctxt->subctxt_cnt != uinfo->subctxt_cnt)
-			continue;
-
-		/* Verify the sharing process matches the master */
-		if (uctxt->userversion != uinfo->userversion)
-			return -EINVAL;
-
-		/* Find an unused context */
-		subctxt = find_first_zero_bit(uctxt->in_use_ctxts,
-					      HFI1_MAX_SHARED_CTXTS);
-		if (subctxt >= uctxt->subctxt_cnt)
-			return -EBUSY;
-
-		fd->uctxt = uctxt;
-		fd->subctxt = subctxt;
-		__set_bit(fd->subctxt, uctxt->in_use_ctxts);
-
-		return 1;
+		uctxt = hfi1_rcd_get_by_index(dd, i);
+		if (uctxt) {
+			ret = match_ctxt(fd, uinfo, uctxt);
+			hfi1_rcd_put(uctxt);
+			/* value of != 0 will return */
+			if (ret)
+				return ret;
+		}
 	}
 
 	return 0;
 }
 
 static int allocate_ctxt(struct hfi1_filedata *fd, struct hfi1_devdata *dd,
-			 struct hfi1_user_info *uinfo)
+			 struct hfi1_user_info *uinfo,
+			 struct hfi1_ctxtdata **rcd)
 {
 	struct hfi1_ctxtdata *uctxt;
-	unsigned int ctxt;
 	int ret, numa;
 
 	if (dd->flags & HFI1_FROZEN) {
@@ -987,22 +1064,9 @@ static int allocate_ctxt(struct hfi1_filedata *fd, struct hfi1_devdata *dd,
 		return -EIO;
 	}
 
-	/*
-	 * This check is sort of redundant to the next EBUSY error. It would
-	 * also indicate an inconsistancy in the driver if this value was
-	 * zero, but there were still contexts available.
-	 */
 	if (!dd->freectxts)
 		return -EBUSY;
 
-	for (ctxt = dd->first_dyn_alloc_ctxt;
-	     ctxt < dd->num_rcv_contexts; ctxt++)
-		if (!dd->rcd[ctxt])
-			break;
-
-	if (ctxt == dd->num_rcv_contexts)
-		return -EBUSY;
-
 	/*
 	 * If we don't have a NUMA node requested, preference is towards
 	 * device NUMA node.
@@ -1012,11 +1076,10 @@ static int allocate_ctxt(struct hfi1_filedata *fd, struct hfi1_devdata *dd,
 		numa = cpu_to_node(fd->rec_cpu_num);
 	else
 		numa = numa_node_id();
-	uctxt = hfi1_create_ctxtdata(dd->pport, ctxt, numa);
-	if (!uctxt) {
-		dd_dev_err(dd,
-			   "Unable to allocate ctxtdata memory, failing open\n");
-		return -ENOMEM;
+	ret = hfi1_create_ctxtdata(dd->pport, numa, &uctxt);
+	if (ret < 0) {
+		dd_dev_err(dd, "user ctxtdata allocation failed\n");
+		return ret;
 	}
 	hfi1_cdbg(PROC, "[%u:%u] pid %u assigned to CPU %d (NUMA %u)",
 		  uctxt->ctxt, fd->subctxt, current->pid, fd->rec_cpu_num,
@@ -1025,8 +1088,7 @@ static int allocate_ctxt(struct hfi1_filedata *fd, struct hfi1_devdata *dd,
 	/*
 	 * Allocate and enable a PIO send context.
 	 */
-	uctxt->sc = sc_alloc(dd, SC_USER, uctxt->rcvhdrqentsize,
-			     uctxt->dd->node);
+	uctxt->sc = sc_alloc(dd, SC_USER, uctxt->rcvhdrqentsize, dd->node);
 	if (!uctxt->sc) {
 		ret = -ENOMEM;
 		goto ctxdata_free;
@@ -1038,28 +1100,19 @@ static int allocate_ctxt(struct hfi1_filedata *fd, struct hfi1_devdata *dd,
 		goto ctxdata_free;
 
 	/*
-	 * Setup sub context resources if the user-level has requested
+	 * Setup sub context information if the user-level has requested
 	 * sub contexts.
 	 * This has to be done here so the rest of the sub-contexts find the
-	 * proper master.
+	 * proper base context.
 	 */
-	if (uinfo->subctxt_cnt) {
-		ret = init_subctxts(uctxt, uinfo);
-		/*
-		 * On error, we don't need to disable and de-allocate the
-		 * send context because it will be done during file close
-		 */
-		if (ret)
-			goto ctxdata_free;
-	}
+	if (uinfo->subctxt_cnt)
+		init_subctxts(uctxt, uinfo);
 	uctxt->userversion = uinfo->userversion;
 	uctxt->flags = hfi1_cap_mask; /* save current flag state */
 	init_waitqueue_head(&uctxt->wait);
 	strlcpy(uctxt->comm, current->comm, sizeof(uctxt->comm));
 	memcpy(uctxt->uuid, uinfo->uuid, sizeof(uctxt->uuid));
 	uctxt->jkey = generate_jkey(current_uid());
-	INIT_LIST_HEAD(&uctxt->sdma_queues);
-	spin_lock_init(&uctxt->sdma_qlock);
 	hfi1_stats.sps_ctxts++;
 	/*
 	 * Disable ASPM when there are open user/PSM contexts to avoid
@@ -1067,31 +1120,33 @@ static int allocate_ctxt(struct hfi1_filedata *fd, struct hfi1_devdata *dd,
 	 */
 	if (dd->freectxts-- == dd->num_user_contexts)
 		aspm_disable_all(dd);
-	fd->uctxt = uctxt;
+
+	*rcd = uctxt;
 
 	return 0;
 
 ctxdata_free:
-	dd->rcd[ctxt] = NULL;
-	hfi1_free_ctxtdata(dd, uctxt);
+	hfi1_free_ctxt(uctxt);
 	return ret;
 }
 
-static int init_subctxts(struct hfi1_ctxtdata *uctxt,
-			 const struct hfi1_user_info *uinfo)
+static void deallocate_ctxt(struct hfi1_ctxtdata *uctxt)
 {
-	u16 num_subctxts;
+	mutex_lock(&hfi1_mutex);
+	hfi1_stats.sps_ctxts--;
+	if (++uctxt->dd->freectxts == uctxt->dd->num_user_contexts)
+		aspm_enable_all(uctxt->dd);
+	mutex_unlock(&hfi1_mutex);
 
-	num_subctxts = uinfo->subctxt_cnt;
-	if (num_subctxts > HFI1_MAX_SHARED_CTXTS)
-		return -EINVAL;
+	hfi1_free_ctxt(uctxt);
+}
 
+static void init_subctxts(struct hfi1_ctxtdata *uctxt,
+			  const struct hfi1_user_info *uinfo)
+{
 	uctxt->subctxt_cnt = uinfo->subctxt_cnt;
 	uctxt->subctxt_id = uinfo->subctxt_id;
-	uctxt->redirect_seq_cnt = 1;
 	set_bit(HFI1_CTXT_BASE_UNINIT, &uctxt->event_flags);
-
-	return 0;
 }
 
 static int setup_subctxt(struct hfi1_ctxtdata *uctxt)
@@ -1153,7 +1208,7 @@ static void user_init(struct hfi1_ctxtdata *uctxt)
 		clear_rcvhdrtail(uctxt);
 
 	/* Setup J_KEY before enabling the context */
-	hfi1_set_ctxt_jkey(uctxt->dd, uctxt->ctxt, uctxt->jkey);
+	hfi1_set_ctxt_jkey(uctxt->dd, uctxt, uctxt->jkey);
 
 	rcvctrl_ops = HFI1_RCVCTRL_CTXT_ENB;
 	if (HFI1_CAP_UGET_MASK(uctxt->flags, HDRSUPP))
@@ -1179,7 +1234,7 @@ static void user_init(struct hfi1_ctxtdata *uctxt)
 		rcvctrl_ops |= HFI1_RCVCTRL_TAILUPD_ENB;
 	else
 		rcvctrl_ops |= HFI1_RCVCTRL_TAILUPD_DIS;
-	hfi1_rcvctrl(uctxt->dd, rcvctrl_ops, uctxt->ctxt);
+	hfi1_rcvctrl(uctxt->dd, rcvctrl_ops, uctxt);
 }
 
 static int get_ctxt_info(struct hfi1_filedata *fd, void __user *ubase,
@@ -1223,23 +1278,25 @@ static int get_ctxt_info(struct hfi1_filedata *fd, void __user *ubase,
 	return ret;
 }
 
-static int init_user_ctxt(struct hfi1_filedata *fd)
+static int init_user_ctxt(struct hfi1_filedata *fd,
+			  struct hfi1_ctxtdata *uctxt)
 {
-	struct hfi1_ctxtdata *uctxt = fd->uctxt;
 	int ret;
 
 	ret = hfi1_user_sdma_alloc_queues(uctxt, fd);
 	if (ret)
 		return ret;
 
-	ret = hfi1_user_exp_rcv_init(fd);
+	ret = hfi1_user_exp_rcv_init(fd, uctxt);
+	if (ret)
+		hfi1_user_sdma_free_queues(fd, uctxt);
 
 	return ret;
 }
 
-static int setup_base_ctxt(struct hfi1_filedata *fd)
+static int setup_base_ctxt(struct hfi1_filedata *fd,
+			   struct hfi1_ctxtdata *uctxt)
 {
-	struct hfi1_ctxtdata *uctxt = fd->uctxt;
 	struct hfi1_devdata *dd = uctxt->dd;
 	int ret = 0;
 
@@ -1260,20 +1317,27 @@ static int setup_base_ctxt(struct hfi1_filedata *fd)
 	if (ret)
 		goto setup_failed;
 
-	ret = hfi1_user_exp_rcv_grp_init(fd);
+	ret = hfi1_alloc_ctxt_rcv_groups(uctxt);
 	if (ret)
 		goto setup_failed;
 
-	ret = init_user_ctxt(fd);
+	ret = init_user_ctxt(fd, uctxt);
 	if (ret)
 		goto setup_failed;
 
 	user_init(uctxt);
 
+	/* Now that the context is set up, the fd can get a reference. */
+	fd->uctxt = uctxt;
+	hfi1_rcd_get(uctxt);
+
 	return 0;
 
 setup_failed:
-	hfi1_free_ctxtdata(dd, uctxt);
+	/* Set the failed bit so sub-context init can do the right thing */
+	set_bit(HFI1_CTXT_BASE_FAILED, &uctxt->event_flags);
+	deallocate_ctxt(uctxt);
+
 	return ret;
 }
 
@@ -1390,7 +1454,7 @@ static unsigned int poll_next(struct file *fp,
 	spin_lock_irq(&dd->uctxt_lock);
 	if (hdrqempty(uctxt)) {
 		set_bit(HFI1_CTXT_WAITING_RCV, &uctxt->event_flags);
-		hfi1_rcvctrl(dd, HFI1_RCVCTRL_INTRAVAIL_ENB, uctxt->ctxt);
+		hfi1_rcvctrl(dd, HFI1_RCVCTRL_INTRAVAIL_ENB, uctxt);
 		pollflag = 0;
 	} else {
 		pollflag = POLLIN | POLLRDNORM;
@@ -1409,19 +1473,14 @@ int hfi1_set_uevent_bits(struct hfi1_pportdata *ppd, const int evtbit)
 {
 	struct hfi1_ctxtdata *uctxt;
 	struct hfi1_devdata *dd = ppd->dd;
-	unsigned ctxt;
-	int ret = 0;
-	unsigned long flags;
+	u16 ctxt;
 
-	if (!dd->events) {
-		ret = -EINVAL;
-		goto done;
-	}
+	if (!dd->events)
+		return -EINVAL;
 
-	spin_lock_irqsave(&dd->uctxt_lock, flags);
 	for (ctxt = dd->first_dyn_alloc_ctxt; ctxt < dd->num_rcv_contexts;
 	     ctxt++) {
-		uctxt = dd->rcd[ctxt];
+		uctxt = hfi1_rcd_get_by_index(dd, ctxt);
 		if (uctxt) {
 			unsigned long *evs = dd->events +
 				(uctxt->ctxt - dd->first_dyn_alloc_ctxt) *
@@ -1434,11 +1493,11 @@ int hfi1_set_uevent_bits(struct hfi1_pportdata *ppd, const int evtbit)
 			set_bit(evtbit, evs);
 			for (i = 1; i < uctxt->subctxt_cnt; i++)
 				set_bit(evtbit, evs + i);
+			hfi1_rcd_put(uctxt);
 		}
 	}
-	spin_unlock_irqrestore(&dd->uctxt_lock, flags);
-done:
-	return ret;
+
+	return 0;
 }
 
 /**
@@ -1475,7 +1534,7 @@ static int manage_rcvq(struct hfi1_ctxtdata *uctxt, u16 subctxt,
 	} else {
 		rcvctrl_op = HFI1_RCVCTRL_CTXT_DIS;
 	}
-	hfi1_rcvctrl(dd, rcvctrl_op, uctxt->ctxt);
+	hfi1_rcvctrl(dd, rcvctrl_op, uctxt);
 	/* always; new head should be equal to new tail; see above */
 bail:
 	return 0;
@@ -1525,7 +1584,7 @@ static int set_ctxt_pkey(struct hfi1_ctxtdata *uctxt, u16 subctxt, u16 pkey)
 		}
 
 	if (intable)
-		ret = hfi1_set_ctxt_pkey(dd, uctxt->ctxt, pkey);
+		ret = hfi1_set_ctxt_pkey(dd, uctxt, pkey);
 done:
 	return ret;
 }
diff --git a/drivers/infiniband/hw/hfi1/firmware.c b/drivers/infiniband/hw/hfi1/firmware.c
index 4042c11..5aea8f4 100644
--- a/drivers/infiniband/hw/hfi1/firmware.c
+++ b/drivers/infiniband/hw/hfi1/firmware.c
@@ -64,30 +64,22 @@
 #define DEFAULT_FW_FABRIC_NAME "hfi1_fabric.fw"
 #define DEFAULT_FW_SBUS_NAME "hfi1_sbus.fw"
 #define DEFAULT_FW_PCIE_NAME "hfi1_pcie.fw"
-#define DEFAULT_PLATFORM_CONFIG_NAME "hfi1_platform.dat"
 #define ALT_FW_8051_NAME_ASIC "hfi1_dc8051_d.fw"
 #define ALT_FW_FABRIC_NAME "hfi1_fabric_d.fw"
 #define ALT_FW_SBUS_NAME "hfi1_sbus_d.fw"
 #define ALT_FW_PCIE_NAME "hfi1_pcie_d.fw"
+#define HOST_INTERFACE_VERSION 1
 
 static uint fw_8051_load = 1;
 static uint fw_fabric_serdes_load = 1;
 static uint fw_pcie_serdes_load = 1;
 static uint fw_sbus_load = 1;
 
-/*
- * Access required in platform.c
- * Maintains state of whether the platform config was fetched via the
- * fallback option
- */
-uint platform_config_load;
-
 /* Firmware file names get set in hfi1_firmware_init() based on the above */
 static char *fw_8051_name;
 static char *fw_fabric_serdes_name;
 static char *fw_sbus_name;
 static char *fw_pcie_serdes_name;
-static char *platform_config_name;
 
 #define SBUS_MAX_POLL_COUNT 100
 #define SBUS_COUNTER(reg, name) \
@@ -177,7 +169,6 @@ static struct firmware_details fw_8051;
 static struct firmware_details fw_fabric;
 static struct firmware_details fw_pcie;
 static struct firmware_details fw_sbus;
-static const struct firmware *platform_config;
 
 /* flags for turn_off_spicos() */
 #define SPICO_SBUS   0x1
@@ -615,6 +606,14 @@ static void __obtain_firmware(struct hfi1_devdata *dd)
 		fw_fabric_serdes_name = ALT_FW_FABRIC_NAME;
 		fw_sbus_name = ALT_FW_SBUS_NAME;
 		fw_pcie_serdes_name = ALT_FW_PCIE_NAME;
+
+		/*
+		 * Add a delay before obtaining and loading debug firmware.
+		 * Authorization will fail if the delay between firmware
+		 * authorization events is shorter than 50us. Add 100us to
+		 * make a delay time safe.
+		 */
+		usleep_range(100, 120);
 	}
 
 	if (fw_sbus_load) {
@@ -675,7 +674,6 @@ static void __obtain_firmware(struct hfi1_devdata *dd)
 static int obtain_firmware(struct hfi1_devdata *dd)
 {
 	unsigned long timeout;
-	int err = 0;
 
 	mutex_lock(&fw_mutex);
 
@@ -699,38 +697,11 @@ static int obtain_firmware(struct hfi1_devdata *dd)
 	}
 	/* not in FW_TRY state */
 
-	if (fw_state == FW_FINAL) {
-		if (platform_config) {
-			dd->platform_config.data = platform_config->data;
-			dd->platform_config.size = platform_config->size;
-		}
-		goto done;	/* already acquired */
-	} else if (fw_state == FW_ERR) {
-		goto done;	/* already tried and failed */
-	}
-	/* fw_state is FW_EMPTY */
-
 	/* set fw_state to FW_TRY, FW_FINAL, or FW_ERR, and fw_err */
-	__obtain_firmware(dd);
+	if (fw_state == FW_EMPTY)
+		__obtain_firmware(dd);
 
-	if (platform_config_load) {
-		platform_config = NULL;
-		err = request_firmware(&platform_config, platform_config_name,
-				       &dd->pcidev->dev);
-		if (err) {
-			platform_config = NULL;
-			dd_dev_err(dd,
-				   "%s: No default platform config file found\n",
-				   __func__);
-			goto done;
-		}
-		dd->platform_config.data = platform_config->data;
-		dd->platform_config.size = platform_config->size;
-	}
-
-done:
 	mutex_unlock(&fw_mutex);
-
 	return fw_err;
 }
 
@@ -752,9 +723,6 @@ void dispose_firmware(void)
 	dispose_one_firmware(&fw_pcie);
 	dispose_one_firmware(&fw_sbus);
 
-	release_firmware(platform_config);
-	platform_config = NULL;
-
 	/* retain the error state, otherwise revert to empty */
 	if (fw_state != FW_ERR)
 		fw_state = FW_EMPTY;
@@ -1079,6 +1047,13 @@ static int load_8051_firmware(struct hfi1_devdata *dd,
 	dd_dev_info(dd, "8051 firmware version %d.%d.%d\n",
 		    (int)ver_major, (int)ver_minor, (int)ver_patch);
 	dd->dc8051_ver = dc8051_ver(ver_major, ver_minor, ver_patch);
+	ret = write_host_interface_version(dd, HOST_INTERFACE_VERSION);
+	if (ret != HCMD_SUCCESS) {
+		dd_dev_err(dd,
+			   "Failed to set host interface version, return 0x%x\n",
+			   ret);
+		return -EIO;
+	}
 
 	return 0;
 }
@@ -1709,10 +1684,8 @@ int hfi1_firmware_init(struct hfi1_devdata *dd)
 	}
 
 	/* no 8051 or QSFP on simulator */
-	if (dd->icode == ICODE_FUNCTIONAL_SIMULATOR) {
+	if (dd->icode == ICODE_FUNCTIONAL_SIMULATOR)
 		fw_8051_load = 0;
-		platform_config_load = 0;
-	}
 
 	if (!fw_8051_name) {
 		if (dd->icode == ICODE_RTL_SILICON)
@@ -1726,8 +1699,6 @@ int hfi1_firmware_init(struct hfi1_devdata *dd)
 		fw_sbus_name = DEFAULT_FW_SBUS_NAME;
 	if (!fw_pcie_serdes_name)
 		fw_pcie_serdes_name = DEFAULT_FW_PCIE_NAME;
-	if (!platform_config_name)
-		platform_config_name = DEFAULT_PLATFORM_CONFIG_NAME;
 
 	return obtain_firmware(dd);
 }
@@ -1773,6 +1744,7 @@ static int check_meta_version(struct hfi1_devdata *dd, u32 *system_table)
 int parse_platform_config(struct hfi1_devdata *dd)
 {
 	struct platform_config_cache *pcfgcache = &dd->pcfg_cache;
+	struct hfi1_pportdata *ppd = dd->pport;
 	u32 *ptr = NULL;
 	u32 header1 = 0, header2 = 0, magic_num = 0, crc = 0, file_length = 0;
 	u32 record_idx = 0, table_type = 0, table_length_dwords = 0;
@@ -1784,7 +1756,7 @@ int parse_platform_config(struct hfi1_devdata *dd)
 	 * scratch register bitmap, thus there is no platform config to parse.
 	 * Skip parsing in these situations.
 	 */
-	if (is_integrated(dd) && !platform_config_load)
+	if (ppd->config_from_scratch)
 		return 0;
 
 	if (!dd->platform_config.data) {
@@ -2073,13 +2045,14 @@ int get_platform_config_field(struct hfi1_devdata *dd,
 	int ret = 0, wlen = 0, seek = 0;
 	u32 field_len_bits = 0, field_start_bits = 0, *src_ptr = NULL;
 	struct platform_config_cache *pcfgcache = &dd->pcfg_cache;
+	struct hfi1_pportdata *ppd = dd->pport;
 
 	if (data)
 		memset(data, 0, len);
 	else
 		return -EINVAL;
 
-	if (is_integrated(dd) && !platform_config_load) {
+	if (ppd->config_from_scratch) {
 		/*
 		 * Use saved configuration from ppd for integrated platforms
 		 */
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index 414a04a..3ac9c30 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -66,9 +66,11 @@
 #include <linux/i2c.h>
 #include <linux/i2c-algo-bit.h>
 #include <rdma/ib_hdrs.h>
+#include <rdma/opa_addr.h>
 #include <linux/rhashtable.h>
 #include <linux/netdevice.h>
 #include <rdma/rdma_vt.h>
+#include <rdma/opa_addr.h>
 
 #include "chip_registers.h"
 #include "common.h"
@@ -213,13 +215,11 @@ struct hfi1_ctxtdata {
 
 	/* dynamic receive available interrupt timeout */
 	u32 rcvavail_timeout;
-	/*
-	 * number of opens (including slave sub-contexts) on this instance
-	 * (ignoring forks, dup, etc. for now)
-	 */
-	int cnt;
+	/* Reference count the base context usage */
+	struct kref kref;
+
 	/* Device context index */
-	unsigned ctxt;
+	u16 ctxt;
 	/*
 	 * non-zero if ctxt can be shared, and defines the maximum number of
 	 * sub-contexts for this device context.
@@ -245,24 +245,10 @@ struct hfi1_ctxtdata {
 
 	/* lock protecting all Expected TID data */
 	struct mutex exp_lock;
-	/* number of pio bufs for this ctxt (all procs, if shared) */
-	u32 piocnt;
-	/* first pio buffer for this ctxt */
-	u32 pio_base;
-	/* chip offset of PIO buffers for this ctxt */
-	u32 piobufs;
 	/* per-context configuration flags */
 	unsigned long flags;
 	/* per-context event flags for fileops/intr communication */
 	unsigned long event_flags;
-	/* WAIT_RCV that timed out, no interrupt */
-	u32 rcvwait_to;
-	/* WAIT_PIO that timed out, no interrupt */
-	u32 piowait_to;
-	/* WAIT_RCV already happened, no wait */
-	u32 rcvnowait;
-	/* WAIT_PIO already happened, no wait */
-	u32 pionowait;
 	/* total number of polled urgent packets */
 	u32 urgent;
 	/* saved total number of polled urgent packets for poll edge trigger */
@@ -289,10 +275,8 @@ struct hfi1_ctxtdata {
 	u16 poll_type;
 	/* receive packet sequence counter */
 	u8 seq_cnt;
-	u8 redirect_seq_cnt;
 	/* ctxt rcvhdrq head offset */
 	u32 head;
-	u32 pkt_count;
 	/* QPs waiting for context processing */
 	struct list_head qp_wait_list;
 	/* interrupt handling */
@@ -301,15 +285,6 @@ struct hfi1_ctxtdata {
 	unsigned numa_id; /* numa node of this context */
 	/* verbs stats per CTX */
 	struct hfi1_opcode_stats_perctx *opstats;
-	/*
-	 * This is the kernel thread that will keep making
-	 * progress on the user sdma requests behind the scenes.
-	 * There is one per context (shared contexts use the master's).
-	 */
-	struct task_struct *progress;
-	struct list_head sdma_queues;
-	/* protect sdma queues */
-	spinlock_t sdma_qlock;
 
 	/* Is ASPM interrupt supported for this context */
 	bool aspm_intr_supported;
@@ -352,23 +327,150 @@ struct hfi1_ctxtdata {
 struct hfi1_packet {
 	void *ebuf;
 	void *hdr;
+	void *payload;
 	struct hfi1_ctxtdata *rcd;
 	__le32 *rhf_addr;
 	struct rvt_qp *qp;
 	struct ib_other_headers *ohdr;
+	struct ib_grh *grh;
 	u64 rhf;
 	u32 maxcnt;
 	u32 rhqoff;
+	u32 dlid;
+	u32 slid;
 	u16 tlen;
 	s16 etail;
 	u8 hlen;
 	u8 numpkt;
 	u8 rsize;
 	u8 updegr;
-	u8 rcv_flags;
 	u8 etype;
+	u8 extra_byte;
+	u8 pad;
+	u8 sc;
+	u8 sl;
+	u8 opcode;
+	bool becn;
+	bool fecn;
 };
 
+/* Packet types */
+#define HFI1_PKT_TYPE_9B  0
+#define HFI1_PKT_TYPE_16B 1
+
+/*
+ * OPA 16B Header
+ */
+#define OPA_16B_L4_MASK		0xFFull
+#define OPA_16B_SC_MASK		0x1F00000ull
+#define OPA_16B_SC_SHIFT	20
+#define OPA_16B_LID_MASK	0xFFFFFull
+#define OPA_16B_DLID_MASK	0xF000ull
+#define OPA_16B_DLID_SHIFT	20
+#define OPA_16B_DLID_HIGH_SHIFT	12
+#define OPA_16B_SLID_MASK	0xF00ull
+#define OPA_16B_SLID_SHIFT	20
+#define OPA_16B_SLID_HIGH_SHIFT	8
+#define OPA_16B_BECN_MASK       0x80000000ull
+#define OPA_16B_BECN_SHIFT      31
+#define OPA_16B_FECN_MASK       0x10000000ull
+#define OPA_16B_FECN_SHIFT      28
+#define OPA_16B_L2_MASK		0x60000000ull
+#define OPA_16B_L2_SHIFT	29
+#define OPA_16B_PKEY_MASK	0xFFFF0000ull
+#define OPA_16B_PKEY_SHIFT	16
+#define OPA_16B_LEN_MASK	0x7FF00000ull
+#define OPA_16B_LEN_SHIFT	20
+#define OPA_16B_RC_MASK		0xE000000ull
+#define OPA_16B_RC_SHIFT	25
+#define OPA_16B_AGE_MASK	0xFF0000ull
+#define OPA_16B_AGE_SHIFT	16
+#define OPA_16B_ENTROPY_MASK	0xFFFFull
+
+/*
+ * OPA 16B L2/L4 Encodings
+ */
+#define OPA_16B_L2_TYPE		0x02
+#define OPA_16B_L4_IB_LOCAL	0x09
+#define OPA_16B_L4_IB_GLOBAL	0x0A
+#define OPA_16B_L4_ETHR		OPA_VNIC_L4_ETHR
+
+static inline u8 hfi1_16B_get_l4(struct hfi1_16b_header *hdr)
+{
+	return (u8)(hdr->lrh[2] & OPA_16B_L4_MASK);
+}
+
+static inline u8 hfi1_16B_get_sc(struct hfi1_16b_header *hdr)
+{
+	return (u8)((hdr->lrh[1] & OPA_16B_SC_MASK) >> OPA_16B_SC_SHIFT);
+}
+
+static inline u32 hfi1_16B_get_dlid(struct hfi1_16b_header *hdr)
+{
+	return (u32)((hdr->lrh[1] & OPA_16B_LID_MASK) |
+		     (((hdr->lrh[2] & OPA_16B_DLID_MASK) >>
+		     OPA_16B_DLID_HIGH_SHIFT) << OPA_16B_DLID_SHIFT));
+}
+
+static inline u32 hfi1_16B_get_slid(struct hfi1_16b_header *hdr)
+{
+	return (u32)((hdr->lrh[0] & OPA_16B_LID_MASK) |
+		     (((hdr->lrh[2] & OPA_16B_SLID_MASK) >>
+		     OPA_16B_SLID_HIGH_SHIFT) << OPA_16B_SLID_SHIFT));
+}
+
+static inline u8 hfi1_16B_get_becn(struct hfi1_16b_header *hdr)
+{
+	return (u8)((hdr->lrh[0] & OPA_16B_BECN_MASK) >> OPA_16B_BECN_SHIFT);
+}
+
+static inline u8 hfi1_16B_get_fecn(struct hfi1_16b_header *hdr)
+{
+	return (u8)((hdr->lrh[1] & OPA_16B_FECN_MASK) >> OPA_16B_FECN_SHIFT);
+}
+
+static inline u8 hfi1_16B_get_l2(struct hfi1_16b_header *hdr)
+{
+	return (u8)((hdr->lrh[1] & OPA_16B_L2_MASK) >> OPA_16B_L2_SHIFT);
+}
+
+static inline u16 hfi1_16B_get_pkey(struct hfi1_16b_header *hdr)
+{
+	return (u16)((hdr->lrh[2] & OPA_16B_PKEY_MASK) >> OPA_16B_PKEY_SHIFT);
+}
+
+static inline u8 hfi1_16B_get_rc(struct hfi1_16b_header *hdr)
+{
+	return (u8)((hdr->lrh[1] & OPA_16B_RC_MASK) >> OPA_16B_RC_SHIFT);
+}
+
+static inline u8 hfi1_16B_get_age(struct hfi1_16b_header *hdr)
+{
+	return (u8)((hdr->lrh[3] & OPA_16B_AGE_MASK) >> OPA_16B_AGE_SHIFT);
+}
+
+static inline u16 hfi1_16B_get_len(struct hfi1_16b_header *hdr)
+{
+	return (u16)((hdr->lrh[0] & OPA_16B_LEN_MASK) >> OPA_16B_LEN_SHIFT);
+}
+
+static inline u16 hfi1_16B_get_entropy(struct hfi1_16b_header *hdr)
+{
+	return (u16)(hdr->lrh[3] & OPA_16B_ENTROPY_MASK);
+}
+
+#define OPA_16B_MAKE_QW(low_dw, high_dw) (((u64)(high_dw) << 32) | (low_dw))
+
+/*
+ * BTH
+ */
+#define OPA_16B_BTH_PAD_MASK	7
+static inline u8 hfi1_16B_bth_get_pad(struct ib_other_headers *ohdr)
+{
+	return (u8)((be32_to_cpu(ohdr->bth[0]) >> IB_BTH_PAD_SHIFT) &
+		   OPA_16B_BTH_PAD_MASK);
+}
+
 struct rvt_sge_state;
 
 /*
@@ -512,7 +614,7 @@ static inline void incr_cntr32(u32 *cntr)
 #define MAX_NAME_SIZE 64
 struct hfi1_msix_entry {
 	enum irq_type type;
-	struct msix_entry msix;
+	int irq;
 	void *arg;
 	char name[MAX_NAME_SIZE];
 	cpumask_t mask;
@@ -575,6 +677,9 @@ struct hfi1_pportdata {
 	u8  default_atten;
 	u8  max_power_class;
 
+	/* did we read platform config from scratch registers? */
+	bool config_from_scratch;
+
 	/* GUIDs for this interface, in host order, guids[0] is a port guid */
 	u64 guids[HFI1_GUIDS_PER_PORT];
 
@@ -593,6 +698,7 @@ struct hfi1_pportdata {
 	/* SendDMA related entries */
 
 	struct workqueue_struct *hfi1_wq;
+	struct workqueue_struct *link_wq;
 
 	/* move out of interrupt context */
 	struct work_struct link_vc_work;
@@ -607,8 +713,6 @@ struct hfi1_pportdata {
 	struct mutex hls_lock;
 	u32 host_link_state;
 
-	u32 lstate;	/* logical link state */
-
 	/* these are the "32 bit" regs */
 
 	u32 ibmtu; /* The MTU programmed for this unit */
@@ -619,7 +723,7 @@ struct hfi1_pportdata {
 	u32 ibmaxlen;
 	u32 current_egress_rate; /* units [10^6 bits/sec] */
 	/* LID programmed for this instance */
-	u16 lid;
+	u32 lid;
 	/* list of pkeys programmed; 0 if not set */
 	u16 pkeys[MAX_PKEY_VALUES];
 	u16 link_width_supported;
@@ -654,12 +758,12 @@ struct hfi1_pportdata {
 	u8 link_enabled;	/* link enabled? */
 	u8 linkinit_reason;
 	u8 local_tx_rate;	/* rate given to 8051 firmware */
-	u8 last_pstate;		/* info only */
 	u8 qsfp_retry_count;
 
 	/* placeholders for IB MAD packet settings */
 	u8 overrun_threshold;
 	u8 phy_error_threshold;
+	unsigned int is_link_down_queued;
 
 	/* Used to override LED behavior for things like maintenance beaconing*/
 	/*
@@ -756,6 +860,10 @@ struct hfi1_pportdata {
 typedef int (*rhf_rcv_function_ptr)(struct hfi1_packet *packet);
 
 typedef void (*opcode_handler)(struct hfi1_packet *packet);
+typedef void (*hfi1_make_req)(struct rvt_qp *qp,
+			      struct hfi1_pkt_state *ps,
+			      struct rvt_swqe *wqe);
+
 
 /* return values for the RHF receive functions */
 #define RHF_RCV_CONTINUE  0	/* keep going */
@@ -860,12 +968,15 @@ struct hfi1_devdata {
 	struct device *diag_device;
 	struct device *ui_device;
 
-	/* mem-mapped pointer to base of chip regs */
-	u8 __iomem *kregbase;
-	/* end of mem-mapped chip space excluding sendbuf and user regs */
-	u8 __iomem *kregend;
-	/* physical address of chip for io_remap, etc. */
+	/* first mapping up to RcvArray */
+	u8 __iomem *kregbase1;
 	resource_size_t physaddr;
+
+	/* second uncached mapping from RcvArray to pio send buffers */
+	u8 __iomem *kregbase2;
+	/* for detecting offset above kregbase2 address */
+	u32 base2_start;
+
 	/* Per VL data. Enough for all VLs but not all elements are set/used. */
 	struct per_vl_data vld[PER_VL_SEND_CONTEXTS];
 	/* send context data */
@@ -953,8 +1064,7 @@ struct hfi1_devdata {
 	u64 __iomem *egrtidbase;
 	spinlock_t sendctrl_lock; /* protect changes to SendCtrl */
 	spinlock_t rcvctrl_lock; /* protect changes to RcvCtrl */
-	/* around rcd and (user ctxts) ctxt_cnt use (intr vs free) */
-	spinlock_t uctxt_lock; /* rcd and user context changes */
+	spinlock_t uctxt_lock; /* protect rcd changes */
 	struct mutex dc8051_lock; /* exclusive access to 8051 */
 	struct workqueue_struct *update_cntr_wq;
 	struct work_struct update_cntr_work;
@@ -1229,9 +1339,10 @@ static inline bool hfi1_vnic_is_rsm_full(struct hfi1_devdata *dd, int spare)
 #define dc8051_ver_patch(a) ((a) & 0x0000ff)
 
 /* f_put_tid types */
-#define PT_EXPECTED 0
-#define PT_EAGER    1
-#define PT_INVALID  2
+#define PT_EXPECTED       0
+#define PT_EAGER          1
+#define PT_INVALID_FLUSH  2
+#define PT_INVALID        3
 
 struct tid_rb_node;
 struct mmu_rb_node;
@@ -1276,13 +1387,16 @@ void handle_user_interrupt(struct hfi1_ctxtdata *rcd);
 
 int hfi1_create_rcvhdrq(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd);
 int hfi1_setup_eagerbufs(struct hfi1_ctxtdata *rcd);
-int hfi1_create_ctxts(struct hfi1_devdata *dd);
-struct hfi1_ctxtdata *hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, u32 ctxt,
-					   int numa);
+int hfi1_create_kctxts(struct hfi1_devdata *dd);
+int hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, int numa,
+			 struct hfi1_ctxtdata **rcd);
+void hfi1_free_ctxt(struct hfi1_ctxtdata *rcd);
 void hfi1_init_pportdata(struct pci_dev *pdev, struct hfi1_pportdata *ppd,
 			 struct hfi1_devdata *dd, u8 hw_pidx, u8 port);
 void hfi1_free_ctxtdata(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd);
-
+int hfi1_rcd_put(struct hfi1_ctxtdata *rcd);
+void hfi1_rcd_get(struct hfi1_ctxtdata *rcd);
+struct hfi1_ctxtdata *hfi1_rcd_get_by_index(struct hfi1_devdata *dd, u16 ctxt);
 int handle_receive_interrupt(struct hfi1_ctxtdata *rcd, int thread);
 int handle_receive_interrupt_nodma_rtail(struct hfi1_ctxtdata *rcd, int thread);
 int handle_receive_interrupt_dma_rtail(struct hfi1_ctxtdata *rcd, int thread);
@@ -1292,6 +1406,13 @@ void hfi1_set_vnic_msix_info(struct hfi1_ctxtdata *rcd);
 void hfi1_reset_vnic_msix_info(struct hfi1_ctxtdata *rcd);
 
 extern const struct pci_device_id hfi1_pci_tbl[];
+void hfi1_make_ud_req_9B(struct rvt_qp *qp,
+			 struct hfi1_pkt_state *ps,
+			 struct rvt_swqe *wqe);
+
+void hfi1_make_ud_req_16B(struct rvt_qp *qp,
+			  struct hfi1_pkt_state *ps,
+			  struct rvt_swqe *wqe);
 
 /* receive packet handler dispositions */
 #define RCV_PKT_OK      0x0 /* keep going */
@@ -1306,21 +1427,6 @@ static inline __le32 *get_rhf_addr(struct hfi1_ctxtdata *rcd)
 
 int hfi1_reset_device(int);
 
-/* return the driver's idea of the logical OPA port state */
-static inline u32 driver_lstate(struct hfi1_pportdata *ppd)
-{
-	/*
-	 * The driver does some processing from the time the logical
-	 * link state is at INIT to the time the SM can be notified
-	 * as such. Return IB_PORT_DOWN until the software state
-	 * is ready.
-	 */
-	if (ppd->lstate == IB_PORT_INIT && !(ppd->host_link_state & HLS_UP))
-		return IB_PORT_DOWN;
-	else
-		return ppd->lstate;
-}
-
 void receive_interrupt_work(struct work_struct *work);
 
 /* extract service channel from header and rhf */
@@ -1413,13 +1519,25 @@ static inline u32 egress_cycles(u32 len, u32 rate)
 }
 
 void set_link_ipg(struct hfi1_pportdata *ppd);
-void process_becn(struct hfi1_pportdata *ppd, u8 sl,  u16 rlid, u32 lqpn,
+void process_becn(struct hfi1_pportdata *ppd, u8 sl, u32 rlid, u32 lqpn,
 		  u32 rqpn, u8 svc_type);
 void return_cnp(struct hfi1_ibport *ibp, struct rvt_qp *qp, u32 remote_qpn,
 		u32 pkey, u32 slid, u32 dlid, u8 sc5,
 		const struct ib_grh *old_grh);
+void return_cnp_16B(struct hfi1_ibport *ibp, struct rvt_qp *qp,
+		    u32 remote_qpn, u32 pkey, u32 slid, u32 dlid,
+		    u8 sc5, const struct ib_grh *old_grh);
+typedef void (*hfi1_handle_cnp)(struct hfi1_ibport *ibp, struct rvt_qp *qp,
+				u32 remote_qpn, u32 pkey, u32 slid, u32 dlid,
+				u8 sc5, const struct ib_grh *old_grh);
+
+/* We support only two types - 9B and 16B for now */
+static const hfi1_handle_cnp hfi1_handle_cnp_tbl[2] = {
+	[HFI1_PKT_TYPE_9B] = &return_cnp,
+	[HFI1_PKT_TYPE_16B] = &return_cnp_16B
+};
 #define PKEY_CHECK_INVALID -1
-int egress_pkey_check(struct hfi1_pportdata *ppd, __be16 *lrh, __be32 *bth,
+int egress_pkey_check(struct hfi1_pportdata *ppd, u32 slid, u16 pkey,
 		      u8 sc5, int8_t s_pkey_index);
 
 #define PACKET_EGRESS_TIMEOUT 350
@@ -1522,9 +1640,9 @@ static void ingress_pkey_table_fail(struct hfi1_pportdata *ppd, u16 pkey,
  * by HW and rcv_pkey_check function should be called instead.
  */
 static inline int ingress_pkey_check(struct hfi1_pportdata *ppd, u16 pkey,
-				     u8 sc5, u8 idx, u16 slid)
+				     u8 sc5, u8 idx, u32 slid, bool force)
 {
-	if (!(ppd->part_enforce & HFI1_PART_ENFORCE_IN))
+	if (!(force) && !(ppd->part_enforce & HFI1_PART_ENFORCE_IN))
 		return 0;
 
 	/* If SC15, pkey[0:14] must be 0x7fff */
@@ -1658,12 +1776,22 @@ static inline bool process_ecn(struct rvt_qp *qp, struct hfi1_packet *pkt,
 			       bool do_cnp)
 {
 	struct ib_other_headers *ohdr = pkt->ohdr;
-	u32 bth1;
 
-	bth1 = be32_to_cpu(ohdr->bth[1]);
-	if (unlikely(bth1 & (IB_BECN_SMASK | IB_FECN_SMASK))) {
+	u32 bth1;
+	bool becn = false;
+	bool fecn = false;
+
+	if (pkt->etype == RHF_RCV_TYPE_BYPASS) {
+		fecn = hfi1_16B_get_fecn(pkt->hdr);
+		becn = hfi1_16B_get_becn(pkt->hdr);
+	} else {
+		bth1 = be32_to_cpu(ohdr->bth[1]);
+		fecn = bth1 & IB_FECN_SMASK;
+		becn = bth1 & IB_BECN_SMASK;
+	}
+	if (unlikely(fecn || becn)) {
 		hfi1_process_ecn_slowpath(qp, pkt, do_cnp);
-		return !!(bth1 & IB_FECN_SMASK);
+		return fecn;
 	}
 	return false;
 }
@@ -1829,10 +1957,9 @@ void hfi1_pcie_cleanup(struct pci_dev *pdev);
 int hfi1_pcie_ddinit(struct hfi1_devdata *dd, struct pci_dev *pdev);
 void hfi1_pcie_ddcleanup(struct hfi1_devdata *);
 int pcie_speeds(struct hfi1_devdata *dd);
-void request_msix(struct hfi1_devdata *dd, u32 *nent,
-		  struct hfi1_msix_entry *entry);
-void hfi1_enable_intx(struct pci_dev *pdev);
-void restore_pci_variables(struct hfi1_devdata *dd);
+int request_msix(struct hfi1_devdata *dd, u32 msireq);
+int restore_pci_variables(struct hfi1_devdata *dd);
+int save_pci_variables(struct hfi1_devdata *dd);
 int do_pcie_gen3_transition(struct hfi1_devdata *dd);
 int parse_platform_config(struct hfi1_devdata *dd);
 int get_platform_config_field(struct hfi1_devdata *dd,
@@ -1860,6 +1987,7 @@ int process_receive_error(struct hfi1_packet *packet);
 int kdeth_process_expected(struct hfi1_packet *packet);
 int kdeth_process_eager(struct hfi1_packet *packet);
 int process_receive_invalid(struct hfi1_packet *packet);
+void seqfile_dump_rcd(struct seq_file *s, struct hfi1_ctxtdata *rcd);
 
 /* global module parameter variables */
 extern unsigned int hfi1_max_mtu;
@@ -1991,9 +2119,15 @@ static inline u64 hfi1_pkt_base_sdma_integrity(struct hfi1_devdata *dd)
 #define dd_dev_emerg(dd, fmt, ...) \
 	dev_emerg(&(dd)->pcidev->dev, "%s: " fmt, \
 		  get_unit_name((dd)->unit), ##__VA_ARGS__)
+
 #define dd_dev_err(dd, fmt, ...) \
 	dev_err(&(dd)->pcidev->dev, "%s: " fmt, \
 			get_unit_name((dd)->unit), ##__VA_ARGS__)
+
+#define dd_dev_err_ratelimited(dd, fmt, ...) \
+	dev_err_ratelimited(&(dd)->pcidev->dev, "%s: " fmt, \
+			get_unit_name((dd)->unit), ##__VA_ARGS__)
+
 #define dd_dev_warn(dd, fmt, ...) \
 	dev_warn(&(dd)->pcidev->dev, "%s: " fmt, \
 			get_unit_name((dd)->unit), ##__VA_ARGS__)
@@ -2087,52 +2221,220 @@ int hfi1_tempsense_rd(struct hfi1_devdata *dd, struct hfi1_temp *temp);
 #define DD_DEV_ENTRY(dd)       __string(dev, dev_name(&(dd)->pcidev->dev))
 #define DD_DEV_ASSIGN(dd)      __assign_str(dev, dev_name(&(dd)->pcidev->dev))
 
-#define packettype_name(etype) { RHF_RCV_TYPE_##etype, #etype }
-#define show_packettype(etype)                  \
-__print_symbolic(etype,                         \
-	packettype_name(EXPECTED),              \
-	packettype_name(EAGER),                 \
-	packettype_name(IB),                    \
-	packettype_name(ERROR),                 \
-	packettype_name(BYPASS))
+static inline void hfi1_update_ah_attr(struct ib_device *ibdev,
+				       struct rdma_ah_attr *attr)
+{
+	struct hfi1_pportdata *ppd;
+	struct hfi1_ibport *ibp;
+	u32 dlid = rdma_ah_get_dlid(attr);
 
-#define ib_opcode_name(opcode) { IB_OPCODE_##opcode, #opcode  }
-#define show_ib_opcode(opcode)                             \
-__print_symbolic(opcode,                                   \
-	ib_opcode_name(RC_SEND_FIRST),                     \
-	ib_opcode_name(RC_SEND_MIDDLE),                    \
-	ib_opcode_name(RC_SEND_LAST),                      \
-	ib_opcode_name(RC_SEND_LAST_WITH_IMMEDIATE),       \
-	ib_opcode_name(RC_SEND_ONLY),                      \
-	ib_opcode_name(RC_SEND_ONLY_WITH_IMMEDIATE),       \
-	ib_opcode_name(RC_RDMA_WRITE_FIRST),               \
-	ib_opcode_name(RC_RDMA_WRITE_MIDDLE),              \
-	ib_opcode_name(RC_RDMA_WRITE_LAST),                \
-	ib_opcode_name(RC_RDMA_WRITE_LAST_WITH_IMMEDIATE), \
-	ib_opcode_name(RC_RDMA_WRITE_ONLY),                \
-	ib_opcode_name(RC_RDMA_WRITE_ONLY_WITH_IMMEDIATE), \
-	ib_opcode_name(RC_RDMA_READ_REQUEST),              \
-	ib_opcode_name(RC_RDMA_READ_RESPONSE_FIRST),       \
-	ib_opcode_name(RC_RDMA_READ_RESPONSE_MIDDLE),      \
-	ib_opcode_name(RC_RDMA_READ_RESPONSE_LAST),        \
-	ib_opcode_name(RC_RDMA_READ_RESPONSE_ONLY),        \
-	ib_opcode_name(RC_ACKNOWLEDGE),                    \
-	ib_opcode_name(RC_ATOMIC_ACKNOWLEDGE),             \
-	ib_opcode_name(RC_COMPARE_SWAP),                   \
-	ib_opcode_name(RC_FETCH_ADD),                      \
-	ib_opcode_name(UC_SEND_FIRST),                     \
-	ib_opcode_name(UC_SEND_MIDDLE),                    \
-	ib_opcode_name(UC_SEND_LAST),                      \
-	ib_opcode_name(UC_SEND_LAST_WITH_IMMEDIATE),       \
-	ib_opcode_name(UC_SEND_ONLY),                      \
-	ib_opcode_name(UC_SEND_ONLY_WITH_IMMEDIATE),       \
-	ib_opcode_name(UC_RDMA_WRITE_FIRST),               \
-	ib_opcode_name(UC_RDMA_WRITE_MIDDLE),              \
-	ib_opcode_name(UC_RDMA_WRITE_LAST),                \
-	ib_opcode_name(UC_RDMA_WRITE_LAST_WITH_IMMEDIATE), \
-	ib_opcode_name(UC_RDMA_WRITE_ONLY),                \
-	ib_opcode_name(UC_RDMA_WRITE_ONLY_WITH_IMMEDIATE), \
-	ib_opcode_name(UD_SEND_ONLY),                      \
-	ib_opcode_name(UD_SEND_ONLY_WITH_IMMEDIATE),       \
-	ib_opcode_name(CNP))
+	/*
+	 * Kernel clients may not have setup GRH information
+	 * Set that here.
+	 */
+	ibp = to_iport(ibdev, rdma_ah_get_port_num(attr));
+	ppd = ppd_from_ibp(ibp);
+	if ((((dlid >= be16_to_cpu(IB_MULTICAST_LID_BASE)) ||
+	      (ppd->lid >= be16_to_cpu(IB_MULTICAST_LID_BASE))) &&
+	    (dlid != be32_to_cpu(OPA_LID_PERMISSIVE)) &&
+	    (dlid != be16_to_cpu(IB_LID_PERMISSIVE)) &&
+	    (!(rdma_ah_get_ah_flags(attr) & IB_AH_GRH))) ||
+	    (rdma_ah_get_make_grd(attr))) {
+		rdma_ah_set_ah_flags(attr, IB_AH_GRH);
+		rdma_ah_set_interface_id(attr, OPA_MAKE_ID(dlid));
+		rdma_ah_set_subnet_prefix(attr, ibp->rvp.gid_prefix);
+	}
+}
+
+/*
+ * hfi1_check_mcast- Check if the given lid is
+ * in the OPA multicast range.
+ *
+ * The LID might either reside in ah.dlid or might be
+ * in the GRH of the address handle as DGID if extended
+ * addresses are in use.
+ */
+static inline bool hfi1_check_mcast(u32 lid)
+{
+	return ((lid >= opa_get_mcast_base(OPA_MCAST_NR)) &&
+		(lid != be32_to_cpu(OPA_LID_PERMISSIVE)));
+}
+
+#define opa_get_lid(lid, format)	\
+	__opa_get_lid(lid, OPA_PORT_PACKET_FORMAT_##format)
+
+/* Convert a lid to a specific lid space */
+static inline u32 __opa_get_lid(u32 lid, u8 format)
+{
+	bool is_mcast = hfi1_check_mcast(lid);
+
+	switch (format) {
+	case OPA_PORT_PACKET_FORMAT_8B:
+	case OPA_PORT_PACKET_FORMAT_10B:
+		if (is_mcast)
+			return (lid - opa_get_mcast_base(OPA_MCAST_NR) +
+				0xF0000);
+		return lid & 0xFFFFF;
+	case OPA_PORT_PACKET_FORMAT_16B:
+		if (is_mcast)
+			return (lid - opa_get_mcast_base(OPA_MCAST_NR) +
+				0xF00000);
+		return lid & 0xFFFFFF;
+	case OPA_PORT_PACKET_FORMAT_9B:
+		if (is_mcast)
+			return (lid -
+				opa_get_mcast_base(OPA_MCAST_NR) +
+				be16_to_cpu(IB_MULTICAST_LID_BASE));
+		else
+			return lid & 0xFFFF;
+	default:
+		return lid;
+	}
+}
+
+/* Return true if the given lid is the OPA 16B multicast range */
+static inline bool hfi1_is_16B_mcast(u32 lid)
+{
+	return ((lid >=
+		opa_get_lid(opa_get_mcast_base(OPA_MCAST_NR), 16B)) &&
+		(lid != opa_get_lid(be32_to_cpu(OPA_LID_PERMISSIVE), 16B)));
+}
+
+static inline void hfi1_make_opa_lid(struct rdma_ah_attr *attr)
+{
+	const struct ib_global_route *grh = rdma_ah_read_grh(attr);
+	u32 dlid = rdma_ah_get_dlid(attr);
+
+	/* Modify ah_attr.dlid to be in the 32 bit LID space.
+	 * This is how the address will be laid out:
+	 * Assuming MCAST_NR to be 4,
+	 * 32 bit permissive LID = 0xFFFFFFFF
+	 * Multicast LID range = 0xFFFFFFFE to 0xF0000000
+	 * Unicast LID range = 0xEFFFFFFF to 1
+	 * Invalid LID = 0
+	 */
+	if (ib_is_opa_gid(&grh->dgid))
+		dlid = opa_get_lid_from_gid(&grh->dgid);
+	else if ((dlid >= be16_to_cpu(IB_MULTICAST_LID_BASE)) &&
+		 (dlid != be16_to_cpu(IB_LID_PERMISSIVE)) &&
+		 (dlid != be32_to_cpu(OPA_LID_PERMISSIVE)))
+		dlid = dlid - be16_to_cpu(IB_MULTICAST_LID_BASE) +
+			opa_get_mcast_base(OPA_MCAST_NR);
+	else if (dlid == be16_to_cpu(IB_LID_PERMISSIVE))
+		dlid = be32_to_cpu(OPA_LID_PERMISSIVE);
+
+	rdma_ah_set_dlid(attr, dlid);
+}
+
+static inline u8 hfi1_get_packet_type(u32 lid)
+{
+	/* 9B if lid > 0xF0000000 */
+	if (lid >= opa_get_mcast_base(OPA_MCAST_NR))
+		return HFI1_PKT_TYPE_9B;
+
+	/* 16B if lid > 0xC000 */
+	if (lid >= opa_get_lid(opa_get_mcast_base(OPA_MCAST_NR), 9B))
+		return HFI1_PKT_TYPE_16B;
+
+	return HFI1_PKT_TYPE_9B;
+}
+
+static inline bool hfi1_get_hdr_type(u32 lid, struct rdma_ah_attr *attr)
+{
+	/*
+	 * If there was an incoming 16B packet with permissive
+	 * LIDs, OPA GIDs would have been programmed when those
+	 * packets were received. A 16B packet will have to
+	 * be sent in response to that packet. Return a 16B
+	 * header type if that's the case.
+	 */
+	if (rdma_ah_get_dlid(attr) == be32_to_cpu(OPA_LID_PERMISSIVE))
+		return (ib_is_opa_gid(&rdma_ah_read_grh(attr)->dgid)) ?
+			HFI1_PKT_TYPE_16B : HFI1_PKT_TYPE_9B;
+
+	/*
+	 * Return a 16B header type if either the the destination
+	 * or source lid is extended.
+	 */
+	if (hfi1_get_packet_type(rdma_ah_get_dlid(attr)) == HFI1_PKT_TYPE_16B)
+		return HFI1_PKT_TYPE_16B;
+
+	return hfi1_get_packet_type(lid);
+}
+
+static inline void hfi1_make_ext_grh(struct hfi1_packet *packet,
+				     struct ib_grh *grh, u32 slid,
+				     u32 dlid)
+{
+	struct hfi1_ibport *ibp = &packet->rcd->ppd->ibport_data;
+	struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
+
+	if (!ibp)
+		return;
+
+	grh->hop_limit = 1;
+	grh->sgid.global.subnet_prefix = ibp->rvp.gid_prefix;
+	if (slid == opa_get_lid(be32_to_cpu(OPA_LID_PERMISSIVE), 16B))
+		grh->sgid.global.interface_id =
+			OPA_MAKE_ID(be32_to_cpu(OPA_LID_PERMISSIVE));
+	else
+		grh->sgid.global.interface_id = OPA_MAKE_ID(slid);
+
+	/*
+	 * Upper layers (like mad) may compare the dgid in the
+	 * wc that is obtained here with the sgid_index in
+	 * the wr. Since sgid_index in wr is always 0 for
+	 * extended lids, set the dgid here to the default
+	 * IB gid.
+	 */
+	grh->dgid.global.subnet_prefix = ibp->rvp.gid_prefix;
+	grh->dgid.global.interface_id =
+		cpu_to_be64(ppd->guids[HFI1_PORT_GUID_INDEX]);
+}
+
+static inline int hfi1_get_16b_padding(u32 hdr_size, u32 payload)
+{
+	return -(hdr_size + payload + (SIZE_OF_CRC << 2) +
+		     SIZE_OF_LT) & 0x7;
+}
+
+static inline void hfi1_make_ib_hdr(struct ib_header *hdr,
+				    u16 lrh0, u16 len,
+				    u16 dlid, u16 slid)
+{
+	hdr->lrh[0] = cpu_to_be16(lrh0);
+	hdr->lrh[1] = cpu_to_be16(dlid);
+	hdr->lrh[2] = cpu_to_be16(len);
+	hdr->lrh[3] = cpu_to_be16(slid);
+}
+
+static inline void hfi1_make_16b_hdr(struct hfi1_16b_header *hdr,
+				     u32 slid, u32 dlid,
+				     u16 len, u16 pkey,
+				     u8 becn, u8 fecn, u8 l4,
+				     u8 sc)
+{
+	u32 lrh0 = 0;
+	u32 lrh1 = 0x40000000;
+	u32 lrh2 = 0;
+	u32 lrh3 = 0;
+
+	lrh0 = (lrh0 & ~OPA_16B_BECN_MASK) | (becn << OPA_16B_BECN_SHIFT);
+	lrh0 = (lrh0 & ~OPA_16B_LEN_MASK) | (len << OPA_16B_LEN_SHIFT);
+	lrh0 = (lrh0 & ~OPA_16B_LID_MASK)  | (slid & OPA_16B_LID_MASK);
+	lrh1 = (lrh1 & ~OPA_16B_FECN_MASK) | (fecn << OPA_16B_FECN_SHIFT);
+	lrh1 = (lrh1 & ~OPA_16B_SC_MASK) | (sc << OPA_16B_SC_SHIFT);
+	lrh1 = (lrh1 & ~OPA_16B_LID_MASK) | (dlid & OPA_16B_LID_MASK);
+	lrh2 = (lrh2 & ~OPA_16B_SLID_MASK) |
+		((slid >> OPA_16B_SLID_SHIFT) << OPA_16B_SLID_HIGH_SHIFT);
+	lrh2 = (lrh2 & ~OPA_16B_DLID_MASK) |
+		((dlid >> OPA_16B_DLID_SHIFT) << OPA_16B_DLID_HIGH_SHIFT);
+	lrh2 = (lrh2 & ~OPA_16B_PKEY_MASK) | (pkey << OPA_16B_PKEY_SHIFT);
+	lrh2 = (lrh2 & ~OPA_16B_L4_MASK) | l4;
+
+	hdr->lrh[0] = lrh0;
+	hdr->lrh[1] = lrh1;
+	hdr->lrh[2] = lrh2;
+	hdr->lrh[3] = lrh3;
+}
 #endif                          /* _HFI1_KERNEL_H */
diff --git a/drivers/infiniband/hw/hfi1/init.c b/drivers/infiniband/hw/hfi1/init.c
index 4a11d4d..fba7700 100644
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -67,6 +67,7 @@
 #include "aspm.h"
 #include "affinity.h"
 #include "vnic.h"
+#include "exp_rcv.h"
 
 #undef pr_fmt
 #define pr_fmt(fmt) DRIVER_NAME ": " fmt
@@ -125,85 +126,198 @@ static struct idr hfi1_unit_table;
 u32 hfi1_cpulist_count;
 unsigned long *hfi1_cpulist;
 
-/*
- * Common code for creating the receive context array.
- */
-int hfi1_create_ctxts(struct hfi1_devdata *dd)
+static int hfi1_create_kctxt(struct hfi1_devdata *dd,
+			     struct hfi1_pportdata *ppd)
 {
-	unsigned i;
+	struct hfi1_ctxtdata *rcd;
 	int ret;
 
 	/* Control context has to be always 0 */
 	BUILD_BUG_ON(HFI1_CTRL_CTXT != 0);
 
-	dd->rcd = kzalloc_node(dd->num_rcv_contexts * sizeof(*dd->rcd),
-			       GFP_KERNEL, dd->node);
-	if (!dd->rcd)
-		goto nomem;
-
-	/* create one or more kernel contexts */
-	for (i = 0; i < dd->first_dyn_alloc_ctxt; ++i) {
-		struct hfi1_pportdata *ppd;
-		struct hfi1_ctxtdata *rcd;
-
-		ppd = dd->pport + (i % dd->num_pports);
-
-		/* dd->rcd[i] gets assigned inside the callee */
-		rcd = hfi1_create_ctxtdata(ppd, i, dd->node);
-		if (!rcd) {
-			dd_dev_err(dd,
-				   "Unable to allocate kernel receive context, failing\n");
-			goto nomem;
-		}
-		/*
-		 * Set up the kernel context flags here and now because they
-		 * use default values for all receive side memories.  User
-		 * contexts will be handled as they are created.
-		 */
-		rcd->flags = HFI1_CAP_KGET(MULTI_PKT_EGR) |
-			HFI1_CAP_KGET(NODROP_RHQ_FULL) |
-			HFI1_CAP_KGET(NODROP_EGR_FULL) |
-			HFI1_CAP_KGET(DMA_RTAIL);
-
-		/* Control context must use DMA_RTAIL */
-		if (rcd->ctxt == HFI1_CTRL_CTXT)
-			rcd->flags |= HFI1_CAP_DMA_RTAIL;
-		rcd->seq_cnt = 1;
-
-		rcd->sc = sc_alloc(dd, SC_ACK, rcd->rcvhdrqentsize, dd->node);
-		if (!rcd->sc) {
-			dd_dev_err(dd,
-				   "Unable to allocate kernel send context, failing\n");
-			goto nomem;
-		}
-
-		hfi1_init_ctxt(rcd->sc);
+	ret = hfi1_create_ctxtdata(ppd, dd->node, &rcd);
+	if (ret < 0) {
+		dd_dev_err(dd, "Kernel receive context allocation failed\n");
+		return ret;
 	}
 
 	/*
-	 * Initialize aspm, to be done after gen3 transition and setting up
-	 * contexts and before enabling interrupts
+	 * Set up the kernel context flags here and now because they use
+	 * default values for all receive side memories.  User contexts will
+	 * be handled as they are created.
 	 */
-	aspm_init(dd);
+	rcd->flags = HFI1_CAP_KGET(MULTI_PKT_EGR) |
+		HFI1_CAP_KGET(NODROP_RHQ_FULL) |
+		HFI1_CAP_KGET(NODROP_EGR_FULL) |
+		HFI1_CAP_KGET(DMA_RTAIL);
+
+	/* Control context must use DMA_RTAIL */
+	if (rcd->ctxt == HFI1_CTRL_CTXT)
+		rcd->flags |= HFI1_CAP_DMA_RTAIL;
+	rcd->seq_cnt = 1;
+
+	rcd->sc = sc_alloc(dd, SC_ACK, rcd->rcvhdrqentsize, dd->node);
+	if (!rcd->sc) {
+		dd_dev_err(dd, "Kernel send context allocation failed\n");
+		return -ENOMEM;
+	}
+	hfi1_init_ctxt(rcd->sc);
 
 	return 0;
-nomem:
-	ret = -ENOMEM;
+}
 
-	if (dd->rcd) {
-		for (i = 0; i < dd->num_rcv_contexts; ++i)
-			hfi1_free_ctxtdata(dd, dd->rcd[i]);
+/*
+ * Create the receive context array and one or more kernel contexts
+ */
+int hfi1_create_kctxts(struct hfi1_devdata *dd)
+{
+	u16 i;
+	int ret;
+
+	dd->rcd = kzalloc_node(dd->num_rcv_contexts * sizeof(*dd->rcd),
+			       GFP_KERNEL, dd->node);
+	if (!dd->rcd)
+		return -ENOMEM;
+
+	for (i = 0; i < dd->first_dyn_alloc_ctxt; ++i) {
+		ret = hfi1_create_kctxt(dd, dd->pport);
+		if (ret)
+			goto bail;
 	}
+
+	return 0;
+bail:
+	for (i = 0; dd->rcd && i < dd->first_dyn_alloc_ctxt; ++i)
+		hfi1_free_ctxt(dd->rcd[i]);
+
+	/* All the contexts should be freed, free the array */
 	kfree(dd->rcd);
 	dd->rcd = NULL;
 	return ret;
 }
 
 /*
- * Common code for user and kernel context setup.
+ * Helper routines for the receive context reference count (rcd and uctxt).
  */
-struct hfi1_ctxtdata *hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, u32 ctxt,
-					   int numa)
+static void hfi1_rcd_init(struct hfi1_ctxtdata *rcd)
+{
+	kref_init(&rcd->kref);
+}
+
+/**
+ * hfi1_rcd_free - When reference is zero clean up.
+ * @kref: pointer to an initialized rcd data structure
+ *
+ */
+static void hfi1_rcd_free(struct kref *kref)
+{
+	unsigned long flags;
+	struct hfi1_ctxtdata *rcd =
+		container_of(kref, struct hfi1_ctxtdata, kref);
+
+	hfi1_free_ctxtdata(rcd->dd, rcd);
+
+	spin_lock_irqsave(&rcd->dd->uctxt_lock, flags);
+	rcd->dd->rcd[rcd->ctxt] = NULL;
+	spin_unlock_irqrestore(&rcd->dd->uctxt_lock, flags);
+
+	kfree(rcd);
+}
+
+/**
+ * hfi1_rcd_put - decrement reference for rcd
+ * @rcd: pointer to an initialized rcd data structure
+ *
+ * Use this to put a reference after the init.
+ */
+int hfi1_rcd_put(struct hfi1_ctxtdata *rcd)
+{
+	if (rcd)
+		return kref_put(&rcd->kref, hfi1_rcd_free);
+
+	return 0;
+}
+
+/**
+ * hfi1_rcd_get - increment reference for rcd
+ * @rcd: pointer to an initialized rcd data structure
+ *
+ * Use this to get a reference after the init.
+ */
+void hfi1_rcd_get(struct hfi1_ctxtdata *rcd)
+{
+	kref_get(&rcd->kref);
+}
+
+/**
+ * allocate_rcd_index - allocate an rcd index from the rcd array
+ * @dd: pointer to a valid devdata structure
+ * @rcd: rcd data structure to assign
+ * @index: pointer to index that is allocated
+ *
+ * Find an empty index in the rcd array, and assign the given rcd to it.
+ * If the array is full, we are EBUSY.
+ *
+ */
+static int allocate_rcd_index(struct hfi1_devdata *dd,
+			      struct hfi1_ctxtdata *rcd, u16 *index)
+{
+	unsigned long flags;
+	u16 ctxt;
+
+	spin_lock_irqsave(&dd->uctxt_lock, flags);
+	for (ctxt = 0; ctxt < dd->num_rcv_contexts; ctxt++)
+		if (!dd->rcd[ctxt])
+			break;
+
+	if (ctxt < dd->num_rcv_contexts) {
+		rcd->ctxt = ctxt;
+		dd->rcd[ctxt] = rcd;
+		hfi1_rcd_init(rcd);
+	}
+	spin_unlock_irqrestore(&dd->uctxt_lock, flags);
+
+	if (ctxt >= dd->num_rcv_contexts)
+		return -EBUSY;
+
+	*index = ctxt;
+
+	return 0;
+}
+
+/**
+ * hfi1_rcd_get_by_index
+ * @dd: pointer to a valid devdata structure
+ * @ctxt: the index of an possilbe rcd
+ *
+ * We need to protect access to the rcd array.  If access is needed to
+ * one or more index, get the protecting spinlock and then increment the
+ * kref.
+ *
+ * The caller is responsible for making the _put().
+ *
+ */
+struct hfi1_ctxtdata *hfi1_rcd_get_by_index(struct hfi1_devdata *dd, u16 ctxt)
+{
+	unsigned long flags;
+	struct hfi1_ctxtdata *rcd = NULL;
+
+	spin_lock_irqsave(&dd->uctxt_lock, flags);
+	if (dd->rcd[ctxt]) {
+		rcd = dd->rcd[ctxt];
+		hfi1_rcd_get(rcd);
+	}
+	spin_unlock_irqrestore(&dd->uctxt_lock, flags);
+
+	return rcd;
+}
+
+/*
+ * Common code for user and kernel context create and setup.
+ * NOTE: the initial kref is done here (hf1_rcd_init()).
+ */
+int hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, int numa,
+			 struct hfi1_ctxtdata **context)
 {
 	struct hfi1_devdata *dd = ppd->dd;
 	struct hfi1_ctxtdata *rcd;
@@ -217,20 +331,30 @@ struct hfi1_ctxtdata *hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, u32 ctxt,
 	rcd = kzalloc_node(sizeof(*rcd), GFP_KERNEL, numa);
 	if (rcd) {
 		u32 rcvtids, max_entries;
+		u16 ctxt;
+		int ret;
 
-		hfi1_cdbg(PROC, "setting up context %u\n", ctxt);
+		ret = allocate_rcd_index(dd, rcd, &ctxt);
+		if (ret) {
+			*context = NULL;
+			kfree(rcd);
+			return ret;
+		}
 
 		INIT_LIST_HEAD(&rcd->qp_wait_list);
+		hfi1_exp_tid_group_init(&rcd->tid_group_list);
+		hfi1_exp_tid_group_init(&rcd->tid_used_list);
+		hfi1_exp_tid_group_init(&rcd->tid_full_list);
 		rcd->ppd = ppd;
 		rcd->dd = dd;
 		__set_bit(0, rcd->in_use_ctxts);
-		rcd->ctxt = ctxt;
-		dd->rcd[ctxt] = rcd;
 		rcd->numa_id = numa;
 		rcd->rcv_array_groups = dd->rcv_entries.ngroups;
 
 		mutex_init(&rcd->exp_lock);
 
+		hfi1_cdbg(PROC, "setting up context %u\n", rcd->ctxt);
+
 		/*
 		 * Calculate the context's RcvArray entry starting point.
 		 * We do this here because we have to take into account all
@@ -328,14 +452,30 @@ struct hfi1_ctxtdata *hfi1_create_ctxtdata(struct hfi1_pportdata *ppd, u32 ctxt,
 			if (!rcd->opstats)
 				goto bail;
 		}
+
+		*context = rcd;
+		return 0;
 	}
-	return rcd;
+
 bail:
-	dd->rcd[ctxt] = NULL;
-	kfree(rcd->egrbufs.rcvtids);
-	kfree(rcd->egrbufs.buffers);
-	kfree(rcd);
-	return NULL;
+	*context = NULL;
+	hfi1_free_ctxt(rcd);
+	return -ENOMEM;
+}
+
+/**
+ * hfi1_free_ctxt
+ * @rcd: pointer to an initialized rcd data structure
+ *
+ * This wrapper is the free function that matches hfi1_create_ctxtdata().
+ * When a context is done being used (kernel or user), this function is called
+ * for the "final" put to match the kref init from hf1i_create_ctxtdata().
+ * Other users of the context do a get/put sequence to make sure that the
+ * structure isn't removed while in use.
+ */
+void hfi1_free_ctxt(struct hfi1_ctxtdata *rcd)
+{
+	hfi1_rcd_put(rcd);
 }
 
 /*
@@ -483,7 +623,6 @@ void hfi1_init_pportdata(struct pci_dev *pdev, struct hfi1_pportdata *ppd,
 
 	ppd->pkeys[default_pkey_idx] = DEFAULT_P_KEY;
 	ppd->part_enforce |= HFI1_PART_ENFORCE_IN;
-	ppd->part_enforce |= HFI1_PART_ENFORCE_OUT;
 
 	if (loopback) {
 		hfi1_early_err(&pdev->dev,
@@ -559,16 +698,19 @@ static int loadtime_init(struct hfi1_devdata *dd)
 static int init_after_reset(struct hfi1_devdata *dd)
 {
 	int i;
-
+	struct hfi1_ctxtdata *rcd;
 	/*
 	 * Ensure chip does no sends or receives, tail updates, or
 	 * pioavail updates while we re-initialize.  This is mostly
 	 * for the driver data structures, not chip registers.
 	 */
-	for (i = 0; i < dd->num_rcv_contexts; i++)
+	for (i = 0; i < dd->num_rcv_contexts; i++) {
+		rcd = hfi1_rcd_get_by_index(dd, i);
 		hfi1_rcvctrl(dd, HFI1_RCVCTRL_CTXT_DIS |
-				  HFI1_RCVCTRL_INTRAVAIL_DIS |
-				  HFI1_RCVCTRL_TAILUPD_DIS, i);
+			     HFI1_RCVCTRL_INTRAVAIL_DIS |
+			     HFI1_RCVCTRL_TAILUPD_DIS, rcd);
+		hfi1_rcd_put(rcd);
+	}
 	pio_send_control(dd, PSC_GLOBAL_DISABLE);
 	for (i = 0; i < dd->num_send_contexts; i++)
 		sc_disable(dd->send_contexts[i].sc);
@@ -578,8 +720,9 @@ static int init_after_reset(struct hfi1_devdata *dd)
 
 static void enable_chip(struct hfi1_devdata *dd)
 {
+	struct hfi1_ctxtdata *rcd;
 	u32 rcvmask;
-	u32 i;
+	u16 i;
 
 	/* enable PIO send */
 	pio_send_control(dd, PSC_GLOBAL_ENABLE);
@@ -589,17 +732,21 @@ static void enable_chip(struct hfi1_devdata *dd)
 	 * Other ctxts done as user opens and initializes them.
 	 */
 	for (i = 0; i < dd->first_dyn_alloc_ctxt; ++i) {
+		rcd = hfi1_rcd_get_by_index(dd, i);
+		if (!rcd)
+			continue;
 		rcvmask = HFI1_RCVCTRL_CTXT_ENB | HFI1_RCVCTRL_INTRAVAIL_ENB;
-		rcvmask |= HFI1_CAP_KGET_MASK(dd->rcd[i]->flags, DMA_RTAIL) ?
+		rcvmask |= HFI1_CAP_KGET_MASK(rcd->flags, DMA_RTAIL) ?
 			HFI1_RCVCTRL_TAILUPD_ENB : HFI1_RCVCTRL_TAILUPD_DIS;
-		if (!HFI1_CAP_KGET_MASK(dd->rcd[i]->flags, MULTI_PKT_EGR))
+		if (!HFI1_CAP_KGET_MASK(rcd->flags, MULTI_PKT_EGR))
 			rcvmask |= HFI1_RCVCTRL_ONE_PKT_EGR_ENB;
-		if (HFI1_CAP_KGET_MASK(dd->rcd[i]->flags, NODROP_RHQ_FULL))
+		if (HFI1_CAP_KGET_MASK(rcd->flags, NODROP_RHQ_FULL))
 			rcvmask |= HFI1_RCVCTRL_NO_RHQ_DROP_ENB;
-		if (HFI1_CAP_KGET_MASK(dd->rcd[i]->flags, NODROP_EGR_FULL))
+		if (HFI1_CAP_KGET_MASK(rcd->flags, NODROP_EGR_FULL))
 			rcvmask |= HFI1_RCVCTRL_NO_EGR_DROP_ENB;
-		hfi1_rcvctrl(dd, rcvmask, i);
-		sc_enable(dd->rcd[i]->sc);
+		hfi1_rcvctrl(dd, rcvmask, rcd);
+		sc_enable(rcd->sc);
+		hfi1_rcd_put(rcd);
 	}
 }
 
@@ -624,6 +771,20 @@ static int create_workqueues(struct hfi1_devdata *dd)
 			if (!ppd->hfi1_wq)
 				goto wq_error;
 		}
+		if (!ppd->link_wq) {
+			/*
+			 * Make the link workqueue single-threaded to enforce
+			 * serialization.
+			 */
+			ppd->link_wq =
+				alloc_workqueue(
+				    "hfi_link_%d_%d",
+				    WQ_SYSFS | WQ_MEM_RECLAIM | WQ_UNBOUND,
+				    1, /* max_active */
+				    dd->unit, pidx);
+			if (!ppd->link_wq)
+				goto wq_error;
+		}
 	}
 	return 0;
 wq_error:
@@ -634,6 +795,10 @@ static int create_workqueues(struct hfi1_devdata *dd)
 			destroy_workqueue(ppd->hfi1_wq);
 			ppd->hfi1_wq = NULL;
 		}
+		if (ppd->link_wq) {
+			destroy_workqueue(ppd->link_wq);
+			ppd->link_wq = NULL;
+		}
 	}
 	return -ENOMEM;
 }
@@ -656,7 +821,8 @@ static int create_workqueues(struct hfi1_devdata *dd)
 int hfi1_init(struct hfi1_devdata *dd, int reinit)
 {
 	int ret = 0, pidx, lastfail = 0;
-	unsigned i, len;
+	unsigned long len;
+	u16 i;
 	struct hfi1_ctxtdata *rcd;
 	struct hfi1_pportdata *ppd;
 
@@ -725,7 +891,7 @@ int hfi1_init(struct hfi1_devdata *dd, int reinit)
 		 * existing, and re-allocate.
 		 * Need to re-create rest of ctxt 0 ctxtdata as well.
 		 */
-		rcd = dd->rcd[i];
+		rcd = hfi1_rcd_get_by_index(dd, i);
 		if (!rcd)
 			continue;
 
@@ -739,6 +905,7 @@ int hfi1_init(struct hfi1_devdata *dd, int reinit)
 				   "failed to allocate kernel ctxt's rcvhdrq and/or egr bufs\n");
 			ret = lastfail;
 		}
+		hfi1_rcd_put(rcd);
 	}
 
 	/* Allocate enough memory for user event notification. */
@@ -858,6 +1025,7 @@ static void stop_timers(struct hfi1_devdata *dd)
 static void shutdown_device(struct hfi1_devdata *dd)
 {
 	struct hfi1_pportdata *ppd;
+	struct hfi1_ctxtdata *rcd;
 	unsigned pidx;
 	int i;
 
@@ -876,12 +1044,15 @@ static void shutdown_device(struct hfi1_devdata *dd)
 
 	for (pidx = 0; pidx < dd->num_pports; ++pidx) {
 		ppd = dd->pport + pidx;
-		for (i = 0; i < dd->num_rcv_contexts; i++)
+		for (i = 0; i < dd->num_rcv_contexts; i++) {
+			rcd = hfi1_rcd_get_by_index(dd, i);
 			hfi1_rcvctrl(dd, HFI1_RCVCTRL_TAILUPD_DIS |
-					  HFI1_RCVCTRL_CTXT_DIS |
-					  HFI1_RCVCTRL_INTRAVAIL_DIS |
-					  HFI1_RCVCTRL_PKEY_DIS |
-					  HFI1_RCVCTRL_ONE_PKT_EGR_DIS, i);
+				     HFI1_RCVCTRL_CTXT_DIS |
+				     HFI1_RCVCTRL_INTRAVAIL_DIS |
+				     HFI1_RCVCTRL_PKEY_DIS |
+				     HFI1_RCVCTRL_ONE_PKT_EGR_DIS, rcd);
+			hfi1_rcd_put(rcd);
+		}
 		/*
 		 * Gracefully stop all sends allowing any in progress to
 		 * trickle out first.
@@ -917,6 +1088,10 @@ static void shutdown_device(struct hfi1_devdata *dd)
 			destroy_workqueue(ppd->hfi1_wq);
 			ppd->hfi1_wq = NULL;
 		}
+		if (ppd->link_wq) {
+			destroy_workqueue(ppd->link_wq);
+			ppd->link_wq = NULL;
+		}
 	}
 	sdma_exit(dd);
 }
@@ -927,14 +1102,11 @@ static void shutdown_device(struct hfi1_devdata *dd)
  * @rcd: the ctxtdata structure
  *
  * free up any allocated data for a context
- * This should not touch anything that would affect a simultaneous
- * re-allocation of context data, because it is called after hfi1_mutex
- * is released (and can be called from reinit as well).
  * It should never change any chip state, or global driver state.
  */
 void hfi1_free_ctxtdata(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd)
 {
-	unsigned e;
+	u32 e;
 
 	if (!rcd)
 		return;
@@ -953,6 +1125,7 @@ void hfi1_free_ctxtdata(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd)
 
 	/* all the RcvArray entries should have been cleared by now */
 	kfree(rcd->egrbufs.rcvtids);
+	rcd->egrbufs.rcvtids = NULL;
 
 	for (e = 0; e < rcd->egrbufs.alloced; e++) {
 		if (rcd->egrbufs.buffers[e].dma)
@@ -962,13 +1135,21 @@ void hfi1_free_ctxtdata(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd)
 					  rcd->egrbufs.buffers[e].dma);
 	}
 	kfree(rcd->egrbufs.buffers);
+	rcd->egrbufs.alloced = 0;
+	rcd->egrbufs.buffers = NULL;
 
 	sc_free(rcd->sc);
+	rcd->sc = NULL;
+
 	vfree(rcd->subctxt_uregbase);
 	vfree(rcd->subctxt_rcvegrbuf);
 	vfree(rcd->subctxt_rcvhdr_base);
 	kfree(rcd->opstats);
-	kfree(rcd);
+
+	rcd->subctxt_uregbase = NULL;
+	rcd->subctxt_rcvegrbuf = NULL;
+	rcd->subctxt_rcvhdr_base = NULL;
+	rcd->opstats = NULL;
 }
 
 /*
@@ -1311,8 +1492,6 @@ static void cleanup_device_data(struct hfi1_devdata *dd)
 {
 	int ctxt;
 	int pidx;
-	struct hfi1_ctxtdata **tmp;
-	unsigned long flags;
 
 	/* users can't do anything more with chip */
 	for (pidx = 0; pidx < dd->num_pports; ++pidx) {
@@ -1337,18 +1516,6 @@ static void cleanup_device_data(struct hfi1_devdata *dd)
 
 	free_credit_return(dd);
 
-	/*
-	 * Free any resources still in use (usually just kernel contexts)
-	 * at unload; we do for ctxtcnt, because that's what we allocate.
-	 * We acquire lock to be really paranoid that rcd isn't being
-	 * accessed from some interrupt-related code (that should not happen,
-	 * but best to be sure).
-	 */
-	spin_lock_irqsave(&dd->uctxt_lock, flags);
-	tmp = dd->rcd;
-	dd->rcd = NULL;
-	spin_unlock_irqrestore(&dd->uctxt_lock, flags);
-
 	if (dd->rcvhdrtail_dummy_kvaddr) {
 		dma_free_coherent(&dd->pcidev->dev, sizeof(u64),
 				  (void *)dd->rcvhdrtail_dummy_kvaddr,
@@ -1356,16 +1523,22 @@ static void cleanup_device_data(struct hfi1_devdata *dd)
 		dd->rcvhdrtail_dummy_kvaddr = NULL;
 	}
 
-	for (ctxt = 0; tmp && ctxt < dd->num_rcv_contexts; ctxt++) {
-		struct hfi1_ctxtdata *rcd = tmp[ctxt];
+	/*
+	 * Free any resources still in use (usually just kernel contexts)
+	 * at unload; we do for ctxtcnt, because that's what we allocate.
+	 */
+	for (ctxt = 0; dd->rcd && ctxt < dd->num_rcv_contexts; ctxt++) {
+		struct hfi1_ctxtdata *rcd = dd->rcd[ctxt];
 
-		tmp[ctxt] = NULL; /* debugging paranoia */
 		if (rcd) {
 			hfi1_clear_tids(rcd);
-			hfi1_free_ctxtdata(dd, rcd);
+			hfi1_free_ctxt(rcd);
 		}
 	}
-	kfree(tmp);
+
+	kfree(dd->rcd);
+	dd->rcd = NULL;
+
 	free_pio_map(dd);
 	/* must follow rcv context free - need to remove rcv's hooks */
 	for (ctxt = 0; ctxt < dd->num_send_contexts; ctxt++)
@@ -1532,6 +1705,10 @@ static int init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 				destroy_workqueue(ppd->hfi1_wq);
 				ppd->hfi1_wq = NULL;
 			}
+			if (ppd->link_wq) {
+				destroy_workqueue(ppd->link_wq);
+				ppd->link_wq = NULL;
+			}
 		}
 		if (!j)
 			hfi1_device_remove(dd);
diff --git a/drivers/infiniband/hw/hfi1/intr.c b/drivers/infiniband/hw/hfi1/intr.c
index 04a5082d..96845df 100644
--- a/drivers/infiniband/hw/hfi1/intr.c
+++ b/drivers/infiniband/hw/hfi1/intr.c
@@ -164,6 +164,7 @@ void handle_linkup_change(struct hfi1_devdata *dd, u32 linkup)
 		ppd->linkup = 0;
 
 		/* clear HW details of the previous connection */
+		ppd->actual_vls_operational = 0;
 		reset_link_credits(dd);
 
 		/* freeze after a link down to guarantee a clean egress */
@@ -196,7 +197,7 @@ void handle_user_interrupt(struct hfi1_ctxtdata *rcd)
 
 	if (test_and_clear_bit(HFI1_CTXT_WAITING_RCV, &rcd->event_flags)) {
 		wake_up_interruptible(&rcd->wait);
-		hfi1_rcvctrl(dd, HFI1_RCVCTRL_INTRAVAIL_DIS, rcd->ctxt);
+		hfi1_rcvctrl(dd, HFI1_RCVCTRL_INTRAVAIL_DIS, rcd);
 	} else if (test_and_clear_bit(HFI1_CTXT_WAITING_URG,
 							&rcd->event_flags)) {
 		rcd->urgent++;
diff --git a/drivers/infiniband/hw/hfi1/iowait.h b/drivers/infiniband/hw/hfi1/iowait.h
index d9740dd..591697d 100644
--- a/drivers/infiniband/hw/hfi1/iowait.h
+++ b/drivers/infiniband/hw/hfi1/iowait.h
@@ -106,7 +106,9 @@ struct iowait {
 		struct sdma_engine *sde,
 		struct iowait *wait,
 		struct sdma_txreq *tx,
-		unsigned seq);
+		uint seq,
+		bool pkts_sent
+		);
 	void (*wakeup)(struct iowait *wait, int reason);
 	void (*sdma_drained)(struct iowait *wait);
 	seqlock_t *lock;
@@ -118,6 +120,7 @@ struct iowait {
 	u32 count;
 	u32 tx_limit;
 	u32 tx_count;
+	u8 starved_cnt;
 };
 
 #define SDMA_AVAIL_REASON 0
@@ -143,7 +146,8 @@ static inline void iowait_init(
 		struct sdma_engine *sde,
 		struct iowait *wait,
 		struct sdma_txreq *tx,
-		unsigned seq),
+		uint seq,
+		bool pkts_sent),
 	void (*wakeup)(struct iowait *wait, int reason),
 	void (*sdma_drained)(struct iowait *wait))
 {
@@ -305,4 +309,66 @@ static inline struct sdma_txreq *iowait_get_txhead(struct iowait *wait)
 	return tx;
 }
 
+/**
+ * iowait_queue - Put the iowait on a wait queue
+ * @pkts_sent: have some packets been sent before queuing?
+ * @w: the iowait struct
+ * @wait_head: the wait queue
+ *
+ * This function is called to insert an iowait struct into a
+ * wait queue after a resource (eg, sdma decriptor or pio
+ * buffer) is run out.
+ */
+static inline void iowait_queue(bool pkts_sent, struct iowait *w,
+				struct list_head *wait_head)
+{
+	/*
+	 * To play fair, insert the iowait at the tail of the wait queue if it
+	 * has already sent some packets; Otherwise, put it at the head.
+	 */
+	if (pkts_sent) {
+		list_add_tail(&w->list, wait_head);
+		w->starved_cnt = 0;
+	} else {
+		list_add(&w->list, wait_head);
+		w->starved_cnt++;
+	}
+}
+
+/**
+ * iowait_starve_clear - clear the wait queue's starve count
+ * @pkts_sent: have some packets been sent?
+ * @w: the iowait struct
+ *
+ * This function is called to clear the starve count. If no
+ * packets have been sent, the starve count will not be cleared.
+ */
+static inline void iowait_starve_clear(bool pkts_sent, struct iowait *w)
+{
+	if (pkts_sent)
+		w->starved_cnt = 0;
+}
+
+/**
+ * iowait_starve_find_max - Find the maximum of the starve count
+ * @w: the iowait struct
+ * @max: a variable containing the max starve count
+ * @idx: the index of the current iowait in an array
+ * @max_idx: a variable containing the array index for the
+ *         iowait entry that has the max starve count
+ *
+ * This function is called to compare the starve count of a
+ * given iowait with the given max starve count. The max starve
+ * count and the index will be updated if the iowait's start
+ * count is larger.
+ */
+static inline void iowait_starve_find_max(struct iowait *w, u8 *max,
+					  uint idx, uint *max_idx)
+{
+	if (w->starved_cnt > *max) {
+		*max = w->starved_cnt;
+		*max_idx = idx;
+	}
+}
+
 #endif
diff --git a/drivers/infiniband/hw/hfi1/mad.c b/drivers/infiniband/hw/hfi1/mad.c
index 5977673..f4c0ffc 100644
--- a/drivers/infiniband/hw/hfi1/mad.c
+++ b/drivers/infiniband/hw/hfi1/mad.c
@@ -46,6 +46,7 @@
  */
 
 #include <linux/net.h>
+#include <rdma/opa_addr.h>
 #define OPA_NUM_PKEY_BLOCKS_PER_SMP (OPA_SMP_DR_DATA_SIZE \
 			/ (OPA_PARTITION_TABLE_BLK_SIZE * sizeof(u16)))
 
@@ -59,6 +60,24 @@
 #define OPA_LINK_WIDTH_RESET_OLD 0x0fff
 #define OPA_LINK_WIDTH_RESET 0xffff
 
+struct trap_node {
+	struct list_head list;
+	struct opa_mad_notice_attr data;
+	__be64 tid;
+	int len;
+	u32 retry;
+	u8 in_use;
+	u8 repress;
+};
+
+static int smp_length_check(u32 data_size, u32 request_len)
+{
+	if (unlikely(request_len < data_size))
+		return -EINVAL;
+
+	return 0;
+}
+
 static int reply(struct ib_mad_hdr *smp)
 {
 	/*
@@ -89,28 +108,222 @@ void hfi1_event_pkey_change(struct hfi1_devdata *dd, u8 port)
 	ib_dispatch_event(&event);
 }
 
-static void send_trap(struct hfi1_ibport *ibp, void *data, unsigned len)
+/*
+ * If the port is down, clean up all pending traps.  We need to be careful
+ * with the given trap, because it may be queued.
+ */
+static void cleanup_traps(struct hfi1_ibport *ibp, struct trap_node *trap)
+{
+	struct trap_node *node, *q;
+	unsigned long flags;
+	struct list_head trap_list;
+	int i;
+
+	for (i = 0; i < RVT_MAX_TRAP_LISTS; i++) {
+		spin_lock_irqsave(&ibp->rvp.lock, flags);
+		list_replace_init(&ibp->rvp.trap_lists[i].list, &trap_list);
+		ibp->rvp.trap_lists[i].list_len = 0;
+		spin_unlock_irqrestore(&ibp->rvp.lock, flags);
+
+		/*
+		 * Remove all items from the list, freeing all the non-given
+		 * traps.
+		 */
+		list_for_each_entry_safe(node, q, &trap_list, list) {
+			list_del(&node->list);
+			if (node != trap)
+				kfree(node);
+		}
+	}
+
+	/*
+	 * If this wasn't on one of the lists it would not be freed.  If it
+	 * was on the list, it is now safe to free.
+	 */
+	kfree(trap);
+}
+
+static struct trap_node *check_and_add_trap(struct hfi1_ibport *ibp,
+					    struct trap_node *trap)
+{
+	struct trap_node *node;
+	struct trap_list *trap_list;
+	unsigned long flags;
+	unsigned long timeout;
+	int found = 0;
+	unsigned int queue_id;
+	static int trap_count;
+
+	queue_id = trap->data.generic_type & 0x0F;
+	if (queue_id >= RVT_MAX_TRAP_LISTS) {
+		trap_count++;
+		pr_err_ratelimited("hfi1: Invalid trap 0x%0x dropped. Total dropped: %d\n",
+				   trap->data.generic_type, trap_count);
+		kfree(trap);
+		return NULL;
+	}
+
+	/*
+	 * Since the retry (handle timeout) does not remove a trap request
+	 * from the list, all we have to do is compare the node.
+	 */
+	spin_lock_irqsave(&ibp->rvp.lock, flags);
+	trap_list = &ibp->rvp.trap_lists[queue_id];
+
+	list_for_each_entry(node, &trap_list->list, list) {
+		if (node == trap) {
+			node->retry++;
+			found = 1;
+			break;
+		}
+	}
+
+	/* If it is not on the list, add it, limited to RVT-MAX_TRAP_LEN. */
+	if (!found) {
+		if (trap_list->list_len < RVT_MAX_TRAP_LEN) {
+			trap_list->list_len++;
+			list_add_tail(&trap->list, &trap_list->list);
+		} else {
+			pr_warn_ratelimited("hfi1: Maximum trap limit reached for 0x%0x traps\n",
+					    trap->data.generic_type);
+			kfree(trap);
+		}
+	}
+
+	/*
+	 * Next check to see if there is a timer pending.  If not, set it up
+	 * and get the first trap from the list.
+	 */
+	node = NULL;
+	if (!timer_pending(&ibp->rvp.trap_timer)) {
+		/*
+		 * o14-2
+		 * If the time out is set we have to wait until it expires
+		 * before the trap can be sent.
+		 * This should be > RVT_TRAP_TIMEOUT
+		 */
+		timeout = (RVT_TRAP_TIMEOUT *
+			   (1UL << ibp->rvp.subnet_timeout)) / 1000;
+		mod_timer(&ibp->rvp.trap_timer,
+			  jiffies + usecs_to_jiffies(timeout));
+		node = list_first_entry(&trap_list->list, struct trap_node,
+					list);
+		node->in_use = 1;
+	}
+	spin_unlock_irqrestore(&ibp->rvp.lock, flags);
+
+	return node;
+}
+
+static void subn_handle_opa_trap_repress(struct hfi1_ibport *ibp,
+					 struct opa_smp *smp)
+{
+	struct trap_list *trap_list;
+	struct trap_node *trap;
+	unsigned long flags;
+	int i;
+
+	if (smp->attr_id != IB_SMP_ATTR_NOTICE)
+		return;
+
+	spin_lock_irqsave(&ibp->rvp.lock, flags);
+	for (i = 0; i < RVT_MAX_TRAP_LISTS; i++) {
+		trap_list = &ibp->rvp.trap_lists[i];
+		trap = list_first_entry_or_null(&trap_list->list,
+						struct trap_node, list);
+		if (trap && trap->tid == smp->tid) {
+			if (trap->in_use) {
+				trap->repress = 1;
+			} else {
+				trap_list->list_len--;
+				list_del(&trap->list);
+				kfree(trap);
+			}
+			break;
+		}
+	}
+	spin_unlock_irqrestore(&ibp->rvp.lock, flags);
+}
+
+static void hfi1_update_sm_ah_attr(struct hfi1_ibport *ibp,
+				   struct rdma_ah_attr *attr, u32 dlid)
+{
+	rdma_ah_set_dlid(attr, dlid);
+	rdma_ah_set_port_num(attr, ppd_from_ibp(ibp)->port);
+	if (dlid >= be16_to_cpu(IB_MULTICAST_LID_BASE)) {
+		struct ib_global_route *grh = rdma_ah_retrieve_grh(attr);
+
+		rdma_ah_set_ah_flags(attr, IB_AH_GRH);
+		grh->sgid_index = 0;
+		grh->hop_limit = 1;
+		grh->dgid.global.subnet_prefix =
+			ibp->rvp.gid_prefix;
+		grh->dgid.global.interface_id = OPA_MAKE_ID(dlid);
+	}
+}
+
+static int hfi1_modify_qp0_ah(struct hfi1_ibport *ibp,
+			      struct rvt_ah *ah, u32 dlid)
+{
+	struct rdma_ah_attr attr;
+	struct rvt_qp *qp0;
+	int ret = -EINVAL;
+
+	memset(&attr, 0, sizeof(attr));
+	attr.type = ah->ibah.type;
+	hfi1_update_sm_ah_attr(ibp, &attr, dlid);
+	rcu_read_lock();
+	qp0 = rcu_dereference(ibp->rvp.qp[0]);
+	if (qp0)
+		ret = rdma_modify_ah(&ah->ibah, &attr);
+	rcu_read_unlock();
+	return ret;
+}
+
+static struct ib_ah *hfi1_create_qp0_ah(struct hfi1_ibport *ibp, u32 dlid)
+{
+	struct rdma_ah_attr attr;
+	struct ib_ah *ah = ERR_PTR(-EINVAL);
+	struct rvt_qp *qp0;
+	struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
+	struct hfi1_devdata *dd = dd_from_ppd(ppd);
+	u8 port_num = ppd->port;
+
+	memset(&attr, 0, sizeof(attr));
+	attr.type = rdma_ah_find_type(&dd->verbs_dev.rdi.ibdev, port_num);
+	hfi1_update_sm_ah_attr(ibp, &attr, dlid);
+	rcu_read_lock();
+	qp0 = rcu_dereference(ibp->rvp.qp[0]);
+	if (qp0)
+		ah = rdma_create_ah(qp0->ibqp.pd, &attr);
+	rcu_read_unlock();
+	return ah;
+}
+
+static void send_trap(struct hfi1_ibport *ibp, struct trap_node *trap)
 {
 	struct ib_mad_send_buf *send_buf;
 	struct ib_mad_agent *agent;
 	struct opa_smp *smp;
-	int ret;
 	unsigned long flags;
-	unsigned long timeout;
 	int pkey_idx;
 	u32 qpn = ppd_from_ibp(ibp)->sm_trap_qp;
 
 	agent = ibp->rvp.send_agent;
-	if (!agent)
+	if (!agent) {
+		cleanup_traps(ibp, trap);
 		return;
+	}
 
 	/* o14-3.2.1 */
-	if (ppd_from_ibp(ibp)->lstate != IB_PORT_ACTIVE)
+	if (driver_lstate(ppd_from_ibp(ibp)) != IB_PORT_ACTIVE) {
+		cleanup_traps(ibp, trap);
 		return;
+	}
 
-	/* o14-2 */
-	if (ibp->rvp.trap_timeout && time_before(jiffies,
-						 ibp->rvp.trap_timeout))
+	/* Add the trap to the list if necessary and see if we can send it */
+	trap = check_and_add_trap(ibp, trap);
+	if (!trap)
 		return;
 
 	pkey_idx = hfi1_lookup_pkey_idx(ibp, LIM_MGMT_P_KEY);
@@ -131,11 +344,21 @@ static void send_trap(struct hfi1_ibport *ibp, void *data, unsigned len)
 	smp->mgmt_class = IB_MGMT_CLASS_SUBN_LID_ROUTED;
 	smp->class_version = OPA_SM_CLASS_VERSION;
 	smp->method = IB_MGMT_METHOD_TRAP;
-	ibp->rvp.tid++;
-	smp->tid = cpu_to_be64(ibp->rvp.tid);
+
+	/* Only update the transaction ID for new traps (o13-5). */
+	if (trap->tid == 0) {
+		ibp->rvp.tid++;
+		/* make sure that tid != 0 */
+		if (ibp->rvp.tid == 0)
+			ibp->rvp.tid++;
+		trap->tid = cpu_to_be64(ibp->rvp.tid);
+	}
+	smp->tid = trap->tid;
+
 	smp->attr_id = IB_SMP_ATTR_NOTICE;
 	/* o14-1: smp->mkey = 0; */
-	memcpy(smp->route.lid.data, data, len);
+
+	memcpy(smp->route.lid.data, &trap->data, trap->len);
 
 	spin_lock_irqsave(&ibp->rvp.lock, flags);
 	if (!ibp->rvp.sm_ah) {
@@ -144,65 +367,101 @@ static void send_trap(struct hfi1_ibport *ibp, void *data, unsigned len)
 
 			ah = hfi1_create_qp0_ah(ibp, ibp->rvp.sm_lid);
 			if (IS_ERR(ah)) {
-				ret = PTR_ERR(ah);
-			} else {
-				send_buf->ah = ah;
-				ibp->rvp.sm_ah = ibah_to_rvtah(ah);
-				ret = 0;
+				spin_unlock_irqrestore(&ibp->rvp.lock, flags);
+				return;
 			}
+			send_buf->ah = ah;
+			ibp->rvp.sm_ah = ibah_to_rvtah(ah);
 		} else {
-			ret = -EINVAL;
+			spin_unlock_irqrestore(&ibp->rvp.lock, flags);
+			return;
 		}
 	} else {
 		send_buf->ah = &ibp->rvp.sm_ah->ibah;
-		ret = 0;
+	}
+
+	/*
+	 * If the trap was repressed while things were getting set up, don't
+	 * bother sending it. This could happen for a retry.
+	 */
+	if (trap->repress) {
+		list_del(&trap->list);
+		spin_unlock_irqrestore(&ibp->rvp.lock, flags);
+		kfree(trap);
+		ib_free_send_mad(send_buf);
+		return;
+	}
+
+	trap->in_use = 0;
+	spin_unlock_irqrestore(&ibp->rvp.lock, flags);
+
+	if (ib_post_send_mad(send_buf, NULL))
+		ib_free_send_mad(send_buf);
+}
+
+void hfi1_handle_trap_timer(unsigned long data)
+{
+	struct hfi1_ibport *ibp = (struct hfi1_ibport *)data;
+	struct trap_node *trap = NULL;
+	unsigned long flags;
+	int i;
+
+	/* Find the trap with the highest priority */
+	spin_lock_irqsave(&ibp->rvp.lock, flags);
+	for (i = 0; !trap && i < RVT_MAX_TRAP_LISTS; i++) {
+		trap = list_first_entry_or_null(&ibp->rvp.trap_lists[i].list,
+						struct trap_node, list);
 	}
 	spin_unlock_irqrestore(&ibp->rvp.lock, flags);
 
-	if (!ret)
-		ret = ib_post_send_mad(send_buf, NULL);
-	if (!ret) {
-		/* 4.096 usec. */
-		timeout = (4096 * (1UL << ibp->rvp.subnet_timeout)) / 1000;
-		ibp->rvp.trap_timeout = jiffies + usecs_to_jiffies(timeout);
-	} else {
-		ib_free_send_mad(send_buf);
-		ibp->rvp.trap_timeout = 0;
-	}
+	if (trap)
+		send_trap(ibp, trap);
+}
+
+static struct trap_node *create_trap_node(u8 type, __be16 trap_num, u32 lid)
+{
+	struct trap_node *trap;
+
+	trap = kzalloc(sizeof(*trap), GFP_ATOMIC);
+	if (!trap)
+		return NULL;
+
+	INIT_LIST_HEAD(&trap->list);
+	trap->data.generic_type = type;
+	trap->data.prod_type_lsb = IB_NOTICE_PROD_CA;
+	trap->data.trap_num = trap_num;
+	trap->data.issuer_lid = cpu_to_be32(lid);
+
+	return trap;
 }
 
 /*
- * Send a bad [PQ]_Key trap (ch. 14.3.8).
+ * Send a bad P_Key trap (ch. 14.3.8).
  */
-void hfi1_bad_pqkey(struct hfi1_ibport *ibp, __be16 trap_num, u32 key, u32 sl,
-		    u32 qp1, u32 qp2, u16 lid1, u16 lid2)
+void hfi1_bad_pkey(struct hfi1_ibport *ibp, u32 key, u32 sl,
+		   u32 qp1, u32 qp2, u32 lid1, u32 lid2)
 {
-	struct opa_mad_notice_attr data;
+	struct trap_node *trap;
 	u32 lid = ppd_from_ibp(ibp)->lid;
-	u32 _lid1 = lid1;
-	u32 _lid2 = lid2;
 
-	memset(&data, 0, sizeof(data));
-
-	if (trap_num == OPA_TRAP_BAD_P_KEY)
-		ibp->rvp.pkey_violations++;
-	else
-		ibp->rvp.qkey_violations++;
 	ibp->rvp.n_pkt_drops++;
+	ibp->rvp.pkey_violations++;
+
+	trap = create_trap_node(IB_NOTICE_TYPE_SECURITY, OPA_TRAP_BAD_P_KEY,
+				lid);
+	if (!trap)
+		return;
 
 	/* Send violation trap */
-	data.generic_type = IB_NOTICE_TYPE_SECURITY;
-	data.prod_type_lsb = IB_NOTICE_PROD_CA;
-	data.trap_num = trap_num;
-	data.issuer_lid = cpu_to_be32(lid);
-	data.ntc_257_258.lid1 = cpu_to_be32(_lid1);
-	data.ntc_257_258.lid2 = cpu_to_be32(_lid2);
-	data.ntc_257_258.key = cpu_to_be32(key);
-	data.ntc_257_258.sl = sl << 3;
-	data.ntc_257_258.qp1 = cpu_to_be32(qp1);
-	data.ntc_257_258.qp2 = cpu_to_be32(qp2);
+	trap->data.ntc_257_258.lid1 = cpu_to_be32(lid1);
+	trap->data.ntc_257_258.lid2 = cpu_to_be32(lid2);
+	trap->data.ntc_257_258.key = cpu_to_be32(key);
+	trap->data.ntc_257_258.sl = sl << 3;
+	trap->data.ntc_257_258.qp1 = cpu_to_be32(qp1);
+	trap->data.ntc_257_258.qp2 = cpu_to_be32(qp2);
 
-	send_trap(ibp, &data, sizeof(data));
+	trap->len = sizeof(trap->data);
+	send_trap(ibp, trap);
 }
 
 /*
@@ -211,34 +470,36 @@ void hfi1_bad_pqkey(struct hfi1_ibport *ibp, __be16 trap_num, u32 key, u32 sl,
 static void bad_mkey(struct hfi1_ibport *ibp, struct ib_mad_hdr *mad,
 		     __be64 mkey, __be32 dr_slid, u8 return_path[], u8 hop_cnt)
 {
-	struct opa_mad_notice_attr data;
+	struct trap_node *trap;
 	u32 lid = ppd_from_ibp(ibp)->lid;
 
-	memset(&data, 0, sizeof(data));
+	trap = create_trap_node(IB_NOTICE_TYPE_SECURITY, OPA_TRAP_BAD_M_KEY,
+				lid);
+	if (!trap)
+		return;
+
 	/* Send violation trap */
-	data.generic_type = IB_NOTICE_TYPE_SECURITY;
-	data.prod_type_lsb = IB_NOTICE_PROD_CA;
-	data.trap_num = OPA_TRAP_BAD_M_KEY;
-	data.issuer_lid = cpu_to_be32(lid);
-	data.ntc_256.lid = data.issuer_lid;
-	data.ntc_256.method = mad->method;
-	data.ntc_256.attr_id = mad->attr_id;
-	data.ntc_256.attr_mod = mad->attr_mod;
-	data.ntc_256.mkey = mkey;
+	trap->data.ntc_256.lid = trap->data.issuer_lid;
+	trap->data.ntc_256.method = mad->method;
+	trap->data.ntc_256.attr_id = mad->attr_id;
+	trap->data.ntc_256.attr_mod = mad->attr_mod;
+	trap->data.ntc_256.mkey = mkey;
 	if (mad->mgmt_class == IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE) {
-		data.ntc_256.dr_slid = dr_slid;
-		data.ntc_256.dr_trunc_hop = IB_NOTICE_TRAP_DR_NOTICE;
-		if (hop_cnt > ARRAY_SIZE(data.ntc_256.dr_rtn_path)) {
-			data.ntc_256.dr_trunc_hop |=
+		trap->data.ntc_256.dr_slid = dr_slid;
+		trap->data.ntc_256.dr_trunc_hop = IB_NOTICE_TRAP_DR_NOTICE;
+		if (hop_cnt > ARRAY_SIZE(trap->data.ntc_256.dr_rtn_path)) {
+			trap->data.ntc_256.dr_trunc_hop |=
 				IB_NOTICE_TRAP_DR_TRUNC;
-			hop_cnt = ARRAY_SIZE(data.ntc_256.dr_rtn_path);
+			hop_cnt = ARRAY_SIZE(trap->data.ntc_256.dr_rtn_path);
 		}
-		data.ntc_256.dr_trunc_hop |= hop_cnt;
-		memcpy(data.ntc_256.dr_rtn_path, return_path,
+		trap->data.ntc_256.dr_trunc_hop |= hop_cnt;
+		memcpy(trap->data.ntc_256.dr_rtn_path, return_path,
 		       hop_cnt);
 	}
 
-	send_trap(ibp, &data, sizeof(data));
+	trap->len = sizeof(trap->data);
+
+	send_trap(ibp, trap);
 }
 
 /*
@@ -246,22 +507,24 @@ static void bad_mkey(struct hfi1_ibport *ibp, struct ib_mad_hdr *mad,
  */
 void hfi1_cap_mask_chg(struct rvt_dev_info *rdi, u8 port_num)
 {
-	struct opa_mad_notice_attr data;
+	struct trap_node *trap;
 	struct hfi1_ibdev *verbs_dev = dev_from_rdi(rdi);
 	struct hfi1_devdata *dd = dd_from_dev(verbs_dev);
 	struct hfi1_ibport *ibp = &dd->pport[port_num - 1].ibport_data;
 	u32 lid = ppd_from_ibp(ibp)->lid;
 
-	memset(&data, 0, sizeof(data));
+	trap = create_trap_node(IB_NOTICE_TYPE_INFO,
+				OPA_TRAP_CHANGE_CAPABILITY,
+				lid);
+	if (!trap)
+		return;
 
-	data.generic_type = IB_NOTICE_TYPE_INFO;
-	data.prod_type_lsb = IB_NOTICE_PROD_CA;
-	data.trap_num = OPA_TRAP_CHANGE_CAPABILITY;
-	data.issuer_lid = cpu_to_be32(lid);
-	data.ntc_144.lid = data.issuer_lid;
-	data.ntc_144.new_cap_mask = cpu_to_be32(ibp->rvp.port_cap_flags);
+	trap->data.ntc_144.lid = trap->data.issuer_lid;
+	trap->data.ntc_144.new_cap_mask = cpu_to_be32(ibp->rvp.port_cap_flags);
+	trap->data.ntc_144.cap_mask3 = cpu_to_be16(ibp->rvp.port_cap3_flags);
 
-	send_trap(ibp, &data, sizeof(data));
+	trap->len = sizeof(trap->data);
+	send_trap(ibp, trap);
 }
 
 /*
@@ -269,19 +532,19 @@ void hfi1_cap_mask_chg(struct rvt_dev_info *rdi, u8 port_num)
  */
 void hfi1_sys_guid_chg(struct hfi1_ibport *ibp)
 {
-	struct opa_mad_notice_attr data;
+	struct trap_node *trap;
 	u32 lid = ppd_from_ibp(ibp)->lid;
 
-	memset(&data, 0, sizeof(data));
+	trap = create_trap_node(IB_NOTICE_TYPE_INFO, OPA_TRAP_CHANGE_SYSGUID,
+				lid);
+	if (!trap)
+		return;
 
-	data.generic_type = IB_NOTICE_TYPE_INFO;
-	data.prod_type_lsb = IB_NOTICE_PROD_CA;
-	data.trap_num = OPA_TRAP_CHANGE_SYSGUID;
-	data.issuer_lid = cpu_to_be32(lid);
-	data.ntc_145.new_sys_guid = ib_hfi1_sys_image_guid;
-	data.ntc_145.lid = data.issuer_lid;
+	trap->data.ntc_145.new_sys_guid = ib_hfi1_sys_image_guid;
+	trap->data.ntc_145.lid = trap->data.issuer_lid;
 
-	send_trap(ibp, &data, sizeof(data));
+	trap->len = sizeof(trap->data);
+	send_trap(ibp, trap);
 }
 
 /*
@@ -289,29 +552,30 @@ void hfi1_sys_guid_chg(struct hfi1_ibport *ibp)
  */
 void hfi1_node_desc_chg(struct hfi1_ibport *ibp)
 {
-	struct opa_mad_notice_attr data;
+	struct trap_node *trap;
 	u32 lid = ppd_from_ibp(ibp)->lid;
 
-	memset(&data, 0, sizeof(data));
+	trap = create_trap_node(IB_NOTICE_TYPE_INFO,
+				OPA_TRAP_CHANGE_CAPABILITY,
+				lid);
+	if (!trap)
+		return;
 
-	data.generic_type = IB_NOTICE_TYPE_INFO;
-	data.prod_type_lsb = IB_NOTICE_PROD_CA;
-	data.trap_num = OPA_TRAP_CHANGE_CAPABILITY;
-	data.issuer_lid = cpu_to_be32(lid);
-	data.ntc_144.lid = data.issuer_lid;
-	data.ntc_144.change_flags =
+	trap->data.ntc_144.lid = trap->data.issuer_lid;
+	trap->data.ntc_144.change_flags =
 		cpu_to_be16(OPA_NOTICE_TRAP_NODE_DESC_CHG);
 
-	send_trap(ibp, &data, sizeof(data));
+	trap->len = sizeof(trap->data);
+	send_trap(ibp, trap);
 }
 
 static int __subn_get_opa_nodedesc(struct opa_smp *smp, u32 am,
 				   u8 *data, struct ib_device *ibdev,
-				   u8 port, u32 *resp_len)
+				   u8 port, u32 *resp_len, u32 max_len)
 {
 	struct opa_node_description *nd;
 
-	if (am) {
+	if (am || smp_length_check(sizeof(*nd), max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
@@ -328,7 +592,7 @@ static int __subn_get_opa_nodedesc(struct opa_smp *smp, u32 am,
 
 static int __subn_get_opa_nodeinfo(struct opa_smp *smp, u32 am, u8 *data,
 				   struct ib_device *ibdev, u8 port,
-				   u32 *resp_len)
+				   u32 *resp_len, u32 max_len)
 {
 	struct opa_node_info *ni;
 	struct hfi1_devdata *dd = dd_from_ibdev(ibdev);
@@ -338,6 +602,7 @@ static int __subn_get_opa_nodeinfo(struct opa_smp *smp, u32 am, u8 *data,
 
 	/* GUID 0 is illegal */
 	if (am || pidx >= dd->num_pports || ibdev->node_guid == 0 ||
+	    smp_length_check(sizeof(*ni), max_len) ||
 	    get_sguid(to_iport(ibdev, port), HFI1_PORT_GUID_INDEX) == 0) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
@@ -519,7 +784,7 @@ void read_ltp_rtt(struct hfi1_devdata *dd)
 
 static int __subn_get_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
 				   struct ib_device *ibdev, u8 port,
-				   u32 *resp_len)
+				   u32 *resp_len, u32 max_len)
 {
 	int i;
 	struct hfi1_devdata *dd;
@@ -535,7 +800,7 @@ static int __subn_get_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
 	u32 buffer_units;
 	u64 tmp = 0;
 
-	if (num_ports != 1) {
+	if (num_ports != 1 || smp_length_check(sizeof(*pi), max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
@@ -605,7 +870,7 @@ static int __subn_get_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
 		ppd->offline_disabled_reason;
 
 	pi->port_states.portphysstate_portstate =
-		(hfi1_ibphys_portstate(ppd) << 4) | state;
+		(driver_pstate(ppd) << 4) | state;
 
 	pi->mkeyprotect_lmc = (ibp->rvp.mkeyprot << 6) | ppd->lmc;
 
@@ -704,13 +969,9 @@ static int __subn_get_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
 	buffer_units |= (dd->vl15_init << 11) & OPA_PI_MASK_BUF_UNIT_VL15_INIT;
 	pi->buffer_units = cpu_to_be32(buffer_units);
 
-	pi->opa_cap_mask = cpu_to_be16(OPA_CAP_MASK3_IsSharedSpaceSupported |
-				       OPA_CAP_MASK3_IsEthOnFabricSupported);
-	/* Driver does not support mcast/collective configuration */
-	pi->opa_cap_mask &=
-		cpu_to_be16(~OPA_CAP_MASK3_IsAddrRangeConfigSupported);
-	pi->collectivemask_multicastmask = ((HFI1_COLLECTIVE_NR & 0x7)
-					    << 3 | (HFI1_MCAST_NR & 0x7));
+	pi->opa_cap_mask = cpu_to_be16(ibp->rvp.port_cap3_flags);
+	pi->collectivemask_multicastmask = ((OPA_COLLECTIVE_NR & 0x7)
+					    << 3 | (OPA_MCAST_NR & 0x7));
 
 	/* HFI supports a replay buffer 128 LTPs in size */
 	pi->replay_depth.buffer = 0x80;
@@ -748,7 +1009,7 @@ static int get_pkeys(struct hfi1_devdata *dd, u8 port, u16 *pkeys)
 
 static int __subn_get_opa_pkeytable(struct opa_smp *smp, u32 am, u8 *data,
 				    struct ib_device *ibdev, u8 port,
-				    u32 *resp_len)
+				    u32 *resp_len, u32 max_len)
 {
 	struct hfi1_devdata *dd = dd_from_ibdev(ibdev);
 	u32 n_blocks_req = OPA_AM_NBLK(am);
@@ -771,6 +1032,11 @@ static int __subn_get_opa_pkeytable(struct opa_smp *smp, u32 am, u8 *data,
 
 	size = (n_blocks_req * OPA_PARTITION_TABLE_BLK_SIZE) * sizeof(u16);
 
+	if (smp_length_check(size, max_len)) {
+		smp->status |= IB_SMP_INVALID_FIELD;
+		return reply((struct ib_mad_hdr *)smp);
+	}
+
 	if (start_block + n_blocks_req > n_blocks_avail ||
 	    n_blocks_req > OPA_NUM_PKEY_BLOCKS_PER_SMP) {
 		pr_warn("OPA Get PKey AM Invalid : s 0x%x; req 0x%x; "
@@ -915,8 +1181,8 @@ static int physical_transition_allowed(int old, int new)
 static int port_states_transition_allowed(struct hfi1_pportdata *ppd,
 					  u32 logical_new, u32 physical_new)
 {
-	u32 physical_old = driver_physical_state(ppd);
-	u32 logical_old = driver_logical_state(ppd);
+	u32 physical_old = driver_pstate(ppd);
+	u32 logical_old = driver_lstate(ppd);
 	int ret, logical_allowed, physical_allowed;
 
 	ret = logical_transition_allowed(logical_old, logical_new);
@@ -1074,7 +1340,7 @@ static int set_port_states(struct hfi1_pportdata *ppd, struct opa_smp *smp,
  */
 static int __subn_set_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
 				   struct ib_device *ibdev, u8 port,
-				   u32 *resp_len)
+				   u32 *resp_len, u32 max_len)
 {
 	struct opa_port_info *pi = (struct opa_port_info *)data;
 	struct ib_event event;
@@ -1083,8 +1349,8 @@ static int __subn_set_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
 	struct hfi1_ibport *ibp;
 	u8 clientrereg;
 	unsigned long flags;
-	u32 smlid, opa_lid; /* tmp vars to hold LID values */
-	u16 lid;
+	u32 smlid;
+	u32 lid;
 	u8 ls_old, ls_new, ps_new;
 	u8 vls;
 	u8 msl;
@@ -1095,27 +1361,26 @@ static int __subn_set_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
 	int ret, i, invalid = 0, call_set_mtu = 0;
 	int call_link_downgrade_policy = 0;
 
-	if (num_ports != 1) {
+	if (num_ports != 1 ||
+	    smp_length_check(sizeof(*pi), max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
 
-	opa_lid = be32_to_cpu(pi->lid);
-	if (opa_lid & 0xFFFF0000) {
-		pr_warn("OPA_PortInfo lid out of range: %X\n", opa_lid);
+	lid = be32_to_cpu(pi->lid);
+	if (lid & 0xFF000000) {
+		pr_warn("OPA_PortInfo lid out of range: %X\n", lid);
 		smp->status |= IB_SMP_INVALID_FIELD;
 		goto get_only;
 	}
 
-	lid = (u16)(opa_lid & 0x0000FFFF);
 
 	smlid = be32_to_cpu(pi->sm_lid);
-	if (smlid & 0xFFFF0000) {
+	if (smlid & 0xFF000000) {
 		pr_warn("OPA_PortInfo SM lid out of range: %X\n", smlid);
 		smp->status |= IB_SMP_INVALID_FIELD;
 		goto get_only;
 	}
-	smlid &= 0x0000FFFF;
 
 	clientrereg = (pi->clientrereg_subnettimeout &
 			OPA_PI_MASK_CLIENT_REREGISTER);
@@ -1130,12 +1395,16 @@ static int __subn_set_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
 	ls_old = driver_lstate(ppd);
 
 	ibp->rvp.mkey = pi->mkey;
-	ibp->rvp.gid_prefix = pi->subnet_prefix;
+	if (ibp->rvp.gid_prefix != pi->subnet_prefix) {
+		ibp->rvp.gid_prefix = pi->subnet_prefix;
+		event.event = IB_EVENT_GID_CHANGE;
+		ib_dispatch_event(&event);
+	}
 	ibp->rvp.mkey_lease_period = be16_to_cpu(pi->mkey_lease_period);
 
 	/* Must be a valid unicast LID address. */
 	if ((lid == 0 && ls_old > IB_PORT_INIT) ||
-	    lid >= be16_to_cpu(IB_MULTICAST_LID_BASE)) {
+	     (hfi1_is_16B_mcast(lid))) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		pr_warn("SubnSet(OPA_PortInfo) lid invalid 0x%x\n",
 			lid);
@@ -1148,6 +1417,16 @@ static int __subn_set_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
 		hfi1_set_lid(ppd, lid, pi->mkeyprotect_lmc & OPA_PI_MASK_LMC);
 		event.event = IB_EVENT_LID_CHANGE;
 		ib_dispatch_event(&event);
+
+		if (HFI1_PORT_GUID_INDEX + 1 < HFI1_GUIDS_PER_PORT) {
+			/* Manufacture GID from LID to support extended
+			 * addresses
+			 */
+			ppd->guids[HFI1_PORT_GUID_INDEX + 1] =
+				be64_to_cpu(OPA_MAKE_ID(lid));
+			event.event = IB_EVENT_GID_CHANGE;
+			ib_dispatch_event(&event);
+		}
 	}
 
 	msl = pi->smsl & OPA_PI_MASK_SMSL;
@@ -1158,7 +1437,7 @@ static int __subn_set_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
 
 	/* Must be a valid unicast LID address. */
 	if ((smlid == 0 && ls_old > IB_PORT_INIT) ||
-	    smlid >= be16_to_cpu(IB_MULTICAST_LID_BASE)) {
+	     (hfi1_is_16B_mcast(smlid))) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		pr_warn("SubnSet(OPA_PortInfo) smlid invalid 0x%x\n", smlid);
 	} else if (smlid != ibp->rvp.sm_lid || msl != ibp->rvp.sm_sl) {
@@ -1166,7 +1445,7 @@ static int __subn_set_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
 		spin_lock_irqsave(&ibp->rvp.lock, flags);
 		if (ibp->rvp.sm_ah) {
 			if (smlid != ibp->rvp.sm_lid)
-				rdma_ah_set_dlid(&ibp->rvp.sm_ah->attr, smlid);
+				hfi1_modify_qp0_ah(ibp, ibp->rvp.sm_ah, smlid);
 			if (msl != ibp->rvp.sm_sl)
 				rdma_ah_set_sl(&ibp->rvp.sm_ah->attr, msl);
 		}
@@ -1346,7 +1625,8 @@ static int __subn_set_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
 	if (ret)
 		return ret;
 
-	ret = __subn_get_opa_portinfo(smp, am, data, ibdev, port, resp_len);
+	ret = __subn_get_opa_portinfo(smp, am, data, ibdev, port, resp_len,
+				      max_len);
 
 	/* restore re-reg bit per o14-12.2.1 */
 	pi->clientrereg_subnettimeout |= clientrereg;
@@ -1363,7 +1643,8 @@ static int __subn_set_opa_portinfo(struct opa_smp *smp, u32 am, u8 *data,
 	return ret;
 
 get_only:
-	return __subn_get_opa_portinfo(smp, am, data, ibdev, port, resp_len);
+	return __subn_get_opa_portinfo(smp, am, data, ibdev, port, resp_len,
+				       max_len);
 }
 
 /**
@@ -1424,7 +1705,7 @@ static int set_pkeys(struct hfi1_devdata *dd, u8 port, u16 *pkeys)
 
 static int __subn_set_opa_pkeytable(struct opa_smp *smp, u32 am, u8 *data,
 				    struct ib_device *ibdev, u8 port,
-				    u32 *resp_len)
+				    u32 *resp_len, u32 max_len)
 {
 	struct hfi1_devdata *dd = dd_from_ibdev(ibdev);
 	u32 n_blocks_sent = OPA_AM_NBLK(am);
@@ -1434,6 +1715,7 @@ static int __subn_set_opa_pkeytable(struct opa_smp *smp, u32 am, u8 *data,
 	int i;
 	u16 n_blocks_avail;
 	unsigned npkeys = hfi1_get_npkeys(dd);
+	u32 size = 0;
 
 	if (n_blocks_sent == 0) {
 		pr_warn("OPA Get PKey AM Invalid : P = %d; B = 0x%x; N = 0x%x\n",
@@ -1444,6 +1726,13 @@ static int __subn_set_opa_pkeytable(struct opa_smp *smp, u32 am, u8 *data,
 
 	n_blocks_avail = (u16)(npkeys / OPA_PARTITION_TABLE_BLK_SIZE) + 1;
 
+	size = sizeof(u16) * (n_blocks_sent * OPA_PARTITION_TABLE_BLK_SIZE);
+
+	if (smp_length_check(size, max_len)) {
+		smp->status |= IB_SMP_INVALID_FIELD;
+		return reply((struct ib_mad_hdr *)smp);
+	}
+
 	if (start_block + n_blocks_sent > n_blocks_avail ||
 	    n_blocks_sent > OPA_NUM_PKEY_BLOCKS_PER_SMP) {
 		pr_warn("OPA Set PKey AM Invalid : s 0x%x; req 0x%x; avail 0x%x; blk/smp 0x%lx\n",
@@ -1461,7 +1750,8 @@ static int __subn_set_opa_pkeytable(struct opa_smp *smp, u32 am, u8 *data,
 		return reply((struct ib_mad_hdr *)smp);
 	}
 
-	return __subn_get_opa_pkeytable(smp, am, data, ibdev, port, resp_len);
+	return __subn_get_opa_pkeytable(smp, am, data, ibdev, port, resp_len,
+					max_len);
 }
 
 #define ILLEGAL_VL 12
@@ -1522,14 +1812,14 @@ static int get_sc2vlt_tables(struct hfi1_devdata *dd, void *data)
 
 static int __subn_get_opa_sl_to_sc(struct opa_smp *smp, u32 am, u8 *data,
 				   struct ib_device *ibdev, u8 port,
-				   u32 *resp_len)
+				   u32 *resp_len, u32 max_len)
 {
 	struct hfi1_ibport *ibp = to_iport(ibdev, port);
 	u8 *p = data;
 	size_t size = ARRAY_SIZE(ibp->sl_to_sc); /* == 32 */
 	unsigned i;
 
-	if (am) {
+	if (am || smp_length_check(size, max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
@@ -1545,14 +1835,15 @@ static int __subn_get_opa_sl_to_sc(struct opa_smp *smp, u32 am, u8 *data,
 
 static int __subn_set_opa_sl_to_sc(struct opa_smp *smp, u32 am, u8 *data,
 				   struct ib_device *ibdev, u8 port,
-				   u32 *resp_len)
+				   u32 *resp_len, u32 max_len)
 {
 	struct hfi1_ibport *ibp = to_iport(ibdev, port);
 	u8 *p = data;
+	size_t size = ARRAY_SIZE(ibp->sl_to_sc);
 	int i;
 	u8 sc;
 
-	if (am) {
+	if (am || smp_length_check(size, max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
@@ -1567,19 +1858,20 @@ static int __subn_set_opa_sl_to_sc(struct opa_smp *smp, u32 am, u8 *data,
 		}
 	}
 
-	return __subn_get_opa_sl_to_sc(smp, am, data, ibdev, port, resp_len);
+	return __subn_get_opa_sl_to_sc(smp, am, data, ibdev, port, resp_len,
+				       max_len);
 }
 
 static int __subn_get_opa_sc_to_sl(struct opa_smp *smp, u32 am, u8 *data,
 				   struct ib_device *ibdev, u8 port,
-				   u32 *resp_len)
+				   u32 *resp_len, u32 max_len)
 {
 	struct hfi1_ibport *ibp = to_iport(ibdev, port);
 	u8 *p = data;
 	size_t size = ARRAY_SIZE(ibp->sc_to_sl); /* == 32 */
 	unsigned i;
 
-	if (am) {
+	if (am || smp_length_check(size, max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
@@ -1595,13 +1887,14 @@ static int __subn_get_opa_sc_to_sl(struct opa_smp *smp, u32 am, u8 *data,
 
 static int __subn_set_opa_sc_to_sl(struct opa_smp *smp, u32 am, u8 *data,
 				   struct ib_device *ibdev, u8 port,
-				   u32 *resp_len)
+				   u32 *resp_len, u32 max_len)
 {
 	struct hfi1_ibport *ibp = to_iport(ibdev, port);
+	size_t size = ARRAY_SIZE(ibp->sc_to_sl);
 	u8 *p = data;
 	int i;
 
-	if (am) {
+	if (am || smp_length_check(size, max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
@@ -1609,19 +1902,20 @@ static int __subn_set_opa_sc_to_sl(struct opa_smp *smp, u32 am, u8 *data,
 	for (i = 0; i < ARRAY_SIZE(ibp->sc_to_sl); i++)
 		ibp->sc_to_sl[i] = *p++;
 
-	return __subn_get_opa_sc_to_sl(smp, am, data, ibdev, port, resp_len);
+	return __subn_get_opa_sc_to_sl(smp, am, data, ibdev, port, resp_len,
+				       max_len);
 }
 
 static int __subn_get_opa_sc_to_vlt(struct opa_smp *smp, u32 am, u8 *data,
 				    struct ib_device *ibdev, u8 port,
-				    u32 *resp_len)
+				    u32 *resp_len, u32 max_len)
 {
 	u32 n_blocks = OPA_AM_NBLK(am);
 	struct hfi1_devdata *dd = dd_from_ibdev(ibdev);
 	void *vp = (void *)data;
 	size_t size = 4 * sizeof(u64);
 
-	if (n_blocks != 1) {
+	if (n_blocks != 1 || smp_length_check(size, max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
@@ -1636,7 +1930,7 @@ static int __subn_get_opa_sc_to_vlt(struct opa_smp *smp, u32 am, u8 *data,
 
 static int __subn_set_opa_sc_to_vlt(struct opa_smp *smp, u32 am, u8 *data,
 				    struct ib_device *ibdev, u8 port,
-				    u32 *resp_len)
+				    u32 *resp_len, u32 max_len)
 {
 	u32 n_blocks = OPA_AM_NBLK(am);
 	int async_update = OPA_AM_ASYNC(am);
@@ -1644,8 +1938,15 @@ static int __subn_set_opa_sc_to_vlt(struct opa_smp *smp, u32 am, u8 *data,
 	void *vp = (void *)data;
 	struct hfi1_pportdata *ppd;
 	int lstate;
+	/*
+	 * set_sc2vlt_tables writes the information contained in *data
+	 * to four 64-bit registers SendSC2VLt[0-3]. We need to make
+	 * sure *max_len is not greater than the total size of the four
+	 * SendSC2VLt[0-3] registers.
+	 */
+	size_t size = 4 * sizeof(u64);
 
-	if (n_blocks != 1 || async_update) {
+	if (n_blocks != 1 || async_update || smp_length_check(size, max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
@@ -1665,27 +1966,28 @@ static int __subn_set_opa_sc_to_vlt(struct opa_smp *smp, u32 am, u8 *data,
 
 	set_sc2vlt_tables(dd, vp);
 
-	return __subn_get_opa_sc_to_vlt(smp, am, data, ibdev, port, resp_len);
+	return __subn_get_opa_sc_to_vlt(smp, am, data, ibdev, port, resp_len,
+					max_len);
 }
 
 static int __subn_get_opa_sc_to_vlnt(struct opa_smp *smp, u32 am, u8 *data,
 				     struct ib_device *ibdev, u8 port,
-				     u32 *resp_len)
+				     u32 *resp_len, u32 max_len)
 {
 	u32 n_blocks = OPA_AM_NPORT(am);
 	struct hfi1_devdata *dd = dd_from_ibdev(ibdev);
 	struct hfi1_pportdata *ppd;
 	void *vp = (void *)data;
-	int size;
+	int size = sizeof(struct sc2vlnt);
 
-	if (n_blocks != 1) {
+	if (n_blocks != 1 || smp_length_check(size, max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
 
 	ppd = dd->pport + (port - 1);
 
-	size = fm_get_table(ppd, FM_TBL_SC2VLNT, vp);
+	fm_get_table(ppd, FM_TBL_SC2VLNT, vp);
 
 	if (resp_len)
 		*resp_len += size;
@@ -1695,15 +1997,16 @@ static int __subn_get_opa_sc_to_vlnt(struct opa_smp *smp, u32 am, u8 *data,
 
 static int __subn_set_opa_sc_to_vlnt(struct opa_smp *smp, u32 am, u8 *data,
 				     struct ib_device *ibdev, u8 port,
-				     u32 *resp_len)
+				     u32 *resp_len, u32 max_len)
 {
 	u32 n_blocks = OPA_AM_NPORT(am);
 	struct hfi1_devdata *dd = dd_from_ibdev(ibdev);
 	struct hfi1_pportdata *ppd;
 	void *vp = (void *)data;
 	int lstate;
+	int size = sizeof(struct sc2vlnt);
 
-	if (n_blocks != 1) {
+	if (n_blocks != 1 || smp_length_check(size, max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
@@ -1721,12 +2024,12 @@ static int __subn_set_opa_sc_to_vlnt(struct opa_smp *smp, u32 am, u8 *data,
 	fm_set_table(ppd, FM_TBL_SC2VLNT, vp);
 
 	return __subn_get_opa_sc_to_vlnt(smp, am, data, ibdev, port,
-					 resp_len);
+					 resp_len, max_len);
 }
 
 static int __subn_get_opa_psi(struct opa_smp *smp, u32 am, u8 *data,
 			      struct ib_device *ibdev, u8 port,
-			      u32 *resp_len)
+			      u32 *resp_len, u32 max_len)
 {
 	u32 nports = OPA_AM_NPORT(am);
 	u32 start_of_sm_config = OPA_AM_START_SM_CFG(am);
@@ -1735,7 +2038,7 @@ static int __subn_get_opa_psi(struct opa_smp *smp, u32 am, u8 *data,
 	struct hfi1_pportdata *ppd;
 	struct opa_port_state_info *psi = (struct opa_port_state_info *)data;
 
-	if (nports != 1) {
+	if (nports != 1 || smp_length_check(sizeof(*psi), max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
@@ -1755,7 +2058,7 @@ static int __subn_get_opa_psi(struct opa_smp *smp, u32 am, u8 *data,
 		ppd->offline_disabled_reason;
 
 	psi->port_states.portphysstate_portstate =
-		(hfi1_ibphys_portstate(ppd) << 4) | (lstate & 0xf);
+		(driver_pstate(ppd) << 4) | (lstate & 0xf);
 	psi->link_width_downgrade_tx_active =
 		cpu_to_be16(ppd->link_width_downgrade_tx_active);
 	psi->link_width_downgrade_rx_active =
@@ -1768,7 +2071,7 @@ static int __subn_get_opa_psi(struct opa_smp *smp, u32 am, u8 *data,
 
 static int __subn_set_opa_psi(struct opa_smp *smp, u32 am, u8 *data,
 			      struct ib_device *ibdev, u8 port,
-			      u32 *resp_len)
+			      u32 *resp_len, u32 max_len)
 {
 	u32 nports = OPA_AM_NPORT(am);
 	u32 start_of_sm_config = OPA_AM_START_SM_CFG(am);
@@ -1779,7 +2082,7 @@ static int __subn_set_opa_psi(struct opa_smp *smp, u32 am, u8 *data,
 	struct opa_port_state_info *psi = (struct opa_port_state_info *)data;
 	int ret, invalid = 0;
 
-	if (nports != 1) {
+	if (nports != 1 || smp_length_check(sizeof(*psi), max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
@@ -1809,19 +2112,21 @@ static int __subn_set_opa_psi(struct opa_smp *smp, u32 am, u8 *data,
 	if (invalid)
 		smp->status |= IB_SMP_INVALID_FIELD;
 
-	return __subn_get_opa_psi(smp, am, data, ibdev, port, resp_len);
+	return __subn_get_opa_psi(smp, am, data, ibdev, port, resp_len,
+				  max_len);
 }
 
 static int __subn_get_opa_cable_info(struct opa_smp *smp, u32 am, u8 *data,
 				     struct ib_device *ibdev, u8 port,
-				     u32 *resp_len)
+				     u32 *resp_len, u32 max_len)
 {
 	struct hfi1_devdata *dd = dd_from_ibdev(ibdev);
 	u32 addr = OPA_AM_CI_ADDR(am);
 	u32 len = OPA_AM_CI_LEN(am) + 1;
 	int ret;
 
-	if (dd->pport->port_type != PORT_TYPE_QSFP) {
+	if (dd->pport->port_type != PORT_TYPE_QSFP ||
+	    smp_length_check(len, max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
@@ -1864,21 +2169,22 @@ static int __subn_get_opa_cable_info(struct opa_smp *smp, u32 am, u8 *data,
 }
 
 static int __subn_get_opa_bct(struct opa_smp *smp, u32 am, u8 *data,
-			      struct ib_device *ibdev, u8 port, u32 *resp_len)
+			      struct ib_device *ibdev, u8 port, u32 *resp_len,
+			      u32 max_len)
 {
 	u32 num_ports = OPA_AM_NPORT(am);
 	struct hfi1_devdata *dd = dd_from_ibdev(ibdev);
 	struct hfi1_pportdata *ppd;
 	struct buffer_control *p = (struct buffer_control *)data;
-	int size;
+	int size = sizeof(struct buffer_control);
 
-	if (num_ports != 1) {
+	if (num_ports != 1 || smp_length_check(size, max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
 
 	ppd = dd->pport + (port - 1);
-	size = fm_get_table(ppd, FM_TBL_BUFFER_CONTROL, p);
+	fm_get_table(ppd, FM_TBL_BUFFER_CONTROL, p);
 	trace_bct_get(dd, p);
 	if (resp_len)
 		*resp_len += size;
@@ -1887,14 +2193,15 @@ static int __subn_get_opa_bct(struct opa_smp *smp, u32 am, u8 *data,
 }
 
 static int __subn_set_opa_bct(struct opa_smp *smp, u32 am, u8 *data,
-			      struct ib_device *ibdev, u8 port, u32 *resp_len)
+			      struct ib_device *ibdev, u8 port, u32 *resp_len,
+			      u32 max_len)
 {
 	u32 num_ports = OPA_AM_NPORT(am);
 	struct hfi1_devdata *dd = dd_from_ibdev(ibdev);
 	struct hfi1_pportdata *ppd;
 	struct buffer_control *p = (struct buffer_control *)data;
 
-	if (num_ports != 1) {
+	if (num_ports != 1 || smp_length_check(sizeof(*p), max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
@@ -1905,41 +2212,43 @@ static int __subn_set_opa_bct(struct opa_smp *smp, u32 am, u8 *data,
 		return reply((struct ib_mad_hdr *)smp);
 	}
 
-	return __subn_get_opa_bct(smp, am, data, ibdev, port, resp_len);
+	return __subn_get_opa_bct(smp, am, data, ibdev, port, resp_len,
+				  max_len);
 }
 
 static int __subn_get_opa_vl_arb(struct opa_smp *smp, u32 am, u8 *data,
 				 struct ib_device *ibdev, u8 port,
-				 u32 *resp_len)
+				 u32 *resp_len, u32 max_len)
 {
 	struct hfi1_pportdata *ppd = ppd_from_ibp(to_iport(ibdev, port));
 	u32 num_ports = OPA_AM_NPORT(am);
 	u8 section = (am & 0x00ff0000) >> 16;
 	u8 *p = data;
-	int size = 0;
+	int size = 256;
 
-	if (num_ports != 1) {
+	if (num_ports != 1 || smp_length_check(size, max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
 
 	switch (section) {
 	case OPA_VLARB_LOW_ELEMENTS:
-		size = fm_get_table(ppd, FM_TBL_VL_LOW_ARB, p);
+		fm_get_table(ppd, FM_TBL_VL_LOW_ARB, p);
 		break;
 	case OPA_VLARB_HIGH_ELEMENTS:
-		size = fm_get_table(ppd, FM_TBL_VL_HIGH_ARB, p);
+		fm_get_table(ppd, FM_TBL_VL_HIGH_ARB, p);
 		break;
 	case OPA_VLARB_PREEMPT_ELEMENTS:
-		size = fm_get_table(ppd, FM_TBL_VL_PREEMPT_ELEMS, p);
+		fm_get_table(ppd, FM_TBL_VL_PREEMPT_ELEMS, p);
 		break;
 	case OPA_VLARB_PREEMPT_MATRIX:
-		size = fm_get_table(ppd, FM_TBL_VL_PREEMPT_MATRIX, p);
+		fm_get_table(ppd, FM_TBL_VL_PREEMPT_MATRIX, p);
 		break;
 	default:
 		pr_warn("OPA SubnGet(VL Arb) AM Invalid : 0x%x\n",
 			be32_to_cpu(smp->attr_mod));
 		smp->status |= IB_SMP_INVALID_FIELD;
+		size = 0;
 		break;
 	}
 
@@ -1951,14 +2260,15 @@ static int __subn_get_opa_vl_arb(struct opa_smp *smp, u32 am, u8 *data,
 
 static int __subn_set_opa_vl_arb(struct opa_smp *smp, u32 am, u8 *data,
 				 struct ib_device *ibdev, u8 port,
-				 u32 *resp_len)
+				 u32 *resp_len, u32 max_len)
 {
 	struct hfi1_pportdata *ppd = ppd_from_ibp(to_iport(ibdev, port));
 	u32 num_ports = OPA_AM_NPORT(am);
 	u8 section = (am & 0x00ff0000) >> 16;
 	u8 *p = data;
+	int size = 256;
 
-	if (num_ports != 1) {
+	if (num_ports != 1 || smp_length_check(size, max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
@@ -1986,7 +2296,8 @@ static int __subn_set_opa_vl_arb(struct opa_smp *smp, u32 am, u8 *data,
 		break;
 	}
 
-	return __subn_get_opa_vl_arb(smp, am, data, ibdev, port, resp_len);
+	return __subn_get_opa_vl_arb(smp, am, data, ibdev, port, resp_len,
+				     max_len);
 }
 
 struct opa_pma_mad {
@@ -3282,13 +3593,18 @@ struct opa_congestion_info_attr {
 
 static int __subn_get_opa_cong_info(struct opa_smp *smp, u32 am, u8 *data,
 				    struct ib_device *ibdev, u8 port,
-				    u32 *resp_len)
+				    u32 *resp_len, u32 max_len)
 {
 	struct opa_congestion_info_attr *p =
 		(struct opa_congestion_info_attr *)data;
 	struct hfi1_ibport *ibp = to_iport(ibdev, port);
 	struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
 
+	if (smp_length_check(sizeof(*p), max_len)) {
+		smp->status |= IB_SMP_INVALID_FIELD;
+		return reply((struct ib_mad_hdr *)smp);
+	}
+
 	p->congestion_info = 0;
 	p->control_table_cap = ppd->cc_max_table_entries;
 	p->congestion_log_length = OPA_CONG_LOG_ELEMS;
@@ -3301,7 +3617,7 @@ static int __subn_get_opa_cong_info(struct opa_smp *smp, u32 am, u8 *data,
 
 static int __subn_get_opa_cong_setting(struct opa_smp *smp, u32 am,
 				       u8 *data, struct ib_device *ibdev,
-				       u8 port, u32 *resp_len)
+				       u8 port, u32 *resp_len, u32 max_len)
 {
 	int i;
 	struct opa_congestion_setting_attr *p =
@@ -3311,6 +3627,11 @@ static int __subn_get_opa_cong_setting(struct opa_smp *smp, u32 am,
 	struct opa_congestion_setting_entry_shadow *entries;
 	struct cc_state *cc_state;
 
+	if (smp_length_check(sizeof(*p), max_len)) {
+		smp->status |= IB_SMP_INVALID_FIELD;
+		return reply((struct ib_mad_hdr *)smp);
+	}
+
 	rcu_read_lock();
 
 	cc_state = get_cc_state(ppd);
@@ -3385,7 +3706,7 @@ static void apply_cc_state(struct hfi1_pportdata *ppd)
 
 static int __subn_set_opa_cong_setting(struct opa_smp *smp, u32 am, u8 *data,
 				       struct ib_device *ibdev, u8 port,
-				       u32 *resp_len)
+				       u32 *resp_len, u32 max_len)
 {
 	struct opa_congestion_setting_attr *p =
 		(struct opa_congestion_setting_attr *)data;
@@ -3394,6 +3715,11 @@ static int __subn_set_opa_cong_setting(struct opa_smp *smp, u32 am, u8 *data,
 	struct opa_congestion_setting_entry_shadow *entries;
 	int i;
 
+	if (smp_length_check(sizeof(*p), max_len)) {
+		smp->status |= IB_SMP_INVALID_FIELD;
+		return reply((struct ib_mad_hdr *)smp);
+	}
+
 	/*
 	 * Save details from packet into the ppd.  Hold the cc_state_lock so
 	 * our information is consistent with anyone trying to apply the state.
@@ -3415,12 +3741,12 @@ static int __subn_set_opa_cong_setting(struct opa_smp *smp, u32 am, u8 *data,
 	apply_cc_state(ppd);
 
 	return __subn_get_opa_cong_setting(smp, am, data, ibdev, port,
-					   resp_len);
+					   resp_len, max_len);
 }
 
 static int __subn_get_opa_hfi1_cong_log(struct opa_smp *smp, u32 am,
 					u8 *data, struct ib_device *ibdev,
-					u8 port, u32 *resp_len)
+					u8 port, u32 *resp_len, u32 max_len)
 {
 	struct hfi1_ibport *ibp = to_iport(ibdev, port);
 	struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
@@ -3428,7 +3754,7 @@ static int __subn_get_opa_hfi1_cong_log(struct opa_smp *smp, u32 am,
 	s64 ts;
 	int i;
 
-	if (am != 0) {
+	if (am || smp_length_check(sizeof(*cong_log), max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
@@ -3486,7 +3812,7 @@ static int __subn_get_opa_hfi1_cong_log(struct opa_smp *smp, u32 am,
 
 static int __subn_get_opa_cc_table(struct opa_smp *smp, u32 am, u8 *data,
 				   struct ib_device *ibdev, u8 port,
-				   u32 *resp_len)
+				   u32 *resp_len, u32 max_len)
 {
 	struct ib_cc_table_attr *cc_table_attr =
 		(struct ib_cc_table_attr *)data;
@@ -3498,9 +3824,10 @@ static int __subn_get_opa_cc_table(struct opa_smp *smp, u32 am, u8 *data,
 	int i, j;
 	u32 sentry, eentry;
 	struct cc_state *cc_state;
+	u32 size = sizeof(u16) * (IB_CCT_ENTRIES * n_blocks + 1);
 
 	/* sanity check n_blocks, start_block */
-	if (n_blocks == 0 ||
+	if (n_blocks == 0 || smp_length_check(size, max_len) ||
 	    start_block + n_blocks > ppd->cc_max_table_entries) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
@@ -3530,14 +3857,14 @@ static int __subn_get_opa_cc_table(struct opa_smp *smp, u32 am, u8 *data,
 	rcu_read_unlock();
 
 	if (resp_len)
-		*resp_len += sizeof(u16) * (IB_CCT_ENTRIES * n_blocks + 1);
+		*resp_len += size;
 
 	return reply((struct ib_mad_hdr *)smp);
 }
 
 static int __subn_set_opa_cc_table(struct opa_smp *smp, u32 am, u8 *data,
 				   struct ib_device *ibdev, u8 port,
-				   u32 *resp_len)
+				   u32 *resp_len, u32 max_len)
 {
 	struct ib_cc_table_attr *p = (struct ib_cc_table_attr *)data;
 	struct hfi1_ibport *ibp = to_iport(ibdev, port);
@@ -3548,9 +3875,10 @@ static int __subn_set_opa_cc_table(struct opa_smp *smp, u32 am, u8 *data,
 	int i, j;
 	u32 sentry, eentry;
 	u16 ccti_limit;
+	u32 size = sizeof(u16) * (IB_CCT_ENTRIES * n_blocks + 1);
 
 	/* sanity check n_blocks, start_block */
-	if (n_blocks == 0 ||
+	if (n_blocks == 0 || smp_length_check(size, max_len) ||
 	    start_block + n_blocks > ppd->cc_max_table_entries) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
@@ -3581,7 +3909,8 @@ static int __subn_set_opa_cc_table(struct opa_smp *smp, u32 am, u8 *data,
 	/* now apply the information */
 	apply_cc_state(ppd);
 
-	return __subn_get_opa_cc_table(smp, am, data, ibdev, port, resp_len);
+	return __subn_get_opa_cc_table(smp, am, data, ibdev, port, resp_len,
+				       max_len);
 }
 
 struct opa_led_info {
@@ -3594,7 +3923,7 @@ struct opa_led_info {
 
 static int __subn_get_opa_led_info(struct opa_smp *smp, u32 am, u8 *data,
 				   struct ib_device *ibdev, u8 port,
-				   u32 *resp_len)
+				   u32 *resp_len, u32 max_len)
 {
 	struct hfi1_devdata *dd = dd_from_ibdev(ibdev);
 	struct hfi1_pportdata *ppd = dd->pport;
@@ -3602,7 +3931,7 @@ static int __subn_get_opa_led_info(struct opa_smp *smp, u32 am, u8 *data,
 	u32 nport = OPA_AM_NPORT(am);
 	u32 is_beaconing_active;
 
-	if (nport != 1) {
+	if (nport != 1 || smp_length_check(sizeof(*p), max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
@@ -3624,14 +3953,14 @@ static int __subn_get_opa_led_info(struct opa_smp *smp, u32 am, u8 *data,
 
 static int __subn_set_opa_led_info(struct opa_smp *smp, u32 am, u8 *data,
 				   struct ib_device *ibdev, u8 port,
-				   u32 *resp_len)
+				   u32 *resp_len, u32 max_len)
 {
 	struct hfi1_devdata *dd = dd_from_ibdev(ibdev);
 	struct opa_led_info *p = (struct opa_led_info *)data;
 	u32 nport = OPA_AM_NPORT(am);
 	int on = !!(be32_to_cpu(p->rsvd_led_mask) & OPA_LED_MASK);
 
-	if (nport != 1) {
+	if (nport != 1 || smp_length_check(sizeof(*p), max_len)) {
 		smp->status |= IB_SMP_INVALID_FIELD;
 		return reply((struct ib_mad_hdr *)smp);
 	}
@@ -3641,12 +3970,13 @@ static int __subn_set_opa_led_info(struct opa_smp *smp, u32 am, u8 *data,
 	else
 		shutdown_led_override(dd->pport);
 
-	return __subn_get_opa_led_info(smp, am, data, ibdev, port, resp_len);
+	return __subn_get_opa_led_info(smp, am, data, ibdev, port, resp_len,
+				       max_len);
 }
 
 static int subn_get_opa_sma(__be16 attr_id, struct opa_smp *smp, u32 am,
 			    u8 *data, struct ib_device *ibdev, u8 port,
-			    u32 *resp_len)
+			    u32 *resp_len, u32 max_len)
 {
 	int ret;
 	struct hfi1_ibport *ibp = to_iport(ibdev, port);
@@ -3654,71 +3984,71 @@ static int subn_get_opa_sma(__be16 attr_id, struct opa_smp *smp, u32 am,
 	switch (attr_id) {
 	case IB_SMP_ATTR_NODE_DESC:
 		ret = __subn_get_opa_nodedesc(smp, am, data, ibdev, port,
-					      resp_len);
+					      resp_len, max_len);
 		break;
 	case IB_SMP_ATTR_NODE_INFO:
 		ret = __subn_get_opa_nodeinfo(smp, am, data, ibdev, port,
-					      resp_len);
+					      resp_len, max_len);
 		break;
 	case IB_SMP_ATTR_PORT_INFO:
 		ret = __subn_get_opa_portinfo(smp, am, data, ibdev, port,
-					      resp_len);
+					      resp_len, max_len);
 		break;
 	case IB_SMP_ATTR_PKEY_TABLE:
 		ret = __subn_get_opa_pkeytable(smp, am, data, ibdev, port,
-					       resp_len);
+					       resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_SL_TO_SC_MAP:
 		ret = __subn_get_opa_sl_to_sc(smp, am, data, ibdev, port,
-					      resp_len);
+					      resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_SC_TO_SL_MAP:
 		ret = __subn_get_opa_sc_to_sl(smp, am, data, ibdev, port,
-					      resp_len);
+					      resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_SC_TO_VLT_MAP:
 		ret = __subn_get_opa_sc_to_vlt(smp, am, data, ibdev, port,
-					       resp_len);
+					       resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_SC_TO_VLNT_MAP:
 		ret = __subn_get_opa_sc_to_vlnt(smp, am, data, ibdev, port,
-						resp_len);
+						resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_PORT_STATE_INFO:
 		ret = __subn_get_opa_psi(smp, am, data, ibdev, port,
-					 resp_len);
+					 resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_BUFFER_CONTROL_TABLE:
 		ret = __subn_get_opa_bct(smp, am, data, ibdev, port,
-					 resp_len);
+					 resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_CABLE_INFO:
 		ret = __subn_get_opa_cable_info(smp, am, data, ibdev, port,
-						resp_len);
+						resp_len, max_len);
 		break;
 	case IB_SMP_ATTR_VL_ARB_TABLE:
 		ret = __subn_get_opa_vl_arb(smp, am, data, ibdev, port,
-					    resp_len);
+					    resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_CONGESTION_INFO:
 		ret = __subn_get_opa_cong_info(smp, am, data, ibdev, port,
-					       resp_len);
+					       resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_HFI_CONGESTION_SETTING:
 		ret = __subn_get_opa_cong_setting(smp, am, data, ibdev,
-						  port, resp_len);
+						  port, resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_HFI_CONGESTION_LOG:
 		ret = __subn_get_opa_hfi1_cong_log(smp, am, data, ibdev,
-						   port, resp_len);
+						   port, resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_CONGESTION_CONTROL_TABLE:
 		ret = __subn_get_opa_cc_table(smp, am, data, ibdev, port,
-					      resp_len);
+					      resp_len, max_len);
 		break;
 	case IB_SMP_ATTR_LED_INFO:
 		ret = __subn_get_opa_led_info(smp, am, data, ibdev, port,
-					      resp_len);
+					      resp_len, max_len);
 		break;
 	case IB_SMP_ATTR_SM_INFO:
 		if (ibp->rvp.port_cap_flags & IB_PORT_SM_DISABLED)
@@ -3736,7 +4066,7 @@ static int subn_get_opa_sma(__be16 attr_id, struct opa_smp *smp, u32 am,
 
 static int subn_set_opa_sma(__be16 attr_id, struct opa_smp *smp, u32 am,
 			    u8 *data, struct ib_device *ibdev, u8 port,
-			    u32 *resp_len)
+			    u32 *resp_len, u32 max_len)
 {
 	int ret;
 	struct hfi1_ibport *ibp = to_iport(ibdev, port);
@@ -3744,51 +4074,51 @@ static int subn_set_opa_sma(__be16 attr_id, struct opa_smp *smp, u32 am,
 	switch (attr_id) {
 	case IB_SMP_ATTR_PORT_INFO:
 		ret = __subn_set_opa_portinfo(smp, am, data, ibdev, port,
-					      resp_len);
+					      resp_len, max_len);
 		break;
 	case IB_SMP_ATTR_PKEY_TABLE:
 		ret = __subn_set_opa_pkeytable(smp, am, data, ibdev, port,
-					       resp_len);
+					       resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_SL_TO_SC_MAP:
 		ret = __subn_set_opa_sl_to_sc(smp, am, data, ibdev, port,
-					      resp_len);
+					      resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_SC_TO_SL_MAP:
 		ret = __subn_set_opa_sc_to_sl(smp, am, data, ibdev, port,
-					      resp_len);
+					      resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_SC_TO_VLT_MAP:
 		ret = __subn_set_opa_sc_to_vlt(smp, am, data, ibdev, port,
-					       resp_len);
+					       resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_SC_TO_VLNT_MAP:
 		ret = __subn_set_opa_sc_to_vlnt(smp, am, data, ibdev, port,
-						resp_len);
+						resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_PORT_STATE_INFO:
 		ret = __subn_set_opa_psi(smp, am, data, ibdev, port,
-					 resp_len);
+					 resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_BUFFER_CONTROL_TABLE:
 		ret = __subn_set_opa_bct(smp, am, data, ibdev, port,
-					 resp_len);
+					 resp_len, max_len);
 		break;
 	case IB_SMP_ATTR_VL_ARB_TABLE:
 		ret = __subn_set_opa_vl_arb(smp, am, data, ibdev, port,
-					    resp_len);
+					    resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_HFI_CONGESTION_SETTING:
 		ret = __subn_set_opa_cong_setting(smp, am, data, ibdev,
-						  port, resp_len);
+						  port, resp_len, max_len);
 		break;
 	case OPA_ATTRIB_ID_CONGESTION_CONTROL_TABLE:
 		ret = __subn_set_opa_cc_table(smp, am, data, ibdev, port,
-					      resp_len);
+					      resp_len, max_len);
 		break;
 	case IB_SMP_ATTR_LED_INFO:
 		ret = __subn_set_opa_led_info(smp, am, data, ibdev, port,
-					      resp_len);
+					      resp_len, max_len);
 		break;
 	case IB_SMP_ATTR_SM_INFO:
 		if (ibp->rvp.port_cap_flags & IB_PORT_SM_DISABLED)
@@ -3844,7 +4174,10 @@ static int subn_get_opa_aggregate(struct opa_smp *smp,
 		memset(next_smp + sizeof(*agg), 0, agg_data_len);
 
 		(void)subn_get_opa_sma(agg->attr_id, smp, am, agg->data,
-					ibdev, port, NULL);
+				       ibdev, port, NULL, (u32)agg_data_len);
+
+		if (smp->status & IB_SMP_INVALID_FIELD)
+			break;
 		if (smp->status & ~IB_SMP_DIRECTION) {
 			set_aggr_error(agg);
 			return reply((struct ib_mad_hdr *)smp);
@@ -3887,7 +4220,9 @@ static int subn_set_opa_aggregate(struct opa_smp *smp,
 		}
 
 		(void)subn_set_opa_sma(agg->attr_id, smp, am, agg->data,
-					ibdev, port, NULL);
+				       ibdev, port, NULL, (u32)agg_data_len);
+		if (smp->status & IB_SMP_INVALID_FIELD)
+			break;
 		if (smp->status & ~IB_SMP_DIRECTION) {
 			set_aggr_error(agg);
 			return reply((struct ib_mad_hdr *)smp);
@@ -3958,7 +4293,7 @@ static int opa_local_smp_check(struct hfi1_ibport *ibp,
 			       const struct ib_wc *in_wc)
 {
 	struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
-	u16 slid = in_wc->slid;
+	u16 slid = ib_lid_cpu16(in_wc->slid);
 	u16 pkey;
 
 	if (in_wc->pkey_index >= ARRAY_SIZE(ppd->pkeys))
@@ -3997,12 +4332,13 @@ static int process_subn_opa(struct ib_device *ibdev, int mad_flags,
 	struct opa_smp *smp = (struct opa_smp *)out_mad;
 	struct hfi1_ibport *ibp = to_iport(ibdev, port);
 	u8 *data;
-	u32 am;
+	u32 am, data_size;
 	__be16 attr_id;
 	int ret;
 
 	*out_mad = *in_mad;
 	data = opa_get_smp_data(smp);
+	data_size = (u32)opa_get_smp_data_size(smp);
 
 	am = be32_to_cpu(smp->attr_mod);
 	attr_id = smp->attr_id;
@@ -4046,7 +4382,8 @@ static int process_subn_opa(struct ib_device *ibdev, int mad_flags,
 		default:
 			clear_opa_smp_data(smp);
 			ret = subn_get_opa_sma(attr_id, smp, am, data,
-					       ibdev, port, resp_len);
+					       ibdev, port, resp_len,
+					       data_size);
 			break;
 		case OPA_ATTRIB_ID_AGGREGATE:
 			ret = subn_get_opa_aggregate(smp, ibdev, port,
@@ -4058,7 +4395,8 @@ static int process_subn_opa(struct ib_device *ibdev, int mad_flags,
 		switch (attr_id) {
 		default:
 			ret = subn_set_opa_sma(attr_id, smp, am, data,
-					       ibdev, port, resp_len);
+					       ibdev, port, resp_len,
+					       data_size);
 			break;
 		case OPA_ATTRIB_ID_AGGREGATE:
 			ret = subn_set_opa_aggregate(smp, ibdev, port,
@@ -4077,6 +4415,11 @@ static int process_subn_opa(struct ib_device *ibdev, int mad_flags,
 		 */
 		ret = IB_MAD_RESULT_SUCCESS;
 		break;
+	case IB_MGMT_METHOD_TRAP_REPRESS:
+		subn_handle_opa_trap_repress(ibp, smp);
+		/* Always successful */
+		ret = IB_MAD_RESULT_SUCCESS;
+		break;
 	default:
 		smp->status |= IB_SMP_UNSUP_METHOD;
 		ret = reply((struct ib_mad_hdr *)smp);
diff --git a/drivers/infiniband/hw/hfi1/mad.h b/drivers/infiniband/hw/hfi1/mad.h
index 5aa3fd1..4c12450 100644
--- a/drivers/infiniband/hw/hfi1/mad.h
+++ b/drivers/infiniband/hw/hfi1/mad.h
@@ -1,5 +1,5 @@
 /*
- * Copyright(c) 2015, 2016 Intel Corporation.
+ * Copyright(c) 2015 - 2017 Intel Corporation.
  *
  * This file is provided under a dual BSD/GPLv2 license.  When using or
  * redistributing this file, you may do so under either license.
@@ -115,7 +115,7 @@ struct opa_mad_notice_attr {
 			__be32	lid;		/* LID where change occurred */
 			__be32	new_cap_mask;	/* new capability mask */
 			__be16	reserved2;
-			__be16	cap_mask;
+			__be16	cap_mask3;
 			__be16	change_flags;	/* low 4 bits only */
 		} __packed ntc_144;
 
@@ -428,5 +428,6 @@ struct sc2vlnt {
 		    COUNTER_MASK(1, 4))
 
 void hfi1_event_pkey_change(struct hfi1_devdata *dd, u8 port);
+void hfi1_handle_trap_timer(unsigned long data);
 
 #endif				/* _HFI1_MAD_H */
diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c
index ccbf52c..2f0d285 100644
--- a/drivers/infiniband/hw/hfi1/mmu_rb.c
+++ b/drivers/infiniband/hw/hfi1/mmu_rb.c
@@ -1,5 +1,5 @@
 /*
- * Copyright(c) 2016 Intel Corporation.
+ * Copyright(c) 2016 - 2017 Intel Corporation.
  *
  * This file is provided under a dual BSD/GPLv2 license.  When using or
  * redistributing this file, you may do so under either license.
@@ -67,8 +67,6 @@ struct mmu_rb_handler {
 
 static unsigned long mmu_node_start(struct mmu_rb_node *);
 static unsigned long mmu_node_last(struct mmu_rb_node *);
-static inline void mmu_notifier_page(struct mmu_notifier *, struct mm_struct *,
-				     unsigned long);
 static inline void mmu_notifier_range_start(struct mmu_notifier *,
 					    struct mm_struct *,
 					    unsigned long, unsigned long);
@@ -82,7 +80,6 @@ static void do_remove(struct mmu_rb_handler *handler,
 static void handle_remove(struct work_struct *work);
 
 static const struct mmu_notifier_ops mn_opts = {
-	.invalidate_page = mmu_notifier_page,
 	.invalidate_range_start = mmu_notifier_range_start,
 };
 
@@ -172,9 +169,8 @@ int hfi1_mmu_rb_insert(struct mmu_rb_handler *handler,
 	unsigned long flags;
 	int ret = 0;
 
+	trace_hfi1_mmu_rb_insert(mnode->addr, mnode->len);
 	spin_lock_irqsave(&handler->lock, flags);
-	hfi1_cdbg(MMU, "Inserting node addr 0x%llx, len %u", mnode->addr,
-		  mnode->len);
 	node = __mmu_rb_search(handler, mnode->addr, mnode->len);
 	if (node) {
 		ret = -EINVAL;
@@ -200,7 +196,7 @@ static struct mmu_rb_node *__mmu_rb_search(struct mmu_rb_handler *handler,
 {
 	struct mmu_rb_node *node = NULL;
 
-	hfi1_cdbg(MMU, "Searching for addr 0x%llx, len %u", addr, len);
+	trace_hfi1_mmu_rb_search(addr, len);
 	if (!handler->ops->filter) {
 		node = __mmu_int_rb_iter_first(&handler->root, addr,
 					       (addr + len) - 1);
@@ -217,21 +213,27 @@ static struct mmu_rb_node *__mmu_rb_search(struct mmu_rb_handler *handler,
 	return node;
 }
 
-struct mmu_rb_node *hfi1_mmu_rb_extract(struct mmu_rb_handler *handler,
-					unsigned long addr, unsigned long len)
+bool hfi1_mmu_rb_remove_unless_exact(struct mmu_rb_handler *handler,
+				     unsigned long addr, unsigned long len,
+				     struct mmu_rb_node **rb_node)
 {
 	struct mmu_rb_node *node;
 	unsigned long flags;
+	bool ret = false;
 
 	spin_lock_irqsave(&handler->lock, flags);
 	node = __mmu_rb_search(handler, addr, len);
 	if (node) {
+		if (node->addr == addr && node->len == len)
+			goto unlock;
 		__mmu_int_rb_remove(node, &handler->root);
 		list_del(&node->list); /* remove from LRU list */
+		ret = true;
 	}
+unlock:
 	spin_unlock_irqrestore(&handler->lock, flags);
-
-	return node;
+	*rb_node = node;
+	return ret;
 }
 
 void hfi1_mmu_rb_evict(struct mmu_rb_handler *handler, void *evict_arg)
@@ -275,8 +277,7 @@ void hfi1_mmu_rb_remove(struct mmu_rb_handler *handler,
 	unsigned long flags;
 
 	/* Validity of handler and node pointers has been checked by caller. */
-	hfi1_cdbg(MMU, "Removing node addr 0x%llx, len %u", node->addr,
-		  node->len);
+	trace_hfi1_mmu_rb_remove(node->addr, node->len);
 	spin_lock_irqsave(&handler->lock, flags);
 	__mmu_int_rb_remove(node, &handler->root);
 	list_del(&node->list); /* remove from LRU list */
@@ -285,12 +286,6 @@ void hfi1_mmu_rb_remove(struct mmu_rb_handler *handler,
 	handler->ops->remove(handler->ops_arg, node);
 }
 
-static inline void mmu_notifier_page(struct mmu_notifier *mn,
-				     struct mm_struct *mm, unsigned long addr)
-{
-	mmu_notifier_mem_invalidate(mn, mm, addr, addr + PAGE_SIZE);
-}
-
 static inline void mmu_notifier_range_start(struct mmu_notifier *mn,
 					    struct mm_struct *mm,
 					    unsigned long start,
@@ -315,8 +310,7 @@ static void mmu_notifier_mem_invalidate(struct mmu_notifier *mn,
 	     node; node = ptr) {
 		/* Guard against node removal. */
 		ptr = __mmu_int_rb_iter_next(node, start, end - 1);
-		hfi1_cdbg(MMU, "Invalidating node addr 0x%llx, len %u",
-			  node->addr, node->len);
+		trace_hfi1_mmu_mem_invalidate(node->addr, node->len);
 		if (handler->ops->invalidate(handler->ops_arg, node)) {
 			__mmu_int_rb_remove(node, root);
 			/* move from LRU list to delete list */
diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.h b/drivers/infiniband/hw/hfi1/mmu_rb.h
index 754f6eb..f04cec1 100644
--- a/drivers/infiniband/hw/hfi1/mmu_rb.h
+++ b/drivers/infiniband/hw/hfi1/mmu_rb.h
@@ -81,7 +81,8 @@ int hfi1_mmu_rb_insert(struct mmu_rb_handler *handler,
 void hfi1_mmu_rb_evict(struct mmu_rb_handler *handler, void *evict_arg);
 void hfi1_mmu_rb_remove(struct mmu_rb_handler *handler,
 			struct mmu_rb_node *mnode);
-struct mmu_rb_node *hfi1_mmu_rb_extract(struct mmu_rb_handler *handler,
-					unsigned long addr, unsigned long len);
+bool hfi1_mmu_rb_remove_unless_exact(struct mmu_rb_handler *handler,
+				     unsigned long addr, unsigned long len,
+				     struct mmu_rb_node **rb_node);
 
 #endif /* _HFI1_MMU_RB_H */
diff --git a/drivers/infiniband/hw/hfi1/opa_compat.h b/drivers/infiniband/hw/hfi1/opa_compat.h
index 6ef3c1c..774215b 100644
--- a/drivers/infiniband/hw/hfi1/opa_compat.h
+++ b/drivers/infiniband/hw/hfi1/opa_compat.h
@@ -84,7 +84,8 @@ static inline u8 port_states_to_phys_state(struct opa_port_states *ps)
 /*
  * OPA port physical states
  * IB Volume 1, Table 146 PortInfo/IB Volume 2 Section 5.4.2(1) PortPhysState
- * values.
+ * values are the same in OmniPath Architecture. OPA leverages some of the same
+ * concepts as InfiniBand, but has a few other states as well.
  *
  * When writing, only values 0-3 are valid, other values are ignored.
  * When reading, 0 is reserved.
@@ -92,6 +93,8 @@ static inline u8 port_states_to_phys_state(struct opa_port_states *ps)
  * Returned by the ibphys_portstate() routine.
  */
 enum opa_port_phys_state {
+	/* Values 0-7 have the same meaning in OPA as in InfiniBand. */
+
 	IB_PORTPHYSSTATE_NOP = 0,
 	/* 1 is reserved */
 	IB_PORTPHYSSTATE_POLLING = 2,
@@ -101,9 +104,23 @@ enum opa_port_phys_state {
 	IB_PORTPHYSSTATE_LINK_ERROR_RECOVERY = 6,
 	IB_PORTPHYSSTATE_PHY_TEST = 7,
 	/* 8 is reserved */
+
+	/*
+	 * Offline: Port is quiet (transmitters disabled) due to lack of
+	 * physical media, unsupported media, or transition between link up
+	 * and next link up attempt
+	 */
 	OPA_PORTPHYSSTATE_OFFLINE = 9,
-	OPA_PORTPHYSSTATE_GANGED = 10,
+
+	/* 10 is reserved */
+
+	/*
+	 * Phy_Test: Specific test patterns are transmitted, and receiver BER
+	 * can be monitored. This facilitates signal integrity testing for the
+	 * physical layer of the port.
+	 */
 	OPA_PORTPHYSSTATE_TEST = 11,
+
 	OPA_PORTPHYSSTATE_MAX = 11,
 	/* values 12-15 are reserved/ignored */
 };
diff --git a/drivers/infiniband/hw/hfi1/pcie.c b/drivers/infiniband/hw/hfi1/pcie.c
index 6a9f6f9..82447b7 100644
--- a/drivers/infiniband/hw/hfi1/pcie.c
+++ b/drivers/infiniband/hw/hfi1/pcie.c
@@ -1,5 +1,5 @@
 /*
- * Copyright(c) 2015, 2016 Intel Corporation.
+ * Copyright(c) 2015 - 2017 Intel Corporation.
  *
  * This file is provided under a dual BSD/GPLv2 license.  When using or
  * redistributing this file, you may do so under either license.
@@ -68,7 +68,7 @@
 /*
  * Code to adjust PCIe capabilities.
  */
-static void tune_pcie_caps(struct hfi1_devdata *);
+static int tune_pcie_caps(struct hfi1_devdata *);
 
 /*
  * Do all the common PCIe setup and initialization.
@@ -161,6 +161,7 @@ int hfi1_pcie_ddinit(struct hfi1_devdata *dd, struct pci_dev *pdev)
 {
 	unsigned long len;
 	resource_size_t addr;
+	int ret = 0;
 
 	dd->pcidev = pdev;
 	pci_set_drvdata(pdev, dd);
@@ -179,47 +180,54 @@ int hfi1_pcie_ddinit(struct hfi1_devdata *dd, struct pci_dev *pdev)
 		return -EINVAL;
 	}
 
-	dd->kregbase = ioremap_nocache(addr, TXE_PIO_SEND);
-	if (!dd->kregbase)
+	dd->kregbase1 = ioremap_nocache(addr, RCV_ARRAY);
+	if (!dd->kregbase1) {
+		dd_dev_err(dd, "UC mapping of kregbase1 failed\n");
 		return -ENOMEM;
+	}
+	dd_dev_info(dd, "UC base1: %p for %x\n", dd->kregbase1, RCV_ARRAY);
+	dd->chip_rcv_array_count = readq(dd->kregbase1 + RCV_ARRAY_CNT);
+	dd_dev_info(dd, "RcvArray count: %u\n", dd->chip_rcv_array_count);
+	dd->base2_start  = RCV_ARRAY + dd->chip_rcv_array_count * 8;
+
+	dd->kregbase2 = ioremap_nocache(
+		addr + dd->base2_start,
+		TXE_PIO_SEND - dd->base2_start);
+	if (!dd->kregbase2) {
+		dd_dev_err(dd, "UC mapping of kregbase2 failed\n");
+		goto nomem;
+	}
+	dd_dev_info(dd, "UC base2: %p for %x\n", dd->kregbase2,
+		    TXE_PIO_SEND - dd->base2_start);
 
 	dd->piobase = ioremap_wc(addr + TXE_PIO_SEND, TXE_PIO_SIZE);
 	if (!dd->piobase) {
-		iounmap(dd->kregbase);
-		return -ENOMEM;
+		dd_dev_err(dd, "WC mapping of send buffers failed\n");
+		goto nomem;
 	}
+	dd_dev_info(dd, "WC piobase: %p\n for %x", dd->piobase, TXE_PIO_SIZE);
 
-	dd->flags |= HFI1_PRESENT;	/* now register routines work */
-
-	dd->kregend = dd->kregbase + TXE_PIO_SEND;
 	dd->physaddr = addr;        /* used for io_remap, etc. */
 
 	/*
-	 * Re-map the chip's RcvArray as write-combining to allow us
+	 * Map the chip's RcvArray as write-combining to allow us
 	 * to write an entire cacheline worth of entries in one shot.
-	 * If this re-map fails, just continue - the RcvArray programming
-	 * function will handle both cases.
 	 */
-	dd->chip_rcv_array_count = read_csr(dd, RCV_ARRAY_CNT);
 	dd->rcvarray_wc = ioremap_wc(addr + RCV_ARRAY,
 				     dd->chip_rcv_array_count * 8);
-	dd_dev_info(dd, "WC Remapped RcvArray: %p\n", dd->rcvarray_wc);
-	/*
-	 * Save BARs and command to rewrite after device reset.
-	 */
-	pci_read_config_dword(dd->pcidev, PCI_BASE_ADDRESS_0, &dd->pcibar0);
-	pci_read_config_dword(dd->pcidev, PCI_BASE_ADDRESS_1, &dd->pcibar1);
-	pci_read_config_dword(dd->pcidev, PCI_ROM_ADDRESS, &dd->pci_rom);
-	pci_read_config_word(dd->pcidev, PCI_COMMAND, &dd->pci_command);
-	pcie_capability_read_word(dd->pcidev, PCI_EXP_DEVCTL, &dd->pcie_devctl);
-	pcie_capability_read_word(dd->pcidev, PCI_EXP_LNKCTL, &dd->pcie_lnkctl);
-	pcie_capability_read_word(dd->pcidev, PCI_EXP_DEVCTL2,
-				  &dd->pcie_devctl2);
-	pci_read_config_dword(dd->pcidev, PCI_CFG_MSIX0, &dd->pci_msix0);
-	pci_read_config_dword(dd->pcidev, PCIE_CFG_SPCIE1, &dd->pci_lnkctl3);
-	pci_read_config_dword(dd->pcidev, PCIE_CFG_TPH2, &dd->pci_tph2);
+	if (!dd->rcvarray_wc) {
+		dd_dev_err(dd, "WC mapping of receive array failed\n");
+		goto nomem;
+	}
+	dd_dev_info(dd, "WC RcvArray: %p for %x\n",
+		    dd->rcvarray_wc, dd->chip_rcv_array_count * 8);
 
+	dd->flags |= HFI1_PRESENT;	/* chip.c CSR routines now work */
 	return 0;
+nomem:
+	ret = -ENOMEM;
+	hfi1_pcie_ddcleanup(dd);
+	return ret;
 }
 
 /*
@@ -229,59 +237,19 @@ int hfi1_pcie_ddinit(struct hfi1_devdata *dd, struct pci_dev *pdev)
  */
 void hfi1_pcie_ddcleanup(struct hfi1_devdata *dd)
 {
-	u64 __iomem *base = (void __iomem *)dd->kregbase;
-
 	dd->flags &= ~HFI1_PRESENT;
-	dd->kregbase = NULL;
-	iounmap(base);
+	if (dd->kregbase1)
+		iounmap(dd->kregbase1);
+	dd->kregbase1 = NULL;
+	if (dd->kregbase2)
+		iounmap(dd->kregbase2);
+	dd->kregbase2 = NULL;
 	if (dd->rcvarray_wc)
 		iounmap(dd->rcvarray_wc);
+	dd->rcvarray_wc = NULL;
 	if (dd->piobase)
 		iounmap(dd->piobase);
-}
-
-static void msix_setup(struct hfi1_devdata *dd, int pos, u32 *msixcnt,
-		       struct hfi1_msix_entry *hfi1_msix_entry)
-{
-	int ret;
-	int nvec = *msixcnt;
-	struct msix_entry *msix_entry;
-	int i;
-
-	/*
-	 * We can't pass hfi1_msix_entry array to msix_setup
-	 * so use a dummy msix_entry array and copy the allocated
-	 * irq back to the hfi1_msix_entry array.
-	 */
-	msix_entry = kmalloc_array(nvec, sizeof(*msix_entry), GFP_KERNEL);
-	if (!msix_entry) {
-		ret = -ENOMEM;
-		goto do_intx;
-	}
-
-	for (i = 0; i < nvec; i++)
-		msix_entry[i] = hfi1_msix_entry[i].msix;
-
-	ret = pci_enable_msix_range(dd->pcidev, msix_entry, 1, nvec);
-	if (ret < 0)
-		goto free_msix_entry;
-	nvec = ret;
-
-	for (i = 0; i < nvec; i++)
-		hfi1_msix_entry[i].msix = msix_entry[i];
-
-	kfree(msix_entry);
-	*msixcnt = nvec;
-	return;
-
-free_msix_entry:
-	kfree(msix_entry);
-
-do_intx:
-	dd_dev_err(dd, "pci_enable_msix_range %d vectors failed: %d, falling back to INTx\n",
-		   nvec, ret);
-	*msixcnt = 0;
-	hfi1_enable_intx(dd->pcidev);
+	dd->piobase = NULL;
 }
 
 /* return the PCIe link speed from the given link status */
@@ -314,8 +282,14 @@ static u32 extract_width(u16 linkstat)
 static void update_lbus_info(struct hfi1_devdata *dd)
 {
 	u16 linkstat;
+	int ret;
 
-	pcie_capability_read_word(dd->pcidev, PCI_EXP_LNKSTA, &linkstat);
+	ret = pcie_capability_read_word(dd->pcidev, PCI_EXP_LNKSTA, &linkstat);
+	if (ret) {
+		dd_dev_err(dd, "Unable to read from PCI config\n");
+		return;
+	}
+
 	dd->lbus_width = extract_width(linkstat);
 	dd->lbus_speed = extract_speed(linkstat);
 	snprintf(dd->lbus_info, sizeof(dd->lbus_info),
@@ -330,6 +304,7 @@ int pcie_speeds(struct hfi1_devdata *dd)
 {
 	u32 linkcap;
 	struct pci_dev *parent = dd->pcidev->bus->self;
+	int ret;
 
 	if (!pci_is_pcie(dd->pcidev)) {
 		dd_dev_err(dd, "Can't find PCI Express capability!\n");
@@ -339,7 +314,12 @@ int pcie_speeds(struct hfi1_devdata *dd)
 	/* find if our max speed is Gen3 and parent supports Gen3 speeds */
 	dd->link_gen3_capable = 1;
 
-	pcie_capability_read_dword(dd->pcidev, PCI_EXP_LNKCAP, &linkcap);
+	ret = pcie_capability_read_dword(dd->pcidev, PCI_EXP_LNKCAP, &linkcap);
+	if (ret) {
+		dd_dev_err(dd, "Unable to read from PCI config\n");
+		return ret;
+	}
+
 	if ((linkcap & PCI_EXP_LNKCAP_SLS) != GEN3_SPEED_VECTOR) {
 		dd_dev_info(dd,
 			    "This HFI is not Gen3 capable, max speed 0x%x, need 0x3\n",
@@ -364,49 +344,150 @@ int pcie_speeds(struct hfi1_devdata *dd)
 }
 
 /*
- * Returns in *nent:
- *	- actual number of interrupts allocated
+ * Returns:
+ *	- actual number of interrupts allocated or
  *	- 0 if fell back to INTx.
+ *      - error
  */
-void request_msix(struct hfi1_devdata *dd, u32 *nent,
-		  struct hfi1_msix_entry *entry)
+int request_msix(struct hfi1_devdata *dd, u32 msireq)
 {
-	int pos;
+	int nvec, ret;
 
-	pos = dd->pcidev->msix_cap;
-	if (*nent && pos) {
-		msix_setup(dd, pos, nent, entry);
-		/* did it, either MSI-X or INTx */
-	} else {
-		*nent = 0;
-		hfi1_enable_intx(dd->pcidev);
+	nvec = pci_alloc_irq_vectors(dd->pcidev, 1, msireq,
+				     PCI_IRQ_MSIX | PCI_IRQ_LEGACY);
+	if (nvec < 0) {
+		dd_dev_err(dd, "pci_alloc_irq_vectors() failed: %d\n", nvec);
+		return nvec;
 	}
 
-	tune_pcie_caps(dd);
-}
+	ret = tune_pcie_caps(dd);
+	if (ret) {
+		dd_dev_err(dd, "tune_pcie_caps() failed: %d\n", ret);
+		pci_free_irq_vectors(dd->pcidev);
+		return ret;
+	}
 
-void hfi1_enable_intx(struct pci_dev *pdev)
-{
-	/* first, turn on INTx */
-	pci_intx(pdev, 1);
-	/* then turn off MSI-X */
-	pci_disable_msix(pdev);
+	/* check for legacy IRQ */
+	if (nvec == 1 && !dd->pcidev->msix_enabled)
+		return 0;
+
+	return nvec;
 }
 
 /* restore command and BARs after a reset has wiped them out */
-void restore_pci_variables(struct hfi1_devdata *dd)
+int restore_pci_variables(struct hfi1_devdata *dd)
 {
-	pci_write_config_word(dd->pcidev, PCI_COMMAND, dd->pci_command);
-	pci_write_config_dword(dd->pcidev, PCI_BASE_ADDRESS_0, dd->pcibar0);
-	pci_write_config_dword(dd->pcidev, PCI_BASE_ADDRESS_1, dd->pcibar1);
-	pci_write_config_dword(dd->pcidev, PCI_ROM_ADDRESS, dd->pci_rom);
-	pcie_capability_write_word(dd->pcidev, PCI_EXP_DEVCTL, dd->pcie_devctl);
-	pcie_capability_write_word(dd->pcidev, PCI_EXP_LNKCTL, dd->pcie_lnkctl);
-	pcie_capability_write_word(dd->pcidev, PCI_EXP_DEVCTL2,
-				   dd->pcie_devctl2);
-	pci_write_config_dword(dd->pcidev, PCI_CFG_MSIX0, dd->pci_msix0);
-	pci_write_config_dword(dd->pcidev, PCIE_CFG_SPCIE1, dd->pci_lnkctl3);
-	pci_write_config_dword(dd->pcidev, PCIE_CFG_TPH2, dd->pci_tph2);
+	int ret = 0;
+
+	ret = pci_write_config_word(dd->pcidev, PCI_COMMAND, dd->pci_command);
+	if (ret)
+		goto error;
+
+	ret = pci_write_config_dword(dd->pcidev, PCI_BASE_ADDRESS_0,
+				     dd->pcibar0);
+	if (ret)
+		goto error;
+
+	ret = pci_write_config_dword(dd->pcidev, PCI_BASE_ADDRESS_1,
+				     dd->pcibar1);
+	if (ret)
+		goto error;
+
+	ret = pci_write_config_dword(dd->pcidev, PCI_ROM_ADDRESS, dd->pci_rom);
+	if (ret)
+		goto error;
+
+	ret = pcie_capability_write_word(dd->pcidev, PCI_EXP_DEVCTL,
+					 dd->pcie_devctl);
+	if (ret)
+		goto error;
+
+	ret = pcie_capability_write_word(dd->pcidev, PCI_EXP_LNKCTL,
+					 dd->pcie_lnkctl);
+	if (ret)
+		goto error;
+
+	ret = pcie_capability_write_word(dd->pcidev, PCI_EXP_DEVCTL2,
+					 dd->pcie_devctl2);
+	if (ret)
+		goto error;
+
+	ret = pci_write_config_dword(dd->pcidev, PCI_CFG_MSIX0, dd->pci_msix0);
+	if (ret)
+		goto error;
+
+	ret = pci_write_config_dword(dd->pcidev, PCIE_CFG_SPCIE1,
+				     dd->pci_lnkctl3);
+	if (ret)
+		goto error;
+
+	ret = pci_write_config_dword(dd->pcidev, PCIE_CFG_TPH2, dd->pci_tph2);
+	if (ret)
+		goto error;
+
+	return 0;
+
+error:
+	dd_dev_err(dd, "Unable to write to PCI config\n");
+	return ret;
+}
+
+/* Save BARs and command to rewrite after device reset */
+int save_pci_variables(struct hfi1_devdata *dd)
+{
+	int ret = 0;
+
+	ret = pci_read_config_dword(dd->pcidev, PCI_BASE_ADDRESS_0,
+				    &dd->pcibar0);
+	if (ret)
+		goto error;
+
+	ret = pci_read_config_dword(dd->pcidev, PCI_BASE_ADDRESS_1,
+				    &dd->pcibar1);
+	if (ret)
+		goto error;
+
+	ret = pci_read_config_dword(dd->pcidev, PCI_ROM_ADDRESS, &dd->pci_rom);
+	if (ret)
+		goto error;
+
+	ret = pci_read_config_word(dd->pcidev, PCI_COMMAND, &dd->pci_command);
+	if (ret)
+		goto error;
+
+	ret = pcie_capability_read_word(dd->pcidev, PCI_EXP_DEVCTL,
+					&dd->pcie_devctl);
+	if (ret)
+		goto error;
+
+	ret = pcie_capability_read_word(dd->pcidev, PCI_EXP_LNKCTL,
+					&dd->pcie_lnkctl);
+	if (ret)
+		goto error;
+
+	ret = pcie_capability_read_word(dd->pcidev, PCI_EXP_DEVCTL2,
+					&dd->pcie_devctl2);
+	if (ret)
+		goto error;
+
+	ret = pci_read_config_dword(dd->pcidev, PCI_CFG_MSIX0, &dd->pci_msix0);
+	if (ret)
+		goto error;
+
+	ret = pci_read_config_dword(dd->pcidev, PCIE_CFG_SPCIE1,
+				    &dd->pci_lnkctl3);
+	if (ret)
+		goto error;
+
+	ret = pci_read_config_dword(dd->pcidev, PCIE_CFG_TPH2, &dd->pci_tph2);
+	if (ret)
+		goto error;
+
+	return 0;
+
+error:
+	dd_dev_err(dd, "Unable to read from PCI config\n");
+	return ret;
 }
 
 /*
@@ -421,21 +502,33 @@ uint aspm_mode = ASPM_MODE_DISABLED;
 module_param_named(aspm, aspm_mode, uint, S_IRUGO);
 MODULE_PARM_DESC(aspm, "PCIe ASPM: 0: disable, 1: enable, 2: dynamic");
 
-static void tune_pcie_caps(struct hfi1_devdata *dd)
+static int tune_pcie_caps(struct hfi1_devdata *dd)
 {
 	struct pci_dev *parent;
 	u16 rc_mpss, rc_mps, ep_mpss, ep_mps;
 	u16 rc_mrrs, ep_mrrs, max_mrrs, ectl;
+	int ret;
 
 	/*
 	 * Turn on extended tags in DevCtl in case the BIOS has turned it off
 	 * to improve WFR SDMA bandwidth
 	 */
-	pcie_capability_read_word(dd->pcidev, PCI_EXP_DEVCTL, &ectl);
+	ret = pcie_capability_read_word(dd->pcidev,
+					PCI_EXP_DEVCTL, &ectl);
+	if (ret) {
+		dd_dev_err(dd, "Unable to read from PCI config\n");
+		return ret;
+	}
+
 	if (!(ectl & PCI_EXP_DEVCTL_EXT_TAG)) {
 		dd_dev_info(dd, "Enabling PCIe extended tags\n");
 		ectl |= PCI_EXP_DEVCTL_EXT_TAG;
-		pcie_capability_write_word(dd->pcidev, PCI_EXP_DEVCTL, ectl);
+		ret = pcie_capability_write_word(dd->pcidev,
+						 PCI_EXP_DEVCTL, ectl);
+		if (ret) {
+			dd_dev_err(dd, "Unable to write to PCI config\n");
+			return ret;
+		}
 	}
 	/* Find out supported and configured values for parent (root) */
 	parent = dd->pcidev->bus->self;
@@ -444,14 +537,14 @@ static void tune_pcie_caps(struct hfi1_devdata *dd)
 	 * access to the upstream component.
 	 */
 	if (!parent)
-		return;
+		return -EINVAL;
 	if (!pci_is_root_bus(parent->bus)) {
 		dd_dev_info(dd, "Parent not root\n");
-		return;
+		return -EINVAL;
 	}
 
 	if (!pci_is_pcie(parent) || !pci_is_pcie(dd->pcidev))
-		return;
+		return -EINVAL;
 	rc_mpss = parent->pcie_mpss;
 	rc_mps = ffs(pcie_get_mps(parent)) - 8;
 	/* Find out supported and configured values for endpoint (us) */
@@ -497,6 +590,8 @@ static void tune_pcie_caps(struct hfi1_devdata *dd)
 		ep_mrrs = max_mrrs;
 		pcie_set_readrq(dd->pcidev, ep_mrrs);
 	}
+
+	return 0;
 }
 
 /* End of PCIe capability tuning */
@@ -728,6 +823,7 @@ static int load_eq_table(struct hfi1_devdata *dd, const u8 eq[11][3], u8 fs,
 	u32 violation;
 	u32 i;
 	u8 c_minus1, c0, c_plus1;
+	int ret;
 
 	for (i = 0; i < 11; i++) {
 		/* set index */
@@ -739,8 +835,14 @@ static int load_eq_table(struct hfi1_devdata *dd, const u8 eq[11][3], u8 fs,
 		pci_write_config_dword(pdev, PCIE_CFG_REG_PL102,
 				       eq_value(c_minus1, c0, c_plus1));
 		/* check if these coefficients violate EQ rules */
-		pci_read_config_dword(dd->pcidev, PCIE_CFG_REG_PL105,
-				      &violation);
+		ret = pci_read_config_dword(dd->pcidev,
+					    PCIE_CFG_REG_PL105, &violation);
+		if (ret) {
+			dd_dev_err(dd, "Unable to read from PCI config\n");
+			hit_error = 1;
+			break;
+		}
+
 		if (violation
 		    & PCIE_CFG_REG_PL105_GEN3_EQ_VIOLATE_COEF_RULES_SMASK){
 			if (hit_error == 0) {
@@ -1194,7 +1296,13 @@ int do_pcie_gen3_transition(struct hfi1_devdata *dd)
 	 * that it is Gen3 capable earlier.
 	 */
 	dd_dev_info(dd, "%s: setting parent target link speed\n", __func__);
-	pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
+	ret = pcie_capability_read_word(parent, PCI_EXP_LNKCTL2, &lnkctl2);
+	if (ret) {
+		dd_dev_err(dd, "Unable to read from PCI config\n");
+		return_error = 1;
+		goto done;
+	}
+
 	dd_dev_info(dd, "%s: ..old link control2: 0x%x\n", __func__,
 		    (u32)lnkctl2);
 	/* only write to parent if target is not as high as ours */
@@ -1203,20 +1311,37 @@ int do_pcie_gen3_transition(struct hfi1_devdata *dd)
 		lnkctl2 |= target_vector;
 		dd_dev_info(dd, "%s: ..new link control2: 0x%x\n", __func__,
 			    (u32)lnkctl2);
-		pcie_capability_write_word(parent, PCI_EXP_LNKCTL2, lnkctl2);
+		ret = pcie_capability_write_word(parent,
+						 PCI_EXP_LNKCTL2, lnkctl2);
+		if (ret) {
+			dd_dev_err(dd, "Unable to write to PCI config\n");
+			return_error = 1;
+			goto done;
+		}
 	} else {
 		dd_dev_info(dd, "%s: ..target speed is OK\n", __func__);
 	}
 
 	dd_dev_info(dd, "%s: setting target link speed\n", __func__);
-	pcie_capability_read_word(dd->pcidev, PCI_EXP_LNKCTL2, &lnkctl2);
+	ret = pcie_capability_read_word(dd->pcidev, PCI_EXP_LNKCTL2, &lnkctl2);
+	if (ret) {
+		dd_dev_err(dd, "Unable to read from PCI config\n");
+		return_error = 1;
+		goto done;
+	}
+
 	dd_dev_info(dd, "%s: ..old link control2: 0x%x\n", __func__,
 		    (u32)lnkctl2);
 	lnkctl2 &= ~LNKCTL2_TARGET_LINK_SPEED_MASK;
 	lnkctl2 |= target_vector;
 	dd_dev_info(dd, "%s: ..new link control2: 0x%x\n", __func__,
 		    (u32)lnkctl2);
-	pcie_capability_write_word(dd->pcidev, PCI_EXP_LNKCTL2, lnkctl2);
+	ret = pcie_capability_write_word(dd->pcidev, PCI_EXP_LNKCTL2, lnkctl2);
+	if (ret) {
+		dd_dev_err(dd, "Unable to write to PCI config\n");
+		return_error = 1;
+		goto done;
+	}
 
 	/* step 5h: arm gasket logic */
 	/* hold DC in reset across the SBR */
@@ -1266,7 +1391,14 @@ int do_pcie_gen3_transition(struct hfi1_devdata *dd)
 
 	/* restore PCI space registers we know were reset */
 	dd_dev_info(dd, "%s: calling restore_pci_variables\n", __func__);
-	restore_pci_variables(dd);
+	ret = restore_pci_variables(dd);
+	if (ret) {
+		dd_dev_err(dd, "%s: Could not restore PCI variables\n",
+			   __func__);
+		return_error = 1;
+		goto done;
+	}
+
 	/* restore firmware control */
 	write_csr(dd, MISC_CFG_FW_CTRL, fw_ctrl);
 
@@ -1296,7 +1428,13 @@ int do_pcie_gen3_transition(struct hfi1_devdata *dd)
 	setextled(dd, 0);
 
 	/* check for any per-lane errors */
-	pci_read_config_dword(dd->pcidev, PCIE_CFG_SPCIE2, &reg32);
+	ret = pci_read_config_dword(dd->pcidev, PCIE_CFG_SPCIE2, &reg32);
+	if (ret) {
+		dd_dev_err(dd, "Unable to read from PCI config\n");
+		return_error = 1;
+		goto done;
+	}
+
 	dd_dev_info(dd, "%s: per-lane errors: 0x%x\n", __func__, reg32);
 
 	/* extract status, look for our HFI */
diff --git a/drivers/infiniband/hw/hfi1/pio.c b/drivers/infiniband/hw/hfi1/pio.c
index ed72b5a..7108a4b 100644
--- a/drivers/infiniband/hw/hfi1/pio.c
+++ b/drivers/infiniband/hw/hfi1/pio.c
@@ -1012,7 +1012,7 @@ static void sc_wait_for_packet_egress(struct send_context *sc, int pause)
 				   "%s: context %u(%u) timeout waiting for packets to egress, remaining count %u, bouncing link\n",
 				   __func__, sc->sw_index,
 				   sc->hw_context, (u32)reg);
-			queue_work(dd->pport->hfi1_wq,
+			queue_work(dd->pport->link_wq,
 				   &dd->pport->link_bounce_work);
 			break;
 		}
@@ -1568,7 +1568,8 @@ static void sc_piobufavail(struct send_context *sc)
 	struct rvt_qp *qp;
 	struct hfi1_qp_priv *priv;
 	unsigned long flags;
-	unsigned i, n = 0;
+	uint i, n = 0, max_idx = 0;
+	u8 max_starved_cnt = 0;
 
 	if (dd->send_contexts[sc->sw_index].type != SC_KERNEL &&
 	    dd->send_contexts[sc->sw_index].type != SC_VL15)
@@ -1591,6 +1592,7 @@ static void sc_piobufavail(struct send_context *sc)
 		priv = qp->priv;
 		list_del_init(&priv->s_iowait.list);
 		priv->s_iowait.lock = NULL;
+		iowait_starve_find_max(wait, &max_starved_cnt, n, &max_idx);
 		/* refcount held until actual wake up */
 		qps[n++] = qp;
 	}
@@ -1605,9 +1607,14 @@ static void sc_piobufavail(struct send_context *sc)
 	}
 	write_sequnlock_irqrestore(&dev->iowait_lock, flags);
 
-	for (i = 0; i < n; i++)
-		hfi1_qp_wakeup(qps[i],
+	/* Wake up the most starved one first */
+	if (n)
+		hfi1_qp_wakeup(qps[max_idx],
 			       RVT_S_WAIT_PIO | RVT_S_WAIT_PIO_DRAIN);
+	for (i = 0; i < n; i++)
+		if (i != max_idx)
+			hfi1_qp_wakeup(qps[i],
+				       RVT_S_WAIT_PIO | RVT_S_WAIT_PIO_DRAIN);
 }
 
 /* translate a send credit update to a bit code of reasons */
diff --git a/drivers/infiniband/hw/hfi1/platform.c b/drivers/infiniband/hw/hfi1/platform.c
index 838fe84..a8af96d 100644
--- a/drivers/infiniband/hw/hfi1/platform.c
+++ b/drivers/infiniband/hw/hfi1/platform.c
@@ -45,10 +45,14 @@
  *
  */
 
+#include <linux/firmware.h>
+
 #include "hfi.h"
 #include "efivar.h"
 #include "eprom.h"
 
+#define DEFAULT_PLATFORM_CONFIG_NAME "hfi1_platform.dat"
+
 static int validate_scratch_checksum(struct hfi1_devdata *dd)
 {
 	u64 checksum = 0, temp_scratch = 0;
@@ -58,8 +62,13 @@ static int validate_scratch_checksum(struct hfi1_devdata *dd)
 	version = (temp_scratch & BITMAP_VERSION_SMASK) >> BITMAP_VERSION_SHIFT;
 
 	/* Prevent power on default of all zeroes from passing checksum */
-	if (!version)
+	if (!version) {
+		dd_dev_err(dd, "%s: Config bitmap uninitialized\n", __func__);
+		dd_dev_err(dd,
+			   "%s: Please update your BIOS to support active channels\n",
+			   __func__);
 		return 0;
+	}
 
 	/*
 	 * ASIC scratch 0 only contains the checksum and bitmap version as
@@ -84,6 +93,8 @@ static int validate_scratch_checksum(struct hfi1_devdata *dd)
 
 	if (checksum + temp_scratch == 0xFFFF)
 		return 1;
+
+	dd_dev_err(dd, "%s: Configuration bitmap corrupted\n", __func__);
 	return 0;
 }
 
@@ -131,25 +142,22 @@ static void save_platform_config_fields(struct hfi1_devdata *dd)
 
 	ppd->max_power_class = (temp_scratch & QSFP_MAX_POWER_SMASK) >>
 				QSFP_MAX_POWER_SHIFT;
+
+	ppd->config_from_scratch = true;
 }
 
 void get_platform_config(struct hfi1_devdata *dd)
 {
 	int ret = 0;
-	unsigned long size = 0;
 	u8 *temp_platform_config = NULL;
 	u32 esize;
+	const struct firmware *platform_config_file = NULL;
 
 	if (is_integrated(dd)) {
 		if (validate_scratch_checksum(dd)) {
 			save_platform_config_fields(dd);
 			return;
 		}
-		dd_dev_err(dd, "%s: Config bitmap corrupted/uninitialized\n",
-			   __func__);
-		dd_dev_err(dd,
-			   "%s: Please update your BIOS to support active channels\n",
-			   __func__);
 	} else {
 		ret = eprom_read_platform_config(dd,
 						 (void **)&temp_platform_config,
@@ -160,36 +168,37 @@ void get_platform_config(struct hfi1_devdata *dd)
 			dd->platform_config.size = esize;
 			return;
 		}
-		/* fail, try EFI variable */
-
-		ret = read_hfi1_efi_var(dd, "configuration", &size,
-					(void **)&temp_platform_config);
-		if (!ret) {
-			dd->platform_config.data = temp_platform_config;
-			dd->platform_config.size = size;
-			return;
-		}
 	}
 	dd_dev_err(dd,
 		   "%s: Failed to get platform config, falling back to sub-optimal default file\n",
 		   __func__);
-	/* fall back to request firmware */
-	platform_config_load = 1;
+
+	ret = request_firmware(&platform_config_file,
+			       DEFAULT_PLATFORM_CONFIG_NAME,
+			       &dd->pcidev->dev);
+	if (ret) {
+		dd_dev_err(dd,
+			   "%s: No default platform config file found\n",
+			   __func__);
+		return;
+	}
+
+	/*
+	 * Allocate separate memory block to store data and free firmware
+	 * structure. This allows free_platform_config to treat EPROM and
+	 * fallback configs in the same manner.
+	 */
+	dd->platform_config.data = kmemdup(platform_config_file->data,
+					   platform_config_file->size,
+					   GFP_KERNEL);
+	dd->platform_config.size = platform_config_file->size;
+	release_firmware(platform_config_file);
 }
 
 void free_platform_config(struct hfi1_devdata *dd)
 {
-	if (!platform_config_load) {
-		/*
-		 * was loaded from EFI or the EPROM, release memory
-		 * allocated by read_efi_var/eprom_read_platform_config
-		 */
-		kfree(dd->platform_config.data);
-	}
-	/*
-	 * else do nothing, dispose_firmware will release
-	 * struct firmware platform_config on driver exit
-	 */
+	/* Release memory allocated for eprom or fallback file read. */
+	kfree(dd->platform_config.data);
 }
 
 void get_port_type(struct hfi1_pportdata *ppd)
@@ -242,7 +251,7 @@ static int qual_power(struct hfi1_pportdata *ppd)
 
 	if (ppd->offline_disabled_reason ==
 			HFI1_ODR_MASK(OPA_LINKDOWN_REASON_POWER_POLICY)) {
-		dd_dev_info(
+		dd_dev_err(
 			ppd->dd,
 			"%s: Port disabled due to system power restrictions\n",
 			__func__);
@@ -268,7 +277,7 @@ static int qual_bitrate(struct hfi1_pportdata *ppd)
 
 	if (ppd->offline_disabled_reason ==
 			HFI1_ODR_MASK(OPA_LINKDOWN_REASON_LINKSPEED_POLICY)) {
-		dd_dev_info(
+		dd_dev_err(
 			ppd->dd,
 			"%s: Cable failed bitrate check, disabling port\n",
 			__func__);
@@ -709,15 +718,15 @@ static void apply_tunings(
 		ret = load_8051_config(ppd->dd, DC_HOST_COMM_SETTINGS,
 				       GENERAL_CONFIG, config_data);
 		if (ret != HCMD_SUCCESS)
-			dd_dev_info(ppd->dd,
-				    "%s: Failed set ext device config params\n",
-				    __func__);
+			dd_dev_err(ppd->dd,
+				   "%s: Failed set ext device config params\n",
+				   __func__);
 	}
 
 	if (tx_preset_index == OPA_INVALID_INDEX) {
 		if (ppd->port_type == PORT_TYPE_QSFP && limiting_active)
-			dd_dev_info(ppd->dd, "%s: Invalid Tx preset index\n",
-				    __func__);
+			dd_dev_err(ppd->dd, "%s: Invalid Tx preset index\n",
+				   __func__);
 		return;
 	}
 
@@ -900,7 +909,7 @@ static int tune_qsfp(struct hfi1_pportdata *ppd,
 	case 0xD: /* fallthrough */
 	case 0xF:
 	default:
-		dd_dev_info(ppd->dd, "%s: Unknown/unsupported cable\n",
+		dd_dev_warn(ppd->dd, "%s: Unknown/unsupported cable\n",
 			    __func__);
 		break;
 	}
@@ -935,6 +944,21 @@ void tune_serdes(struct hfi1_pportdata *ppd)
 	if (loopback != LOOPBACK_NONE ||
 	    ppd->dd->icode == ICODE_FUNCTIONAL_SIMULATOR) {
 		ppd->driver_link_ready = 1;
+
+		if (qsfp_mod_present(ppd)) {
+			ret = acquire_chip_resource(ppd->dd,
+						    qsfp_resource(ppd->dd),
+						    QSFP_WAIT);
+			if (ret) {
+				dd_dev_err(ppd->dd, "%s: hfi%d: cannot lock i2c chain\n",
+					   __func__, (int)ppd->dd->hfi1_id);
+				goto bail;
+			}
+
+			refresh_qsfp_cache(ppd, &ppd->qsfp_info);
+			release_chip_resource(ppd->dd, qsfp_resource(ppd->dd));
+		}
+
 		return;
 	}
 
@@ -942,7 +966,7 @@ void tune_serdes(struct hfi1_pportdata *ppd)
 	case PORT_TYPE_DISCONNECTED:
 		ppd->offline_disabled_reason =
 			HFI1_ODR_MASK(OPA_LINKDOWN_REASON_DISCONNECTED);
-		dd_dev_info(dd, "%s: Port disconnected, disabling port\n",
+		dd_dev_warn(dd, "%s: Port disconnected, disabling port\n",
 			    __func__);
 		goto bail;
 	case PORT_TYPE_FIXED:
@@ -1027,7 +1051,7 @@ void tune_serdes(struct hfi1_pportdata *ppd)
 		}
 		break;
 	default:
-		dd_dev_info(ppd->dd, "%s: Unknown port type\n", __func__);
+		dd_dev_warn(ppd->dd, "%s: Unknown port type\n", __func__);
 		ppd->port_type = PORT_TYPE_UNKNOWN;
 		tuning_method = OPA_UNKNOWN_TUNING;
 		total_atten = 0;
diff --git a/drivers/infiniband/hw/hfi1/qp.c b/drivers/infiniband/hw/hfi1/qp.c
index 1a7af9f..4b01ccd 100644
--- a/drivers/infiniband/hw/hfi1/qp.c
+++ b/drivers/infiniband/hw/hfi1/qp.c
@@ -1,5 +1,5 @@
 /*
- * Copyright(c) 2015, 2016 Intel Corporation.
+ * Copyright(c) 2015 - 2017 Intel Corporation.
  *
  * This file is provided under a dual BSD/GPLv2 license.  When using or
  * redistributing this file, you may do so under either license.
@@ -68,17 +68,12 @@ static int iowait_sleep(
 	struct sdma_engine *sde,
 	struct iowait *wait,
 	struct sdma_txreq *stx,
-	unsigned seq);
+	unsigned int seq,
+	bool pkts_sent);
 static void iowait_wakeup(struct iowait *wait, int reason);
 static void iowait_sdma_drained(struct iowait *wait);
 static void qp_pio_drain(struct rvt_qp *qp);
 
-static inline unsigned mk_qpn(struct rvt_qpn_table *qpt,
-			      struct rvt_qpn_map *map, unsigned off)
-{
-	return (map - qpt->map) * RVT_BITS_PER_PAGE + off;
-}
-
 const struct rvt_operation_params hfi1_post_parms[RVT_OPERATION_MAX] = {
 [IB_WR_RDMA_WRITE] = {
 	.length = sizeof(struct ib_rdma_wr),
@@ -237,6 +232,31 @@ int hfi1_check_modify_qp(struct rvt_qp *qp, struct ib_qp_attr *attr,
 	return 0;
 }
 
+/*
+ * qp_set_16b - Set the hdr_type based on whether the slid or the
+ * dlid in the connection is extended. Only applicable for RC and UC
+ * QPs. UD QPs determine this on the fly from the ah in the wqe
+ */
+static inline void qp_set_16b(struct rvt_qp *qp)
+{
+	struct hfi1_pportdata *ppd;
+	struct hfi1_ibport *ibp;
+	struct hfi1_qp_priv *priv = qp->priv;
+
+	/* Update ah_attr to account for extended LIDs */
+	hfi1_update_ah_attr(qp->ibqp.device, &qp->remote_ah_attr);
+
+	/* Create 32 bit LIDs */
+	hfi1_make_opa_lid(&qp->remote_ah_attr);
+
+	if (!(rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH))
+		return;
+
+	ibp = to_iport(qp->ibqp.device, qp->port_num);
+	ppd = ppd_from_ibp(ibp);
+	priv->hdr_type = hfi1_get_hdr_type(ppd->lid, &qp->remote_ah_attr);
+}
+
 void hfi1_modify_qp(struct rvt_qp *qp, struct ib_qp_attr *attr,
 		    int attr_mask, struct ib_udata *udata)
 {
@@ -247,6 +267,7 @@ void hfi1_modify_qp(struct rvt_qp *qp, struct ib_qp_attr *attr,
 		priv->s_sc = ah_to_sc(ibqp->device, &qp->remote_ah_attr);
 		priv->s_sde = qp_to_sdma_engine(qp, priv->s_sc);
 		priv->s_sendcontext = qp_to_send_context(qp, priv->s_sc);
+		qp_set_16b(qp);
 	}
 
 	if (attr_mask & IB_QP_PATH_MIG_STATE &&
@@ -256,6 +277,7 @@ void hfi1_modify_qp(struct rvt_qp *qp, struct ib_qp_attr *attr,
 		priv->s_sc = ah_to_sc(ibqp->device, &qp->remote_ah_attr);
 		priv->s_sde = qp_to_sdma_engine(qp, priv->s_sc);
 		priv->s_sendcontext = qp_to_send_context(qp, priv->s_sc);
+		qp_set_16b(qp);
 	}
 }
 
@@ -377,7 +399,8 @@ static int iowait_sleep(
 	struct sdma_engine *sde,
 	struct iowait *wait,
 	struct sdma_txreq *stx,
-	unsigned seq)
+	uint seq,
+	bool pkts_sent)
 {
 	struct verbs_txreq *tx = container_of(stx, struct verbs_txreq, txreq);
 	struct rvt_qp *qp;
@@ -408,7 +431,8 @@ static int iowait_sleep(
 
 			ibp->rvp.n_dmawait++;
 			qp->s_flags |= RVT_S_WAIT_DMA_DESC;
-			list_add_tail(&priv->s_iowait.list, &sde->dmawait);
+			iowait_queue(pkts_sent, &priv->s_iowait,
+				     &sde->dmawait);
 			priv->s_iowait.lock = &dev->iowait_lock;
 			trace_hfi1_qpsleep(qp, RVT_S_WAIT_DMA_DESC);
 			rvt_get_qp(qp);
@@ -506,82 +530,6 @@ struct send_context *qp_to_send_context(struct rvt_qp *qp, u8 sc5)
 					  sc5);
 }
 
-struct qp_iter {
-	struct hfi1_ibdev *dev;
-	struct rvt_qp *qp;
-	int specials;
-	int n;
-};
-
-struct qp_iter *qp_iter_init(struct hfi1_ibdev *dev)
-{
-	struct qp_iter *iter;
-
-	iter = kzalloc(sizeof(*iter), GFP_KERNEL);
-	if (!iter)
-		return NULL;
-
-	iter->dev = dev;
-	iter->specials = dev->rdi.ibdev.phys_port_cnt * 2;
-
-	return iter;
-}
-
-int qp_iter_next(struct qp_iter *iter)
-{
-	struct hfi1_ibdev *dev = iter->dev;
-	int n = iter->n;
-	int ret = 1;
-	struct rvt_qp *pqp = iter->qp;
-	struct rvt_qp *qp;
-
-	/*
-	 * The approach is to consider the special qps
-	 * as an additional table entries before the
-	 * real hash table.  Since the qp code sets
-	 * the qp->next hash link to NULL, this works just fine.
-	 *
-	 * iter->specials is 2 * # ports
-	 *
-	 * n = 0..iter->specials is the special qp indices
-	 *
-	 * n = iter->specials..dev->rdi.qp_dev->qp_table_size+iter->specials are
-	 * the potential hash bucket entries
-	 *
-	 */
-	for (; n <  dev->rdi.qp_dev->qp_table_size + iter->specials; n++) {
-		if (pqp) {
-			qp = rcu_dereference(pqp->next);
-		} else {
-			if (n < iter->specials) {
-				struct hfi1_pportdata *ppd;
-				struct hfi1_ibport *ibp;
-				int pidx;
-
-				pidx = n % dev->rdi.ibdev.phys_port_cnt;
-				ppd = &dd_from_dev(dev)->pport[pidx];
-				ibp = &ppd->ibport_data;
-
-				if (!(n & 1))
-					qp = rcu_dereference(ibp->rvp.qp[0]);
-				else
-					qp = rcu_dereference(ibp->rvp.qp[1]);
-			} else {
-				qp = rcu_dereference(
-					dev->rdi.qp_dev->qp_table[
-						(n - iter->specials)]);
-			}
-		}
-		pqp = qp;
-		if (qp) {
-			iter->qp = qp;
-			iter->n = n;
-			return 0;
-		}
-	}
-	return ret;
-}
-
 static const char * const qp_type_str[] = {
 	"SMI", "GSI", "RC", "UC", "UD",
 };
@@ -595,19 +543,27 @@ static int qp_idle(struct rvt_qp *qp)
 		qp->s_tail == qp->s_head;
 }
 
-void qp_iter_print(struct seq_file *s, struct qp_iter *iter)
+/**
+ * qp_iter_print - print the qp information to seq_file
+ * @s: the seq_file to emit the qp information on
+ * @iter: the iterator for the qp hash list
+ */
+void qp_iter_print(struct seq_file *s, struct rvt_qp_iter *iter)
 {
 	struct rvt_swqe *wqe;
 	struct rvt_qp *qp = iter->qp;
 	struct hfi1_qp_priv *priv = qp->priv;
 	struct sdma_engine *sde;
 	struct send_context *send_context;
+	struct rvt_ack_entry *e = NULL;
 
 	sde = qp_to_sdma_engine(qp, priv->s_sc);
 	wqe = rvt_get_swqe_ptr(qp, qp->s_last);
 	send_context = qp_to_send_context(qp, priv->s_sc);
+	if (qp->s_ack_queue)
+		e = &qp->s_ack_queue[qp->s_tail_ack_queue];
 	seq_printf(s,
-		   "N %d %s QP %x R %u %s %u %u %u f=%x %u %u %u %u %u %u SPSN %x %x %x %x %x RPSN %x (%u %u %u %u %u %u %u) RQP %x LID %x SL %u MTU %u %u %u %u %u SDE %p,%u SC %p,%u SCQ %u %u PID %d\n",
+		   "N %d %s QP %x R %u %s %u %u %u f=%x %u %u %u %u %u %u SPSN %x %x %x %x %x RPSN %x S(%u %u %u %u %u %u %u) R(%u %u %u) RQP %x LID %x SL %u MTU %u %u %u %u %u SDE %p,%u SC %p,%u SCQ %u %u PID %d OS %x %x E %x %x %x\n",
 		   iter->n,
 		   qp_idle(qp) ? "I" : "B",
 		   qp->ibqp.qp_num,
@@ -630,6 +586,10 @@ void qp_iter_print(struct seq_file *s, struct qp_iter *iter)
 		   qp->s_last, qp->s_acked, qp->s_cur,
 		   qp->s_tail, qp->s_head, qp->s_size,
 		   qp->s_avail,
+		   /* ack_queue ring pointers, size */
+		   qp->s_tail_ack_queue, qp->r_head_ack_queue,
+		   rvt_max_atomic(&to_idev(qp->ibqp.device)->rdi),
+		   /* remote QP info  */
 		   qp->remote_qpn,
 		   rdma_ah_get_dlid(&qp->remote_ah_attr),
 		   rdma_ah_get_sl(&qp->remote_ah_attr),
@@ -644,7 +604,13 @@ void qp_iter_print(struct seq_file *s, struct qp_iter *iter)
 		   send_context ? send_context->sw_index : 0,
 		   ibcq_to_rvtcq(qp->ibqp.send_cq)->queue->head,
 		   ibcq_to_rvtcq(qp->ibqp.send_cq)->queue->tail,
-		   qp->pid);
+		   qp->pid,
+		   qp->s_state,
+		   qp->s_ack_state,
+		   /* ack queue information */
+		   e ? e->opcode : 0,
+		   e ? e->psn : 0,
+		   e ? e->lpsn : 0);
 }
 
 void *qp_priv_alloc(struct rvt_dev_info *rdi, struct rvt_qp *qp)
@@ -750,6 +716,7 @@ void hfi1_migrate_qp(struct rvt_qp *qp)
 	qp->s_flags |= RVT_S_AHG_CLEAR;
 	priv->s_sc = ah_to_sc(qp->ibqp.device, &qp->remote_ah_attr);
 	priv->s_sde = qp_to_sdma_engine(qp, priv->s_sc);
+	qp_set_16b(qp);
 
 	ev.device = qp->ibqp.device;
 	ev.element.qp = &qp->ibqp;
@@ -832,6 +799,45 @@ void notify_error_qp(struct rvt_qp *qp)
 }
 
 /**
+ * hfi1_qp_iter_cb - callback for iterator
+ * @qp - the qp
+ * @v - the sl in low bits of v
+ *
+ * This is called from the iterator callback to work
+ * on an individual qp.
+ */
+static void hfi1_qp_iter_cb(struct rvt_qp *qp, u64 v)
+{
+	int lastwqe;
+	struct ib_event ev;
+	struct hfi1_ibport *ibp =
+		to_iport(qp->ibqp.device, qp->port_num);
+	struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
+	u8 sl = (u8)v;
+
+	if (qp->port_num != ppd->port ||
+	    (qp->ibqp.qp_type != IB_QPT_UC &&
+	     qp->ibqp.qp_type != IB_QPT_RC) ||
+	    rdma_ah_get_sl(&qp->remote_ah_attr) != sl ||
+	    !(ib_rvt_state_ops[qp->state] & RVT_POST_SEND_OK))
+		return;
+
+	spin_lock_irq(&qp->r_lock);
+	spin_lock(&qp->s_hlock);
+	spin_lock(&qp->s_lock);
+	lastwqe = rvt_error_qp(qp, IB_WC_WR_FLUSH_ERR);
+	spin_unlock(&qp->s_lock);
+	spin_unlock(&qp->s_hlock);
+	spin_unlock_irq(&qp->r_lock);
+	if (lastwqe) {
+		ev.device = qp->ibqp.device;
+		ev.element.qp = &qp->ibqp;
+		ev.event = IB_EVENT_QP_LAST_WQE_REACHED;
+		qp->ibqp.event_handler(&ev, qp->ibqp.qp_context);
+	}
+}
+
+/**
  * hfi1_error_port_qps - put a port's RC/UC qps into error state
  * @ibp: the ibport.
  * @sl: the service level.
@@ -842,44 +848,8 @@ void notify_error_qp(struct rvt_qp *qp)
  */
 void hfi1_error_port_qps(struct hfi1_ibport *ibp, u8 sl)
 {
-	struct rvt_qp *qp = NULL;
 	struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
 	struct hfi1_ibdev *dev = &ppd->dd->verbs_dev;
-	int n;
-	int lastwqe;
-	struct ib_event ev;
 
-	rcu_read_lock();
-
-	/* Deal only with RC/UC qps that use the given SL. */
-	for (n = 0; n < dev->rdi.qp_dev->qp_table_size; n++) {
-		for (qp = rcu_dereference(dev->rdi.qp_dev->qp_table[n]); qp;
-			qp = rcu_dereference(qp->next)) {
-			if (qp->port_num == ppd->port &&
-			    (qp->ibqp.qp_type == IB_QPT_UC ||
-			     qp->ibqp.qp_type == IB_QPT_RC) &&
-			    rdma_ah_get_sl(&qp->remote_ah_attr) == sl &&
-			    (ib_rvt_state_ops[qp->state] &
-			     RVT_POST_SEND_OK)) {
-				spin_lock_irq(&qp->r_lock);
-				spin_lock(&qp->s_hlock);
-				spin_lock(&qp->s_lock);
-				lastwqe = rvt_error_qp(qp,
-						       IB_WC_WR_FLUSH_ERR);
-				spin_unlock(&qp->s_lock);
-				spin_unlock(&qp->s_hlock);
-				spin_unlock_irq(&qp->r_lock);
-				if (lastwqe) {
-					ev.device = qp->ibqp.device;
-					ev.element.qp = &qp->ibqp;
-					ev.event =
-						IB_EVENT_QP_LAST_WQE_REACHED;
-					qp->ibqp.event_handler(&ev,
-						qp->ibqp.qp_context);
-				}
-			}
-		}
-	}
-
-	rcu_read_unlock();
+	rvt_qp_iter(&dev->rdi, sl, hfi1_qp_iter_cb);
 }
diff --git a/drivers/infiniband/hw/hfi1/qp.h b/drivers/infiniband/hw/hfi1/qp.h
index 6fe542b..c06d2f8 100644
--- a/drivers/infiniband/hw/hfi1/qp.h
+++ b/drivers/infiniband/hw/hfi1/qp.h
@@ -1,7 +1,7 @@
 #ifndef _QP_H
 #define _QP_H
 /*
- * Copyright(c) 2015, 2016 Intel Corporation.
+ * Copyright(c) 2015 - 2017 Intel Corporation.
  *
  * This file is provided under a dual BSD/GPLv2 license.  When using or
  * redistributing this file, you may do so under either license.
@@ -94,26 +94,7 @@ void hfi1_qp_wakeup(struct rvt_qp *qp, u32 flag);
 struct sdma_engine *qp_to_sdma_engine(struct rvt_qp *qp, u8 sc5);
 struct send_context *qp_to_send_context(struct rvt_qp *qp, u8 sc5);
 
-struct qp_iter;
-
-/**
- * qp_iter_init - initialize the iterator for the qp hash list
- * @dev: the hfi1_ibdev
- */
-struct qp_iter *qp_iter_init(struct hfi1_ibdev *dev);
-
-/**
- * qp_iter_next - Find the next qp in the hash list
- * @iter: the iterator for the qp hash list
- */
-int qp_iter_next(struct qp_iter *iter);
-
-/**
- * qp_iter_print - print the qp information to seq_file
- * @s: the seq_file to emit the qp information on
- * @iter: the iterator for the qp hash list
- */
-void qp_iter_print(struct seq_file *s, struct qp_iter *iter);
+void qp_iter_print(struct seq_file *s, struct rvt_qp_iter *iter);
 
 void _hfi1_schedule_send(struct rvt_qp *qp);
 void hfi1_schedule_send(struct rvt_qp *qp);
diff --git a/drivers/infiniband/hw/hfi1/rc.c b/drivers/infiniband/hw/hfi1/rc.c
index 1080778..e1cf0c0 100644
--- a/drivers/infiniband/hw/hfi1/rc.c
+++ b/drivers/infiniband/hw/hfi1/rc.c
@@ -100,8 +100,12 @@ static int make_rc_ack(struct hfi1_ibdev *dev, struct rvt_qp *qp,
 	if (!(ib_rvt_state_ops[qp->state] & RVT_PROCESS_RECV_OK))
 		goto bail;
 
-	/* header size in 32-bit words LRH+BTH = (8+12)/4. */
-	hwords = 5;
+	if (priv->hdr_type == HFI1_PKT_TYPE_9B)
+		/* header size in 32-bit words LRH+BTH = (8+12)/4. */
+		hwords = 5;
+	else
+		/* header size in 32-bit words 16B LRH+BTH = (16+12)/4. */
+		hwords = 7;
 
 	switch (qp->s_ack_state) {
 	case OP(RDMA_READ_RESPONSE_LAST):
@@ -258,8 +262,7 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 	struct ib_other_headers *ohdr;
 	struct rvt_sge_state *ss;
 	struct rvt_swqe *wqe;
-	/* header size in 32-bit words LRH+BTH = (8+12)/4. */
-	u32 hwords = 5;
+	u32 hwords;
 	u32 len;
 	u32 bth0 = 0;
 	u32 bth2;
@@ -273,9 +276,23 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 	if (IS_ERR(ps->s_txreq))
 		goto bail_no_tx;
 
-	ohdr = &ps->s_txreq->phdr.hdr.u.oth;
-	if (rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)
-		ohdr = &ps->s_txreq->phdr.hdr.u.l.oth;
+	ps->s_txreq->phdr.hdr.hdr_type = priv->hdr_type;
+	if (priv->hdr_type == HFI1_PKT_TYPE_9B) {
+		/* header size in 32-bit words LRH+BTH = (8+12)/4. */
+		hwords = 5;
+		if (rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)
+			ohdr = &ps->s_txreq->phdr.hdr.ibh.u.l.oth;
+		else
+			ohdr = &ps->s_txreq->phdr.hdr.ibh.u.oth;
+	} else {
+		/* header size in 32-bit words 16B LRH+BTH = (16+12)/4. */
+		hwords = 7;
+		if ((rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH) &&
+		    (hfi1_check_mcast(rdma_ah_get_dlid(&qp->remote_ah_attr))))
+			ohdr = &ps->s_txreq->phdr.hdr.opah.u.l.oth;
+		else
+			ohdr = &ps->s_txreq->phdr.hdr.opah.u.oth;
+	}
 
 	/* Sending responses has higher priority over sending requests. */
 	if ((qp->s_flags & RVT_S_RESP_PENDING) &&
@@ -425,7 +442,7 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 		case IB_WR_RDMA_WRITE:
 			if (newreq && !(qp->s_flags & RVT_S_UNLIMITED_CREDIT))
 				qp->s_lsn++;
-			/* FALLTHROUGH */
+			goto no_flow_control;
 		case IB_WR_RDMA_WRITE_WITH_IMM:
 			/* If no credit, return. */
 			if (!(qp->s_flags & RVT_S_UNLIMITED_CREDIT) &&
@@ -433,6 +450,7 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 				qp->s_flags |= RVT_S_WAIT_SSN_CREDIT;
 				goto bail;
 			}
+no_flow_control:
 			put_ib_reth_vaddr(
 				wqe->rdma_wr.remote_addr,
 				&ohdr->u.rc.reth);
@@ -703,109 +721,27 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 	return 0;
 }
 
-/**
- * hfi1_send_rc_ack - Construct an ACK packet and send it
- * @qp: a pointer to the QP
- *
- * This is called from hfi1_rc_rcv() and handle_receive_interrupt().
- * Note that RDMA reads and atomics are handled in the
- * send side QP state and send engine.
- */
-void hfi1_send_rc_ack(struct hfi1_ctxtdata *rcd, struct rvt_qp *qp,
-		      int is_fecn)
+static inline void hfi1_make_bth_aeth(struct rvt_qp *qp,
+				      struct ib_other_headers *ohdr,
+				      u32 bth0, u32 bth1)
 {
-	struct hfi1_ibport *ibp = rcd_to_iport(rcd);
-	struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
-	u64 pbc, pbc_flags = 0;
-	u16 lrh0;
-	u16 sc5;
-	u32 bth0;
-	u32 hwords;
-	u32 vl, plen;
-	struct send_context *sc;
-	struct pio_buf *pbuf;
-	struct ib_header hdr;
-	struct ib_other_headers *ohdr;
-	unsigned long flags;
-
-	/* clear the defer count */
-	qp->r_adefered = 0;
-
-	/* Don't send ACK or NAK if a RDMA read or atomic is pending. */
-	if (qp->s_flags & RVT_S_RESP_PENDING)
-		goto queue_ack;
-
-	/* Ensure s_rdma_ack_cnt changes are committed */
-	smp_read_barrier_depends();
-	if (qp->s_rdma_ack_cnt)
-		goto queue_ack;
-
-	/* Construct the header */
-	/* header size in 32-bit words LRH+BTH+AETH = (8+12+4)/4 */
-	hwords = 6;
-	if (unlikely(rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)) {
-		hwords += hfi1_make_grh(ibp, &hdr.u.l.grh,
-					rdma_ah_read_grh(&qp->remote_ah_attr),
-					hwords, 0);
-		ohdr = &hdr.u.l.oth;
-		lrh0 = HFI1_LRH_GRH;
-	} else {
-		ohdr = &hdr.u.oth;
-		lrh0 = HFI1_LRH_BTH;
-	}
-	/* read pkey_index w/o lock (its atomic) */
-	bth0 = hfi1_get_pkey(ibp, qp->s_pkey_index) | (OP(ACKNOWLEDGE) << 24);
-	if (qp->s_mig_state == IB_MIG_MIGRATED)
-		bth0 |= IB_BTH_MIG_REQ;
 	if (qp->r_nak_state)
 		ohdr->u.aeth = cpu_to_be32((qp->r_msn & IB_MSN_MASK) |
 					    (qp->r_nak_state <<
 					     IB_AETH_CREDIT_SHIFT));
 	else
 		ohdr->u.aeth = rvt_compute_aeth(qp);
-	sc5 = ibp->sl_to_sc[rdma_ah_get_sl(&qp->remote_ah_attr)];
-	/* set PBC_DC_INFO bit (aka SC[4]) in pbc_flags */
-	pbc_flags |= ((!!(sc5 & 0x10)) << PBC_DC_INFO_SHIFT);
-	lrh0 |= (sc5 & 0xf) << 12 | (rdma_ah_get_sl(&qp->remote_ah_attr)
-				     & 0xf) << 4;
-	hdr.lrh[0] = cpu_to_be16(lrh0);
-	hdr.lrh[1] = cpu_to_be16(rdma_ah_get_dlid(&qp->remote_ah_attr));
-	hdr.lrh[2] = cpu_to_be16(hwords + SIZE_OF_CRC);
-	hdr.lrh[3] = cpu_to_be16(ppd->lid |
-				 rdma_ah_get_path_bits(&qp->remote_ah_attr));
+
 	ohdr->bth[0] = cpu_to_be32(bth0);
-	ohdr->bth[1] = cpu_to_be32(qp->remote_qpn);
-	ohdr->bth[1] |= cpu_to_be32((!!is_fecn) << IB_BECN_SHIFT);
+	ohdr->bth[1] = cpu_to_be32(bth1 | qp->remote_qpn);
 	ohdr->bth[2] = cpu_to_be32(mask_psn(qp->r_ack_psn));
+}
 
-	/* Don't try to send ACKs if the link isn't ACTIVE */
-	if (driver_lstate(ppd) != IB_PORT_ACTIVE)
-		return;
+static inline void hfi1_queue_rc_ack(struct rvt_qp *qp, bool is_fecn)
+{
+	struct hfi1_ibport *ibp = to_iport(qp->ibqp.device, qp->port_num);
+	unsigned long flags;
 
-	sc = rcd->sc;
-	plen = 2 /* PBC */ + hwords;
-	vl = sc_to_vlt(ppd->dd, sc5);
-	pbc = create_pbc(ppd, pbc_flags, qp->srate_mbps, vl, plen);
-
-	pbuf = sc_buffer_alloc(sc, plen, NULL, NULL);
-	if (!pbuf) {
-		/*
-		 * We have no room to send at the moment.  Pass
-		 * responsibility for sending the ACK to the send engine
-		 * so that when enough buffer space becomes available,
-		 * the ACK is sent ahead of other outgoing packets.
-		 */
-		goto queue_ack;
-	}
-
-	trace_ack_output_ibhdr(dd_from_ibdev(qp->ibqp.device), &hdr);
-
-	/* write the pbc and data */
-	ppd->dd->pio_inline_send(ppd->dd, pbuf, pbc, &hdr, hwords);
-
-	return;
-
-queue_ack:
 	spin_lock_irqsave(&qp->s_lock, flags);
 	if (!(ib_rvt_state_ops[qp->state] & RVT_PROCESS_RECV_OK))
 		goto unlock;
@@ -816,12 +752,194 @@ void hfi1_send_rc_ack(struct hfi1_ctxtdata *rcd, struct rvt_qp *qp,
 	if (is_fecn)
 		qp->s_flags |= RVT_S_ECN;
 
-	/* Schedule the send engine. */
+	/* Schedule the send tasklet. */
 	hfi1_schedule_send(qp);
 unlock:
 	spin_unlock_irqrestore(&qp->s_lock, flags);
 }
 
+static inline void hfi1_make_rc_ack_9B(struct rvt_qp *qp,
+				       struct hfi1_opa_header *opa_hdr,
+				       u8 sc5, bool is_fecn,
+				       u64 *pbc_flags, u32 *hwords,
+				       u32 *nwords)
+{
+	struct hfi1_ibport *ibp = to_iport(qp->ibqp.device, qp->port_num);
+	struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
+	struct ib_header *hdr = &opa_hdr->ibh;
+	struct ib_other_headers *ohdr;
+	u16 lrh0 = HFI1_LRH_BTH;
+	u16 pkey;
+	u32 bth0, bth1;
+
+	opa_hdr->hdr_type = HFI1_PKT_TYPE_9B;
+	ohdr = &hdr->u.oth;
+	/* header size in 32-bit words LRH+BTH+AETH = (8+12+4)/4 */
+	*hwords = 6;
+
+	if (unlikely(rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)) {
+		*hwords += hfi1_make_grh(ibp, &hdr->u.l.grh,
+					 rdma_ah_read_grh(&qp->remote_ah_attr),
+					 *hwords - 2, SIZE_OF_CRC);
+		ohdr = &hdr->u.l.oth;
+		lrh0 = HFI1_LRH_GRH;
+	}
+	/* set PBC_DC_INFO bit (aka SC[4]) in pbc_flags */
+	*pbc_flags |= ((!!(sc5 & 0x10)) << PBC_DC_INFO_SHIFT);
+
+	/* read pkey_index w/o lock (its atomic) */
+	pkey = hfi1_get_pkey(ibp, qp->s_pkey_index);
+
+	lrh0 |= (sc5 & IB_SC_MASK) << IB_SC_SHIFT |
+		(rdma_ah_get_sl(&qp->remote_ah_attr) & IB_SL_MASK) <<
+			IB_SL_SHIFT;
+
+	hfi1_make_ib_hdr(hdr, lrh0, *hwords + SIZE_OF_CRC,
+			 opa_get_lid(rdma_ah_get_dlid(&qp->remote_ah_attr), 9B),
+			 ppd->lid | rdma_ah_get_path_bits(&qp->remote_ah_attr));
+
+	bth0 = pkey | (OP(ACKNOWLEDGE) << 24);
+	if (qp->s_mig_state == IB_MIG_MIGRATED)
+		bth0 |= IB_BTH_MIG_REQ;
+	bth1 = (!!is_fecn) << IB_BECN_SHIFT;
+	hfi1_make_bth_aeth(qp, ohdr, bth0, bth1);
+}
+
+static inline void hfi1_make_rc_ack_16B(struct rvt_qp *qp,
+					struct hfi1_opa_header *opa_hdr,
+					u8 sc5, bool is_fecn,
+					u64 *pbc_flags, u32 *hwords,
+					u32 *nwords)
+{
+	struct hfi1_ibport *ibp = to_iport(qp->ibqp.device, qp->port_num);
+	struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
+	struct hfi1_16b_header *hdr = &opa_hdr->opah;
+	struct ib_other_headers *ohdr;
+	u32 bth0, bth1;
+	u16 len, pkey;
+	u8 becn = !!is_fecn;
+	u8 l4 = OPA_16B_L4_IB_LOCAL;
+	u8 extra_bytes;
+
+	opa_hdr->hdr_type = HFI1_PKT_TYPE_16B;
+	ohdr = &hdr->u.oth;
+	/* header size in 32-bit words 16B LRH+BTH+AETH = (16+12+4)/4 */
+	*hwords = 8;
+	extra_bytes = hfi1_get_16b_padding(*hwords << 2, 0);
+	*nwords = SIZE_OF_CRC + ((extra_bytes + SIZE_OF_LT) >> 2);
+
+	if (unlikely(rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH) &&
+	    hfi1_check_mcast(rdma_ah_get_dlid(&qp->remote_ah_attr))) {
+		*hwords += hfi1_make_grh(ibp, &hdr->u.l.grh,
+					 rdma_ah_read_grh(&qp->remote_ah_attr),
+					 *hwords - 4, *nwords);
+		ohdr = &hdr->u.l.oth;
+		l4 = OPA_16B_L4_IB_GLOBAL;
+	}
+	*pbc_flags |= PBC_PACKET_BYPASS | PBC_INSERT_BYPASS_ICRC;
+
+	/* read pkey_index w/o lock (its atomic) */
+	pkey = hfi1_get_pkey(ibp, qp->s_pkey_index);
+
+	/* Convert dwords to flits */
+	len = (*hwords + *nwords) >> 1;
+
+	hfi1_make_16b_hdr(hdr,
+			  ppd->lid | rdma_ah_get_path_bits(&qp->remote_ah_attr),
+			  opa_get_lid(rdma_ah_get_dlid(&qp->remote_ah_attr),
+				      16B),
+			  len, pkey, becn, 0, l4, sc5);
+
+	bth0 = pkey | (OP(ACKNOWLEDGE) << 24);
+	bth0 |= extra_bytes << 20;
+	if (qp->s_mig_state == IB_MIG_MIGRATED)
+		bth1 = OPA_BTH_MIG_REQ;
+	hfi1_make_bth_aeth(qp, ohdr, bth0, bth1);
+}
+
+typedef void (*hfi1_make_rc_ack)(struct rvt_qp *qp,
+				 struct hfi1_opa_header *opa_hdr,
+				 u8 sc5, bool is_fecn,
+				 u64 *pbc_flags, u32 *hwords,
+				 u32 *nwords);
+
+/* We support only two types - 9B and 16B for now */
+static const hfi1_make_rc_ack hfi1_make_rc_ack_tbl[2] = {
+	[HFI1_PKT_TYPE_9B] = &hfi1_make_rc_ack_9B,
+	[HFI1_PKT_TYPE_16B] = &hfi1_make_rc_ack_16B
+};
+
+/**
+ * hfi1_send_rc_ack - Construct an ACK packet and send it
+ * @qp: a pointer to the QP
+ *
+ * This is called from hfi1_rc_rcv() and handle_receive_interrupt().
+ * Note that RDMA reads and atomics are handled in the
+ * send side QP state and send engine.
+ */
+void hfi1_send_rc_ack(struct hfi1_ctxtdata *rcd,
+		      struct rvt_qp *qp, bool is_fecn)
+{
+	struct hfi1_ibport *ibp = rcd_to_iport(rcd);
+	struct hfi1_qp_priv *priv = qp->priv;
+	struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
+	u8 sc5 = ibp->sl_to_sc[rdma_ah_get_sl(&qp->remote_ah_attr)];
+	u64 pbc, pbc_flags = 0;
+	u32 hwords = 0;
+	u32 nwords = 0;
+	u32 plen;
+	struct pio_buf *pbuf;
+	struct hfi1_opa_header opa_hdr;
+
+	/* clear the defer count */
+	qp->r_adefered = 0;
+
+	/* Don't send ACK or NAK if a RDMA read or atomic is pending. */
+	if (qp->s_flags & RVT_S_RESP_PENDING) {
+		hfi1_queue_rc_ack(qp, is_fecn);
+		return;
+	}
+
+	/* Ensure s_rdma_ack_cnt changes are committed */
+	smp_read_barrier_depends();
+	if (qp->s_rdma_ack_cnt) {
+		hfi1_queue_rc_ack(qp, is_fecn);
+		return;
+	}
+
+	/* Don't try to send ACKs if the link isn't ACTIVE */
+	if (driver_lstate(ppd) != IB_PORT_ACTIVE)
+		return;
+
+	/* Make the appropriate header */
+	hfi1_make_rc_ack_tbl[priv->hdr_type](qp, &opa_hdr, sc5, is_fecn,
+					     &pbc_flags, &hwords, &nwords);
+
+	plen = 2 /* PBC */ + hwords + nwords;
+	pbc = create_pbc(ppd, pbc_flags, qp->srate_mbps,
+			 sc_to_vlt(ppd->dd, sc5), plen);
+	pbuf = sc_buffer_alloc(rcd->sc, plen, NULL, NULL);
+	if (!pbuf) {
+		/*
+		 * We have no room to send at the moment.  Pass
+		 * responsibility for sending the ACK to the send engine
+		 * so that when enough buffer space becomes available,
+		 * the ACK is sent ahead of other outgoing packets.
+		 */
+		hfi1_queue_rc_ack(qp, is_fecn);
+		return;
+	}
+	trace_ack_output_ibhdr(dd_from_ibdev(qp->ibqp.device),
+			       &opa_hdr, ib_is_sc5(sc5));
+
+	/* write the pbc and data */
+	ppd->dd->pio_inline_send(ppd->dd, pbuf, pbc,
+				 (priv->hdr_type == HFI1_PKT_TYPE_9B ?
+				 (void *)&opa_hdr.ibh :
+				 (void *)&opa_hdr.opah), hwords);
+	return;
+}
+
 /**
  * reset_psn - reset the QP state to send starting from PSN
  * @qp: the QP
@@ -984,10 +1102,13 @@ static void reset_sending_psn(struct rvt_qp *qp, u32 psn)
 /*
  * This should be called with the QP s_lock held and interrupts disabled.
  */
-void hfi1_rc_send_complete(struct rvt_qp *qp, struct ib_header *hdr)
+void hfi1_rc_send_complete(struct rvt_qp *qp, struct hfi1_opa_header *opah)
 {
 	struct ib_other_headers *ohdr;
+	struct hfi1_qp_priv *priv = qp->priv;
 	struct rvt_swqe *wqe;
+	struct ib_header *hdr = NULL;
+	struct hfi1_16b_header *hdr_16b = NULL;
 	u32 opcode;
 	u32 psn;
 
@@ -996,10 +1117,22 @@ void hfi1_rc_send_complete(struct rvt_qp *qp, struct ib_header *hdr)
 		return;
 
 	/* Find out where the BTH is */
-	if (ib_get_lnh(hdr) == HFI1_LRH_BTH)
-		ohdr = &hdr->u.oth;
-	else
-		ohdr = &hdr->u.l.oth;
+	if (priv->hdr_type == HFI1_PKT_TYPE_9B) {
+		hdr = &opah->ibh;
+		if (ib_get_lnh(hdr) == HFI1_LRH_BTH)
+			ohdr = &hdr->u.oth;
+		else
+			ohdr = &hdr->u.l.oth;
+	} else {
+		u8 l4;
+
+		hdr_16b = &opah->opah;
+		l4  = hfi1_16B_get_l4(hdr_16b);
+		if (l4 == OPA_16B_L4_IB_LOCAL)
+			ohdr = &hdr_16b->u.oth;
+		else
+			ohdr = &hdr_16b->u.l.oth;
+	}
 
 	opcode = ib_bth_get_opcode(ohdr);
 	if (opcode >= OP(RDMA_READ_RESPONSE_FIRST) &&
@@ -1009,7 +1142,7 @@ void hfi1_rc_send_complete(struct rvt_qp *qp, struct ib_header *hdr)
 		return;
 	}
 
-	psn = be32_to_cpu(ohdr->bth[2]);
+	psn = ib_bth_get_psn(ohdr);
 	reset_sending_psn(qp, psn);
 
 	/*
@@ -1399,36 +1532,34 @@ static void rdma_seq_err(struct rvt_qp *qp, struct hfi1_ibport *ibp, u32 psn,
 
 /**
  * rc_rcv_resp - process an incoming RC response packet
- * @ibp: the port this packet came in on
- * @ohdr: the other headers for this packet
- * @data: the packet data
- * @tlen: the packet length
- * @qp: the QP for this packet
- * @opcode: the opcode for this packet
- * @psn: the packet sequence number for this packet
- * @hdrsize: the header length
- * @pmtu: the path MTU
+ * @packet: data packet information
  *
  * This is called from hfi1_rc_rcv() to process an incoming RC response
  * packet for the given QP.
  * Called at interrupt level.
  */
-static void rc_rcv_resp(struct hfi1_ibport *ibp,
-			struct ib_other_headers *ohdr,
-			void *data, u32 tlen, struct rvt_qp *qp,
-			u32 opcode, u32 psn, u32 hdrsize, u32 pmtu,
-			struct hfi1_ctxtdata *rcd)
+static void rc_rcv_resp(struct hfi1_packet *packet)
 {
+	struct hfi1_ctxtdata *rcd = packet->rcd;
+	void *data = packet->payload;
+	u32 tlen = packet->tlen;
+	struct rvt_qp *qp = packet->qp;
+	struct hfi1_ibport *ibp = to_iport(qp->ibqp.device, qp->port_num);
+	struct ib_other_headers *ohdr = packet->ohdr;
 	struct rvt_swqe *wqe;
 	enum ib_wc_status status;
 	unsigned long flags;
 	int diff;
-	u32 pad;
-	u32 aeth;
 	u64 val;
+	u32 aeth;
+	u32 psn = ib_bth_get_psn(packet->ohdr);
+	u32 pmtu = qp->pmtu;
+	u16 hdrsize = packet->hlen;
+	u8 opcode = packet->opcode;
+	u8 pad = packet->pad;
+	u8 extra_bytes = pad + packet->extra_byte + (SIZE_OF_CRC << 2);
 
 	spin_lock_irqsave(&qp->s_lock, flags);
-
 	trace_hfi1_ack(qp, psn);
 
 	/* Ignore invalid responses. */
@@ -1494,7 +1625,7 @@ static void rc_rcv_resp(struct hfi1_ibport *ibp,
 		if (unlikely(wqe->wr.opcode != IB_WR_RDMA_READ))
 			goto ack_op_err;
 read_middle:
-		if (unlikely(tlen != (hdrsize + pmtu + 4)))
+		if (unlikely(tlen != (hdrsize + pmtu + extra_bytes)))
 			goto ack_len_err;
 		if (unlikely(pmtu >= qp->s_rdma_read_len))
 			goto ack_len_err;
@@ -1526,13 +1657,11 @@ static void rc_rcv_resp(struct hfi1_ibport *ibp,
 		aeth = be32_to_cpu(ohdr->u.aeth);
 		if (!do_rc_ack(qp, aeth, psn, opcode, 0, rcd))
 			goto ack_done;
-		/* Get the number of bytes the message was padded by. */
-		pad = ib_bth_get_pad(ohdr);
 		/*
 		 * Check that the data size is >= 0 && <= pmtu.
 		 * Remember to account for ICRC (4).
 		 */
-		if (unlikely(tlen < (hdrsize + pad + 4)))
+		if (unlikely(tlen < (hdrsize + extra_bytes)))
 			goto ack_len_err;
 		/*
 		 * If this is a response to a resent RDMA read, we
@@ -1550,16 +1679,14 @@ static void rc_rcv_resp(struct hfi1_ibport *ibp,
 			goto ack_seq_err;
 		if (unlikely(wqe->wr.opcode != IB_WR_RDMA_READ))
 			goto ack_op_err;
-		/* Get the number of bytes the message was padded by. */
-		pad = ib_bth_get_pad(ohdr);
 		/*
 		 * Check that the data size is >= 1 && <= pmtu.
 		 * Remember to account for ICRC (4).
 		 */
-		if (unlikely(tlen <= (hdrsize + pad + 4)))
+		if (unlikely(tlen <= (hdrsize + extra_bytes)))
 			goto ack_len_err;
 read_last:
-		tlen -= hdrsize + pad + 4;
+		tlen -= hdrsize + extra_bytes;
 		if (unlikely(tlen != qp->s_rdma_read_len))
 			goto ack_len_err;
 		aeth = be32_to_cpu(ohdr->u.aeth);
@@ -1844,7 +1971,7 @@ static void log_cca_event(struct hfi1_pportdata *ppd, u8 sl, u32 rlid,
 	spin_unlock_irqrestore(&ppd->cc_log_lock, flags);
 }
 
-void process_becn(struct hfi1_pportdata *ppd, u8 sl, u16 rlid, u32 lqpn,
+void process_becn(struct hfi1_pportdata *ppd, u8 sl, u32 rlid, u32 lqpn,
 		  u32 rqpn, u8 svc_type)
 {
 	struct cca_timer *cca_timer;
@@ -1901,12 +2028,7 @@ void process_becn(struct hfi1_pportdata *ppd, u8 sl, u16 rlid, u32 lqpn,
 
 /**
  * hfi1_rc_rcv - process an incoming RC packet
- * @rcd: the context pointer
- * @hdr: the header of this packet
- * @rcv_flags: flags relevant to rcv processing
- * @data: the packet data
- * @tlen: the packet length
- * @qp: the QP for this packet
+ * @packet: data packet information
  *
  * This is called from qp_rcv() to process an incoming RC packet
  * for the given QP.
@@ -1915,17 +2037,16 @@ void process_becn(struct hfi1_pportdata *ppd, u8 sl, u16 rlid, u32 lqpn,
 void hfi1_rc_rcv(struct hfi1_packet *packet)
 {
 	struct hfi1_ctxtdata *rcd = packet->rcd;
-	struct ib_header *hdr = packet->hdr;
-	u32 rcv_flags = packet->rcv_flags;
-	void *data = packet->ebuf;
+	void *data = packet->payload;
 	u32 tlen = packet->tlen;
 	struct rvt_qp *qp = packet->qp;
 	struct hfi1_ibport *ibp = rcd_to_iport(rcd);
 	struct ib_other_headers *ohdr = packet->ohdr;
-	u32 bth0, opcode;
+	u32 bth0 = be32_to_cpu(ohdr->bth[0]);
+	u32 opcode = packet->opcode;
 	u32 hdrsize = packet->hlen;
-	u32 psn;
-	u32 pad;
+	u32 psn = ib_bth_get_psn(packet->ohdr);
+	u32 pad = packet->pad;
 	struct ib_wc wc;
 	u32 pmtu = qp->pmtu;
 	int diff;
@@ -1935,17 +2056,15 @@ void hfi1_rc_rcv(struct hfi1_packet *packet)
 	bool is_fecn = false;
 	bool copy_last = false;
 	u32 rkey;
+	u8 extra_bytes = pad + packet->extra_byte + (SIZE_OF_CRC << 2);
 
 	lockdep_assert_held(&qp->r_lock);
-	bth0 = be32_to_cpu(ohdr->bth[0]);
-	if (hfi1_ruc_check_hdr(ibp, hdr, rcv_flags & HFI1_HAS_GRH, qp, bth0))
+
+	if (hfi1_ruc_check_hdr(ibp, packet))
 		return;
 
 	is_fecn = process_ecn(qp, packet, false);
 
-	psn = be32_to_cpu(ohdr->bth[2]);
-	opcode = ib_bth_get_opcode(ohdr);
-
 	/*
 	 * Process responses (ACKs) before anything else.  Note that the
 	 * packet sequence number will be for something in the send work
@@ -1954,8 +2073,7 @@ void hfi1_rc_rcv(struct hfi1_packet *packet)
 	 */
 	if (opcode >= OP(RDMA_READ_RESPONSE_FIRST) &&
 	    opcode <= OP(ATOMIC_ACKNOWLEDGE)) {
-		rc_rcv_resp(ibp, ohdr, data, tlen, qp, opcode, psn,
-			    hdrsize, pmtu, rcd);
+		rc_rcv_resp(packet);
 		if (is_fecn)
 			goto send_ack;
 		return;
@@ -2022,7 +2140,12 @@ void hfi1_rc_rcv(struct hfi1_packet *packet)
 	case OP(RDMA_WRITE_MIDDLE):
 send_middle:
 		/* Check for invalid length PMTU or posted rwqe len. */
-		if (unlikely(tlen != (hdrsize + pmtu + 4)))
+		/*
+		 * There will be no padding for 9B packet but 16B packets
+		 * will come in with some padding since we always add
+		 * CRC and LT bytes which will need to be flit aligned
+		 */
+		if (unlikely(tlen != (hdrsize + pmtu + extra_bytes)))
 			goto nack_inv;
 		qp->r_rcv_len += pmtu;
 		if (unlikely(qp->r_rcv_len > qp->r_len))
@@ -2074,14 +2197,12 @@ void hfi1_rc_rcv(struct hfi1_packet *packet)
 		wc.wc_flags = 0;
 		wc.ex.imm_data = 0;
 send_last:
-		/* Get the number of bytes the message was padded by. */
-		pad = ib_bth_get_pad(ohdr);
 		/* Check for invalid length. */
 		/* LAST len should be >= 1 */
-		if (unlikely(tlen < (hdrsize + pad + 4)))
+		if (unlikely(tlen < (hdrsize + extra_bytes)))
 			goto nack_inv;
-		/* Don't count the CRC. */
-		tlen -= (hdrsize + pad + 4);
+		/* Don't count the CRC(and padding and LT byte for 16B). */
+		tlen -= (hdrsize + extra_bytes);
 		wc.byte_len = tlen + qp->r_rcv_len;
 		if (unlikely(wc.byte_len > qp->r_len))
 			goto nack_inv;
@@ -2368,28 +2489,19 @@ void hfi1_rc_rcv(struct hfi1_packet *packet)
 
 void hfi1_rc_hdrerr(
 	struct hfi1_ctxtdata *rcd,
-	struct ib_header *hdr,
-	u32 rcv_flags,
+	struct hfi1_packet *packet,
 	struct rvt_qp *qp)
 {
-	int has_grh = rcv_flags & HFI1_HAS_GRH;
-	struct ib_other_headers *ohdr;
 	struct hfi1_ibport *ibp = rcd_to_iport(rcd);
 	int diff;
 	u32 opcode;
-	u32 psn, bth0;
+	u32 psn;
 
-	/* Check for GRH */
-	ohdr = &hdr->u.oth;
-	if (has_grh)
-		ohdr = &hdr->u.l.oth;
-
-	bth0 = be32_to_cpu(ohdr->bth[0]);
-	if (hfi1_ruc_check_hdr(ibp, hdr, has_grh, qp, bth0))
+	if (hfi1_ruc_check_hdr(ibp, packet))
 		return;
 
-	psn = be32_to_cpu(ohdr->bth[2]);
-	opcode = ib_bth_get_opcode(ohdr);
+	psn = ib_bth_get_psn(packet->ohdr);
+	opcode = ib_bth_get_opcode(packet->ohdr);
 
 	/* Only deal with RDMA Writes for now */
 	if (opcode < IB_OPCODE_RC_RDMA_READ_RESPONSE_FIRST) {
diff --git a/drivers/infiniband/hw/hfi1/ruc.c b/drivers/infiniband/hw/hfi1/ruc.c
index 3a17dab..b3291f0 100644
--- a/drivers/infiniband/hw/hfi1/ruc.c
+++ b/drivers/infiniband/hw/hfi1/ruc.c
@@ -74,8 +74,10 @@ static int init_sge(struct rvt_qp *qp, struct rvt_rwqe *wqe)
 		if (wqe->sg_list[i].length == 0)
 			continue;
 		/* Check LKEY */
-		if (!rvt_lkey_ok(rkt, pd, j ? &ss->sg_list[j - 1] : &ss->sge,
-				 &wqe->sg_list[i], IB_ACCESS_LOCAL_WRITE))
+		ret = rvt_lkey_ok(rkt, pd, j ? &ss->sg_list[j - 1] : &ss->sge,
+				  NULL, &wqe->sg_list[i],
+				  IB_ACCESS_LOCAL_WRITE);
+		if (unlikely(ret <= 0))
 			goto bad_lkey;
 		qp->r_len += wqe->sg_list[i].length;
 		j++;
@@ -214,100 +216,104 @@ static int gid_ok(union ib_gid *gid, __be64 gid_prefix, __be64 id)
  *
  * The s_lock will be acquired around the hfi1_migrate_qp() call.
  */
-int hfi1_ruc_check_hdr(struct hfi1_ibport *ibp, struct ib_header *hdr,
-		       int has_grh, struct rvt_qp *qp, u32 bth0)
+int hfi1_ruc_check_hdr(struct hfi1_ibport *ibp, struct hfi1_packet *packet)
 {
 	__be64 guid;
 	unsigned long flags;
+	struct rvt_qp *qp = packet->qp;
 	u8 sc5 = ibp->sl_to_sc[rdma_ah_get_sl(&qp->remote_ah_attr)];
+	u32 dlid = packet->dlid;
+	u32 slid = packet->slid;
+	u32 sl = packet->sl;
+	int migrated;
+	u32 bth0, bth1;
+	u16 pkey;
 
-	if (qp->s_mig_state == IB_MIG_ARMED && (bth0 & IB_BTH_MIG_REQ)) {
-		if (!has_grh) {
-			if (rdma_ah_get_ah_flags(&qp->alt_ah_attr) &
-			    IB_AH_GRH)
-				goto err;
+	bth0 = be32_to_cpu(packet->ohdr->bth[0]);
+	bth1 = be32_to_cpu(packet->ohdr->bth[1]);
+	if (packet->etype == RHF_RCV_TYPE_BYPASS) {
+		pkey = hfi1_16B_get_pkey(packet->hdr);
+		migrated = bth1 & OPA_BTH_MIG_REQ;
+	} else {
+		pkey = ib_bth_get_pkey(packet->ohdr);
+		migrated = bth0 & IB_BTH_MIG_REQ;
+	}
+
+	if (qp->s_mig_state == IB_MIG_ARMED && migrated) {
+		if (!packet->grh) {
+			if ((rdma_ah_get_ah_flags(&qp->alt_ah_attr) &
+			     IB_AH_GRH) &&
+			    (packet->etype != RHF_RCV_TYPE_BYPASS))
+				return 1;
 		} else {
 			const struct ib_global_route *grh;
 
 			if (!(rdma_ah_get_ah_flags(&qp->alt_ah_attr) &
 			      IB_AH_GRH))
-				goto err;
+				return 1;
 			grh = rdma_ah_read_grh(&qp->alt_ah_attr);
 			guid = get_sguid(ibp, grh->sgid_index);
-			if (!gid_ok(&hdr->u.l.grh.dgid, ibp->rvp.gid_prefix,
+			if (!gid_ok(&packet->grh->dgid, ibp->rvp.gid_prefix,
 				    guid))
-				goto err;
+				return 1;
 			if (!gid_ok(
-				&hdr->u.l.grh.sgid,
+				&packet->grh->sgid,
 				grh->dgid.global.subnet_prefix,
 				grh->dgid.global.interface_id))
-				goto err;
+				return 1;
 		}
-		if (unlikely(rcv_pkey_check(ppd_from_ibp(ibp), (u16)bth0, sc5,
-					    ib_get_slid(hdr)))) {
-			hfi1_bad_pqkey(ibp, OPA_TRAP_BAD_P_KEY,
-				       (u16)bth0,
-				       ib_get_sl(hdr),
-				       0, qp->ibqp.qp_num,
-				       ib_get_slid(hdr),
-				       ib_get_dlid(hdr));
-			goto err;
+		if (unlikely(rcv_pkey_check(ppd_from_ibp(ibp), pkey,
+					    sc5, slid))) {
+			hfi1_bad_pkey(ibp, pkey, sl, 0, qp->ibqp.qp_num,
+				      slid, dlid);
+			return 1;
 		}
 		/* Validate the SLID. See Ch. 9.6.1.5 and 17.2.8 */
-		if (ib_get_slid(hdr) !=
-			rdma_ah_get_dlid(&qp->alt_ah_attr) ||
+		if (slid != rdma_ah_get_dlid(&qp->alt_ah_attr) ||
 		    ppd_from_ibp(ibp)->port !=
 			rdma_ah_get_port_num(&qp->alt_ah_attr))
-			goto err;
+			return 1;
 		spin_lock_irqsave(&qp->s_lock, flags);
 		hfi1_migrate_qp(qp);
 		spin_unlock_irqrestore(&qp->s_lock, flags);
 	} else {
-		if (!has_grh) {
-			if (rdma_ah_get_ah_flags(&qp->remote_ah_attr) &
-						 IB_AH_GRH)
-				goto err;
+		if (!packet->grh) {
+			if ((rdma_ah_get_ah_flags(&qp->remote_ah_attr) &
+			     IB_AH_GRH) &&
+			    (packet->etype != RHF_RCV_TYPE_BYPASS))
+				return 1;
 		} else {
 			const struct ib_global_route *grh;
 
 			if (!(rdma_ah_get_ah_flags(&qp->remote_ah_attr) &
 						   IB_AH_GRH))
-				goto err;
+				return 1;
 			grh = rdma_ah_read_grh(&qp->remote_ah_attr);
 			guid = get_sguid(ibp, grh->sgid_index);
-			if (!gid_ok(&hdr->u.l.grh.dgid, ibp->rvp.gid_prefix,
+			if (!gid_ok(&packet->grh->dgid, ibp->rvp.gid_prefix,
 				    guid))
-				goto err;
+				return 1;
 			if (!gid_ok(
-			     &hdr->u.l.grh.sgid,
+			     &packet->grh->sgid,
 			     grh->dgid.global.subnet_prefix,
 			     grh->dgid.global.interface_id))
-				goto err;
+				return 1;
 		}
-		if (unlikely(rcv_pkey_check(ppd_from_ibp(ibp), (u16)bth0, sc5,
-					    ib_get_slid(hdr)))) {
-			hfi1_bad_pqkey(ibp, OPA_TRAP_BAD_P_KEY,
-				       (u16)bth0,
-				       ib_get_sl(hdr),
-				       0, qp->ibqp.qp_num,
-				       ib_get_slid(hdr),
-				       ib_get_dlid(hdr));
-			goto err;
+		if (unlikely(rcv_pkey_check(ppd_from_ibp(ibp), pkey,
+					    sc5, slid))) {
+			hfi1_bad_pkey(ibp, pkey, sl, 0, qp->ibqp.qp_num,
+				      slid, dlid);
+			return 1;
 		}
 		/* Validate the SLID. See Ch. 9.6.1.5 */
-		if (ib_get_slid(hdr) !=
-			rdma_ah_get_dlid(&qp->remote_ah_attr) ||
+		if ((slid != rdma_ah_get_dlid(&qp->remote_ah_attr)) ||
 		    ppd_from_ibp(ibp)->port != qp->port_num)
-			goto err;
-		if (qp->s_mig_state == IB_MIG_REARM &&
-		    !(bth0 & IB_BTH_MIG_REQ))
+			return 1;
+		if (qp->s_mig_state == IB_MIG_REARM && !migrated)
 			qp->s_mig_state = IB_MIG_ARMED;
 	}
 
 	return 0;
-
-err:
-	return 1;
 }
 
 /**
@@ -643,7 +649,7 @@ static void ruc_loopback(struct rvt_qp *sqp)
  * @ibp: a pointer to the IB port
  * @hdr: a pointer to the GRH header being constructed
  * @grh: the global route address to send to
- * @hwords: the number of 32 bit words of header being sent
+ * @hwords: size of header after grh being sent in dwords
  * @nwords: the number of 32 bit words of data being sent
  *
  * Return the size of the header in 32 bit words.
@@ -655,7 +661,7 @@ u32 hfi1_make_grh(struct hfi1_ibport *ibp, struct ib_grh *hdr,
 		cpu_to_be32((IB_GRH_VERSION << IB_GRH_VERSION_SHIFT) |
 			    (grh->traffic_class << IB_GRH_TCLASS_SHIFT) |
 			    (grh->flow_label << IB_GRH_FLOW_SHIFT));
-	hdr->paylen = cpu_to_be16((hwords - 2 + nwords + SIZE_OF_CRC) << 2);
+	hdr->paylen = cpu_to_be16((hwords + nwords) << 2);
 	/* next_hdr is defined by C8-7 in ch. 8.4.1 */
 	hdr->next_hdr = IB_GRH_NEXT_HDR;
 	hdr->hop_limit = grh->hop_limit;
@@ -671,7 +677,8 @@ u32 hfi1_make_grh(struct hfi1_ibport *ibp, struct ib_grh *hdr,
 	return sizeof(struct ib_grh) / sizeof(u32);
 }
 
-#define BTH2_OFFSET (offsetof(struct hfi1_sdma_header, hdr.u.oth.bth[2]) / 4)
+#define BTH2_OFFSET (offsetof(struct hfi1_sdma_header, \
+			      hdr.ibh.u.oth.bth[2]) / 4)
 
 /**
  * build_ahg - create ahg in s_ahg
@@ -728,32 +735,169 @@ static inline void build_ahg(struct rvt_qp *qp, u32 npsn)
 	}
 }
 
+static inline void hfi1_make_ruc_bth(struct rvt_qp *qp,
+				     struct ib_other_headers *ohdr,
+				     u32 bth0, u32 bth1, u32 bth2)
+{
+	bth1 |= qp->remote_qpn;
+	ohdr->bth[0] = cpu_to_be32(bth0);
+	ohdr->bth[1] = cpu_to_be32(bth1);
+	ohdr->bth[2] = cpu_to_be32(bth2);
+}
+
+static inline void hfi1_make_ruc_header_16B(struct rvt_qp *qp,
+					    struct ib_other_headers *ohdr,
+					    u32 bth0, u32 bth2, int middle,
+					    struct hfi1_pkt_state *ps)
+{
+	struct hfi1_qp_priv *priv = qp->priv;
+	struct hfi1_ibport *ibp = ps->ibp;
+	struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
+	u32 bth1 = 0;
+	u32 slid;
+	u16 pkey = hfi1_get_pkey(ibp, qp->s_pkey_index);
+	u8 l4 = OPA_16B_L4_IB_LOCAL;
+	u8 extra_bytes = hfi1_get_16b_padding((qp->s_hdrwords << 2),
+				   ps->s_txreq->s_cur_size);
+	u32 nwords = SIZE_OF_CRC + ((ps->s_txreq->s_cur_size +
+				 extra_bytes + SIZE_OF_LT) >> 2);
+	u8 becn = 0;
+
+	if (unlikely(rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH) &&
+	    hfi1_check_mcast(rdma_ah_get_dlid(&qp->remote_ah_attr))) {
+		struct ib_grh *grh;
+		struct ib_global_route *grd =
+			rdma_ah_retrieve_grh(&qp->remote_ah_attr);
+		int hdrwords;
+
+		/*
+		 * Ensure OPA GIDs are transformed to IB gids
+		 * before creating the GRH.
+		 */
+		if (grd->sgid_index == OPA_GID_INDEX)
+			grd->sgid_index = 0;
+		grh = &ps->s_txreq->phdr.hdr.opah.u.l.grh;
+		l4 = OPA_16B_L4_IB_GLOBAL;
+		hdrwords = qp->s_hdrwords - 4;
+		qp->s_hdrwords += hfi1_make_grh(ibp, grh, grd,
+						hdrwords, nwords);
+		middle = 0;
+	}
+
+	if (qp->s_mig_state == IB_MIG_MIGRATED)
+		bth1 |= OPA_BTH_MIG_REQ;
+	else
+		middle = 0;
+
+	if (middle)
+		build_ahg(qp, bth2);
+	else
+		qp->s_flags &= ~RVT_S_AHG_VALID;
+
+	bth0 |= pkey;
+	bth0 |= extra_bytes << 20;
+	if (qp->s_flags & RVT_S_ECN) {
+		qp->s_flags &= ~RVT_S_ECN;
+		/* we recently received a FECN, so return a BECN */
+		becn = 1;
+	}
+	hfi1_make_ruc_bth(qp, ohdr, bth0, bth1, bth2);
+
+	if (!ppd->lid)
+		slid = be32_to_cpu(OPA_LID_PERMISSIVE);
+	else
+		slid = ppd->lid |
+			(rdma_ah_get_path_bits(&qp->remote_ah_attr) &
+			((1 << ppd->lmc) - 1));
+
+	hfi1_make_16b_hdr(&ps->s_txreq->phdr.hdr.opah,
+			  slid,
+			  opa_get_lid(rdma_ah_get_dlid(&qp->remote_ah_attr),
+				      16B),
+			  (qp->s_hdrwords + nwords) >> 1,
+			  pkey, becn, 0, l4, priv->s_sc);
+}
+
+static inline void hfi1_make_ruc_header_9B(struct rvt_qp *qp,
+					   struct ib_other_headers *ohdr,
+					   u32 bth0, u32 bth2, int middle,
+					   struct hfi1_pkt_state *ps)
+{
+	struct hfi1_qp_priv *priv = qp->priv;
+	struct hfi1_ibport *ibp = ps->ibp;
+	struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
+	u32 bth1 = 0;
+	u16 pkey = hfi1_get_pkey(ibp, qp->s_pkey_index);
+	u16 lrh0 = HFI1_LRH_BTH;
+	u16 slid;
+	u8 extra_bytes = -ps->s_txreq->s_cur_size & 3;
+	u32 nwords = SIZE_OF_CRC + ((ps->s_txreq->s_cur_size +
+					 extra_bytes) >> 2);
+
+	if (unlikely(rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)) {
+		struct ib_grh *grh = &ps->s_txreq->phdr.hdr.ibh.u.l.grh;
+		int hdrwords = qp->s_hdrwords - 2;
+
+		lrh0 = HFI1_LRH_GRH;
+		qp->s_hdrwords +=
+			hfi1_make_grh(ibp, grh,
+				      rdma_ah_read_grh(&qp->remote_ah_attr),
+				      hdrwords, nwords);
+		middle = 0;
+	}
+	lrh0 |= (priv->s_sc & 0xf) << 12 |
+		(rdma_ah_get_sl(&qp->remote_ah_attr) & 0xf) << 4;
+
+	if (qp->s_mig_state == IB_MIG_MIGRATED)
+		bth0 |= IB_BTH_MIG_REQ;
+	else
+		middle = 0;
+
+	if (middle)
+		build_ahg(qp, bth2);
+	else
+		qp->s_flags &= ~RVT_S_AHG_VALID;
+
+	bth0 |= pkey;
+	bth0 |= extra_bytes << 20;
+	if (qp->s_flags & RVT_S_ECN) {
+		qp->s_flags &= ~RVT_S_ECN;
+		/* we recently received a FECN, so return a BECN */
+		bth1 |= (IB_BECN_MASK << IB_BECN_SHIFT);
+	}
+	hfi1_make_ruc_bth(qp, ohdr, bth0, bth1, bth2);
+
+	if (!ppd->lid)
+		slid = be16_to_cpu(IB_LID_PERMISSIVE);
+	else
+		slid = ppd->lid |
+			(rdma_ah_get_path_bits(&qp->remote_ah_attr) &
+			((1 << ppd->lmc) - 1));
+	hfi1_make_ib_hdr(&ps->s_txreq->phdr.hdr.ibh,
+			 lrh0,
+			 qp->s_hdrwords + nwords,
+			 opa_get_lid(rdma_ah_get_dlid(&qp->remote_ah_attr), 9B),
+			 ppd_from_ibp(ibp)->lid |
+				rdma_ah_get_path_bits(&qp->remote_ah_attr));
+}
+
+typedef void (*hfi1_make_ruc_hdr)(struct rvt_qp *qp,
+				  struct ib_other_headers *ohdr,
+				  u32 bth0, u32 bth2, int middle,
+				  struct hfi1_pkt_state *ps);
+
+/* We support only two types - 9B and 16B for now */
+static const hfi1_make_ruc_hdr hfi1_ruc_header_tbl[2] = {
+	[HFI1_PKT_TYPE_9B] = &hfi1_make_ruc_header_9B,
+	[HFI1_PKT_TYPE_16B] = &hfi1_make_ruc_header_16B
+};
+
 void hfi1_make_ruc_header(struct rvt_qp *qp, struct ib_other_headers *ohdr,
 			  u32 bth0, u32 bth2, int middle,
 			  struct hfi1_pkt_state *ps)
 {
 	struct hfi1_qp_priv *priv = qp->priv;
-	struct hfi1_ibport *ibp = ps->ibp;
-	u16 lrh0;
-	u32 nwords;
-	u32 extra_bytes;
-	u32 bth1;
 
-	/* Construct the header. */
-	extra_bytes = -ps->s_txreq->s_cur_size & 3;
-	nwords = (ps->s_txreq->s_cur_size + extra_bytes) >> 2;
-	lrh0 = HFI1_LRH_BTH;
-	if (unlikely(rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)) {
-		qp->s_hdrwords +=
-			hfi1_make_grh(ibp,
-				      &ps->s_txreq->phdr.hdr.u.l.grh,
-				      rdma_ah_read_grh(&qp->remote_ah_attr),
-				      qp->s_hdrwords, nwords);
-		lrh0 = HFI1_LRH_GRH;
-		middle = 0;
-	}
-	lrh0 |= (priv->s_sc & 0xf) << 12 |
-		(rdma_ah_get_sl(&qp->remote_ah_attr) & 0xf) << 4;
 	/*
 	 * reset s_ahg/AHG fields
 	 *
@@ -768,33 +912,9 @@ void hfi1_make_ruc_header(struct rvt_qp *qp, struct ib_other_headers *ohdr,
 	priv->s_ahg->tx_flags = 0;
 	priv->s_ahg->ahgcount = 0;
 	priv->s_ahg->ahgidx = 0;
-	if (qp->s_mig_state == IB_MIG_MIGRATED)
-		bth0 |= IB_BTH_MIG_REQ;
-	else
-		middle = 0;
-	if (middle)
-		build_ahg(qp, bth2);
-	else
-		qp->s_flags &= ~RVT_S_AHG_VALID;
-	ps->s_txreq->phdr.hdr.lrh[0] = cpu_to_be16(lrh0);
-	ps->s_txreq->phdr.hdr.lrh[1] =
-		cpu_to_be16(rdma_ah_get_dlid(&qp->remote_ah_attr));
-	ps->s_txreq->phdr.hdr.lrh[2] =
-		cpu_to_be16(qp->s_hdrwords + nwords + SIZE_OF_CRC);
-	ps->s_txreq->phdr.hdr.lrh[3] =
-		cpu_to_be16(ppd_from_ibp(ibp)->lid |
-			    rdma_ah_get_path_bits(&qp->remote_ah_attr));
-	bth0 |= hfi1_get_pkey(ibp, qp->s_pkey_index);
-	bth0 |= extra_bytes << 20;
-	ohdr->bth[0] = cpu_to_be32(bth0);
-	bth1 = qp->remote_qpn;
-	if (qp->s_flags & RVT_S_ECN) {
-		qp->s_flags &= ~RVT_S_ECN;
-		/* we recently received a FECN, so return a BECN */
-		bth1 |= (IB_BECN_MASK << IB_BECN_SHIFT);
-	}
-	ohdr->bth[1] = cpu_to_be32(bth1);
-	ohdr->bth[2] = cpu_to_be32(bth2);
+
+	/* Make the appropriate header */
+	hfi1_ruc_header_tbl[priv->hdr_type](qp, ohdr, bth0, bth2, middle, ps);
 }
 
 /* when sending, force a reschedule every one of these periods */
@@ -816,6 +936,8 @@ void hfi1_make_ruc_header(struct rvt_qp *qp, struct ib_other_headers *ohdr,
 static bool schedule_send_yield(struct rvt_qp *qp,
 				struct hfi1_pkt_state *ps)
 {
+	ps->pkts_sent = true;
+
 	if (unlikely(time_after(jiffies, ps->timeout))) {
 		if (!ps->in_thread ||
 		    workqueue_congested(ps->cpu, ps->ppd->hfi1_wq)) {
@@ -912,6 +1034,7 @@ void hfi1_do_send(struct rvt_qp *qp, bool in_thread)
 	ps.timeout = jiffies + ps.timeout_int;
 	ps.cpu = priv->s_sde ? priv->s_sde->cpu :
 			cpumask_first(cpumask_of_node(ps.ppd->dd->node));
+	ps.pkts_sent = false;
 
 	/* insure a pre-built packet is handled  */
 	ps.s_txreq = get_waiting_verbs_txreq(qp);
@@ -934,7 +1057,7 @@ void hfi1_do_send(struct rvt_qp *qp, bool in_thread)
 			spin_lock_irqsave(&qp->s_lock, ps.flags);
 		}
 	} while (make_req(qp, &ps));
-
+	iowait_starve_clear(ps.pkts_sent, &priv->s_iowait);
 	spin_unlock_irqrestore(&qp->s_lock, ps.flags);
 }
 
diff --git a/drivers/infiniband/hw/hfi1/sdma.c b/drivers/infiniband/hw/hfi1/sdma.c
index bfd0d51..6781bcd 100644
--- a/drivers/infiniband/hw/hfi1/sdma.c
+++ b/drivers/infiniband/hw/hfi1/sdma.c
@@ -246,7 +246,7 @@ static void __sdma_process_event(
 	enum sdma_events event);
 static void dump_sdma_state(struct sdma_engine *sde);
 static void sdma_make_progress(struct sdma_engine *sde, u64 status);
-static void sdma_desc_avail(struct sdma_engine *sde, unsigned avail);
+static void sdma_desc_avail(struct sdma_engine *sde, uint avail);
 static void sdma_flush_descq(struct sdma_engine *sde);
 
 /**
@@ -325,7 +325,7 @@ static void sdma_wait_for_packet_egress(struct sdma_engine *sde,
 			/* timed out - bounce the link */
 			dd_dev_err(dd, "%s: engine %u timeout waiting for packets to egress, remaining count %u, bouncing link\n",
 				   __func__, sde->this_idx, (u32)reg);
-			queue_work(dd->pport->hfi1_wq,
+			queue_work(dd->pport->link_wq,
 				   &dd->pport->link_bounce_work);
 			break;
 		}
@@ -1340,10 +1340,8 @@ static void sdma_clean(struct hfi1_devdata *dd, size_t num_engines)
  * @dd: hfi1_devdata
  * @port: port number (currently only zero)
  *
- * sdma_init initializes the specified number of engines.
- *
- * The code initializes each sde, its csrs.  Interrupts
- * are not required to be enabled.
+ * Initializes each sde and its csrs.
+ * Interrupts are not required to be enabled.
  *
  * Returns:
  * 0 - success, -errno on failure
@@ -1764,13 +1762,14 @@ static inline u16 sdma_gethead(struct sdma_engine *sde)
  *
  * This is called with head_lock held.
  */
-static void sdma_desc_avail(struct sdma_engine *sde, unsigned avail)
+static void sdma_desc_avail(struct sdma_engine *sde, uint avail)
 {
 	struct iowait *wait, *nw;
 	struct iowait *waits[SDMA_WAIT_BATCH_SIZE];
-	unsigned i, n = 0, seq;
+	uint i, n = 0, seq, max_idx = 0;
 	struct sdma_txreq *stx;
 	struct hfi1_ibdev *dev = &sde->dd->verbs_dev;
+	u8 max_starved_cnt = 0;
 
 #ifdef CONFIG_SDMA_VERBOSITY
 	dd_dev_err(sde->dd, "CONFIG SDMA(%u) %s:%d %s()\n", sde->this_idx,
@@ -1805,6 +1804,9 @@ static void sdma_desc_avail(struct sdma_engine *sde, unsigned avail)
 				if (num_desc > avail)
 					break;
 				avail -= num_desc;
+				/* Find the most starved wait memeber */
+				iowait_starve_find_max(wait, &max_starved_cnt,
+						       n, &max_idx);
 				list_del_init(&wait->list);
 				waits[n++] = wait;
 			}
@@ -1813,8 +1815,13 @@ static void sdma_desc_avail(struct sdma_engine *sde, unsigned avail)
 		}
 	} while (read_seqretry(&dev->iowait_lock, seq));
 
+	/* Schedule the most starved one first */
+	if (n)
+		waits[max_idx]->wakeup(waits[max_idx], SDMA_AVAIL_REASON);
+
 	for (i = 0; i < n; i++)
-		waits[i]->wakeup(waits[i], SDMA_AVAIL_REASON);
+		if (i != max_idx)
+			waits[i]->wakeup(waits[i], SDMA_AVAIL_REASON);
 }
 
 /* head_lock must be held */
@@ -2351,7 +2358,8 @@ static inline u16 submit_tx(struct sdma_engine *sde, struct sdma_txreq *tx)
 static int sdma_check_progress(
 	struct sdma_engine *sde,
 	struct iowait *wait,
-	struct sdma_txreq *tx)
+	struct sdma_txreq *tx,
+	bool pkts_sent)
 {
 	int ret;
 
@@ -2364,7 +2372,7 @@ static int sdma_check_progress(
 
 		seq = raw_seqcount_begin(
 			(const seqcount_t *)&sde->head_lock.seqcount);
-		ret = wait->sleep(sde, wait, tx, seq);
+		ret = wait->sleep(sde, wait, tx, seq, pkts_sent);
 		if (ret == -EAGAIN)
 			sde->desc_avail = sdma_descq_freecnt(sde);
 	} else {
@@ -2378,6 +2386,7 @@ static int sdma_check_progress(
  * @sde: sdma engine to use
  * @wait: wait structure to use when full (may be NULL)
  * @tx: sdma_txreq to submit
+ * @pkts_sent: has any packet been sent yet?
  *
  * The call submits the tx into the ring.  If a iowait structure is non-NULL
  * the packet will be queued to the list in wait.
@@ -2389,7 +2398,8 @@ static int sdma_check_progress(
  */
 int sdma_send_txreq(struct sdma_engine *sde,
 		    struct iowait *wait,
-		    struct sdma_txreq *tx)
+		    struct sdma_txreq *tx,
+		    bool pkts_sent)
 {
 	int ret = 0;
 	u16 tail;
@@ -2431,7 +2441,7 @@ int sdma_send_txreq(struct sdma_engine *sde,
 	ret = -ECOMM;
 	goto unlock;
 nodesc:
-	ret = sdma_check_progress(sde, wait, tx);
+	ret = sdma_check_progress(sde, wait, tx, pkts_sent);
 	if (ret == -EAGAIN) {
 		ret = 0;
 		goto retry;
@@ -2500,8 +2510,10 @@ int sdma_send_txlist(struct sdma_engine *sde, struct iowait *wait,
 	}
 update_tail:
 	total_count = submit_count + flush_count;
-	if (wait)
+	if (wait) {
 		iowait_sdma_add(wait, total_count);
+		iowait_starve_clear(submit_count > 0, wait);
+	}
 	if (tail != INVALID_TAIL)
 		sdma_update_tail(sde, tail);
 	spin_unlock_irqrestore(&sde->tail_lock, flags);
@@ -2529,7 +2541,7 @@ int sdma_send_txlist(struct sdma_engine *sde, struct iowait *wait,
 	ret = -ECOMM;
 	goto update_tail;
 nodesc:
-	ret = sdma_check_progress(sde, wait, tx);
+	ret = sdma_check_progress(sde, wait, tx, submit_count > 0);
 	if (ret == -EAGAIN) {
 		ret = 0;
 		goto retry;
diff --git a/drivers/infiniband/hw/hfi1/sdma.h b/drivers/infiniband/hw/hfi1/sdma.h
index 64f10b8..107011d 100644
--- a/drivers/infiniband/hw/hfi1/sdma.h
+++ b/drivers/infiniband/hw/hfi1/sdma.h
@@ -852,7 +852,8 @@ struct iowait;
 
 int sdma_send_txreq(struct sdma_engine *sde,
 		    struct iowait *wait,
-		    struct sdma_txreq *tx);
+		    struct sdma_txreq *tx,
+		    bool pkts_sent);
 int sdma_send_txlist(struct sdma_engine *sde,
 		     struct iowait *wait,
 		     struct list_head *tx_list,
diff --git a/drivers/infiniband/hw/hfi1/sysfs.c b/drivers/infiniband/hw/hfi1/sysfs.c
index 2f3bbca..6d2702ef 100644
--- a/drivers/infiniband/hw/hfi1/sysfs.c
+++ b/drivers/infiniband/hw/hfi1/sysfs.c
@@ -95,7 +95,7 @@ static void port_release(struct kobject *kobj)
 	/* nothing to do since memory is freed by hfi1_free_devdata() */
 }
 
-static struct bin_attribute cc_table_bin_attr = {
+static const struct bin_attribute cc_table_bin_attr = {
 	.attr = {.name = "cc_table_bin", .mode = 0444},
 	.read = read_cc_table_bin,
 	.size = PAGE_SIZE,
@@ -137,7 +137,7 @@ static ssize_t read_cc_setting_bin(struct file *filp, struct kobject *kobj,
 	return count;
 }
 
-static struct bin_attribute cc_setting_bin_attr = {
+static const struct bin_attribute cc_setting_bin_attr = {
 	.attr = {.name = "cc_settings_bin", .mode = 0444},
 	.read = read_cc_setting_bin,
 	.size = PAGE_SIZE,
diff --git a/drivers/infiniband/hw/hfi1/trace.c b/drivers/infiniband/hw/hfi1/trace.c
index eafae48..9938bb9 100644
--- a/drivers/infiniband/hw/hfi1/trace.c
+++ b/drivers/infiniband/hw/hfi1/trace.c
@@ -47,7 +47,7 @@
 #define CREATE_TRACE_POINTS
 #include "trace.h"
 
-u8 ibhdr_exhdr_len(struct ib_header *hdr)
+static u8 __get_ib_hdr_len(struct ib_header *hdr)
 {
 	struct ib_other_headers *ohdr;
 	u8 opcode;
@@ -61,13 +61,69 @@ u8 ibhdr_exhdr_len(struct ib_header *hdr)
 	       0 : hdr_len_by_opcode[opcode] - (12 + 8);
 }
 
-#define IMM_PRN  "imm %d"
-#define RETH_PRN "reth vaddr 0x%.16llx rkey 0x%.8x dlen 0x%.8x"
-#define AETH_PRN "aeth syn 0x%.2x %s msn 0x%.8x"
-#define DETH_PRN "deth qkey 0x%.8x sqpn 0x%.6x"
-#define IETH_PRN "ieth rkey 0x%.8x"
-#define ATOMICACKETH_PRN "origdata %llx"
-#define ATOMICETH_PRN "vaddr 0x%llx rkey 0x%.8x sdata %llx cdata %llx"
+static u8 __get_16b_hdr_len(struct hfi1_16b_header *hdr)
+{
+	struct ib_other_headers *ohdr;
+	u8 opcode;
+
+	if (hfi1_16B_get_l4(hdr) == OPA_16B_L4_IB_LOCAL)
+		ohdr = &hdr->u.oth;
+	else
+		ohdr = &hdr->u.l.oth;
+	opcode = ib_bth_get_opcode(ohdr);
+	return hdr_len_by_opcode[opcode] == 0 ?
+	       0 : hdr_len_by_opcode[opcode] - (12 + 8 + 8);
+}
+
+u8 hfi1_trace_packet_hdr_len(struct hfi1_packet *packet)
+{
+	if (packet->etype != RHF_RCV_TYPE_BYPASS)
+		return __get_ib_hdr_len(packet->hdr);
+	else
+		return __get_16b_hdr_len(packet->hdr);
+}
+
+u8 hfi1_trace_opa_hdr_len(struct hfi1_opa_header *opa_hdr)
+{
+	if (!opa_hdr->hdr_type)
+		return __get_ib_hdr_len(&opa_hdr->ibh);
+	else
+		return __get_16b_hdr_len(&opa_hdr->opah);
+}
+
+const char *hfi1_trace_get_packet_str(struct hfi1_packet *packet)
+{
+	if (packet->etype != RHF_RCV_TYPE_BYPASS)
+		return "IB";
+
+	switch (hfi1_16B_get_l2(packet->hdr)) {
+	case 0:
+		return "0";
+	case 1:
+		return "1";
+	case 2:
+		return "16B";
+	case 3:
+		return "9B";
+	}
+	return "";
+}
+
+const char *hfi1_trace_get_packet_type_str(u8 l4)
+{
+	if (l4)
+		return "16B";
+	else
+		return "9B";
+}
+
+#define IMM_PRN  "imm:%d"
+#define RETH_PRN "reth vaddr:0x%.16llx rkey:0x%.8x dlen:0x%.8x"
+#define AETH_PRN "aeth syn:0x%.2x %s msn:0x%.8x"
+#define DETH_PRN "deth qkey:0x%.8x sqpn:0x%.6x"
+#define IETH_PRN "ieth rkey:0x%.8x"
+#define ATOMICACKETH_PRN "origdata:%llx"
+#define ATOMICETH_PRN "vaddr:0x%llx rkey:0x%.8x sdata:%llx cdata:%llx"
 
 #define OP(transport, op) IB_OPCODE_## transport ## _ ## op
 
@@ -84,6 +140,125 @@ static const char *parse_syndrome(u8 syndrome)
 	return "";
 }
 
+void hfi1_trace_parse_9b_bth(struct ib_other_headers *ohdr,
+			     u8 *ack, u8 *becn, u8 *fecn, u8 *mig,
+			     u8 *se, u8 *pad, u8 *opcode, u8 *tver,
+			     u16 *pkey, u32 *psn, u32 *qpn)
+{
+	*ack = ib_bth_get_ackreq(ohdr);
+	*becn = ib_bth_get_becn(ohdr);
+	*fecn = ib_bth_get_fecn(ohdr);
+	*mig = ib_bth_get_migreq(ohdr);
+	*se = ib_bth_get_se(ohdr);
+	*pad = ib_bth_get_pad(ohdr);
+	*opcode = ib_bth_get_opcode(ohdr);
+	*tver = ib_bth_get_tver(ohdr);
+	*pkey = ib_bth_get_pkey(ohdr);
+	*psn = ib_bth_get_psn(ohdr);
+	*qpn = ib_bth_get_qpn(ohdr);
+}
+
+void hfi1_trace_parse_16b_bth(struct ib_other_headers *ohdr,
+			      u8 *ack, u8 *mig, u8 *opcode,
+			      u8 *pad, u8 *se, u8 *tver,
+			      u32 *psn, u32 *qpn)
+{
+	*ack = ib_bth_get_ackreq(ohdr);
+	*mig = ib_bth_get_migreq(ohdr);
+	*opcode = ib_bth_get_opcode(ohdr);
+	*pad = ib_bth_get_pad(ohdr);
+	*se = ib_bth_get_se(ohdr);
+	*tver = ib_bth_get_tver(ohdr);
+	*psn = ib_bth_get_psn(ohdr);
+	*qpn = ib_bth_get_qpn(ohdr);
+}
+
+void hfi1_trace_parse_9b_hdr(struct ib_header *hdr, bool sc5,
+			     u8 *lnh, u8 *lver, u8 *sl, u8 *sc,
+			     u16 *len, u32 *dlid, u32 *slid)
+{
+	*lnh = ib_get_lnh(hdr);
+	*lver = ib_get_lver(hdr);
+	*sl = ib_get_sl(hdr);
+	*sc = ib_get_sc(hdr) | (sc5 << 4);
+	*len = ib_get_len(hdr);
+	*dlid = ib_get_dlid(hdr);
+	*slid = ib_get_slid(hdr);
+}
+
+void hfi1_trace_parse_16b_hdr(struct hfi1_16b_header *hdr,
+			      u8 *age, u8 *becn, u8 *fecn,
+			      u8 *l4, u8 *rc, u8 *sc,
+			      u16 *entropy, u16 *len, u16 *pkey,
+			      u32 *dlid, u32 *slid)
+{
+	*age = hfi1_16B_get_age(hdr);
+	*becn = hfi1_16B_get_becn(hdr);
+	*fecn = hfi1_16B_get_fecn(hdr);
+	*l4 = hfi1_16B_get_l4(hdr);
+	*rc = hfi1_16B_get_rc(hdr);
+	*sc = hfi1_16B_get_sc(hdr);
+	*entropy = hfi1_16B_get_entropy(hdr);
+	*len = hfi1_16B_get_len(hdr);
+	*pkey = hfi1_16B_get_pkey(hdr);
+	*dlid = hfi1_16B_get_dlid(hdr);
+	*slid = hfi1_16B_get_slid(hdr);
+}
+
+#define LRH_PRN "len:%d sc:%d dlid:0x%.4x slid:0x%.4x "
+#define LRH_9B_PRN "lnh:%d,%s lver:%d sl:%d"
+#define LRH_16B_PRN "age:%d becn:%d fecn:%d l4:%d " \
+		    "rc:%d sc:%d pkey:0x%.4x entropy:0x%.4x"
+const char *hfi1_trace_fmt_lrh(struct trace_seq *p, bool bypass,
+			       u8 age, u8 becn, u8 fecn, u8 l4,
+			       u8 lnh, const char *lnh_name, u8 lver,
+			       u8 rc, u8 sc, u8 sl, u16 entropy,
+			       u16 len, u16 pkey, u32 dlid, u32 slid)
+{
+	const char *ret = trace_seq_buffer_ptr(p);
+
+	trace_seq_printf(p, LRH_PRN, len, sc, dlid, slid);
+
+	if (bypass)
+		trace_seq_printf(p, LRH_16B_PRN,
+				 age, becn, fecn, l4, rc, sc, pkey, entropy);
+
+	else
+		trace_seq_printf(p, LRH_9B_PRN,
+				 lnh, lnh_name, lver, sl);
+	trace_seq_putc(p, 0);
+
+	return ret;
+}
+
+#define BTH_9B_PRN \
+	"op:0x%.2x,%s se:%d m:%d pad:%d tver:%d pkey:0x%.4x " \
+	"f:%d b:%d qpn:0x%.6x a:%d psn:0x%.8x"
+#define BTH_16B_PRN \
+	"op:0x%.2x,%s se:%d m:%d pad:%d tver:%d " \
+	"qpn:0x%.6x a:%d psn:0x%.8x"
+const char *hfi1_trace_fmt_bth(struct trace_seq *p, bool bypass,
+			       u8 ack, u8 becn, u8 fecn, u8 mig,
+			       u8 se, u8 pad, u8 opcode, const char *opname,
+			       u8 tver, u16 pkey, u32 psn, u32 qpn)
+{
+	const char *ret = trace_seq_buffer_ptr(p);
+
+	if (bypass)
+		trace_seq_printf(p, BTH_16B_PRN,
+				 opcode, opname,
+				 se, mig, pad, tver, qpn, ack, psn);
+
+	else
+		trace_seq_printf(p, BTH_9B_PRN,
+				 opcode, opname,
+				 se, mig, pad, tver, pkey, fecn, becn,
+				 qpn, ack, psn);
+	trace_seq_putc(p, 0);
+
+	return ret;
+}
+
 const char *parse_everbs_hdrs(
 	struct trace_seq *p,
 	u8 opcode,
diff --git a/drivers/infiniband/hw/hfi1/trace.h b/drivers/infiniband/hw/hfi1/trace.h
index 92dc88f..af50c07 100644
--- a/drivers/infiniband/hw/hfi1/trace.h
+++ b/drivers/infiniband/hw/hfi1/trace.h
@@ -1,5 +1,5 @@
 /*
- * Copyright(c) 2015, 2016 Intel Corporation.
+ * Copyright(c) 2015 - 2017 Intel Corporation.
  *
  * This file is provided under a dual BSD/GPLv2 license.  When using or
  * redistributing this file, you may do so under either license.
@@ -51,3 +51,4 @@
 #include "trace_rc.h"
 #include "trace_rx.h"
 #include "trace_tx.h"
+#include "trace_mmu.h"
diff --git a/drivers/infiniband/hw/hfi1/trace_ibhdrs.h b/drivers/infiniband/hw/hfi1/trace_ibhdrs.h
index 090f6b5..6721f84 100644
--- a/drivers/infiniband/hw/hfi1/trace_ibhdrs.h
+++ b/drivers/infiniband/hw/hfi1/trace_ibhdrs.h
@@ -55,8 +55,79 @@
 #undef TRACE_SYSTEM
 #define TRACE_SYSTEM hfi1_ibhdrs
 
+#define ib_opcode_name(opcode) { IB_OPCODE_##opcode, #opcode  }
+#define show_ib_opcode(opcode)                             \
+__print_symbolic(opcode,                                   \
+	ib_opcode_name(RC_SEND_FIRST),                     \
+	ib_opcode_name(RC_SEND_MIDDLE),                    \
+	ib_opcode_name(RC_SEND_LAST),                      \
+	ib_opcode_name(RC_SEND_LAST_WITH_IMMEDIATE),       \
+	ib_opcode_name(RC_SEND_ONLY),                      \
+	ib_opcode_name(RC_SEND_ONLY_WITH_IMMEDIATE),       \
+	ib_opcode_name(RC_RDMA_WRITE_FIRST),               \
+	ib_opcode_name(RC_RDMA_WRITE_MIDDLE),              \
+	ib_opcode_name(RC_RDMA_WRITE_LAST),                \
+	ib_opcode_name(RC_RDMA_WRITE_LAST_WITH_IMMEDIATE), \
+	ib_opcode_name(RC_RDMA_WRITE_ONLY),                \
+	ib_opcode_name(RC_RDMA_WRITE_ONLY_WITH_IMMEDIATE), \
+	ib_opcode_name(RC_RDMA_READ_REQUEST),              \
+	ib_opcode_name(RC_RDMA_READ_RESPONSE_FIRST),       \
+	ib_opcode_name(RC_RDMA_READ_RESPONSE_MIDDLE),      \
+	ib_opcode_name(RC_RDMA_READ_RESPONSE_LAST),        \
+	ib_opcode_name(RC_RDMA_READ_RESPONSE_ONLY),        \
+	ib_opcode_name(RC_ACKNOWLEDGE),                    \
+	ib_opcode_name(RC_ATOMIC_ACKNOWLEDGE),             \
+	ib_opcode_name(RC_COMPARE_SWAP),                   \
+	ib_opcode_name(RC_FETCH_ADD),                      \
+	ib_opcode_name(UC_SEND_FIRST),                     \
+	ib_opcode_name(UC_SEND_MIDDLE),                    \
+	ib_opcode_name(UC_SEND_LAST),                      \
+	ib_opcode_name(UC_SEND_LAST_WITH_IMMEDIATE),       \
+	ib_opcode_name(UC_SEND_ONLY),                      \
+	ib_opcode_name(UC_SEND_ONLY_WITH_IMMEDIATE),       \
+	ib_opcode_name(UC_RDMA_WRITE_FIRST),               \
+	ib_opcode_name(UC_RDMA_WRITE_MIDDLE),              \
+	ib_opcode_name(UC_RDMA_WRITE_LAST),                \
+	ib_opcode_name(UC_RDMA_WRITE_LAST_WITH_IMMEDIATE), \
+	ib_opcode_name(UC_RDMA_WRITE_ONLY),                \
+	ib_opcode_name(UC_RDMA_WRITE_ONLY_WITH_IMMEDIATE), \
+	ib_opcode_name(UD_SEND_ONLY),                      \
+	ib_opcode_name(UD_SEND_ONLY_WITH_IMMEDIATE),       \
+	ib_opcode_name(CNP))
+
 u8 ibhdr_exhdr_len(struct ib_header *hdr);
 const char *parse_everbs_hdrs(struct trace_seq *p, u8 opcode, void *ehdrs);
+u8 hfi1_trace_opa_hdr_len(struct hfi1_opa_header *opah);
+u8 hfi1_trace_packet_hdr_len(struct hfi1_packet *packet);
+const char *hfi1_trace_get_packet_type_str(u8 l4);
+const char *hfi1_trace_get_packet_str(struct hfi1_packet *packet);
+void hfi1_trace_parse_9b_bth(struct ib_other_headers *ohdr,
+			     u8 *ack, u8 *becn, u8 *fecn, u8 *mig,
+			     u8 *se, u8 *pad, u8 *opcode, u8 *tver,
+			     u16 *pkey, u32 *psn, u32 *qpn);
+void hfi1_trace_parse_9b_hdr(struct ib_header *hdr, bool sc5,
+			     u8 *lnh, u8 *lver, u8 *sl, u8 *sc,
+			     u16 *len, u32 *dlid, u32 *slid);
+void hfi1_trace_parse_16b_bth(struct ib_other_headers *ohdr,
+			      u8 *ack, u8 *mig, u8 *opcode,
+			      u8 *pad, u8 *se, u8 *tver,
+			      u32 *psn, u32 *qpn);
+void hfi1_trace_parse_16b_hdr(struct hfi1_16b_header *hdr,
+			      u8 *age, u8 *becn, u8 *fecn,
+			      u8 *l4, u8 *rc, u8 *sc,
+			      u16 *entropy, u16 *len, u16 *pkey,
+			      u32 *dlid, u32 *slid);
+
+const char *hfi1_trace_fmt_lrh(struct trace_seq *p, bool bypass,
+			       u8 age, u8 becn, u8 fecn, u8 l4,
+			       u8 lnh, const char *lnh_name, u8 lver,
+			       u8 rc, u8 sc, u8 sl, u16 entropy,
+			       u16 len, u16 pkey, u32 dlid, u32 slid);
+
+const char *hfi1_trace_fmt_bth(struct trace_seq *p, bool bypass,
+			       u8 ack, u8 becn, u8 fecn, u8 mig,
+			       u8 se, u8 pad, u8 opcode, const char *opname,
+			       u8 tver, u16 pkey, u32 psn, u32 qpn);
 
 #define __parse_ib_ehdrs(op, ehdrs) parse_everbs_hdrs(p, op, ehdrs)
 
@@ -65,140 +136,303 @@ const char *parse_everbs_hdrs(struct trace_seq *p, u8 opcode, void *ehdrs);
 __print_symbolic(lrh,                    \
 	lrh_name(LRH_BTH),               \
 	lrh_name(LRH_GRH))
+#define PKT_ENTRY(pkt)	__string(ptype,  hfi1_trace_get_packet_str(packet))
+#define PKT_ASSIGN(pkt) __assign_str(ptype, hfi1_trace_get_packet_str(packet))
 
-#define LRH_PRN "vl %d lver %d sl %d lnh %d,%s dlid %.4x len %d slid %.4x"
-#define BTH_PRN \
-	"op 0x%.2x,%s se %d m %d pad %d tver %d pkey 0x%.4x " \
-	"f %d b %d qpn 0x%.6x a %d psn 0x%.8x"
-#define EHDR_PRN "%s"
-
-DECLARE_EVENT_CLASS(hfi1_ibhdr_template,
+DECLARE_EVENT_CLASS(hfi1_input_ibhdr_template,
 		    TP_PROTO(struct hfi1_devdata *dd,
-			     struct ib_header *hdr),
-		    TP_ARGS(dd, hdr),
+			     struct hfi1_packet *packet,
+			     bool sc5),
+		    TP_ARGS(dd, packet, sc5),
 		    TP_STRUCT__entry(
 			DD_DEV_ENTRY(dd)
-			/* LRH */
-			__field(u8, vl)
-			__field(u8, lver)
-			__field(u8, sl)
+			PKT_ENTRY(packet)
+			__field(bool, bypass)
+			__field(u8, ack)
+			__field(u8, age)
+			__field(u8, becn)
+			__field(u8, fecn)
+			__field(u8, l4)
 			__field(u8, lnh)
-			__field(u16, dlid)
-			__field(u16, len)
-			__field(u16, slid)
-			/* BTH */
+			__field(u8, lver)
+			__field(u8, mig)
 			__field(u8, opcode)
-			__field(u8, se)
-			__field(u8, m)
 			__field(u8, pad)
+			__field(u8, rc)
+			__field(u8, sc)
+			__field(u8, se)
+			__field(u8, sl)
 			__field(u8, tver)
+			__field(u16, entropy)
+			__field(u16, len)
 			__field(u16, pkey)
-			__field(u8, f)
-			__field(u8, b)
-			__field(u32, qpn)
-			__field(u8, a)
+			__field(u32, dlid)
 			__field(u32, psn)
+			__field(u32, qpn)
+			__field(u32, slid)
 			/* extended headers */
-			__dynamic_array(u8, ehdrs, ibhdr_exhdr_len(hdr))
+			__dynamic_array(u8, ehdrs,
+					hfi1_trace_packet_hdr_len(packet))
 			),
-		      TP_fast_assign(
+		    TP_fast_assign(
+			DD_DEV_ASSIGN(dd);
+			PKT_ASSIGN(packet);
+
+			if (packet->etype == RHF_RCV_TYPE_BYPASS) {
+				__entry->bypass = true;
+				hfi1_trace_parse_16b_hdr(packet->hdr,
+							 &__entry->age,
+							 &__entry->becn,
+							 &__entry->fecn,
+							 &__entry->l4,
+							 &__entry->rc,
+							 &__entry->sc,
+							 &__entry->entropy,
+							 &__entry->len,
+							 &__entry->pkey,
+							 &__entry->dlid,
+							 &__entry->slid);
+
+				  hfi1_trace_parse_16b_bth(packet->ohdr,
+							   &__entry->ack,
+							   &__entry->mig,
+							   &__entry->opcode,
+							   &__entry->pad,
+							   &__entry->se,
+							   &__entry->tver,
+							   &__entry->psn,
+							   &__entry->qpn);
+			} else {
+				__entry->bypass = false;
+				hfi1_trace_parse_9b_hdr(packet->hdr, sc5,
+							&__entry->lnh,
+							&__entry->lver,
+							&__entry->sl,
+							&__entry->sc,
+							&__entry->len,
+							&__entry->dlid,
+							&__entry->slid);
+
+				  hfi1_trace_parse_9b_bth(packet->ohdr,
+							  &__entry->ack,
+							  &__entry->becn,
+							  &__entry->fecn,
+							  &__entry->mig,
+							  &__entry->se,
+							  &__entry->pad,
+							  &__entry->opcode,
+							  &__entry->tver,
+							  &__entry->pkey,
+							  &__entry->psn,
+							  &__entry->qpn);
+				}
+				/* extended headers */
+				memcpy(__get_dynamic_array(ehdrs),
+				       &packet->ohdr->u,
+				       __get_dynamic_array_len(ehdrs));
+			 ),
+		    TP_printk("[%s] (%s) %s %s hlen:%d %s",
+			      __get_str(dev),
+			      __get_str(ptype),
+			      hfi1_trace_fmt_lrh(p,
+						 __entry->bypass,
+						 __entry->age,
+						 __entry->becn,
+						 __entry->fecn,
+						 __entry->l4,
+						 __entry->lnh,
+						 show_lnh(__entry->lnh),
+						 __entry->lver,
+						 __entry->rc,
+						 __entry->sc,
+						 __entry->sl,
+						 __entry->entropy,
+						 __entry->len,
+						 __entry->pkey,
+						 __entry->dlid,
+						 __entry->slid),
+			      hfi1_trace_fmt_bth(p,
+						 __entry->bypass,
+						 __entry->ack,
+						 __entry->becn,
+						 __entry->fecn,
+						 __entry->mig,
+						 __entry->se,
+						 __entry->pad,
+						 __entry->opcode,
+						 show_ib_opcode(__entry->opcode),
+						 __entry->tver,
+						 __entry->pkey,
+						 __entry->psn,
+						 __entry->qpn),
+			      /* extended headers */
+			      __get_dynamic_array_len(ehdrs),
+			      __parse_ib_ehdrs(
+					__entry->opcode,
+					(void *)__get_dynamic_array(ehdrs))
+			     )
+);
+
+DEFINE_EVENT(hfi1_input_ibhdr_template, input_ibhdr,
+	     TP_PROTO(struct hfi1_devdata *dd,
+		      struct hfi1_packet *packet, bool sc5),
+	     TP_ARGS(dd, packet, sc5));
+
+DECLARE_EVENT_CLASS(hfi1_output_ibhdr_template,
+		    TP_PROTO(struct hfi1_devdata *dd,
+			     struct hfi1_opa_header *opah, bool sc5),
+		    TP_ARGS(dd, opah, sc5),
+		    TP_STRUCT__entry(
+			DD_DEV_ENTRY(dd)
+			__field(bool, bypass)
+			__field(u8, ack)
+			__field(u8, age)
+			__field(u8, becn)
+			__field(u8, fecn)
+			__field(u8, l4)
+			__field(u8, lnh)
+			__field(u8, lver)
+			__field(u8, mig)
+			__field(u8, opcode)
+			__field(u8, pad)
+			__field(u8, rc)
+			__field(u8, sc)
+			__field(u8, se)
+			__field(u8, sl)
+			__field(u8, tver)
+			__field(u16, entropy)
+			__field(u16, len)
+			__field(u16, pkey)
+			__field(u32, dlid)
+			__field(u32, psn)
+			__field(u32, qpn)
+			__field(u32, slid)
+			/* extended headers */
+			__dynamic_array(u8, ehdrs,
+					hfi1_trace_opa_hdr_len(opah))
+			),
+		    TP_fast_assign(
 			struct ib_other_headers *ohdr;
 
 			DD_DEV_ASSIGN(dd);
-			/* LRH */
-			__entry->vl =
-			(u8)(be16_to_cpu(hdr->lrh[0]) >> 12);
-			__entry->lver =
-			(u8)(be16_to_cpu(hdr->lrh[0]) >> 8) & 0xf;
-			__entry->sl =
-			(u8)(be16_to_cpu(hdr->lrh[0]) >> 4) & 0xf;
-			__entry->lnh =
-			(u8)(be16_to_cpu(hdr->lrh[0]) & 3);
-			__entry->dlid =
-			be16_to_cpu(hdr->lrh[1]);
-			/* allow for larger len */
-			__entry->len =
-			be16_to_cpu(hdr->lrh[2]);
-			__entry->slid =
-			be16_to_cpu(hdr->lrh[3]);
-			/* BTH */
-			if (__entry->lnh == HFI1_LRH_BTH)
-			ohdr = &hdr->u.oth;
-			else
-			ohdr = &hdr->u.l.oth;
-			__entry->opcode =
-			(be32_to_cpu(ohdr->bth[0]) >> 24) & 0xff;
-			__entry->se =
-			(be32_to_cpu(ohdr->bth[0]) >> 23) & 1;
-			__entry->m =
-			(be32_to_cpu(ohdr->bth[0]) >> 22) & 1;
-			__entry->pad =
-			(be32_to_cpu(ohdr->bth[0]) >> 20) & 3;
-			__entry->tver =
-			(be32_to_cpu(ohdr->bth[0]) >> 16) & 0xf;
-			__entry->pkey =
-			be32_to_cpu(ohdr->bth[0]) & 0xffff;
-			__entry->f =
-			(be32_to_cpu(ohdr->bth[1]) >> IB_FECN_SHIFT) &
-			IB_FECN_MASK;
-			__entry->b =
-			(be32_to_cpu(ohdr->bth[1]) >> IB_BECN_SHIFT) &
-			IB_BECN_MASK;
-			__entry->qpn =
-			be32_to_cpu(ohdr->bth[1]) & RVT_QPN_MASK;
-			__entry->a =
-			(be32_to_cpu(ohdr->bth[2]) >> 31) & 1;
-			/* allow for larger PSN */
-			__entry->psn =
-			be32_to_cpu(ohdr->bth[2]) & 0x7fffffff;
+
+			if (opah->hdr_type)  {
+				__entry->bypass = true;
+				hfi1_trace_parse_16b_hdr(&opah->opah,
+							 &__entry->age,
+							 &__entry->becn,
+							 &__entry->fecn,
+							 &__entry->l4,
+							 &__entry->rc,
+							 &__entry->sc,
+							 &__entry->entropy,
+							 &__entry->len,
+							 &__entry->pkey,
+							 &__entry->dlid,
+							 &__entry->slid);
+
+				if (entry->l4 == OPA_16B_L4_IB_LOCAL)
+					ohdr = &opah->opah.u.oth;
+				else
+					ohdr = &opah->opah.u.l.oth;
+				hfi1_trace_parse_16b_bth(ohdr,
+							 &__entry->ack,
+							 &__entry->mig,
+							 &__entry->opcode,
+							 &__entry->pad,
+							 &__entry->se,
+							 &__entry->tver,
+							 &__entry->psn,
+							 &__entry->qpn);
+			} else {
+				__entry->bypass = false;
+				hfi1_trace_parse_9b_hdr(&opah->ibh, sc5,
+							&__entry->lnh,
+							&__entry->lver,
+							&__entry->sl,
+							&__entry->sc,
+							&__entry->len,
+							&__entry->dlid,
+							&__entry->slid);
+				if (entry->lnh == HFI1_LRH_BTH)
+					ohdr = &opah->ibh.u.oth;
+				else
+					ohdr = &opah->ibh.u.l.oth;
+				hfi1_trace_parse_9b_bth(ohdr,
+							&__entry->ack,
+							&__entry->becn,
+							&__entry->fecn,
+							&__entry->mig,
+							&__entry->se,
+							&__entry->pad,
+							&__entry->opcode,
+							&__entry->tver,
+							&__entry->pkey,
+							&__entry->psn,
+							&__entry->qpn);
+			}
+
 			/* extended headers */
-			memcpy(__get_dynamic_array(ehdrs), &ohdr->u,
-			       ibhdr_exhdr_len(hdr));
-			),
-		TP_printk("[%s] " LRH_PRN " " BTH_PRN " " EHDR_PRN,
-			  __get_str(dev),
-			  /* LRH */
-			  __entry->vl,
-			  __entry->lver,
-			  __entry->sl,
-			  __entry->lnh, show_lnh(__entry->lnh),
-			  __entry->dlid,
-			  __entry->len,
-			  __entry->slid,
-			  /* BTH */
-			  __entry->opcode, show_ib_opcode(__entry->opcode),
-			  __entry->se,
-			  __entry->m,
-			  __entry->pad,
-			  __entry->tver,
-			  __entry->pkey,
-			  __entry->f,
-			  __entry->b,
-			  __entry->qpn,
-			  __entry->a,
-			  __entry->psn,
-			  /* extended headers */
-			  __parse_ib_ehdrs(
-				__entry->opcode,
-				(void *)__get_dynamic_array(ehdrs))
-			)
+			memcpy(__get_dynamic_array(ehdrs),
+			       &ohdr->u, __get_dynamic_array_len(ehdrs));
+		    ),
+		    TP_printk("[%s] (%s) %s %s hlen:%d %s",
+			      __get_str(dev),
+			      hfi1_trace_get_packet_type_str(__entry->l4),
+			      hfi1_trace_fmt_lrh(p,
+						 __entry->bypass,
+						 __entry->age,
+						 __entry->becn,
+						 __entry->fecn,
+						 __entry->l4,
+						 __entry->lnh,
+						 show_lnh(__entry->lnh),
+						 __entry->lver,
+						 __entry->rc,
+						 __entry->sc,
+						 __entry->sl,
+						 __entry->entropy,
+						 __entry->len,
+						 __entry->pkey,
+						 __entry->dlid,
+						 __entry->slid),
+			      hfi1_trace_fmt_bth(p,
+						 __entry->bypass,
+						 __entry->ack,
+						 __entry->becn,
+						 __entry->fecn,
+						 __entry->mig,
+						 __entry->se,
+						 __entry->pad,
+						 __entry->opcode,
+						 show_ib_opcode(__entry->opcode),
+						 __entry->tver,
+						 __entry->pkey,
+						 __entry->psn,
+						 __entry->qpn),
+			      /* extended headers */
+			      __get_dynamic_array_len(ehdrs),
+			      __parse_ib_ehdrs(
+					__entry->opcode,
+					(void *)__get_dynamic_array(ehdrs))
+			     )
 );
 
-DEFINE_EVENT(hfi1_ibhdr_template, input_ibhdr,
-	     TP_PROTO(struct hfi1_devdata *dd, struct ib_header *hdr),
-	     TP_ARGS(dd, hdr));
+DEFINE_EVENT(hfi1_output_ibhdr_template, pio_output_ibhdr,
+	     TP_PROTO(struct hfi1_devdata *dd,
+		      struct hfi1_opa_header *opah, bool sc5),
+	     TP_ARGS(dd, opah, sc5));
 
-DEFINE_EVENT(hfi1_ibhdr_template, pio_output_ibhdr,
-	     TP_PROTO(struct hfi1_devdata *dd, struct ib_header *hdr),
-	     TP_ARGS(dd, hdr));
+DEFINE_EVENT(hfi1_output_ibhdr_template, ack_output_ibhdr,
+	     TP_PROTO(struct hfi1_devdata *dd,
+		      struct hfi1_opa_header *opah, bool sc5),
+	     TP_ARGS(dd, opah, sc5));
 
-DEFINE_EVENT(hfi1_ibhdr_template, ack_output_ibhdr,
-	     TP_PROTO(struct hfi1_devdata *dd, struct ib_header *hdr),
-	     TP_ARGS(dd, hdr));
+DEFINE_EVENT(hfi1_output_ibhdr_template, sdma_output_ibhdr,
+	     TP_PROTO(struct hfi1_devdata *dd,
+		      struct hfi1_opa_header *opah, bool sc5),
+	     TP_ARGS(dd, opah, sc5));
 
-DEFINE_EVENT(hfi1_ibhdr_template, sdma_output_ibhdr,
-	     TP_PROTO(struct hfi1_devdata *dd, struct ib_header *hdr),
-	     TP_ARGS(dd, hdr));
 
 #endif /* __HFI1_TRACE_IBHDRS_H */
 
diff --git a/drivers/infiniband/hw/hfi1/trace_misc.h b/drivers/infiniband/hw/hfi1/trace_misc.h
index deac77d..8db2253 100644
--- a/drivers/infiniband/hw/hfi1/trace_misc.h
+++ b/drivers/infiniband/hw/hfi1/trace_misc.h
@@ -72,6 +72,26 @@ TRACE_EVENT(hfi1_interrupt,
 		      __entry->src)
 );
 
+DECLARE_EVENT_CLASS(
+	hfi1_csr_template,
+	TP_PROTO(void __iomem *addr, u64 value),
+	TP_ARGS(addr, value),
+	TP_STRUCT__entry(
+		__field(void __iomem *, addr)
+		__field(u64, value)
+	),
+	TP_fast_assign(
+		__entry->addr = addr;
+		__entry->value = value;
+	),
+	TP_printk("addr %p value %llx", __entry->addr, __entry->value)
+);
+
+DEFINE_EVENT(
+	hfi1_csr_template, hfi1_write_rcvarray,
+	TP_PROTO(void __iomem *addr, u64 value),
+	TP_ARGS(addr, value));
+
 #ifdef CONFIG_FAULT_INJECTION
 TRACE_EVENT(hfi1_fault_opcode,
 	    TP_PROTO(struct rvt_qp *qp, u8 opcode),
diff --git a/drivers/infiniband/hw/hfi1/trace_mmu.h b/drivers/infiniband/hw/hfi1/trace_mmu.h
new file mode 100644
index 0000000..3b7abbc
--- /dev/null
+++ b/drivers/infiniband/hw/hfi1/trace_mmu.h
@@ -0,0 +1,95 @@
+/*
+ * Copyright(c) 2017 Intel Corporation.
+ *
+ * This file is provided under a dual BSD/GPLv2 license.  When using or
+ * redistributing this file, you may do so under either license.
+ *
+ * GPL LICENSE SUMMARY
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of version 2 of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * BSD LICENSE
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ *  - Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer.
+ *  - Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *  - Neither the name of Intel Corporation nor the names of its
+ *    contributors may be used to endorse or promote products derived
+ *    from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ *
+ */
+#if !defined(__HFI1_TRACE_MMU_H) || defined(TRACE_HEADER_MULTI_READ)
+#define __HFI1_TRACE_MMU_H
+
+#include <linux/tracepoint.h>
+#include <linux/trace_seq.h>
+
+#include "hfi.h"
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM hfi1_mmu
+
+DECLARE_EVENT_CLASS(hfi1_mmu_rb_template,
+		    TP_PROTO(unsigned long addr, unsigned long len),
+		    TP_ARGS(addr, len),
+		    TP_STRUCT__entry(__field(unsigned long, addr)
+				     __field(unsigned long, len)
+			    ),
+		    TP_fast_assign(__entry->addr = addr;
+				   __entry->len = len;
+			    ),
+		    TP_printk("MMU node addr 0x%lx, len %lu",
+			      __entry->addr,
+			      __entry->len
+			    )
+);
+
+DEFINE_EVENT(hfi1_mmu_rb_template, hfi1_mmu_rb_insert,
+	     TP_PROTO(unsigned long addr, unsigned long len),
+	     TP_ARGS(addr, len));
+
+DEFINE_EVENT(hfi1_mmu_rb_template, hfi1_mmu_rb_search,
+	     TP_PROTO(unsigned long addr, unsigned long len),
+	     TP_ARGS(addr, len));
+
+DEFINE_EVENT(hfi1_mmu_rb_template, hfi1_mmu_rb_remove,
+	     TP_PROTO(unsigned long addr, unsigned long len),
+	     TP_ARGS(addr, len));
+
+DEFINE_EVENT(hfi1_mmu_rb_template, hfi1_mmu_mem_invalidate,
+	     TP_PROTO(unsigned long addr, unsigned long len),
+	     TP_ARGS(addr, len));
+
+#endif /* __HFI1_TRACE_RC_H */
+
+#undef TRACE_INCLUDE_PATH
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_PATH .
+#define TRACE_INCLUDE_FILE trace_mmu
+#include <trace/define_trace.h>
diff --git a/drivers/infiniband/hw/hfi1/trace_rx.h b/drivers/infiniband/hw/hfi1/trace_rx.h
index f77e59f..f9909d2 100644
--- a/drivers/infiniband/hw/hfi1/trace_rx.h
+++ b/drivers/infiniband/hw/hfi1/trace_rx.h
@@ -52,9 +52,25 @@
 
 #include "hfi.h"
 
+#define tidtype_name(type) { PT_##type, #type }
+#define show_tidtype(type)                   \
+__print_symbolic(type,                       \
+	tidtype_name(EXPECTED),              \
+	tidtype_name(EAGER),                 \
+	tidtype_name(INVALID))               \
+
 #undef TRACE_SYSTEM
 #define TRACE_SYSTEM hfi1_rx
 
+#define packettype_name(etype) { RHF_RCV_TYPE_##etype, #etype }
+#define show_packettype(etype)                  \
+__print_symbolic(etype,                         \
+	packettype_name(EXPECTED),              \
+	packettype_name(EAGER),                 \
+	packettype_name(IB),                    \
+	packettype_name(ERROR),                 \
+	packettype_name(BYPASS))
+
 TRACE_EVENT(hfi1_rcvhdr,
 	    TP_PROTO(struct hfi1_devdata *dd,
 		     u32 ctxt,
@@ -98,24 +114,24 @@ TRACE_EVENT(hfi1_rcvhdr,
 );
 
 TRACE_EVENT(hfi1_receive_interrupt,
-	    TP_PROTO(struct hfi1_devdata *dd, u32 ctxt),
-	    TP_ARGS(dd, ctxt),
+	    TP_PROTO(struct hfi1_devdata *dd, struct hfi1_ctxtdata *rcd),
+	    TP_ARGS(dd, rcd),
 	    TP_STRUCT__entry(DD_DEV_ENTRY(dd)
 			     __field(u32, ctxt)
 			     __field(u8, slow_path)
 			     __field(u8, dma_rtail)
 			     ),
 	    TP_fast_assign(DD_DEV_ASSIGN(dd);
-			__entry->ctxt = ctxt;
-			if (dd->rcd[ctxt]->do_interrupt ==
+			__entry->ctxt = rcd->ctxt;
+			if (rcd->do_interrupt ==
 			    &handle_receive_interrupt) {
 				__entry->slow_path = 1;
 				__entry->dma_rtail = 0xFF;
-			} else if (dd->rcd[ctxt]->do_interrupt ==
+			} else if (rcd->do_interrupt ==
 					&handle_receive_interrupt_dma_rtail){
 				__entry->dma_rtail = 1;
 				__entry->slow_path = 0;
-			} else if (dd->rcd[ctxt]->do_interrupt ==
+			} else if (rcd->do_interrupt ==
 					&handle_receive_interrupt_nodma_rtail) {
 				__entry->dma_rtail = 0;
 				__entry->slow_path = 0;
@@ -129,7 +145,8 @@ TRACE_EVENT(hfi1_receive_interrupt,
 		      )
 );
 
-TRACE_EVENT(hfi1_exp_tid_reg,
+DECLARE_EVENT_CLASS(
+	    hfi1_exp_tid_reg_unreg,
 	    TP_PROTO(unsigned int ctxt, u16 subctxt, u32 rarr,
 		     u32 npages, unsigned long va, unsigned long pa,
 		     dma_addr_t dma),
@@ -163,38 +180,45 @@ TRACE_EVENT(hfi1_exp_tid_reg,
 		      )
 	);
 
-TRACE_EVENT(hfi1_exp_tid_unreg,
-	    TP_PROTO(unsigned int ctxt, u16 subctxt, u32 rarr, u32 npages,
-		     unsigned long va, unsigned long pa, dma_addr_t dma),
-	    TP_ARGS(ctxt, subctxt, rarr, npages, va, pa, dma),
-	    TP_STRUCT__entry(
-			     __field(unsigned int, ctxt)
-			     __field(u16, subctxt)
-			     __field(u32, rarr)
-			     __field(u32, npages)
-			     __field(unsigned long, va)
-			     __field(unsigned long, pa)
-			     __field(dma_addr_t, dma)
-			     ),
-	    TP_fast_assign(
-			   __entry->ctxt = ctxt;
-			   __entry->subctxt = subctxt;
-			   __entry->rarr = rarr;
-			   __entry->npages = npages;
-			   __entry->va = va;
-			   __entry->pa = pa;
-			   __entry->dma = dma;
-			   ),
-	    TP_printk("[%u:%u] entry:%u, %u pages @ 0x%lx, va:0x%lx dma:0x%llx",
-		      __entry->ctxt,
-		      __entry->subctxt,
-		      __entry->rarr,
-		      __entry->npages,
-		      __entry->pa,
-		      __entry->va,
-		      __entry->dma
-		      )
-	);
+DEFINE_EVENT(
+	hfi1_exp_tid_reg_unreg, hfi1_exp_tid_unreg,
+	TP_PROTO(unsigned int ctxt, u16 subctxt, u32 rarr, u32 npages,
+		 unsigned long va, unsigned long pa, dma_addr_t dma),
+	TP_ARGS(ctxt, subctxt, rarr, npages, va, pa, dma));
+
+DEFINE_EVENT(
+	hfi1_exp_tid_reg_unreg, hfi1_exp_tid_reg,
+	TP_PROTO(unsigned int ctxt, u16 subctxt, u32 rarr, u32 npages,
+		 unsigned long va, unsigned long pa, dma_addr_t dma),
+	TP_ARGS(ctxt, subctxt, rarr, npages, va, pa, dma));
+
+TRACE_EVENT(
+	hfi1_put_tid,
+	TP_PROTO(struct hfi1_devdata *dd,
+		 u32 index, u32 type, unsigned long pa, u16 order),
+	TP_ARGS(dd, index, type, pa, order),
+	TP_STRUCT__entry(
+		DD_DEV_ENTRY(dd)
+		__field(unsigned long, pa);
+		__field(u32, index);
+		__field(u32, type);
+		__field(u16, order);
+	),
+	TP_fast_assign(
+		DD_DEV_ASSIGN(dd);
+		__entry->pa = pa;
+		__entry->index = index;
+		__entry->type = type;
+		__entry->order = order;
+	),
+	TP_printk("[%s] type %s pa %lx index %u order %u",
+		  __get_str(dev),
+		  show_tidtype(__entry->type),
+		  __entry->pa,
+		  __entry->index,
+		  __entry->order
+	)
+);
 
 TRACE_EVENT(hfi1_exp_tid_inval,
 	    TP_PROTO(unsigned int ctxt, u16 subctxt, unsigned long va, u32 rarr,
diff --git a/drivers/infiniband/hw/hfi1/trace_tx.h b/drivers/infiniband/hw/hfi1/trace_tx.h
index c59809a..c57af3b 100644
--- a/drivers/infiniband/hw/hfi1/trace_tx.h
+++ b/drivers/infiniband/hw/hfi1/trace_tx.h
@@ -1,5 +1,5 @@
 /*
- * Copyright(c) 2015, 2016 Intel Corporation.
+ * Copyright(c) 2015 - 2017 Intel Corporation.
  *
  * This file is provided under a dual BSD/GPLv2 license.  When using or
  * redistributing this file, you may do so under either license.
@@ -198,6 +198,140 @@ TRACE_EVENT(hfi1_sdma_engine_select,
 		      )
 );
 
+TRACE_EVENT(hfi1_sdma_user_free_queues,
+	    TP_PROTO(struct hfi1_devdata *dd, u16 ctxt, u16 subctxt),
+	    TP_ARGS(dd, ctxt, subctxt),
+	    TP_STRUCT__entry(DD_DEV_ENTRY(dd)
+			     __field(u16, ctxt)
+			     __field(u16, subctxt)
+			     ),
+	    TP_fast_assign(DD_DEV_ASSIGN(dd);
+			   __entry->ctxt = ctxt;
+			   __entry->subctxt = subctxt;
+			   ),
+	    TP_printk("[%s] SDMA [%u:%u] Freeing user SDMA queues",
+		      __get_str(dev),
+		      __entry->ctxt,
+		      __entry->subctxt
+		      )
+);
+
+TRACE_EVENT(hfi1_sdma_user_process_request,
+	    TP_PROTO(struct hfi1_devdata *dd, u16 ctxt, u16 subctxt,
+		     u16 comp_idx),
+	    TP_ARGS(dd, ctxt, subctxt, comp_idx),
+	    TP_STRUCT__entry(DD_DEV_ENTRY(dd)
+			     __field(u16, ctxt)
+			     __field(u16, subctxt)
+			     __field(u16, comp_idx)
+			     ),
+	    TP_fast_assign(DD_DEV_ASSIGN(dd);
+			   __entry->ctxt = ctxt;
+			   __entry->subctxt = subctxt;
+			   __entry->comp_idx = comp_idx;
+			   ),
+	    TP_printk("[%s] SDMA [%u:%u] Using req/comp entry: %u",
+		      __get_str(dev),
+		      __entry->ctxt,
+		      __entry->subctxt,
+		      __entry->comp_idx
+		      )
+);
+
+DECLARE_EVENT_CLASS(
+	hfi1_sdma_value_template,
+	TP_PROTO(struct hfi1_devdata *dd, u16 ctxt, u16 subctxt, u16 comp_idx,
+		 u32 value),
+	TP_ARGS(dd, ctxt, subctxt, comp_idx, value),
+	TP_STRUCT__entry(DD_DEV_ENTRY(dd)
+			 __field(u16, ctxt)
+			 __field(u16, subctxt)
+			 __field(u16, comp_idx)
+			 __field(u32, value)
+		),
+	TP_fast_assign(DD_DEV_ASSIGN(dd);
+		       __entry->ctxt = ctxt;
+		       __entry->subctxt = subctxt;
+		       __entry->comp_idx = comp_idx;
+		       __entry->value = value;
+		),
+	TP_printk("[%s] SDMA [%u:%u:%u] value: %u",
+		  __get_str(dev),
+		  __entry->ctxt,
+		  __entry->subctxt,
+		  __entry->comp_idx,
+		  __entry->value
+		)
+);
+
+DEFINE_EVENT(hfi1_sdma_value_template, hfi1_sdma_user_initial_tidoffset,
+	     TP_PROTO(struct hfi1_devdata *dd, u16 ctxt, u16 subctxt,
+		      u16 comp_idx, u32 tidoffset),
+	     TP_ARGS(dd, ctxt, subctxt, comp_idx, tidoffset));
+
+DEFINE_EVENT(hfi1_sdma_value_template, hfi1_sdma_user_data_length,
+	     TP_PROTO(struct hfi1_devdata *dd, u16 ctxt, u16 subctxt,
+		      u16 comp_idx, u32 data_len),
+	     TP_ARGS(dd, ctxt, subctxt, comp_idx, data_len));
+
+DEFINE_EVENT(hfi1_sdma_value_template, hfi1_sdma_user_compute_length,
+	     TP_PROTO(struct hfi1_devdata *dd, u16 ctxt, u16 subctxt,
+		      u16 comp_idx, u32 data_len),
+	     TP_ARGS(dd, ctxt, subctxt, comp_idx, data_len));
+
+TRACE_EVENT(hfi1_sdma_user_tid_info,
+	    TP_PROTO(struct hfi1_devdata *dd, u16 ctxt, u16 subctxt,
+		     u16 comp_idx, u32 tidoffset, u32 units, u8 shift),
+	    TP_ARGS(dd, ctxt, subctxt, comp_idx, tidoffset, units, shift),
+	    TP_STRUCT__entry(DD_DEV_ENTRY(dd)
+			     __field(u16, ctxt)
+			     __field(u16, subctxt)
+			     __field(u16, comp_idx)
+			     __field(u32, tidoffset)
+			     __field(u32, units)
+			     __field(u8, shift)
+			     ),
+	    TP_fast_assign(DD_DEV_ASSIGN(dd);
+			   __entry->ctxt = ctxt;
+			   __entry->subctxt = subctxt;
+			   __entry->comp_idx = comp_idx;
+			   __entry->tidoffset = tidoffset;
+			   __entry->units = units;
+			   __entry->shift = shift;
+			   ),
+	    TP_printk("[%s] SDMA [%u:%u:%u] TID offset %ubytes %uunits om %u",
+		      __get_str(dev),
+		      __entry->ctxt,
+		      __entry->subctxt,
+		      __entry->comp_idx,
+		      __entry->tidoffset,
+		      __entry->units,
+		      __entry->shift
+		      )
+);
+
+TRACE_EVENT(hfi1_sdma_request,
+	    TP_PROTO(struct hfi1_devdata *dd, u16 ctxt, u16 subctxt,
+		     unsigned long dim),
+	    TP_ARGS(dd, ctxt, subctxt, dim),
+	    TP_STRUCT__entry(DD_DEV_ENTRY(dd)
+			     __field(u16, ctxt)
+			     __field(u16, subctxt)
+			     __field(unsigned long, dim)
+			     ),
+	    TP_fast_assign(DD_DEV_ASSIGN(dd);
+			   __entry->ctxt = ctxt;
+			   __entry->subctxt = subctxt;
+			   __entry->dim = dim;
+			   ),
+	    TP_printk("[%s] SDMA from %u:%u (%lu)",
+		      __get_str(dev),
+		      __entry->ctxt,
+		      __entry->subctxt,
+		      __entry->dim
+		      )
+);
+
 DECLARE_EVENT_CLASS(hfi1_sdma_engine_class,
 		    TP_PROTO(struct sdma_engine *sde, u64 status),
 		    TP_ARGS(sde, status),
diff --git a/drivers/infiniband/hw/hfi1/uc.c b/drivers/infiniband/hw/hfi1/uc.c
index 5da1e45..0b64617 100644
--- a/drivers/infiniband/hw/hfi1/uc.c
+++ b/drivers/infiniband/hw/hfi1/uc.c
@@ -65,7 +65,7 @@ int hfi1_make_uc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 	struct hfi1_qp_priv *priv = qp->priv;
 	struct ib_other_headers *ohdr;
 	struct rvt_swqe *wqe;
-	u32 hwords = 5;
+	u32 hwords;
 	u32 bth0 = 0;
 	u32 len;
 	u32 pmtu = qp->pmtu;
@@ -93,9 +93,23 @@ int hfi1_make_uc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 		goto done_free_tx;
 	}
 
-	ohdr = &ps->s_txreq->phdr.hdr.u.oth;
-	if (rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)
-		ohdr = &ps->s_txreq->phdr.hdr.u.l.oth;
+	ps->s_txreq->phdr.hdr.hdr_type = priv->hdr_type;
+	if (priv->hdr_type == HFI1_PKT_TYPE_9B) {
+		/* header size in 32-bit words LRH+BTH = (8+12)/4. */
+		hwords = 5;
+		if (rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH)
+			ohdr = &ps->s_txreq->phdr.hdr.ibh.u.l.oth;
+		else
+			ohdr = &ps->s_txreq->phdr.hdr.ibh.u.oth;
+	} else {
+		/* header size in 32-bit words 16B LRH+BTH = (16+12)/4. */
+		hwords = 7;
+		if ((rdma_ah_get_ah_flags(&qp->remote_ah_attr) & IB_AH_GRH) &&
+		    (hfi1_check_mcast(rdma_ah_get_dlid(&qp->remote_ah_attr))))
+			ohdr = &ps->s_txreq->phdr.hdr.opah.u.l.oth;
+		else
+			ohdr = &ps->s_txreq->phdr.hdr.opah.u.oth;
+	}
 
 	/* Get the next send request. */
 	wqe = rvt_get_swqe_ptr(qp, qp->s_cur);
@@ -297,31 +311,26 @@ int hfi1_make_uc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 void hfi1_uc_rcv(struct hfi1_packet *packet)
 {
 	struct hfi1_ibport *ibp = rcd_to_iport(packet->rcd);
-	struct ib_header *hdr = packet->hdr;
-	u32 rcv_flags = packet->rcv_flags;
-	void *data = packet->ebuf;
+	void *data = packet->payload;
 	u32 tlen = packet->tlen;
 	struct rvt_qp *qp = packet->qp;
 	struct ib_other_headers *ohdr = packet->ohdr;
-	u32 bth0, opcode;
+	u32 opcode = packet->opcode;
 	u32 hdrsize = packet->hlen;
 	u32 psn;
-	u32 pad;
+	u32 pad = packet->pad;
 	struct ib_wc wc;
 	u32 pmtu = qp->pmtu;
 	struct ib_reth *reth;
-	int has_grh = rcv_flags & HFI1_HAS_GRH;
 	int ret;
+	u8 extra_bytes = pad + packet->extra_byte + (SIZE_OF_CRC << 2);
 
-	bth0 = be32_to_cpu(ohdr->bth[0]);
-	if (hfi1_ruc_check_hdr(ibp, hdr, has_grh, qp, bth0))
+	if (hfi1_ruc_check_hdr(ibp, packet))
 		return;
 
 	process_ecn(qp, packet, true);
 
-	psn = be32_to_cpu(ohdr->bth[2]);
-	opcode = ib_bth_get_opcode(ohdr);
-
+	psn = ib_bth_get_psn(ohdr);
 	/* Compare the PSN verses the expected PSN. */
 	if (unlikely(cmp_psn(psn, qp->r_psn) != 0)) {
 		/*
@@ -414,7 +423,12 @@ void hfi1_uc_rcv(struct hfi1_packet *packet)
 		/* FALLTHROUGH */
 	case OP(SEND_MIDDLE):
 		/* Check for invalid length PMTU or posted rwqe len. */
-		if (unlikely(tlen != (hdrsize + pmtu + 4)))
+		/*
+		 * There will be no padding for 9B packet but 16B packets
+		 * will come in with some padding since we always add
+		 * CRC and LT bytes which will need to be flit aligned
+		 */
+		if (unlikely(tlen != (hdrsize + pmtu + extra_bytes)))
 			goto rewind;
 		qp->r_rcv_len += pmtu;
 		if (unlikely(qp->r_rcv_len > qp->r_len))
@@ -432,14 +446,12 @@ void hfi1_uc_rcv(struct hfi1_packet *packet)
 		wc.ex.imm_data = 0;
 		wc.wc_flags = 0;
 send_last:
-		/* Get the number of bytes the message was padded by. */
-		pad = ib_bth_get_pad(ohdr);
 		/* Check for invalid length. */
 		/* LAST len should be >= 1 */
-		if (unlikely(tlen < (hdrsize + pad + 4)))
+		if (unlikely(tlen < (hdrsize + extra_bytes)))
 			goto rewind;
 		/* Don't count the CRC. */
-		tlen -= (hdrsize + pad + 4);
+		tlen -= (hdrsize + extra_bytes);
 		wc.byte_len = tlen + qp->r_rcv_len;
 		if (unlikely(wc.byte_len > qp->r_len))
 			goto rewind;
@@ -527,14 +539,12 @@ void hfi1_uc_rcv(struct hfi1_packet *packet)
 rdma_last_imm:
 		wc.wc_flags = IB_WC_WITH_IMM;
 
-		/* Get the number of bytes the message was padded by. */
-		pad = ib_bth_get_pad(ohdr);
 		/* Check for invalid length. */
 		/* LAST len should be >= 1 */
 		if (unlikely(tlen < (hdrsize + pad + 4)))
 			goto drop;
 		/* Don't count the CRC. */
-		tlen -= (hdrsize + pad + 4);
+		tlen -= (hdrsize + extra_bytes);
 		if (unlikely(tlen + qp->r_rcv_len != qp->r_len))
 			goto drop;
 		if (test_and_clear_bit(RVT_R_REWIND_SGE, &qp->r_aflags)) {
@@ -554,14 +564,12 @@ void hfi1_uc_rcv(struct hfi1_packet *packet)
 
 	case OP(RDMA_WRITE_LAST):
 rdma_last:
-		/* Get the number of bytes the message was padded by. */
-		pad = ib_bth_get_pad(ohdr);
 		/* Check for invalid length. */
 		/* LAST len should be >= 1 */
 		if (unlikely(tlen < (hdrsize + pad + 4)))
 			goto drop;
 		/* Don't count the CRC. */
-		tlen -= (hdrsize + pad + 4);
+		tlen -= (hdrsize + extra_bytes);
 		if (unlikely(tlen + qp->r_rcv_len != qp->r_len))
 			goto drop;
 		hfi1_copy_sge(&qp->r_sge, data, tlen, true, false);
diff --git a/drivers/infiniband/hw/hfi1/ud.c b/drivers/infiniband/hw/hfi1/ud.c
index 6a4e95c..2ba74fd 100644
--- a/drivers/infiniband/hw/hfi1/ud.c
+++ b/drivers/infiniband/hw/hfi1/ud.c
@@ -53,6 +53,12 @@
 #include "verbs_txreq.h"
 #include "qp.h"
 
+/* We support only two types - 9B and 16B for now */
+static const hfi1_make_req hfi1_make_ud_req_tbl[2] = {
+	[HFI1_PKT_TYPE_9B] = &hfi1_make_ud_req_9B,
+	[HFI1_PKT_TYPE_16B] = &hfi1_make_ud_req_16B
+};
+
 /**
  * ud_loopback - handle send on loopback QPs
  * @sqp: the sending QP
@@ -67,6 +73,7 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
 {
 	struct hfi1_ibport *ibp = to_iport(sqp->ibqp.device, sqp->port_num);
 	struct hfi1_pportdata *ppd;
+	struct hfi1_qp_priv *priv = sqp->priv;
 	struct rvt_qp *qp;
 	struct rdma_ah_attr *ah_attr;
 	unsigned long flags;
@@ -102,18 +109,19 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
 
 	if (qp->ibqp.qp_num > 1) {
 		u16 pkey;
-		u16 slid;
+		u32 slid;
 		u8 sc5 = ibp->sl_to_sc[rdma_ah_get_sl(ah_attr)];
 
 		pkey = hfi1_get_pkey(ibp, sqp->s_pkey_index);
 		slid = ppd->lid | (rdma_ah_get_path_bits(ah_attr) &
 				   ((1 << ppd->lmc) - 1));
 		if (unlikely(ingress_pkey_check(ppd, pkey, sc5,
-						qp->s_pkey_index, slid))) {
-			hfi1_bad_pqkey(ibp, OPA_TRAP_BAD_P_KEY, pkey,
-				       rdma_ah_get_sl(ah_attr),
-				       sqp->ibqp.qp_num, qp->ibqp.qp_num,
-				       slid, rdma_ah_get_dlid(ah_attr));
+						qp->s_pkey_index,
+						slid, false))) {
+			hfi1_bad_pkey(ibp, pkey,
+				      rdma_ah_get_sl(ah_attr),
+				      sqp->ibqp.qp_num, qp->ibqp.qp_num,
+				      slid, rdma_ah_get_dlid(ah_attr));
 			goto drop;
 		}
 	}
@@ -128,18 +136,8 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
 
 		qkey = (int)swqe->ud_wr.remote_qkey < 0 ?
 			sqp->qkey : swqe->ud_wr.remote_qkey;
-		if (unlikely(qkey != qp->qkey)) {
-			u16 lid;
-
-			lid = ppd->lid | (rdma_ah_get_path_bits(ah_attr) &
-					  ((1 << ppd->lmc) - 1));
-			hfi1_bad_pqkey(ibp, OPA_TRAP_BAD_Q_KEY, qkey,
-				       rdma_ah_get_sl(ah_attr),
-				       sqp->ibqp.qp_num, qp->ibqp.qp_num,
-				       lid,
-				       rdma_ah_get_dlid(ah_attr));
-			goto drop;
-		}
+		if (unlikely(qkey != qp->qkey))
+			goto drop; /* silently drop per IBTA spec */
 	}
 
 	/*
@@ -185,9 +183,33 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
 
 	if (rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH) {
 		struct ib_grh grh;
-		const struct ib_global_route *grd = rdma_ah_read_grh(ah_attr);
+		struct ib_global_route grd = *(rdma_ah_read_grh(ah_attr));
 
-		hfi1_make_grh(ibp, &grh, grd, 0, 0);
+		/*
+		 * For loopback packets with extended LIDs, the
+		 * sgid_index in the GRH is 0 and the dgid is
+		 * OPA GID of the sender. While creating a response
+		 * to the loopback packet, IB core creates the new
+		 * sgid_index from the DGID and that will be the
+		 * OPA_GID_INDEX. The new dgid is from the sgid
+		 * index and that will be in the IB GID format.
+		 *
+		 * We now have a case where the sent packet had a
+		 * different sgid_index and dgid compared to the
+		 * one that was received in response.
+		 *
+		 * Fix this inconsistency.
+		 */
+		if (priv->hdr_type == HFI1_PKT_TYPE_16B) {
+			if (grd.sgid_index == 0)
+				grd.sgid_index = OPA_GID_INDEX;
+
+			if (ib_is_opa_gid(&grd.dgid))
+				grd.dgid.global.interface_id =
+				cpu_to_be64(ppd->guids[HFI1_PORT_GUID_INDEX]);
+		}
+
+		hfi1_make_grh(ibp, &grh, &grd, 0, 0);
 		hfi1_copy_sge(&qp->r_sge, &grh,
 			      sizeof(grh), true, false);
 		wc.wc_flags |= IB_WC_GRH;
@@ -244,7 +266,7 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
 		wc.pkey_index = 0;
 	}
 	wc.slid = ppd->lid | (rdma_ah_get_path_bits(ah_attr) &
-			      ((1 << ppd->lmc) - 1));
+				   ((1 << ppd->lmc) - 1));
 	/* Check for loopback when the port lid is not set */
 	if (wc.slid == 0 && sqp->ibqp.qp_type == IB_QPT_GSI)
 		wc.slid = be16_to_cpu(IB_LID_PERMISSIVE);
@@ -261,6 +283,183 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
 	rcu_read_unlock();
 }
 
+static void hfi1_make_bth_deth(struct rvt_qp *qp, struct rvt_swqe *wqe,
+			       struct ib_other_headers *ohdr,
+			       u16 *pkey, u32 extra_bytes, bool bypass)
+{
+	u32 bth0;
+	struct hfi1_ibport *ibp;
+
+	ibp = to_iport(qp->ibqp.device, qp->port_num);
+	if (wqe->wr.opcode == IB_WR_SEND_WITH_IMM) {
+		ohdr->u.ud.imm_data = wqe->wr.ex.imm_data;
+		bth0 = IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE << 24;
+	} else {
+		bth0 = IB_OPCODE_UD_SEND_ONLY << 24;
+	}
+
+	if (wqe->wr.send_flags & IB_SEND_SOLICITED)
+		bth0 |= IB_BTH_SOLICITED;
+	bth0 |= extra_bytes << 20;
+	if (qp->ibqp.qp_type == IB_QPT_GSI || qp->ibqp.qp_type == IB_QPT_SMI)
+		*pkey = hfi1_get_pkey(ibp, wqe->ud_wr.pkey_index);
+	else
+		*pkey = hfi1_get_pkey(ibp, qp->s_pkey_index);
+	if (!bypass)
+		bth0 |= *pkey;
+	ohdr->bth[0] = cpu_to_be32(bth0);
+	ohdr->bth[1] = cpu_to_be32(wqe->ud_wr.remote_qpn);
+	ohdr->bth[2] = cpu_to_be32(mask_psn(wqe->psn));
+	/*
+	 * Qkeys with the high order bit set mean use the
+	 * qkey from the QP context instead of the WR (see 10.2.5).
+	 */
+	ohdr->u.ud.deth[0] = cpu_to_be32((int)wqe->ud_wr.remote_qkey < 0 ?
+					 qp->qkey : wqe->ud_wr.remote_qkey);
+	ohdr->u.ud.deth[1] = cpu_to_be32(qp->ibqp.qp_num);
+}
+
+void hfi1_make_ud_req_9B(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
+			 struct rvt_swqe *wqe)
+{
+	u32 nwords, extra_bytes;
+	u16 len, slid, dlid, pkey;
+	u16 lrh0 = 0;
+	u8 sc5;
+	struct hfi1_qp_priv *priv = qp->priv;
+	struct ib_other_headers *ohdr;
+	struct rdma_ah_attr *ah_attr;
+	struct hfi1_pportdata *ppd;
+	struct hfi1_ibport *ibp;
+	struct ib_grh *grh;
+
+	ibp = to_iport(qp->ibqp.device, qp->port_num);
+	ppd = ppd_from_ibp(ibp);
+	ah_attr = &ibah_to_rvtah(wqe->ud_wr.ah)->attr;
+
+	extra_bytes = -wqe->length & 3;
+	nwords = ((wqe->length + extra_bytes) >> 2) + SIZE_OF_CRC;
+	/* header size in dwords LRH+BTH+DETH = (8+12+8)/4. */
+	qp->s_hdrwords = 7;
+	if (wqe->wr.opcode == IB_WR_SEND_WITH_IMM)
+		qp->s_hdrwords++;
+
+	if (rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH) {
+		grh = &ps->s_txreq->phdr.hdr.ibh.u.l.grh;
+		qp->s_hdrwords += hfi1_make_grh(ibp, grh,
+						rdma_ah_read_grh(ah_attr),
+						qp->s_hdrwords - 2, nwords);
+		lrh0 = HFI1_LRH_GRH;
+		ohdr = &ps->s_txreq->phdr.hdr.ibh.u.l.oth;
+	} else {
+		lrh0 = HFI1_LRH_BTH;
+		ohdr = &ps->s_txreq->phdr.hdr.ibh.u.oth;
+	}
+
+	sc5 = ibp->sl_to_sc[rdma_ah_get_sl(ah_attr)];
+	lrh0 |= (rdma_ah_get_sl(ah_attr) & 0xf) << 4;
+	if (qp->ibqp.qp_type == IB_QPT_SMI) {
+		lrh0 |= 0xF000; /* Set VL (see ch. 13.5.3.1) */
+		priv->s_sc = 0xf;
+	} else {
+		lrh0 |= (sc5 & 0xf) << 12;
+		priv->s_sc = sc5;
+	}
+
+	dlid = opa_get_lid(rdma_ah_get_dlid(ah_attr), 9B);
+	if (dlid == be16_to_cpu(IB_LID_PERMISSIVE)) {
+		slid = be16_to_cpu(IB_LID_PERMISSIVE);
+	} else {
+		u16 lid = (u16)ppd->lid;
+
+		if (lid) {
+			lid |= rdma_ah_get_path_bits(ah_attr) &
+				((1 << ppd->lmc) - 1);
+			slid = lid;
+		} else {
+			slid = be16_to_cpu(IB_LID_PERMISSIVE);
+		}
+	}
+	hfi1_make_bth_deth(qp, wqe, ohdr, &pkey, extra_bytes, false);
+	len = qp->s_hdrwords + nwords;
+
+	/* Setup the packet */
+	ps->s_txreq->phdr.hdr.hdr_type = HFI1_PKT_TYPE_9B;
+	hfi1_make_ib_hdr(&ps->s_txreq->phdr.hdr.ibh,
+			 lrh0, len, dlid, slid);
+}
+
+void hfi1_make_ud_req_16B(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
+			  struct rvt_swqe *wqe)
+{
+	struct hfi1_qp_priv *priv = qp->priv;
+	struct ib_other_headers *ohdr;
+	struct rdma_ah_attr *ah_attr;
+	struct hfi1_pportdata *ppd;
+	struct hfi1_ibport *ibp;
+	u32 dlid, slid, nwords, extra_bytes;
+	u16 len, pkey;
+	u8 l4, sc5;
+
+	ibp = to_iport(qp->ibqp.device, qp->port_num);
+	ppd = ppd_from_ibp(ibp);
+	ah_attr = &ibah_to_rvtah(wqe->ud_wr.ah)->attr;
+	/* header size in dwords 16B LRH+BTH+DETH = (16+12+8)/4. */
+	qp->s_hdrwords = 9;
+	if (wqe->wr.opcode == IB_WR_SEND_WITH_IMM)
+		qp->s_hdrwords++;
+
+	/* SW provides space for CRC and LT for bypass packets. */
+	extra_bytes = hfi1_get_16b_padding((qp->s_hdrwords << 2),
+					   wqe->length);
+	nwords = ((wqe->length + extra_bytes + SIZE_OF_LT) >> 2) + SIZE_OF_CRC;
+
+	if ((rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH) &&
+	    hfi1_check_mcast(rdma_ah_get_dlid(ah_attr))) {
+		struct ib_grh *grh;
+		struct ib_global_route *grd = rdma_ah_retrieve_grh(ah_attr);
+		/*
+		 * Ensure OPA GIDs are transformed to IB gids
+		 * before creating the GRH.
+		 */
+		if (grd->sgid_index == OPA_GID_INDEX) {
+			dd_dev_warn(ppd->dd, "Bad sgid_index. sgid_index: %d\n",
+				    grd->sgid_index);
+			grd->sgid_index = 0;
+		}
+		grh = &ps->s_txreq->phdr.hdr.opah.u.l.grh;
+		qp->s_hdrwords += hfi1_make_grh(ibp, grh, grd,
+					qp->s_hdrwords - 4, nwords);
+		ohdr = &ps->s_txreq->phdr.hdr.opah.u.l.oth;
+		l4 = OPA_16B_L4_IB_GLOBAL;
+	} else {
+		ohdr = &ps->s_txreq->phdr.hdr.opah.u.oth;
+		l4 = OPA_16B_L4_IB_LOCAL;
+	}
+
+	sc5 = ibp->sl_to_sc[rdma_ah_get_sl(ah_attr)];
+	if (qp->ibqp.qp_type == IB_QPT_SMI)
+		priv->s_sc = 0xf;
+	else
+		priv->s_sc = sc5;
+
+	dlid = opa_get_lid(rdma_ah_get_dlid(ah_attr), 16B);
+	if (!ppd->lid)
+		slid = be32_to_cpu(OPA_LID_PERMISSIVE);
+	else
+		slid = ppd->lid | (rdma_ah_get_path_bits(ah_attr) &
+			   ((1 << ppd->lmc) - 1));
+
+	hfi1_make_bth_deth(qp, wqe, ohdr, &pkey, extra_bytes, true);
+	/* Convert dwords to flits */
+	len = (qp->s_hdrwords + nwords) >> 1;
+
+	/* Setup the packet */
+	ps->s_txreq->phdr.hdr.hdr_type = HFI1_PKT_TYPE_16B;
+	hfi1_make_16b_hdr(&ps->s_txreq->phdr.hdr.opah,
+			  slid, dlid, len, pkey, 0, 0, l4, priv->s_sc);
+}
+
 /**
  * hfi1_make_ud_req - construct a UD request packet
  * @qp: the QP
@@ -272,18 +471,12 @@ static void ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
 int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 {
 	struct hfi1_qp_priv *priv = qp->priv;
-	struct ib_other_headers *ohdr;
 	struct rdma_ah_attr *ah_attr;
 	struct hfi1_pportdata *ppd;
 	struct hfi1_ibport *ibp;
 	struct rvt_swqe *wqe;
-	u32 nwords;
-	u32 extra_bytes;
-	u32 bth0;
-	u16 lrh0;
-	u16 lid;
 	int next_cur;
-	u8 sc5;
+	u32 lid;
 
 	ps->s_txreq = get_txreq(ps->dev, qp);
 	if (IS_ERR(ps->s_txreq))
@@ -320,13 +513,14 @@ int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 	ibp = to_iport(qp->ibqp.device, qp->port_num);
 	ppd = ppd_from_ibp(ibp);
 	ah_attr = &ibah_to_rvtah(wqe->ud_wr.ah)->attr;
-	if (rdma_ah_get_dlid(ah_attr) < be16_to_cpu(IB_MULTICAST_LID_BASE) ||
-	    rdma_ah_get_dlid(ah_attr) == be16_to_cpu(IB_LID_PERMISSIVE)) {
+	priv->hdr_type = hfi1_get_hdr_type(ppd->lid, ah_attr);
+	if ((!hfi1_check_mcast(rdma_ah_get_dlid(ah_attr))) ||
+	    (rdma_ah_get_dlid(ah_attr) == be32_to_cpu(OPA_LID_PERMISSIVE))) {
 		lid = rdma_ah_get_dlid(ah_attr) & ~((1 << ppd->lmc) - 1);
 		if (unlikely(!loopback &&
-			     (lid == ppd->lid ||
-			      (lid == be16_to_cpu(IB_LID_PERMISSIVE) &&
-			      qp->ibqp.qp_type == IB_QPT_GSI)))) {
+			     ((lid == ppd->lid) ||
+			      ((lid == be32_to_cpu(OPA_LID_PERMISSIVE)) &&
+			       (qp->ibqp.qp_type == IB_QPT_GSI))))) {
 			unsigned long tflags = ps->flags;
 			/*
 			 * If DMAs are in progress, we can't generate
@@ -350,11 +544,6 @@ int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 	}
 
 	qp->s_cur = next_cur;
-	extra_bytes = -wqe->length & 3;
-	nwords = (wqe->length + extra_bytes) >> 2;
-
-	/* header size in 32-bit words LRH+BTH+DETH = (8+12+8)/4. */
-	qp->s_hdrwords = 7;
 	ps->s_txreq->s_cur_size = wqe->length;
 	ps->s_txreq->ss = &qp->s_sge;
 	qp->s_srate = rdma_ah_get_static_rate(ah_attr);
@@ -365,77 +554,12 @@ int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 	qp->s_sge.num_sge = wqe->wr.num_sge;
 	qp->s_sge.total_len = wqe->length;
 
-	if (rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH) {
-		/* Header size in 32-bit words. */
-		qp->s_hdrwords += hfi1_make_grh(ibp,
-						&ps->s_txreq->phdr.hdr.u.l.grh,
-						rdma_ah_read_grh(ah_attr),
-						qp->s_hdrwords, nwords);
-		lrh0 = HFI1_LRH_GRH;
-		ohdr = &ps->s_txreq->phdr.hdr.u.l.oth;
-		/*
-		 * Don't worry about sending to locally attached multicast
-		 * QPs.  It is unspecified by the spec. what happens.
-		 */
-	} else {
-		/* Header size in 32-bit words. */
-		lrh0 = HFI1_LRH_BTH;
-		ohdr = &ps->s_txreq->phdr.hdr.u.oth;
-	}
-	if (wqe->wr.opcode == IB_WR_SEND_WITH_IMM) {
-		qp->s_hdrwords++;
-		ohdr->u.ud.imm_data = wqe->wr.ex.imm_data;
-		bth0 = IB_OPCODE_UD_SEND_ONLY_WITH_IMMEDIATE << 24;
-	} else {
-		bth0 = IB_OPCODE_UD_SEND_ONLY << 24;
-	}
-	sc5 = ibp->sl_to_sc[rdma_ah_get_sl(ah_attr)];
-	lrh0 |= (rdma_ah_get_sl(ah_attr) & 0xf) << 4;
-	if (qp->ibqp.qp_type == IB_QPT_SMI) {
-		lrh0 |= 0xF000; /* Set VL (see ch. 13.5.3.1) */
-		priv->s_sc = 0xf;
-	} else {
-		lrh0 |= (sc5 & 0xf) << 12;
-		priv->s_sc = sc5;
-	}
+	/* Make the appropriate header */
+	hfi1_make_ud_req_tbl[priv->hdr_type](qp, ps, qp->s_wqe);
 	priv->s_sde = qp_to_sdma_engine(qp, priv->s_sc);
 	ps->s_txreq->sde = priv->s_sde;
 	priv->s_sendcontext = qp_to_send_context(qp, priv->s_sc);
 	ps->s_txreq->psc = priv->s_sendcontext;
-	ps->s_txreq->phdr.hdr.lrh[0] = cpu_to_be16(lrh0);
-	ps->s_txreq->phdr.hdr.lrh[1] =
-		cpu_to_be16(rdma_ah_get_dlid(ah_attr));
-	ps->s_txreq->phdr.hdr.lrh[2] =
-		cpu_to_be16(qp->s_hdrwords + nwords + SIZE_OF_CRC);
-	if (rdma_ah_get_dlid(ah_attr) == be16_to_cpu(IB_LID_PERMISSIVE)) {
-		ps->s_txreq->phdr.hdr.lrh[3] = IB_LID_PERMISSIVE;
-	} else {
-		lid = ppd->lid;
-		if (lid) {
-			lid |= rdma_ah_get_path_bits(ah_attr) &
-				((1 << ppd->lmc) - 1);
-			ps->s_txreq->phdr.hdr.lrh[3] = cpu_to_be16(lid);
-		} else {
-			ps->s_txreq->phdr.hdr.lrh[3] = IB_LID_PERMISSIVE;
-		}
-	}
-	if (wqe->wr.send_flags & IB_SEND_SOLICITED)
-		bth0 |= IB_BTH_SOLICITED;
-	bth0 |= extra_bytes << 20;
-	if (qp->ibqp.qp_type == IB_QPT_GSI || qp->ibqp.qp_type == IB_QPT_SMI)
-		bth0 |= hfi1_get_pkey(ibp, wqe->ud_wr.pkey_index);
-	else
-		bth0 |= hfi1_get_pkey(ibp, qp->s_pkey_index);
-	ohdr->bth[0] = cpu_to_be32(bth0);
-	ohdr->bth[1] = cpu_to_be32(wqe->ud_wr.remote_qpn);
-	ohdr->bth[2] = cpu_to_be32(mask_psn(wqe->psn));
-	/*
-	 * Qkeys with the high order bit set mean use the
-	 * qkey from the QP context instead of the WR (see 10.2.5).
-	 */
-	ohdr->u.ud.deth[0] = cpu_to_be32((int)wqe->ud_wr.remote_qkey < 0 ?
-					 qp->qkey : wqe->ud_wr.remote_qkey);
-	ohdr->u.ud.deth[1] = cpu_to_be32(qp->ibqp.qp_num);
 	/* disarm any ahg */
 	priv->s_ahg->ahgcount = 0;
 	priv->s_ahg->ahgidx = 0;
@@ -505,6 +629,64 @@ int hfi1_lookup_pkey_idx(struct hfi1_ibport *ibp, u16 pkey)
 	return -1;
 }
 
+void return_cnp_16B(struct hfi1_ibport *ibp, struct rvt_qp *qp,
+		    u32 remote_qpn, u32 pkey, u32 slid, u32 dlid,
+		    u8 sc5, const struct ib_grh *old_grh)
+{
+	u64 pbc, pbc_flags = 0;
+	u32 bth0, plen, vl, hwords = 7;
+	u16 len;
+	u8 l4;
+	struct hfi1_16b_header hdr;
+	struct ib_other_headers *ohdr;
+	struct pio_buf *pbuf;
+	struct send_context *ctxt = qp_to_send_context(qp, sc5);
+	struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
+	u32 nwords;
+
+	/* Populate length */
+	nwords = ((hfi1_get_16b_padding(hwords << 2, 0) +
+		   SIZE_OF_LT) >> 2) + SIZE_OF_CRC;
+	if (old_grh) {
+		struct ib_grh *grh = &hdr.u.l.grh;
+
+		grh->version_tclass_flow = old_grh->version_tclass_flow;
+		grh->paylen = cpu_to_be16((hwords - 4 + nwords) << 2);
+		grh->hop_limit = 0xff;
+		grh->sgid = old_grh->dgid;
+		grh->dgid = old_grh->sgid;
+		ohdr = &hdr.u.l.oth;
+		l4 = OPA_16B_L4_IB_GLOBAL;
+		hwords += sizeof(struct ib_grh) / sizeof(u32);
+	} else {
+		ohdr = &hdr.u.oth;
+		l4 = OPA_16B_L4_IB_LOCAL;
+	}
+
+	/* BIT 16 to 19 is TVER. Bit 20 to 22 is pad cnt */
+	bth0 = (IB_OPCODE_CNP << 24) | (1 << 16) |
+	       (hfi1_get_16b_padding(hwords << 2, 0) << 20);
+	ohdr->bth[0] = cpu_to_be32(bth0);
+
+	ohdr->bth[1] = cpu_to_be32(remote_qpn);
+	ohdr->bth[2] = 0; /* PSN 0 */
+
+	/* Convert dwords to flits */
+	len = (hwords + nwords) >> 1;
+	hfi1_make_16b_hdr(&hdr, slid, dlid, len, pkey, 1, 0, l4, sc5);
+
+	plen = 2 /* PBC */ + hwords + nwords;
+	pbc_flags |= PBC_PACKET_BYPASS | PBC_INSERT_BYPASS_ICRC;
+	vl = sc_to_vlt(ppd->dd, sc5);
+	pbc = create_pbc(ppd, pbc_flags, qp->srate_mbps, vl, plen);
+	if (ctxt) {
+		pbuf = sc_buffer_alloc(ctxt, plen, NULL, NULL);
+		if (pbuf)
+			ppd->dd->pio_inline_send(ppd->dd, pbuf, pbc,
+						 &hdr, hwords);
+	}
+}
+
 void return_cnp(struct hfi1_ibport *ibp, struct rvt_qp *qp, u32 remote_qpn,
 		u32 pkey, u32 slid, u32 dlid, u8 sc5,
 		const struct ib_grh *old_grh)
@@ -543,13 +725,9 @@ void return_cnp(struct hfi1_ibport *ibp, struct rvt_qp *qp, u32 remote_qpn,
 	ohdr->bth[1] = cpu_to_be32(remote_qpn | (1 << IB_BECN_SHIFT));
 	ohdr->bth[2] = 0; /* PSN 0 */
 
-	hdr.lrh[0] = cpu_to_be16(lrh0);
-	hdr.lrh[1] = cpu_to_be16(dlid);
-	hdr.lrh[2] = cpu_to_be16(hwords + SIZE_OF_CRC);
-	hdr.lrh[3] = cpu_to_be16(slid);
-
+	hfi1_make_ib_hdr(&hdr, lrh0, hwords + SIZE_OF_CRC, dlid, slid);
 	plen = 2 /* PBC */ + hwords;
-	pbc_flags |= (!!(sc5 & 0x10)) << PBC_DC_INFO_SHIFT;
+	pbc_flags |= (ib_is_sc5(sc5) << PBC_DC_INFO_SHIFT);
 	vl = sc_to_vlt(ppd->dd, sc5);
 	pbc = create_pbc(ppd, pbc_flags, qp->srate_mbps, vl, plen);
 	if (ctxt) {
@@ -668,37 +846,45 @@ static int opa_smp_check(struct hfi1_ibport *ibp, u16 pkey, u8 sc5,
 void hfi1_ud_rcv(struct hfi1_packet *packet)
 {
 	struct ib_other_headers *ohdr = packet->ohdr;
-	int opcode;
 	u32 hdrsize = packet->hlen;
 	struct ib_wc wc;
 	u32 qkey;
 	u32 src_qp;
-	u16 dlid, pkey;
+	u16 pkey;
 	int mgmt_pkey_idx = -1;
 	struct hfi1_ibport *ibp = rcd_to_iport(packet->rcd);
 	struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
 	struct ib_header *hdr = packet->hdr;
-	u32 rcv_flags = packet->rcv_flags;
-	void *data = packet->ebuf;
+	void *data = packet->payload;
 	u32 tlen = packet->tlen;
 	struct rvt_qp *qp = packet->qp;
-	bool has_grh = rcv_flags & HFI1_HAS_GRH;
-	u8 sc5 = hfi1_9B_get_sc5(hdr, packet->rhf);
-	u32 bth1;
-	u8 sl_from_sc, sl;
-	u16 slid;
+	u8 sc5 = packet->sc;
+	u8 sl_from_sc;
+	u8 opcode = packet->opcode;
+	u8 sl = packet->sl;
+	u32 dlid = packet->dlid;
+	u32 slid = packet->slid;
 	u8 extra_bytes;
+	bool dlid_is_permissive;
+	bool slid_is_permissive;
 
-	qkey = be32_to_cpu(ohdr->u.ud.deth[0]);
-	src_qp = be32_to_cpu(ohdr->u.ud.deth[1]) & RVT_QPN_MASK;
-	dlid = ib_get_dlid(hdr);
-	bth1 = be32_to_cpu(ohdr->bth[1]);
-	slid = ib_get_slid(hdr);
-	pkey = ib_bth_get_pkey(ohdr);
-	opcode = ib_bth_get_opcode(ohdr);
-	sl = ib_get_sl(hdr);
-	extra_bytes = ib_bth_get_pad(ohdr);
-	extra_bytes += (SIZE_OF_CRC << 2);
+	extra_bytes = packet->pad + packet->extra_byte + (SIZE_OF_CRC << 2);
+	qkey = ib_get_qkey(ohdr);
+	src_qp = ib_get_sqpn(ohdr);
+
+	if (packet->etype == RHF_RCV_TYPE_BYPASS) {
+		u32 permissive_lid =
+			opa_get_lid(be32_to_cpu(OPA_LID_PERMISSIVE), 16B);
+
+		pkey = hfi1_16B_get_pkey(packet->hdr);
+		dlid_is_permissive = (dlid == permissive_lid);
+		slid_is_permissive = (slid == permissive_lid);
+	} else {
+		hdr = packet->hdr;
+		pkey = ib_bth_get_pkey(ohdr);
+		dlid_is_permissive = (dlid == be16_to_cpu(IB_LID_PERMISSIVE));
+		slid_is_permissive = (slid == be16_to_cpu(IB_LID_PERMISSIVE));
+	}
 	sl_from_sc = ibp->sc_to_sl[sc5];
 
 	process_ecn(qp, packet, (opcode != IB_OPCODE_CNP));
@@ -716,8 +902,7 @@ void hfi1_ud_rcv(struct hfi1_packet *packet)
 	 * and the QKEY matches (see 9.6.1.4.1 and 9.6.1.5.1).
 	 */
 	if (qp->ibqp.qp_num) {
-		if (unlikely(hdr->lrh[1] == IB_LID_PERMISSIVE ||
-			     hdr->lrh[3] == IB_LID_PERMISSIVE))
+		if (unlikely(dlid_is_permissive || slid_is_permissive))
 			goto drop;
 		if (qp->ibqp.qp_num > 1) {
 			if (unlikely(rcv_pkey_check(ppd, pkey, sc5, slid))) {
@@ -727,10 +912,10 @@ void hfi1_ud_rcv(struct hfi1_packet *packet)
 				 * for invalid pkeys is optional according to
 				 * IB spec (release 1.3, section 10.9.4)
 				 */
-				hfi1_bad_pqkey(ibp, OPA_TRAP_BAD_P_KEY,
-					       pkey, sl,
-					       src_qp, qp->ibqp.qp_num,
-					       slid, dlid);
+				hfi1_bad_pkey(ibp,
+					      pkey, sl,
+					      src_qp, qp->ibqp.qp_num,
+					      slid, dlid);
 				return;
 			}
 		} else {
@@ -739,12 +924,9 @@ void hfi1_ud_rcv(struct hfi1_packet *packet)
 			if (mgmt_pkey_idx < 0)
 				goto drop;
 		}
-		if (unlikely(qkey != qp->qkey)) {
-			hfi1_bad_pqkey(ibp, OPA_TRAP_BAD_Q_KEY, qkey, sl,
-				       src_qp, qp->ibqp.qp_num,
-				       slid, dlid);
+		if (unlikely(qkey != qp->qkey)) /* Silent drop */
 			return;
-		}
+
 		/* Drop invalid MAD packets (see 13.5.3.1). */
 		if (unlikely(qp->ibqp.qp_num == 1 &&
 			     (tlen > 2048 || (sc5 == 0xF))))
@@ -758,8 +940,7 @@ void hfi1_ud_rcv(struct hfi1_packet *packet)
 
 		if (tlen > 2048)
 			goto drop;
-		if ((hdr->lrh[1] == IB_LID_PERMISSIVE ||
-		     hdr->lrh[3] == IB_LID_PERMISSIVE) &&
+		if ((dlid_is_permissive || slid_is_permissive) &&
 		    smp->mgmt_class != IB_MGMT_CLASS_SUBN_DIRECTED_ROUTE)
 			goto drop;
 
@@ -811,8 +992,19 @@ void hfi1_ud_rcv(struct hfi1_packet *packet)
 		qp->r_flags |= RVT_R_REUSE_SGE;
 		goto drop;
 	}
-	if (has_grh) {
-		hfi1_copy_sge(&qp->r_sge, &hdr->u.l.grh,
+	if (packet->grh) {
+		hfi1_copy_sge(&qp->r_sge, packet->grh,
+			      sizeof(struct ib_grh), true, false);
+		wc.wc_flags |= IB_WC_GRH;
+	} else if (packet->etype == RHF_RCV_TYPE_BYPASS) {
+		struct ib_grh grh;
+		/*
+		 * Assuming we only created 16B on the send side
+		 * if we want to use large LIDs, since GRH was stripped
+		 * out when creating 16B, add back the GRH here.
+		 */
+		hfi1_make_ext_grh(packet, &grh, slid, dlid);
+		hfi1_copy_sge(&qp->r_sge, &grh,
 			      sizeof(struct ib_grh), true, false);
 		wc.wc_flags |= IB_WC_GRH;
 	} else {
@@ -845,14 +1037,15 @@ void hfi1_ud_rcv(struct hfi1_packet *packet)
 	} else {
 		wc.pkey_index = 0;
 	}
-
+	if (slid_is_permissive)
+		slid = be32_to_cpu(OPA_LID_PERMISSIVE);
 	wc.slid = slid;
 	wc.sl = sl_from_sc;
 
 	/*
 	 * Save the LMC lower bits if the destination LID is a unicast LID.
 	 */
-	wc.dlid_path_bits = dlid >= be16_to_cpu(IB_MULTICAST_LID_BASE) ? 0 :
+	wc.dlid_path_bits = hfi1_check_mcast(dlid) ? 0 :
 		dlid & ((1 << ppd_from_ibp(ibp)->lmc) - 1);
 	wc.port_num = qp->port_num;
 	/* Signal completion event if the solicited bit is set. */
diff --git a/drivers/infiniband/hw/hfi1/user_exp_rcv.c b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
index a8f0aa4..6f6c14d 100644
--- a/drivers/infiniband/hw/hfi1/user_exp_rcv.c
+++ b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
@@ -47,58 +47,28 @@
 #include <asm/page.h>
 #include <linux/string.h>
 
+#include "mmu_rb.h"
 #include "user_exp_rcv.h"
 #include "trace.h"
-#include "mmu_rb.h"
-
-struct tid_group {
-	struct list_head list;
-	u32 base;
-	u8 size;
-	u8 used;
-	u8 map;
-};
-
-struct tid_rb_node {
-	struct mmu_rb_node mmu;
-	unsigned long phys;
-	struct tid_group *grp;
-	u32 rcventry;
-	dma_addr_t dma_addr;
-	bool freed;
-	unsigned npages;
-	struct page *pages[0];
-};
-
-struct tid_pageset {
-	u16 idx;
-	u16 count;
-};
-
-#define EXP_TID_SET_EMPTY(set) (set.count == 0 && list_empty(&set.list))
-
-#define num_user_pages(vaddr, len)				       \
-	(1 + (((((unsigned long)(vaddr) +			       \
-		 (unsigned long)(len) - 1) & PAGE_MASK) -	       \
-	       ((unsigned long)vaddr & PAGE_MASK)) >> PAGE_SHIFT))
 
 static void unlock_exp_tids(struct hfi1_ctxtdata *uctxt,
 			    struct exp_tid_set *set,
 			    struct hfi1_filedata *fd);
-static u32 find_phys_blocks(struct page **pages, unsigned npages,
-			    struct tid_pageset *list);
-static int set_rcvarray_entry(struct hfi1_filedata *fd, unsigned long vaddr,
+static u32 find_phys_blocks(struct tid_user_buf *tidbuf, unsigned int npages);
+static int set_rcvarray_entry(struct hfi1_filedata *fd,
+			      struct tid_user_buf *tbuf,
 			      u32 rcventry, struct tid_group *grp,
-			      struct page **pages, unsigned npages);
+			      u16 pageidx, unsigned int npages);
 static int tid_rb_insert(void *arg, struct mmu_rb_node *node);
 static void cacheless_tid_rb_remove(struct hfi1_filedata *fdata,
 				    struct tid_rb_node *tnode);
 static void tid_rb_remove(void *arg, struct mmu_rb_node *node);
 static int tid_rb_invalidate(void *arg, struct mmu_rb_node *mnode);
-static int program_rcvarray(struct hfi1_filedata *fd, unsigned long vaddr,
-			    struct tid_group *grp, struct tid_pageset *sets,
-			    unsigned start, u16 count, struct page **pages,
-			    u32 *tidlist, unsigned *tididx, unsigned *pmapped);
+static int program_rcvarray(struct hfi1_filedata *fd, struct tid_user_buf *,
+			    struct tid_group *grp,
+			    unsigned int start, u16 count,
+			    u32 *tidlist, unsigned int *tididx,
+			    unsigned int *pmapped);
 static int unprogram_rcvarray(struct hfi1_filedata *fd, u32 tidinfo,
 			      struct tid_group **grp);
 static void clear_tid_node(struct hfi1_filedata *fd, struct tid_rb_node *node);
@@ -109,96 +79,14 @@ static struct mmu_rb_ops tid_rb_ops = {
 	.invalidate = tid_rb_invalidate
 };
 
-static inline u32 rcventry2tidinfo(u32 rcventry)
-{
-	u32 pair = rcventry & ~0x1;
-
-	return EXP_TID_SET(IDX, pair >> 1) |
-		EXP_TID_SET(CTRL, 1 << (rcventry - pair));
-}
-
-static inline void exp_tid_group_init(struct exp_tid_set *set)
-{
-	INIT_LIST_HEAD(&set->list);
-	set->count = 0;
-}
-
-static inline void tid_group_remove(struct tid_group *grp,
-				    struct exp_tid_set *set)
-{
-	list_del_init(&grp->list);
-	set->count--;
-}
-
-static inline void tid_group_add_tail(struct tid_group *grp,
-				      struct exp_tid_set *set)
-{
-	list_add_tail(&grp->list, &set->list);
-	set->count++;
-}
-
-static inline struct tid_group *tid_group_pop(struct exp_tid_set *set)
-{
-	struct tid_group *grp =
-		list_first_entry(&set->list, struct tid_group, list);
-	list_del_init(&grp->list);
-	set->count--;
-	return grp;
-}
-
-static inline void tid_group_move(struct tid_group *group,
-				  struct exp_tid_set *s1,
-				  struct exp_tid_set *s2)
-{
-	tid_group_remove(group, s1);
-	tid_group_add_tail(group, s2);
-}
-
-int hfi1_user_exp_rcv_grp_init(struct hfi1_filedata *fd)
-{
-	struct hfi1_ctxtdata *uctxt = fd->uctxt;
-	struct hfi1_devdata *dd = fd->dd;
-	u32 tidbase;
-	u32 i;
-	struct tid_group *grp, *gptr;
-
-	exp_tid_group_init(&uctxt->tid_group_list);
-	exp_tid_group_init(&uctxt->tid_used_list);
-	exp_tid_group_init(&uctxt->tid_full_list);
-
-	tidbase = uctxt->expected_base;
-	for (i = 0; i < uctxt->expected_count /
-		     dd->rcv_entries.group_size; i++) {
-		grp = kzalloc(sizeof(*grp), GFP_KERNEL);
-		if (!grp)
-			goto grp_failed;
-
-		grp->size = dd->rcv_entries.group_size;
-		grp->base = tidbase;
-		tid_group_add_tail(grp, &uctxt->tid_group_list);
-		tidbase += dd->rcv_entries.group_size;
-	}
-
-	return 0;
-
-grp_failed:
-	list_for_each_entry_safe(grp, gptr, &uctxt->tid_group_list.list,
-				 list) {
-		list_del_init(&grp->list);
-		kfree(grp);
-	}
-
-	return -ENOMEM;
-}
-
 /*
  * Initialize context and file private data needed for Expected
  * receive caching. This needs to be done after the context has
  * been configured with the eager/expected RcvEntry counts.
  */
-int hfi1_user_exp_rcv_init(struct hfi1_filedata *fd)
+int hfi1_user_exp_rcv_init(struct hfi1_filedata *fd,
+			   struct hfi1_ctxtdata *uctxt)
 {
-	struct hfi1_ctxtdata *uctxt = fd->uctxt;
 	struct hfi1_devdata *dd = uctxt->dd;
 	int ret = 0;
 
@@ -266,18 +154,6 @@ int hfi1_user_exp_rcv_init(struct hfi1_filedata *fd)
 	return ret;
 }
 
-void hfi1_user_exp_rcv_grp_free(struct hfi1_ctxtdata *uctxt)
-{
-	struct tid_group *grp, *gptr;
-
-	list_for_each_entry_safe(grp, gptr, &uctxt->tid_group_list.list,
-				 list) {
-		list_del_init(&grp->list);
-		kfree(grp);
-	}
-	hfi1_clear_tids(uctxt);
-}
-
 void hfi1_user_exp_rcv_free(struct hfi1_filedata *fd)
 {
 	struct hfi1_ctxtdata *uctxt = fd->uctxt;
@@ -302,21 +178,90 @@ void hfi1_user_exp_rcv_free(struct hfi1_filedata *fd)
 	fd->entry_to_rb = NULL;
 }
 
-/*
- * Write an "empty" RcvArray entry.
- * This function exists so the TID registaration code can use it
- * to write to unused/unneeded entries and still take advantage
- * of the WC performance improvements. The HFI will ignore this
- * write to the RcvArray entry.
+/**
+ * Release pinned receive buffer pages.
+ *
+ * @mapped - true if the pages have been DMA mapped. false otherwise.
+ * @idx - Index of the first page to unpin.
+ * @npages - No of pages to unpin.
+ *
+ * If the pages have been DMA mapped (indicated by mapped parameter), their
+ * info will be passed via a struct tid_rb_node. If they haven't been mapped,
+ * their info will be passed via a struct tid_user_buf.
  */
-static inline void rcv_array_wc_fill(struct hfi1_devdata *dd, u32 index)
+static void unpin_rcv_pages(struct hfi1_filedata *fd,
+			    struct tid_user_buf *tidbuf,
+			    struct tid_rb_node *node,
+			    unsigned int idx,
+			    unsigned int npages,
+			    bool mapped)
 {
+	struct page **pages;
+	struct hfi1_devdata *dd = fd->uctxt->dd;
+
+	if (mapped) {
+		pci_unmap_single(dd->pcidev, node->dma_addr,
+				 node->mmu.len, PCI_DMA_FROMDEVICE);
+		pages = &node->pages[idx];
+	} else {
+		pages = &tidbuf->pages[idx];
+	}
+	hfi1_release_user_pages(fd->mm, pages, npages, mapped);
+	fd->tid_n_pinned -= npages;
+}
+
+/**
+ * Pin receive buffer pages.
+ */
+static int pin_rcv_pages(struct hfi1_filedata *fd, struct tid_user_buf *tidbuf)
+{
+	int pinned;
+	unsigned int npages;
+	unsigned long vaddr = tidbuf->vaddr;
+	struct page **pages = NULL;
+	struct hfi1_devdata *dd = fd->uctxt->dd;
+
+	/* Get the number of pages the user buffer spans */
+	npages = num_user_pages(vaddr, tidbuf->length);
+	if (!npages)
+		return -EINVAL;
+
+	if (npages > fd->uctxt->expected_count) {
+		dd_dev_err(dd, "Expected buffer too big\n");
+		return -EINVAL;
+	}
+
+	/* Verify that access is OK for the user buffer */
+	if (!access_ok(VERIFY_WRITE, (void __user *)vaddr,
+		       npages * PAGE_SIZE)) {
+		dd_dev_err(dd, "Fail vaddr %p, %u pages, !access_ok\n",
+			   (void *)vaddr, npages);
+		return -EFAULT;
+	}
+	/* Allocate the array of struct page pointers needed for pinning */
+	pages = kcalloc(npages, sizeof(*pages), GFP_KERNEL);
+	if (!pages)
+		return -ENOMEM;
+
 	/*
-	 * Doing the WC fill writes only makes sense if the device is
-	 * present and the RcvArray has been mapped as WC memory.
+	 * Pin all the pages of the user buffer. If we can't pin all the
+	 * pages, accept the amount pinned so far and program only that.
+	 * User space knows how to deal with partially programmed buffers.
 	 */
-	if ((dd->flags & HFI1_PRESENT) && dd->rcvarray_wc)
-		writeq(0, dd->rcvarray_wc + (index * 8));
+	if (!hfi1_can_pin_pages(dd, fd->mm, fd->tid_n_pinned, npages)) {
+		kfree(pages);
+		return -ENOMEM;
+	}
+
+	pinned = hfi1_acquire_user_pages(fd->mm, vaddr, npages, true, pages);
+	if (pinned <= 0) {
+		kfree(pages);
+		return pinned;
+	}
+	tidbuf->pages = pages;
+	tidbuf->npages = npages;
+	fd->tid_n_pinned += pinned;
+	return pinned;
 }
 
 /*
@@ -374,62 +319,33 @@ int hfi1_user_exp_rcv_setup(struct hfi1_filedata *fd,
 	int ret = 0, need_group = 0, pinned;
 	struct hfi1_ctxtdata *uctxt = fd->uctxt;
 	struct hfi1_devdata *dd = uctxt->dd;
-	unsigned npages, ngroups, pageidx = 0, pageset_count, npagesets,
+	unsigned int ngroups, pageidx = 0, pageset_count,
 		tididx = 0, mapped, mapped_pages = 0;
-	unsigned long vaddr = tinfo->vaddr;
-	struct page **pages = NULL;
 	u32 *tidlist = NULL;
-	struct tid_pageset *pagesets = NULL;
+	struct tid_user_buf *tidbuf;
 
-	/* Get the number of pages the user buffer spans */
-	npages = num_user_pages(vaddr, tinfo->length);
-	if (!npages)
-		return -EINVAL;
-
-	if (npages > uctxt->expected_count) {
-		dd_dev_err(dd, "Expected buffer too big\n");
-		return -EINVAL;
-	}
-
-	/* Verify that access is OK for the user buffer */
-	if (!access_ok(VERIFY_WRITE, (void __user *)vaddr,
-		       npages * PAGE_SIZE)) {
-		dd_dev_err(dd, "Fail vaddr %p, %u pages, !access_ok\n",
-			   (void *)vaddr, npages);
-		return -EFAULT;
-	}
-
-	pagesets = kcalloc(uctxt->expected_count, sizeof(*pagesets),
-			   GFP_KERNEL);
-	if (!pagesets)
+	tidbuf = kzalloc(sizeof(*tidbuf), GFP_KERNEL);
+	if (!tidbuf)
 		return -ENOMEM;
 
-	/* Allocate the array of struct page pointers needed for pinning */
-	pages = kcalloc(npages, sizeof(*pages), GFP_KERNEL);
-	if (!pages) {
-		ret = -ENOMEM;
-		goto bail;
+	tidbuf->vaddr = tinfo->vaddr;
+	tidbuf->length = tinfo->length;
+	tidbuf->psets = kcalloc(uctxt->expected_count, sizeof(*tidbuf->psets),
+				GFP_KERNEL);
+	if (!tidbuf->psets) {
+		kfree(tidbuf);
+		return -ENOMEM;
 	}
 
-	/*
-	 * Pin all the pages of the user buffer. If we can't pin all the
-	 * pages, accept the amount pinned so far and program only that.
-	 * User space knows how to deal with partially programmed buffers.
-	 */
-	if (!hfi1_can_pin_pages(dd, fd->mm, fd->tid_n_pinned, npages)) {
-		ret = -ENOMEM;
-		goto bail;
-	}
-
-	pinned = hfi1_acquire_user_pages(fd->mm, vaddr, npages, true, pages);
+	pinned = pin_rcv_pages(fd, tidbuf);
 	if (pinned <= 0) {
-		ret = pinned;
-		goto bail;
+		kfree(tidbuf->psets);
+		kfree(tidbuf);
+		return pinned;
 	}
-	fd->tid_n_pinned += npages;
 
 	/* Find sets of physically contiguous pages */
-	npagesets = find_phys_blocks(pages, pinned, pagesets);
+	tidbuf->n_psets = find_phys_blocks(tidbuf, pinned);
 
 	/*
 	 * We don't need to access this under a lock since tid_used is per
@@ -437,10 +353,10 @@ int hfi1_user_exp_rcv_setup(struct hfi1_filedata *fd,
 	 * and hfi1_user_exp_rcv_setup() at the same time.
 	 */
 	spin_lock(&fd->tid_lock);
-	if (fd->tid_used + npagesets > fd->tid_limit)
+	if (fd->tid_used + tidbuf->n_psets > fd->tid_limit)
 		pageset_count = fd->tid_limit - fd->tid_used;
 	else
-		pageset_count = npagesets;
+		pageset_count = tidbuf->n_psets;
 	spin_unlock(&fd->tid_lock);
 
 	if (!pageset_count)
@@ -468,9 +384,9 @@ int hfi1_user_exp_rcv_setup(struct hfi1_filedata *fd,
 		struct tid_group *grp =
 			tid_group_pop(&uctxt->tid_group_list);
 
-		ret = program_rcvarray(fd, vaddr, grp, pagesets,
+		ret = program_rcvarray(fd, tidbuf, grp,
 				       pageidx, dd->rcv_entries.group_size,
-				       pages, tidlist, &tididx, &mapped);
+				       tidlist, &tididx, &mapped);
 		/*
 		 * If there was a failure to program the RcvArray
 		 * entries for the entire group, reset the grp fields
@@ -514,8 +430,8 @@ int hfi1_user_exp_rcv_setup(struct hfi1_filedata *fd,
 			unsigned use = min_t(unsigned, pageset_count - pageidx,
 					     grp->size - grp->used);
 
-			ret = program_rcvarray(fd, vaddr, grp, pagesets,
-					       pageidx, use, pages, tidlist,
+			ret = program_rcvarray(fd, tidbuf, grp,
+					       pageidx, use, tidlist,
 					       &tididx, &mapped);
 			if (ret < 0) {
 				hfi1_cdbg(TID,
@@ -575,16 +491,14 @@ int hfi1_user_exp_rcv_setup(struct hfi1_filedata *fd,
 	 * If not everything was mapped (due to insufficient RcvArray entries,
 	 * for example), unpin all unmapped pages so we can pin them nex time.
 	 */
-	if (mapped_pages != pinned) {
-		hfi1_release_user_pages(fd->mm, &pages[mapped_pages],
-					pinned - mapped_pages,
-					false);
-		fd->tid_n_pinned -= pinned - mapped_pages;
-	}
+	if (mapped_pages != pinned)
+		unpin_rcv_pages(fd, tidbuf, NULL, mapped_pages,
+				(pinned - mapped_pages), false);
 bail:
-	kfree(pagesets);
-	kfree(pages);
+	kfree(tidbuf->psets);
 	kfree(tidlist);
+	kfree(tidbuf->pages);
+	kfree(tidbuf);
 	return ret > 0 ? 0 : ret;
 }
 
@@ -674,11 +588,12 @@ int hfi1_user_exp_rcv_invalid(struct hfi1_filedata *fd,
 	return ret;
 }
 
-static u32 find_phys_blocks(struct page **pages, unsigned npages,
-			    struct tid_pageset *list)
+static u32 find_phys_blocks(struct tid_user_buf *tidbuf, unsigned int npages)
 {
 	unsigned pagecount, pageidx, setcount = 0, i;
 	unsigned long pfn, this_pfn;
+	struct page **pages = tidbuf->pages;
+	struct tid_pageset *list = tidbuf->psets;
 
 	if (!npages)
 		return 0;
@@ -741,13 +656,13 @@ static u32 find_phys_blocks(struct page **pages, unsigned npages,
 /**
  * program_rcvarray() - program an RcvArray group with receive buffers
  * @fd: filedata pointer
- * @vaddr: starting user virtual address
+ * @tbuf: pointer to struct tid_user_buf that has the user buffer starting
+ *	  virtual address, buffer length, page pointers, pagesets (array of
+ *	  struct tid_pageset holding information on physically contiguous
+ *	  chunks from the user buffer), and other fields.
  * @grp: RcvArray group
- * @sets: array of struct tid_pageset holding information on physically
- *        contiguous chunks from the user buffer
  * @start: starting index into sets array
  * @count: number of struct tid_pageset's to program
- * @pages: an array of struct page * for the user buffer
  * @tidlist: the array of u32 elements when the information about the
  *           programmed RcvArray entries is to be encoded.
  * @tididx: starting offset into tidlist
@@ -765,11 +680,11 @@ static u32 find_phys_blocks(struct page **pages, unsigned npages,
  * -ENOMEM or -EFAULT on error from set_rcvarray_entry(), or
  * number of RcvArray entries programmed.
  */
-static int program_rcvarray(struct hfi1_filedata *fd, unsigned long vaddr,
+static int program_rcvarray(struct hfi1_filedata *fd, struct tid_user_buf *tbuf,
 			    struct tid_group *grp,
-			    struct tid_pageset *sets,
-			    unsigned start, u16 count, struct page **pages,
-			    u32 *tidlist, unsigned *tididx, unsigned *pmapped)
+			    unsigned int start, u16 count,
+			    u32 *tidlist, unsigned int *tididx,
+			    unsigned int *pmapped)
 {
 	struct hfi1_ctxtdata *uctxt = fd->uctxt;
 	struct hfi1_devdata *dd = uctxt->dd;
@@ -808,11 +723,11 @@ static int program_rcvarray(struct hfi1_filedata *fd, unsigned long vaddr,
 		}
 
 		rcventry = grp->base + useidx;
-		npages = sets[setidx].count;
-		pageidx = sets[setidx].idx;
+		npages = tbuf->psets[setidx].count;
+		pageidx = tbuf->psets[setidx].idx;
 
-		ret = set_rcvarray_entry(fd, vaddr + (pageidx * PAGE_SIZE),
-					 rcventry, grp, pages + pageidx,
+		ret = set_rcvarray_entry(fd, tbuf,
+					 rcventry, grp, pageidx,
 					 npages);
 		if (ret)
 			return ret;
@@ -833,15 +748,17 @@ static int program_rcvarray(struct hfi1_filedata *fd, unsigned long vaddr,
 	return idx;
 }
 
-static int set_rcvarray_entry(struct hfi1_filedata *fd, unsigned long vaddr,
+static int set_rcvarray_entry(struct hfi1_filedata *fd,
+			      struct tid_user_buf *tbuf,
 			      u32 rcventry, struct tid_group *grp,
-			      struct page **pages, unsigned npages)
+			      u16 pageidx, unsigned int npages)
 {
 	int ret;
 	struct hfi1_ctxtdata *uctxt = fd->uctxt;
 	struct tid_rb_node *node;
 	struct hfi1_devdata *dd = uctxt->dd;
 	dma_addr_t phys;
+	struct page **pages = tbuf->pages + pageidx;
 
 	/*
 	 * Allocate the node first so we can handle a potential
@@ -862,7 +779,7 @@ static int set_rcvarray_entry(struct hfi1_filedata *fd, unsigned long vaddr,
 		return -EFAULT;
 	}
 
-	node->mmu.addr = vaddr;
+	node->mmu.addr = tbuf->vaddr + (pageidx * PAGE_SIZE);
 	node->mmu.len = npages * PAGE_SIZE;
 	node->phys = page_to_phys(pages[0]);
 	node->npages = npages;
@@ -935,17 +852,13 @@ static void clear_tid_node(struct hfi1_filedata *fd, struct tid_rb_node *node)
 				 node->npages, node->mmu.addr, node->phys,
 				 node->dma_addr);
 
-	hfi1_put_tid(dd, node->rcventry, PT_INVALID, 0, 0);
 	/*
 	 * Make sure device has seen the write before we unpin the
 	 * pages.
 	 */
-	flush_wc();
+	hfi1_put_tid(dd, node->rcventry, PT_INVALID_FLUSH, 0, 0);
 
-	pci_unmap_single(dd->pcidev, node->dma_addr, node->mmu.len,
-			 PCI_DMA_FROMDEVICE);
-	hfi1_release_user_pages(fd->mm, node->pages, node->npages, true);
-	fd->tid_n_pinned -= node->npages;
+	unpin_rcv_pages(fd, NULL, node, 0, node->npages, true);
 
 	node->grp->used--;
 	node->grp->map &= ~(1 << (node->rcventry - node->grp->base));
diff --git a/drivers/infiniband/hw/hfi1/user_exp_rcv.h b/drivers/infiniband/hw/hfi1/user_exp_rcv.h
index 5250c89..e383cc0 100644
--- a/drivers/infiniband/hw/hfi1/user_exp_rcv.h
+++ b/drivers/infiniband/hw/hfi1/user_exp_rcv.h
@@ -49,30 +49,44 @@
 
 #include "hfi.h"
 
-#define EXP_TID_TIDLEN_MASK   0x7FFULL
-#define EXP_TID_TIDLEN_SHIFT  0
-#define EXP_TID_TIDCTRL_MASK  0x3ULL
-#define EXP_TID_TIDCTRL_SHIFT 20
-#define EXP_TID_TIDIDX_MASK   0x3FFULL
-#define EXP_TID_TIDIDX_SHIFT  22
-#define EXP_TID_GET(tid, field)	\
-	(((tid) >> EXP_TID_TID##field##_SHIFT) & EXP_TID_TID##field##_MASK)
+#include "exp_rcv.h"
 
-#define EXP_TID_SET(field, value)			\
-	(((value) & EXP_TID_TID##field##_MASK) <<	\
-	 EXP_TID_TID##field##_SHIFT)
-#define EXP_TID_CLEAR(tid, field) ({					\
-		(tid) &= ~(EXP_TID_TID##field##_MASK <<			\
-			   EXP_TID_TID##field##_SHIFT);			\
-		})
-#define EXP_TID_RESET(tid, field, value) do {				\
-		EXP_TID_CLEAR(tid, field);				\
-		(tid) |= EXP_TID_SET(field, (value));			\
-	} while (0)
+struct tid_pageset {
+	u16 idx;
+	u16 count;
+};
 
-void hfi1_user_exp_rcv_grp_free(struct hfi1_ctxtdata *uctxt);
-int hfi1_user_exp_rcv_grp_init(struct hfi1_filedata *fd);
-int hfi1_user_exp_rcv_init(struct hfi1_filedata *fd);
+struct tid_user_buf {
+	unsigned long vaddr;
+	unsigned long length;
+	unsigned int npages;
+	struct page **pages;
+	struct tid_pageset *psets;
+	unsigned int n_psets;
+};
+
+struct tid_rb_node {
+	struct mmu_rb_node mmu;
+	unsigned long phys;
+	struct tid_group *grp;
+	u32 rcventry;
+	dma_addr_t dma_addr;
+	bool freed;
+	unsigned int npages;
+	struct page *pages[0];
+};
+
+static inline int num_user_pages(unsigned long addr,
+				 unsigned long len)
+{
+	const unsigned long spage = addr & PAGE_MASK;
+	const unsigned long epage = (addr + len - 1) & PAGE_MASK;
+
+	return 1 + ((epage - spage) >> PAGE_SHIFT);
+}
+
+int hfi1_user_exp_rcv_init(struct hfi1_filedata *fd,
+			   struct hfi1_ctxtdata *uctxt);
 void hfi1_user_exp_rcv_free(struct hfi1_filedata *fd);
 int hfi1_user_exp_rcv_setup(struct hfi1_filedata *fd,
 			    struct hfi1_tid_info *tinfo);
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index d55339f..c0c0e04 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -64,224 +64,20 @@
 
 #include "hfi.h"
 #include "sdma.h"
+#include "mmu_rb.h"
 #include "user_sdma.h"
 #include "verbs.h"  /* for the headers */
 #include "common.h" /* for struct hfi1_tid_info */
 #include "trace.h"
-#include "mmu_rb.h"
 
 static uint hfi1_sdma_comp_ring_size = 128;
 module_param_named(sdma_comp_size, hfi1_sdma_comp_ring_size, uint, S_IRUGO);
 MODULE_PARM_DESC(sdma_comp_size, "Size of User SDMA completion ring. Default: 128");
 
-/* The maximum number of Data io vectors per message/request */
-#define MAX_VECTORS_PER_REQ 8
-/*
- * Maximum number of packet to send from each message/request
- * before moving to the next one.
- */
-#define MAX_PKTS_PER_QUEUE 16
-
-#define num_pages(x) (1 + ((((x) - 1) & PAGE_MASK) >> PAGE_SHIFT))
-
-#define req_opcode(x) \
-	(((x) >> HFI1_SDMA_REQ_OPCODE_SHIFT) & HFI1_SDMA_REQ_OPCODE_MASK)
-#define req_version(x) \
-	(((x) >> HFI1_SDMA_REQ_VERSION_SHIFT) & HFI1_SDMA_REQ_OPCODE_MASK)
-#define req_iovcnt(x) \
-	(((x) >> HFI1_SDMA_REQ_IOVCNT_SHIFT) & HFI1_SDMA_REQ_IOVCNT_MASK)
-
-/* Number of BTH.PSN bits used for sequence number in expected rcvs */
-#define BTH_SEQ_MASK 0x7ffull
-
-/*
- * Define fields in the KDETH header so we can update the header
- * template.
- */
-#define KDETH_OFFSET_SHIFT        0
-#define KDETH_OFFSET_MASK         0x7fff
-#define KDETH_OM_SHIFT            15
-#define KDETH_OM_MASK             0x1
-#define KDETH_TID_SHIFT           16
-#define KDETH_TID_MASK            0x3ff
-#define KDETH_TIDCTRL_SHIFT       26
-#define KDETH_TIDCTRL_MASK        0x3
-#define KDETH_INTR_SHIFT          28
-#define KDETH_INTR_MASK           0x1
-#define KDETH_SH_SHIFT            29
-#define KDETH_SH_MASK             0x1
-#define KDETH_HCRC_UPPER_SHIFT    16
-#define KDETH_HCRC_UPPER_MASK     0xff
-#define KDETH_HCRC_LOWER_SHIFT    24
-#define KDETH_HCRC_LOWER_MASK     0xff
-
-#define AHG_KDETH_INTR_SHIFT 12
-#define AHG_KDETH_SH_SHIFT   13
-
-#define PBC2LRH(x) ((((x) & 0xfff) << 2) - 4)
-#define LRH2PBC(x) ((((x) >> 2) + 1) & 0xfff)
-
-#define KDETH_GET(val, field)						\
-	(((le32_to_cpu((val))) >> KDETH_##field##_SHIFT) & KDETH_##field##_MASK)
-#define KDETH_SET(dw, field, val) do {					\
-		u32 dwval = le32_to_cpu(dw);				\
-		dwval &= ~(KDETH_##field##_MASK << KDETH_##field##_SHIFT); \
-		dwval |= (((val) & KDETH_##field##_MASK) << \
-			  KDETH_##field##_SHIFT);			\
-		dw = cpu_to_le32(dwval);				\
-	} while (0)
-
-#define AHG_HEADER_SET(arr, idx, dw, bit, width, value)			\
-	do {								\
-		if ((idx) < ARRAY_SIZE((arr)))				\
-			(arr)[(idx++)] = sdma_build_ahg_descriptor(	\
-				(__force u16)(value), (dw), (bit),	\
-							(width));	\
-		else							\
-			return -ERANGE;					\
-	} while (0)
-
-/* KDETH OM multipliers and switch over point */
-#define KDETH_OM_SMALL     4
-#define KDETH_OM_SMALL_SHIFT     2
-#define KDETH_OM_LARGE     64
-#define KDETH_OM_LARGE_SHIFT     6
-#define KDETH_OM_MAX_SIZE  (1 << ((KDETH_OM_LARGE / KDETH_OM_SMALL) + 1))
-
-/* Tx request flag bits */
-#define TXREQ_FLAGS_REQ_ACK   BIT(0)      /* Set the ACK bit in the header */
-#define TXREQ_FLAGS_REQ_DISABLE_SH BIT(1) /* Disable header suppression */
-
-/* SDMA request flag bits */
-#define SDMA_REQ_FOR_THREAD 1
-#define SDMA_REQ_SEND_DONE  2
-#define SDMA_REQ_HAS_ERROR  3
-#define SDMA_REQ_DONE_ERROR 4
-
-#define SDMA_PKT_Q_INACTIVE BIT(0)
-#define SDMA_PKT_Q_ACTIVE   BIT(1)
-#define SDMA_PKT_Q_DEFERRED BIT(2)
-
-/*
- * Maximum retry attempts to submit a TX request
- * before putting the process to sleep.
- */
-#define MAX_DEFER_RETRY_COUNT 1
-
 static unsigned initial_pkt_count = 8;
 
-#define SDMA_IOWAIT_TIMEOUT 1000 /* in milliseconds */
-
-struct sdma_mmu_node;
-
-struct user_sdma_iovec {
-	struct list_head list;
-	struct iovec iov;
-	/* number of pages in this vector */
-	unsigned npages;
-	/* array of pinned pages for this vector */
-	struct page **pages;
-	/*
-	 * offset into the virtual address space of the vector at
-	 * which we last left off.
-	 */
-	u64 offset;
-	struct sdma_mmu_node *node;
-};
-
-struct sdma_mmu_node {
-	struct mmu_rb_node rb;
-	struct hfi1_user_sdma_pkt_q *pq;
-	atomic_t refcount;
-	struct page **pages;
-	unsigned npages;
-};
-
-/* evict operation argument */
-struct evict_data {
-	u32 cleared;	/* count evicted so far */
-	u32 target;	/* target count to evict */
-};
-
-struct user_sdma_request {
-	struct sdma_req_info info;
-	struct hfi1_user_sdma_pkt_q *pq;
-	struct hfi1_user_sdma_comp_q *cq;
-	/* This is the original header from user space */
-	struct hfi1_pkt_header hdr;
-	/*
-	 * Pointer to the SDMA engine for this request.
-	 * Since different request could be on different VLs,
-	 * each request will need it's own engine pointer.
-	 */
-	struct sdma_engine *sde;
-	s8 ahg_idx;
-	u32 ahg[9];
-	/*
-	 * KDETH.Offset (Eager) field
-	 * We need to remember the initial value so the headers
-	 * can be updated properly.
-	 */
-	u32 koffset;
-	/*
-	 * KDETH.OFFSET (TID) field
-	 * The offset can cover multiple packets, depending on the
-	 * size of the TID entry.
-	 */
-	u32 tidoffset;
-	/*
-	 * We copy the iovs for this request (based on
-	 * info.iovcnt). These are only the data vectors
-	 */
-	unsigned data_iovs;
-	/* total length of the data in the request */
-	u32 data_len;
-	/* progress index moving along the iovs array */
-	unsigned iov_idx;
-	struct user_sdma_iovec iovs[MAX_VECTORS_PER_REQ];
-	/* number of elements copied to the tids array */
-	u16 n_tids;
-	/* TID array values copied from the tid_iov vector */
-	u32 *tids;
-	u16 tididx;
-	u32 sent;
-	u64 seqnum;
-	u64 seqcomp;
-	u64 seqsubmitted;
-	struct list_head txps;
-	unsigned long flags;
-	/* status of the last txreq completed */
-	int status;
-};
-
-/*
- * A single txreq could span up to 3 physical pages when the MTU
- * is sufficiently large (> 4K). Each of the IOV pointers also
- * needs it's own set of flags so the vector has been handled
- * independently of each other.
- */
-struct user_sdma_txreq {
-	/* Packet header for the txreq */
-	struct hfi1_pkt_header hdr;
-	struct sdma_txreq txreq;
-	struct list_head list;
-	struct user_sdma_request *req;
-	u16 flags;
-	unsigned busycount;
-	u64 seqnum;
-};
-
-#define SDMA_DBG(req, fmt, ...)				     \
-	hfi1_cdbg(SDMA, "[%u:%u:%u:%u] " fmt, (req)->pq->dd->unit, \
-		 (req)->pq->ctxt, (req)->pq->subctxt, (req)->info.comp_idx, \
-		 ##__VA_ARGS__)
-#define SDMA_Q_DBG(pq, fmt, ...)			 \
-	hfi1_cdbg(SDMA, "[%u:%u:%u] " fmt, (pq)->dd->unit, (pq)->ctxt, \
-		 (pq)->subctxt, ##__VA_ARGS__)
-
 static int user_sdma_send_pkts(struct user_sdma_request *req,
 			       unsigned maxpkts);
-static int num_user_pages(const struct iovec *iov);
 static void user_sdma_txreq_cb(struct sdma_txreq *txreq, int status);
 static inline void pq_update(struct hfi1_user_sdma_pkt_q *pq);
 static void user_sdma_free_request(struct user_sdma_request *req, bool unpin);
@@ -307,7 +103,8 @@ static int defer_packet_queue(
 	struct sdma_engine *sde,
 	struct iowait *wait,
 	struct sdma_txreq *txreq,
-	unsigned int seq);
+	uint seq,
+	bool pkts_sent);
 static void activate_packet_queue(struct iowait *wait, int reason);
 static bool sdma_rb_filter(struct mmu_rb_node *node, unsigned long addr,
 			   unsigned long len);
@@ -329,7 +126,8 @@ static int defer_packet_queue(
 	struct sdma_engine *sde,
 	struct iowait *wait,
 	struct sdma_txreq *txreq,
-	unsigned seq)
+	uint seq,
+	bool pkts_sent)
 {
 	struct hfi1_user_sdma_pkt_q *pq =
 		container_of(wait, struct hfi1_user_sdma_pkt_q, busy);
@@ -349,7 +147,7 @@ static int defer_packet_queue(
 	xchg(&pq->state, SDMA_PKT_Q_DEFERRED);
 	write_seqlock(&dev->iowait_lock);
 	if (list_empty(&pq->busy.list))
-		list_add_tail(&pq->busy.list, &sde->dmawait);
+		iowait_queue(pkts_sent, &pq->busy, &sde->dmawait);
 	write_sequnlock(&dev->iowait_lock);
 	return -EBUSY;
 eagain:
@@ -364,13 +162,6 @@ static void activate_packet_queue(struct iowait *wait, int reason)
 	wake_up(&wait->wait_dma);
 };
 
-static void sdma_kmem_cache_ctor(void *obj)
-{
-	struct user_sdma_txreq *tx = obj;
-
-	memset(tx, 0, sizeof(*tx));
-}
-
 int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt,
 				struct hfi1_filedata *fd)
 {
@@ -379,7 +170,6 @@ int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt,
 	struct hfi1_devdata *dd;
 	struct hfi1_user_sdma_comp_q *cq;
 	struct hfi1_user_sdma_pkt_q *pq;
-	unsigned long flags;
 
 	if (!uctxt || !fd)
 		return -EBADF;
@@ -393,7 +183,6 @@ int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt,
 	if (!pq)
 		return -ENOMEM;
 
-	INIT_LIST_HEAD(&pq->list);
 	pq->dd = dd;
 	pq->ctxt = uctxt->ctxt;
 	pq->subctxt = fd->subctxt;
@@ -426,7 +215,7 @@ int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt,
 					    sizeof(struct user_sdma_txreq),
 					    L1_CACHE_BYTES,
 					    SLAB_HWCACHE_ALIGN,
-					    sdma_kmem_cache_ctor);
+					    NULL);
 	if (!pq->txreq_cache) {
 		dd_dev_err(dd, "[%u] Failed to allocate TxReq cache\n",
 			   uctxt->ctxt);
@@ -454,10 +243,6 @@ int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt,
 	fd->pq = pq;
 	fd->cq = cq;
 
-	spin_lock_irqsave(&uctxt->sdma_qlock, flags);
-	list_add(&pq->list, &uctxt->sdma_queues);
-	spin_unlock_irqrestore(&uctxt->sdma_qlock, flags);
-
 	return 0;
 
 pq_mmu_fail:
@@ -476,22 +261,17 @@ int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt,
 	return ret;
 }
 
-int hfi1_user_sdma_free_queues(struct hfi1_filedata *fd)
+int hfi1_user_sdma_free_queues(struct hfi1_filedata *fd,
+			       struct hfi1_ctxtdata *uctxt)
 {
-	struct hfi1_ctxtdata *uctxt = fd->uctxt;
 	struct hfi1_user_sdma_pkt_q *pq;
-	unsigned long flags;
 
-	hfi1_cdbg(SDMA, "[%u:%u:%u] Freeing user SDMA queues", uctxt->dd->unit,
-		  uctxt->ctxt, fd->subctxt);
+	trace_hfi1_sdma_user_free_queues(uctxt->dd, uctxt->ctxt, fd->subctxt);
+
 	pq = fd->pq;
 	if (pq) {
 		if (pq->handler)
 			hfi1_mmu_rb_unregister(pq->handler);
-		spin_lock_irqsave(&uctxt->sdma_qlock, flags);
-		if (!list_empty(&pq->list))
-			list_del_init(&pq->list);
-		spin_unlock_irqrestore(&uctxt->sdma_qlock, flags);
 		iowait_sdma_drain(&pq->busy);
 		/* Wait until all requests have been freed. */
 		wait_event_interruptible(
@@ -546,6 +326,8 @@ int hfi1_user_sdma_process_request(struct hfi1_filedata *fd,
 	struct sdma_req_info info;
 	struct user_sdma_request *req;
 	u8 opcode, sc, vl;
+	u16 pkey;
+	u32 slid;
 	int req_queued = 0;
 	u16 dlid;
 	u32 selector;
@@ -567,7 +349,6 @@ int hfi1_user_sdma_process_request(struct hfi1_filedata *fd,
 
 	trace_hfi1_sdma_user_reqinfo(dd, uctxt->ctxt, fd->subctxt,
 				     (u16 *)&info);
-
 	if (info.comp_idx >= hfi1_sdma_comp_ring_size) {
 		hfi1_cdbg(SDMA,
 			  "[%u:%u:%u:%u] Invalid comp index",
@@ -604,15 +385,23 @@ int hfi1_user_sdma_process_request(struct hfi1_filedata *fd,
 	/*
 	 * All safety checks have been done and this request has been claimed.
 	 */
-	hfi1_cdbg(SDMA, "[%u:%u:%u] Using req/comp entry %u\n", dd->unit,
-		  uctxt->ctxt, fd->subctxt, info.comp_idx);
+	trace_hfi1_sdma_user_process_request(dd, uctxt->ctxt, fd->subctxt,
+					     info.comp_idx);
 	req = pq->reqs + info.comp_idx;
-	memset(req, 0, sizeof(*req));
 	req->data_iovs = req_iovcnt(info.ctrl) - 1; /* subtract header vector */
+	req->data_len  = 0;
 	req->pq = pq;
 	req->cq = cq;
 	req->status = -1;
 	req->ahg_idx = -1;
+	req->iov_idx = 0;
+	req->sent = 0;
+	req->seqnum = 0;
+	req->seqcomp = 0;
+	req->seqsubmitted = 0;
+	req->tids = NULL;
+	req->done = 0;
+	req->has_error = 0;
 	INIT_LIST_HEAD(&req->txps);
 
 	memcpy(&req->info, &info, sizeof(info));
@@ -671,8 +460,9 @@ int hfi1_user_sdma_process_request(struct hfi1_filedata *fd,
 	}
 
 	/* Checking P_KEY for requests from user-space */
-	if (egress_pkey_check(dd->pport, req->hdr.lrh, req->hdr.bth, sc,
-			      PKEY_CHECK_INVALID)) {
+	pkey = (u16)be32_to_cpu(req->hdr.bth[0]);
+	slid = be16_to_cpu(req->hdr.lrh[3]);
+	if (egress_pkey_check(dd->pport, slid, pkey, sc, PKEY_CHECK_INVALID)) {
 		ret = -EINVAL;
 		goto free_req;
 	}
@@ -696,24 +486,27 @@ int hfi1_user_sdma_process_request(struct hfi1_filedata *fd,
 	req->tidoffset = KDETH_GET(req->hdr.kdeth.ver_tid_offset, OFFSET) *
 		(KDETH_GET(req->hdr.kdeth.ver_tid_offset, OM) ?
 		 KDETH_OM_LARGE : KDETH_OM_SMALL);
-	SDMA_DBG(req, "Initial TID offset %u", req->tidoffset);
+	trace_hfi1_sdma_user_initial_tidoffset(dd, uctxt->ctxt, fd->subctxt,
+					       info.comp_idx, req->tidoffset);
 	idx++;
 
 	/* Save all the IO vector structures */
 	for (i = 0; i < req->data_iovs; i++) {
+		req->iovs[i].offset = 0;
 		INIT_LIST_HEAD(&req->iovs[i].list);
 		memcpy(&req->iovs[i].iov,
 		       iovec + idx++,
 		       sizeof(req->iovs[i].iov));
 		ret = pin_vector_pages(req, &req->iovs[i]);
 		if (ret) {
+			req->data_iovs = i;
 			req->status = ret;
 			goto free_req;
 		}
 		req->data_len += req->iovs[i].iov.iov_len;
 	}
-	SDMA_DBG(req, "total data length %u", req->data_len);
-
+	trace_hfi1_sdma_user_data_length(dd, uctxt->ctxt, fd->subctxt,
+					 info.comp_idx, req->data_len);
 	if (pcount > req->info.npkts)
 		pcount = req->info.npkts;
 	/*
@@ -749,6 +542,7 @@ int hfi1_user_sdma_process_request(struct hfi1_filedata *fd,
 		}
 		req->tids = tmp;
 		req->n_tids = ntids;
+		req->tididx = 0;
 		idx++;
 	}
 
@@ -791,12 +585,12 @@ int hfi1_user_sdma_process_request(struct hfi1_filedata *fd,
 	 * request have been submitted to the SDMA engine. However, it
 	 * will not wait for send completions.
 	 */
-	while (!test_bit(SDMA_REQ_SEND_DONE, &req->flags)) {
+	while (req->seqsubmitted != req->info.npkts) {
 		ret = user_sdma_send_pkts(req, pcount);
 		if (ret < 0) {
 			if (ret != -EBUSY) {
 				req->status = ret;
-				set_bit(SDMA_REQ_DONE_ERROR, &req->flags);
+				WRITE_ONCE(req->has_error, 1);
 				if (ACCESS_ONCE(req->seqcomp) ==
 				    req->seqsubmitted - 1)
 					goto free_req;
@@ -867,7 +661,11 @@ static inline u32 compute_data_length(struct user_sdma_request *req,
 	} else {
 		len = min(req->data_len - req->sent, (u32)req->info.fragsize);
 	}
-	SDMA_DBG(req, "Data Length = %u", len);
+	trace_hfi1_sdma_user_compute_length(req->pq->dd,
+					    req->pq->ctxt,
+					    req->pq->subctxt,
+					    req->info.comp_idx,
+					    len);
 	return len;
 }
 
@@ -884,6 +682,84 @@ static inline u32 get_lrh_len(struct hfi1_pkt_header hdr, u32 len)
 	return ((sizeof(hdr) - sizeof(hdr.pbc)) + 4 + len);
 }
 
+static int user_sdma_txadd_ahg(struct user_sdma_request *req,
+			       struct user_sdma_txreq *tx,
+			       u32 datalen)
+{
+	int ret;
+	u16 pbclen = le16_to_cpu(req->hdr.pbc[0]);
+	u32 lrhlen = get_lrh_len(req->hdr, pad_len(datalen));
+	struct hfi1_user_sdma_pkt_q *pq = req->pq;
+
+	/*
+	 * Copy the request header into the tx header
+	 * because the HW needs a cacheline-aligned
+	 * address.
+	 * This copy can be optimized out if the hdr
+	 * member of user_sdma_request were also
+	 * cacheline aligned.
+	 */
+	memcpy(&tx->hdr, &req->hdr, sizeof(tx->hdr));
+	if (PBC2LRH(pbclen) != lrhlen) {
+		pbclen = (pbclen & 0xf000) | LRH2PBC(lrhlen);
+		tx->hdr.pbc[0] = cpu_to_le16(pbclen);
+	}
+	ret = check_header_template(req, &tx->hdr, lrhlen, datalen);
+	if (ret)
+		return ret;
+	ret = sdma_txinit_ahg(&tx->txreq, SDMA_TXREQ_F_AHG_COPY,
+			      sizeof(tx->hdr) + datalen, req->ahg_idx,
+			      0, NULL, 0, user_sdma_txreq_cb);
+	if (ret)
+		return ret;
+	ret = sdma_txadd_kvaddr(pq->dd, &tx->txreq, &tx->hdr, sizeof(tx->hdr));
+	if (ret)
+		sdma_txclean(pq->dd, &tx->txreq);
+	return ret;
+}
+
+static int user_sdma_txadd(struct user_sdma_request *req,
+			   struct user_sdma_txreq *tx,
+			   struct user_sdma_iovec *iovec, u32 datalen,
+			   u32 *queued_ptr, u32 *data_sent_ptr,
+			   u64 *iov_offset_ptr)
+{
+	int ret;
+	unsigned int pageidx, len;
+	unsigned long base, offset;
+	u64 iov_offset = *iov_offset_ptr;
+	u32 queued = *queued_ptr, data_sent = *data_sent_ptr;
+	struct hfi1_user_sdma_pkt_q *pq = req->pq;
+
+	base = (unsigned long)iovec->iov.iov_base;
+	offset = offset_in_page(base + iovec->offset + iov_offset);
+	pageidx = (((iovec->offset + iov_offset + base) - (base & PAGE_MASK)) >>
+		   PAGE_SHIFT);
+	len = offset + req->info.fragsize > PAGE_SIZE ?
+		PAGE_SIZE - offset : req->info.fragsize;
+	len = min((datalen - queued), len);
+	ret = sdma_txadd_page(pq->dd, &tx->txreq, iovec->pages[pageidx],
+			      offset, len);
+	if (ret) {
+		SDMA_DBG(req, "SDMA txreq add page failed %d\n", ret);
+		return ret;
+	}
+	iov_offset += len;
+	queued += len;
+	data_sent += len;
+	if (unlikely(queued < datalen && pageidx == iovec->npages &&
+		     req->iov_idx < req->data_iovs - 1)) {
+		iovec->offset += iov_offset;
+		iovec = &req->iovs[++req->iov_idx];
+		iov_offset = 0;
+	}
+
+	*queued_ptr = queued;
+	*data_sent_ptr = data_sent;
+	*iov_offset_ptr = iov_offset;
+	return ret;
+}
+
 static int user_sdma_send_pkts(struct user_sdma_request *req, unsigned maxpkts)
 {
 	int ret = 0, count;
@@ -898,10 +774,8 @@ static int user_sdma_send_pkts(struct user_sdma_request *req, unsigned maxpkts)
 	pq = req->pq;
 
 	/* If tx completion has reported an error, we are done. */
-	if (test_bit(SDMA_REQ_HAS_ERROR, &req->flags)) {
-		set_bit(SDMA_REQ_DONE_ERROR, &req->flags);
+	if (READ_ONCE(req->has_error))
 		return -EFAULT;
-	}
 
 	/*
 	 * Check if we might have sent the entire request already
@@ -924,10 +798,8 @@ static int user_sdma_send_pkts(struct user_sdma_request *req, unsigned maxpkts)
 		 * with errors. If so, we are not going to process any
 		 * more packets from this request.
 		 */
-		if (test_bit(SDMA_REQ_HAS_ERROR, &req->flags)) {
-			set_bit(SDMA_REQ_DONE_ERROR, &req->flags);
+		if (READ_ONCE(req->has_error))
 			return -EFAULT;
-		}
 
 		tx = kmem_cache_alloc(pq->txreq_cache, GFP_KERNEL);
 		if (!tx)
@@ -984,39 +856,9 @@ static int user_sdma_send_pkts(struct user_sdma_request *req, unsigned maxpkts)
 
 		if (req->ahg_idx >= 0) {
 			if (!req->seqnum) {
-				u16 pbclen = le16_to_cpu(req->hdr.pbc[0]);
-				u32 lrhlen = get_lrh_len(req->hdr,
-							 pad_len(datalen));
-				/*
-				 * Copy the request header into the tx header
-				 * because the HW needs a cacheline-aligned
-				 * address.
-				 * This copy can be optimized out if the hdr
-				 * member of user_sdma_request were also
-				 * cacheline aligned.
-				 */
-				memcpy(&tx->hdr, &req->hdr, sizeof(tx->hdr));
-				if (PBC2LRH(pbclen) != lrhlen) {
-					pbclen = (pbclen & 0xf000) |
-						LRH2PBC(lrhlen);
-					tx->hdr.pbc[0] = cpu_to_le16(pbclen);
-				}
-				ret = check_header_template(req, &tx->hdr,
-							    lrhlen, datalen);
+				ret = user_sdma_txadd_ahg(req, tx, datalen);
 				if (ret)
 					goto free_tx;
-				ret = sdma_txinit_ahg(&tx->txreq,
-						      SDMA_TXREQ_F_AHG_COPY,
-						      sizeof(tx->hdr) + datalen,
-						      req->ahg_idx, 0, NULL, 0,
-						      user_sdma_txreq_cb);
-				if (ret)
-					goto free_tx;
-				ret = sdma_txadd_kvaddr(pq->dd, &tx->txreq,
-							&tx->hdr,
-							sizeof(tx->hdr));
-				if (ret)
-					goto free_txreq;
 			} else {
 				int changes;
 
@@ -1024,11 +866,6 @@ static int user_sdma_send_pkts(struct user_sdma_request *req, unsigned maxpkts)
 							       datalen);
 				if (changes < 0)
 					goto free_tx;
-				sdma_txinit_ahg(&tx->txreq,
-						SDMA_TXREQ_F_USE_AHG,
-						datalen, req->ahg_idx, changes,
-						req->ahg, sizeof(req->hdr),
-						user_sdma_txreq_cb);
 			}
 		} else {
 			ret = sdma_txinit(&tx->txreq, 0, sizeof(req->hdr) +
@@ -1052,35 +889,10 @@ static int user_sdma_send_pkts(struct user_sdma_request *req, unsigned maxpkts)
 		 */
 		while (queued < datalen &&
 		       (req->sent + data_sent) < req->data_len) {
-			unsigned long base, offset;
-			unsigned pageidx, len;
-
-			base = (unsigned long)iovec->iov.iov_base;
-			offset = offset_in_page(base + iovec->offset +
-						iov_offset);
-			pageidx = (((iovec->offset + iov_offset +
-				     base) - (base & PAGE_MASK)) >> PAGE_SHIFT);
-			len = offset + req->info.fragsize > PAGE_SIZE ?
-				PAGE_SIZE - offset : req->info.fragsize;
-			len = min((datalen - queued), len);
-			ret = sdma_txadd_page(pq->dd, &tx->txreq,
-					      iovec->pages[pageidx],
-					      offset, len);
-			if (ret) {
-				SDMA_DBG(req, "SDMA txreq add page failed %d\n",
-					 ret);
+			ret = user_sdma_txadd(req, tx, iovec, datalen,
+					      &queued, &data_sent, &iov_offset);
+			if (ret)
 				goto free_txreq;
-			}
-			iov_offset += len;
-			queued += len;
-			data_sent += len;
-			if (unlikely(queued < datalen &&
-				     pageidx == iovec->npages &&
-				     req->iov_idx < req->data_iovs - 1)) {
-				iovec->offset += iov_offset;
-				iovec = &req->iovs[++req->iov_idx];
-				iov_offset = 0;
-			}
 		}
 		/*
 		 * The txreq was submitted successfully so we can update
@@ -1105,7 +917,7 @@ static int user_sdma_send_pkts(struct user_sdma_request *req, unsigned maxpkts)
 	ret = sdma_send_txlist(req->sde, &pq->busy, &req->txps, &count);
 	req->seqsubmitted += count;
 	if (req->seqsubmitted == req->info.npkts) {
-		set_bit(SDMA_REQ_SEND_DONE, &req->flags);
+		WRITE_ONCE(req->done, 1);
 		/*
 		 * The txreq has already been submitted to the HW queue
 		 * so we can free the AHG entry now. Corruption will not
@@ -1124,19 +936,6 @@ static int user_sdma_send_pkts(struct user_sdma_request *req, unsigned maxpkts)
 	return ret;
 }
 
-/*
- * How many pages in this iovec element?
- */
-static inline int num_user_pages(const struct iovec *iov)
-{
-	const unsigned long addr  = (unsigned long)iov->iov_base;
-	const unsigned long len   = iov->iov_len;
-	const unsigned long spage = addr & PAGE_MASK;
-	const unsigned long epage = (addr + len - 1) & PAGE_MASK;
-
-	return 1 + ((epage - spage) >> PAGE_SHIFT);
-}
-
 static u32 sdma_cache_evict(struct hfi1_user_sdma_pkt_q *pq, u32 npages)
 {
 	struct evict_data evict_data;
@@ -1147,22 +946,82 @@ static u32 sdma_cache_evict(struct hfi1_user_sdma_pkt_q *pq, u32 npages)
 	return evict_data.cleared;
 }
 
+static int pin_sdma_pages(struct user_sdma_request *req,
+			  struct user_sdma_iovec *iovec,
+			  struct sdma_mmu_node *node,
+			  int npages)
+{
+	int pinned, cleared;
+	struct page **pages;
+	struct hfi1_user_sdma_pkt_q *pq = req->pq;
+
+	pages = kcalloc(npages, sizeof(*pages), GFP_KERNEL);
+	if (!pages) {
+		SDMA_DBG(req, "Failed page array alloc");
+		return -ENOMEM;
+	}
+	memcpy(pages, node->pages, node->npages * sizeof(*pages));
+
+	npages -= node->npages;
+retry:
+	if (!hfi1_can_pin_pages(pq->dd, pq->mm,
+				atomic_read(&pq->n_locked), npages)) {
+		cleared = sdma_cache_evict(pq, npages);
+		if (cleared >= npages)
+			goto retry;
+	}
+	pinned = hfi1_acquire_user_pages(pq->mm,
+					 ((unsigned long)iovec->iov.iov_base +
+					 (node->npages * PAGE_SIZE)), npages, 0,
+					 pages + node->npages);
+	if (pinned < 0) {
+		kfree(pages);
+		return pinned;
+	}
+	if (pinned != npages) {
+		unpin_vector_pages(pq->mm, pages, node->npages, pinned);
+		return -EFAULT;
+	}
+	kfree(node->pages);
+	node->rb.len = iovec->iov.iov_len;
+	node->pages = pages;
+	atomic_add(pinned, &pq->n_locked);
+	return pinned;
+}
+
+static void unpin_sdma_pages(struct sdma_mmu_node *node)
+{
+	if (node->npages) {
+		unpin_vector_pages(node->pq->mm, node->pages, 0, node->npages);
+		atomic_sub(node->npages, &node->pq->n_locked);
+	}
+}
+
 static int pin_vector_pages(struct user_sdma_request *req,
 			    struct user_sdma_iovec *iovec)
 {
-	int ret = 0, pinned, npages, cleared;
-	struct page **pages;
+	int ret = 0, pinned, npages;
 	struct hfi1_user_sdma_pkt_q *pq = req->pq;
 	struct sdma_mmu_node *node = NULL;
 	struct mmu_rb_node *rb_node;
+	struct iovec *iov;
+	bool extracted;
 
-	rb_node = hfi1_mmu_rb_extract(pq->handler,
-				      (unsigned long)iovec->iov.iov_base,
-				      iovec->iov.iov_len);
-	if (rb_node)
+	extracted =
+		hfi1_mmu_rb_remove_unless_exact(pq->handler,
+						(unsigned long)
+						iovec->iov.iov_base,
+						iovec->iov.iov_len, &rb_node);
+	if (rb_node) {
 		node = container_of(rb_node, struct sdma_mmu_node, rb);
-	else
-		rb_node = NULL;
+		if (!extracted) {
+			atomic_inc(&node->refcount);
+			iovec->pages = node->pages;
+			iovec->npages = node->npages;
+			iovec->node = node;
+			return 0;
+		}
+	}
 
 	if (!node) {
 		node = kzalloc(sizeof(*node), GFP_KERNEL);
@@ -1174,46 +1033,16 @@ static int pin_vector_pages(struct user_sdma_request *req,
 		atomic_set(&node->refcount, 0);
 	}
 
-	npages = num_user_pages(&iovec->iov);
+	iov = &iovec->iov;
+	npages = num_user_pages((unsigned long)iov->iov_base, iov->iov_len);
 	if (node->npages < npages) {
-		pages = kcalloc(npages, sizeof(*pages), GFP_KERNEL);
-		if (!pages) {
-			SDMA_DBG(req, "Failed page array alloc");
-			ret = -ENOMEM;
-			goto bail;
-		}
-		memcpy(pages, node->pages, node->npages * sizeof(*pages));
-
-		npages -= node->npages;
-
-retry:
-		if (!hfi1_can_pin_pages(pq->dd, pq->mm,
-					atomic_read(&pq->n_locked), npages)) {
-			cleared = sdma_cache_evict(pq, npages);
-			if (cleared >= npages)
-				goto retry;
-		}
-		pinned = hfi1_acquire_user_pages(pq->mm,
-			((unsigned long)iovec->iov.iov_base +
-			 (node->npages * PAGE_SIZE)), npages, 0,
-			pages + node->npages);
+		pinned = pin_sdma_pages(req, iovec, node, npages);
 		if (pinned < 0) {
-			kfree(pages);
 			ret = pinned;
 			goto bail;
 		}
-		if (pinned != npages) {
-			unpin_vector_pages(pq->mm, pages, node->npages,
-					   pinned);
-			ret = -EFAULT;
-			goto bail;
-		}
-		kfree(node->pages);
-		node->rb.len = iovec->iov.iov_len;
-		node->pages = pages;
 		node->npages += pinned;
 		npages = node->npages;
-		atomic_add(pinned, &pq->n_locked);
 	}
 	iovec->pages = node->pages;
 	iovec->npages = npages;
@@ -1221,14 +1050,12 @@ static int pin_vector_pages(struct user_sdma_request *req,
 
 	ret = hfi1_mmu_rb_insert(req->pq->handler, &node->rb);
 	if (ret) {
-		atomic_sub(node->npages, &pq->n_locked);
 		iovec->node = NULL;
 		goto bail;
 	}
 	return 0;
 bail:
-	if (rb_node)
-		unpin_vector_pages(pq->mm, node->pages, 0, node->npages);
+	unpin_sdma_pages(node);
 	kfree(node);
 	return ret;
 }
@@ -1408,9 +1235,10 @@ static int set_txreq_header(struct user_sdma_request *req,
 		 * Set the KDETH.OFFSET and KDETH.OM based on size of
 		 * transfer.
 		 */
-		SDMA_DBG(req, "TID offset %ubytes %uunits om%u",
-			 req->tidoffset, req->tidoffset >> omfactor,
-			 omfactor != KDETH_OM_SMALL_SHIFT);
+		trace_hfi1_sdma_user_tid_info(
+			pq->dd, pq->ctxt, pq->subctxt, req->info.comp_idx,
+			req->tidoffset, req->tidoffset >> omfactor,
+			omfactor != KDETH_OM_SMALL_SHIFT);
 		KDETH_SET(hdr->kdeth.ver_tid_offset, OFFSET,
 			  req->tidoffset >> omfactor);
 		KDETH_SET(hdr->kdeth.ver_tid_offset, OM,
@@ -1423,21 +1251,22 @@ static int set_txreq_header(struct user_sdma_request *req,
 }
 
 static int set_txreq_header_ahg(struct user_sdma_request *req,
-				struct user_sdma_txreq *tx, u32 len)
+				struct user_sdma_txreq *tx, u32 datalen)
 {
+	u32 ahg[AHG_KDETH_ARRAY_SIZE];
 	int diff = 0;
 	u8 omfactor; /* KDETH.OM */
 	struct hfi1_user_sdma_pkt_q *pq = req->pq;
 	struct hfi1_pkt_header *hdr = &req->hdr;
 	u16 pbclen = le16_to_cpu(hdr->pbc[0]);
-	u32 val32, tidval = 0, lrhlen = get_lrh_len(*hdr, pad_len(len));
+	u32 val32, tidval = 0, lrhlen = get_lrh_len(*hdr, pad_len(datalen));
 
 	if (PBC2LRH(pbclen) != lrhlen) {
 		/* PBC.PbcLengthDWs */
-		AHG_HEADER_SET(req->ahg, diff, 0, 0, 12,
+		AHG_HEADER_SET(ahg, diff, 0, 0, 12,
 			       cpu_to_le16(LRH2PBC(lrhlen)));
 		/* LRH.PktLen (we need the full 16 bits due to byte swap) */
-		AHG_HEADER_SET(req->ahg, diff, 3, 0, 16,
+		AHG_HEADER_SET(ahg, diff, 3, 0, 16,
 			       cpu_to_be16(lrhlen >> 2));
 	}
 
@@ -1449,13 +1278,12 @@ static int set_txreq_header_ahg(struct user_sdma_request *req,
 		(HFI1_CAP_IS_KSET(EXTENDED_PSN) ? 0x7fffffff : 0xffffff);
 	if (unlikely(tx->flags & TXREQ_FLAGS_REQ_ACK))
 		val32 |= 1UL << 31;
-	AHG_HEADER_SET(req->ahg, diff, 6, 0, 16, cpu_to_be16(val32 >> 16));
-	AHG_HEADER_SET(req->ahg, diff, 6, 16, 16, cpu_to_be16(val32 & 0xffff));
+	AHG_HEADER_SET(ahg, diff, 6, 0, 16, cpu_to_be16(val32 >> 16));
+	AHG_HEADER_SET(ahg, diff, 6, 16, 16, cpu_to_be16(val32 & 0xffff));
 	/* KDETH.Offset */
-	AHG_HEADER_SET(req->ahg, diff, 15, 0, 16,
+	AHG_HEADER_SET(ahg, diff, 15, 0, 16,
 		       cpu_to_le16(req->koffset & 0xffff));
-	AHG_HEADER_SET(req->ahg, diff, 15, 16, 16,
-		       cpu_to_le16(req->koffset >> 16));
+	AHG_HEADER_SET(ahg, diff, 15, 16, 16, cpu_to_le16(req->koffset >> 16));
 	if (req_opcode(req->info.ctrl) == EXPECTED) {
 		__le16 val;
 
@@ -1473,9 +1301,8 @@ static int set_txreq_header_ahg(struct user_sdma_request *req,
 			 * we have to check again.
 			 */
 			if (++req->tididx > req->n_tids - 1 ||
-			    !req->tids[req->tididx]) {
+			    !req->tids[req->tididx])
 				return -EINVAL;
-			}
 			tidval = req->tids[req->tididx];
 		}
 		omfactor = ((EXP_TID_GET(tidval, LEN) *
@@ -1483,7 +1310,7 @@ static int set_txreq_header_ahg(struct user_sdma_request *req,
 				 KDETH_OM_MAX_SIZE) ? KDETH_OM_LARGE_SHIFT :
 				 KDETH_OM_SMALL_SHIFT;
 		/* KDETH.OM and KDETH.OFFSET (TID) */
-		AHG_HEADER_SET(req->ahg, diff, 7, 0, 16,
+		AHG_HEADER_SET(ahg, diff, 7, 0, 16,
 			       ((!!(omfactor - KDETH_OM_SMALL_SHIFT)) << 15 |
 				((req->tidoffset >> omfactor)
 				 & 0x7fff)));
@@ -1503,12 +1330,20 @@ static int set_txreq_header_ahg(struct user_sdma_request *req,
 					     AHG_KDETH_INTR_SHIFT));
 		}
 
-		AHG_HEADER_SET(req->ahg, diff, 7, 16, 14, val);
+		AHG_HEADER_SET(ahg, diff, 7, 16, 14, val);
 	}
+	if (diff < 0)
+		return diff;
 
 	trace_hfi1_sdma_user_header_ahg(pq->dd, pq->ctxt, pq->subctxt,
 					req->info.comp_idx, req->sde->this_idx,
-					req->ahg_idx, req->ahg, diff, tidval);
+					req->ahg_idx, ahg, diff, tidval);
+	sdma_txinit_ahg(&tx->txreq,
+			SDMA_TXREQ_F_USE_AHG,
+			datalen, req->ahg_idx, diff,
+			ahg, sizeof(req->hdr),
+			user_sdma_txreq_cb);
+
 	return diff;
 }
 
@@ -1537,7 +1372,7 @@ static void user_sdma_txreq_cb(struct sdma_txreq *txreq, int status)
 	if (status != SDMA_TXREQ_S_OK) {
 		SDMA_DBG(req, "SDMA completion with error %d",
 			 status);
-		set_bit(SDMA_REQ_HAS_ERROR, &req->flags);
+		WRITE_ONCE(req->has_error, 1);
 	}
 
 	req->seqcomp = tx->seqnum;
@@ -1556,8 +1391,8 @@ static void user_sdma_txreq_cb(struct sdma_txreq *txreq, int status)
 		if (status != SDMA_TXREQ_S_OK)
 			req->status = status;
 		if (req->seqcomp == (ACCESS_ONCE(req->seqsubmitted) - 1) &&
-		    (test_bit(SDMA_REQ_SEND_DONE, &req->flags) ||
-		     test_bit(SDMA_REQ_DONE_ERROR, &req->flags))) {
+		    (READ_ONCE(req->done) ||
+		     READ_ONCE(req->has_error))) {
 			user_sdma_free_request(req, false);
 			pq_update(pq);
 			set_comp_state(pq, cq, idx, ERROR, req->status);
@@ -1611,8 +1446,6 @@ static inline void set_comp_state(struct hfi1_user_sdma_pkt_q *pq,
 				  u16 idx, enum hfi1_sdma_comp_state state,
 				  int ret)
 {
-	hfi1_cdbg(SDMA, "[%u:%u:%u:%u] Setting completion status %u %d",
-		  pq->dd->unit, pq->ctxt, pq->subctxt, idx, state, ret);
 	if (state == ERROR)
 		cq->comps[idx].errcode = -ret;
 	smp_wmb(); /* make sure errcode is visible first */
@@ -1667,10 +1500,7 @@ static void sdma_rb_remove(void *arg, struct mmu_rb_node *mnode)
 	struct sdma_mmu_node *node =
 		container_of(mnode, struct sdma_mmu_node, rb);
 
-	atomic_sub(node->npages, &node->pq->n_locked);
-
-	unpin_vector_pages(node->pq->mm, node->pages, 0, node->npages);
-
+	unpin_sdma_pages(node);
 	kfree(node);
 }
 
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.h b/drivers/infiniband/hw/hfi1/user_sdma.h
index e5b10ae..9b8bb56 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.h
+++ b/drivers/infiniband/hw/hfi1/user_sdma.h
@@ -53,11 +53,68 @@
 #include "iowait.h"
 #include "user_exp_rcv.h"
 
+/* The maximum number of Data io vectors per message/request */
+#define MAX_VECTORS_PER_REQ 8
+/*
+ * Maximum number of packet to send from each message/request
+ * before moving to the next one.
+ */
+#define MAX_PKTS_PER_QUEUE 16
+
+#define num_pages(x) (1 + ((((x) - 1) & PAGE_MASK) >> PAGE_SHIFT))
+
+#define req_opcode(x) \
+	(((x) >> HFI1_SDMA_REQ_OPCODE_SHIFT) & HFI1_SDMA_REQ_OPCODE_MASK)
+#define req_version(x) \
+	(((x) >> HFI1_SDMA_REQ_VERSION_SHIFT) & HFI1_SDMA_REQ_OPCODE_MASK)
+#define req_iovcnt(x) \
+	(((x) >> HFI1_SDMA_REQ_IOVCNT_SHIFT) & HFI1_SDMA_REQ_IOVCNT_MASK)
+
+/* Number of BTH.PSN bits used for sequence number in expected rcvs */
+#define BTH_SEQ_MASK 0x7ffull
+
+#define AHG_KDETH_INTR_SHIFT 12
+#define AHG_KDETH_SH_SHIFT   13
+#define AHG_KDETH_ARRAY_SIZE  9
+
+#define PBC2LRH(x) ((((x) & 0xfff) << 2) - 4)
+#define LRH2PBC(x) ((((x) >> 2) + 1) & 0xfff)
+
+#define AHG_HEADER_SET(arr, idx, dw, bit, width, value)			\
+	do {								\
+		if ((idx) < ARRAY_SIZE((arr)))				\
+			(arr)[(idx++)] = sdma_build_ahg_descriptor(	\
+				(__force u16)(value), (dw), (bit),	\
+							(width));	\
+		else							\
+			return -ERANGE;					\
+	} while (0)
+
+/* Tx request flag bits */
+#define TXREQ_FLAGS_REQ_ACK   BIT(0)      /* Set the ACK bit in the header */
+#define TXREQ_FLAGS_REQ_DISABLE_SH BIT(1) /* Disable header suppression */
+
+#define SDMA_PKT_Q_INACTIVE BIT(0)
+#define SDMA_PKT_Q_ACTIVE   BIT(1)
+#define SDMA_PKT_Q_DEFERRED BIT(2)
+
+/*
+ * Maximum retry attempts to submit a TX request
+ * before putting the process to sleep.
+ */
+#define MAX_DEFER_RETRY_COUNT 1
+
+#define SDMA_IOWAIT_TIMEOUT 1000 /* in milliseconds */
+
+#define SDMA_DBG(req, fmt, ...)				     \
+	hfi1_cdbg(SDMA, "[%u:%u:%u:%u] " fmt, (req)->pq->dd->unit, \
+		 (req)->pq->ctxt, (req)->pq->subctxt, (req)->info.comp_idx, \
+		 ##__VA_ARGS__)
+
 extern uint extended_psn;
 
 struct hfi1_user_sdma_pkt_q {
-	struct list_head list;
-	unsigned ctxt;
+	u16 ctxt;
 	u16 subctxt;
 	u16 n_max_reqs;
 	atomic_t n_reqs;
@@ -80,9 +137,115 @@ struct hfi1_user_sdma_comp_q {
 	struct hfi1_sdma_comp_entry *comps;
 };
 
+struct sdma_mmu_node {
+	struct mmu_rb_node rb;
+	struct hfi1_user_sdma_pkt_q *pq;
+	atomic_t refcount;
+	struct page **pages;
+	unsigned int npages;
+};
+
+struct user_sdma_iovec {
+	struct list_head list;
+	struct iovec iov;
+	/* number of pages in this vector */
+	unsigned int npages;
+	/* array of pinned pages for this vector */
+	struct page **pages;
+	/*
+	 * offset into the virtual address space of the vector at
+	 * which we last left off.
+	 */
+	u64 offset;
+	struct sdma_mmu_node *node;
+};
+
+/* evict operation argument */
+struct evict_data {
+	u32 cleared;	/* count evicted so far */
+	u32 target;	/* target count to evict */
+};
+
+struct user_sdma_request {
+	/* This is the original header from user space */
+	struct hfi1_pkt_header hdr;
+
+	/* Read mostly fields */
+	struct hfi1_user_sdma_pkt_q *pq ____cacheline_aligned_in_smp;
+	struct hfi1_user_sdma_comp_q *cq;
+	/*
+	 * Pointer to the SDMA engine for this request.
+	 * Since different request could be on different VLs,
+	 * each request will need it's own engine pointer.
+	 */
+	struct sdma_engine *sde;
+	struct sdma_req_info info;
+	/* TID array values copied from the tid_iov vector */
+	u32 *tids;
+	/* total length of the data in the request */
+	u32 data_len;
+	/* number of elements copied to the tids array */
+	u16 n_tids;
+	/*
+	 * We copy the iovs for this request (based on
+	 * info.iovcnt). These are only the data vectors
+	 */
+	u8 data_iovs;
+	s8 ahg_idx;
+
+	/* Writeable fields shared with interrupt */
+	u64 seqcomp ____cacheline_aligned_in_smp;
+	u64 seqsubmitted;
+	/* status of the last txreq completed */
+	int status;
+
+	/* Send side fields */
+	struct list_head txps ____cacheline_aligned_in_smp;
+	u64 seqnum;
+	/*
+	 * KDETH.OFFSET (TID) field
+	 * The offset can cover multiple packets, depending on the
+	 * size of the TID entry.
+	 */
+	u32 tidoffset;
+	/*
+	 * KDETH.Offset (Eager) field
+	 * We need to remember the initial value so the headers
+	 * can be updated properly.
+	 */
+	u32 koffset;
+	u32 sent;
+	/* TID index copied from the tid_iov vector */
+	u16 tididx;
+	/* progress index moving along the iovs array */
+	u8 iov_idx;
+	u8 done;
+	u8 has_error;
+
+	struct user_sdma_iovec iovs[MAX_VECTORS_PER_REQ];
+} ____cacheline_aligned_in_smp;
+
+/*
+ * A single txreq could span up to 3 physical pages when the MTU
+ * is sufficiently large (> 4K). Each of the IOV pointers also
+ * needs it's own set of flags so the vector has been handled
+ * independently of each other.
+ */
+struct user_sdma_txreq {
+	/* Packet header for the txreq */
+	struct hfi1_pkt_header hdr;
+	struct sdma_txreq txreq;
+	struct list_head list;
+	struct user_sdma_request *req;
+	u16 flags;
+	unsigned int busycount;
+	u64 seqnum;
+};
+
 int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt,
 				struct hfi1_filedata *fd);
-int hfi1_user_sdma_free_queues(struct hfi1_filedata *fd);
+int hfi1_user_sdma_free_queues(struct hfi1_filedata *fd,
+			       struct hfi1_ctxtdata *uctxt);
 int hfi1_user_sdma_process_request(struct hfi1_filedata *fd,
 				   struct iovec *iovec, unsigned long dim,
 				   unsigned long *count);
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index 2d19f9b..e232f3c 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -53,6 +53,7 @@
 #include <linux/rculist.h>
 #include <linux/mm.h>
 #include <linux/vmalloc.h>
+#include <rdma/opa_addr.h>
 
 #include "hfi.h"
 #include "common.h"
@@ -508,13 +509,14 @@ void hfi1_copy_sge(
 /*
  * Make sure the QP is ready and able to accept the given opcode.
  */
-static inline opcode_handler qp_ok(int opcode, struct hfi1_packet *packet)
+static inline opcode_handler qp_ok(struct hfi1_packet *packet)
 {
 	if (!(ib_rvt_state_ops[packet->qp->state] & RVT_PROCESS_RECV_OK))
 		return NULL;
-	if (((opcode & RVT_OPCODE_QP_MASK) == packet->qp->allowed_ops) ||
-	    (opcode == IB_OPCODE_CNP))
-		return opcode_handler_tbl[opcode];
+	if (((packet->opcode & RVT_OPCODE_QP_MASK) ==
+	     packet->qp->allowed_ops) ||
+	    (packet->opcode == IB_OPCODE_CNP))
+		return opcode_handler_tbl[packet->opcode];
 
 	return NULL;
 }
@@ -548,69 +550,54 @@ static u64 hfi1_fault_tx(struct rvt_qp *qp, u8 opcode, u64 pbc)
 	return pbc;
 }
 
-/**
- * hfi1_ib_rcv - process an incoming packet
- * @packet: data packet information
- *
- * This is called to process an incoming packet at interrupt level.
- *
- * Tlen is the length of the header + data + CRC in bytes.
- */
-void hfi1_ib_rcv(struct hfi1_packet *packet)
+static int hfi1_do_pkey_check(struct hfi1_packet *packet)
 {
 	struct hfi1_ctxtdata *rcd = packet->rcd;
-	struct ib_header *hdr = packet->hdr;
-	u32 tlen = packet->tlen;
+	struct hfi1_pportdata *ppd = rcd->ppd;
+	struct hfi1_16b_header *hdr = packet->hdr;
+	u16 pkey;
+
+	/* Pkey check needed only for bypass packets */
+	if (packet->etype != RHF_RCV_TYPE_BYPASS)
+		return 0;
+
+	/* Perform pkey check */
+	pkey = hfi1_16B_get_pkey(hdr);
+	return ingress_pkey_check(ppd, pkey, packet->sc,
+				  packet->qp->s_pkey_index,
+				  packet->slid, true);
+}
+
+static inline void hfi1_handle_packet(struct hfi1_packet *packet,
+				      bool is_mcast)
+{
+	u32 qp_num;
+	struct hfi1_ctxtdata *rcd = packet->rcd;
 	struct hfi1_pportdata *ppd = rcd->ppd;
 	struct hfi1_ibport *ibp = rcd_to_iport(rcd);
 	struct rvt_dev_info *rdi = &ppd->dd->verbs_dev.rdi;
 	opcode_handler packet_handler;
 	unsigned long flags;
-	u32 qp_num;
-	int lnh;
-	u8 opcode;
-	u16 lid;
 
-	/* Check for GRH */
-	lnh = ib_get_lnh(hdr);
-	if (lnh == HFI1_LRH_BTH) {
-		packet->ohdr = &hdr->u.oth;
-	} else if (lnh == HFI1_LRH_GRH) {
-		u32 vtf;
+	inc_opstats(packet->tlen, &rcd->opstats->stats[packet->opcode]);
 
-		packet->ohdr = &hdr->u.l.oth;
-		if (hdr->u.l.grh.next_hdr != IB_GRH_NEXT_HDR)
-			goto drop;
-		vtf = be32_to_cpu(hdr->u.l.grh.version_tclass_flow);
-		if ((vtf >> IB_GRH_VERSION_SHIFT) != IB_GRH_VERSION)
-			goto drop;
-		packet->rcv_flags |= HFI1_HAS_GRH;
-	} else {
-		goto drop;
-	}
-
-	trace_input_ibhdr(rcd->dd, hdr);
-
-	opcode = ib_bth_get_opcode(packet->ohdr);
-	inc_opstats(tlen, &rcd->opstats->stats[opcode]);
-
-	/* Get the destination QP number. */
-	qp_num = be32_to_cpu(packet->ohdr->bth[1]) & RVT_QPN_MASK;
-	lid = ib_get_dlid(hdr);
-	if (unlikely((lid >= be16_to_cpu(IB_MULTICAST_LID_BASE)) &&
-		     (lid != be16_to_cpu(IB_LID_PERMISSIVE)))) {
+	if (unlikely(is_mcast)) {
 		struct rvt_mcast *mcast;
 		struct rvt_mcast_qp *p;
 
-		if (lnh != HFI1_LRH_GRH)
+		if (!packet->grh)
 			goto drop;
-		mcast = rvt_mcast_find(&ibp->rvp, &hdr->u.l.grh.dgid, lid);
+		mcast = rvt_mcast_find(&ibp->rvp,
+				       &packet->grh->dgid,
+				       opa_get_lid(packet->dlid, 9B));
 		if (!mcast)
 			goto drop;
 		list_for_each_entry_rcu(p, &mcast->qp_list, list) {
 			packet->qp = p->qp;
+			if (hfi1_do_pkey_check(packet))
+				goto drop;
 			spin_lock_irqsave(&packet->qp->r_lock, flags);
-			packet_handler = qp_ok(opcode, packet);
+			packet_handler = qp_ok(packet);
 			if (likely(packet_handler))
 				packet_handler(packet);
 			else
@@ -624,19 +611,22 @@ void hfi1_ib_rcv(struct hfi1_packet *packet)
 		if (atomic_dec_return(&mcast->refcount) <= 1)
 			wake_up(&mcast->wait);
 	} else {
+		/* Get the destination QP number. */
+		qp_num = ib_bth_get_qpn(packet->ohdr);
 		rcu_read_lock();
 		packet->qp = rvt_lookup_qpn(rdi, &ibp->rvp, qp_num);
-		if (!packet->qp) {
-			rcu_read_unlock();
-			goto drop;
-		}
-		if (unlikely(hfi1_dbg_fault_opcode(packet->qp, opcode,
-						   true))) {
-			rcu_read_unlock();
-			goto drop;
-		}
+		if (!packet->qp)
+			goto unlock_drop;
+
+		if (hfi1_do_pkey_check(packet))
+			goto unlock_drop;
+
+		if (unlikely(hfi1_dbg_fault_opcode(packet->qp, packet->opcode,
+						   true)))
+			goto unlock_drop;
+
 		spin_lock_irqsave(&packet->qp->r_lock, flags);
-		packet_handler = qp_ok(opcode, packet);
+		packet_handler = qp_ok(packet);
 		if (likely(packet_handler))
 			packet_handler(packet);
 		else
@@ -645,11 +635,34 @@ void hfi1_ib_rcv(struct hfi1_packet *packet)
 		rcu_read_unlock();
 	}
 	return;
-
+unlock_drop:
+	rcu_read_unlock();
 drop:
 	ibp->rvp.n_pkt_drops++;
 }
 
+/**
+ * hfi1_ib_rcv - process an incoming packet
+ * @packet: data packet information
+ *
+ * This is called to process an incoming packet at interrupt level.
+ */
+void hfi1_ib_rcv(struct hfi1_packet *packet)
+{
+	struct hfi1_ctxtdata *rcd = packet->rcd;
+
+	trace_input_ibhdr(rcd->dd, packet, !!(rhf_dc_info(packet->rhf)));
+	hfi1_handle_packet(packet, hfi1_check_mcast(packet->dlid));
+}
+
+void hfi1_16B_rcv(struct hfi1_packet *packet)
+{
+	struct hfi1_ctxtdata *rcd = packet->rcd;
+
+	trace_input_ibhdr(rcd->dd, packet, false);
+	hfi1_handle_packet(packet, hfi1_check_mcast(packet->dlid));
+}
+
 /*
  * This is called from a timer to check for QPs
  * which need kernel memory in order to send a packet.
@@ -696,7 +709,7 @@ static void verbs_sdma_complete(
 	if (tx->wqe) {
 		hfi1_send_complete(qp, tx->wqe, IB_WC_SUCCESS);
 	} else if (qp->ibqp.qp_type == IB_QPT_RC) {
-		struct ib_header *hdr;
+		struct hfi1_opa_header *hdr;
 
 		hdr = &tx->phdr.hdr;
 		hfi1_rc_send_complete(qp, hdr);
@@ -799,12 +812,27 @@ static int build_verbs_tx_desc(
 	int ret = 0;
 	struct hfi1_sdma_header *phdr = &tx->phdr;
 	u16 hdrbytes = tx->hdr_dwords << 2;
+	u32 *hdr;
+	u8 extra_bytes = 0;
+	static char trail_buf[12]; /* CRC = 4, LT = 1, Pad = 0 to 7 bytes */
 
+	if (tx->phdr.hdr.hdr_type) {
+		/*
+		 * hdrbytes accounts for PBC. Need to subtract 8 bytes
+		 * before calculating padding.
+		 */
+		extra_bytes = hfi1_get_16b_padding(hdrbytes - 8, length) +
+			      (SIZE_OF_CRC << 2) + SIZE_OF_LT;
+		hdr = (u32 *)&phdr->hdr.opah;
+	} else {
+		hdr = (u32 *)&phdr->hdr.ibh;
+	}
 	if (!ahg_info->ahgcount) {
 		ret = sdma_txinit_ahg(
 			&tx->txreq,
 			ahg_info->tx_flags,
-			hdrbytes + length,
+			hdrbytes + length +
+			extra_bytes,
 			ahg_info->ahgidx,
 			0,
 			NULL,
@@ -834,8 +862,17 @@ static int build_verbs_tx_desc(
 			goto bail_txadd;
 	}
 	/* add the ulp payload - if any. tx->ss can be NULL for acks */
-	if (tx->ss)
+	if (tx->ss) {
 		ret = build_verbs_ulp_payload(sde, length, tx);
+		if (ret)
+			goto bail_txadd;
+	}
+
+	/* add icrc, lt byte, and padding to flit */
+	if (extra_bytes != 0)
+		ret = sdma_txadd_kvaddr(sde->dd, &tx->txreq,
+					trail_buf, extra_bytes);
+
 bail_txadd:
 	return ret;
 }
@@ -847,26 +884,42 @@ int hfi1_verbs_send_dma(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
 	struct hfi1_ahg_info *ahg_info = priv->s_ahg;
 	u32 hdrwords = qp->s_hdrwords;
 	u32 len = ps->s_txreq->s_cur_size;
-	u32 plen = hdrwords + ((len + 3) >> 2) + 2; /* includes pbc */
+	u32 plen;
 	struct hfi1_ibdev *dev = ps->dev;
 	struct hfi1_pportdata *ppd = ps->ppd;
 	struct verbs_txreq *tx;
 	u8 sc5 = priv->s_sc;
-
 	int ret;
+	u32 dwords;
+	bool bypass = false;
+
+	if (ps->s_txreq->phdr.hdr.hdr_type) {
+		u8 extra_bytes = hfi1_get_16b_padding((hdrwords << 2), len);
+
+		dwords = (len + extra_bytes + (SIZE_OF_CRC << 2) +
+			  SIZE_OF_LT) >> 2;
+		bypass = true;
+	} else {
+		dwords = (len + 3) >> 2;
+	}
+	plen = hdrwords + dwords + 2;
 
 	tx = ps->s_txreq;
 	if (!sdma_txreq_built(&tx->txreq)) {
 		if (likely(pbc == 0)) {
 			u32 vl = sc_to_vlt(dd_from_ibdev(qp->ibqp.device), sc5);
-			u8 opcode = get_opcode(&tx->phdr.hdr);
 
 			/* No vl15 here */
-			/* set PBC_DC_INFO bit (aka SC[4]) in pbc_flags */
-			pbc |= (!!(sc5 & 0x10)) << PBC_DC_INFO_SHIFT;
+			/* set PBC_DC_INFO bit (aka SC[4]) in pbc */
+			if (ps->s_txreq->phdr.hdr.hdr_type)
+				pbc |= PBC_PACKET_BYPASS |
+				       PBC_INSERT_BYPASS_ICRC;
+			else
+				pbc |= (ib_is_sc5(sc5) << PBC_DC_INFO_SHIFT);
 
-			if (unlikely(hfi1_dbg_fault_opcode(qp, opcode, false)))
-				pbc = hfi1_fault_tx(qp, opcode, pbc);
+			if (unlikely(hfi1_dbg_fault_opcode(qp, ps->opcode,
+							   false)))
+				pbc = hfi1_fault_tx(qp, ps->opcode, pbc);
 			pbc = create_pbc(ppd,
 					 pbc,
 					 qp->srate_mbps,
@@ -878,14 +931,15 @@ int hfi1_verbs_send_dma(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
 		if (unlikely(ret))
 			goto bail_build;
 	}
-	ret =  sdma_send_txreq(tx->sde, &priv->s_iowait, &tx->txreq);
+	ret =  sdma_send_txreq(tx->sde, &priv->s_iowait, &tx->txreq,
+			       ps->pkts_sent);
 	if (unlikely(ret < 0)) {
 		if (ret == -ECOMM)
 			goto bail_ecomm;
 		return ret;
 	}
 	trace_sdma_output_ibhdr(dd_from_ibdev(qp->ibqp.device),
-				&ps->s_txreq->phdr.hdr);
+				&ps->s_txreq->phdr.hdr, ib_is_sc5(sc5));
 	return ret;
 
 bail_ecomm:
@@ -935,7 +989,8 @@ static int pio_wait(struct rvt_qp *qp,
 			dev->n_piodrain += !!(flag & RVT_S_WAIT_PIO_DRAIN);
 			qp->s_flags |= flag;
 			was_empty = list_empty(&sc->piowait);
-			list_add_tail(&priv->s_iowait.list, &sc->piowait);
+			iowait_queue(ps->pkts_sent, &priv->s_iowait,
+				     &sc->piowait);
 			priv->s_iowait.lock = &dev->iowait_lock;
 			trace_hfi1_qpsleep(qp, RVT_S_WAIT_PIO);
 			rvt_get_qp(qp);
@@ -967,10 +1022,10 @@ int hfi1_verbs_send_pio(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
 	u32 hdrwords = qp->s_hdrwords;
 	struct rvt_sge_state *ss = ps->s_txreq->ss;
 	u32 len = ps->s_txreq->s_cur_size;
-	u32 dwords = (len + 3) >> 2;
-	u32 plen = hdrwords + dwords + 2; /* includes pbc */
+	u32 dwords;
+	u32 plen;
 	struct hfi1_pportdata *ppd = ps->ppd;
-	u32 *hdr = (u32 *)&ps->s_txreq->phdr.hdr;
+	u32 *hdr;
 	u8 sc5;
 	unsigned long flags = 0;
 	struct send_context *sc;
@@ -978,6 +1033,23 @@ int hfi1_verbs_send_pio(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
 	int wc_status = IB_WC_SUCCESS;
 	int ret = 0;
 	pio_release_cb cb = NULL;
+	u32 lrh0_16b;
+	bool bypass = false;
+	u8 extra_bytes = 0;
+
+	if (ps->s_txreq->phdr.hdr.hdr_type) {
+		u8 pad_size = hfi1_get_16b_padding((hdrwords << 2), len);
+
+		extra_bytes = pad_size + (SIZE_OF_CRC << 2) + SIZE_OF_LT;
+		dwords = (len + extra_bytes) >> 2;
+		hdr = (u32 *)&ps->s_txreq->phdr.hdr.opah;
+		lrh0_16b = ps->s_txreq->phdr.hdr.opah.lrh[0];
+		bypass = true;
+	} else {
+		dwords = (len + 3) >> 2;
+		hdr = (u32 *)&ps->s_txreq->phdr.hdr.ibh;
+	}
+	plen = hdrwords + dwords + 2;
 
 	/* only RC/UC use complete */
 	switch (qp->ibqp.qp_type) {
@@ -995,13 +1067,14 @@ int hfi1_verbs_send_pio(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
 
 	if (likely(pbc == 0)) {
 		u8 vl = sc_to_vlt(dd_from_ibdev(qp->ibqp.device), sc5);
-		struct verbs_txreq *tx = ps->s_txreq;
-		u8 opcode = get_opcode(&tx->phdr.hdr);
 
-		/* set PBC_DC_INFO bit (aka SC[4]) in pbc_flags */
-		pbc |= (!!(sc5 & 0x10)) << PBC_DC_INFO_SHIFT;
-		if (unlikely(hfi1_dbg_fault_opcode(qp, opcode, false)))
-			pbc = hfi1_fault_tx(qp, opcode, pbc);
+		/* set PBC_DC_INFO bit (aka SC[4]) in pbc */
+		if (ps->s_txreq->phdr.hdr.hdr_type)
+			pbc |= PBC_PACKET_BYPASS | PBC_INSERT_BYPASS_ICRC;
+		else
+			pbc |= (ib_is_sc5(sc5) << PBC_DC_INFO_SHIFT);
+		if (unlikely(hfi1_dbg_fault_opcode(qp, ps->opcode, false)))
+			pbc = hfi1_fault_tx(qp, ps->opcode, pbc);
 		pbc = create_pbc(ppd, pbc, qp->srate_mbps, vl, plen);
 	}
 	if (cb)
@@ -1038,11 +1111,12 @@ int hfi1_verbs_send_pio(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
 		}
 	}
 
-	if (len == 0) {
+	if (dwords == 0) {
 		pio_copy(ppd->dd, pbuf, pbc, hdr, hdrwords);
 	} else {
+		seg_pio_copy_start(pbuf, pbc,
+				   hdr, hdrwords * 4);
 		if (ss) {
-			seg_pio_copy_start(pbuf, pbc, hdr, hdrwords * 4);
 			while (len) {
 				void *addr = ss->sge.vaddr;
 				u32 slen = ss->sge.length;
@@ -1053,12 +1127,24 @@ int hfi1_verbs_send_pio(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
 				seg_pio_copy_mid(pbuf, addr, slen);
 				len -= slen;
 			}
-			seg_pio_copy_end(pbuf);
 		}
+		/*
+		 * Bypass packet will need to copy additional
+		 * bytes to accommodate for CRC and LT bytes
+		 */
+		if (extra_bytes) {
+			u8 *empty_buf;
+
+			empty_buf = kcalloc(extra_bytes, sizeof(u8),
+					    GFP_KERNEL);
+			seg_pio_copy_mid(pbuf, empty_buf, extra_bytes);
+			kfree(empty_buf);
+		}
+		seg_pio_copy_end(pbuf);
 	}
 
 	trace_pio_output_ibhdr(dd_from_ibdev(qp->ibqp.device),
-			       &ps->s_txreq->phdr.hdr);
+			       &ps->s_txreq->phdr.hdr, ib_is_sc5(sc5));
 
 pio_bail:
 	if (qp->s_wqe) {
@@ -1104,10 +1190,10 @@ static inline int egress_pkey_matches_entry(u16 pkey, u16 ent)
 
 /**
  * egress_pkey_check - check P_KEY of a packet
- * @ppd:    Physical IB port data
- * @lrh: Local route header
- * @bth: Base transport header
- * @sc5:    SC for packet
+ * @ppd:  Physical IB port data
+ * @slid: SLID for packet
+ * @bkey: PKEY for header
+ * @sc5:  SC for packet
  * @s_pkey_index: It will be used for look up optimization for kernel contexts
  * only. If it is negative value, then it means user contexts is calling this
  * function.
@@ -1116,19 +1202,16 @@ static inline int egress_pkey_matches_entry(u16 pkey, u16 ent)
  *
  * Return: 0 on success, otherwise, 1
  */
-int egress_pkey_check(struct hfi1_pportdata *ppd, __be16 *lrh, __be32 *bth,
+int egress_pkey_check(struct hfi1_pportdata *ppd, u32 slid, u16 pkey,
 		      u8 sc5, int8_t s_pkey_index)
 {
 	struct hfi1_devdata *dd;
 	int i;
-	u16 pkey;
 	int is_user_ctxt_mechanism = (s_pkey_index < 0);
 
 	if (!(ppd->part_enforce & HFI1_PART_ENFORCE_OUT))
 		return 0;
 
-	pkey = (u16)be32_to_cpu(bth[0]);
-
 	/* If SC15, pkey[0:14] must be 0x7fff */
 	if ((sc5 == 0xf) && ((pkey & PKEY_LOW_15_MASK) != PKEY_LOW_15_MASK))
 		goto bad;
@@ -1161,8 +1244,6 @@ int egress_pkey_check(struct hfi1_pportdata *ppd, __be16 *lrh, __be32 *bth,
 		dd = ppd->dd;
 		if (!(dd->err_info_xmit_constraint.status &
 		      OPA_EI_STATUS_SMASK)) {
-			u16 slid = be16_to_cpu(lrh[3]);
-
 			dd->err_info_xmit_constraint.status |=
 				OPA_EI_STATUS_SMASK;
 			dd->err_info_xmit_constraint.slid = slid;
@@ -1179,11 +1260,11 @@ int egress_pkey_check(struct hfi1_pportdata *ppd, __be16 *lrh, __be32 *bth,
  * and size
  */
 static inline send_routine get_send_routine(struct rvt_qp *qp,
-					    struct verbs_txreq *tx)
+					    struct hfi1_pkt_state *ps)
 {
 	struct hfi1_devdata *dd = dd_from_ibdev(qp->ibqp.device);
 	struct hfi1_qp_priv *priv = qp->priv;
-	struct ib_header *h = &tx->phdr.hdr;
+	struct verbs_txreq *tx = ps->s_txreq;
 
 	if (unlikely(!(dd->flags & HFI1_HAS_SEND_DMA)))
 		return dd->process_pio_send;
@@ -1195,11 +1276,9 @@ static inline send_routine get_send_routine(struct rvt_qp *qp,
 		break;
 	case IB_QPT_UC:
 	case IB_QPT_RC: {
-		u8 op = get_opcode(h);
-
 		if (piothreshold &&
 		    tx->s_cur_size <= min(piothreshold, qp->pmtu) &&
-		    (BIT(op & OPMASK) & pio_opmask[op >> 5]) &&
+		    (BIT(ps->opcode & OPMASK) & pio_opmask[ps->opcode >> 5]) &&
 		    iowait_sdma_pending(&priv->s_iowait) == 0 &&
 		    !sdma_txreq_built(&tx->txreq))
 			return dd->process_pio_send;
@@ -1224,25 +1303,38 @@ int hfi1_verbs_send(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 	struct hfi1_devdata *dd = dd_from_ibdev(qp->ibqp.device);
 	struct hfi1_qp_priv *priv = qp->priv;
 	struct ib_other_headers *ohdr;
-	struct ib_header *hdr;
 	send_routine sr;
 	int ret;
-	u8 lnh;
+	u16 pkey;
+	u32 slid;
 
-	hdr = &ps->s_txreq->phdr.hdr;
 	/* locate the pkey within the headers */
-	lnh = ib_get_lnh(hdr);
-	if (lnh == HFI1_LRH_GRH)
-		ohdr = &hdr->u.l.oth;
-	else
-		ohdr = &hdr->u.oth;
+	if (ps->s_txreq->phdr.hdr.hdr_type) {
+		struct hfi1_16b_header *hdr = &ps->s_txreq->phdr.hdr.opah;
+		u8 l4 = hfi1_16B_get_l4(hdr);
 
-	sr = get_send_routine(qp, ps->s_txreq);
-	ret = egress_pkey_check(dd->pport,
-				hdr->lrh,
-				ohdr->bth,
-				priv->s_sc,
-				qp->s_pkey_index);
+		if (l4 == OPA_16B_L4_IB_GLOBAL)
+			ohdr = &hdr->u.l.oth;
+		else
+			ohdr = &hdr->u.oth;
+		slid = hfi1_16B_get_slid(hdr);
+		pkey = hfi1_16B_get_pkey(hdr);
+	} else {
+		struct ib_header *hdr = &ps->s_txreq->phdr.hdr.ibh;
+		u8 lnh = ib_get_lnh(hdr);
+
+		if (lnh == HFI1_LRH_GRH)
+			ohdr = &hdr->u.l.oth;
+		else
+			ohdr = &hdr->u.oth;
+		slid = ib_get_slid(hdr);
+		pkey = ib_bth_get_pkey(ohdr);
+	}
+
+	ps->opcode = ib_bth_get_opcode(ohdr);
+	sr = get_send_routine(qp, ps);
+	ret = egress_pkey_check(dd->pport, slid, pkey,
+				priv->s_sc, qp->s_pkey_index);
 	if (unlikely(ret)) {
 		/*
 		 * The value we are returning here does not get propagated to
@@ -1361,14 +1453,14 @@ static int query_port(struct rvt_dev_info *rdi, u8 port_num,
 	struct hfi1_ibdev *verbs_dev = dev_from_rdi(rdi);
 	struct hfi1_devdata *dd = dd_from_dev(verbs_dev);
 	struct hfi1_pportdata *ppd = &dd->pport[port_num - 1];
-	u16 lid = ppd->lid;
+	u32 lid = ppd->lid;
 
 	/* props being zeroed by the caller, avoid zeroing it here */
 	props->lid = lid ? lid : 0;
 	props->lmc = ppd->lmc;
 	/* OPA logical states match IB logical states */
 	props->state = driver_lstate(ppd);
-	props->phys_state = hfi1_ibphys_portstate(ppd);
+	props->phys_state = driver_pstate(ppd);
 	props->gid_tbl_len = HFI1_GUIDS_PER_PORT;
 	props->active_width = (u8)opa_width_to_ib(ppd->link_width_active);
 	/* see rate_show() in ib core/sysfs.c */
@@ -1388,6 +1480,15 @@ static int query_port(struct rvt_dev_info *rdi, u8 port_num,
 	props->active_mtu = !valid_ib_mtu(ppd->ibmtu) ? props->max_mtu :
 		mtu_to_enum(ppd->ibmtu, IB_MTU_2048);
 
+	/*
+	 * sm_lid of 0xFFFF needs special handling so that it can
+	 * be differentiated from a permissve LID of 0xFFFF.
+	 * We set the grh_required flag here so the SA can program
+	 * the DGID in the address handle appropriately
+	 */
+	if (props->sm_lid == be16_to_cpu(IB_LID_PERMISSIVE))
+		props->grh_required = true;
+
 	return 0;
 }
 
@@ -1473,6 +1574,10 @@ static int hfi1_check_ah(struct ib_device *ibdev, struct rdma_ah_attr *ah_attr)
 	struct hfi1_devdata *dd;
 	u8 sc5;
 
+	if (hfi1_check_mcast(rdma_ah_get_dlid(ah_attr)) &&
+	    !(rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH))
+		return -EINVAL;
+
 	/* test the mapping for validity */
 	ibp = to_iport(ibdev, rdma_ah_get_port_num(ah_attr));
 	ppd = ppd_from_ibp(ibp);
@@ -1491,6 +1596,7 @@ static void hfi1_notify_new_ah(struct ib_device *ibdev,
 	struct hfi1_pportdata *ppd;
 	struct hfi1_devdata *dd;
 	u8 sc5;
+	struct rdma_ah_attr *attr = &ah->attr;
 
 	/*
 	 * Do not trust reading anything from rvt_ah at this point as it is not
@@ -1500,33 +1606,14 @@ static void hfi1_notify_new_ah(struct ib_device *ibdev,
 	ibp = to_iport(ibdev, rdma_ah_get_port_num(ah_attr));
 	ppd = ppd_from_ibp(ibp);
 	sc5 = ibp->sl_to_sc[rdma_ah_get_sl(&ah->attr)];
+	hfi1_update_ah_attr(ibdev, attr);
+	hfi1_make_opa_lid(attr);
 	dd = dd_from_ppd(ppd);
 	ah->vl = sc_to_vlt(dd, sc5);
 	if (ah->vl < num_vls || ah->vl == 15)
 		ah->log_pmtu = ilog2(dd->vld[ah->vl].mtu);
 }
 
-struct ib_ah *hfi1_create_qp0_ah(struct hfi1_ibport *ibp, u16 dlid)
-{
-	struct rdma_ah_attr attr;
-	struct ib_ah *ah = ERR_PTR(-EINVAL);
-	struct rvt_qp *qp0;
-	struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
-	struct hfi1_devdata *dd = dd_from_ppd(ppd);
-	u8 port_num = ppd->port;
-
-	memset(&attr, 0, sizeof(attr));
-	attr.type = rdma_ah_find_type(&dd->verbs_dev.rdi.ibdev, port_num);
-	rdma_ah_set_dlid(&attr, dlid);
-	rdma_ah_set_port_num(&attr, ppd_from_ibp(ibp)->port);
-	rcu_read_lock();
-	qp0 = rcu_dereference(ibp->rvp.qp[0]);
-	if (qp0)
-		ah = rdma_create_ah(qp0->ibqp.pd, &attr);
-	rcu_read_unlock();
-	return ah;
-}
-
 /**
  * hfi1_get_npkeys - return the size of the PKEY table for context 0
  * @dd: the hfi1_ib device
@@ -1547,13 +1634,22 @@ static void init_ibport(struct hfi1_pportdata *ppd)
 		ibp->sc_to_sl[i] = i;
 	}
 
+	for (i = 0; i < RVT_MAX_TRAP_LISTS ; i++)
+		INIT_LIST_HEAD(&ibp->rvp.trap_lists[i].list);
+	setup_timer(&ibp->rvp.trap_timer, hfi1_handle_trap_timer,
+		    (unsigned long)ibp);
+
 	spin_lock_init(&ibp->rvp.lock);
 	/* Set the prefix to the default value (see ch. 4.1.1) */
 	ibp->rvp.gid_prefix = IB_DEFAULT_GID_PREFIX;
 	ibp->rvp.sm_lid = 0;
-	/* Below should only set bits defined in OPA PortInfo.CapabilityMask */
+	/*
+	 * Below should only set bits defined in OPA PortInfo.CapabilityMask
+	 * and PortInfo.CapabilityMask3
+	 */
 	ibp->rvp.port_cap_flags = IB_PORT_AUTO_MIGR_SUP |
 		IB_PORT_CAP_MASK_NOTICE_SUP;
+	ibp->rvp.port_cap3_flags = OPA_CAP_MASK3_IsSharedSpaceSupported;
 	ibp->rvp.pma_counter_select[0] = IB_PMA_PORT_XMIT_DATA;
 	ibp->rvp.pma_counter_select[1] = IB_PMA_PORT_RCV_DATA;
 	ibp->rvp.pma_counter_select[2] = IB_PMA_PORT_XMIT_PKTS;
@@ -1564,14 +1660,13 @@ static void init_ibport(struct hfi1_pportdata *ppd)
 	RCU_INIT_POINTER(ibp->rvp.qp[1], NULL);
 }
 
-static void hfi1_get_dev_fw_str(struct ib_device *ibdev, char *str,
-				size_t str_len)
+static void hfi1_get_dev_fw_str(struct ib_device *ibdev, char *str)
 {
 	struct rvt_dev_info *rdi = ib_to_rvt(ibdev);
 	struct hfi1_ibdev *dev = dev_from_rdi(rdi);
 	u32 ver = dd_from_dev(dev)->dc8051_ver;
 
-	snprintf(str, str_len, "%u.%u.%u", dc8051_ver_maj(ver),
+	snprintf(str, IB_FW_VERSION_NAME_MAX, "%u.%u.%u", dc8051_ver_maj(ver),
 		 dc8051_ver_min(ver), dc8051_ver_patch(ver));
 }
 
@@ -1816,7 +1911,8 @@ int hfi1_register_ib_device(struct hfi1_devdata *dd)
 	dd->verbs_dev.rdi.dparms.psn_mask = PSN_MASK;
 	dd->verbs_dev.rdi.dparms.psn_shift = PSN_SHIFT;
 	dd->verbs_dev.rdi.dparms.psn_modify_mask = PSN_MODIFY_MASK;
-	dd->verbs_dev.rdi.dparms.core_cap_flags = RDMA_CORE_PORT_INTEL_OPA;
+	dd->verbs_dev.rdi.dparms.core_cap_flags = RDMA_CORE_PORT_INTEL_OPA |
+						RDMA_CORE_CAP_OPA_AH;
 	dd->verbs_dev.rdi.dparms.max_mad_size = OPA_MGMT_MAD_SIZE;
 
 	dd->verbs_dev.rdi.driver_f.qp_priv_alloc = qp_priv_alloc;
diff --git a/drivers/infiniband/hw/hfi1/verbs.h b/drivers/infiniband/hw/hfi1/verbs.h
index cd635d0..87d1285 100644
--- a/drivers/infiniband/hw/hfi1/verbs.h
+++ b/drivers/infiniband/hw/hfi1/verbs.h
@@ -95,6 +95,7 @@ struct hfi1_packet;
 #define HFI1_VENDOR_IPG		cpu_to_be16(0xFFA0)
 
 #define IB_DEFAULT_GID_PREFIX	cpu_to_be64(0xfe80000000000000ULL)
+#define OPA_BTH_MIG_REQ		BIT(31)
 
 #define RC_OP(x) IB_OPCODE_RC_##x
 #define UC_OP(x) IB_OPCODE_UC_##x
@@ -104,6 +105,25 @@ enum {
 	HFI1_HAS_GRH = (1 << 0),
 };
 
+struct hfi1_16b_header {
+	u32 lrh[4];
+	union {
+		struct {
+			struct ib_grh grh;
+			struct ib_other_headers oth;
+		} l;
+		struct ib_other_headers oth;
+	} u;
+} __packed;
+
+struct hfi1_opa_header {
+	union {
+		struct ib_header ibh; /* 9B header */
+		struct hfi1_16b_header opah; /* 16B header */
+	};
+	u8 hdr_type; /* 9B or 16B */
+} __packed;
+
 struct hfi1_ahg_info {
 	u32 ahgdesc[2];
 	u16 tx_flags;
@@ -113,7 +133,7 @@ struct hfi1_ahg_info {
 
 struct hfi1_sdma_header {
 	__le64 pbc;
-	struct ib_header hdr;
+	struct hfi1_opa_header hdr;
 } __packed;
 
 /*
@@ -127,6 +147,7 @@ struct hfi1_qp_priv {
 	u8 s_sc;		                  /* SC[0..4] for next packet */
 	struct iowait s_iowait;
 	struct rvt_qp *owner;
+	u8 hdr_type; /* 9B or 16B */
 };
 
 /*
@@ -142,7 +163,9 @@ struct hfi1_pkt_state {
 	unsigned long timeout;
 	unsigned long timeout_int;
 	int cpu;
+	u8 opcode;
 	bool in_thread;
+	bool pkts_sent;
 };
 
 #define HFI1_PSN_CREDIT  16
@@ -236,8 +259,8 @@ static inline int hfi1_send_ok(struct rvt_qp *qp)
 /*
  * This must be called with s_lock held.
  */
-void hfi1_bad_pqkey(struct hfi1_ibport *ibp, __be16 trap_num, u32 key, u32 sl,
-		    u32 qp1, u32 qp2, u16 lid1, u16 lid2);
+void hfi1_bad_pkey(struct hfi1_ibport *ibp, u32 key, u32 sl,
+		   u32 qp1, u32 qp2, u32 lid1, u32 lid2);
 void hfi1_cap_mask_chg(struct rvt_dev_info *rdi, u8 port_num);
 void hfi1_sys_guid_chg(struct hfi1_ibport *ibp);
 void hfi1_node_desc_chg(struct hfi1_ibport *ibp);
@@ -257,13 +280,8 @@ int hfi1_process_mad(struct ib_device *ibdev, int mad_flags, u8 port,
  * necessarily be at least one bit less than
  * the container holding the PSN.
  */
-#ifndef CONFIG_HFI1_VERBS_31BIT_PSN
-#define PSN_MASK 0xFFFFFF
-#define PSN_SHIFT 8
-#else
 #define PSN_MASK 0x7FFFFFFF
 #define PSN_SHIFT 1
-#endif
 #define PSN_MODIFY_MASK 0xFFFFFF
 
 /*
@@ -307,15 +325,12 @@ void hfi1_rc_rcv(struct hfi1_packet *packet);
 
 void hfi1_rc_hdrerr(
 	struct hfi1_ctxtdata *rcd,
-	struct ib_header *hdr,
-	u32 rcv_flags,
+	struct hfi1_packet *packet,
 	struct rvt_qp *qp);
 
 u8 ah_to_sc(struct ib_device *ibdev, struct rdma_ah_attr *ah_attr);
 
-struct ib_ah *hfi1_create_qp0_ah(struct hfi1_ibport *ibp, u16 dlid);
-
-void hfi1_rc_send_complete(struct rvt_qp *qp, struct ib_header *hdr);
+void hfi1_rc_send_complete(struct rvt_qp *qp, struct hfi1_opa_header *opah);
 
 void hfi1_ud_rcv(struct hfi1_packet *packet);
 
@@ -336,18 +351,7 @@ int hfi1_check_send_wqe(struct rvt_qp *qp, struct rvt_swqe *wqe);
 extern const u32 rc_only_opcode;
 extern const u32 uc_only_opcode;
 
-static inline u8 get_opcode(struct ib_header *h)
-{
-	u16 lnh = be16_to_cpu(h->lrh[0]) & 3;
-
-	if (lnh == IB_LNH_IBA_LOCAL)
-		return be32_to_cpu(h->u.oth.bth[0]) >> 24;
-	else
-		return be32_to_cpu(h->u.l.oth.bth[0]) >> 24;
-}
-
-int hfi1_ruc_check_hdr(struct hfi1_ibport *ibp, struct ib_header *hdr,
-		       int has_grh, struct rvt_qp *qp, u32 bth0);
+int hfi1_ruc_check_hdr(struct hfi1_ibport *ibp, struct hfi1_packet *packet);
 
 u32 hfi1_make_grh(struct hfi1_ibport *ibp, struct ib_grh *hdr,
 		  const struct ib_global_route *grh, u32 hwords, u32 nwords);
@@ -365,7 +369,8 @@ void hfi1_do_send(struct rvt_qp *qp, bool in_thread);
 void hfi1_send_complete(struct rvt_qp *qp, struct rvt_swqe *wqe,
 			enum ib_wc_status status);
 
-void hfi1_send_rc_ack(struct hfi1_ctxtdata *, struct rvt_qp *qp, int is_fecn);
+void hfi1_send_rc_ack(struct hfi1_ctxtdata *rcd, struct rvt_qp *qp,
+		      bool is_fecn);
 
 int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps);
 
@@ -379,6 +384,8 @@ void hfi1_unregister_ib_device(struct hfi1_devdata *);
 
 void hfi1_ib_rcv(struct hfi1_packet *packet);
 
+void hfi1_16B_rcv(struct hfi1_packet *packet);
+
 unsigned hfi1_get_npkeys(struct hfi1_devdata *);
 
 int hfi1_verbs_send_dma(struct rvt_qp *qp, struct hfi1_pkt_state *ps,
diff --git a/drivers/infiniband/hw/hfi1/verbs_txreq.c b/drivers/infiniband/hw/hfi1/verbs_txreq.c
index 5d23172..873e48e 100644
--- a/drivers/infiniband/hw/hfi1/verbs_txreq.c
+++ b/drivers/infiniband/hw/hfi1/verbs_txreq.c
@@ -1,5 +1,5 @@
 /*
- * Copyright(c) 2016 Intel Corporation.
+ * Copyright(c) 2016 - 2017 Intel Corporation.
  *
  * This file is provided under a dual BSD/GPLv2 license.  When using or
  * redistributing this file, you may do so under either license.
@@ -119,13 +119,6 @@ struct verbs_txreq *__get_txreq(struct hfi1_ibdev *dev,
 	return tx;
 }
 
-static void verbs_txreq_kmem_cache_ctor(void *obj)
-{
-	struct verbs_txreq *tx = (struct verbs_txreq *)obj;
-
-	memset(tx, 0, sizeof(*tx));
-}
-
 int verbs_txreq_init(struct hfi1_ibdev *dev)
 {
 	char buf[TXREQ_LEN];
@@ -135,7 +128,7 @@ int verbs_txreq_init(struct hfi1_ibdev *dev)
 	dev->verbs_txreq_cache = kmem_cache_create(buf,
 						   sizeof(struct verbs_txreq),
 						   0, SLAB_HWCACHE_ALIGN,
-						   verbs_txreq_kmem_cache_ctor);
+						   NULL);
 	if (!dev->verbs_txreq_cache)
 		return -ENOMEM;
 	return 0;
diff --git a/drivers/infiniband/hw/hfi1/vnic.h b/drivers/infiniband/hw/hfi1/vnic.h
index 4a621cd..5ae7815 100644
--- a/drivers/infiniband/hw/hfi1/vnic.h
+++ b/drivers/infiniband/hw/hfi1/vnic.h
@@ -54,21 +54,6 @@
 #define HFI1_VNIC_MAX_TXQ     16
 #define HFI1_VNIC_MAX_PAD     12
 
-/* L2 header definitions */
-#define HFI1_L2_TYPE_OFFSET     0x7
-#define HFI1_L2_TYPE_SHFT       0x5
-#define HFI1_L2_TYPE_MASK       0x3
-
-#define HFI1_GET_L2_TYPE(hdr)                                            \
-	((*((u8 *)(hdr) + HFI1_L2_TYPE_OFFSET) >> HFI1_L2_TYPE_SHFT) &   \
-	 HFI1_L2_TYPE_MASK)
-
-/* L4 type definitions */
-#define HFI1_L4_TYPE_OFFSET 8
-
-#define HFI1_GET_L4_TYPE(data)   \
-	(*((u8 *)(data) + HFI1_L4_TYPE_OFFSET))
-
 /* L4 header definitions */
 #define HFI1_VNIC_L4_HDR_OFFSET  OPA_VNIC_L2_HDR_LEN
 
@@ -103,6 +88,7 @@ struct hfi1_vnic_sdma {
 	struct sdma_txreq stx;
 	unsigned int state;
 	u8 q_idx;
+	bool pkts_sent;
 };
 
 /**
diff --git a/drivers/infiniband/hw/hfi1/vnic_main.c b/drivers/infiniband/hw/hfi1/vnic_main.c
index 339f0cd..f419cbb 100644
--- a/drivers/infiniband/hw/hfi1/vnic_main.c
+++ b/drivers/infiniband/hw/hfi1/vnic_main.c
@@ -95,7 +95,7 @@ static int setup_vnic_ctxt(struct hfi1_devdata *dd, struct hfi1_ctxtdata *uctxt)
 	if (HFI1_CAP_KGET_MASK(uctxt->flags, DMA_RTAIL))
 		rcvctrl_ops |= HFI1_RCVCTRL_TAILUPD_ENB;
 
-	hfi1_rcvctrl(uctxt->dd, rcvctrl_ops, uctxt->ctxt);
+	hfi1_rcvctrl(uctxt->dd, rcvctrl_ops, uctxt);
 
 	uctxt->is_vnic = true;
 done:
@@ -106,22 +106,13 @@ static int allocate_vnic_ctxt(struct hfi1_devdata *dd,
 			      struct hfi1_ctxtdata **vnic_ctxt)
 {
 	struct hfi1_ctxtdata *uctxt;
-	unsigned int ctxt;
 	int ret;
 
 	if (dd->flags & HFI1_FROZEN)
 		return -EIO;
 
-	for (ctxt = dd->first_dyn_alloc_ctxt;
-	     ctxt < dd->num_rcv_contexts; ctxt++)
-		if (!dd->rcd[ctxt])
-			break;
-
-	if (ctxt == dd->num_rcv_contexts)
-		return -EBUSY;
-
-	uctxt = hfi1_create_ctxtdata(dd->pport, ctxt, dd->node);
-	if (!uctxt) {
+	ret = hfi1_create_ctxtdata(dd->pport, dd->node, &uctxt);
+	if (ret < 0) {
 		dd_dev_err(dd, "Unable to create ctxtdata, failing open\n");
 		return -ENOMEM;
 	}
@@ -155,12 +146,7 @@ static int allocate_vnic_ctxt(struct hfi1_devdata *dd,
 
 	return ret;
 bail:
-	/*
-	 * hfi1_free_ctxtdata() also releases send_context
-	 * structure if uctxt->sc is not null
-	 */
-	dd->rcd[uctxt->ctxt] = NULL;
-	hfi1_free_ctxtdata(dd, uctxt);
+	hfi1_free_ctxt(uctxt);
 	dd_dev_dbg(dd, "vnic allocation failed. rc %d\n", ret);
 	return ret;
 }
@@ -168,15 +154,12 @@ static int allocate_vnic_ctxt(struct hfi1_devdata *dd,
 static void deallocate_vnic_ctxt(struct hfi1_devdata *dd,
 				 struct hfi1_ctxtdata *uctxt)
 {
-	unsigned long flags;
-
 	dd_dev_dbg(dd, "closing vnic context %d\n", uctxt->ctxt);
 	flush_wc();
 
 	if (dd->num_msix_entries)
 		hfi1_reset_vnic_msix_info(uctxt);
 
-	spin_lock_irqsave(&dd->uctxt_lock, flags);
 	/*
 	 * Disable receive context and interrupt available, reset all
 	 * RcvCtxtCtrl bits to default values.
@@ -186,7 +169,7 @@ static void deallocate_vnic_ctxt(struct hfi1_devdata *dd,
 		     HFI1_RCVCTRL_INTRAVAIL_DIS |
 		     HFI1_RCVCTRL_ONE_PKT_EGR_DIS |
 		     HFI1_RCVCTRL_NO_RHQ_DROP_DIS |
-		     HFI1_RCVCTRL_NO_EGR_DROP_DIS, uctxt->ctxt);
+		     HFI1_RCVCTRL_NO_EGR_DROP_DIS, uctxt);
 	/*
 	 * VNIC contexts are allocated from user context pool.
 	 * Release them back to user context pool.
@@ -199,16 +182,15 @@ static void deallocate_vnic_ctxt(struct hfi1_devdata *dd,
 	sc_disable(uctxt->sc);
 
 	dd->send_contexts[uctxt->sc->sw_index].type = SC_USER;
-	spin_unlock_irqrestore(&dd->uctxt_lock, flags);
 
-	dd->rcd[uctxt->ctxt] = NULL;
 	uctxt->event_flags = 0;
 
 	hfi1_clear_tids(uctxt);
 	hfi1_clear_ctxt_pkey(dd, uctxt);
 
 	hfi1_stats.sps_ctxts--;
-	hfi1_free_ctxtdata(dd, uctxt);
+
+	hfi1_free_ctxt(uctxt);
 }
 
 void hfi1_vnic_setup(struct hfi1_devdata *dd)
@@ -582,8 +564,8 @@ void hfi1_vnic_bypass_rcv(struct hfi1_packet *packet)
 	int l4_type, vesw_id = -1;
 	u8 q_idx;
 
-	l4_type = HFI1_GET_L4_TYPE(packet->ebuf);
-	if (likely(l4_type == OPA_VNIC_L4_ETHR)) {
+	l4_type = hfi1_16B_get_l4(packet->ebuf);
+	if (likely(l4_type == OPA_16B_L4_ETHR)) {
 		vesw_id = HFI1_VNIC_GET_VESWID(packet->ebuf);
 		vinfo = idr_find(&dd->vnic.vesw_idr, vesw_id);
 
@@ -751,6 +733,7 @@ static int hfi1_vnic_init(struct hfi1_vnic_vport_info *vinfo)
 		rc = hfi1_vnic_allot_ctxt(dd, &dd->vnic.ctxt[i]);
 		if (rc)
 			break;
+		hfi1_rcd_get(dd->vnic.ctxt[i]);
 		dd->vnic.ctxt[i]->vnic_q_idx = i;
 	}
 
@@ -762,6 +745,7 @@ static int hfi1_vnic_init(struct hfi1_vnic_vport_info *vinfo)
 		 */
 		while (i-- > dd->vnic.num_ctxt) {
 			deallocate_vnic_ctxt(dd, dd->vnic.ctxt[i]);
+			hfi1_rcd_put(dd->vnic.ctxt[i]);
 			dd->vnic.ctxt[i] = NULL;
 		}
 		goto alloc_fail;
@@ -791,6 +775,7 @@ static void hfi1_vnic_deinit(struct hfi1_vnic_vport_info *vinfo)
 	if (--dd->vnic.num_vports == 0) {
 		for (i = 0; i < dd->vnic.num_ctxt; i++) {
 			deallocate_vnic_ctxt(dd, dd->vnic.ctxt[i]);
+			hfi1_rcd_put(dd->vnic.ctxt[i]);
 			dd->vnic.ctxt[i] = NULL;
 		}
 		hfi1_deinit_vnic_rsm(dd);
diff --git a/drivers/infiniband/hw/hfi1/vnic_sdma.c b/drivers/infiniband/hw/hfi1/vnic_sdma.c
index 51a817d..c3c96c5 100644
--- a/drivers/infiniband/hw/hfi1/vnic_sdma.c
+++ b/drivers/infiniband/hw/hfi1/vnic_sdma.c
@@ -198,11 +198,16 @@ int hfi1_vnic_send_dma(struct hfi1_devdata *dd, u8 q_idx,
 		goto free_desc;
 	tx->retry_count = 0;
 
-	ret = sdma_send_txreq(sde, &vnic_sdma->wait, &tx->txreq);
+	ret = sdma_send_txreq(sde, &vnic_sdma->wait, &tx->txreq,
+			      vnic_sdma->pkts_sent);
 	/* When -ECOMM, sdma callback will be called with ABORT status */
 	if (unlikely(ret && unlikely(ret != -ECOMM)))
 		goto free_desc;
 
+	if (!ret) {
+		vnic_sdma->pkts_sent = true;
+		iowait_starve_clear(vnic_sdma->pkts_sent, &vnic_sdma->wait);
+	}
 	return ret;
 
 free_desc:
@@ -211,6 +216,8 @@ int hfi1_vnic_send_dma(struct hfi1_devdata *dd, u8 q_idx,
 tx_err:
 	if (ret != -EBUSY)
 		dev_kfree_skb_any(skb);
+	else
+		vnic_sdma->pkts_sent = false;
 	return ret;
 }
 
@@ -225,7 +232,8 @@ int hfi1_vnic_send_dma(struct hfi1_devdata *dd, u8 q_idx,
 static int hfi1_vnic_sdma_sleep(struct sdma_engine *sde,
 				struct iowait *wait,
 				struct sdma_txreq *txreq,
-				unsigned int seq)
+				uint seq,
+				bool pkts_sent)
 {
 	struct hfi1_vnic_sdma *vnic_sdma =
 		container_of(wait, struct hfi1_vnic_sdma, wait);
@@ -239,7 +247,7 @@ static int hfi1_vnic_sdma_sleep(struct sdma_engine *sde,
 	vnic_sdma->state = HFI1_VNIC_SDMA_Q_DEFERRED;
 	write_seqlock(&dev->iowait_lock);
 	if (list_empty(&vnic_sdma->wait.list))
-		list_add_tail(&vnic_sdma->wait.list, &sde->dmawait);
+		iowait_queue(pkts_sent, wait, &sde->dmawait);
 	write_sequnlock(&dev->iowait_lock);
 	return -EBUSY;
 }
@@ -295,22 +303,15 @@ void hfi1_vnic_sdma_init(struct hfi1_vnic_vport_info *vinfo)
 	}
 }
 
-static void hfi1_vnic_txreq_kmem_cache_ctor(void *obj)
-{
-	struct vnic_txreq *tx = (struct vnic_txreq *)obj;
-
-	memset(tx, 0, sizeof(*tx));
-}
-
 int hfi1_vnic_txreq_init(struct hfi1_devdata *dd)
 {
 	char buf[HFI1_VNIC_TXREQ_NAME_LEN];
 
 	snprintf(buf, sizeof(buf), "hfi1_%u_vnic_txreq_cache", dd->unit);
 	dd->vnic.txreq_cache = kmem_cache_create(buf,
-					  sizeof(struct vnic_txreq),
-					  0, SLAB_HWCACHE_ALIGN,
-					  hfi1_vnic_txreq_kmem_cache_ctor);
+						 sizeof(struct vnic_txreq),
+						 0, SLAB_HWCACHE_ALIGN,
+						 NULL);
 	if (!dd->vnic.txreq_cache)
 		return -ENOMEM;
 	return 0;
diff --git a/drivers/infiniband/hw/hns/Kconfig b/drivers/infiniband/hw/hns/Kconfig
index e1a6e05..61c93bb 100644
--- a/drivers/infiniband/hw/hns/Kconfig
+++ b/drivers/infiniband/hw/hns/Kconfig
@@ -1,7 +1,7 @@
 config INFINIBAND_HNS
 	tristate "HNS RoCE Driver"
 	depends on NET_VENDOR_HISILICON
-	depends on ARM64 && HNS && HNS_DSAF && HNS_ENET
+	depends on (ARM64 || (COMPILE_TEST && 64BIT)) && HNS && HNS_DSAF && HNS_ENET
 	---help---
 	  This is a RoCE/RDMA driver for the Hisilicon RoCE engine. The engine
 	  is used in Hisilicon Hi1610 and more further ICT SoC.
diff --git a/drivers/infiniband/hw/hns/hns_roce_alloc.c b/drivers/infiniband/hw/hns/hns_roce_alloc.c
index 605962f..e1b433c 100644
--- a/drivers/infiniband/hw/hns/hns_roce_alloc.c
+++ b/drivers/infiniband/hw/hns/hns_roce_alloc.c
@@ -32,6 +32,7 @@
  */
 
 #include <linux/platform_device.h>
+#include <linux/vmalloc.h>
 #include "hns_roce_device.h"
 
 int hns_roce_bitmap_alloc(struct hns_roce_bitmap *bitmap, unsigned long *obj)
diff --git a/drivers/infiniband/hw/hns/hns_roce_eq.c b/drivers/infiniband/hw/hns/hns_roce_eq.c
index 50f8649..b0f4373 100644
--- a/drivers/infiniband/hw/hns/hns_roce_eq.c
+++ b/drivers/infiniband/hw/hns/hns_roce_eq.c
@@ -31,6 +31,7 @@
  */
 
 #include <linux/platform_device.h>
+#include <linux/interrupt.h>
 #include "hns_roce_common.h"
 #include "hns_roce_device.h"
 #include "hns_roce_eq.h"
@@ -292,7 +293,7 @@ static int hns_roce_aeq_int(struct hns_roce_dev *hr_dev, struct hns_roce_eq *eq)
 			dev_warn(dev, "Unhandled event %d on EQ %d at index %u\n",
 				 event_type, eq->eqn, eq->cons_index);
 			break;
-		};
+		}
 
 		eq->cons_index++;
 		aeqes_found = 1;
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
index 2540b65..747efd1 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
@@ -2023,7 +2023,6 @@ int hns_roce_v1_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags)
 	struct hns_roce_cq *hr_cq = to_hr_cq(ibcq);
 	u32 notification_flag;
 	u32 doorbell[2];
-	int ret = 0;
 
 	notification_flag = (flags & IB_CQ_SOLICITED_MASK) ==
 			    IB_CQ_SOLICITED ? CQ_DB_REQ_NOT : CQ_DB_REQ_NOT_SOL;
@@ -2043,7 +2042,7 @@ int hns_roce_v1_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags)
 
 	hns_roce_write64_k(doorbell, hr_cq->cq_db_l);
 
-	return ret;
+	return 0;
 }
 
 static int hns_roce_v1_poll_one(struct hns_roce_cq *hr_cq,
diff --git a/drivers/infiniband/hw/hns/hns_roce_mr.c b/drivers/infiniband/hw/hns/hns_roce_mr.c
index 80fc01f..e387360 100644
--- a/drivers/infiniband/hw/hns/hns_roce_mr.c
+++ b/drivers/infiniband/hw/hns/hns_roce_mr.c
@@ -32,6 +32,7 @@
  */
 
 #include <linux/platform_device.h>
+#include <linux/vmalloc.h>
 #include <rdma/ib_umem.h>
 #include "hns_roce_device.h"
 #include "hns_roce_cmd.h"
diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c
index 054c526..f5dd21c 100644
--- a/drivers/infiniband/hw/hns/hns_roce_qp.c
+++ b/drivers/infiniband/hw/hns/hns_roce_qp.c
@@ -799,7 +799,7 @@ bool hns_roce_wq_overflow(struct hns_roce_wq *hr_wq, int nreq,
 
 	cur = hr_wq->head - hr_wq->tail;
 	if (likely(cur + nreq < hr_wq->max_post))
-		return 0;
+		return false;
 
 	hr_cq = to_hr_cq(ib_cq);
 	spin_lock(&hr_cq->lock);
diff --git a/drivers/infiniband/hw/i40iw/i40iw_cm.c b/drivers/infiniband/hw/i40iw/i40iw_cm.c
index 5a2fa74..14f36ba 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_cm.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_cm.c
@@ -1582,15 +1582,14 @@ static enum i40iw_status_code i40iw_del_multiple_qhash(
 }
 
 /**
- * i40iw_netdev_vlan_ipv6 - Gets the netdev and mac
+ * i40iw_netdev_vlan_ipv6 - Gets the netdev and vlan
  * @addr: local IPv6 address
  * @vlan_id: vlan id for the given IPv6 address
- * @mac: mac address for the given IPv6 address
  *
  * Returns the net_device of the IPv6 address and also sets the
- * vlan id and mac for that address.
+ * vlan id for that address.
  */
-static struct net_device *i40iw_netdev_vlan_ipv6(u32 *addr, u16 *vlan_id, u8 *mac)
+static struct net_device *i40iw_netdev_vlan_ipv6(u32 *addr, u16 *vlan_id)
 {
 	struct net_device *ip_dev = NULL;
 	struct in6_addr laddr6;
@@ -1600,15 +1599,11 @@ static struct net_device *i40iw_netdev_vlan_ipv6(u32 *addr, u16 *vlan_id, u8 *ma
 	i40iw_copy_ip_htonl(laddr6.in6_u.u6_addr32, addr);
 	if (vlan_id)
 		*vlan_id = I40IW_NO_VLAN;
-	if (mac)
-		eth_zero_addr(mac);
 	rcu_read_lock();
 	for_each_netdev_rcu(&init_net, ip_dev) {
 		if (ipv6_chk_addr(&init_net, &laddr6, ip_dev, 1)) {
 			if (vlan_id)
 				*vlan_id = rdma_vlan_dev_vlan_id(ip_dev);
-			if (ip_dev->dev_addr && mac)
-				ether_addr_copy(mac, ip_dev->dev_addr);
 			break;
 		}
 	}
@@ -3588,7 +3583,7 @@ int i40iw_accept(struct iw_cm_id *cm_id, struct iw_cm_conn_param *conn_param)
 		cm_node->vlan_id = i40iw_get_vlan_ipv4(cm_node->loc_addr);
 	} else {
 		cm_node->ipv4 = false;
-		i40iw_netdev_vlan_ipv6(cm_node->loc_addr, &cm_node->vlan_id, NULL);
+		i40iw_netdev_vlan_ipv6(cm_node->loc_addr, &cm_node->vlan_id);
 	}
 	i40iw_debug(cm_node->dev,
 		    I40IW_DEBUG_CM,
@@ -3687,8 +3682,6 @@ int i40iw_accept(struct iw_cm_id *cm_id, struct iw_cm_conn_param *conn_param)
 
 	cm_node->accelerated = 1;
 	if (cm_node->accept_pend) {
-		if (!cm_node->listener)
-			i40iw_pr_err("cm_node->listener NULL for passive node\n");
 		atomic_dec(&cm_node->listener->pend_accepts_cnt);
 		cm_node->accept_pend = 0;
 	}
@@ -3789,7 +3782,7 @@ int i40iw_connect(struct iw_cm_id *cm_id, struct iw_cm_conn_param *conn_param)
 				    raddr6->sin6_addr.in6_u.u6_addr32);
 		cm_info.loc_port = ntohs(laddr6->sin6_port);
 		cm_info.rem_port = ntohs(raddr6->sin6_port);
-		i40iw_netdev_vlan_ipv6(cm_info.loc_addr, &cm_info.vlan_id, NULL);
+		i40iw_netdev_vlan_ipv6(cm_info.loc_addr, &cm_info.vlan_id);
 	}
 	cm_info.cm_id = cm_id;
 	cm_info.tos = cm_id->tos;
@@ -3931,8 +3924,7 @@ int i40iw_create_listen(struct iw_cm_id *cm_id, int backlog)
 		cm_info.loc_port = ntohs(laddr6->sin6_port);
 		if (ipv6_addr_type(&laddr6->sin6_addr) != IPV6_ADDR_ANY)
 			i40iw_netdev_vlan_ipv6(cm_info.loc_addr,
-					       &cm_info.vlan_id,
-					       NULL);
+					       &cm_info.vlan_id);
 		else
 			wildcard = true;
 	}
@@ -4056,12 +4048,7 @@ static void i40iw_cm_event_connected(struct i40iw_cm_event *event)
 	i40iw_modify_qp(&iwqp->ibqp, &attr, IB_QP_STATE, NULL);
 
 	cm_node->accelerated = 1;
-	if (cm_node->accept_pend) {
-		if (!cm_node->listener)
-			i40iw_pr_err("listener is null for passive node\n");
-		atomic_dec(&cm_node->listener->pend_accepts_cnt);
-		cm_node->accept_pend = 0;
-	}
+
 	return;
 
 error:
diff --git a/drivers/infiniband/hw/i40iw/i40iw_ctrl.c b/drivers/infiniband/hw/i40iw/i40iw_ctrl.c
index a49ff2e..d1f5345 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_ctrl.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_ctrl.c
@@ -54,6 +54,17 @@ static inline void i40iw_insert_wqe_hdr(u64 *wqe, u64 header)
 	set_64bit_val(wqe, 24, header);
 }
 
+void i40iw_check_cqp_progress(struct i40iw_cqp_timeout *cqp_timeout, struct i40iw_sc_dev *dev)
+{
+	if (cqp_timeout->compl_cqp_cmds != dev->cqp_cmd_stats[OP_COMPLETED_COMMANDS]) {
+		cqp_timeout->compl_cqp_cmds = dev->cqp_cmd_stats[OP_COMPLETED_COMMANDS];
+		cqp_timeout->count = 0;
+	} else {
+		if (dev->cqp_cmd_stats[OP_REQUESTED_COMMANDS] != cqp_timeout->compl_cqp_cmds)
+			cqp_timeout->count++;
+	}
+}
+
 /**
  * i40iw_get_cqp_reg_info - get head and tail for cqp using registers
  * @cqp: struct for cqp hw
diff --git a/drivers/infiniband/hw/i40iw/i40iw_main.c b/drivers/infiniband/hw/i40iw/i40iw_main.c
index ae8463f..cc742c3 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_main.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_main.c
@@ -77,7 +77,6 @@ MODULE_PARM_DESC(mpa_version, "MPA version to be used in MPA Req/Resp 1 or 2");
 MODULE_AUTHOR("Intel Corporation, <e1000-rdma@lists.sourceforge.net>");
 MODULE_DESCRIPTION("Intel(R) Ethernet Connection X722 iWARP RDMA Driver");
 MODULE_LICENSE("Dual BSD/GPL");
-MODULE_VERSION(DRV_VERSION);
 
 static struct i40e_client i40iw_client;
 static char i40iw_client_name[I40E_CLIENT_STR_LENGTH] = "i40iw";
diff --git a/drivers/infiniband/hw/i40iw/i40iw_p.h b/drivers/infiniband/hw/i40iw/i40iw_p.h
index 28a92fe..e217a12 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_p.h
+++ b/drivers/infiniband/hw/i40iw/i40iw_p.h
@@ -35,11 +35,13 @@
 #ifndef I40IW_P_H
 #define I40IW_P_H
 
-#define PAUSE_TIMER_VALUE  0xFFFF
-#define REFRESH_THRESHOLD  0x7FFF
-#define HIGH_THRESHOLD     0x800
-#define LOW_THRESHOLD      0x200
-#define ALL_TC2PFC         0xFF
+#define PAUSE_TIMER_VALUE       0xFFFF
+#define REFRESH_THRESHOLD       0x7FFF
+#define HIGH_THRESHOLD          0x800
+#define LOW_THRESHOLD           0x200
+#define ALL_TC2PFC              0xFF
+#define CQP_COMPL_WAIT_TIME     0x3E8
+#define CQP_TIMEOUT_THRESHOLD   5
 
 void i40iw_debug_buf(struct i40iw_sc_dev *dev, enum i40iw_debug_flag mask,
 		     char *desc, u64 *buf, u32 size);
@@ -51,6 +53,8 @@ void i40iw_sc_cqp_post_sq(struct i40iw_sc_cqp *cqp);
 
 u64 *i40iw_sc_cqp_get_next_send_wqe(struct i40iw_sc_cqp *cqp, u64 scratch);
 
+void i40iw_check_cqp_progress(struct i40iw_cqp_timeout *cqp_timeout, struct i40iw_sc_dev *dev);
+
 enum i40iw_status_code i40iw_sc_mr_fast_register(struct i40iw_sc_qp *qp,
 						 struct i40iw_fast_reg_stag_info *info,
 						 bool post_sq);
diff --git a/drivers/infiniband/hw/i40iw/i40iw_pble.c b/drivers/infiniband/hw/i40iw/i40iw_pble.c
index c87ba16..540aab5 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_pble.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_pble.c
@@ -269,10 +269,8 @@ static enum i40iw_status_code add_bp_pages(struct i40iw_sc_dev *dev,
 	status = i40iw_add_sd_table_entry(dev->hw, hmc_info,
 					  info->idx.sd_idx, I40IW_SD_TYPE_PAGED,
 					  I40IW_HMC_DIRECT_BP_SIZE);
-	if (status) {
-		i40iw_free_vmalloc_mem(dev->hw, chunk);
-		return status;
-	}
+	if (status)
+		goto error;
 	if (!dev->is_pf) {
 		status = i40iw_vchnl_vf_add_hmc_objs(dev, I40IW_HMC_IW_PBLE,
 						     fpm_to_idx(pble_rsrc,
@@ -280,8 +278,7 @@ static enum i40iw_status_code add_bp_pages(struct i40iw_sc_dev *dev,
 						     (info->pages << PBLE_512_SHIFT));
 		if (status) {
 			i40iw_pr_err("allocate PBLEs in the PF.  Error %i\n", status);
-			i40iw_free_vmalloc_mem(dev->hw, chunk);
-			return status;
+			goto error;
 		}
 	}
 	addr = chunk->vaddr;
diff --git a/drivers/infiniband/hw/i40iw/i40iw_puda.c b/drivers/infiniband/hw/i40iw/i40iw_puda.c
index 7f5583d..c2cab20 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_puda.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_puda.c
@@ -949,14 +949,16 @@ enum i40iw_status_code i40iw_puda_create_rsrc(struct i40iw_sc_vsi *vsi,
 		ret = i40iw_puda_qp_create(rsrc);
 	}
 	if (ret) {
-		i40iw_debug(dev, I40IW_DEBUG_PUDA, "[%s] error qp_create\n", __func__);
+		i40iw_debug(dev, I40IW_DEBUG_PUDA, "[%s] error qp_create\n",
+			    __func__);
 		goto error;
 	}
 	rsrc->completion = PUDA_QP_CREATED;
 
 	ret = i40iw_puda_allocbufs(rsrc, info->tx_buf_cnt + info->rq_size);
 	if (ret) {
-		i40iw_debug(dev, I40IW_DEBUG_PUDA, "[%s] error allloc_buf\n", __func__);
+		i40iw_debug(dev, I40IW_DEBUG_PUDA, "[%s] error alloc_buf\n",
+			    __func__);
 		goto error;
 	}
 
diff --git a/drivers/infiniband/hw/i40iw/i40iw_type.h b/drivers/infiniband/hw/i40iw/i40iw_type.h
index 959ec81..63118f6 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_type.h
+++ b/drivers/infiniband/hw/i40iw/i40iw_type.h
@@ -1345,4 +1345,9 @@ struct i40iw_virtchnl_work_info {
 	void *worker_vf_dev;
 };
 
+struct i40iw_cqp_timeout {
+	u64 compl_cqp_cmds;
+	u8 count;
+};
+
 #endif
diff --git a/drivers/infiniband/hw/i40iw/i40iw_uk.c b/drivers/infiniband/hw/i40iw/i40iw_uk.c
index 1060725..0aadb7a 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_uk.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_uk.c
@@ -912,7 +912,7 @@ enum i40iw_status_code i40iw_get_wqe_shift(u32 wqdepth, u32 sge, u32 inline_data
 	return 0;
 }
 
-static struct i40iw_qp_uk_ops iw_qp_uk_ops = {
+static const struct i40iw_qp_uk_ops iw_qp_uk_ops = {
 	.iw_qp_post_wr = i40iw_qp_post_wr,
 	.iw_qp_ring_push_db = i40iw_qp_ring_push_db,
 	.iw_rdma_write = i40iw_rdma_write,
@@ -926,14 +926,14 @@ static struct i40iw_qp_uk_ops iw_qp_uk_ops = {
 	.iw_post_nop = i40iw_nop
 };
 
-static struct i40iw_cq_ops iw_cq_ops = {
+static const struct i40iw_cq_ops iw_cq_ops = {
 	.iw_cq_request_notification = i40iw_cq_request_notification,
 	.iw_cq_poll_completion = i40iw_cq_poll_completion,
 	.iw_cq_post_entries = i40iw_cq_post_entries,
 	.iw_cq_clean = i40iw_clean_cq
 };
 
-static struct i40iw_device_uk_ops iw_device_uk_ops = {
+static const struct i40iw_device_uk_ops iw_device_uk_ops = {
 	.iwarp_cq_uk_init = i40iw_cq_uk_init,
 	.iwarp_qp_uk_init = i40iw_qp_uk_init,
 };
diff --git a/drivers/infiniband/hw/i40iw/i40iw_utils.c b/drivers/infiniband/hw/i40iw/i40iw_utils.c
index e311ec5..62f1f45 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_utils.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_utils.c
@@ -445,23 +445,29 @@ static int i40iw_wait_event(struct i40iw_device *iwdev,
 {
 	struct cqp_commands_info *info = &cqp_request->info;
 	struct i40iw_cqp *iwcqp = &iwdev->cqp;
+	struct i40iw_cqp_timeout cqp_timeout;
 	bool cqp_error = false;
 	int err_code = 0;
-	int timeout_ret = 0;
+	memset(&cqp_timeout, 0, sizeof(cqp_timeout));
+	cqp_timeout.compl_cqp_cmds = iwdev->sc_dev.cqp_cmd_stats[OP_COMPLETED_COMMANDS];
+	do {
+		if (wait_event_timeout(cqp_request->waitq,
+				       cqp_request->request_done, CQP_COMPL_WAIT_TIME))
+			break;
 
-	timeout_ret = wait_event_timeout(cqp_request->waitq,
-					 cqp_request->request_done,
-					 I40IW_EVENT_TIMEOUT);
-	if (!timeout_ret) {
-		i40iw_pr_err("error cqp command 0x%x timed out ret = %d\n",
-			     info->cqp_cmd, timeout_ret);
+		i40iw_check_cqp_progress(&cqp_timeout, &iwdev->sc_dev);
+
+		if (cqp_timeout.count < CQP_TIMEOUT_THRESHOLD)
+			continue;
+
+		i40iw_pr_err("error cqp command 0x%x timed out", info->cqp_cmd);
 		err_code = -ETIME;
 		if (!iwdev->reset) {
 			iwdev->reset = true;
 			i40iw_request_reset(iwdev);
 		}
 		goto done;
-	}
+	} while (1);
 	cqp_error = cqp_request->compl_info.error;
 	if (cqp_error) {
 		i40iw_pr_err("error cqp command 0x%x completion maj = 0x%x min=0x%x\n",
diff --git a/drivers/infiniband/hw/i40iw/i40iw_verbs.c b/drivers/infiniband/hw/i40iw/i40iw_verbs.c
index 02d871d..1aa4110 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_verbs.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_verbs.c
@@ -2584,13 +2584,12 @@ static const char * const i40iw_hw_stat_names[] = {
 		"iwRdmaInv"
 };
 
-static void i40iw_get_dev_fw_str(struct ib_device *dev, char *str,
-				 size_t str_len)
+static void i40iw_get_dev_fw_str(struct ib_device *dev, char *str)
 {
 	u32 firmware_version = I40IW_FW_VERSION;
 
-	snprintf(str, str_len, "%u.%u", firmware_version,
-		       (firmware_version & 0x000000ff));
+	snprintf(str, IB_FW_VERSION_NAME_MAX, "%u.%u", firmware_version,
+		 (firmware_version & 0x000000ff));
 }
 
 /**
diff --git a/drivers/infiniband/hw/mlx4/alias_GUID.c b/drivers/infiniband/hw/mlx4/alias_GUID.c
index ea24230..155b4df 100644
--- a/drivers/infiniband/hw/mlx4/alias_GUID.c
+++ b/drivers/infiniband/hw/mlx4/alias_GUID.c
@@ -528,7 +528,7 @@ static int set_guid_rec(struct ib_device *ibdev,
 
 	memset(&guid_info_rec, 0, sizeof (struct ib_sa_guidinfo_rec));
 
-	guid_info_rec.lid = cpu_to_be16(attr.lid);
+	guid_info_rec.lid = ib_lid_be16(attr.lid);
 	guid_info_rec.block_num = index;
 
 	memcpy(guid_info_rec.guid_info_list, rec_det->all_recs,
@@ -781,7 +781,7 @@ void mlx4_ib_init_alias_guid_work(struct mlx4_ib_dev *dev, int port)
 	spin_lock_irqsave(&dev->sriov.going_down_lock, flags);
 	spin_lock_irqsave(&dev->sriov.alias_guid.ag_work_lock, flags1);
 	if (!dev->sriov.is_going_down) {
-		/* If there is pending one should cancell then run, otherwise
+		/* If there is pending one should cancel then run, otherwise
 		  * won't run till previous one is ended as same work
 		  * struct is used.
 		  */
diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c
index ff931c5..cab7963 100644
--- a/drivers/infiniband/hw/mlx4/cq.c
+++ b/drivers/infiniband/hw/mlx4/cq.c
@@ -218,6 +218,7 @@ struct ib_cq *mlx4_ib_create_cq(struct ib_device *ibdev,
 			goto err_mtt;
 
 		uar = &to_mucontext(context)->uar;
+		cq->mcq.usage = MLX4_RES_USAGE_USER_VERBS;
 	} else {
 		err = mlx4_db_alloc(dev->dev, &cq->db, 1);
 		if (err)
@@ -233,6 +234,7 @@ struct ib_cq *mlx4_ib_create_cq(struct ib_device *ibdev,
 			goto err_db;
 
 		uar = &dev->priv_uar;
+		cq->mcq.usage = MLX4_RES_USAGE_DRIVER;
 	}
 
 	if (dev->eq_table)
@@ -635,7 +637,7 @@ static void mlx4_ib_poll_sw_comp(struct mlx4_ib_cq *cq, int num_entries,
 	struct mlx4_ib_qp *qp;
 
 	*npolled = 0;
-	/* Find uncompleted WQEs belonging to that cq and retrun
+	/* Find uncompleted WQEs belonging to that cq and return
 	 * simulated FLUSH_ERR completions
 	 */
 	list_for_each_entry(qp, &cq->send_qp_list, cq_send_list) {
diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c
index 21d31cb..0793a21 100644
--- a/drivers/infiniband/hw/mlx4/mad.c
+++ b/drivers/infiniband/hw/mlx4/mad.c
@@ -169,7 +169,7 @@ int mlx4_MAD_IFC(struct mlx4_ib_dev *dev, int mad_ifc_flags,
 
 		op_modifier |= 0x4;
 
-		in_modifier |= in_wc->slid << 16;
+		in_modifier |= ib_lid_cpu16(in_wc->slid) << 16;
 	}
 
 	err = mlx4_cmd_box(dev->dev, inmailbox->dma, outmailbox->dma, in_modifier,
@@ -625,7 +625,7 @@ int mlx4_ib_send_to_slave(struct mlx4_ib_dev *dev, int slave, u8 port,
 		memcpy((char *)&tun_mad->hdr.slid_mac_47_32, &(wc->smac[4]), 2);
 	} else {
 		tun_mad->hdr.sl_vid = cpu_to_be16(((u16)(wc->sl)) << 12);
-		tun_mad->hdr.slid_mac_47_32 = cpu_to_be16(wc->slid);
+		tun_mad->hdr.slid_mac_47_32 = ib_lid_be16(wc->slid);
 	}
 
 	ib_dma_sync_single_for_device(&dev->ib_dev,
@@ -826,7 +826,7 @@ static int ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
 		}
 	}
 
-	slid = in_wc ? in_wc->slid : be16_to_cpu(IB_LID_PERMISSIVE);
+	slid = in_wc ? ib_lid_cpu16(in_wc->slid) : be16_to_cpu(IB_LID_PERMISSIVE);
 
 	if (in_mad->mad_hdr.method == IB_MGMT_METHOD_TRAP && slid == 0) {
 		forward_trap(to_mdev(ibdev), port_num, in_mad);
@@ -860,7 +860,7 @@ static int ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
 	    in_mad->mad_hdr.method == IB_MGMT_METHOD_SET &&
 	    in_mad->mad_hdr.attr_id == IB_SMP_ATTR_PORT_INFO &&
 	    !ib_query_port(ibdev, port_num, &pattr))
-		prev_lid = pattr.lid;
+		prev_lid = ib_lid_cpu16(pattr.lid);
 
 	err = mlx4_MAD_IFC(to_mdev(ibdev),
 			   (mad_flags & IB_MAD_IGNORE_MKEY ? MLX4_MAD_IFC_IGNORE_MKEY : 0) |
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index d1b43cb..c636842 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -70,7 +70,6 @@
 MODULE_AUTHOR("Roland Dreier");
 MODULE_DESCRIPTION("Mellanox ConnectX HCA InfiniBand driver");
 MODULE_LICENSE("Dual BSD/GPL");
-MODULE_VERSION(DRV_VERSION);
 
 int mlx4_ib_sm_guid_assign = 0;
 module_param_named(sm_guid_assign, mlx4_ib_sm_guid_assign, int, 0444);
@@ -81,6 +80,8 @@ static const char mlx4_ib_version[] =
 	DRV_VERSION "\n";
 
 static void do_slave_init(struct mlx4_ib_dev *ibdev, int slave, int do_init);
+static enum rdma_link_layer mlx4_ib_port_link_layer(struct ib_device *device,
+						    u8 port_num);
 
 static struct workqueue_struct *wq;
 
@@ -552,6 +553,16 @@ static int mlx4_ib_query_device(struct ib_device *ibdev,
 	props->timestamp_mask = 0xFFFFFFFFFFFFULL;
 	props->max_ah = INT_MAX;
 
+	if ((dev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_RSS) &&
+	    (mlx4_ib_port_link_layer(ibdev, 1) == IB_LINK_LAYER_ETHERNET ||
+	     mlx4_ib_port_link_layer(ibdev, 2) == IB_LINK_LAYER_ETHERNET)) {
+		props->rss_caps.max_rwq_indirection_tables = props->max_qp;
+		props->rss_caps.max_rwq_indirection_table_size =
+			dev->dev->caps.max_rss_tbl_sz;
+		props->rss_caps.supported_qpts = 1 << IB_QPT_RAW_PACKET;
+		props->max_wq_type_rq = props->max_qp;
+	}
+
 	if (!mlx4_is_slave(dev->dev))
 		err = mlx4_get_internal_clock_params(dev->dev, &clock_params);
 
@@ -563,6 +574,13 @@ static int mlx4_ib_query_device(struct ib_device *ibdev,
 		}
 	}
 
+	if (uhw->outlen >= resp.response_length +
+	    sizeof(resp.max_inl_recv_sz)) {
+		resp.response_length += sizeof(resp.max_inl_recv_sz);
+		resp.max_inl_recv_sz  = dev->dev->caps.max_rq_sg *
+			sizeof(struct mlx4_wqe_data_seg);
+	}
+
 	if (uhw->outlen) {
 		err = ib_copy_to_udata(uhw, &resp, resp.response_length);
 		if (err)
@@ -1069,6 +1087,9 @@ static struct ib_ucontext *mlx4_ib_alloc_ucontext(struct ib_device *ibdev,
 	INIT_LIST_HEAD(&context->db_page_list);
 	mutex_init(&context->db_page_mutex);
 
+	INIT_LIST_HEAD(&context->wqn_ranges_list);
+	mutex_init(&context->wqn_ranges_mutex);
+
 	if (ibdev->uverbs_abi_ver == MLX4_IB_UVERBS_NO_DEV_CAPS_ABI_VERSION)
 		err = ib_copy_to_udata(udata, &resp_v3, sizeof(resp_v3));
 	else
@@ -2566,12 +2587,11 @@ static int mlx4_port_immutable(struct ib_device *ibdev, u8 port_num,
 	return 0;
 }
 
-static void get_fw_ver_str(struct ib_device *device, char *str,
-			   size_t str_len)
+static void get_fw_ver_str(struct ib_device *device, char *str)
 {
 	struct mlx4_ib_dev *dev =
 		container_of(device, struct mlx4_ib_dev, ib_dev);
-	snprintf(str, str_len, "%d.%d.%d",
+	snprintf(str, IB_FW_VERSION_NAME_MAX, "%d.%d.%d",
 		 (int) (dev->dev->caps.fw_ver >> 32),
 		 (int) (dev->dev->caps.fw_ver >> 16) & 0xffff,
 		 (int) dev->dev->caps.fw_ver & 0xffff);
@@ -2713,6 +2733,26 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
 	ibdev->ib_dev.get_dev_fw_str    = get_fw_ver_str;
 	ibdev->ib_dev.disassociate_ucontext = mlx4_ib_disassociate_ucontext;
 
+	if ((dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_RSS) &&
+	    ((mlx4_ib_port_link_layer(&ibdev->ib_dev, 1) ==
+	    IB_LINK_LAYER_ETHERNET) ||
+	    (mlx4_ib_port_link_layer(&ibdev->ib_dev, 2) ==
+	    IB_LINK_LAYER_ETHERNET))) {
+		ibdev->ib_dev.create_wq		= mlx4_ib_create_wq;
+		ibdev->ib_dev.modify_wq		= mlx4_ib_modify_wq;
+		ibdev->ib_dev.destroy_wq	= mlx4_ib_destroy_wq;
+		ibdev->ib_dev.create_rwq_ind_table  =
+			mlx4_ib_create_rwq_ind_table;
+		ibdev->ib_dev.destroy_rwq_ind_table =
+			mlx4_ib_destroy_rwq_ind_table;
+		ibdev->ib_dev.uverbs_ex_cmd_mask |=
+			(1ull << IB_USER_VERBS_EX_CMD_CREATE_WQ)	  |
+			(1ull << IB_USER_VERBS_EX_CMD_MODIFY_WQ)	  |
+			(1ull << IB_USER_VERBS_EX_CMD_DESTROY_WQ)	  |
+			(1ull << IB_USER_VERBS_EX_CMD_CREATE_RWQ_IND_TBL) |
+			(1ull << IB_USER_VERBS_EX_CMD_DESTROY_RWQ_IND_TBL);
+	}
+
 	if (!mlx4_is_slave(ibdev->dev)) {
 		ibdev->ib_dev.alloc_fmr		= mlx4_ib_fmr_alloc;
 		ibdev->ib_dev.map_phys_fmr	= mlx4_ib_map_phys_fmr;
@@ -2772,7 +2812,8 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
 		allocated = 0;
 		if (mlx4_ib_port_link_layer(&ibdev->ib_dev, i + 1) ==
 						IB_LINK_LAYER_ETHERNET) {
-			err = mlx4_counter_alloc(ibdev->dev, &counter_index);
+			err = mlx4_counter_alloc(ibdev->dev, &counter_index,
+						 MLX4_RES_USAGE_DRIVER);
 			/* if failed to allocate a new counter, use default */
 			if (err)
 				counter_index =
@@ -2827,7 +2868,8 @@ static void *mlx4_ib_add(struct mlx4_dev *dev)
 		ibdev->steer_qpn_count = MLX4_IB_UC_MAX_NUM_QPS;
 		err = mlx4_qp_reserve_range(dev, ibdev->steer_qpn_count,
 					    MLX4_IB_UC_STEER_QPN_ALIGN,
-					    &ibdev->steer_qpn_base, 0);
+					    &ibdev->steer_qpn_base, 0,
+					    MLX4_RES_USAGE_DRIVER);
 		if (err)
 			goto err_counter;
 
diff --git a/drivers/infiniband/hw/mlx4/mcg.c b/drivers/infiniband/hw/mlx4/mcg.c
index b73f897..70eb9f9 100644
--- a/drivers/infiniband/hw/mlx4/mcg.c
+++ b/drivers/infiniband/hw/mlx4/mcg.c
@@ -808,8 +808,7 @@ static ssize_t sysfs_show_group(struct device *dev,
 		struct device_attribute *attr, char *buf);
 
 static struct mcast_group *acquire_group(struct mlx4_ib_demux_ctx *ctx,
-					 union ib_gid *mgid, int create,
-					 gfp_t gfp_mask)
+					 union ib_gid *mgid, int create)
 {
 	struct mcast_group *group, *cur_group;
 	int is_mgid0;
@@ -825,7 +824,7 @@ static struct mcast_group *acquire_group(struct mlx4_ib_demux_ctx *ctx,
 	if (!create)
 		return ERR_PTR(-ENOENT);
 
-	group = kzalloc(sizeof *group, gfp_mask);
+	group = kzalloc(sizeof(*group), GFP_KERNEL);
 	if (!group)
 		return ERR_PTR(-ENOMEM);
 
@@ -892,7 +891,7 @@ int mlx4_ib_mcg_demux_handler(struct ib_device *ibdev, int port, int slave,
 	case IB_MGMT_METHOD_GET_RESP:
 	case IB_SA_METHOD_DELETE_RESP:
 		mutex_lock(&ctx->mcg_table_lock);
-		group = acquire_group(ctx, &rec->mgid, 0, GFP_KERNEL);
+		group = acquire_group(ctx, &rec->mgid, 0);
 		mutex_unlock(&ctx->mcg_table_lock);
 		if (IS_ERR(group)) {
 			if (mad->mad_hdr.method == IB_MGMT_METHOD_GET_RESP) {
@@ -954,7 +953,7 @@ int mlx4_ib_mcg_multiplex_handler(struct ib_device *ibdev, int port,
 		req->sa_mad = *sa_mad;
 
 		mutex_lock(&ctx->mcg_table_lock);
-		group = acquire_group(ctx, &rec->mgid, may_create, GFP_KERNEL);
+		group = acquire_group(ctx, &rec->mgid, may_create);
 		mutex_unlock(&ctx->mcg_table_lock);
 		if (IS_ERR(group)) {
 			kfree(req);
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index 9db82e6..1fa1982 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -46,6 +46,7 @@
 
 #include <linux/mlx4/device.h>
 #include <linux/mlx4/doorbell.h>
+#include <linux/mlx4/qp.h>
 
 #define MLX4_IB_DRV_NAME	"mlx4_ib"
 
@@ -88,6 +89,8 @@ struct mlx4_ib_ucontext {
 	struct list_head	db_page_list;
 	struct mutex		db_page_mutex;
 	struct mlx4_ib_vma_private_data hw_bar_info[HW_BAR_COUNT];
+	struct list_head	wqn_ranges_list;
+	struct mutex		wqn_ranges_mutex; /* protect wqn_ranges_list */
 };
 
 struct mlx4_ib_pd {
@@ -289,8 +292,25 @@ struct mlx4_roce_smac_vlan_info {
 	int update_vid;
 };
 
+struct mlx4_wqn_range {
+	int			base_wqn;
+	int			size;
+	int			refcount;
+	bool			dirty;
+	struct list_head	list;
+};
+
+struct mlx4_ib_rss {
+	unsigned int		base_qpn_tbl_sz;
+	u8			flags;
+	u8			rss_key[MLX4_EN_RSS_KEY_SIZE];
+};
+
 struct mlx4_ib_qp {
-	struct ib_qp		ibqp;
+	union {
+		struct ib_qp	ibqp;
+		struct ib_wq	ibwq;
+	};
 	struct mlx4_qp		mqp;
 	struct mlx4_buf		buf;
 
@@ -318,6 +338,7 @@ struct mlx4_ib_qp {
 	u8			sq_no_prefetch;
 	u8			state;
 	int			mlx_type;
+	u32			inl_recv_sz;
 	struct list_head	gid_list;
 	struct list_head	steering_rules;
 	struct mlx4_ib_buf	*sqp_proxy_rcv;
@@ -328,6 +349,10 @@ struct mlx4_ib_qp {
 	struct list_head	cq_recv_list;
 	struct list_head	cq_send_list;
 	struct counter_index	*counter_index;
+	struct mlx4_wqn_range	*wqn_range;
+	/* Number of RSS QP parents that uses this WQ */
+	u32			rss_usecnt;
+	struct mlx4_ib_rss	*rss_ctx;
 };
 
 struct mlx4_ib_srq {
@@ -623,6 +648,8 @@ struct mlx4_uverbs_ex_query_device_resp {
 	__u32 comp_mask;
 	__u32 response_length;
 	__u64 hca_core_clock_offset;
+	__u32 max_inl_recv_sz;
+	__u32 reserved;
 };
 
 static inline struct mlx4_ib_dev *to_mdev(struct ib_device *ibdev)
@@ -890,4 +917,17 @@ void mlx4_sched_ib_sl2vl_update_work(struct mlx4_ib_dev *ibdev,
 
 void mlx4_ib_sl2vl_update(struct mlx4_ib_dev *mdev, int port);
 
+struct ib_wq *mlx4_ib_create_wq(struct ib_pd *pd,
+				struct ib_wq_init_attr *init_attr,
+				struct ib_udata *udata);
+int mlx4_ib_destroy_wq(struct ib_wq *wq);
+int mlx4_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
+		      u32 wq_attr_mask, struct ib_udata *udata);
+
+struct ib_rwq_ind_table
+*mlx4_ib_create_rwq_ind_table(struct ib_device *device,
+			      struct ib_rwq_ind_table_init_attr *init_attr,
+			      struct ib_udata *udata);
+int mlx4_ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *wq_ind_table);
+
 #endif /* MLX4_IB_H */
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 75c0e6c..2747abd 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -36,7 +36,6 @@
 #include <net/ip.h>
 #include <linux/slab.h>
 #include <linux/netdevice.h>
-#include <linux/vmalloc.h>
 
 #include <rdma/ib_cache.h>
 #include <rdma/ib_pack.h>
@@ -53,6 +52,7 @@ static void mlx4_ib_lock_cqs(struct mlx4_ib_cq *send_cq,
 			     struct mlx4_ib_cq *recv_cq);
 static void mlx4_ib_unlock_cqs(struct mlx4_ib_cq *send_cq,
 			       struct mlx4_ib_cq *recv_cq);
+static int _mlx4_ib_modify_wq(struct ib_wq *ibwq, enum ib_wq_state new_state);
 
 enum {
 	MLX4_IB_ACK_REQ_FREQ	= 8,
@@ -116,6 +116,11 @@ static const __be32 mlx4_ib_opcode[] = {
 	[IB_WR_MASKED_ATOMIC_FETCH_AND_ADD]	= cpu_to_be32(MLX4_OPCODE_MASKED_ATOMIC_FA),
 };
 
+enum mlx4_ib_source_type {
+	MLX4_IB_QP_SRC	= 0,
+	MLX4_IB_RWQ_SRC	= 1,
+};
+
 static struct mlx4_ib_sqp *to_msqp(struct mlx4_ib_qp *mqp)
 {
 	return container_of(mqp, struct mlx4_ib_sqp, qp);
@@ -330,6 +335,12 @@ static void mlx4_ib_qp_event(struct mlx4_qp *qp, enum mlx4_event type)
 	}
 }
 
+static void mlx4_ib_wq_event(struct mlx4_qp *qp, enum mlx4_event type)
+{
+	pr_warn_ratelimited("Unexpected event type %d on WQ 0x%06x. Events are not supported for WQs\n",
+			    type, qp->qpn);
+}
+
 static int send_wqe_overhead(enum mlx4_ib_qp_type type, u32 flags)
 {
 	/*
@@ -377,7 +388,8 @@ static int send_wqe_overhead(enum mlx4_ib_qp_type type, u32 flags)
 }
 
 static int set_rq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap,
-		       int is_user, int has_rq, struct mlx4_ib_qp *qp)
+		       int is_user, int has_rq, struct mlx4_ib_qp *qp,
+		       u32 inl_recv_sz)
 {
 	/* Sanity check RQ size before proceeding */
 	if (cap->max_recv_wr > dev->dev->caps.max_wqes - MLX4_IB_SQ_MAX_SPARE ||
@@ -385,18 +397,24 @@ static int set_rq_size(struct mlx4_ib_dev *dev, struct ib_qp_cap *cap,
 		return -EINVAL;
 
 	if (!has_rq) {
-		if (cap->max_recv_wr)
+		if (cap->max_recv_wr || inl_recv_sz)
 			return -EINVAL;
 
 		qp->rq.wqe_cnt = qp->rq.max_gs = 0;
 	} else {
+		u32 max_inl_recv_sz = dev->dev->caps.max_rq_sg *
+			sizeof(struct mlx4_wqe_data_seg);
+		u32 wqe_size;
+
 		/* HW requires >= 1 RQ entry with >= 1 gather entry */
-		if (is_user && (!cap->max_recv_wr || !cap->max_recv_sge))
+		if (is_user && (!cap->max_recv_wr || !cap->max_recv_sge ||
+				inl_recv_sz > max_inl_recv_sz))
 			return -EINVAL;
 
 		qp->rq.wqe_cnt	 = roundup_pow_of_two(max(1U, cap->max_recv_wr));
 		qp->rq.max_gs	 = roundup_pow_of_two(max(1U, cap->max_recv_sge));
-		qp->rq.wqe_shift = ilog2(qp->rq.max_gs * sizeof (struct mlx4_wqe_data_seg));
+		wqe_size = qp->rq.max_gs * sizeof(struct mlx4_wqe_data_seg);
+		qp->rq.wqe_shift = ilog2(max_t(u32, wqe_size, inl_recv_sz));
 	}
 
 	/* leave userspace return values as they were, so as not to break ABI */
@@ -632,7 +650,300 @@ static void mlx4_ib_free_qp_counter(struct mlx4_ib_dev *dev,
 	qp->counter_index = NULL;
 }
 
+static int set_qp_rss(struct mlx4_ib_dev *dev, struct mlx4_ib_rss *rss_ctx,
+		      struct ib_qp_init_attr *init_attr,
+		      struct mlx4_ib_create_qp_rss *ucmd)
+{
+	rss_ctx->base_qpn_tbl_sz = init_attr->rwq_ind_tbl->ind_tbl[0]->wq_num |
+		(init_attr->rwq_ind_tbl->log_ind_tbl_size << 24);
+
+	if ((ucmd->rx_hash_function == MLX4_IB_RX_HASH_FUNC_TOEPLITZ) &&
+	    (dev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_RSS_TOP)) {
+		memcpy(rss_ctx->rss_key, ucmd->rx_hash_key,
+		       MLX4_EN_RSS_KEY_SIZE);
+	} else {
+		pr_debug("RX Hash function is not supported\n");
+		return (-EOPNOTSUPP);
+	}
+
+	if ((ucmd->rx_hash_fields_mask & MLX4_IB_RX_HASH_SRC_IPV4) &&
+	    (ucmd->rx_hash_fields_mask & MLX4_IB_RX_HASH_DST_IPV4)) {
+		rss_ctx->flags = MLX4_RSS_IPV4;
+	} else if ((ucmd->rx_hash_fields_mask & MLX4_IB_RX_HASH_SRC_IPV4) ||
+		   (ucmd->rx_hash_fields_mask & MLX4_IB_RX_HASH_DST_IPV4)) {
+		pr_debug("RX Hash fields_mask is not supported - both IPv4 SRC and DST must be set\n");
+		return (-EOPNOTSUPP);
+	}
+
+	if ((ucmd->rx_hash_fields_mask & MLX4_IB_RX_HASH_SRC_IPV6) &&
+	    (ucmd->rx_hash_fields_mask & MLX4_IB_RX_HASH_DST_IPV6)) {
+		rss_ctx->flags |= MLX4_RSS_IPV6;
+	} else if ((ucmd->rx_hash_fields_mask & MLX4_IB_RX_HASH_SRC_IPV6) ||
+		   (ucmd->rx_hash_fields_mask & MLX4_IB_RX_HASH_DST_IPV6)) {
+		pr_debug("RX Hash fields_mask is not supported - both IPv6 SRC and DST must be set\n");
+		return (-EOPNOTSUPP);
+	}
+
+	if ((ucmd->rx_hash_fields_mask & MLX4_IB_RX_HASH_SRC_PORT_UDP) &&
+	    (ucmd->rx_hash_fields_mask & MLX4_IB_RX_HASH_DST_PORT_UDP)) {
+		if (!(dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_UDP_RSS)) {
+			pr_debug("RX Hash fields_mask for UDP is not supported\n");
+			return (-EOPNOTSUPP);
+		}
+
+		if (rss_ctx->flags & MLX4_RSS_IPV4) {
+			rss_ctx->flags |= MLX4_RSS_UDP_IPV4;
+		} else if (rss_ctx->flags & MLX4_RSS_IPV6) {
+			rss_ctx->flags |= MLX4_RSS_UDP_IPV6;
+		} else {
+			pr_debug("RX Hash fields_mask is not supported - UDP must be set with IPv4 or IPv6\n");
+			return (-EOPNOTSUPP);
+		}
+	} else if ((ucmd->rx_hash_fields_mask & MLX4_IB_RX_HASH_SRC_PORT_UDP) ||
+		   (ucmd->rx_hash_fields_mask & MLX4_IB_RX_HASH_DST_PORT_UDP)) {
+		pr_debug("RX Hash fields_mask is not supported - both UDP SRC and DST must be set\n");
+		return (-EOPNOTSUPP);
+	}
+
+	if ((ucmd->rx_hash_fields_mask & MLX4_IB_RX_HASH_SRC_PORT_TCP) &&
+	    (ucmd->rx_hash_fields_mask & MLX4_IB_RX_HASH_DST_PORT_TCP)) {
+		if (rss_ctx->flags & MLX4_RSS_IPV4) {
+			rss_ctx->flags |= MLX4_RSS_TCP_IPV4;
+		} else if (rss_ctx->flags & MLX4_RSS_IPV6) {
+			rss_ctx->flags |= MLX4_RSS_TCP_IPV6;
+		} else {
+			pr_debug("RX Hash fields_mask is not supported - TCP must be set with IPv4 or IPv6\n");
+			return (-EOPNOTSUPP);
+		}
+
+	} else if ((ucmd->rx_hash_fields_mask & MLX4_IB_RX_HASH_SRC_PORT_TCP) ||
+		   (ucmd->rx_hash_fields_mask & MLX4_IB_RX_HASH_DST_PORT_TCP)) {
+		pr_debug("RX Hash fields_mask is not supported - both TCP SRC and DST must be set\n");
+		return (-EOPNOTSUPP);
+	}
+
+	return 0;
+}
+
+static int create_qp_rss(struct mlx4_ib_dev *dev, struct ib_pd *ibpd,
+			 struct ib_qp_init_attr *init_attr,
+			 struct mlx4_ib_create_qp_rss *ucmd,
+			 struct mlx4_ib_qp *qp)
+{
+	int qpn;
+	int err;
+
+	qp->mqp.usage = MLX4_RES_USAGE_USER_VERBS;
+
+	err = mlx4_qp_reserve_range(dev->dev, 1, 1, &qpn, 0, qp->mqp.usage);
+	if (err)
+		return err;
+
+	err = mlx4_qp_alloc(dev->dev, qpn, &qp->mqp);
+	if (err)
+		goto err_qpn;
+
+	mutex_init(&qp->mutex);
+
+	INIT_LIST_HEAD(&qp->gid_list);
+	INIT_LIST_HEAD(&qp->steering_rules);
+
+	qp->mlx4_ib_qp_type = MLX4_IB_QPT_RAW_PACKET;
+	qp->state = IB_QPS_RESET;
+
+	/* Set dummy send resources to be compatible with HV and PRM */
+	qp->sq_no_prefetch = 1;
+	qp->sq.wqe_cnt = 1;
+	qp->sq.wqe_shift = MLX4_IB_MIN_SQ_STRIDE;
+	qp->buf_size = qp->sq.wqe_cnt << MLX4_IB_MIN_SQ_STRIDE;
+	qp->mtt = (to_mqp(
+		   (struct ib_qp *)init_attr->rwq_ind_tbl->ind_tbl[0]))->mtt;
+
+	qp->rss_ctx = kzalloc(sizeof(*qp->rss_ctx), GFP_KERNEL);
+	if (!qp->rss_ctx) {
+		err = -ENOMEM;
+		goto err_qp_alloc;
+	}
+
+	err = set_qp_rss(dev, qp->rss_ctx, init_attr, ucmd);
+	if (err)
+		goto err;
+
+	return 0;
+
+err:
+	kfree(qp->rss_ctx);
+
+err_qp_alloc:
+	mlx4_qp_remove(dev->dev, &qp->mqp);
+	mlx4_qp_free(dev->dev, &qp->mqp);
+
+err_qpn:
+	mlx4_qp_release_range(dev->dev, qpn, 1);
+	return err;
+}
+
+static struct ib_qp *_mlx4_ib_create_qp_rss(struct ib_pd *pd,
+					    struct ib_qp_init_attr *init_attr,
+					    struct ib_udata *udata)
+{
+	struct mlx4_ib_qp *qp;
+	struct mlx4_ib_create_qp_rss ucmd = {};
+	size_t required_cmd_sz;
+	int err;
+
+	if (!udata) {
+		pr_debug("RSS QP with NULL udata\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	if (udata->outlen)
+		return ERR_PTR(-EOPNOTSUPP);
+
+	required_cmd_sz = offsetof(typeof(ucmd), reserved1) +
+					sizeof(ucmd.reserved1);
+	if (udata->inlen < required_cmd_sz) {
+		pr_debug("invalid inlen\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	if (ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen))) {
+		pr_debug("copy failed\n");
+		return ERR_PTR(-EFAULT);
+	}
+
+	if (memchr_inv(ucmd.reserved, 0, sizeof(ucmd.reserved)))
+		return ERR_PTR(-EOPNOTSUPP);
+
+	if (ucmd.comp_mask || ucmd.reserved1)
+		return ERR_PTR(-EOPNOTSUPP);
+
+	if (udata->inlen > sizeof(ucmd) &&
+	    !ib_is_udata_cleared(udata, sizeof(ucmd),
+				 udata->inlen - sizeof(ucmd))) {
+		pr_debug("inlen is not supported\n");
+		return ERR_PTR(-EOPNOTSUPP);
+	}
+
+	if (init_attr->qp_type != IB_QPT_RAW_PACKET) {
+		pr_debug("RSS QP with unsupported QP type %d\n",
+			 init_attr->qp_type);
+		return ERR_PTR(-EOPNOTSUPP);
+	}
+
+	if (init_attr->create_flags) {
+		pr_debug("RSS QP doesn't support create flags\n");
+		return ERR_PTR(-EOPNOTSUPP);
+	}
+
+	if (init_attr->send_cq || init_attr->cap.max_send_wr) {
+		pr_debug("RSS QP with unsupported send attributes\n");
+		return ERR_PTR(-EOPNOTSUPP);
+	}
+
+	qp = kzalloc(sizeof(*qp), GFP_KERNEL);
+	if (!qp)
+		return ERR_PTR(-ENOMEM);
+
+	qp->pri.vid = 0xFFFF;
+	qp->alt.vid = 0xFFFF;
+
+	err = create_qp_rss(to_mdev(pd->device), pd, init_attr, &ucmd, qp);
+	if (err) {
+		kfree(qp);
+		return ERR_PTR(err);
+	}
+
+	qp->ibqp.qp_num = qp->mqp.qpn;
+
+	return &qp->ibqp;
+}
+
+/*
+ * This function allocates a WQN from a range which is consecutive and aligned
+ * to its size. In case the range is full, then it creates a new range and
+ * allocates WQN from it. The new range will be used for following allocations.
+ */
+static int mlx4_ib_alloc_wqn(struct mlx4_ib_ucontext *context,
+			     struct mlx4_ib_qp *qp, int range_size, int *wqn)
+{
+	struct mlx4_ib_dev *dev = to_mdev(context->ibucontext.device);
+	struct mlx4_wqn_range *range;
+	int err = 0;
+
+	mutex_lock(&context->wqn_ranges_mutex);
+
+	range = list_first_entry_or_null(&context->wqn_ranges_list,
+					 struct mlx4_wqn_range, list);
+
+	if (!range || (range->refcount == range->size) || range->dirty) {
+		range = kzalloc(sizeof(*range), GFP_KERNEL);
+		if (!range) {
+			err = -ENOMEM;
+			goto out;
+		}
+
+		err = mlx4_qp_reserve_range(dev->dev, range_size,
+					    range_size, &range->base_wqn, 0,
+					    qp->mqp.usage);
+		if (err) {
+			kfree(range);
+			goto out;
+		}
+
+		range->size = range_size;
+		list_add(&range->list, &context->wqn_ranges_list);
+	} else if (range_size != 1) {
+		/*
+		 * Requesting a new range (>1) when last range is still open, is
+		 * not valid.
+		 */
+		err = -EINVAL;
+		goto out;
+	}
+
+	qp->wqn_range = range;
+
+	*wqn = range->base_wqn + range->refcount;
+
+	range->refcount++;
+
+out:
+	mutex_unlock(&context->wqn_ranges_mutex);
+
+	return err;
+}
+
+static void mlx4_ib_release_wqn(struct mlx4_ib_ucontext *context,
+				struct mlx4_ib_qp *qp, bool dirty_release)
+{
+	struct mlx4_ib_dev *dev = to_mdev(context->ibucontext.device);
+	struct mlx4_wqn_range *range;
+
+	mutex_lock(&context->wqn_ranges_mutex);
+
+	range = qp->wqn_range;
+
+	range->refcount--;
+	if (!range->refcount) {
+		mlx4_qp_release_range(dev->dev, range->base_wqn,
+				      range->size);
+		list_del(&range->list);
+		kfree(range);
+	} else if (dirty_release) {
+	/*
+	 * A range which one of its WQNs is destroyed, won't be able to be
+	 * reused for further WQN allocations.
+	 * The next created WQ will allocate a new range.
+	 */
+		range->dirty = 1;
+	}
+
+	mutex_unlock(&context->wqn_ranges_mutex);
+}
+
 static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd,
+			    enum mlx4_ib_source_type src,
 			    struct ib_qp_init_attr *init_attr,
 			    struct ib_udata *udata, int sqpn,
 			    struct mlx4_ib_qp **caller_qp)
@@ -645,6 +956,7 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd,
 	enum mlx4_ib_qp_type qp_type = (enum mlx4_ib_qp_type) init_attr->qp_type;
 	struct mlx4_ib_cq *mcq;
 	unsigned long flags;
+	int range_size = 0;
 
 	/* When tunneling special qps, we use a plain UD qp */
 	if (sqpn) {
@@ -719,26 +1031,70 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd,
 	if (init_attr->sq_sig_type == IB_SIGNAL_ALL_WR)
 		qp->sq_signal_bits = cpu_to_be32(MLX4_WQE_CTRL_CQ_UPDATE);
 
-	err = set_rq_size(dev, &init_attr->cap, !!pd->uobject, qp_has_rq(init_attr), qp);
-	if (err)
-		goto err;
 
 	if (pd->uobject) {
-		struct mlx4_ib_create_qp ucmd;
+		union {
+			struct mlx4_ib_create_qp qp;
+			struct mlx4_ib_create_wq wq;
+		} ucmd;
+		size_t copy_len;
 
-		if (ib_copy_from_udata(&ucmd, udata, sizeof ucmd)) {
+		copy_len = (src == MLX4_IB_QP_SRC) ?
+			   sizeof(struct mlx4_ib_create_qp) :
+			   min(sizeof(struct mlx4_ib_create_wq), udata->inlen);
+
+		if (ib_copy_from_udata(&ucmd, udata, copy_len)) {
 			err = -EFAULT;
 			goto err;
 		}
 
-		qp->sq_no_prefetch = ucmd.sq_no_prefetch;
+		if (src == MLX4_IB_RWQ_SRC) {
+			if (ucmd.wq.comp_mask || ucmd.wq.reserved[0] ||
+			    ucmd.wq.reserved[1] || ucmd.wq.reserved[2]) {
+				pr_debug("user command isn't supported\n");
+				err = -EOPNOTSUPP;
+				goto err;
+			}
 
-		err = set_user_sq_size(dev, qp, &ucmd);
+			if (ucmd.wq.log_range_size >
+			    ilog2(dev->dev->caps.max_rss_tbl_sz)) {
+				pr_debug("WQN range size must be equal or smaller than %d\n",
+					 dev->dev->caps.max_rss_tbl_sz);
+				err = -EOPNOTSUPP;
+				goto err;
+			}
+			range_size = 1 << ucmd.wq.log_range_size;
+		} else {
+			qp->inl_recv_sz = ucmd.qp.inl_recv_sz;
+		}
+
+		err = set_rq_size(dev, &init_attr->cap, !!pd->uobject,
+				  qp_has_rq(init_attr), qp, qp->inl_recv_sz);
 		if (err)
 			goto err;
 
-		qp->umem = ib_umem_get(pd->uobject->context, ucmd.buf_addr,
-				       qp->buf_size, 0, 0);
+		if (src == MLX4_IB_QP_SRC) {
+			qp->sq_no_prefetch = ucmd.qp.sq_no_prefetch;
+
+			err = set_user_sq_size(dev, qp,
+					       (struct mlx4_ib_create_qp *)
+					       &ucmd);
+			if (err)
+				goto err;
+		} else {
+			qp->sq_no_prefetch = 1;
+			qp->sq.wqe_cnt = 1;
+			qp->sq.wqe_shift = MLX4_IB_MIN_SQ_STRIDE;
+			/* Allocated buffer expects to have at least that SQ
+			 * size.
+			 */
+			qp->buf_size = (qp->rq.wqe_cnt << qp->rq.wqe_shift) +
+				(qp->sq.wqe_cnt << qp->sq.wqe_shift);
+		}
+
+		qp->umem = ib_umem_get(pd->uobject->context,
+				(src == MLX4_IB_QP_SRC) ? ucmd.qp.buf_addr :
+				ucmd.wq.buf_addr, qp->buf_size, 0, 0);
 		if (IS_ERR(qp->umem)) {
 			err = PTR_ERR(qp->umem);
 			goto err;
@@ -755,11 +1111,18 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd,
 
 		if (qp_has_rq(init_attr)) {
 			err = mlx4_ib_db_map_user(to_mucontext(pd->uobject->context),
-						  ucmd.db_addr, &qp->db);
+				(src == MLX4_IB_QP_SRC) ? ucmd.qp.db_addr :
+				ucmd.wq.db_addr, &qp->db);
 			if (err)
 				goto err_mtt;
 		}
+		qp->mqp.usage = MLX4_RES_USAGE_USER_VERBS;
 	} else {
+		err = set_rq_size(dev, &init_attr->cap, !!pd->uobject,
+				  qp_has_rq(init_attr), qp, 0);
+		if (err)
+			goto err;
+
 		qp->sq_no_prefetch = 0;
 
 		if (init_attr->create_flags & IB_QP_CREATE_IPOIB_UD_LSO)
@@ -812,20 +1175,15 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd,
 		if (err)
 			goto err_mtt;
 
-		qp->sq.wrid = kmalloc_array(qp->sq.wqe_cnt, sizeof(u64),
-					GFP_KERNEL | __GFP_NOWARN);
-		if (!qp->sq.wrid)
-			qp->sq.wrid = __vmalloc(qp->sq.wqe_cnt * sizeof(u64),
-						GFP_KERNEL, PAGE_KERNEL);
-		qp->rq.wrid = kmalloc_array(qp->rq.wqe_cnt, sizeof(u64),
-					GFP_KERNEL | __GFP_NOWARN);
-		if (!qp->rq.wrid)
-			qp->rq.wrid = __vmalloc(qp->rq.wqe_cnt * sizeof(u64),
-						GFP_KERNEL, PAGE_KERNEL);
+		qp->sq.wrid = kvmalloc_array(qp->sq.wqe_cnt,
+					     sizeof(u64), GFP_KERNEL);
+		qp->rq.wrid = kvmalloc_array(qp->rq.wqe_cnt,
+					     sizeof(u64), GFP_KERNEL);
 		if (!qp->sq.wrid || !qp->rq.wrid) {
 			err = -ENOMEM;
 			goto err_wrid;
 		}
+		qp->mqp.usage = MLX4_RES_USAGE_DRIVER;
 	}
 
 	if (sqpn) {
@@ -836,6 +1194,11 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd,
 				goto err_wrid;
 			}
 		}
+	} else if (src == MLX4_IB_RWQ_SRC) {
+		err = mlx4_ib_alloc_wqn(to_mucontext(pd->uobject->context), qp,
+					range_size, &qpn);
+		if (err)
+			goto err_wrid;
 	} else {
 		/* Raw packet QPNs may not have bits 6,7 set in their qp_num;
 		 * otherwise, the WQE BlueFlame setup flow wrongly causes
@@ -845,13 +1208,14 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd,
 						    (init_attr->cap.max_send_wr ?
 						     MLX4_RESERVE_ETH_BF_QP : 0) |
 						    (init_attr->cap.max_recv_wr ?
-						     MLX4_RESERVE_A0_QP : 0));
+						     MLX4_RESERVE_A0_QP : 0),
+						    qp->mqp.usage);
 		else
 			if (qp->flags & MLX4_IB_QP_NETIF)
 				err = mlx4_ib_steer_qp_alloc(dev, 1, &qpn);
 			else
 				err = mlx4_qp_reserve_range(dev->dev, 1, 1,
-							    &qpn, 0);
+							    &qpn, 0, qp->mqp.usage);
 		if (err)
 			goto err_proxy;
 	}
@@ -873,7 +1237,9 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd,
 	 */
 	qp->doorbell_qpn = swab32(qp->mqp.qpn << 8);
 
-	qp->mqp.event = mlx4_ib_qp_event;
+	qp->mqp.event = (src == MLX4_IB_QP_SRC) ? mlx4_ib_qp_event :
+						  mlx4_ib_wq_event;
+
 	if (!*caller_qp)
 		*caller_qp = qp;
 
@@ -900,6 +1266,9 @@ static int create_qp_common(struct mlx4_ib_dev *dev, struct ib_pd *pd,
 	if (!sqpn) {
 		if (qp->flags & MLX4_IB_QP_NETIF)
 			mlx4_ib_steer_qp_free(dev, qpn, 1);
+		else if (src == MLX4_IB_RWQ_SRC)
+			mlx4_ib_release_wqn(to_mucontext(pd->uobject->context),
+					    qp, 0);
 		else
 			mlx4_qp_release_range(dev->dev, qpn, 1);
 	}
@@ -998,7 +1367,7 @@ static struct mlx4_ib_pd *get_pd(struct mlx4_ib_qp *qp)
 		return to_mpd(qp->ibqp.pd);
 }
 
-static void get_cqs(struct mlx4_ib_qp *qp,
+static void get_cqs(struct mlx4_ib_qp *qp, enum mlx4_ib_source_type src,
 		    struct mlx4_ib_cq **send_cq, struct mlx4_ib_cq **recv_cq)
 {
 	switch (qp->ibqp.qp_type) {
@@ -1011,14 +1380,46 @@ static void get_cqs(struct mlx4_ib_qp *qp,
 		*recv_cq = *send_cq;
 		break;
 	default:
-		*send_cq = to_mcq(qp->ibqp.send_cq);
-		*recv_cq = to_mcq(qp->ibqp.recv_cq);
+		*recv_cq = (src == MLX4_IB_QP_SRC) ? to_mcq(qp->ibqp.recv_cq) :
+						     to_mcq(qp->ibwq.cq);
+		*send_cq = (src == MLX4_IB_QP_SRC) ? to_mcq(qp->ibqp.send_cq) :
+						     *recv_cq;
 		break;
 	}
 }
 
+static void destroy_qp_rss(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp)
+{
+	if (qp->state != IB_QPS_RESET) {
+		int i;
+
+		for (i = 0; i < (1 << qp->ibqp.rwq_ind_tbl->log_ind_tbl_size);
+		     i++) {
+			struct ib_wq *ibwq = qp->ibqp.rwq_ind_tbl->ind_tbl[i];
+			struct mlx4_ib_qp *wq =	to_mqp((struct ib_qp *)ibwq);
+
+			mutex_lock(&wq->mutex);
+
+			wq->rss_usecnt--;
+
+			mutex_unlock(&wq->mutex);
+		}
+
+		if (mlx4_qp_modify(dev->dev, NULL, to_mlx4_state(qp->state),
+				   MLX4_QP_STATE_RST, NULL, 0, 0, &qp->mqp))
+			pr_warn("modify QP %06x to RESET failed.\n",
+				qp->mqp.qpn);
+	}
+
+	mlx4_qp_remove(dev->dev, &qp->mqp);
+	mlx4_qp_free(dev->dev, &qp->mqp);
+	mlx4_qp_release_range(dev->dev, qp->mqp.qpn, 1);
+	del_gid_entries(qp);
+	kfree(qp->rss_ctx);
+}
+
 static void destroy_qp_common(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp,
-			      int is_user)
+			      enum mlx4_ib_source_type src, int is_user)
 {
 	struct mlx4_ib_cq *send_cq, *recv_cq;
 	unsigned long flags;
@@ -1051,7 +1452,7 @@ static void destroy_qp_common(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp,
 		}
 	}
 
-	get_cqs(qp, &send_cq, &recv_cq);
+	get_cqs(qp, src, &send_cq, &recv_cq);
 
 	spin_lock_irqsave(&dev->reset_flow_resource_lock, flags);
 	mlx4_ib_lock_cqs(send_cq, recv_cq);
@@ -1077,6 +1478,9 @@ static void destroy_qp_common(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp,
 	if (!is_sqp(dev, qp) && !is_tunnel_qp(dev, qp)) {
 		if (qp->flags & MLX4_IB_QP_NETIF)
 			mlx4_ib_steer_qp_free(dev, qp->mqp.qpn, 1);
+		else if (src == MLX4_IB_RWQ_SRC)
+			mlx4_ib_release_wqn(to_mucontext(
+					    qp->ibwq.uobject->context), qp, 1);
 		else
 			mlx4_qp_release_range(dev->dev, qp->mqp.qpn, 1);
 	}
@@ -1084,9 +1488,12 @@ static void destroy_qp_common(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp,
 	mlx4_mtt_cleanup(dev->dev, &qp->mtt);
 
 	if (is_user) {
-		if (qp->rq.wqe_cnt)
-			mlx4_ib_db_unmap_user(to_mucontext(qp->ibqp.uobject->context),
-					      &qp->db);
+		if (qp->rq.wqe_cnt) {
+			struct mlx4_ib_ucontext *mcontext = !src ?
+				to_mucontext(qp->ibqp.uobject->context) :
+				to_mucontext(qp->ibwq.uobject->context);
+			mlx4_ib_db_unmap_user(mcontext, &qp->db);
+		}
 		ib_umem_release(qp->umem);
 	} else {
 		kvfree(qp->sq.wrid);
@@ -1128,6 +1535,9 @@ static struct ib_qp *_mlx4_ib_create_qp(struct ib_pd *pd,
 	int sup_u_create_flags = MLX4_IB_QP_BLOCK_MULTICAST_LOOPBACK;
 	u16 xrcdn = 0;
 
+	if (init_attr->rwq_ind_tbl)
+		return _mlx4_ib_create_qp_rss(pd, init_attr, udata);
+
 	/*
 	 * We only support LSO, vendor flag1, and multicast loopback blocking,
 	 * and only for kernel UD QPs.
@@ -1182,8 +1592,8 @@ static struct ib_qp *_mlx4_ib_create_qp(struct ib_pd *pd,
 		/* fall through */
 	case IB_QPT_UD:
 	{
-		err = create_qp_common(to_mdev(pd->device), pd, init_attr,
-				       udata, 0, &qp);
+		err = create_qp_common(to_mdev(pd->device), pd,	MLX4_IB_QP_SRC,
+				       init_attr, udata, 0, &qp);
 		if (err) {
 			kfree(qp);
 			return ERR_PTR(err);
@@ -1203,7 +1613,9 @@ static struct ib_qp *_mlx4_ib_create_qp(struct ib_pd *pd,
 		if (udata)
 			return ERR_PTR(-EINVAL);
 		if (init_attr->create_flags & MLX4_IB_QP_CREATE_ROCE_V2_GSI) {
-			int res = mlx4_qp_reserve_range(to_mdev(pd->device)->dev, 1, 1, &sqpn, 0);
+			int res = mlx4_qp_reserve_range(to_mdev(pd->device)->dev,
+							1, 1, &sqpn, 0,
+							MLX4_RES_USAGE_DRIVER);
 
 			if (res)
 				return ERR_PTR(res);
@@ -1211,8 +1623,8 @@ static struct ib_qp *_mlx4_ib_create_qp(struct ib_pd *pd,
 			sqpn = get_sqp_num(to_mdev(pd->device), init_attr);
 		}
 
-		err = create_qp_common(to_mdev(pd->device), pd, init_attr, udata,
-				       sqpn, &qp);
+		err = create_qp_common(to_mdev(pd->device), pd, MLX4_IB_QP_SRC,
+				       init_attr, udata, sqpn, &qp);
 		if (err)
 			return ERR_PTR(err);
 
@@ -1267,7 +1679,6 @@ static int _mlx4_ib_destroy_qp(struct ib_qp *qp)
 {
 	struct mlx4_ib_dev *dev = to_mdev(qp->device);
 	struct mlx4_ib_qp *mqp = to_mqp(qp);
-	struct mlx4_ib_pd *pd;
 
 	if (is_qp0(dev, mqp))
 		mlx4_CLOSE_PORT(dev->dev, mqp->port);
@@ -1282,8 +1693,14 @@ static int _mlx4_ib_destroy_qp(struct ib_qp *qp)
 	if (mqp->counter_index)
 		mlx4_ib_free_qp_counter(dev, mqp);
 
-	pd = get_pd(mqp);
-	destroy_qp_common(dev, mqp, !!pd->ibpd.uobject);
+	if (qp->rwq_ind_tbl) {
+		destroy_qp_rss(dev, mqp);
+	} else {
+		struct mlx4_ib_pd *pd;
+
+		pd = get_pd(mqp);
+		destroy_qp_common(dev, mqp, MLX4_IB_QP_SRC, !!pd->ibpd.uobject);
+	}
 
 	if (is_sqp(dev, mqp))
 		kfree(to_msqp(mqp));
@@ -1566,7 +1983,7 @@ static int create_qp_lb_counter(struct mlx4_ib_dev *dev, struct mlx4_ib_qp *qp)
 	    !(dev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_LB_SRC_CHK))
 		return 0;
 
-	err = mlx4_counter_alloc(dev->dev, &tmp_idx);
+	err = mlx4_counter_alloc(dev->dev, &tmp_idx, MLX4_RES_USAGE_DRIVER);
 	if (err)
 		return err;
 
@@ -1606,12 +2023,119 @@ static u8 gid_type_to_qpc(enum ib_gid_type gid_type)
 	}
 }
 
-static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
+/*
+ * Go over all RSS QP's childes (WQs) and apply their HW state according to
+ * their logic state if the RSS QP is the first RSS QP associated for the WQ.
+ */
+static int bringup_rss_rwqs(struct ib_rwq_ind_table *ind_tbl, u8 port_num)
+{
+	int err = 0;
+	int i;
+
+	for (i = 0; i < (1 << ind_tbl->log_ind_tbl_size); i++) {
+		struct ib_wq *ibwq = ind_tbl->ind_tbl[i];
+		struct mlx4_ib_qp *wq = to_mqp((struct ib_qp *)ibwq);
+
+		mutex_lock(&wq->mutex);
+
+		/* Mlx4_ib restrictions:
+		 * WQ's is associated to a port according to the RSS QP it is
+		 * associates to.
+		 * In case the WQ is associated to a different port by another
+		 * RSS QP, return a failure.
+		 */
+		if ((wq->rss_usecnt > 0) && (wq->port != port_num)) {
+			err = -EINVAL;
+			mutex_unlock(&wq->mutex);
+			break;
+		}
+		wq->port = port_num;
+		if ((wq->rss_usecnt == 0) && (ibwq->state == IB_WQS_RDY)) {
+			err = _mlx4_ib_modify_wq(ibwq, IB_WQS_RDY);
+			if (err) {
+				mutex_unlock(&wq->mutex);
+				break;
+			}
+		}
+		wq->rss_usecnt++;
+
+		mutex_unlock(&wq->mutex);
+	}
+
+	if (i && err) {
+		int j;
+
+		for (j = (i - 1); j >= 0; j--) {
+			struct ib_wq *ibwq = ind_tbl->ind_tbl[j];
+			struct mlx4_ib_qp *wq = to_mqp((struct ib_qp *)ibwq);
+
+			mutex_lock(&wq->mutex);
+
+			if ((wq->rss_usecnt == 1) &&
+			    (ibwq->state == IB_WQS_RDY))
+				if (_mlx4_ib_modify_wq(ibwq, IB_WQS_RESET))
+					pr_warn("failed to reverse WQN=0x%06x\n",
+						ibwq->wq_num);
+			wq->rss_usecnt--;
+
+			mutex_unlock(&wq->mutex);
+		}
+	}
+
+	return err;
+}
+
+static void bring_down_rss_rwqs(struct ib_rwq_ind_table *ind_tbl)
+{
+	int i;
+
+	for (i = 0; i < (1 << ind_tbl->log_ind_tbl_size); i++) {
+		struct ib_wq *ibwq = ind_tbl->ind_tbl[i];
+		struct mlx4_ib_qp *wq = to_mqp((struct ib_qp *)ibwq);
+
+		mutex_lock(&wq->mutex);
+
+		if ((wq->rss_usecnt == 1) && (ibwq->state == IB_WQS_RDY))
+			if (_mlx4_ib_modify_wq(ibwq, IB_WQS_RESET))
+				pr_warn("failed to reverse WQN=%x\n",
+					ibwq->wq_num);
+		wq->rss_usecnt--;
+
+		mutex_unlock(&wq->mutex);
+	}
+}
+
+static void fill_qp_rss_context(struct mlx4_qp_context *context,
+				struct mlx4_ib_qp *qp)
+{
+	struct mlx4_rss_context *rss_context;
+
+	rss_context = (void *)context + offsetof(struct mlx4_qp_context,
+			pri_path) + MLX4_RSS_OFFSET_IN_QPC_PRI_PATH;
+
+	rss_context->base_qpn = cpu_to_be32(qp->rss_ctx->base_qpn_tbl_sz);
+	rss_context->default_qpn =
+		cpu_to_be32(qp->rss_ctx->base_qpn_tbl_sz & 0xffffff);
+	if (qp->rss_ctx->flags & (MLX4_RSS_UDP_IPV4 | MLX4_RSS_UDP_IPV6))
+		rss_context->base_qpn_udp = rss_context->default_qpn;
+	rss_context->flags = qp->rss_ctx->flags;
+	/* Currently support just toeplitz */
+	rss_context->hash_fn = MLX4_RSS_HASH_TOP;
+
+	memcpy(rss_context->rss_key, qp->rss_ctx->rss_key,
+	       MLX4_EN_RSS_KEY_SIZE);
+}
+
+static int __mlx4_ib_modify_qp(void *src, enum mlx4_ib_source_type src_type,
 			       const struct ib_qp_attr *attr, int attr_mask,
 			       enum ib_qp_state cur_state, enum ib_qp_state new_state)
 {
-	struct mlx4_ib_dev *dev = to_mdev(ibqp->device);
-	struct mlx4_ib_qp *qp = to_mqp(ibqp);
+	struct ib_uobject *ibuobject;
+	struct ib_srq  *ibsrq;
+	struct ib_rwq_ind_table *rwq_ind_tbl;
+	enum ib_qp_type qp_type;
+	struct mlx4_ib_dev *dev;
+	struct mlx4_ib_qp *qp;
 	struct mlx4_ib_pd *pd;
 	struct mlx4_ib_cq *send_cq, *recv_cq;
 	struct mlx4_qp_context *context;
@@ -1621,6 +2145,30 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 	int err = -EINVAL;
 	int counter_index;
 
+	if (src_type == MLX4_IB_RWQ_SRC) {
+		struct ib_wq *ibwq;
+
+		ibwq	    = (struct ib_wq *)src;
+		ibuobject   = ibwq->uobject;
+		ibsrq	    = NULL;
+		rwq_ind_tbl = NULL;
+		qp_type     = IB_QPT_RAW_PACKET;
+		qp	    = to_mqp((struct ib_qp *)ibwq);
+		dev	    = to_mdev(ibwq->device);
+		pd	    = to_mpd(ibwq->pd);
+	} else {
+		struct ib_qp *ibqp;
+
+		ibqp	    = (struct ib_qp *)src;
+		ibuobject   = ibqp->uobject;
+		ibsrq	    = ibqp->srq;
+		rwq_ind_tbl = ibqp->rwq_ind_tbl;
+		qp_type     = ibqp->qp_type;
+		qp	    = to_mqp(ibqp);
+		dev	    = to_mdev(ibqp->device);
+		pd	    = get_pd(qp);
+	}
+
 	/* APM is not supported under RoCE */
 	if (attr_mask & IB_QP_ALT_PATH &&
 	    rdma_port_get_link_layer(&dev->ib_dev, qp->port) ==
@@ -1634,6 +2182,11 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 	context->flags = cpu_to_be32((to_mlx4_state(new_state) << 28) |
 				     (to_mlx4_st(dev, qp->mlx4_ib_qp_type) << 16));
 
+	if (rwq_ind_tbl) {
+		fill_qp_rss_context(context, qp);
+		context->flags |= cpu_to_be32(1 << MLX4_RSS_QPC_FLAG_OFFSET);
+	}
+
 	if (!(attr_mask & IB_QP_PATH_MIG_STATE))
 		context->flags |= cpu_to_be32(MLX4_QP_PM_MIGRATED << 11);
 	else {
@@ -1651,11 +2204,14 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 		}
 	}
 
-	if (ibqp->qp_type == IB_QPT_GSI || ibqp->qp_type == IB_QPT_SMI)
+	if (qp->inl_recv_sz)
+		context->param3 |= cpu_to_be32(1 << 25);
+
+	if (qp_type == IB_QPT_GSI || qp_type == IB_QPT_SMI)
 		context->mtu_msgmax = (IB_MTU_4096 << 5) | 11;
-	else if (ibqp->qp_type == IB_QPT_RAW_PACKET)
+	else if (qp_type == IB_QPT_RAW_PACKET)
 		context->mtu_msgmax = (MLX4_RAW_QP_MTU << 5) | MLX4_RAW_QP_MSGMAX;
-	else if (ibqp->qp_type == IB_QPT_UD) {
+	else if (qp_type == IB_QPT_UD) {
 		if (qp->flags & MLX4_IB_QP_LSO)
 			context->mtu_msgmax = (IB_MTU_4096 << 5) |
 					      ilog2(dev->dev->caps.max_gso_sz);
@@ -1671,9 +2227,11 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 			ilog2(dev->dev->caps.max_msg_sz);
 	}
 
-	if (qp->rq.wqe_cnt)
-		context->rq_size_stride = ilog2(qp->rq.wqe_cnt) << 3;
-	context->rq_size_stride |= qp->rq.wqe_shift - 4;
+	if (!rwq_ind_tbl) { /* PRM RSS receive side should be left zeros */
+		if (qp->rq.wqe_cnt)
+			context->rq_size_stride = ilog2(qp->rq.wqe_cnt) << 3;
+		context->rq_size_stride |= qp->rq.wqe_shift - 4;
+	}
 
 	if (qp->sq.wqe_cnt)
 		context->sq_size_stride = ilog2(qp->sq.wqe_cnt) << 3;
@@ -1685,14 +2243,15 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 	if (cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT) {
 		context->sq_size_stride |= !!qp->sq_no_prefetch << 7;
 		context->xrcd = cpu_to_be32((u32) qp->xrcdn);
-		if (ibqp->qp_type == IB_QPT_RAW_PACKET)
+		if (qp_type == IB_QPT_RAW_PACKET)
 			context->param3 |= cpu_to_be32(1 << 30);
 	}
 
-	if (qp->ibqp.uobject)
+	if (ibuobject)
 		context->usr_page = cpu_to_be32(
 			mlx4_to_hw_uar_index(dev->dev,
-					     to_mucontext(ibqp->uobject->context)->uar.index));
+					     to_mucontext(ibuobject->context)
+					     ->uar.index));
 	else
 		context->usr_page = cpu_to_be32(
 			mlx4_to_hw_uar_index(dev->dev, dev->priv_uar.index));
@@ -1736,7 +2295,7 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 			steer_qp = 1;
 		}
 
-		if (ibqp->qp_type == IB_QPT_GSI) {
+		if (qp_type == IB_QPT_GSI) {
 			enum ib_gid_type gid_type = qp->flags & MLX4_IB_ROCE_V2_GSI_QP ?
 				IB_GID_TYPE_ROCE_UDP_ENCAP : IB_GID_TYPE_ROCE;
 			u8 qpc_roce_mode = gid_type_to_qpc(gid_type);
@@ -1753,7 +2312,7 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 	}
 
 	if (attr_mask & IB_QP_AV) {
-		u8 port_num = mlx4_is_bonded(to_mdev(ibqp->device)->dev) ? 1 :
+		u8 port_num = mlx4_is_bonded(dev->dev) ? 1 :
 			attr_mask & IB_QP_PORT ? attr->port_num : qp->port;
 		union ib_gid gid;
 		struct ib_gid_attr gid_attr = {.gid_type = IB_GID_TYPE_IB};
@@ -1768,7 +2327,7 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 			int index =
 				rdma_ah_read_grh(&attr->ah_attr)->sgid_index;
 
-			status = ib_get_cached_gid(ibqp->device, port_num,
+			status = ib_get_cached_gid(&dev->ib_dev, port_num,
 						   index, &gid, &gid_attr);
 			if (!status && !memcmp(&gid, &zgid, sizeof(gid)))
 				status = -ENOENT;
@@ -1825,15 +2384,20 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 		optpar |= MLX4_QP_OPTPAR_ALT_ADDR_PATH;
 	}
 
-	pd = get_pd(qp);
-	get_cqs(qp, &send_cq, &recv_cq);
-	context->pd       = cpu_to_be32(pd->pdn);
+	context->pd = cpu_to_be32(pd->pdn);
+
+	if (!rwq_ind_tbl) {
+		get_cqs(qp, src_type, &send_cq, &recv_cq);
+	} else { /* Set dummy CQs to be compatible with HV and PRM */
+		send_cq = to_mcq(rwq_ind_tbl->ind_tbl[0]->cq);
+		recv_cq = send_cq;
+	}
 	context->cqn_send = cpu_to_be32(send_cq->mcq.cqn);
 	context->cqn_recv = cpu_to_be32(recv_cq->mcq.cqn);
 	context->params1  = cpu_to_be32(MLX4_IB_ACK_REQ_FREQ << 28);
 
 	/* Set "fast registration enabled" for all kernel QPs */
-	if (!qp->ibqp.uobject)
+	if (!ibuobject)
 		context->params1 |= cpu_to_be32(1 << 11);
 
 	if (attr_mask & IB_QP_RNR_RETRY) {
@@ -1868,7 +2432,7 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 		optpar |= MLX4_QP_OPTPAR_RWE | MLX4_QP_OPTPAR_RRE | MLX4_QP_OPTPAR_RAE;
 	}
 
-	if (ibqp->srq)
+	if (ibsrq)
 		context->params2 |= cpu_to_be32(MLX4_QP_BIT_RIC);
 
 	if (attr_mask & IB_QP_MIN_RNR_TIMER) {
@@ -1899,17 +2463,19 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 		optpar |= MLX4_QP_OPTPAR_Q_KEY;
 	}
 
-	if (ibqp->srq)
-		context->srqn = cpu_to_be32(1 << 24 | to_msrq(ibqp->srq)->msrq.srqn);
+	if (ibsrq)
+		context->srqn = cpu_to_be32(1 << 24 |
+					    to_msrq(ibsrq)->msrq.srqn);
 
-	if (qp->rq.wqe_cnt && cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT)
+	if (qp->rq.wqe_cnt &&
+	    cur_state == IB_QPS_RESET &&
+	    new_state == IB_QPS_INIT)
 		context->db_rec_addr = cpu_to_be64(qp->db.dma);
 
 	if (cur_state == IB_QPS_INIT &&
 	    new_state == IB_QPS_RTR  &&
-	    (ibqp->qp_type == IB_QPT_GSI || ibqp->qp_type == IB_QPT_SMI ||
-	     ibqp->qp_type == IB_QPT_UD ||
-	     ibqp->qp_type == IB_QPT_RAW_PACKET)) {
+	    (qp_type == IB_QPT_GSI || qp_type == IB_QPT_SMI ||
+	     qp_type == IB_QPT_UD || qp_type == IB_QPT_RAW_PACKET)) {
 		context->pri_path.sched_queue = (qp->port - 1) << 6;
 		if (qp->mlx4_ib_qp_type == MLX4_IB_QPT_SMI ||
 		    qp->mlx4_ib_qp_type &
@@ -1942,7 +2508,7 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 		}
 	}
 
-	if (qp->ibqp.qp_type == IB_QPT_RAW_PACKET) {
+	if (qp_type == IB_QPT_RAW_PACKET) {
 		context->pri_path.ackto = (context->pri_path.ackto & 0xf8) |
 					MLX4_IB_LINK_TYPE_ETH;
 		if (dev->dev->caps.tunnel_offload_mode ==  MLX4_TUNNEL_OFFLOAD_MODE_VXLAN) {
@@ -1952,7 +2518,7 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 		}
 	}
 
-	if (ibqp->qp_type == IB_QPT_UD && (new_state == IB_QPS_RTR)) {
+	if (qp_type == IB_QPT_UD && (new_state == IB_QPS_RTR)) {
 		int is_eth = rdma_port_get_link_layer(
 				&dev->ib_dev, qp->port) ==
 				IB_LINK_LAYER_ETHERNET;
@@ -1962,14 +2528,15 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 		}
 	}
 
-
 	if (cur_state == IB_QPS_RTS && new_state == IB_QPS_SQD	&&
 	    attr_mask & IB_QP_EN_SQD_ASYNC_NOTIFY && attr->en_sqd_async_notify)
 		sqd_event = 1;
 	else
 		sqd_event = 0;
 
-	if (!ibqp->uobject && cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT)
+	if (!ibuobject &&
+	    cur_state == IB_QPS_RESET &&
+	    new_state == IB_QPS_INIT)
 		context->rlkey_roce_mode |= (1 << 4);
 
 	/*
@@ -1978,7 +2545,9 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 	 * headroom is stamped so that the hardware doesn't start
 	 * processing stale work requests.
 	 */
-	if (!ibqp->uobject && cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT) {
+	if (!ibuobject &&
+	    cur_state == IB_QPS_RESET &&
+	    new_state == IB_QPS_INIT) {
 		struct mlx4_wqe_ctrl_seg *ctrl;
 		int i;
 
@@ -2035,9 +2604,9 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 	 * entries and reinitialize the QP.
 	 */
 	if (new_state == IB_QPS_RESET) {
-		if (!ibqp->uobject) {
+		if (!ibuobject) {
 			mlx4_ib_cq_clean(recv_cq, qp->mqp.qpn,
-					 ibqp->srq ? to_msrq(ibqp->srq) : NULL);
+					 ibsrq ? to_msrq(ibsrq) : NULL);
 			if (send_cq != recv_cq)
 				mlx4_ib_cq_clean(send_cq, qp->mqp.qpn, NULL);
 
@@ -2148,22 +2717,25 @@ static int __mlx4_ib_modify_qp(struct ib_qp *ibqp,
 	return err;
 }
 
+enum {
+	MLX4_IB_MODIFY_QP_RSS_SUP_ATTR_MSK = (IB_QP_STATE	|
+					      IB_QP_PORT),
+};
+
 static int _mlx4_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 			      int attr_mask, struct ib_udata *udata)
 {
+	enum rdma_link_layer ll = IB_LINK_LAYER_UNSPECIFIED;
 	struct mlx4_ib_dev *dev = to_mdev(ibqp->device);
 	struct mlx4_ib_qp *qp = to_mqp(ibqp);
 	enum ib_qp_state cur_state, new_state;
 	int err = -EINVAL;
-	int ll;
 	mutex_lock(&qp->mutex);
 
 	cur_state = attr_mask & IB_QP_CUR_STATE ? attr->cur_qp_state : qp->state;
 	new_state = attr_mask & IB_QP_STATE ? attr->qp_state : cur_state;
 
-	if (cur_state == new_state && cur_state == IB_QPS_RESET) {
-		ll = IB_LINK_LAYER_UNSPECIFIED;
-	} else {
+	if (cur_state != new_state || cur_state != IB_QPS_RESET) {
 		int port = attr_mask & IB_QP_PORT ? attr->port_num : qp->port;
 		ll = rdma_port_get_link_layer(&dev->ib_dev, port);
 	}
@@ -2178,6 +2750,27 @@ static int _mlx4_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 		goto out;
 	}
 
+	if (ibqp->rwq_ind_tbl) {
+		if (!(((cur_state == IB_QPS_RESET) &&
+		       (new_state == IB_QPS_INIT)) ||
+		      ((cur_state == IB_QPS_INIT)  &&
+		       (new_state == IB_QPS_RTR)))) {
+			pr_debug("qpn 0x%x: RSS QP unsupported transition %d to %d\n",
+				 ibqp->qp_num, cur_state, new_state);
+
+			err = -EOPNOTSUPP;
+			goto out;
+		}
+
+		if (attr_mask & ~MLX4_IB_MODIFY_QP_RSS_SUP_ATTR_MSK) {
+			pr_debug("qpn 0x%x: RSS QP unsupported attribute mask 0x%x for transition %d to %d\n",
+				 ibqp->qp_num, attr_mask, cur_state, new_state);
+
+			err = -EOPNOTSUPP;
+			goto out;
+		}
+	}
+
 	if (mlx4_is_bonded(dev->dev) && (attr_mask & IB_QP_PORT)) {
 		if ((cur_state == IB_QPS_RESET) && (new_state == IB_QPS_INIT)) {
 			if ((ibqp->qp_type == IB_QPT_RC) ||
@@ -2242,7 +2835,17 @@ static int _mlx4_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 		goto out;
 	}
 
-	err = __mlx4_ib_modify_qp(ibqp, attr, attr_mask, cur_state, new_state);
+	if (ibqp->rwq_ind_tbl && (new_state == IB_QPS_INIT)) {
+		err = bringup_rss_rwqs(ibqp->rwq_ind_tbl, attr->port_num);
+		if (err)
+			goto out;
+	}
+
+	err = __mlx4_ib_modify_qp(ibqp, MLX4_IB_QP_SRC, attr, attr_mask,
+				  cur_state, new_state);
+
+	if (ibqp->rwq_ind_tbl && err)
+		bring_down_rss_rwqs(ibqp->rwq_ind_tbl);
 
 	if (mlx4_is_bonded(dev->dev) && (attr_mask & IB_QP_PORT))
 		attr->port_num = 1;
@@ -3432,6 +4035,9 @@ int mlx4_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr, int qp_attr
 	int mlx4_state;
 	int err = 0;
 
+	if (ibqp->rwq_ind_tbl)
+		return -EOPNOTSUPP;
+
 	mutex_lock(&qp->mutex);
 
 	if (qp->state == IB_QPS_RESET) {
@@ -3527,3 +4133,285 @@ int mlx4_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr, int qp_attr
 	return err;
 }
 
+struct ib_wq *mlx4_ib_create_wq(struct ib_pd *pd,
+				struct ib_wq_init_attr *init_attr,
+				struct ib_udata *udata)
+{
+	struct mlx4_ib_dev *dev;
+	struct ib_qp_init_attr ib_qp_init_attr;
+	struct mlx4_ib_qp *qp;
+	struct mlx4_ib_create_wq ucmd;
+	int err, required_cmd_sz;
+
+	if (!(udata && pd->uobject))
+		return ERR_PTR(-EINVAL);
+
+	required_cmd_sz = offsetof(typeof(ucmd), comp_mask) +
+			  sizeof(ucmd.comp_mask);
+	if (udata->inlen < required_cmd_sz) {
+		pr_debug("invalid inlen\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	if (udata->inlen > sizeof(ucmd) &&
+	    !ib_is_udata_cleared(udata, sizeof(ucmd),
+				 udata->inlen - sizeof(ucmd))) {
+		pr_debug("inlen is not supported\n");
+		return ERR_PTR(-EOPNOTSUPP);
+	}
+
+	if (udata->outlen)
+		return ERR_PTR(-EOPNOTSUPP);
+
+	dev = to_mdev(pd->device);
+
+	if (init_attr->wq_type != IB_WQT_RQ) {
+		pr_debug("unsupported wq type %d\n", init_attr->wq_type);
+		return ERR_PTR(-EOPNOTSUPP);
+	}
+
+	if (init_attr->create_flags) {
+		pr_debug("unsupported create_flags %u\n",
+			 init_attr->create_flags);
+		return ERR_PTR(-EOPNOTSUPP);
+	}
+
+	qp = kzalloc(sizeof(*qp), GFP_KERNEL);
+	if (!qp)
+		return ERR_PTR(-ENOMEM);
+
+	qp->pri.vid = 0xFFFF;
+	qp->alt.vid = 0xFFFF;
+
+	memset(&ib_qp_init_attr, 0, sizeof(ib_qp_init_attr));
+	ib_qp_init_attr.qp_context = init_attr->wq_context;
+	ib_qp_init_attr.qp_type = IB_QPT_RAW_PACKET;
+	ib_qp_init_attr.cap.max_recv_wr = init_attr->max_wr;
+	ib_qp_init_attr.cap.max_recv_sge = init_attr->max_sge;
+	ib_qp_init_attr.recv_cq = init_attr->cq;
+	ib_qp_init_attr.send_cq = ib_qp_init_attr.recv_cq; /* Dummy CQ */
+
+	err = create_qp_common(dev, pd, MLX4_IB_RWQ_SRC, &ib_qp_init_attr,
+			       udata, 0, &qp);
+	if (err) {
+		kfree(qp);
+		return ERR_PTR(err);
+	}
+
+	qp->ibwq.event_handler = init_attr->event_handler;
+	qp->ibwq.wq_num = qp->mqp.qpn;
+	qp->ibwq.state = IB_WQS_RESET;
+
+	return &qp->ibwq;
+}
+
+static int ib_wq2qp_state(enum ib_wq_state state)
+{
+	switch (state) {
+	case IB_WQS_RESET:
+		return IB_QPS_RESET;
+	case IB_WQS_RDY:
+		return IB_QPS_RTR;
+	default:
+		return IB_QPS_ERR;
+	}
+}
+
+static int _mlx4_ib_modify_wq(struct ib_wq *ibwq, enum ib_wq_state new_state)
+{
+	struct mlx4_ib_qp *qp = to_mqp((struct ib_qp *)ibwq);
+	enum ib_qp_state qp_cur_state;
+	enum ib_qp_state qp_new_state;
+	int attr_mask;
+	int err;
+
+	/* ib_qp.state represents the WQ HW state while ib_wq.state represents
+	 * the WQ logic state.
+	 */
+	qp_cur_state = qp->state;
+	qp_new_state = ib_wq2qp_state(new_state);
+
+	if (ib_wq2qp_state(new_state) == qp_cur_state)
+		return 0;
+
+	if (new_state == IB_WQS_RDY) {
+		struct ib_qp_attr attr = {};
+
+		attr.port_num = qp->port;
+		attr_mask = IB_QP_PORT;
+
+		err = __mlx4_ib_modify_qp(ibwq, MLX4_IB_RWQ_SRC, &attr,
+					  attr_mask, IB_QPS_RESET, IB_QPS_INIT);
+		if (err) {
+			pr_debug("WQN=0x%06x failed to apply RST->INIT on the HW QP\n",
+				 ibwq->wq_num);
+			return err;
+		}
+
+		qp_cur_state = IB_QPS_INIT;
+	}
+
+	attr_mask = 0;
+	err = __mlx4_ib_modify_qp(ibwq, MLX4_IB_RWQ_SRC, NULL, attr_mask,
+				  qp_cur_state,  qp_new_state);
+
+	if (err && (qp_cur_state == IB_QPS_INIT)) {
+		qp_new_state = IB_QPS_RESET;
+		if (__mlx4_ib_modify_qp(ibwq, MLX4_IB_RWQ_SRC, NULL,
+					attr_mask, IB_QPS_INIT, IB_QPS_RESET)) {
+			pr_warn("WQN=0x%06x failed with reverting HW's resources failure\n",
+				ibwq->wq_num);
+			qp_new_state = IB_QPS_INIT;
+		}
+	}
+
+	qp->state = qp_new_state;
+
+	return err;
+}
+
+int mlx4_ib_modify_wq(struct ib_wq *ibwq, struct ib_wq_attr *wq_attr,
+		      u32 wq_attr_mask, struct ib_udata *udata)
+{
+	struct mlx4_ib_qp *qp = to_mqp((struct ib_qp *)ibwq);
+	struct mlx4_ib_modify_wq ucmd = {};
+	size_t required_cmd_sz;
+	enum ib_wq_state cur_state, new_state;
+	int err = 0;
+
+	required_cmd_sz = offsetof(typeof(ucmd), reserved) +
+				   sizeof(ucmd.reserved);
+	if (udata->inlen < required_cmd_sz)
+		return -EINVAL;
+
+	if (udata->inlen > sizeof(ucmd) &&
+	    !ib_is_udata_cleared(udata, sizeof(ucmd),
+				 udata->inlen - sizeof(ucmd)))
+		return -EOPNOTSUPP;
+
+	if (ib_copy_from_udata(&ucmd, udata, min(sizeof(ucmd), udata->inlen)))
+		return -EFAULT;
+
+	if (ucmd.comp_mask || ucmd.reserved)
+		return -EOPNOTSUPP;
+
+	if (wq_attr_mask & IB_WQ_FLAGS)
+		return -EOPNOTSUPP;
+
+	cur_state = wq_attr_mask & IB_WQ_CUR_STATE ? wq_attr->curr_wq_state :
+						     ibwq->state;
+	new_state = wq_attr_mask & IB_WQ_STATE ? wq_attr->wq_state : cur_state;
+
+	if (cur_state  < IB_WQS_RESET || cur_state  > IB_WQS_ERR ||
+	    new_state < IB_WQS_RESET || new_state > IB_WQS_ERR)
+		return -EINVAL;
+
+	if ((new_state == IB_WQS_RDY) && (cur_state == IB_WQS_ERR))
+		return -EINVAL;
+
+	if ((new_state == IB_WQS_ERR) && (cur_state == IB_WQS_RESET))
+		return -EINVAL;
+
+	/* Need to protect against the parent RSS which also may modify WQ
+	 * state.
+	 */
+	mutex_lock(&qp->mutex);
+
+	/* Can update HW state only if a RSS QP has already associated to this
+	 * WQ, so we can apply its port on the WQ.
+	 */
+	if (qp->rss_usecnt)
+		err = _mlx4_ib_modify_wq(ibwq, new_state);
+
+	if (!err)
+		ibwq->state = new_state;
+
+	mutex_unlock(&qp->mutex);
+
+	return err;
+}
+
+int mlx4_ib_destroy_wq(struct ib_wq *ibwq)
+{
+	struct mlx4_ib_dev *dev = to_mdev(ibwq->device);
+	struct mlx4_ib_qp *qp = to_mqp((struct ib_qp *)ibwq);
+
+	if (qp->counter_index)
+		mlx4_ib_free_qp_counter(dev, qp);
+
+	destroy_qp_common(dev, qp, MLX4_IB_RWQ_SRC, 1);
+
+	kfree(qp);
+
+	return 0;
+}
+
+struct ib_rwq_ind_table
+*mlx4_ib_create_rwq_ind_table(struct ib_device *device,
+			      struct ib_rwq_ind_table_init_attr *init_attr,
+			      struct ib_udata *udata)
+{
+	struct ib_rwq_ind_table *rwq_ind_table;
+	struct mlx4_ib_create_rwq_ind_tbl_resp resp = {};
+	unsigned int ind_tbl_size = 1 << init_attr->log_ind_tbl_size;
+	unsigned int base_wqn;
+	size_t min_resp_len;
+	int i;
+	int err;
+
+	if (udata->inlen > 0 &&
+	    !ib_is_udata_cleared(udata, 0,
+				 udata->inlen))
+		return ERR_PTR(-EOPNOTSUPP);
+
+	min_resp_len = offsetof(typeof(resp), reserved) + sizeof(resp.reserved);
+	if (udata->outlen && udata->outlen < min_resp_len)
+		return ERR_PTR(-EINVAL);
+
+	if (ind_tbl_size >
+	    device->attrs.rss_caps.max_rwq_indirection_table_size) {
+		pr_debug("log_ind_tbl_size = %d is bigger than supported = %d\n",
+			 ind_tbl_size,
+			 device->attrs.rss_caps.max_rwq_indirection_table_size);
+		return ERR_PTR(-EINVAL);
+	}
+
+	base_wqn = init_attr->ind_tbl[0]->wq_num;
+
+	if (base_wqn % ind_tbl_size) {
+		pr_debug("WQN=0x%x isn't aligned with indirection table size\n",
+			 base_wqn);
+		return ERR_PTR(-EINVAL);
+	}
+
+	for (i = 1; i < ind_tbl_size; i++) {
+		if (++base_wqn != init_attr->ind_tbl[i]->wq_num) {
+			pr_debug("indirection table's WQNs aren't consecutive\n");
+			return ERR_PTR(-EINVAL);
+		}
+	}
+
+	rwq_ind_table = kzalloc(sizeof(*rwq_ind_table), GFP_KERNEL);
+	if (!rwq_ind_table)
+		return ERR_PTR(-ENOMEM);
+
+	if (udata->outlen) {
+		resp.response_length = offsetof(typeof(resp), response_length) +
+					sizeof(resp.response_length);
+		err = ib_copy_to_udata(udata, &resp, resp.response_length);
+		if (err)
+			goto err;
+	}
+
+	return rwq_ind_table;
+
+err:
+	kfree(rwq_ind_table);
+	return ERR_PTR(err);
+}
+
+int mlx4_ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *ib_rwq_ind_tbl)
+{
+	kfree(ib_rwq_ind_tbl);
+	return 0;
+}
diff --git a/drivers/infiniband/hw/mlx4/srq.c b/drivers/infiniband/hw/mlx4/srq.c
index 0facaf5..ebee56c 100644
--- a/drivers/infiniband/hw/mlx4/srq.c
+++ b/drivers/infiniband/hw/mlx4/srq.c
@@ -34,7 +34,6 @@
 #include <linux/mlx4/qp.h>
 #include <linux/mlx4/srq.h>
 #include <linux/slab.h>
-#include <linux/vmalloc.h>
 
 #include "mlx4_ib.h"
 #include <rdma/mlx4-abi.h>
@@ -171,20 +170,16 @@ struct ib_srq *mlx4_ib_create_srq(struct ib_pd *pd,
 		if (err)
 			goto err_mtt;
 
-		srq->wrid = kmalloc_array(srq->msrq.max, sizeof(u64),
-					GFP_KERNEL | __GFP_NOWARN);
+		srq->wrid = kvmalloc_array(srq->msrq.max,
+					   sizeof(u64), GFP_KERNEL);
 		if (!srq->wrid) {
-			srq->wrid = __vmalloc(srq->msrq.max * sizeof(u64),
-					      GFP_KERNEL, PAGE_KERNEL);
-			if (!srq->wrid) {
-				err = -ENOMEM;
-				goto err_mtt;
-			}
+			err = -ENOMEM;
+			goto err_mtt;
 		}
 	}
 
-	cqn = (init_attr->srq_type == IB_SRQT_XRC) ?
-		to_mcq(init_attr->ext.xrc.cq)->mcq.cqn : 0;
+	cqn = ib_srq_has_cq(init_attr->srq_type) ?
+		to_mcq(init_attr->ext.cq)->mcq.cqn : 0;
 	xrcdn = (init_attr->srq_type == IB_SRQT_XRC) ?
 		to_mxrcd(init_attr->ext.xrc.xrcd)->xrcdn :
 		(u16) dev->dev->caps.reserved_xrcds;
diff --git a/drivers/infiniband/hw/mlx5/Makefile b/drivers/infiniband/hw/mlx5/Makefile
index 90ad2ad..bc62996 100644
--- a/drivers/infiniband/hw/mlx5/Makefile
+++ b/drivers/infiniband/hw/mlx5/Makefile
@@ -1,4 +1,4 @@
 obj-$(CONFIG_MLX5_INFINIBAND)	+= mlx5_ib.o
 
-mlx5_ib-y :=	main.o cq.o doorbell.o qp.o mem.o srq.o mr.o ah.o mad.o gsi.o ib_virt.o cmd.o
+mlx5_ib-y :=	main.o cq.o doorbell.o qp.o mem.o srq.o mr.o ah.o mad.o gsi.o ib_virt.o cmd.o cong.o
 mlx5_ib-$(CONFIG_INFINIBAND_ON_DEMAND_PAGING) += odp.o
diff --git a/drivers/infiniband/hw/mlx5/cmd.c b/drivers/infiniband/hw/mlx5/cmd.c
index 18d5e1d..470995f 100644
--- a/drivers/infiniband/hw/mlx5/cmd.c
+++ b/drivers/infiniband/hw/mlx5/cmd.c
@@ -57,3 +57,23 @@ int mlx5_cmd_query_cong_counter(struct mlx5_core_dev *dev,
 	MLX5_SET(query_cong_statistics_in, in, clear, reset);
 	return mlx5_cmd_exec(dev, in, sizeof(in), out, out_size);
 }
+
+int mlx5_cmd_query_cong_params(struct mlx5_core_dev *dev, int cong_point,
+			       void *out, int out_size)
+{
+	u32 in[MLX5_ST_SZ_DW(query_cong_params_in)] = { };
+
+	MLX5_SET(query_cong_params_in, in, opcode,
+		 MLX5_CMD_OP_QUERY_CONG_PARAMS);
+	MLX5_SET(query_cong_params_in, in, cong_protocol, cong_point);
+
+	return mlx5_cmd_exec(dev, in, sizeof(in), out, out_size);
+}
+
+int mlx5_cmd_modify_cong_params(struct mlx5_core_dev *dev,
+				void *in, int in_size)
+{
+	u32 out[MLX5_ST_SZ_DW(modify_cong_params_out)] = { };
+
+	return mlx5_cmd_exec(dev, in, in_size, out, sizeof(out));
+}
diff --git a/drivers/infiniband/hw/mlx5/cmd.h b/drivers/infiniband/hw/mlx5/cmd.h
index fa09228..af4c245 100644
--- a/drivers/infiniband/hw/mlx5/cmd.h
+++ b/drivers/infiniband/hw/mlx5/cmd.h
@@ -39,4 +39,8 @@
 int mlx5_cmd_null_mkey(struct mlx5_core_dev *dev, u32 *null_mkey);
 int mlx5_cmd_query_cong_counter(struct mlx5_core_dev *dev,
 				bool reset, void *out, int out_size);
+int mlx5_cmd_query_cong_params(struct mlx5_core_dev *dev, int cong_point,
+			       void *out, int out_size);
+int mlx5_cmd_modify_cong_params(struct mlx5_core_dev *mdev,
+				void *in, int in_size);
 #endif /* MLX5_IB_CMD_H */
diff --git a/drivers/infiniband/hw/mlx5/cong.c b/drivers/infiniband/hw/mlx5/cong.c
new file mode 100644
index 0000000..2d32b51
--- /dev/null
+++ b/drivers/infiniband/hw/mlx5/cong.c
@@ -0,0 +1,421 @@
+/*
+ * Copyright (c) 2013-2017, Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#include <linux/debugfs.h>
+
+#include "mlx5_ib.h"
+#include "cmd.h"
+
+enum mlx5_ib_cong_node_type {
+	MLX5_IB_RROCE_ECN_RP = 1,
+	MLX5_IB_RROCE_ECN_NP = 2,
+};
+
+static const char * const mlx5_ib_dbg_cc_name[] = {
+	"rp_clamp_tgt_rate",
+	"rp_clamp_tgt_rate_ati",
+	"rp_time_reset",
+	"rp_byte_reset",
+	"rp_threshold",
+	"rp_ai_rate",
+	"rp_hai_rate",
+	"rp_min_dec_fac",
+	"rp_min_rate",
+	"rp_rate_to_set_on_first_cnp",
+	"rp_dce_tcp_g",
+	"rp_dce_tcp_rtt",
+	"rp_rate_reduce_monitor_period",
+	"rp_initial_alpha_value",
+	"rp_gd",
+	"np_cnp_dscp",
+	"np_cnp_prio_mode",
+	"np_cnp_prio",
+};
+
+#define MLX5_IB_RP_CLAMP_TGT_RATE_ATTR			BIT(1)
+#define MLX5_IB_RP_CLAMP_TGT_RATE_ATI_ATTR		BIT(2)
+#define MLX5_IB_RP_TIME_RESET_ATTR			BIT(3)
+#define MLX5_IB_RP_BYTE_RESET_ATTR			BIT(4)
+#define MLX5_IB_RP_THRESHOLD_ATTR			BIT(5)
+#define MLX5_IB_RP_AI_RATE_ATTR				BIT(7)
+#define MLX5_IB_RP_HAI_RATE_ATTR			BIT(8)
+#define MLX5_IB_RP_MIN_DEC_FAC_ATTR			BIT(9)
+#define MLX5_IB_RP_MIN_RATE_ATTR			BIT(10)
+#define MLX5_IB_RP_RATE_TO_SET_ON_FIRST_CNP_ATTR	BIT(11)
+#define MLX5_IB_RP_DCE_TCP_G_ATTR			BIT(12)
+#define MLX5_IB_RP_DCE_TCP_RTT_ATTR			BIT(13)
+#define MLX5_IB_RP_RATE_REDUCE_MONITOR_PERIOD_ATTR	BIT(14)
+#define MLX5_IB_RP_INITIAL_ALPHA_VALUE_ATTR		BIT(15)
+#define MLX5_IB_RP_GD_ATTR				BIT(16)
+
+#define MLX5_IB_NP_CNP_DSCP_ATTR			BIT(3)
+#define MLX5_IB_NP_CNP_PRIO_MODE_ATTR			BIT(4)
+
+static enum mlx5_ib_cong_node_type
+mlx5_ib_param_to_node(enum mlx5_ib_dbg_cc_types param_offset)
+{
+	if (param_offset >= MLX5_IB_DBG_CC_RP_CLAMP_TGT_RATE &&
+	    param_offset <= MLX5_IB_DBG_CC_RP_GD)
+		return MLX5_IB_RROCE_ECN_RP;
+	else
+		return MLX5_IB_RROCE_ECN_NP;
+}
+
+static u32 mlx5_get_cc_param_val(void *field, int offset)
+{
+	switch (offset) {
+	case MLX5_IB_DBG_CC_RP_CLAMP_TGT_RATE:
+		return MLX5_GET(cong_control_r_roce_ecn_rp, field,
+				clamp_tgt_rate);
+	case MLX5_IB_DBG_CC_RP_CLAMP_TGT_RATE_ATI:
+		return MLX5_GET(cong_control_r_roce_ecn_rp, field,
+				clamp_tgt_rate_after_time_inc);
+	case MLX5_IB_DBG_CC_RP_TIME_RESET:
+		return MLX5_GET(cong_control_r_roce_ecn_rp, field,
+				rpg_time_reset);
+	case MLX5_IB_DBG_CC_RP_BYTE_RESET:
+		return MLX5_GET(cong_control_r_roce_ecn_rp, field,
+				rpg_byte_reset);
+	case MLX5_IB_DBG_CC_RP_THRESHOLD:
+		return MLX5_GET(cong_control_r_roce_ecn_rp, field,
+				rpg_threshold);
+	case MLX5_IB_DBG_CC_RP_AI_RATE:
+		return MLX5_GET(cong_control_r_roce_ecn_rp, field,
+				rpg_ai_rate);
+	case MLX5_IB_DBG_CC_RP_HAI_RATE:
+		return MLX5_GET(cong_control_r_roce_ecn_rp, field,
+				rpg_hai_rate);
+	case MLX5_IB_DBG_CC_RP_MIN_DEC_FAC:
+		return MLX5_GET(cong_control_r_roce_ecn_rp, field,
+				rpg_min_dec_fac);
+	case MLX5_IB_DBG_CC_RP_MIN_RATE:
+		return MLX5_GET(cong_control_r_roce_ecn_rp, field,
+				rpg_min_rate);
+	case MLX5_IB_DBG_CC_RP_RATE_TO_SET_ON_FIRST_CNP:
+		return MLX5_GET(cong_control_r_roce_ecn_rp, field,
+				rate_to_set_on_first_cnp);
+	case MLX5_IB_DBG_CC_RP_DCE_TCP_G:
+		return MLX5_GET(cong_control_r_roce_ecn_rp, field,
+				dce_tcp_g);
+	case MLX5_IB_DBG_CC_RP_DCE_TCP_RTT:
+		return MLX5_GET(cong_control_r_roce_ecn_rp, field,
+				dce_tcp_rtt);
+	case MLX5_IB_DBG_CC_RP_RATE_REDUCE_MONITOR_PERIOD:
+		return MLX5_GET(cong_control_r_roce_ecn_rp, field,
+				rate_reduce_monitor_period);
+	case MLX5_IB_DBG_CC_RP_INITIAL_ALPHA_VALUE:
+		return MLX5_GET(cong_control_r_roce_ecn_rp, field,
+				initial_alpha_value);
+	case MLX5_IB_DBG_CC_RP_GD:
+		return MLX5_GET(cong_control_r_roce_ecn_rp, field,
+				rpg_gd);
+	case MLX5_IB_DBG_CC_NP_CNP_DSCP:
+		return MLX5_GET(cong_control_r_roce_ecn_np, field,
+				cnp_dscp);
+	case MLX5_IB_DBG_CC_NP_CNP_PRIO_MODE:
+		return MLX5_GET(cong_control_r_roce_ecn_np, field,
+				cnp_prio_mode);
+	case MLX5_IB_DBG_CC_NP_CNP_PRIO:
+		return MLX5_GET(cong_control_r_roce_ecn_np, field,
+				cnp_802p_prio);
+	default:
+		return 0;
+	}
+}
+
+static void mlx5_ib_set_cc_param_mask_val(void *field, int offset,
+					  u32 var, u32 *attr_mask)
+{
+	switch (offset) {
+	case MLX5_IB_DBG_CC_RP_CLAMP_TGT_RATE:
+		*attr_mask |= MLX5_IB_RP_CLAMP_TGT_RATE_ATTR;
+		MLX5_SET(cong_control_r_roce_ecn_rp, field,
+			 clamp_tgt_rate, var);
+		break;
+	case MLX5_IB_DBG_CC_RP_CLAMP_TGT_RATE_ATI:
+		*attr_mask |= MLX5_IB_RP_CLAMP_TGT_RATE_ATI_ATTR;
+		MLX5_SET(cong_control_r_roce_ecn_rp, field,
+			 clamp_tgt_rate_after_time_inc, var);
+		break;
+	case MLX5_IB_DBG_CC_RP_TIME_RESET:
+		*attr_mask |= MLX5_IB_RP_TIME_RESET_ATTR;
+		MLX5_SET(cong_control_r_roce_ecn_rp, field,
+			 rpg_time_reset, var);
+		break;
+	case MLX5_IB_DBG_CC_RP_BYTE_RESET:
+		*attr_mask |= MLX5_IB_RP_BYTE_RESET_ATTR;
+		MLX5_SET(cong_control_r_roce_ecn_rp, field,
+			 rpg_byte_reset, var);
+		break;
+	case MLX5_IB_DBG_CC_RP_THRESHOLD:
+		*attr_mask |= MLX5_IB_RP_THRESHOLD_ATTR;
+		MLX5_SET(cong_control_r_roce_ecn_rp, field,
+			 rpg_threshold, var);
+		break;
+	case MLX5_IB_DBG_CC_RP_AI_RATE:
+		*attr_mask |= MLX5_IB_RP_AI_RATE_ATTR;
+		MLX5_SET(cong_control_r_roce_ecn_rp, field,
+			 rpg_ai_rate, var);
+		break;
+	case MLX5_IB_DBG_CC_RP_HAI_RATE:
+		*attr_mask |= MLX5_IB_RP_HAI_RATE_ATTR;
+		MLX5_SET(cong_control_r_roce_ecn_rp, field,
+			 rpg_hai_rate, var);
+		break;
+	case MLX5_IB_DBG_CC_RP_MIN_DEC_FAC:
+		*attr_mask |= MLX5_IB_RP_MIN_DEC_FAC_ATTR;
+		MLX5_SET(cong_control_r_roce_ecn_rp, field,
+			 rpg_min_dec_fac, var);
+		break;
+	case MLX5_IB_DBG_CC_RP_MIN_RATE:
+		*attr_mask |= MLX5_IB_RP_MIN_RATE_ATTR;
+		MLX5_SET(cong_control_r_roce_ecn_rp, field,
+			 rpg_min_rate, var);
+		break;
+	case MLX5_IB_DBG_CC_RP_RATE_TO_SET_ON_FIRST_CNP:
+		*attr_mask |= MLX5_IB_RP_RATE_TO_SET_ON_FIRST_CNP_ATTR;
+		MLX5_SET(cong_control_r_roce_ecn_rp, field,
+			 rate_to_set_on_first_cnp, var);
+		break;
+	case MLX5_IB_DBG_CC_RP_DCE_TCP_G:
+		*attr_mask |= MLX5_IB_RP_DCE_TCP_G_ATTR;
+		MLX5_SET(cong_control_r_roce_ecn_rp, field,
+			 dce_tcp_g, var);
+		break;
+	case MLX5_IB_DBG_CC_RP_DCE_TCP_RTT:
+		*attr_mask |= MLX5_IB_RP_DCE_TCP_RTT_ATTR;
+		MLX5_SET(cong_control_r_roce_ecn_rp, field,
+			 dce_tcp_rtt, var);
+		break;
+	case MLX5_IB_DBG_CC_RP_RATE_REDUCE_MONITOR_PERIOD:
+		*attr_mask |= MLX5_IB_RP_RATE_REDUCE_MONITOR_PERIOD_ATTR;
+		MLX5_SET(cong_control_r_roce_ecn_rp, field,
+			 rate_reduce_monitor_period, var);
+		break;
+	case MLX5_IB_DBG_CC_RP_INITIAL_ALPHA_VALUE:
+		*attr_mask |= MLX5_IB_RP_INITIAL_ALPHA_VALUE_ATTR;
+		MLX5_SET(cong_control_r_roce_ecn_rp, field,
+			 initial_alpha_value, var);
+		break;
+	case MLX5_IB_DBG_CC_RP_GD:
+		*attr_mask |= MLX5_IB_RP_GD_ATTR;
+		MLX5_SET(cong_control_r_roce_ecn_rp, field,
+			 rpg_gd, var);
+		break;
+	case MLX5_IB_DBG_CC_NP_CNP_DSCP:
+		*attr_mask |= MLX5_IB_NP_CNP_DSCP_ATTR;
+		MLX5_SET(cong_control_r_roce_ecn_np, field, cnp_dscp, var);
+		break;
+	case MLX5_IB_DBG_CC_NP_CNP_PRIO_MODE:
+		*attr_mask |= MLX5_IB_NP_CNP_PRIO_MODE_ATTR;
+		MLX5_SET(cong_control_r_roce_ecn_np, field, cnp_prio_mode, var);
+		break;
+	case MLX5_IB_DBG_CC_NP_CNP_PRIO:
+		*attr_mask |= MLX5_IB_NP_CNP_PRIO_MODE_ATTR;
+		MLX5_SET(cong_control_r_roce_ecn_np, field, cnp_prio_mode, 0);
+		MLX5_SET(cong_control_r_roce_ecn_np, field, cnp_802p_prio, var);
+		break;
+	}
+}
+
+static int mlx5_ib_get_cc_params(struct mlx5_ib_dev *dev, int offset, u32 *var)
+{
+	int outlen = MLX5_ST_SZ_BYTES(query_cong_params_out);
+	void *out;
+	void *field;
+	int err;
+	enum mlx5_ib_cong_node_type node;
+
+	out = kvzalloc(outlen, GFP_KERNEL);
+	if (!out)
+		return -ENOMEM;
+
+	node = mlx5_ib_param_to_node(offset);
+
+	err = mlx5_cmd_query_cong_params(dev->mdev, node, out, outlen);
+	if (err)
+		goto free;
+
+	field = MLX5_ADDR_OF(query_cong_params_out, out, congestion_parameters);
+	*var = mlx5_get_cc_param_val(field, offset);
+
+free:
+	kvfree(out);
+	return err;
+}
+
+static int mlx5_ib_set_cc_params(struct mlx5_ib_dev *dev, int offset, u32 var)
+{
+	int inlen = MLX5_ST_SZ_BYTES(modify_cong_params_in);
+	void *in;
+	void *field;
+	enum mlx5_ib_cong_node_type node;
+	u32 attr_mask = 0;
+	int err;
+
+	in = kvzalloc(inlen, GFP_KERNEL);
+	if (!in)
+		return -ENOMEM;
+
+	MLX5_SET(modify_cong_params_in, in, opcode,
+		 MLX5_CMD_OP_MODIFY_CONG_PARAMS);
+
+	node = mlx5_ib_param_to_node(offset);
+	MLX5_SET(modify_cong_params_in, in, cong_protocol, node);
+
+	field = MLX5_ADDR_OF(modify_cong_params_in, in, congestion_parameters);
+	mlx5_ib_set_cc_param_mask_val(field, offset, var, &attr_mask);
+
+	field = MLX5_ADDR_OF(modify_cong_params_in, in, field_select);
+	MLX5_SET(field_select_r_roce_rp, field, field_select_r_roce_rp,
+		 attr_mask);
+
+	err = mlx5_cmd_modify_cong_params(dev->mdev, in, inlen);
+	kvfree(in);
+	return err;
+}
+
+static ssize_t set_param(struct file *filp, const char __user *buf,
+			 size_t count, loff_t *pos)
+{
+	struct mlx5_ib_dbg_param *param = filp->private_data;
+	int offset = param->offset;
+	char lbuf[11] = { };
+	u32 var;
+	int ret;
+
+	if (count > sizeof(lbuf))
+		return -EINVAL;
+
+	if (copy_from_user(lbuf, buf, count))
+		return -EFAULT;
+
+	lbuf[sizeof(lbuf) - 1] = '\0';
+
+	if (kstrtou32(lbuf, 0, &var))
+		return -EINVAL;
+
+	ret = mlx5_ib_set_cc_params(param->dev, offset, var);
+	return ret ? ret : count;
+}
+
+static ssize_t get_param(struct file *filp, char __user *buf, size_t count,
+			 loff_t *pos)
+{
+	struct mlx5_ib_dbg_param *param = filp->private_data;
+	int offset = param->offset;
+	u32 var = 0;
+	int ret;
+	char lbuf[11];
+
+	if (*pos)
+		return 0;
+
+	ret = mlx5_ib_get_cc_params(param->dev, offset, &var);
+	if (ret)
+		return ret;
+
+	ret = snprintf(lbuf, sizeof(lbuf), "%d\n", var);
+	if (ret < 0)
+		return ret;
+
+	if (copy_to_user(buf, lbuf, ret))
+		return -EFAULT;
+
+	*pos += ret;
+	return ret;
+}
+
+static const struct file_operations dbg_cc_fops = {
+	.owner	= THIS_MODULE,
+	.open	= simple_open,
+	.write	= set_param,
+	.read	= get_param,
+};
+
+void mlx5_ib_cleanup_cong_debugfs(struct mlx5_ib_dev *dev)
+{
+	if (!mlx5_debugfs_root ||
+	    !dev->dbg_cc_params ||
+	    !dev->dbg_cc_params->root)
+		return;
+
+	debugfs_remove_recursive(dev->dbg_cc_params->root);
+	kfree(dev->dbg_cc_params);
+	dev->dbg_cc_params = NULL;
+}
+
+int mlx5_ib_init_cong_debugfs(struct mlx5_ib_dev *dev)
+{
+	struct mlx5_ib_dbg_cc_params *dbg_cc_params;
+	int i;
+
+	if (!mlx5_debugfs_root)
+		goto out;
+
+	if (!MLX5_CAP_GEN(dev->mdev, cc_query_allowed) ||
+	    !MLX5_CAP_GEN(dev->mdev, cc_modify_allowed))
+		goto out;
+
+	dbg_cc_params = kzalloc(sizeof(*dbg_cc_params), GFP_KERNEL);
+	if (!dbg_cc_params)
+		goto out;
+
+	dev->dbg_cc_params = dbg_cc_params;
+
+	dbg_cc_params->root = debugfs_create_dir("cc_params",
+						 dev->mdev->priv.dbg_root);
+	if (!dbg_cc_params->root)
+		goto err;
+
+	for (i = 0; i < MLX5_IB_DBG_CC_MAX; i++) {
+		dbg_cc_params->params[i].offset = i;
+		dbg_cc_params->params[i].dev = dev;
+		dbg_cc_params->params[i].dentry =
+			debugfs_create_file(mlx5_ib_dbg_cc_name[i],
+					    0600, dbg_cc_params->root,
+					    &dbg_cc_params->params[i],
+					    &dbg_cc_fops);
+		if (!dbg_cc_params->params[i].dentry)
+			goto err;
+	}
+out:	return 0;
+
+err:
+	mlx5_ib_warn(dev, "cong debugfs failure\n");
+	mlx5_ib_cleanup_cong_debugfs(dev);
+	/*
+	 * We don't want to fail driver if debugfs failed to initialize,
+	 * so we are not forwarding error to the user.
+	 */
+	return 0;
+}
diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index a384d72..2aa53f4 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -499,7 +499,7 @@ static void mlx5_ib_poll_sw_comp(struct mlx5_ib_cq *cq, int num_entries,
 	struct mlx5_ib_qp *qp;
 
 	*npolled = 0;
-	/* Find uncompleted WQEs belonging to that cq and retrun mmics ones */
+	/* Find uncompleted WQEs belonging to that cq and return mmics ones */
 	list_for_each_entry(qp, &cq->list_send_qp, cq_send_list) {
 		sw_send_comp(qp, num_entries, wc + *npolled, npolled);
 		if (*npolled >= num_entries)
@@ -751,10 +751,8 @@ static int create_cq_user(struct mlx5_ib_dev *dev, struct ib_udata *udata,
 	void *cqc;
 	int err;
 
-	ucmdlen =
-		(udata->inlen - sizeof(struct ib_uverbs_cmd_hdr) <
-		 sizeof(ucmd)) ? (sizeof(ucmd) -
-				  sizeof(ucmd.reserved)) : sizeof(ucmd);
+	ucmdlen = udata->inlen < sizeof(ucmd) ?
+		  (sizeof(ucmd) - sizeof(ucmd.reserved)) : sizeof(ucmd);
 
 	if (ib_copy_from_udata(&ucmd, udata, ucmdlen))
 		return -EFAULT;
diff --git a/drivers/infiniband/hw/mlx5/ib_virt.c b/drivers/infiniband/hw/mlx5/ib_virt.c
index c1b9de8..649a336 100644
--- a/drivers/infiniband/hw/mlx5/ib_virt.c
+++ b/drivers/infiniband/hw/mlx5/ib_virt.c
@@ -96,6 +96,7 @@ int mlx5_ib_set_vf_link_state(struct ib_device *device, int vf,
 	struct mlx5_ib_dev *dev = to_mdev(device);
 	struct mlx5_core_dev *mdev = dev->mdev;
 	struct mlx5_hca_vport_context *in;
+	struct mlx5_vf_context *vfs_ctx = mdev->priv.sriov.vfs_ctx;
 	int err;
 
 	in = kzalloc(sizeof(*in), GFP_KERNEL);
@@ -109,6 +110,8 @@ int mlx5_ib_set_vf_link_state(struct ib_device *device, int vf,
 	}
 	in->field_select = MLX5_HCA_VPORT_SEL_STATE_POLICY;
 	err = mlx5_core_modify_hca_vport_context(mdev, 1, 1, vf + 1, in);
+	if (!err)
+		vfs_ctx[vf].policy = in->policy;
 
 out:
 	kfree(in);
@@ -151,6 +154,7 @@ static int set_vf_node_guid(struct ib_device *device, int vf, u8 port, u64 guid)
 	struct mlx5_ib_dev *dev = to_mdev(device);
 	struct mlx5_core_dev *mdev = dev->mdev;
 	struct mlx5_hca_vport_context *in;
+	struct mlx5_vf_context *vfs_ctx = mdev->priv.sriov.vfs_ctx;
 	int err;
 
 	in = kzalloc(sizeof(*in), GFP_KERNEL);
@@ -160,6 +164,8 @@ static int set_vf_node_guid(struct ib_device *device, int vf, u8 port, u64 guid)
 	in->field_select = MLX5_HCA_VPORT_SEL_NODE_GUID;
 	in->node_guid = guid;
 	err = mlx5_core_modify_hca_vport_context(mdev, 1, 1, vf + 1, in);
+	if (!err)
+		vfs_ctx[vf].node_guid = guid;
 	kfree(in);
 	return err;
 }
@@ -169,6 +175,7 @@ static int set_vf_port_guid(struct ib_device *device, int vf, u8 port, u64 guid)
 	struct mlx5_ib_dev *dev = to_mdev(device);
 	struct mlx5_core_dev *mdev = dev->mdev;
 	struct mlx5_hca_vport_context *in;
+	struct mlx5_vf_context *vfs_ctx = mdev->priv.sriov.vfs_ctx;
 	int err;
 
 	in = kzalloc(sizeof(*in), GFP_KERNEL);
@@ -178,6 +185,8 @@ static int set_vf_port_guid(struct ib_device *device, int vf, u8 port, u64 guid)
 	in->field_select = MLX5_HCA_VPORT_SEL_PORT_GUID;
 	in->port_guid = guid;
 	err = mlx5_core_modify_hca_vport_context(mdev, 1, 1, vf + 1, in);
+	if (!err)
+		vfs_ctx[vf].port_guid = guid;
 	kfree(in);
 	return err;
 }
diff --git a/drivers/infiniband/hw/mlx5/mad.c b/drivers/infiniband/hw/mlx5/mad.c
index 95db929..1003b01 100644
--- a/drivers/infiniband/hw/mlx5/mad.c
+++ b/drivers/infiniband/hw/mlx5/mad.c
@@ -78,7 +78,7 @@ static int process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
 	u16 slid;
 	int err;
 
-	slid = in_wc ? in_wc->slid : be16_to_cpu(IB_LID_PERMISSIVE);
+	slid = in_wc ? ib_lid_cpu16(in_wc->slid) : be16_to_cpu(IB_LID_PERMISSIVE);
 
 	if (in_mad->mad_hdr.method == IB_MGMT_METHOD_TRAP && slid == 0)
 		return IB_MAD_RESULT_SUCCESS | IB_MAD_RESULT_CONSUMED;
@@ -204,7 +204,7 @@ static int process_pma_cmd(struct ib_device *ibdev, u8 port_num,
 	int err;
 	void *out_cnt;
 
-	/* Decalring support of extended counters */
+	/* Declaring support of extended counters */
 	if (in_mad->mad_hdr.attr_id == IB_PMA_CLASS_PORT_INFO) {
 		struct ib_class_port_info cpi = {};
 
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index f7fcde1..ab3c562 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -30,6 +30,7 @@
  * SOFTWARE.
  */
 
+#include <linux/debugfs.h>
 #include <linux/highmem.h>
 #include <linux/module.h>
 #include <linux/init.h>
@@ -58,6 +59,7 @@
 #include <linux/mlx5/vport.h>
 #include "mlx5_ib.h"
 #include "cmd.h"
+#include <linux/mlx5/vport.h>
 
 #define DRIVER_NAME "mlx5_ib"
 #define DRIVER_VERSION "5.0-0"
@@ -65,7 +67,6 @@
 MODULE_AUTHOR("Eli Cohen <eli@mellanox.com>");
 MODULE_DESCRIPTION("Mellanox Connect-IB HCA IB driver");
 MODULE_LICENSE("Dual BSD/GPL");
-MODULE_VERSION(DRIVER_VERSION);
 
 static char mlx5_version[] =
 	DRIVER_NAME ": Mellanox Connect-IB Infiniband driver v"
@@ -97,6 +98,20 @@ mlx5_ib_port_link_layer(struct ib_device *device, u8 port_num)
 	return mlx5_port_type_cap_to_rdma_ll(port_type_cap);
 }
 
+static int get_port_state(struct ib_device *ibdev,
+			  u8 port_num,
+			  enum ib_port_state *state)
+{
+	struct ib_port_attr attr;
+	int ret;
+
+	memset(&attr, 0, sizeof(attr));
+	ret = mlx5_ib_query_port(ibdev, port_num, &attr);
+	if (!ret)
+		*state = attr.state;
+	return ret;
+}
+
 static int mlx5_netdev_event(struct notifier_block *this,
 			     unsigned long event, void *ptr)
 {
@@ -114,6 +129,7 @@ static int mlx5_netdev_event(struct notifier_block *this,
 		write_unlock(&ibdev->roce.netdev_lock);
 		break;
 
+	case NETDEV_CHANGE:
 	case NETDEV_UP:
 	case NETDEV_DOWN: {
 		struct net_device *lag_ndev = mlx5_lag_get_roce_netdev(ibdev->mdev);
@@ -127,10 +143,23 @@ static int mlx5_netdev_event(struct notifier_block *this,
 		if ((upper == ndev || (!upper && ndev == ibdev->roce.netdev))
 		    && ibdev->ib_active) {
 			struct ib_event ibev = { };
+			enum ib_port_state port_state;
 
+			if (get_port_state(&ibdev->ib_dev, 1, &port_state))
+				return NOTIFY_DONE;
+
+			if (ibdev->roce.last_port_state == port_state)
+				return NOTIFY_DONE;
+
+			ibdev->roce.last_port_state = port_state;
 			ibev.device = &ibdev->ib_dev;
-			ibev.event = (event == NETDEV_UP) ?
-				     IB_EVENT_PORT_ACTIVE : IB_EVENT_PORT_ERR;
+			if (port_state == IB_PORT_DOWN)
+				ibev.event = IB_EVENT_PORT_ERR;
+			else if (port_state == IB_PORT_ACTIVE)
+				ibev.event = IB_EVENT_PORT_ACTIVE;
+			else
+				return NOTIFY_DONE;
+
 			ibev.element.port_num = 1;
 			ib_dispatch_event(&ibev);
 		}
@@ -668,6 +697,14 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
 		props->device_cap_flags |= IB_DEVICE_UD_TSO;
 	}
 
+	if (MLX5_CAP_GEN(dev->mdev, rq_delay_drop) &&
+	    MLX5_CAP_GEN(dev->mdev, general_notification_event))
+		props->raw_packet_caps |= IB_RAW_PACKET_CAP_DELAY_DROP;
+
+	if (MLX5_CAP_GEN(mdev, ipoib_enhanced_offloads) &&
+	    MLX5_CAP_IPOIB_ENHANCED(mdev, csum_cap))
+		props->device_cap_flags |= IB_DEVICE_UD_IP_CSUM;
+
 	if (MLX5_CAP_GEN(dev->mdev, eth_net_offloads) &&
 	    MLX5_CAP_ETH(dev->mdev, scatter_fcs)) {
 		/* Legacy bit to support old userspace libraries */
@@ -740,6 +777,16 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
 			1 << MLX5_CAP_GEN(dev->mdev, log_max_rq);
 	}
 
+	if (MLX5_CAP_GEN(mdev, tag_matching)) {
+		props->xrq_caps.max_rndv_hdr_size = MLX5_TM_MAX_RNDV_MSG_SIZE;
+		props->xrq_caps.max_num_tags =
+			(1 << MLX5_CAP_GEN(mdev, log_tag_matching_list_sz)) - 1;
+		props->xrq_caps.flags = IB_TM_CAP_RC;
+		props->xrq_caps.max_ops =
+			1 << MLX5_CAP_GEN(mdev, log_max_qp_sz);
+		props->xrq_caps.max_sge = MLX5_TM_MAX_SGE;
+	}
+
 	if (field_avail(typeof(resp), cqe_comp_caps, uhw->outlen)) {
 		resp.cqe_comp_caps.max_num =
 			MLX5_CAP_GEN(dev->mdev, cqe_compression) ?
@@ -765,8 +812,14 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
 
 	if (field_avail(typeof(resp), mlx5_ib_support_multi_pkt_send_wqes,
 			uhw->outlen)) {
-		resp.mlx5_ib_support_multi_pkt_send_wqes =
-			MLX5_CAP_ETH(mdev, multi_pkt_send_wqe);
+		if (MLX5_CAP_ETH(mdev, multi_pkt_send_wqe))
+			resp.mlx5_ib_support_multi_pkt_send_wqes =
+				MLX5_IB_ALLOW_MPW;
+
+		if (MLX5_CAP_ETH(mdev, enhanced_multi_pkt_send_wqe))
+			resp.mlx5_ib_support_multi_pkt_send_wqes |=
+				MLX5_IB_SUPPORT_EMPW;
+
 		resp.response_length +=
 			sizeof(resp.mlx5_ib_support_multi_pkt_send_wqes);
 	}
@@ -774,6 +827,27 @@ static int mlx5_ib_query_device(struct ib_device *ibdev,
 	if (field_avail(typeof(resp), reserved, uhw->outlen))
 		resp.response_length += sizeof(resp.reserved);
 
+	if (field_avail(typeof(resp), sw_parsing_caps,
+			uhw->outlen)) {
+		resp.response_length += sizeof(resp.sw_parsing_caps);
+		if (MLX5_CAP_ETH(mdev, swp)) {
+			resp.sw_parsing_caps.sw_parsing_offloads |=
+				MLX5_IB_SW_PARSING;
+
+			if (MLX5_CAP_ETH(mdev, swp_csum))
+				resp.sw_parsing_caps.sw_parsing_offloads |=
+					MLX5_IB_SW_PARSING_CSUM;
+
+			if (MLX5_CAP_ETH(mdev, swp_lso))
+				resp.sw_parsing_caps.sw_parsing_offloads |=
+					MLX5_IB_SW_PARSING_LSO;
+
+			if (resp.sw_parsing_caps.sw_parsing_offloads)
+				resp.sw_parsing_caps.supported_qpts =
+					BIT(IB_QPT_RAW_PACKET);
+		}
+	}
+
 	if (uhw->outlen) {
 		err = ib_copy_to_udata(uhw, &resp, resp.response_length);
 
@@ -1144,7 +1218,7 @@ static int calc_total_bfregs(struct mlx5_ib_dev *dev, bool lib_uar_4k,
 	if (req->num_low_latency_bfregs > req->total_num_bfregs - 1)
 		return -EINVAL;
 
-	mlx5_ib_dbg(dev, "uar_4k: fw support %s, lib support %s, user requested %d bfregs, alloated %d, using %d sys pages\n",
+	mlx5_ib_dbg(dev, "uar_4k: fw support %s, lib support %s, user requested %d bfregs, allocated %d, using %d sys pages\n",
 		    MLX5_CAP_GEN(dev->mdev, uar_4k) ? "yes" : "no",
 		    lib_uar_4k ? "yes" : "no", ref_bfregs,
 		    req->total_num_bfregs, *num_sys_pages);
@@ -1193,6 +1267,45 @@ static int deallocate_uars(struct mlx5_ib_dev *dev, struct mlx5_ib_ucontext *con
 	return 0;
 }
 
+static int mlx5_ib_alloc_transport_domain(struct mlx5_ib_dev *dev, u32 *tdn)
+{
+	int err;
+
+	err = mlx5_core_alloc_transport_domain(dev->mdev, tdn);
+	if (err)
+		return err;
+
+	if ((MLX5_CAP_GEN(dev->mdev, port_type) != MLX5_CAP_PORT_TYPE_ETH) ||
+	    !MLX5_CAP_GEN(dev->mdev, disable_local_lb))
+		return err;
+
+	mutex_lock(&dev->lb_mutex);
+	dev->user_td++;
+
+	if (dev->user_td == 2)
+		err = mlx5_nic_vport_update_local_lb(dev->mdev, true);
+
+	mutex_unlock(&dev->lb_mutex);
+	return err;
+}
+
+static void mlx5_ib_dealloc_transport_domain(struct mlx5_ib_dev *dev, u32 tdn)
+{
+	mlx5_core_dealloc_transport_domain(dev->mdev, tdn);
+
+	if ((MLX5_CAP_GEN(dev->mdev, port_type) != MLX5_CAP_PORT_TYPE_ETH) ||
+	    !MLX5_CAP_GEN(dev->mdev, disable_local_lb))
+		return;
+
+	mutex_lock(&dev->lb_mutex);
+	dev->user_td--;
+
+	if (dev->user_td < 2)
+		mlx5_nic_vport_update_local_lb(dev->mdev, false);
+
+	mutex_unlock(&dev->lb_mutex);
+}
+
 static struct ib_ucontext *mlx5_ib_alloc_ucontext(struct ib_device *ibdev,
 						  struct ib_udata *udata)
 {
@@ -1203,7 +1316,6 @@ static struct ib_ucontext *mlx5_ib_alloc_ucontext(struct ib_device *ibdev,
 	struct mlx5_bfreg_info *bfregi;
 	int ver;
 	int err;
-	size_t reqlen;
 	size_t min_req_v2 = offsetof(struct mlx5_ib_alloc_ucontext_req_v2,
 				     max_cqe_version);
 	bool lib_uar_4k;
@@ -1211,18 +1323,14 @@ static struct ib_ucontext *mlx5_ib_alloc_ucontext(struct ib_device *ibdev,
 	if (!dev->ib_active)
 		return ERR_PTR(-EAGAIN);
 
-	if (udata->inlen < sizeof(struct ib_uverbs_cmd_hdr))
-		return ERR_PTR(-EINVAL);
-
-	reqlen = udata->inlen - sizeof(struct ib_uverbs_cmd_hdr);
-	if (reqlen == sizeof(struct mlx5_ib_alloc_ucontext_req))
+	if (udata->inlen == sizeof(struct mlx5_ib_alloc_ucontext_req))
 		ver = 0;
-	else if (reqlen >= min_req_v2)
+	else if (udata->inlen >= min_req_v2)
 		ver = 2;
 	else
 		return ERR_PTR(-EINVAL);
 
-	err = ib_copy_from_udata(&req, udata, min(reqlen, sizeof(req)));
+	err = ib_copy_from_udata(&req, udata, min(udata->inlen, sizeof(req)));
 	if (err)
 		return ERR_PTR(err);
 
@@ -1301,8 +1409,7 @@ static struct ib_ucontext *mlx5_ib_alloc_ucontext(struct ib_device *ibdev,
 	mutex_init(&context->upd_xlt_page_mutex);
 
 	if (MLX5_CAP_GEN(dev->mdev, log_max_transport_domain)) {
-		err = mlx5_core_alloc_transport_domain(dev->mdev,
-						       &context->tdn);
+		err = mlx5_ib_alloc_transport_domain(dev, &context->tdn);
 		if (err)
 			goto out_page;
 	}
@@ -1368,7 +1475,7 @@ static struct ib_ucontext *mlx5_ib_alloc_ucontext(struct ib_device *ibdev,
 
 out_td:
 	if (MLX5_CAP_GEN(dev->mdev, log_max_transport_domain))
-		mlx5_core_dealloc_transport_domain(dev->mdev, context->tdn);
+		mlx5_ib_dealloc_transport_domain(dev, context->tdn);
 
 out_page:
 	free_page(context->upd_xlt_page);
@@ -1396,7 +1503,7 @@ static int mlx5_ib_dealloc_ucontext(struct ib_ucontext *ibcontext)
 
 	bfregi = &context->bfregi;
 	if (MLX5_CAP_GEN(dev->mdev, log_max_transport_domain))
-		mlx5_core_dealloc_transport_domain(dev->mdev, context->tdn);
+		mlx5_ib_dealloc_transport_domain(dev, context->tdn);
 
 	free_page(context->upd_xlt_page);
 	deallocate_uars(dev, context);
@@ -2034,23 +2141,34 @@ static int parse_flow_attr(struct mlx5_core_dev *mdev, u32 *match_c,
  * it won't fall into the multicast flow steering table and this rule
  * could steal other multicast packets.
  */
-static bool flow_is_multicast_only(struct ib_flow_attr *ib_attr)
+static bool flow_is_multicast_only(const struct ib_flow_attr *ib_attr)
 {
-	struct ib_flow_spec_eth *eth_spec;
+	union ib_flow_spec *flow_spec;
 
 	if (ib_attr->type != IB_FLOW_ATTR_NORMAL ||
-	    ib_attr->size < sizeof(struct ib_flow_attr) +
-	    sizeof(struct ib_flow_spec_eth) ||
 	    ib_attr->num_of_specs < 1)
 		return false;
 
-	eth_spec = (struct ib_flow_spec_eth *)(ib_attr + 1);
-	if (eth_spec->type != IB_FLOW_SPEC_ETH ||
-	    eth_spec->size != sizeof(*eth_spec))
-		return false;
+	flow_spec = (union ib_flow_spec *)(ib_attr + 1);
+	if (flow_spec->type == IB_FLOW_SPEC_IPV4) {
+		struct ib_flow_spec_ipv4 *ipv4_spec;
 
-	return is_multicast_ether_addr(eth_spec->mask.dst_mac) &&
-	       is_multicast_ether_addr(eth_spec->val.dst_mac);
+		ipv4_spec = (struct ib_flow_spec_ipv4 *)flow_spec;
+		if (ipv4_is_multicast(ipv4_spec->val.dst_ip))
+			return true;
+
+		return false;
+	}
+
+	if (flow_spec->type == IB_FLOW_SPEC_ETH) {
+		struct ib_flow_spec_eth *eth_spec;
+
+		eth_spec = (struct ib_flow_spec_eth *)flow_spec;
+		return is_multicast_ether_addr(eth_spec->mask.dst_mac) &&
+		       is_multicast_ether_addr(eth_spec->val.dst_mac);
+	}
+
+	return false;
 }
 
 static bool is_valid_ethertype(struct mlx5_core_dev *mdev,
@@ -2235,10 +2353,31 @@ static struct mlx5_ib_flow_prio *get_flow_table(struct mlx5_ib_dev *dev,
 	return err ? ERR_PTR(err) : prio;
 }
 
-static struct mlx5_ib_flow_handler *create_flow_rule(struct mlx5_ib_dev *dev,
-						     struct mlx5_ib_flow_prio *ft_prio,
-						     const struct ib_flow_attr *flow_attr,
-						     struct mlx5_flow_destination *dst)
+static void set_underlay_qp(struct mlx5_ib_dev *dev,
+			    struct mlx5_flow_spec *spec,
+			    u32 underlay_qpn)
+{
+	void *misc_params_c = MLX5_ADDR_OF(fte_match_param,
+					   spec->match_criteria,
+					   misc_parameters);
+	void *misc_params_v = MLX5_ADDR_OF(fte_match_param, spec->match_value,
+					   misc_parameters);
+
+	if (underlay_qpn &&
+	    MLX5_CAP_FLOWTABLE_NIC_RX(dev->mdev,
+				      ft_field_support.bth_dst_qp)) {
+		MLX5_SET(fte_match_set_misc,
+			 misc_params_v, bth_dst_qp, underlay_qpn);
+		MLX5_SET(fte_match_set_misc,
+			 misc_params_c, bth_dst_qp, 0xffffff);
+	}
+}
+
+static struct mlx5_ib_flow_handler *_create_flow_rule(struct mlx5_ib_dev *dev,
+						      struct mlx5_ib_flow_prio *ft_prio,
+						      const struct ib_flow_attr *flow_attr,
+						      struct mlx5_flow_destination *dst,
+						      u32 underlay_qpn)
 {
 	struct mlx5_flow_table	*ft = ft_prio->flow_table;
 	struct mlx5_ib_flow_handler *handler;
@@ -2274,6 +2413,9 @@ static struct mlx5_ib_flow_handler *create_flow_rule(struct mlx5_ib_dev *dev,
 		ib_flow += ((union ib_flow_spec *)ib_flow)->size;
 	}
 
+	if (!flow_is_multicast_only(flow_attr))
+		set_underlay_qp(dev, spec, underlay_qpn);
+
 	spec->match_criteria_enable = get_match_criteria_enable(spec->match_criteria);
 	if (is_drop) {
 		flow_act.action = MLX5_FLOW_CONTEXT_ACTION_DROP;
@@ -2313,6 +2455,14 @@ static struct mlx5_ib_flow_handler *create_flow_rule(struct mlx5_ib_dev *dev,
 	return err ? ERR_PTR(err) : handler;
 }
 
+static struct mlx5_ib_flow_handler *create_flow_rule(struct mlx5_ib_dev *dev,
+						     struct mlx5_ib_flow_prio *ft_prio,
+						     const struct ib_flow_attr *flow_attr,
+						     struct mlx5_flow_destination *dst)
+{
+	return _create_flow_rule(dev, ft_prio, flow_attr, dst, 0);
+}
+
 static struct mlx5_ib_flow_handler *create_dont_trap_rule(struct mlx5_ib_dev *dev,
 							  struct mlx5_ib_flow_prio *ft_prio,
 							  struct ib_flow_attr *flow_attr,
@@ -2449,6 +2599,7 @@ static struct ib_flow *mlx5_ib_create_flow(struct ib_qp *qp,
 	struct mlx5_ib_flow_prio *ft_prio_tx = NULL;
 	struct mlx5_ib_flow_prio *ft_prio;
 	int err;
+	int underlay_qpn;
 
 	if (flow_attr->priority > MLX5_IB_FLOW_LAST_PRIO)
 		return ERR_PTR(-ENOMEM);
@@ -2489,8 +2640,10 @@ static struct ib_flow *mlx5_ib_create_flow(struct ib_qp *qp,
 			handler = create_dont_trap_rule(dev, ft_prio,
 							flow_attr, dst);
 		} else {
-			handler = create_flow_rule(dev, ft_prio, flow_attr,
-						   dst);
+			underlay_qpn = (mqp->flags & MLX5_IB_QP_UNDERLAY) ?
+					mqp->underlay_qpn : 0;
+			handler = _create_flow_rule(dev, ft_prio, flow_attr,
+						    dst, underlay_qpn);
 		}
 	} else if (flow_attr->type == IB_FLOW_ATTR_ALL_DEFAULT ||
 		   flow_attr->type == IB_FLOW_ATTR_MC_DEFAULT) {
@@ -2528,8 +2681,14 @@ static struct ib_flow *mlx5_ib_create_flow(struct ib_qp *qp,
 static int mlx5_ib_mcg_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
 {
 	struct mlx5_ib_dev *dev = to_mdev(ibqp->device);
+	struct mlx5_ib_qp *mqp = to_mqp(ibqp);
 	int err;
 
+	if (mqp->flags & MLX5_IB_QP_UNDERLAY) {
+		mlx5_ib_dbg(dev, "Attaching a multi cast group to underlay QP is not supported\n");
+		return -EOPNOTSUPP;
+	}
+
 	err = mlx5_core_attach_mcg(dev->mdev, gid, ibqp->qp_num);
 	if (err)
 		mlx5_ib_warn(dev, "failed attaching QPN 0x%x, MGID %pI6\n",
@@ -2691,6 +2850,26 @@ static void mlx5_ib_handle_internal_error(struct mlx5_ib_dev *ibdev)
 	spin_unlock_irqrestore(&ibdev->reset_flow_resource_lock, flags);
 }
 
+static void delay_drop_handler(struct work_struct *work)
+{
+	int err;
+	struct mlx5_ib_delay_drop *delay_drop =
+		container_of(work, struct mlx5_ib_delay_drop,
+			     delay_drop_work);
+
+	atomic_inc(&delay_drop->events_cnt);
+
+	mutex_lock(&delay_drop->lock);
+	err = mlx5_core_set_delay_drop(delay_drop->dev->mdev,
+				       delay_drop->timeout);
+	if (err) {
+		mlx5_ib_warn(delay_drop->dev, "Failed to set delay drop, timeout=%u\n",
+			     delay_drop->timeout);
+		delay_drop->activate = false;
+	}
+	mutex_unlock(&delay_drop->lock);
+}
+
 static void mlx5_ib_event(struct mlx5_core_dev *dev, void *context,
 			  enum mlx5_dev_event event, unsigned long param)
 {
@@ -2743,8 +2922,11 @@ static void mlx5_ib_event(struct mlx5_core_dev *dev, void *context,
 		ibev.event = IB_EVENT_CLIENT_REREGISTER;
 		port = (u8)param;
 		break;
+	case MLX5_DEV_EVENT_DELAY_DROP_TIMEOUT:
+		schedule_work(&ibdev->delay_drop.delay_drop_work);
+		goto out;
 	default:
-		return;
+		goto out;
 	}
 
 	ibev.device	      = &ibdev->ib_dev;
@@ -2752,7 +2934,7 @@ static void mlx5_ib_event(struct mlx5_core_dev *dev, void *context,
 
 	if (port < 1 || port > ibdev->num_ports) {
 		mlx5_ib_warn(ibdev, "warning: event on port %d\n", port);
-		return;
+		goto out;
 	}
 
 	if (ibdev->ib_active)
@@ -2760,6 +2942,9 @@ static void mlx5_ib_event(struct mlx5_core_dev *dev, void *context,
 
 	if (fatal)
 		ibdev->ib_active = false;
+
+out:
+	return;
 }
 
 static int set_has_smi_cap(struct mlx5_ib_dev *dev)
@@ -3042,7 +3227,7 @@ static int create_dev_resources(struct mlx5_ib_resources *devr)
 	attr.attr.max_sge = 1;
 	attr.attr.max_wr = 1;
 	attr.srq_type = IB_SRQT_XRC;
-	attr.ext.xrc.cq = devr->c0;
+	attr.ext.cq = devr->c0;
 	attr.ext.xrc.xrcd = devr->x0;
 
 	devr->s0 = mlx5_ib_create_srq(devr->p0, &attr, NULL);
@@ -3057,9 +3242,9 @@ static int create_dev_resources(struct mlx5_ib_resources *devr)
 	devr->s0->srq_context   = NULL;
 	devr->s0->srq_type      = IB_SRQT_XRC;
 	devr->s0->ext.xrc.xrcd	= devr->x0;
-	devr->s0->ext.xrc.cq	= devr->c0;
+	devr->s0->ext.cq	= devr->c0;
 	atomic_inc(&devr->s0->ext.xrc.xrcd->usecnt);
-	atomic_inc(&devr->s0->ext.xrc.cq->usecnt);
+	atomic_inc(&devr->s0->ext.cq->usecnt);
 	atomic_inc(&devr->p0->usecnt);
 	atomic_set(&devr->s0->usecnt, 0);
 
@@ -3078,9 +3263,9 @@ static int create_dev_resources(struct mlx5_ib_resources *devr)
 	devr->s1->event_handler = NULL;
 	devr->s1->srq_context   = NULL;
 	devr->s1->srq_type      = IB_SRQT_BASIC;
-	devr->s1->ext.xrc.cq	= devr->c0;
+	devr->s1->ext.cq	= devr->c0;
 	atomic_inc(&devr->p0->usecnt);
-	atomic_set(&devr->s0->usecnt, 0);
+	atomic_set(&devr->s1->usecnt, 0);
 
 	for (port = 0; port < ARRAY_SIZE(devr->ports); ++port) {
 		INIT_WORK(&devr->ports[port].pkey_change_work,
@@ -3173,13 +3358,13 @@ static int mlx5_port_immutable(struct ib_device *ibdev, u8 port_num,
 	return 0;
 }
 
-static void get_dev_fw_str(struct ib_device *ibdev, char *str,
-			   size_t str_len)
+static void get_dev_fw_str(struct ib_device *ibdev, char *str)
 {
 	struct mlx5_ib_dev *dev =
 		container_of(ibdev, struct mlx5_ib_dev, ib_dev);
-	snprintf(str, str_len, "%d.%d.%04d", fw_rev_maj(dev->mdev),
-		       fw_rev_min(dev->mdev), fw_rev_sub(dev->mdev));
+	snprintf(str, IB_FW_VERSION_NAME_MAX, "%d.%d.%04d",
+		 fw_rev_maj(dev->mdev), fw_rev_min(dev->mdev),
+		 fw_rev_sub(dev->mdev));
 }
 
 static int mlx5_eth_lag_init(struct mlx5_ib_dev *dev)
@@ -3319,6 +3504,17 @@ static const struct mlx5_ib_counter cong_cnts[] = {
 	INIT_CONG_COUNTER(np_cnp_sent),
 };
 
+static const struct mlx5_ib_counter extended_err_cnts[] = {
+	INIT_Q_COUNTER(resp_local_length_error),
+	INIT_Q_COUNTER(resp_cqe_error),
+	INIT_Q_COUNTER(req_cqe_error),
+	INIT_Q_COUNTER(req_remote_invalid_request),
+	INIT_Q_COUNTER(req_remote_access_errors),
+	INIT_Q_COUNTER(resp_remote_access_errors),
+	INIT_Q_COUNTER(resp_cqe_flush_error),
+	INIT_Q_COUNTER(req_cqe_flush_error),
+};
+
 static void mlx5_ib_dealloc_counters(struct mlx5_ib_dev *dev)
 {
 	unsigned int i;
@@ -3343,6 +3539,10 @@ static int __mlx5_ib_alloc_counters(struct mlx5_ib_dev *dev,
 
 	if (MLX5_CAP_GEN(dev->mdev, retransmission_q_counters))
 		num_counters += ARRAY_SIZE(retrans_q_cnts);
+
+	if (MLX5_CAP_GEN(dev->mdev, enhanced_error_q_counters))
+		num_counters += ARRAY_SIZE(extended_err_cnts);
+
 	cnts->num_q_counters = num_counters;
 
 	if (MLX5_CAP_GEN(dev->mdev, cc_query_allowed)) {
@@ -3392,6 +3592,13 @@ static void mlx5_ib_fill_counters(struct mlx5_ib_dev *dev,
 		}
 	}
 
+	if (MLX5_CAP_GEN(dev->mdev, enhanced_error_q_counters)) {
+		for (i = 0; i < ARRAY_SIZE(extended_err_cnts); i++, j++) {
+			names[j] = extended_err_cnts[i].name;
+			offsets[j] = extended_err_cnts[i].offset;
+		}
+	}
+
 	if (MLX5_CAP_GEN(dev->mdev, cc_query_allowed)) {
 		for (i = 0; i < ARRAY_SIZE(cong_cnts); i++, j++) {
 			names[j] = cong_cnts[i].name;
@@ -3562,6 +3769,136 @@ mlx5_ib_alloc_rdma_netdev(struct ib_device *hca,
 	return netdev;
 }
 
+static void delay_drop_debugfs_cleanup(struct mlx5_ib_dev *dev)
+{
+	if (!dev->delay_drop.dbg)
+		return;
+	debugfs_remove_recursive(dev->delay_drop.dbg->dir_debugfs);
+	kfree(dev->delay_drop.dbg);
+	dev->delay_drop.dbg = NULL;
+}
+
+static void cancel_delay_drop(struct mlx5_ib_dev *dev)
+{
+	if (!(dev->ib_dev.attrs.raw_packet_caps & IB_RAW_PACKET_CAP_DELAY_DROP))
+		return;
+
+	cancel_work_sync(&dev->delay_drop.delay_drop_work);
+	delay_drop_debugfs_cleanup(dev);
+}
+
+static ssize_t delay_drop_timeout_read(struct file *filp, char __user *buf,
+				       size_t count, loff_t *pos)
+{
+	struct mlx5_ib_delay_drop *delay_drop = filp->private_data;
+	char lbuf[20];
+	int len;
+
+	len = snprintf(lbuf, sizeof(lbuf), "%u\n", delay_drop->timeout);
+	return simple_read_from_buffer(buf, count, pos, lbuf, len);
+}
+
+static ssize_t delay_drop_timeout_write(struct file *filp, const char __user *buf,
+					size_t count, loff_t *pos)
+{
+	struct mlx5_ib_delay_drop *delay_drop = filp->private_data;
+	u32 timeout;
+	u32 var;
+
+	if (kstrtouint_from_user(buf, count, 0, &var))
+		return -EFAULT;
+
+	timeout = min_t(u32, roundup(var, 100), MLX5_MAX_DELAY_DROP_TIMEOUT_MS *
+			1000);
+	if (timeout != var)
+		mlx5_ib_dbg(delay_drop->dev, "Round delay drop timeout to %u usec\n",
+			    timeout);
+
+	delay_drop->timeout = timeout;
+
+	return count;
+}
+
+static const struct file_operations fops_delay_drop_timeout = {
+	.owner	= THIS_MODULE,
+	.open	= simple_open,
+	.write	= delay_drop_timeout_write,
+	.read	= delay_drop_timeout_read,
+};
+
+static int delay_drop_debugfs_init(struct mlx5_ib_dev *dev)
+{
+	struct mlx5_ib_dbg_delay_drop *dbg;
+
+	if (!mlx5_debugfs_root)
+		return 0;
+
+	dbg = kzalloc(sizeof(*dbg), GFP_KERNEL);
+	if (!dbg)
+		return -ENOMEM;
+
+	dbg->dir_debugfs =
+		debugfs_create_dir("delay_drop",
+				   dev->mdev->priv.dbg_root);
+	if (!dbg->dir_debugfs)
+		return -ENOMEM;
+
+	dbg->events_cnt_debugfs =
+		debugfs_create_atomic_t("num_timeout_events", 0400,
+					dbg->dir_debugfs,
+					&dev->delay_drop.events_cnt);
+	if (!dbg->events_cnt_debugfs)
+		goto out_debugfs;
+
+	dbg->rqs_cnt_debugfs =
+		debugfs_create_atomic_t("num_rqs", 0400,
+					dbg->dir_debugfs,
+					&dev->delay_drop.rqs_cnt);
+	if (!dbg->rqs_cnt_debugfs)
+		goto out_debugfs;
+
+	dbg->timeout_debugfs =
+		debugfs_create_file("timeout", 0600,
+				    dbg->dir_debugfs,
+				    &dev->delay_drop,
+				    &fops_delay_drop_timeout);
+	if (!dbg->timeout_debugfs)
+		goto out_debugfs;
+
+	dev->delay_drop.dbg = dbg;
+
+	return 0;
+
+out_debugfs:
+	delay_drop_debugfs_cleanup(dev);
+	return -ENOMEM;
+}
+
+static void init_delay_drop(struct mlx5_ib_dev *dev)
+{
+	if (!(dev->ib_dev.attrs.raw_packet_caps & IB_RAW_PACKET_CAP_DELAY_DROP))
+		return;
+
+	mutex_init(&dev->delay_drop.lock);
+	dev->delay_drop.dev = dev;
+	dev->delay_drop.activate = false;
+	dev->delay_drop.timeout = MLX5_MAX_DELAY_DROP_TIMEOUT_MS * 1000;
+	INIT_WORK(&dev->delay_drop.delay_drop_work, delay_drop_handler);
+	atomic_set(&dev->delay_drop.rqs_cnt, 0);
+	atomic_set(&dev->delay_drop.events_cnt, 0);
+
+	if (delay_drop_debugfs_init(dev))
+		mlx5_ib_warn(dev, "Failed to init delay drop debugfs\n");
+}
+
+static const struct cpumask *
+mlx5_ib_get_vector_affinity(struct ib_device *ibdev, int comp_vector)
+{
+	struct mlx5_ib_dev *dev = to_mdev(ibdev);
+
+	return mlx5_get_vector_affinity(dev->mdev, comp_vector);
+}
+
 static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 {
 	struct mlx5_ib_dev *dev;
@@ -3692,6 +4029,7 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 	dev->ib_dev.check_mr_status	= mlx5_ib_check_mr_status;
 	dev->ib_dev.get_port_immutable  = mlx5_port_immutable;
 	dev->ib_dev.get_dev_fw_str      = get_dev_fw_str;
+	dev->ib_dev.get_vector_affinity	= mlx5_ib_get_vector_affinity;
 	if (MLX5_CAP_GEN(mdev, ipoib_enhanced_offloads))
 		dev->ib_dev.alloc_rdma_netdev	= mlx5_ib_alloc_rdma_netdev;
 
@@ -3729,18 +4067,20 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 			(1ull << IB_USER_VERBS_CMD_CLOSE_XRCD);
 	}
 
+	dev->ib_dev.create_flow	= mlx5_ib_create_flow;
+	dev->ib_dev.destroy_flow = mlx5_ib_destroy_flow;
+	dev->ib_dev.uverbs_ex_cmd_mask |=
+			(1ull << IB_USER_VERBS_EX_CMD_CREATE_FLOW) |
+			(1ull << IB_USER_VERBS_EX_CMD_DESTROY_FLOW);
+
 	if (mlx5_ib_port_link_layer(&dev->ib_dev, 1) ==
 	    IB_LINK_LAYER_ETHERNET) {
-		dev->ib_dev.create_flow	= mlx5_ib_create_flow;
-		dev->ib_dev.destroy_flow = mlx5_ib_destroy_flow;
 		dev->ib_dev.create_wq	 = mlx5_ib_create_wq;
 		dev->ib_dev.modify_wq	 = mlx5_ib_modify_wq;
 		dev->ib_dev.destroy_wq	 = mlx5_ib_destroy_wq;
 		dev->ib_dev.create_rwq_ind_table = mlx5_ib_create_rwq_ind_table;
 		dev->ib_dev.destroy_rwq_ind_table = mlx5_ib_destroy_rwq_ind_table;
 		dev->ib_dev.uverbs_ex_cmd_mask |=
-			(1ull << IB_USER_VERBS_EX_CMD_CREATE_FLOW) |
-			(1ull << IB_USER_VERBS_EX_CMD_DESTROY_FLOW) |
 			(1ull << IB_USER_VERBS_EX_CMD_CREATE_WQ) |
 			(1ull << IB_USER_VERBS_EX_CMD_MODIFY_WQ) |
 			(1ull << IB_USER_VERBS_EX_CMD_DESTROY_WQ) |
@@ -3760,6 +4100,7 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 		err = mlx5_enable_eth(dev);
 		if (err)
 			goto err_free_port;
+		dev->roce.last_port_state = IB_PORT_DOWN;
 	}
 
 	err = create_dev_resources(&dev->devr);
@@ -3776,9 +4117,13 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 			goto err_odp;
 	}
 
+	err = mlx5_ib_init_cong_debugfs(dev);
+	if (err)
+		goto err_cnt;
+
 	dev->mdev->priv.uar = mlx5_get_uars_page(dev->mdev);
 	if (!dev->mdev->priv.uar)
-		goto err_cnt;
+		goto err_cong;
 
 	err = mlx5_alloc_bfreg(dev->mdev, &dev->bfreg, false, false);
 	if (err)
@@ -3796,18 +4141,25 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 	if (err)
 		goto err_dev;
 
+	init_delay_drop(dev);
+
 	for (i = 0; i < ARRAY_SIZE(mlx5_class_attributes); i++) {
 		err = device_create_file(&dev->ib_dev.dev,
 					 mlx5_class_attributes[i]);
 		if (err)
-			goto err_umrc;
+			goto err_delay_drop;
 	}
 
+	if ((MLX5_CAP_GEN(mdev, port_type) == MLX5_CAP_PORT_TYPE_ETH) &&
+	    MLX5_CAP_GEN(mdev, disable_local_lb))
+		mutex_init(&dev->lb_mutex);
+
 	dev->ib_active = true;
 
 	return dev;
 
-err_umrc:
+err_delay_drop:
+	cancel_delay_drop(dev);
 	destroy_umrc_res(dev);
 
 err_dev:
@@ -3823,6 +4175,8 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 	mlx5_put_uars_page(dev->mdev, dev->mdev->priv.uar);
 
 err_cnt:
+	mlx5_ib_cleanup_cong_debugfs(dev);
+err_cong:
 	if (MLX5_CAP_GEN(dev->mdev, max_qp_cnt))
 		mlx5_ib_dealloc_counters(dev);
 
@@ -3852,11 +4206,13 @@ static void mlx5_ib_remove(struct mlx5_core_dev *mdev, void *context)
 	struct mlx5_ib_dev *dev = context;
 	enum rdma_link_layer ll = mlx5_ib_port_link_layer(&dev->ib_dev, 1);
 
+	cancel_delay_drop(dev);
 	mlx5_remove_netdev_notifier(dev);
 	ib_unregister_device(&dev->ib_dev);
 	mlx5_free_bfreg(dev->mdev, &dev->fp_bfreg);
 	mlx5_free_bfreg(dev->mdev, &dev->bfreg);
 	mlx5_put_uars_page(dev->mdev, mdev->priv.uar);
+	mlx5_ib_cleanup_cong_debugfs(dev);
 	if (MLX5_CAP_GEN(dev->mdev, max_qp_cnt))
 		mlx5_ib_dealloc_counters(dev);
 	destroy_umrc_res(dev);
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index bdcf254..189e80c 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -107,6 +107,11 @@ enum {
 	MLX5_CQE_VERSION_V1,
 };
 
+enum {
+	MLX5_TM_MAX_RNDV_MSG_SIZE	= 64,
+	MLX5_TM_MAX_SGE			= 1,
+};
+
 struct mlx5_ib_vma_private_data {
 	struct list_head list;
 	struct vm_area_struct *vma;
@@ -247,6 +252,10 @@ struct mlx5_ib_wq {
 	void		       *qend;
 };
 
+enum mlx5_ib_wq_flags {
+	MLX5_IB_WQ_FLAGS_DELAY_DROP = 0x1,
+};
+
 struct mlx5_ib_rwq {
 	struct ib_wq		ibwq;
 	struct mlx5_core_qp	core_qp;
@@ -264,6 +273,7 @@ struct mlx5_ib_rwq {
 	u32			wqe_count;
 	u32			wqe_shift;
 	int			wq_sig;
+	u32			create_flags; /* Use enum mlx5_ib_wq_flags */
 };
 
 enum {
@@ -378,6 +388,7 @@ struct mlx5_ib_qp {
 	struct list_head	cq_recv_list;
 	struct list_head	cq_send_list;
 	u32			rate_limit;
+	u32                     underlay_qpn;
 };
 
 struct mlx5_ib_cq_buf {
@@ -399,6 +410,7 @@ enum mlx5_ib_qp_flags {
 	MLX5_IB_QP_CAP_SCATTER_FCS		= 1 << 7,
 	MLX5_IB_QP_RSS				= 1 << 8,
 	MLX5_IB_QP_CVLAN_STRIPPING		= 1 << 9,
+	MLX5_IB_QP_UNDERLAY			= 1 << 10,
 };
 
 struct mlx5_umr_wr {
@@ -496,7 +508,7 @@ struct mlx5_ib_mr {
 	struct mlx5_shared_mr_info	*smr_info;
 	struct list_head	list;
 	int			order;
-	int			umred;
+	bool			allocated_from_cache;
 	int			npages;
 	struct mlx5_ib_dev     *dev;
 	u32 out[MLX5_ST_SZ_DW(create_mkey_out)];
@@ -616,6 +628,63 @@ struct mlx5_roce {
 	struct net_device	*netdev;
 	struct notifier_block	nb;
 	atomic_t		next_port;
+	enum ib_port_state last_port_state;
+};
+
+struct mlx5_ib_dbg_param {
+	int			offset;
+	struct mlx5_ib_dev	*dev;
+	struct dentry		*dentry;
+};
+
+enum mlx5_ib_dbg_cc_types {
+	MLX5_IB_DBG_CC_RP_CLAMP_TGT_RATE,
+	MLX5_IB_DBG_CC_RP_CLAMP_TGT_RATE_ATI,
+	MLX5_IB_DBG_CC_RP_TIME_RESET,
+	MLX5_IB_DBG_CC_RP_BYTE_RESET,
+	MLX5_IB_DBG_CC_RP_THRESHOLD,
+	MLX5_IB_DBG_CC_RP_AI_RATE,
+	MLX5_IB_DBG_CC_RP_HAI_RATE,
+	MLX5_IB_DBG_CC_RP_MIN_DEC_FAC,
+	MLX5_IB_DBG_CC_RP_MIN_RATE,
+	MLX5_IB_DBG_CC_RP_RATE_TO_SET_ON_FIRST_CNP,
+	MLX5_IB_DBG_CC_RP_DCE_TCP_G,
+	MLX5_IB_DBG_CC_RP_DCE_TCP_RTT,
+	MLX5_IB_DBG_CC_RP_RATE_REDUCE_MONITOR_PERIOD,
+	MLX5_IB_DBG_CC_RP_INITIAL_ALPHA_VALUE,
+	MLX5_IB_DBG_CC_RP_GD,
+	MLX5_IB_DBG_CC_NP_CNP_DSCP,
+	MLX5_IB_DBG_CC_NP_CNP_PRIO_MODE,
+	MLX5_IB_DBG_CC_NP_CNP_PRIO,
+	MLX5_IB_DBG_CC_MAX,
+};
+
+struct mlx5_ib_dbg_cc_params {
+	struct dentry			*root;
+	struct mlx5_ib_dbg_param	params[MLX5_IB_DBG_CC_MAX];
+};
+
+enum {
+	MLX5_MAX_DELAY_DROP_TIMEOUT_MS = 100,
+};
+
+struct mlx5_ib_dbg_delay_drop {
+	struct dentry		*dir_debugfs;
+	struct dentry		*rqs_cnt_debugfs;
+	struct dentry		*events_cnt_debugfs;
+	struct dentry		*timeout_debugfs;
+};
+
+struct mlx5_ib_delay_drop {
+	struct mlx5_ib_dev     *dev;
+	struct work_struct	delay_drop_work;
+	/* serialize setting of delay drop */
+	struct mutex		lock;
+	u32			timeout;
+	bool			activate;
+	atomic_t		events_cnt;
+	atomic_t		rqs_cnt;
+	struct mlx5_ib_dbg_delay_drop *dbg;
 };
 
 struct mlx5_ib_dev {
@@ -652,9 +721,15 @@ struct mlx5_ib_dev {
 	struct list_head	qp_list;
 	/* Array with num_ports elements */
 	struct mlx5_ib_port	*port;
-	struct mlx5_sq_bfreg     bfreg;
-	struct mlx5_sq_bfreg     fp_bfreg;
-	u8				umr_fence;
+	struct mlx5_sq_bfreg	bfreg;
+	struct mlx5_sq_bfreg	fp_bfreg;
+	struct mlx5_ib_delay_drop	delay_drop;
+	struct mlx5_ib_dbg_cc_params	*dbg_cc_params;
+
+	/* protect the user_td */
+	struct mutex		lb_mutex;
+	u32			user_td;
+	u8			umr_fence;
 };
 
 static inline struct mlx5_ib_cq *to_mibcq(struct mlx5_core_cq *mcq)
@@ -904,6 +979,9 @@ __be16 mlx5_get_roce_udp_sport(struct mlx5_ib_dev *dev, u8 port_num,
 int mlx5_get_roce_gid_type(struct mlx5_ib_dev *dev, u8 port_num,
 			   int index, enum ib_gid_type *gid_type);
 
+void mlx5_ib_cleanup_cong_debugfs(struct mlx5_ib_dev *dev);
+int mlx5_ib_init_cong_debugfs(struct mlx5_ib_dev *dev);
+
 /* GSI QP helper functions */
 struct ib_qp *mlx5_ib_gsi_create_qp(struct ib_pd *pd,
 				    struct ib_qp_init_attr *init_attr);
diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
index 2c40a2e..0e2789d 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -48,7 +48,7 @@ enum {
 #define MLX5_UMR_ALIGN 2048
 
 static int clean_mr(struct mlx5_ib_mr *mr);
-static int use_umr(struct mlx5_ib_dev *dev, int order);
+static int mr_cache_max_order(struct mlx5_ib_dev *dev);
 static int unreg_umr(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr);
 
 static int destroy_mkey(struct mlx5_ib_dev *dev, struct mlx5_ib_mr *mr)
@@ -183,7 +183,7 @@ static int add_keys(struct mlx5_ib_dev *dev, int c, int num)
 			break;
 		}
 		mr->order = ent->order;
-		mr->umred = 1;
+		mr->allocated_from_cache = 1;
 		mr->dev = dev;
 
 		MLX5_SET(mkc, mkc, free, 1);
@@ -491,16 +491,18 @@ static struct mlx5_ib_mr *alloc_cached_mr(struct mlx5_ib_dev *dev, int order)
 	struct mlx5_mr_cache *cache = &dev->cache;
 	struct mlx5_ib_mr *mr = NULL;
 	struct mlx5_cache_ent *ent;
+	int last_umr_cache_entry;
 	int c;
 	int i;
 
 	c = order2idx(dev, order);
-	if (c < 0 || c > MAX_UMR_CACHE_ENTRY) {
+	last_umr_cache_entry = order2idx(dev, mr_cache_max_order(dev));
+	if (c < 0 || c > last_umr_cache_entry) {
 		mlx5_ib_warn(dev, "order %d, cache index %d\n", order, c);
 		return NULL;
 	}
 
-	for (i = c; i < MAX_UMR_CACHE_ENTRY; i++) {
+	for (i = c; i <= last_umr_cache_entry; i++) {
 		ent = &cache->ent[i];
 
 		mlx5_ib_dbg(dev, "order %d, cache index %d\n", ent->order, i);
@@ -674,12 +676,12 @@ int mlx5_mr_cache_init(struct mlx5_ib_dev *dev)
 		INIT_DELAYED_WORK(&ent->dwork, delayed_cache_work_func);
 		queue_work(cache->wq, &ent->work);
 
-		if (i > MAX_UMR_CACHE_ENTRY) {
+		if (i > MR_CACHE_LAST_STD_ENTRY) {
 			mlx5_odp_init_mr_cache_entry(ent);
 			continue;
 		}
 
-		if (!use_umr(dev, ent->order))
+		if (ent->order > mr_cache_max_order(dev))
 			continue;
 
 		ent->page = PAGE_SHIFT;
@@ -806,21 +808,22 @@ struct ib_mr *mlx5_ib_get_dma_mr(struct ib_pd *pd, int acc)
 	return ERR_PTR(err);
 }
 
-static int get_octo_len(u64 addr, u64 len, int page_size)
+static int get_octo_len(u64 addr, u64 len, int page_shift)
 {
+	u64 page_size = 1ULL << page_shift;
 	u64 offset;
 	int npages;
 
 	offset = addr & (page_size - 1);
-	npages = ALIGN(len + offset, page_size) >> ilog2(page_size);
+	npages = ALIGN(len + offset, page_size) >> page_shift;
 	return (npages + 1) / 2;
 }
 
-static int use_umr(struct mlx5_ib_dev *dev, int order)
+static int mr_cache_max_order(struct mlx5_ib_dev *dev)
 {
 	if (MLX5_CAP_GEN(dev->mdev, umr_extended_translation_offset))
-		return order <= MAX_UMR_CACHE_ENTRY + 2;
-	return order <= MLX5_MAX_UMR_SHIFT;
+		return MR_CACHE_LAST_STD_ENTRY + 2;
+	return MLX5_MAX_UMR_SHIFT;
 }
 
 static int mr_umem_get(struct ib_pd *pd, u64 start, u64 length,
@@ -896,7 +899,8 @@ static int mlx5_ib_post_send_wait(struct mlx5_ib_dev *dev,
 	return err;
 }
 
-static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
+static struct mlx5_ib_mr *alloc_mr_from_cache(
+				  struct ib_pd *pd, struct ib_umem *umem,
 				  u64 virt_addr, u64 len, int npages,
 				  int page_shift, int order, int access_flags)
 {
@@ -928,16 +932,6 @@ static struct mlx5_ib_mr *reg_umr(struct ib_pd *pd, struct ib_umem *umem,
 	mr->mmkey.size = len;
 	mr->mmkey.pd = to_mpd(pd)->pdn;
 
-	err = mlx5_ib_update_xlt(mr, 0, npages, page_shift,
-				 MLX5_IB_UPD_XLT_ENABLE);
-
-	if (err) {
-		mlx5_mr_cache_free(dev, mr);
-		return ERR_PTR(err);
-	}
-
-	mr->live = 1;
-
 	return mr;
 }
 
@@ -1103,7 +1097,8 @@ int mlx5_ib_update_xlt(struct mlx5_ib_mr *mr, u64 idx, int npages,
 static struct mlx5_ib_mr *reg_create(struct ib_mr *ibmr, struct ib_pd *pd,
 				     u64 virt_addr, u64 length,
 				     struct ib_umem *umem, int npages,
-				     int page_shift, int access_flags)
+				     int page_shift, int access_flags,
+				     bool populate)
 {
 	struct mlx5_ib_dev *dev = to_mdev(pd->device);
 	struct mlx5_ib_mr *mr;
@@ -1118,15 +1113,19 @@ static struct mlx5_ib_mr *reg_create(struct ib_mr *ibmr, struct ib_pd *pd,
 	if (!mr)
 		return ERR_PTR(-ENOMEM);
 
-	inlen = MLX5_ST_SZ_BYTES(create_mkey_in) +
-		sizeof(*pas) * ((npages + 1) / 2) * 2;
+	mr->ibmr.pd = pd;
+	mr->access_flags = access_flags;
+
+	inlen = MLX5_ST_SZ_BYTES(create_mkey_in);
+	if (populate)
+		inlen += sizeof(*pas) * roundup(npages, 2);
 	in = kvzalloc(inlen, GFP_KERNEL);
 	if (!in) {
 		err = -ENOMEM;
 		goto err_1;
 	}
 	pas = (__be64 *)MLX5_ADDR_OF(create_mkey_in, in, klm_pas_mtt);
-	if (!(access_flags & IB_ACCESS_ON_DEMAND))
+	if (populate && !(access_flags & IB_ACCESS_ON_DEMAND))
 		mlx5_ib_populate_pas(dev, umem, page_shift, pas,
 				     pg_cap ? MLX5_IB_MTT_PRESENT : 0);
 
@@ -1135,23 +1134,27 @@ static struct mlx5_ib_mr *reg_create(struct ib_mr *ibmr, struct ib_pd *pd,
 	MLX5_SET(create_mkey_in, in, pg_access, !!(pg_cap));
 
 	mkc = MLX5_ADDR_OF(create_mkey_in, in, memory_key_mkey_entry);
+	MLX5_SET(mkc, mkc, free, !populate);
 	MLX5_SET(mkc, mkc, access_mode, MLX5_MKC_ACCESS_MODE_MTT);
 	MLX5_SET(mkc, mkc, a, !!(access_flags & IB_ACCESS_REMOTE_ATOMIC));
 	MLX5_SET(mkc, mkc, rw, !!(access_flags & IB_ACCESS_REMOTE_WRITE));
 	MLX5_SET(mkc, mkc, rr, !!(access_flags & IB_ACCESS_REMOTE_READ));
 	MLX5_SET(mkc, mkc, lw, !!(access_flags & IB_ACCESS_LOCAL_WRITE));
 	MLX5_SET(mkc, mkc, lr, 1);
+	MLX5_SET(mkc, mkc, umr_en, 1);
 
 	MLX5_SET64(mkc, mkc, start_addr, virt_addr);
 	MLX5_SET64(mkc, mkc, len, length);
 	MLX5_SET(mkc, mkc, pd, to_mpd(pd)->pdn);
 	MLX5_SET(mkc, mkc, bsf_octword_size, 0);
 	MLX5_SET(mkc, mkc, translations_octword_size,
-		 get_octo_len(virt_addr, length, 1 << page_shift));
+		 get_octo_len(virt_addr, length, page_shift));
 	MLX5_SET(mkc, mkc, log_page_size, page_shift);
 	MLX5_SET(mkc, mkc, qpn, 0xffffff);
-	MLX5_SET(create_mkey_in, in, translations_octword_actual_size,
-		 get_octo_len(virt_addr, length, 1 << page_shift));
+	if (populate) {
+		MLX5_SET(create_mkey_in, in, translations_octword_actual_size,
+			 get_octo_len(virt_addr, length, page_shift));
+	}
 
 	err = mlx5_core_create_mkey(dev->mdev, &mr->mmkey, in, inlen);
 	if (err) {
@@ -1160,9 +1163,7 @@ static struct mlx5_ib_mr *reg_create(struct ib_mr *ibmr, struct ib_pd *pd,
 	}
 	mr->mmkey.type = MLX5_MKEY_MR;
 	mr->desc_size = sizeof(struct mlx5_mtt);
-	mr->umem = umem;
 	mr->dev = dev;
-	mr->live = 1;
 	kvfree(in);
 
 	mlx5_ib_dbg(dev, "mkey = 0x%x\n", mr->mmkey.key);
@@ -1202,6 +1203,7 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 	int ncont;
 	int order;
 	int err;
+	bool use_umr = true;
 
 	mlx5_ib_dbg(dev, "start 0x%llx, virt_addr 0x%llx, length 0x%llx, access_flags 0x%x\n",
 		    start, virt_addr, length, access_flags);
@@ -1220,27 +1222,29 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 	err = mr_umem_get(pd, start, length, access_flags, &umem, &npages,
 			   &page_shift, &ncont, &order);
 
-        if (err < 0)
+	if (err < 0)
 		return ERR_PTR(err);
 
-	if (use_umr(dev, order)) {
-		mr = reg_umr(pd, umem, virt_addr, length, ncont, page_shift,
-			     order, access_flags);
+	if (order <= mr_cache_max_order(dev)) {
+		mr = alloc_mr_from_cache(pd, umem, virt_addr, length, ncont,
+					 page_shift, order, access_flags);
 		if (PTR_ERR(mr) == -EAGAIN) {
 			mlx5_ib_dbg(dev, "cache empty for order %d", order);
 			mr = NULL;
 		}
-	} else if (access_flags & IB_ACCESS_ON_DEMAND &&
-		   !MLX5_CAP_GEN(dev->mdev, umr_extended_translation_offset)) {
-		err = -EINVAL;
-		pr_err("Got MR registration for ODP MR > 512MB, not supported for Connect-IB");
-		goto error;
+	} else if (!MLX5_CAP_GEN(dev->mdev, umr_extended_translation_offset)) {
+		if (access_flags & IB_ACCESS_ON_DEMAND) {
+			err = -EINVAL;
+			pr_err("Got MR registration for ODP MR > 512MB, not supported for Connect-IB");
+			goto error;
+		}
+		use_umr = false;
 	}
 
 	if (!mr) {
 		mutex_lock(&dev->slow_path_mutex);
 		mr = reg_create(NULL, pd, virt_addr, length, umem, ncont,
-				page_shift, access_flags);
+				page_shift, access_flags, !use_umr);
 		mutex_unlock(&dev->slow_path_mutex);
 	}
 
@@ -1258,8 +1262,22 @@ struct ib_mr *mlx5_ib_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 	update_odp_mr(mr);
 #endif
 
-	return &mr->ibmr;
+	if (use_umr) {
+		int update_xlt_flags = MLX5_IB_UPD_XLT_ENABLE;
 
+		if (access_flags & IB_ACCESS_ON_DEMAND)
+			update_xlt_flags |= MLX5_IB_UPD_XLT_ZAP;
+
+		err = mlx5_ib_update_xlt(mr, 0, ncont, page_shift,
+					 update_xlt_flags);
+		if (err) {
+			mlx5_ib_dereg_mr(&mr->ibmr);
+			return ERR_PTR(err);
+		}
+	}
+
+	mr->live = 1;
+	return &mr->ibmr;
 error:
 	ib_umem_release(umem);
 	return ERR_PTR(err);
@@ -1347,7 +1365,7 @@ int mlx5_ib_rereg_user_mr(struct ib_mr *ib_mr, int flags, u64 start,
 		/*
 		 * UMR can't be used - MKey needs to be replaced.
 		 */
-		if (mr->umred) {
+		if (mr->allocated_from_cache) {
 			err = unreg_umr(dev, mr);
 			if (err)
 				mlx5_ib_warn(dev, "Failed to unregister MR\n");
@@ -1360,12 +1378,13 @@ int mlx5_ib_rereg_user_mr(struct ib_mr *ib_mr, int flags, u64 start,
 			return err;
 
 		mr = reg_create(ib_mr, pd, addr, len, mr->umem, ncont,
-				page_shift, access_flags);
+				page_shift, access_flags, true);
 
 		if (IS_ERR(mr))
 			return PTR_ERR(mr);
 
-		mr->umred = 0;
+		mr->allocated_from_cache = 0;
+		mr->live = 1;
 	} else {
 		/*
 		 * Send a UMR WQE
@@ -1453,7 +1472,7 @@ mlx5_free_priv_descs(struct mlx5_ib_mr *mr)
 static int clean_mr(struct mlx5_ib_mr *mr)
 {
 	struct mlx5_ib_dev *dev = to_mdev(mr->ibmr.device);
-	int umred = mr->umred;
+	int allocated_from_cache = mr->allocated_from_cache;
 	int err;
 
 	if (mr->sig) {
@@ -1471,20 +1490,20 @@ static int clean_mr(struct mlx5_ib_mr *mr)
 
 	mlx5_free_priv_descs(mr);
 
-	if (!umred) {
+	if (!allocated_from_cache) {
+		u32 key = mr->mmkey.key;
+
 		err = destroy_mkey(dev, mr);
+		kfree(mr);
 		if (err) {
 			mlx5_ib_warn(dev, "failed to destroy mkey 0x%x (%d)\n",
-				     mr->mmkey.key, err);
+				     key, err);
 			return err;
 		}
 	} else {
 		mlx5_mr_cache_free(dev, mr);
 	}
 
-	if (!umred)
-		kfree(mr);
-
 	return 0;
 }
 
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index f58f8f5..acb79d3 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -34,6 +34,7 @@
 #include <rdma/ib_umem.h>
 #include <rdma/ib_cache.h>
 #include <rdma/ib_user_verbs.h>
+#include <linux/mlx5/fs.h>
 #include "mlx5_ib.h"
 
 /* not supported currently */
@@ -453,7 +454,8 @@ static int set_user_buf_size(struct mlx5_ib_dev *dev,
 		return -EINVAL;
 	}
 
-	if (attr->qp_type == IB_QPT_RAW_PACKET) {
+	if (attr->qp_type == IB_QPT_RAW_PACKET ||
+	    qp->flags & MLX5_IB_QP_UNDERLAY) {
 		base->ubuffer.buf_size = qp->rq.wqe_cnt << qp->rq.wqe_shift;
 		qp->raw_packet_qp.sq.ubuffer.buf_size = qp->sq.wqe_cnt << 6;
 	} else {
@@ -675,10 +677,14 @@ static int mlx5_ib_umem_get(struct mlx5_ib_dev *dev,
 	return err;
 }
 
-static void destroy_user_rq(struct ib_pd *pd, struct mlx5_ib_rwq *rwq)
+static void destroy_user_rq(struct mlx5_ib_dev *dev, struct ib_pd *pd,
+			    struct mlx5_ib_rwq *rwq)
 {
 	struct mlx5_ib_ucontext *context;
 
+	if (rwq->create_flags & MLX5_IB_WQ_FLAGS_DELAY_DROP)
+		atomic_dec(&dev->delay_drop.rqs_cnt);
+
 	context = to_mucontext(pd->uobject->context);
 	mlx5_ib_db_unmap_user(context, &rwq->db);
 	if (rwq->umem)
@@ -959,11 +965,16 @@ static int create_kernel_qp(struct mlx5_ib_dev *dev,
 		goto err_free;
 	}
 
-	qp->sq.wrid = kmalloc(qp->sq.wqe_cnt * sizeof(*qp->sq.wrid), GFP_KERNEL);
-	qp->sq.wr_data = kmalloc(qp->sq.wqe_cnt * sizeof(*qp->sq.wr_data), GFP_KERNEL);
-	qp->rq.wrid = kmalloc(qp->rq.wqe_cnt * sizeof(*qp->rq.wrid), GFP_KERNEL);
-	qp->sq.w_list = kmalloc(qp->sq.wqe_cnt * sizeof(*qp->sq.w_list), GFP_KERNEL);
-	qp->sq.wqe_head = kmalloc(qp->sq.wqe_cnt * sizeof(*qp->sq.wqe_head), GFP_KERNEL);
+	qp->sq.wrid = kvmalloc_array(qp->sq.wqe_cnt,
+				     sizeof(*qp->sq.wrid), GFP_KERNEL);
+	qp->sq.wr_data = kvmalloc_array(qp->sq.wqe_cnt,
+					sizeof(*qp->sq.wr_data), GFP_KERNEL);
+	qp->rq.wrid = kvmalloc_array(qp->rq.wqe_cnt,
+				     sizeof(*qp->rq.wrid), GFP_KERNEL);
+	qp->sq.w_list = kvmalloc_array(qp->sq.wqe_cnt,
+				       sizeof(*qp->sq.w_list), GFP_KERNEL);
+	qp->sq.wqe_head = kvmalloc_array(qp->sq.wqe_cnt,
+					 sizeof(*qp->sq.wqe_head), GFP_KERNEL);
 
 	if (!qp->sq.wrid || !qp->sq.wr_data || !qp->rq.wrid ||
 	    !qp->sq.w_list || !qp->sq.wqe_head) {
@@ -975,11 +986,11 @@ static int create_kernel_qp(struct mlx5_ib_dev *dev,
 	return 0;
 
 err_wrid:
-	kfree(qp->sq.wqe_head);
-	kfree(qp->sq.w_list);
-	kfree(qp->sq.wrid);
-	kfree(qp->sq.wr_data);
-	kfree(qp->rq.wrid);
+	kvfree(qp->sq.wqe_head);
+	kvfree(qp->sq.w_list);
+	kvfree(qp->sq.wrid);
+	kvfree(qp->sq.wr_data);
+	kvfree(qp->rq.wrid);
 	mlx5_db_free(dev->mdev, &qp->db);
 
 err_free:
@@ -992,11 +1003,11 @@ static int create_kernel_qp(struct mlx5_ib_dev *dev,
 
 static void destroy_qp_kernel(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp)
 {
-	kfree(qp->sq.wqe_head);
-	kfree(qp->sq.w_list);
-	kfree(qp->sq.wrid);
-	kfree(qp->sq.wr_data);
-	kfree(qp->rq.wrid);
+	kvfree(qp->sq.wqe_head);
+	kvfree(qp->sq.w_list);
+	kvfree(qp->sq.wrid);
+	kvfree(qp->sq.wr_data);
+	kvfree(qp->rq.wrid);
 	mlx5_db_free(dev->mdev, &qp->db);
 	mlx5_buf_free(dev->mdev, &qp->buf);
 }
@@ -1021,12 +1032,16 @@ static int is_connected(enum ib_qp_type qp_type)
 }
 
 static int create_raw_packet_qp_tis(struct mlx5_ib_dev *dev,
+				    struct mlx5_ib_qp *qp,
 				    struct mlx5_ib_sq *sq, u32 tdn)
 {
 	u32 in[MLX5_ST_SZ_DW(create_tis_in)] = {0};
 	void *tisc = MLX5_ADDR_OF(create_tis_in, in, ctx);
 
 	MLX5_SET(tisc, tisc, transport_domain, tdn);
+	if (qp->flags & MLX5_IB_QP_UNDERLAY)
+		MLX5_SET(tisc, tisc, underlay_qpn, qp->underlay_qpn);
+
 	return mlx5_core_create_tis(dev->mdev, in, sizeof(in), &sq->tisn);
 }
 
@@ -1068,11 +1083,16 @@ static int create_raw_packet_qp_sq(struct mlx5_ib_dev *dev,
 
 	sqc = MLX5_ADDR_OF(create_sq_in, in, ctx);
 	MLX5_SET(sqc, sqc, flush_in_error_en, 1);
+	if (MLX5_CAP_ETH(dev->mdev, multi_pkt_send_wqe))
+		MLX5_SET(sqc, sqc, allow_multi_pkt_send_wqe, 1);
 	MLX5_SET(sqc, sqc, state, MLX5_SQC_STATE_RST);
 	MLX5_SET(sqc, sqc, user_index, MLX5_GET(qpc, qpc, user_index));
 	MLX5_SET(sqc, sqc, cqn, MLX5_GET(qpc, qpc, cqn_snd));
 	MLX5_SET(sqc, sqc, tis_lst_sz, 1);
 	MLX5_SET(sqc, sqc, tis_num_0, sq->tisn);
+	if (MLX5_CAP_GEN(dev->mdev, eth_net_offloads) &&
+	    MLX5_CAP_ETH(dev->mdev, swp))
+		MLX5_SET(sqc, sqc, allow_swp, 1);
 
 	wq = MLX5_ADDR_OF(sqc, sqc, wq);
 	MLX5_SET(wq, wq, wq_type, MLX5_WQ_TYPE_CYCLIC);
@@ -1229,7 +1249,7 @@ static int create_raw_packet_qp(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp,
 	u32 tdn = mucontext->tdn;
 
 	if (qp->sq.wqe_cnt) {
-		err = create_raw_packet_qp_tis(dev, sq, tdn);
+		err = create_raw_packet_qp_tis(dev, qp, sq, tdn);
 		if (err)
 			return err;
 
@@ -1503,10 +1523,6 @@ static int create_qp_common(struct mlx5_ib_dev *dev, struct ib_pd *pd,
 	u32 *in;
 	int err;
 
-	base = init_attr->qp_type == IB_QPT_RAW_PACKET ?
-	       &qp->raw_packet_qp.rq.base :
-	       &qp->trans_qp.base;
-
 	mutex_init(&qp->mutex);
 	spin_lock_init(&qp->sq.lock);
 	spin_lock_init(&qp->rq.lock);
@@ -1588,10 +1604,28 @@ static int create_qp_common(struct mlx5_ib_dev *dev, struct ib_pd *pd,
 
 		qp->wq_sig = !!(ucmd.flags & MLX5_QP_FLAG_SIGNATURE);
 		qp->scat_cqe = !!(ucmd.flags & MLX5_QP_FLAG_SCATTER_CQE);
+
+		if (init_attr->create_flags & IB_QP_CREATE_SOURCE_QPN) {
+			if (init_attr->qp_type != IB_QPT_UD ||
+			    (MLX5_CAP_GEN(dev->mdev, port_type) !=
+			     MLX5_CAP_PORT_TYPE_IB) ||
+			    !mlx5_get_flow_namespace(dev->mdev, MLX5_FLOW_NAMESPACE_BYPASS)) {
+				mlx5_ib_dbg(dev, "Source QP option isn't supported\n");
+				return -EOPNOTSUPP;
+			}
+
+			qp->flags |= MLX5_IB_QP_UNDERLAY;
+			qp->underlay_qpn = init_attr->source_qpn;
+		}
 	} else {
 		qp->wq_sig = !!wq_signature;
 	}
 
+	base = (init_attr->qp_type == IB_QPT_RAW_PACKET ||
+		qp->flags & MLX5_IB_QP_UNDERLAY) ?
+	       &qp->raw_packet_qp.rq.base :
+	       &qp->trans_qp.base;
+
 	qp->has_rq = qp_has_rq(init_attr);
 	err = set_rq_size(dev, &init_attr->cap, qp->has_rq,
 			  qp, (pd && pd->uobject) ? &ucmd : NULL);
@@ -1695,10 +1729,15 @@ static int create_qp_common(struct mlx5_ib_dev *dev, struct ib_pd *pd,
 
 	MLX5_SET(qpc, qpc, rq_type, get_rx_type(qp, init_attr));
 
-	if (qp->sq.wqe_cnt)
+	if (qp->sq.wqe_cnt) {
 		MLX5_SET(qpc, qpc, log_sq_size, ilog2(qp->sq.wqe_cnt));
-	else
+	} else {
 		MLX5_SET(qpc, qpc, no_sq, 1);
+		if (init_attr->srq &&
+		    init_attr->srq->srq_type == IB_SRQT_TM)
+			MLX5_SET(qpc, qpc, offload_type,
+				 MLX5_QPC_OFFLOAD_TYPE_RNDV);
+	}
 
 	/* Set default resources */
 	switch (init_attr->qp_type) {
@@ -1742,7 +1781,8 @@ static int create_qp_common(struct mlx5_ib_dev *dev, struct ib_pd *pd,
 		qp->flags |= MLX5_IB_QP_LSO;
 	}
 
-	if (init_attr->qp_type == IB_QPT_RAW_PACKET) {
+	if (init_attr->qp_type == IB_QPT_RAW_PACKET ||
+	    qp->flags & MLX5_IB_QP_UNDERLAY) {
 		qp->raw_packet_qp.sq.ubuffer.buf_addr = ucmd.sq_buf_addr;
 		raw_packet_qp_copy_info(qp, &qp->raw_packet_qp);
 		err = create_raw_packet_qp(dev, qp, in, pd);
@@ -1894,7 +1934,7 @@ static int modify_raw_packet_qp(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp,
 static void destroy_qp_common(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp)
 {
 	struct mlx5_ib_cq *send_cq, *recv_cq;
-	struct mlx5_ib_qp_base *base = &qp->trans_qp.base;
+	struct mlx5_ib_qp_base *base;
 	unsigned long flags;
 	int err;
 
@@ -1903,12 +1943,14 @@ static void destroy_qp_common(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp)
 		return;
 	}
 
-	base = qp->ibqp.qp_type == IB_QPT_RAW_PACKET ?
+	base = (qp->ibqp.qp_type == IB_QPT_RAW_PACKET ||
+		qp->flags & MLX5_IB_QP_UNDERLAY) ?
 	       &qp->raw_packet_qp.rq.base :
 	       &qp->trans_qp.base;
 
 	if (qp->state != IB_QPS_RESET) {
-		if (qp->ibqp.qp_type != IB_QPT_RAW_PACKET) {
+		if (qp->ibqp.qp_type != IB_QPT_RAW_PACKET &&
+		    !(qp->flags & MLX5_IB_QP_UNDERLAY)) {
 			err = mlx5_core_qp_modify(dev->mdev,
 						  MLX5_CMD_OP_2RST_QP, 0,
 						  NULL, &base->mqp);
@@ -1947,7 +1989,8 @@ static void destroy_qp_common(struct mlx5_ib_dev *dev, struct mlx5_ib_qp *qp)
 	mlx5_ib_unlock_cqs(send_cq, recv_cq);
 	spin_unlock_irqrestore(&dev->reset_flow_resource_lock, flags);
 
-	if (qp->ibqp.qp_type == IB_QPT_RAW_PACKET) {
+	if (qp->ibqp.qp_type == IB_QPT_RAW_PACKET ||
+	    qp->flags & MLX5_IB_QP_UNDERLAY) {
 		destroy_raw_packet_qp(dev, qp);
 	} else {
 		err = mlx5_core_destroy_qp(dev->mdev, &base->mqp);
@@ -2703,7 +2746,8 @@ static int __mlx5_ib_modify_qp(struct ib_qp *ibqp,
 
 	if (is_sqp(ibqp->qp_type)) {
 		context->mtu_msgmax = (IB_MTU_256 << 5) | 8;
-	} else if (ibqp->qp_type == IB_QPT_UD ||
+	} else if ((ibqp->qp_type == IB_QPT_UD &&
+		    !(qp->flags & MLX5_IB_QP_UNDERLAY)) ||
 		   ibqp->qp_type == MLX5_IB_QPT_REG_UMR) {
 		context->mtu_msgmax = (IB_MTU_4096 << 5) | 12;
 	} else if (attr_mask & IB_QP_PATH_MTU) {
@@ -2800,6 +2844,11 @@ static int __mlx5_ib_modify_qp(struct ib_qp *ibqp,
 	if (cur_state == IB_QPS_RESET && new_state == IB_QPS_INIT) {
 		u8 port_num = (attr_mask & IB_QP_PORT ? attr->port_num :
 			       qp->port) - 1;
+
+		/* Underlay port should be used - index 0 function per port */
+		if (qp->flags & MLX5_IB_QP_UNDERLAY)
+			port_num = 0;
+
 		mibport = &dev->port[port_num];
 		context->qp_counter_set_usr_page |=
 			cpu_to_be32((u32)(mibport->cnts.set_id) << 24);
@@ -2825,7 +2874,8 @@ static int __mlx5_ib_modify_qp(struct ib_qp *ibqp,
 	optpar = ib_mask_to_mlx5_opt(attr_mask);
 	optpar &= opt_mask[mlx5_cur][mlx5_new][mlx5_st];
 
-	if (qp->ibqp.qp_type == IB_QPT_RAW_PACKET) {
+	if (qp->ibqp.qp_type == IB_QPT_RAW_PACKET ||
+	    qp->flags & MLX5_IB_QP_UNDERLAY) {
 		struct mlx5_modify_raw_qp_param raw_qp_param = {};
 
 		raw_qp_param.operation = op;
@@ -2914,7 +2964,13 @@ int mlx5_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 		ll = dev->ib_dev.get_link_layer(&dev->ib_dev, port);
 	}
 
-	if (qp_type != MLX5_IB_QPT_REG_UMR &&
+	if (qp->flags & MLX5_IB_QP_UNDERLAY) {
+		if (attr_mask & ~(IB_QP_STATE | IB_QP_CUR_STATE)) {
+			mlx5_ib_dbg(dev, "invalid attr_mask 0x%x when underlay QP is used\n",
+				    attr_mask);
+			goto out;
+		}
+	} else if (qp_type != MLX5_IB_QPT_REG_UMR &&
 	    !ib_modify_qp_is_ok(cur_state, new_state, qp_type, attr_mask, ll)) {
 		mlx5_ib_dbg(dev, "invalid QP state transition from %d to %d, qp_type %d, attr_mask 0x%x\n",
 			    cur_state, new_state, ibqp->qp_type, attr_mask);
@@ -4478,9 +4534,14 @@ int mlx5_ib_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr,
 		return mlx5_ib_gsi_query_qp(ibqp, qp_attr, qp_attr_mask,
 					    qp_init_attr);
 
+	/* Not all of output fields are applicable, make sure to zero them */
+	memset(qp_init_attr, 0, sizeof(*qp_init_attr));
+	memset(qp_attr, 0, sizeof(*qp_attr));
+
 	mutex_lock(&qp->mutex);
 
-	if (qp->ibqp.qp_type == IB_QPT_RAW_PACKET) {
+	if (qp->ibqp.qp_type == IB_QPT_RAW_PACKET ||
+	    qp->flags & MLX5_IB_QP_UNDERLAY) {
 		err = query_raw_packet_qp_state(dev, qp, &raw_packet_qp_state);
 		if (err)
 			goto out;
@@ -4598,6 +4659,27 @@ static void mlx5_ib_wq_event(struct mlx5_core_qp *core_qp, int type)
 	}
 }
 
+static int set_delay_drop(struct mlx5_ib_dev *dev)
+{
+	int err = 0;
+
+	mutex_lock(&dev->delay_drop.lock);
+	if (dev->delay_drop.activate)
+		goto out;
+
+	err = mlx5_core_set_delay_drop(dev->mdev, dev->delay_drop.timeout);
+	if (err)
+		goto out;
+
+	dev->delay_drop.activate = true;
+out:
+	mutex_unlock(&dev->delay_drop.lock);
+
+	if (!err)
+		atomic_inc(&dev->delay_drop.rqs_cnt);
+	return err;
+}
+
 static int  create_rq(struct mlx5_ib_rwq *rwq, struct ib_pd *pd,
 		      struct ib_wq_init_attr *init_attr)
 {
@@ -4652,9 +4734,28 @@ static int  create_rq(struct mlx5_ib_rwq *rwq, struct ib_pd *pd,
 		}
 		MLX5_SET(rqc, rqc, scatter_fcs, 1);
 	}
+	if (init_attr->create_flags & IB_WQ_FLAGS_DELAY_DROP) {
+		if (!(dev->ib_dev.attrs.raw_packet_caps &
+		      IB_RAW_PACKET_CAP_DELAY_DROP)) {
+			mlx5_ib_dbg(dev, "Delay drop is not supported\n");
+			err = -EOPNOTSUPP;
+			goto out;
+		}
+		MLX5_SET(rqc, rqc, delay_drop_en, 1);
+	}
 	rq_pas0 = (__be64 *)MLX5_ADDR_OF(wq, wq, pas);
 	mlx5_ib_populate_pas(dev, rwq->umem, rwq->page_shift, rq_pas0, 0);
 	err = mlx5_core_create_rq_tracked(dev->mdev, in, inlen, &rwq->core_qp);
+	if (!err && init_attr->create_flags & IB_WQ_FLAGS_DELAY_DROP) {
+		err = set_delay_drop(dev);
+		if (err) {
+			mlx5_ib_warn(dev, "Failed to enable delay drop err=%d\n",
+				     err);
+			mlx5_core_destroy_rq_tracked(dev->mdev, &rwq->core_qp);
+		} else {
+			rwq->create_flags |= MLX5_IB_WQ_FLAGS_DELAY_DROP;
+		}
+	}
 out:
 	kvfree(in);
 	return err;
@@ -4788,7 +4889,7 @@ struct ib_wq *mlx5_ib_create_wq(struct ib_pd *pd,
 err_copy:
 	mlx5_core_destroy_rq_tracked(dev->mdev, &rwq->core_qp);
 err_user_rq:
-	destroy_user_rq(pd, rwq);
+	destroy_user_rq(dev, pd, rwq);
 err:
 	kfree(rwq);
 	return ERR_PTR(err);
@@ -4800,7 +4901,7 @@ int mlx5_ib_destroy_wq(struct ib_wq *wq)
 	struct mlx5_ib_rwq *rwq = to_mrwq(wq);
 
 	mlx5_core_destroy_rq_tracked(dev->mdev, &rwq->core_qp);
-	destroy_user_rq(wq->pd, rwq);
+	destroy_user_rq(dev, wq->pd, rwq);
 	kfree(rwq);
 
 	return 0;
diff --git a/drivers/infiniband/hw/mlx5/srq.c b/drivers/infiniband/hw/mlx5/srq.c
index 43707b1..6d5fada 100644
--- a/drivers/infiniband/hw/mlx5/srq.c
+++ b/drivers/infiniband/hw/mlx5/srq.c
@@ -101,7 +101,7 @@ static int create_srq_user(struct ib_pd *pd, struct mlx5_ib_srq *srq,
 				 udata->inlen - sizeof(ucmd)))
 		return -EINVAL;
 
-	if (in->type == IB_SRQT_XRC) {
+	if (in->type != IB_SRQT_BASIC) {
 		err = get_srq_user_index(to_mucontext(pd->uobject->context),
 					 &ucmd, udata->inlen, &uidx);
 		if (err)
@@ -145,7 +145,7 @@ static int create_srq_user(struct ib_pd *pd, struct mlx5_ib_srq *srq,
 	in->log_page_size = page_shift - MLX5_ADAPTER_PAGE_SHIFT;
 	in->page_offset = offset;
 	if (MLX5_CAP_GEN(dev->mdev, cqe_version) == MLX5_CQE_VERSION_V1 &&
-	    in->type == IB_SRQT_XRC)
+	    in->type != IB_SRQT_BASIC)
 		in->user_index = uidx;
 
 	return 0;
@@ -196,7 +196,7 @@ static int create_srq_kernel(struct mlx5_ib_dev *dev, struct mlx5_ib_srq *srq,
 	}
 	mlx5_fill_page_array(&srq->buf, in->pas);
 
-	srq->wrid = kmalloc(srq->msrq.max * sizeof(u64), GFP_KERNEL);
+	srq->wrid = kvmalloc_array(srq->msrq.max, sizeof(u64), GFP_KERNEL);
 	if (!srq->wrid) {
 		err = -ENOMEM;
 		goto err_in;
@@ -205,7 +205,7 @@ static int create_srq_kernel(struct mlx5_ib_dev *dev, struct mlx5_ib_srq *srq,
 
 	in->log_page_size = srq->buf.page_shift - MLX5_ADAPTER_PAGE_SHIFT;
 	if (MLX5_CAP_GEN(dev->mdev, cqe_version) == MLX5_CQE_VERSION_V1 &&
-	    in->type == IB_SRQT_XRC)
+	    in->type != IB_SRQT_BASIC)
 		in->user_index = MLX5_IB_DEFAULT_UIDX;
 
 	return 0;
@@ -230,7 +230,7 @@ static void destroy_srq_user(struct ib_pd *pd, struct mlx5_ib_srq *srq)
 
 static void destroy_srq_kernel(struct mlx5_ib_dev *dev, struct mlx5_ib_srq *srq)
 {
-	kfree(srq->wrid);
+	kvfree(srq->wrid);
 	mlx5_buf_free(dev->mdev, &srq->buf);
 	mlx5_db_free(dev->mdev, &srq->db);
 }
@@ -292,14 +292,29 @@ struct ib_srq *mlx5_ib_create_srq(struct ib_pd *pd,
 	in.wqe_shift = srq->msrq.wqe_shift - 4;
 	if (srq->wq_sig)
 		in.flags |= MLX5_SRQ_FLAG_WQ_SIG;
-	if (init_attr->srq_type == IB_SRQT_XRC) {
+
+	if (init_attr->srq_type == IB_SRQT_XRC)
 		in.xrcd = to_mxrcd(init_attr->ext.xrc.xrcd)->xrcdn;
-		in.cqn = to_mcq(init_attr->ext.xrc.cq)->mcq.cqn;
-	} else if (init_attr->srq_type == IB_SRQT_BASIC) {
+	else
 		in.xrcd = to_mxrcd(dev->devr.x0)->xrcdn;
-		in.cqn = to_mcq(dev->devr.c0)->mcq.cqn;
+
+	if (init_attr->srq_type == IB_SRQT_TM) {
+		in.tm_log_list_size =
+			ilog2(init_attr->ext.tag_matching.max_num_tags) + 1;
+		if (in.tm_log_list_size >
+		    MLX5_CAP_GEN(dev->mdev, log_tag_matching_list_sz)) {
+			mlx5_ib_dbg(dev, "TM SRQ max_num_tags exceeding limit\n");
+			err = -EINVAL;
+			goto err_usr_kern_srq;
+		}
+		in.flags |= MLX5_SRQ_FLAG_RNDV;
 	}
 
+	if (ib_srq_has_cq(init_attr->srq_type))
+		in.cqn = to_mcq(init_attr->ext.cq)->mcq.cqn;
+	else
+		in.cqn = to_mcq(dev->devr.c0)->mcq.cqn;
+
 	in.pd = to_mpd(pd)->pdn;
 	in.db_record = srq->db.dma;
 	err = mlx5_core_create_srq(dev->mdev, &srq->msrq, &in);
diff --git a/drivers/infiniband/hw/mthca/mthca_av.c b/drivers/infiniband/hw/mthca/mthca_av.c
index 2aec990..e7f6223 100644
--- a/drivers/infiniband/hw/mthca/mthca_av.c
+++ b/drivers/infiniband/hw/mthca/mthca_av.c
@@ -186,7 +186,7 @@ int mthca_create_ah(struct mthca_dev *dev,
 
 on_hca_fail:
 	if (ah->type == MTHCA_AH_PCI_POOL) {
-		ah->av = pci_pool_zalloc(dev->av_table.pool,
+		ah->av = dma_pool_zalloc(dev->av_table.pool,
 					 GFP_ATOMIC, &ah->avdma);
 		if (!ah->av)
 			return -ENOMEM;
@@ -250,7 +250,7 @@ int mthca_destroy_ah(struct mthca_dev *dev, struct mthca_ah *ah)
 		break;
 
 	case MTHCA_AH_PCI_POOL:
-		pci_pool_free(dev->av_table.pool, ah->av, ah->avdma);
+		dma_pool_free(dev->av_table.pool, ah->av, ah->avdma);
 		break;
 
 	case MTHCA_AH_KMALLOC:
@@ -340,7 +340,7 @@ int mthca_init_av_table(struct mthca_dev *dev)
 	if (err)
 		return err;
 
-	dev->av_table.pool = pci_pool_create("mthca_av", dev->pdev,
+	dev->av_table.pool = dma_pool_create("mthca_av", &dev->pdev->dev,
 					     MTHCA_AV_SIZE,
 					     MTHCA_AV_SIZE, 0);
 	if (!dev->av_table.pool)
@@ -360,7 +360,7 @@ int mthca_init_av_table(struct mthca_dev *dev)
 	return 0;
 
  out_free_pool:
-	pci_pool_destroy(dev->av_table.pool);
+	dma_pool_destroy(dev->av_table.pool);
 
  out_free_alloc:
 	mthca_alloc_cleanup(&dev->av_table.alloc);
@@ -374,6 +374,6 @@ void mthca_cleanup_av_table(struct mthca_dev *dev)
 
 	if (dev->av_table.av_map)
 		iounmap(dev->av_table.av_map);
-	pci_pool_destroy(dev->av_table.pool);
+	dma_pool_destroy(dev->av_table.pool);
 	mthca_alloc_cleanup(&dev->av_table.alloc);
 }
diff --git a/drivers/infiniband/hw/mthca/mthca_cmd.c b/drivers/infiniband/hw/mthca/mthca_cmd.c
index 9d83a53..419a2a2 100644
--- a/drivers/infiniband/hw/mthca/mthca_cmd.c
+++ b/drivers/infiniband/hw/mthca/mthca_cmd.c
@@ -538,7 +538,7 @@ int mthca_cmd_init(struct mthca_dev *dev)
 		return -ENOMEM;
 	}
 
-	dev->cmd.pool = pci_pool_create("mthca_cmd", dev->pdev,
+	dev->cmd.pool = dma_pool_create("mthca_cmd", &dev->pdev->dev,
 					MTHCA_MAILBOX_SIZE,
 					MTHCA_MAILBOX_SIZE, 0);
 	if (!dev->cmd.pool) {
@@ -551,7 +551,7 @@ int mthca_cmd_init(struct mthca_dev *dev)
 
 void mthca_cmd_cleanup(struct mthca_dev *dev)
 {
-	pci_pool_destroy(dev->cmd.pool);
+	dma_pool_destroy(dev->cmd.pool);
 	iounmap(dev->hcr);
 	if (dev->cmd.flags & MTHCA_CMD_POST_DOORBELLS)
 		iounmap(dev->cmd.dbell_map);
@@ -621,7 +621,7 @@ struct mthca_mailbox *mthca_alloc_mailbox(struct mthca_dev *dev,
 	if (!mailbox)
 		return ERR_PTR(-ENOMEM);
 
-	mailbox->buf = pci_pool_alloc(dev->cmd.pool, gfp_mask, &mailbox->dma);
+	mailbox->buf = dma_pool_alloc(dev->cmd.pool, gfp_mask, &mailbox->dma);
 	if (!mailbox->buf) {
 		kfree(mailbox);
 		return ERR_PTR(-ENOMEM);
@@ -635,7 +635,7 @@ void mthca_free_mailbox(struct mthca_dev *dev, struct mthca_mailbox *mailbox)
 	if (!mailbox)
 		return;
 
-	pci_pool_free(dev->cmd.pool, mailbox->buf, mailbox->dma);
+	dma_pool_free(dev->cmd.pool, mailbox->buf, mailbox->dma);
 	kfree(mailbox);
 }
 
@@ -698,7 +698,7 @@ static int mthca_map_cmd(struct mthca_dev *dev, u16 op, struct mthca_icm *icm,
 		for (i = 0; i < mthca_icm_size(&iter) >> lg; ++i) {
 			if (virt != -1) {
 				pages[nent * 2] = cpu_to_be64(virt);
-				virt += 1 << lg;
+				virt += 1ULL << lg;
 			}
 
 			pages[nent * 2 + 1] =
@@ -1921,7 +1921,7 @@ int mthca_MAD_IFC(struct mthca_dev *dev, int ignore_mkey, int ignore_bkey,
 			(in_wc->wc_flags & IB_WC_GRH ? 0x80 : 0);
 		MTHCA_PUT(inbox, val,               MAD_IFC_G_PATH_OFFSET);
 
-		MTHCA_PUT(inbox, in_wc->slid,       MAD_IFC_RLID_OFFSET);
+		MTHCA_PUT(inbox, ib_lid_cpu16(in_wc->slid), MAD_IFC_RLID_OFFSET);
 		MTHCA_PUT(inbox, in_wc->pkey_index, MAD_IFC_PKEY_OFFSET);
 
 		if (in_grh)
@@ -1929,7 +1929,7 @@ int mthca_MAD_IFC(struct mthca_dev *dev, int ignore_mkey, int ignore_bkey,
 
 		op_modifier |= 0x4;
 
-		in_modifier |= in_wc->slid << 16;
+		in_modifier |= ib_lid_cpu16(in_wc->slid) << 16;
 	}
 
 	err = mthca_cmd_box(dev, inmailbox->dma, outmailbox->dma,
diff --git a/drivers/infiniband/hw/mthca/mthca_dev.h b/drivers/infiniband/hw/mthca/mthca_dev.h
index ec7da9a..5508afb 100644
--- a/drivers/infiniband/hw/mthca/mthca_dev.h
+++ b/drivers/infiniband/hw/mthca/mthca_dev.h
@@ -118,7 +118,7 @@ enum {
 };
 
 struct mthca_cmd {
-	struct pci_pool          *pool;
+	struct dma_pool          *pool;
 	struct mutex              hcr_mutex;
 	struct semaphore 	  poll_sem;
 	struct semaphore 	  event_sem;
@@ -263,7 +263,7 @@ struct mthca_qp_table {
 };
 
 struct mthca_av_table {
-	struct pci_pool   *pool;
+	struct dma_pool   *pool;
 	int                num_ddr_avs;
 	u64                ddr_av_base;
 	void __iomem      *av_map;
diff --git a/drivers/infiniband/hw/mthca/mthca_mad.c b/drivers/infiniband/hw/mthca/mthca_mad.c
index 7df3db7..093f775 100644
--- a/drivers/infiniband/hw/mthca/mthca_mad.c
+++ b/drivers/infiniband/hw/mthca/mthca_mad.c
@@ -205,7 +205,7 @@ int mthca_process_mad(struct ib_device *ibdev,
 		      u16 *out_mad_pkey_index)
 {
 	int err;
-	u16 slid = in_wc ? in_wc->slid : be16_to_cpu(IB_LID_PERMISSIVE);
+	u16 slid = in_wc ? ib_lid_cpu16(in_wc->slid) : be16_to_cpu(IB_LID_PERMISSIVE);
 	u16 prev_lid = 0;
 	struct ib_port_attr pattr;
 	const struct ib_mad *in_mad = (const struct ib_mad *)in;
@@ -256,7 +256,7 @@ int mthca_process_mad(struct ib_device *ibdev,
 	    in_mad->mad_hdr.method == IB_MGMT_METHOD_SET &&
 	    in_mad->mad_hdr.attr_id == IB_SMP_ATTR_PORT_INFO &&
 	    !ib_query_port(ibdev, port_num, &pattr))
-		prev_lid = pattr.lid;
+		prev_lid = ib_lid_cpu16(pattr.lid);
 
 	err = mthca_MAD_IFC(to_mdev(ibdev),
 			    mad_flags & IB_MAD_IGNORE_MKEY,
diff --git a/drivers/infiniband/hw/mthca/mthca_main.c b/drivers/infiniband/hw/mthca/mthca_main.c
index c309e5c..e36a9bc 100644
--- a/drivers/infiniband/hw/mthca/mthca_main.c
+++ b/drivers/infiniband/hw/mthca/mthca_main.c
@@ -49,7 +49,6 @@
 MODULE_AUTHOR("Roland Dreier");
 MODULE_DESCRIPTION("Mellanox InfiniBand HCA low-level driver");
 MODULE_LICENSE("Dual BSD/GPL");
-MODULE_VERSION(DRV_VERSION);
 
 #ifdef CONFIG_INFINIBAND_MTHCA_DEBUG
 
@@ -1162,7 +1161,7 @@ static void mthca_remove_one(struct pci_dev *pdev)
 	mutex_unlock(&mthca_device_mutex);
 }
 
-static struct pci_device_id mthca_pci_table[] = {
+static const struct pci_device_id mthca_pci_table[] = {
 	{ PCI_DEVICE(PCI_VENDOR_ID_MELLANOX, PCI_DEVICE_ID_MELLANOX_TAVOR),
 	  .driver_data = TAVOR },
 	{ PCI_DEVICE(PCI_VENDOR_ID_TOPSPIN, PCI_DEVICE_ID_MELLANOX_TAVOR),
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
index c197cd9..6fee779 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.c
+++ b/drivers/infiniband/hw/mthca/mthca_provider.c
@@ -914,7 +914,7 @@ static struct ib_mr *mthca_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 	int err = 0;
 	int write_mtt_size;
 
-	if (udata->inlen - sizeof (struct ib_uverbs_cmd_hdr) < sizeof ucmd) {
+	if (udata->inlen < sizeof ucmd) {
 		if (!to_mucontext(pd->uobject->context)->reg_mr_warned) {
 			mthca_warn(dev, "Process '%s' did not pass in MR attrs.\n",
 				   current->comm);
@@ -1178,12 +1178,11 @@ static int mthca_port_immutable(struct ib_device *ibdev, u8 port_num,
 	return 0;
 }
 
-static void get_dev_fw_str(struct ib_device *device, char *str,
-			   size_t str_len)
+static void get_dev_fw_str(struct ib_device *device, char *str)
 {
 	struct mthca_dev *dev =
 		container_of(device, struct mthca_dev, ib_dev);
-	snprintf(str, str_len, "%d.%d.%d",
+	snprintf(str, IB_FW_VERSION_NAME_MAX, "%d.%d.%d",
 		 (int) (dev->fw_ver >> 32),
 		 (int) (dev->fw_ver >> 16) & 0xffff,
 		 (int) dev->fw_ver & 0xffff);
diff --git a/drivers/infiniband/hw/nes/nes.c b/drivers/infiniband/hw/nes/nes.c
index a30aa65..942ca84 100644
--- a/drivers/infiniband/hw/nes/nes.c
+++ b/drivers/infiniband/hw/nes/nes.c
@@ -63,7 +63,6 @@
 MODULE_AUTHOR("NetEffect");
 MODULE_DESCRIPTION("NetEffect RNIC Low-level iWARP Driver");
 MODULE_LICENSE("Dual BSD/GPL");
-MODULE_VERSION(DRV_VERSION);
 
 int interrupt_mod_interval = 0;
 
@@ -102,7 +101,7 @@ static unsigned int ee_flsh_adapter;
 static unsigned int sysfs_nonidx_addr;
 static unsigned int sysfs_idx_addr;
 
-static struct pci_device_id nes_pci_table[] = {
+static const struct pci_device_id nes_pci_table[] = {
 	{ PCI_VDEVICE(NETEFFECT, PCI_DEVICE_ID_NETEFFECT_NE020), },
 	{ PCI_VDEVICE(NETEFFECT, PCI_DEVICE_ID_NETEFFECT_NE020_KR), },
 	{0}
@@ -808,13 +807,6 @@ static void nes_remove(struct pci_dev *pcidev)
 }
 
 
-static struct pci_driver nes_pci_driver = {
-	.name = DRV_NAME,
-	.id_table = nes_pci_table,
-	.probe = nes_probe,
-	.remove = nes_remove,
-};
-
 static ssize_t adapter_show(struct device_driver *ddp, char *buf)
 {
 	unsigned int  devfn = 0xffffffff;
@@ -1156,35 +1148,29 @@ static DRIVER_ATTR_RW(idx_addr);
 static DRIVER_ATTR_RW(idx_data);
 static DRIVER_ATTR_RW(wqm_quanta);
 
-static int nes_create_driver_sysfs(struct pci_driver *drv)
-{
-	int error;
-	error  = driver_create_file(&drv->driver, &driver_attr_adapter);
-	error |= driver_create_file(&drv->driver, &driver_attr_eeprom_cmd);
-	error |= driver_create_file(&drv->driver, &driver_attr_eeprom_data);
-	error |= driver_create_file(&drv->driver, &driver_attr_flash_cmd);
-	error |= driver_create_file(&drv->driver, &driver_attr_flash_data);
-	error |= driver_create_file(&drv->driver, &driver_attr_nonidx_addr);
-	error |= driver_create_file(&drv->driver, &driver_attr_nonidx_data);
-	error |= driver_create_file(&drv->driver, &driver_attr_idx_addr);
-	error |= driver_create_file(&drv->driver, &driver_attr_idx_data);
-	error |= driver_create_file(&drv->driver, &driver_attr_wqm_quanta);
-	return error;
-}
+static struct attribute *nes_attrs[] = {
+	&driver_attr_adapter.attr,
+	&driver_attr_eeprom_cmd.attr,
+	&driver_attr_eeprom_data.attr,
+	&driver_attr_flash_cmd.attr,
+	&driver_attr_flash_data.attr,
+	&driver_attr_nonidx_addr.attr,
+	&driver_attr_nonidx_data.attr,
+	&driver_attr_idx_addr.attr,
+	&driver_attr_idx_data.attr,
+	&driver_attr_wqm_quanta.attr,
+	NULL,
+};
+ATTRIBUTE_GROUPS(nes);
 
-static void nes_remove_driver_sysfs(struct pci_driver *drv)
-{
-	driver_remove_file(&drv->driver, &driver_attr_adapter);
-	driver_remove_file(&drv->driver, &driver_attr_eeprom_cmd);
-	driver_remove_file(&drv->driver, &driver_attr_eeprom_data);
-	driver_remove_file(&drv->driver, &driver_attr_flash_cmd);
-	driver_remove_file(&drv->driver, &driver_attr_flash_data);
-	driver_remove_file(&drv->driver, &driver_attr_nonidx_addr);
-	driver_remove_file(&drv->driver, &driver_attr_nonidx_data);
-	driver_remove_file(&drv->driver, &driver_attr_idx_addr);
-	driver_remove_file(&drv->driver, &driver_attr_idx_data);
-	driver_remove_file(&drv->driver, &driver_attr_wqm_quanta);
-}
+static struct pci_driver nes_pci_driver = {
+	.name = DRV_NAME,
+	.id_table = nes_pci_table,
+	.probe = nes_probe,
+	.remove = nes_remove,
+	.groups = nes_groups,
+};
+
 
 /**
  * nes_init_module - module initialization entry point
@@ -1192,20 +1178,13 @@ static void nes_remove_driver_sysfs(struct pci_driver *drv)
 static int __init nes_init_module(void)
 {
 	int retval;
-	int retval1;
 
 	retval = nes_cm_start();
 	if (retval) {
 		printk(KERN_ERR PFX "Unable to start NetEffect iWARP CM.\n");
 		return retval;
 	}
-	retval = pci_register_driver(&nes_pci_driver);
-	if (retval >= 0) {
-		retval1 = nes_create_driver_sysfs(&nes_pci_driver);
-		if (retval1 < 0)
-			printk(KERN_ERR PFX "Unable to create NetEffect sys files.\n");
-	}
-	return retval;
+	return pci_register_driver(&nes_pci_driver);
 }
 
 
@@ -1215,7 +1194,6 @@ static int __init nes_init_module(void)
 static void __exit nes_exit_module(void)
 {
 	nes_cm_stop();
-	nes_remove_driver_sysfs(&nes_pci_driver);
 
 	pci_unregister_driver(&nes_pci_driver);
 }
diff --git a/drivers/infiniband/hw/nes/nes_verbs.c b/drivers/infiniband/hw/nes/nes_verbs.c
index 25dcd75..f0dc5f4 100644
--- a/drivers/infiniband/hw/nes/nes_verbs.c
+++ b/drivers/infiniband/hw/nes/nes_verbs.c
@@ -481,21 +481,16 @@ static int nes_query_port(struct ib_device *ibdev, u8 port, struct ib_port_attr
 	props->active_mtu = ib_mtu_int_to_enum(netdev->mtu);
 
 	props->lid = 1;
-	props->lmc = 0;
-	props->sm_lid = 0;
-	props->sm_sl = 0;
 	if (netif_queue_stopped(netdev))
 		props->state = IB_PORT_DOWN;
 	else if (nesvnic->linkup)
 		props->state = IB_PORT_ACTIVE;
 	else
 		props->state = IB_PORT_DOWN;
-	props->phys_state = 0;
 	props->port_cap_flags = IB_PORT_CM_SUP | IB_PORT_REINIT_SUP |
 			IB_PORT_VENDOR_CLASS_SUP | IB_PORT_BOOT_MGMT_SUP;
 	props->gid_tbl_len = 1;
 	props->pkey_tbl_len = 1;
-	props->qkey_viol_cntr = 0;
 	props->active_width = IB_WIDTH_4X;
 	props->active_speed = IB_SPEED_SDR;
 	props->max_msg_sz = 0x80000000;
@@ -3672,15 +3667,14 @@ static int nes_port_immutable(struct ib_device *ibdev, u8 port_num,
 	return 0;
 }
 
-static void get_dev_fw_str(struct ib_device *dev, char *str,
-			   size_t str_len)
+static void get_dev_fw_str(struct ib_device *dev, char *str)
 {
 	struct nes_ib_device *nesibdev =
 			container_of(dev, struct nes_ib_device, ibdev);
 	struct nes_vnic *nesvnic = nesibdev->nesvnic;
 
 	nes_debug(NES_DBG_INIT, "\n");
-	snprintf(str, str_len, "%u.%u",
+	snprintf(str, IB_FW_VERSION_NAME_MAX, "%u.%u",
 		 (nesvnic->nesdev->nesadapter->firmware_version >> 16),
 		 (nesvnic->nesdev->nesadapter->firmware_version & 0x000000ff));
 }
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_main.c b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
index 57c9a2a..fbfbd9e 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_main.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_main.c
@@ -58,7 +58,6 @@
 #include "ocrdma_stats.h"
 #include <rdma/ocrdma-abi.h>
 
-MODULE_VERSION(OCRDMA_ROCE_DRV_VERSION);
 MODULE_DESCRIPTION(OCRDMA_ROCE_DRV_DESC " " OCRDMA_ROCE_DRV_VERSION);
 MODULE_AUTHOR("Emulex Corporation");
 MODULE_LICENSE("Dual BSD/GPL");
@@ -108,12 +107,11 @@ static int ocrdma_port_immutable(struct ib_device *ibdev, u8 port_num,
 	return 0;
 }
 
-static void get_dev_fw_str(struct ib_device *device, char *str,
-			   size_t str_len)
+static void get_dev_fw_str(struct ib_device *device, char *str)
 {
 	struct ocrdma_dev *dev = get_ocrdma_dev(device);
 
-	snprintf(str, str_len, "%s", &dev->attr.fw_ver[0]);
+	snprintf(str, IB_FW_VERSION_NAME_MAX, "%s", &dev->attr.fw_ver[0]);
 }
 
 static int ocrdma_register_device(struct ocrdma_dev *dev)
diff --git a/drivers/infiniband/hw/qedr/main.c b/drivers/infiniband/hw/qedr/main.c
index b5851fd..97d033f 100644
--- a/drivers/infiniband/hw/qedr/main.c
+++ b/drivers/infiniband/hw/qedr/main.c
@@ -47,7 +47,6 @@
 MODULE_DESCRIPTION("QLogic 40G/100G ROCE Driver");
 MODULE_AUTHOR("QLogic Corporation");
 MODULE_LICENSE("Dual BSD/GPL");
-MODULE_VERSION(QEDR_MODULE_VERSION);
 
 #define QEDR_WQ_MULTIPLIER_DFT	(3)
 
@@ -69,13 +68,12 @@ static enum rdma_link_layer qedr_link_layer(struct ib_device *device,
 	return IB_LINK_LAYER_ETHERNET;
 }
 
-static void qedr_get_dev_fw_str(struct ib_device *ibdev, char *str,
-				size_t str_len)
+static void qedr_get_dev_fw_str(struct ib_device *ibdev, char *str)
 {
 	struct qedr_dev *qedr = get_qedr_dev(ibdev);
 	u32 fw_ver = (u32)qedr->attr.fw_ver;
 
-	snprintf(str, str_len, "%d. %d. %d. %d",
+	snprintf(str, IB_FW_VERSION_NAME_MAX, "%d. %d. %d. %d",
 		 (fw_ver >> 24) & 0xFF, (fw_ver >> 16) & 0xFF,
 		 (fw_ver >> 8) & 0xFF, fw_ver & 0xFF);
 }
@@ -778,6 +776,7 @@ static struct qedr_dev *qedr_add(struct qed_dev *cdev, struct pci_dev *pdev,
 	if (rc)
 		goto init_err;
 
+	dev->user_dpm_enabled = dev_info.user_dpm_enabled;
 	dev->num_hwfns = dev_info.common.num_hwfns;
 	dev->rdma_ctx = dev->ops->rdma_get_rdma_ctx(cdev);
 
diff --git a/drivers/infiniband/hw/qedr/qedr.h b/drivers/infiniband/hw/qedr/qedr.h
index ab7784b..b2bb42e 100644
--- a/drivers/infiniband/hw/qedr/qedr.h
+++ b/drivers/infiniband/hw/qedr/qedr.h
@@ -41,7 +41,6 @@
 #include <linux/qed/roce_common.h>
 #include "qedr_hsi_rdma.h"
 
-#define QEDR_MODULE_VERSION	"8.10.10.0"
 #define QEDR_NODE_DESC "QLogic 579xx RoCE HCA"
 #define DP_NAME(dev) ((dev)->ibdev.name)
 
@@ -163,6 +162,8 @@ struct qedr_dev {
 	struct qedr_qp		*gsi_qp;
 
 	unsigned long enet_state;
+
+	u8 user_dpm_enabled;
 };
 
 #define QEDR_MAX_SQ_PBL			(0x8000)
diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
index 2ae71b8..769ac07 100644
--- a/drivers/infiniband/hw/qedr/verbs.c
+++ b/drivers/infiniband/hw/qedr/verbs.c
@@ -376,6 +376,9 @@ struct ib_ucontext *qedr_alloc_ucontext(struct ib_device *ibdev,
 
 	memset(&uresp, 0, sizeof(uresp));
 
+	uresp.dpm_enabled = dev->user_dpm_enabled;
+	uresp.wids_enabled = 1;
+	uresp.wid_count = oparams.wid_count;
 	uresp.db_pa = ctx->dpi_phys_addr;
 	uresp.db_size = ctx->dpi_size;
 	uresp.max_send_wr = dev->attr.max_sqe;
@@ -488,7 +491,7 @@ struct ib_pd *qedr_alloc_pd(struct ib_device *ibdev,
 		 (udata && context) ? "User Lib" : "Kernel");
 
 	if (!dev->rdma_ctx) {
-		DP_ERR(dev, "invlaid RDMA context\n");
+		DP_ERR(dev, "invalid RDMA context\n");
 		return ERR_PTR(-EINVAL);
 	}
 
diff --git a/drivers/infiniband/hw/qib/qib.h b/drivers/infiniband/hw/qib/qib.h
index a3e21a2..f9e1c69 100644
--- a/drivers/infiniband/hw/qib/qib.h
+++ b/drivers/infiniband/hw/qib/qib.h
@@ -1,7 +1,7 @@
 #ifndef _QIB_KERNEL_H
 #define _QIB_KERNEL_H
 /*
- * Copyright (c) 2012, 2013 Intel Corporation.  All rights reserved.
+ * Copyright (c) 2012 - 2017 Intel Corporation.  All rights reserved.
  * Copyright (c) 2006 - 2012 QLogic Corporation. All rights reserved.
  * Copyright (c) 2003, 2004, 2005, 2006 PathScale, Inc. All rights reserved.
  *
@@ -443,7 +443,7 @@ struct qib_irq_notify;
 #endif
 
 struct qib_msix_entry {
-	struct msix_entry msix;
+	int irq;
 	void *arg;
 #ifdef CONFIG_INFINIBAND_QIB_DCA
 	int dca;
@@ -1433,9 +1433,9 @@ int qib_pcie_init(struct pci_dev *, const struct pci_device_id *);
 int qib_pcie_ddinit(struct qib_devdata *, struct pci_dev *,
 		    const struct pci_device_id *);
 void qib_pcie_ddcleanup(struct qib_devdata *);
-int qib_pcie_params(struct qib_devdata *, u32, u32 *, struct qib_msix_entry *);
+int qib_pcie_params(struct qib_devdata *dd, u32 minw, u32 *nent);
 int qib_reinit_intr(struct qib_devdata *);
-void qib_enable_intx(struct pci_dev *);
+void qib_enable_intx(struct qib_devdata *dd);
 void qib_nomsi(struct qib_devdata *);
 void qib_nomsix(struct qib_devdata *);
 void qib_pcie_getcmd(struct qib_devdata *, u16 *, u8 *, u8 *);
diff --git a/drivers/infiniband/hw/qib/qib_debugfs.c b/drivers/infiniband/hw/qib/qib_debugfs.c
index 5bad8e3..5ed1ed9 100644
--- a/drivers/infiniband/hw/qib/qib_debugfs.c
+++ b/drivers/infiniband/hw/qib/qib_debugfs.c
@@ -1,6 +1,5 @@
-#ifdef CONFIG_DEBUG_FS
 /*
- * Copyright (c) 2013 Intel Corporation.  All rights reserved.
+ * Copyright (c) 2013 - 2017 Intel Corporation.  All rights reserved.
  *
  * This software is available to you under a choice of one of two
  * licenses.  You may choose to be licensed under the terms of the GNU
@@ -191,10 +190,10 @@ DEBUGFS_FILE(ctx_stats)
 static void *_qp_stats_seq_start(struct seq_file *s, loff_t *pos)
 	__acquires(RCU)
 {
-	struct qib_qp_iter *iter;
+	struct rvt_qp_iter *iter;
 	loff_t n = *pos;
 
-	iter = qib_qp_iter_init(s->private);
+	iter = rvt_qp_iter_init(s->private, 0, NULL);
 
 	/* stop calls rcu_read_unlock */
 	rcu_read_lock();
@@ -203,7 +202,7 @@ static void *_qp_stats_seq_start(struct seq_file *s, loff_t *pos)
 		return NULL;
 
 	do {
-		if (qib_qp_iter_next(iter)) {
+		if (rvt_qp_iter_next(iter)) {
 			kfree(iter);
 			return NULL;
 		}
@@ -216,11 +215,11 @@ static void *_qp_stats_seq_next(struct seq_file *s, void *iter_ptr,
 				   loff_t *pos)
 	__must_hold(RCU)
 {
-	struct qib_qp_iter *iter = iter_ptr;
+	struct rvt_qp_iter *iter = iter_ptr;
 
 	(*pos)++;
 
-	if (qib_qp_iter_next(iter)) {
+	if (rvt_qp_iter_next(iter)) {
 		kfree(iter);
 		return NULL;
 	}
@@ -236,7 +235,7 @@ static void _qp_stats_seq_stop(struct seq_file *s, void *iter_ptr)
 
 static int _qp_stats_seq_show(struct seq_file *s, void *iter_ptr)
 {
-	struct qib_qp_iter *iter = iter_ptr;
+	struct rvt_qp_iter *iter = iter_ptr;
 
 	if (!iter)
 		return 0;
@@ -284,6 +283,3 @@ void qib_dbg_exit(void)
 	debugfs_remove_recursive(qib_dbg_root);
 	qib_dbg_root = NULL;
 }
-
-#endif
-
diff --git a/drivers/infiniband/hw/qib/qib_driver.c b/drivers/infiniband/hw/qib/qib_driver.c
index 2b5982f..719906a 100644
--- a/drivers/infiniband/hw/qib/qib_driver.c
+++ b/drivers/infiniband/hw/qib/qib_driver.c
@@ -66,7 +66,6 @@ MODULE_PARM_DESC(compat_ddr_negotiate,
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_AUTHOR("Intel <ibsupport@intel.com>");
 MODULE_DESCRIPTION("Intel IB driver");
-MODULE_VERSION(QIB_DRIVER_VERSION);
 
 /*
  * QIB_PIO_MAXIBHDR is the max IB header size allowed for in our
diff --git a/drivers/infiniband/hw/qib/qib_iba6120.c b/drivers/infiniband/hw/qib/qib_iba6120.c
index e423b71..3259a60 100644
--- a/drivers/infiniband/hw/qib/qib_iba6120.c
+++ b/drivers/infiniband/hw/qib/qib_iba6120.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2013 Intel Corporation. All rights reserved.
+ * Copyright (c) 2013 - 2017 Intel Corporation. All rights reserved.
  * Copyright (c) 2006, 2007, 2008, 2009, 2010 QLogic Corporation.
  * All rights reserved.
  * Copyright (c) 2003, 2004, 2005, 2006 PathScale, Inc. All rights reserved.
@@ -1742,38 +1742,32 @@ static void qib_setup_6120_interrupt(struct qib_devdata *dd)
  */
 static void pe_boardname(struct qib_devdata *dd)
 {
-	char *n;
-	u32 boardid, namelen;
+	u32 boardid;
 
 	boardid = SYM_FIELD(dd->revision, Revision,
 			    BoardID);
 
 	switch (boardid) {
 	case 2:
-		n = "InfiniPath_QLE7140";
+		dd->boardname = "InfiniPath_QLE7140";
 		break;
 	default:
 		qib_dev_err(dd, "Unknown 6120 board with ID %u\n", boardid);
-		n = "Unknown_InfiniPath_6120";
+		dd->boardname = "Unknown_InfiniPath_6120";
 		break;
 	}
-	namelen = strlen(n) + 1;
-	dd->boardname = kmalloc(namelen, GFP_KERNEL);
-	if (dd->boardname)
-		snprintf(dd->boardname, namelen, "%s", n);
 
 	if (dd->majrev != 4 || !dd->minrev || dd->minrev > 2)
 		qib_dev_err(dd,
-			"Unsupported InfiniPath hardware revision %u.%u!\n",
-			dd->majrev, dd->minrev);
+			    "Unsupported InfiniPath hardware revision %u.%u!\n",
+			    dd->majrev, dd->minrev);
 
 	snprintf(dd->boardversion, sizeof(dd->boardversion),
 		 "ChipABI %u.%u, %s, InfiniPath%u %u.%u, SW Compat %u\n",
 		 QIB_CHIP_VERS_MAJ, QIB_CHIP_VERS_MIN, dd->boardname,
-		 (unsigned)SYM_FIELD(dd->revision, Revision_R, Arch),
+		 (unsigned int)SYM_FIELD(dd->revision, Revision_R, Arch),
 		 dd->majrev, dd->minrev,
-		 (unsigned)SYM_FIELD(dd->revision, Revision_R, SW));
-
+		 (unsigned int)SYM_FIELD(dd->revision, Revision_R, SW));
 }
 
 /*
@@ -1838,7 +1832,7 @@ static int qib_6120_setup_reset(struct qib_devdata *dd)
 
 bail:
 	if (ret) {
-		if (qib_pcie_params(dd, dd->lbus_width, NULL, NULL))
+		if (qib_pcie_params(dd, dd->lbus_width, NULL))
 			qib_dev_err(dd,
 				"Reset failed to setup PCIe or interrupts; continuing anyway\n");
 		/* clear the reset error, init error/hwerror mask */
@@ -3562,7 +3556,7 @@ struct qib_devdata *qib_init_iba6120_funcs(struct pci_dev *pdev,
 	if (qib_mini_init)
 		goto bail;
 
-	if (qib_pcie_params(dd, 8, NULL, NULL))
+	if (qib_pcie_params(dd, 8, NULL))
 		qib_dev_err(dd,
 			"Failed to setup PCIe or interrupts; continuing anyway\n");
 	dd->cspec->irq = pdev->irq; /* save IRQ */
diff --git a/drivers/infiniband/hw/qib/qib_iba7220.c b/drivers/infiniband/hw/qib/qib_iba7220.c
index c3679c4..04bdd3d 100644
--- a/drivers/infiniband/hw/qib/qib_iba7220.c
+++ b/drivers/infiniband/hw/qib/qib_iba7220.c
@@ -1,4 +1,5 @@
 /*
+ * Copyright (c) 2011 - 2017 Intel Corporation.  All rights reserved.
  * Copyright (c) 2006, 2007, 2008, 2009, 2010 QLogic Corporation.
  * All rights reserved.
  * Copyright (c) 2003, 2004, 2005, 2006 PathScale, Inc. All rights reserved.
@@ -2049,41 +2050,35 @@ static void qib_setup_7220_interrupt(struct qib_devdata *dd)
  */
 static void qib_7220_boardname(struct qib_devdata *dd)
 {
-	char *n;
-	u32 boardid, namelen;
+	u32 boardid;
 
 	boardid = SYM_FIELD(dd->revision, Revision,
 			    BoardID);
 
 	switch (boardid) {
 	case 1:
-		n = "InfiniPath_QLE7240";
+		dd->boardname = "InfiniPath_QLE7240";
 		break;
 	case 2:
-		n = "InfiniPath_QLE7280";
+		dd->boardname = "InfiniPath_QLE7280";
 		break;
 	default:
 		qib_dev_err(dd, "Unknown 7220 board with ID %u\n", boardid);
-		n = "Unknown_InfiniPath_7220";
+		dd->boardname = "Unknown_InfiniPath_7220";
 		break;
 	}
 
-	namelen = strlen(n) + 1;
-	dd->boardname = kmalloc(namelen, GFP_KERNEL);
-	if (dd->boardname)
-		snprintf(dd->boardname, namelen, "%s", n);
-
 	if (dd->majrev != 5 || !dd->minrev || dd->minrev > 2)
 		qib_dev_err(dd,
-			"Unsupported InfiniPath hardware revision %u.%u!\n",
-			dd->majrev, dd->minrev);
+			    "Unsupported InfiniPath hardware revision %u.%u!\n",
+			    dd->majrev, dd->minrev);
 
 	snprintf(dd->boardversion, sizeof(dd->boardversion),
 		 "ChipABI %u.%u, %s, InfiniPath%u %u.%u, SW Compat %u\n",
 		 QIB_CHIP_VERS_MAJ, QIB_CHIP_VERS_MIN, dd->boardname,
-		 (unsigned)SYM_FIELD(dd->revision, Revision_R, Arch),
+		 (unsigned int)SYM_FIELD(dd->revision, Revision_R, Arch),
 		 dd->majrev, dd->minrev,
-		 (unsigned)SYM_FIELD(dd->revision, Revision_R, SW));
+		 (unsigned int)SYM_FIELD(dd->revision, Revision_R, SW));
 }
 
 /*
@@ -2148,7 +2143,7 @@ static int qib_setup_7220_reset(struct qib_devdata *dd)
 
 bail:
 	if (ret) {
-		if (qib_pcie_params(dd, dd->lbus_width, NULL, NULL))
+		if (qib_pcie_params(dd, dd->lbus_width, NULL))
 			qib_dev_err(dd,
 				"Reset failed to setup PCIe or interrupts; continuing anyway\n");
 
@@ -3309,7 +3304,7 @@ static int qib_7220_intr_fallback(struct qib_devdata *dd)
 	qib_devinfo(dd->pcidev,
 		"MSI interrupt not detected, trying INTx interrupts\n");
 	qib_7220_free_irq(dd);
-	qib_enable_intx(dd->pcidev);
+	qib_enable_intx(dd);
 	/*
 	 * Some newer kernels require free_irq before disable_msi,
 	 * and irq can be changed during disable and INTx enable
@@ -4619,7 +4614,7 @@ struct qib_devdata *qib_init_iba7220_funcs(struct pci_dev *pdev,
 		minwidth = 8; /* x8 capable boards */
 		break;
 	}
-	if (qib_pcie_params(dd, minwidth, NULL, NULL))
+	if (qib_pcie_params(dd, minwidth, NULL))
 		qib_dev_err(dd,
 			"Failed to setup PCIe or interrupts; continuing anyway\n");
 
diff --git a/drivers/infiniband/hw/qib/qib_iba7322.c b/drivers/infiniband/hw/qib/qib_iba7322.c
index bb2439f..14cadf6 100644
--- a/drivers/infiniband/hw/qib/qib_iba7322.c
+++ b/drivers/infiniband/hw/qib/qib_iba7322.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2012 Intel Corporation.  All rights reserved.
+ * Copyright (c) 2012 - 2017 Intel Corporation.  All rights reserved.
  * Copyright (c) 2008 - 2012 QLogic Corporation. All rights reserved.
  *
  * This software is available to you under a choice of one of two
@@ -2841,10 +2841,10 @@ static void qib_7322_nomsix(struct qib_devdata *dd)
 			reset_dca_notifier(dd, &dd->cspec->msix_entries[i]);
 #endif
 			irq_set_affinity_hint(
-			  dd->cspec->msix_entries[i].msix.vector, NULL);
+				dd->cspec->msix_entries[i].irq, NULL);
 			free_cpumask_var(dd->cspec->msix_entries[i].mask);
-			free_irq(dd->cspec->msix_entries[i].msix.vector,
-			   dd->cspec->msix_entries[i].arg);
+			free_irq(dd->cspec->msix_entries[i].irq,
+				 dd->cspec->msix_entries[i].arg);
 		}
 		qib_nomsix(dd);
 	}
@@ -3336,9 +3336,9 @@ static void reset_dca_notifier(struct qib_devdata *dd, struct qib_msix_entry *m)
 	qib_devinfo(dd->pcidev,
 		"Disabling notifier on HCA %d irq %d\n",
 		dd->unit,
-		m->msix.vector);
+		m->irq);
 	irq_set_affinity_notifier(
-		m->msix.vector,
+		m->irq,
 		NULL);
 	m->notifier = NULL;
 }
@@ -3354,7 +3354,7 @@ static void setup_dca_notifier(struct qib_devdata *dd, struct qib_msix_entry *m)
 		int ret;
 
 		m->notifier = n;
-		n->notify.irq = m->msix.vector;
+		n->notify.irq = m->irq;
 		n->notify.notify = qib_irq_notifier_notify;
 		n->notify.release = qib_irq_notifier_release;
 		n->arg = m->arg;
@@ -3500,10 +3500,21 @@ static void qib_setup_7322_interrupt(struct qib_devdata *dd, int clearpend)
 				 - 1,
 				QIB_DRV_NAME "%d (kctx)", dd->unit);
 		}
-		ret = request_irq(
-			dd->cspec->msix_entries[msixnum].msix.vector,
-			handler, 0, dd->cspec->msix_entries[msixnum].name,
-			arg);
+
+		dd->cspec->msix_entries[msixnum].irq = pci_irq_vector(
+			dd->pcidev, msixnum);
+		if (dd->cspec->msix_entries[msixnum].irq < 0) {
+			qib_dev_err(dd,
+				    "Couldn't get MSIx irq (vec=%d): %d\n",
+				    msixnum,
+				    dd->cspec->msix_entries[msixnum].irq);
+			qib_7322_nomsix(dd);
+			goto try_intx;
+		}
+		ret = request_irq(dd->cspec->msix_entries[msixnum].irq,
+				  handler, 0,
+				  dd->cspec->msix_entries[msixnum].name,
+				  arg);
 		if (ret) {
 			/*
 			 * Shouldn't happen since the enable said we could
@@ -3512,7 +3523,7 @@ static void qib_setup_7322_interrupt(struct qib_devdata *dd, int clearpend)
 			qib_dev_err(dd,
 				"Couldn't setup MSIx interrupt (vec=%d, irq=%d): %d\n",
 				msixnum,
-				dd->cspec->msix_entries[msixnum].msix.vector,
+				dd->cspec->msix_entries[msixnum].irq,
 				ret);
 			qib_7322_nomsix(dd);
 			goto try_intx;
@@ -3548,7 +3559,7 @@ static void qib_setup_7322_interrupt(struct qib_devdata *dd, int clearpend)
 					dd->cspec->msix_entries[msixnum].mask);
 			}
 			irq_set_affinity_hint(
-				dd->cspec->msix_entries[msixnum].msix.vector,
+				dd->cspec->msix_entries[msixnum].irq,
 				dd->cspec->msix_entries[msixnum].mask);
 		}
 		msixnum++;
@@ -3571,75 +3582,69 @@ bail:;
 static unsigned qib_7322_boardname(struct qib_devdata *dd)
 {
 	/* Will need enumeration of board-types here */
-	char *n;
-	u32 boardid, namelen;
-	unsigned features = DUAL_PORT_CAP;
+	u32 boardid;
+	unsigned int features = DUAL_PORT_CAP;
 
 	boardid = SYM_FIELD(dd->revision, Revision, BoardID);
 
 	switch (boardid) {
 	case 0:
-		n = "InfiniPath_QLE7342_Emulation";
+		dd->boardname = "InfiniPath_QLE7342_Emulation";
 		break;
 	case 1:
-		n = "InfiniPath_QLE7340";
+		dd->boardname = "InfiniPath_QLE7340";
 		dd->flags |= QIB_HAS_QSFP;
 		features = PORT_SPD_CAP;
 		break;
 	case 2:
-		n = "InfiniPath_QLE7342";
+		dd->boardname = "InfiniPath_QLE7342";
 		dd->flags |= QIB_HAS_QSFP;
 		break;
 	case 3:
-		n = "InfiniPath_QMI7342";
+		dd->boardname = "InfiniPath_QMI7342";
 		break;
 	case 4:
-		n = "InfiniPath_Unsupported7342";
+		dd->boardname = "InfiniPath_Unsupported7342";
 		qib_dev_err(dd, "Unsupported version of QMH7342\n");
 		features = 0;
 		break;
 	case BOARD_QMH7342:
-		n = "InfiniPath_QMH7342";
+		dd->boardname = "InfiniPath_QMH7342";
 		features = 0x24;
 		break;
 	case BOARD_QME7342:
-		n = "InfiniPath_QME7342";
+		dd->boardname = "InfiniPath_QME7342";
 		break;
 	case 8:
-		n = "InfiniPath_QME7362";
+		dd->boardname = "InfiniPath_QME7362";
 		dd->flags |= QIB_HAS_QSFP;
 		break;
 	case BOARD_QMH7360:
-		n = "Intel IB QDR 1P FLR-QSFP Adptr";
+		dd->boardname = "Intel IB QDR 1P FLR-QSFP Adptr";
 		dd->flags |= QIB_HAS_QSFP;
 		break;
 	case 15:
-		n = "InfiniPath_QLE7342_TEST";
+		dd->boardname = "InfiniPath_QLE7342_TEST";
 		dd->flags |= QIB_HAS_QSFP;
 		break;
 	default:
-		n = "InfiniPath_QLE73xy_UNKNOWN";
+		dd->boardname = "InfiniPath_QLE73xy_UNKNOWN";
 		qib_dev_err(dd, "Unknown 7322 board type %u\n", boardid);
 		break;
 	}
 	dd->board_atten = 1; /* index into txdds_Xdr */
 
-	namelen = strlen(n) + 1;
-	dd->boardname = kmalloc(namelen, GFP_KERNEL);
-	if (dd->boardname)
-		snprintf(dd->boardname, namelen, "%s", n);
-
 	snprintf(dd->boardversion, sizeof(dd->boardversion),
 		 "ChipABI %u.%u, %s, InfiniPath%u %u.%u, SW Compat %u\n",
 		 QIB_CHIP_VERS_MAJ, QIB_CHIP_VERS_MIN, dd->boardname,
-		 (unsigned)SYM_FIELD(dd->revision, Revision_R, Arch),
+		 (unsigned int)SYM_FIELD(dd->revision, Revision_R, Arch),
 		 dd->majrev, dd->minrev,
-		 (unsigned)SYM_FIELD(dd->revision, Revision_R, SW));
+		 (unsigned int)SYM_FIELD(dd->revision, Revision_R, SW));
 
 	if (qib_singleport && (features >> PORT_SPD_CAP_SHIFT) & PORT_SPD_CAP) {
 		qib_devinfo(dd->pcidev,
-			"IB%u: Forced to single port mode by module parameter\n",
-			dd->unit);
+			    "IB%u: Forced to single port mode by module parameter\n",
+			    dd->unit);
 		features &= PORT_SPD_CAP;
 	}
 
@@ -3744,7 +3749,6 @@ static int qib_do_7322_reset(struct qib_devdata *dd)
 	if (msix_entries) {
 		/* restore the MSIx vector address and data if saved above */
 		for (i = 0; i < msix_entries; i++) {
-			dd->cspec->msix_entries[i].msix.entry = i;
 			if (!msix_vecsave || !msix_vecsave[2 * i])
 				continue;
 			qib_write_kreg(dd, 2 * i +
@@ -3762,8 +3766,7 @@ static int qib_do_7322_reset(struct qib_devdata *dd)
 	write_7322_initregs(dd);
 
 	if (qib_pcie_params(dd, dd->lbus_width,
-			    &dd->cspec->num_msix_entries,
-			    dd->cspec->msix_entries))
+			    &dd->cspec->num_msix_entries))
 		qib_dev_err(dd,
 			"Reset failed to setup PCIe or interrupts; continuing anyway\n");
 
@@ -5195,7 +5198,7 @@ static int qib_7322_intr_fallback(struct qib_devdata *dd)
 	qib_devinfo(dd->pcidev,
 		"MSIx interrupt not detected, trying INTx interrupts\n");
 	qib_7322_nomsix(dd);
-	qib_enable_intx(dd->pcidev);
+	qib_enable_intx(dd);
 	qib_setup_7322_interrupt(dd, 0);
 	return 1;
 }
@@ -6172,7 +6175,7 @@ static int setup_txselect(const char *str, struct kernel_param *kp)
 	unsigned long val;
 	char *n;
 
-	if (strlen(str) >= MAX_ATTEN_LEN) {
+	if (strlen(str) >= ARRAY_SIZE(txselect_list)) {
 		pr_info("txselect_values string too long\n");
 		return -ENOSPC;
 	}
@@ -6183,7 +6186,7 @@ static int setup_txselect(const char *str, struct kernel_param *kp)
 			TXDDS_TABLE_SZ + TXDDS_EXTRA_SZ + TXDDS_MFG_SZ);
 		return -EINVAL;
 	}
-	strcpy(txselect_list, str);
+	strncpy(txselect_list, str, ARRAY_SIZE(txselect_list) - 1);
 
 	list_for_each_entry(dd, &qib_dev_list, list)
 		if (dd->deviceid == PCI_DEVICE_ID_QLOGIC_IB_7322)
@@ -7327,10 +7330,7 @@ struct qib_devdata *qib_init_iba7322_funcs(struct pci_dev *pdev,
 	if (!dd->cspec->msix_entries)
 		tabsize = 0;
 
-	for (i = 0; i < tabsize; i++)
-		dd->cspec->msix_entries[i].msix.entry = i;
-
-	if (qib_pcie_params(dd, 8, &tabsize, dd->cspec->msix_entries))
+	if (qib_pcie_params(dd, 8, &tabsize))
 		qib_dev_err(dd,
 			"Failed to setup PCIe or interrupts; continuing anyway\n");
 	/* may be less than we wanted, if not enough available */
diff --git a/drivers/infiniband/hw/qib/qib_init.c b/drivers/infiniband/hw/qib/qib_init.c
index 6c16ba1..c5a4c65 100644
--- a/drivers/infiniband/hw/qib/qib_init.c
+++ b/drivers/infiniband/hw/qib/qib_init.c
@@ -399,7 +399,7 @@ static int loadtime_init(struct qib_devdata *dd)
 	if (((dd->revision >> QLOGIC_IB_R_SOFTWARE_SHIFT) &
 	     QLOGIC_IB_R_SOFTWARE_MASK) != QIB_CHIP_SWVERSION) {
 		qib_dev_err(dd,
-			"Driver only handles version %d, chip swversion is %d (%llx), failng\n",
+			"Driver only handles version %d, chip swversion is %d (%llx), failing\n",
 			QIB_CHIP_SWVERSION,
 			(int)(dd->revision >>
 				QLOGIC_IB_R_SOFTWARE_SHIFT) &
@@ -1398,7 +1398,6 @@ static void cleanup_device_data(struct qib_devdata *dd)
 		qib_free_ctxtdata(dd, rcd);
 	}
 	kfree(tmp);
-	kfree(dd->boardname);
 }
 
 /*
diff --git a/drivers/infiniband/hw/qib/qib_mad.c b/drivers/infiniband/hw/qib/qib_mad.c
index da295e0..82d9da9 100644
--- a/drivers/infiniband/hw/qib/qib_mad.c
+++ b/drivers/infiniband/hw/qib/qib_mad.c
@@ -105,7 +105,7 @@ static void qib_send_trap(struct qib_ibport *ibp, void *data, unsigned len)
 		if (ibp->rvp.sm_lid != be16_to_cpu(IB_LID_PERMISSIVE)) {
 			struct ib_ah *ah;
 
-			ah = qib_create_qp0_ah(ibp, ibp->rvp.sm_lid);
+			ah = qib_create_qp0_ah(ibp, (u16)ibp->rvp.sm_lid);
 			if (IS_ERR(ah))
 				ret = PTR_ERR(ah);
 			else {
@@ -134,24 +134,21 @@ static void qib_send_trap(struct qib_ibport *ibp, void *data, unsigned len)
 }
 
 /*
- * Send a bad [PQ]_Key trap (ch. 14.3.8).
+ * Send a bad P_Key trap (ch. 14.3.8).
  */
-void qib_bad_pqkey(struct qib_ibport *ibp, __be16 trap_num, u32 key, u32 sl,
-		   u32 qp1, u32 qp2, __be16 lid1, __be16 lid2)
+void qib_bad_pkey(struct qib_ibport *ibp, u32 key, u32 sl,
+		  u32 qp1, u32 qp2, __be16 lid1, __be16 lid2)
 {
 	struct ib_mad_notice_attr data;
 
-	if (trap_num == IB_NOTICE_TRAP_BAD_PKEY)
-		ibp->rvp.pkey_violations++;
-	else
-		ibp->rvp.qkey_violations++;
 	ibp->rvp.n_pkt_drops++;
+	ibp->rvp.pkey_violations++;
 
 	/* Send violation trap */
 	data.generic_type = IB_NOTICE_TYPE_SECURITY;
 	data.prod_type_msb = 0;
 	data.prod_type_lsb = IB_NOTICE_PROD_CA;
-	data.trap_num = trap_num;
+	data.trap_num = IB_NOTICE_TRAP_BAD_PKEY;
 	data.issuer_lid = cpu_to_be16(ppd_from_ibp(ibp)->lid);
 	data.toggle_count = 0;
 	memset(&data.details, 0, sizeof(data.details));
@@ -499,7 +496,7 @@ static int subn_get_portinfo(struct ib_smp *smp, struct ib_device *ibdev,
 		pip->mkey = ibp->rvp.mkey;
 	pip->gid_prefix = ibp->rvp.gid_prefix;
 	pip->lid = cpu_to_be16(ppd->lid);
-	pip->sm_lid = cpu_to_be16(ibp->rvp.sm_lid);
+	pip->sm_lid = cpu_to_be16((u16)ibp->rvp.sm_lid);
 	pip->cap_mask = cpu_to_be32(ibp->rvp.port_cap_flags);
 	/* pip->diag_code; */
 	pip->mkey_lease_period = cpu_to_be16(ibp->rvp.mkey_lease_period);
@@ -874,8 +871,6 @@ static int subn_set_portinfo(struct ib_smp *smp, struct ib_device *ibdev,
 		ib_dispatch_event(&event);
 	}
 
-	ret = subn_get_portinfo(smp, ibdev, port);
-
 	/* restore re-reg bit per o14-12.2.1 */
 	pip->clientrereg_resv_subnetto |= clientrereg;
 
@@ -1578,8 +1573,8 @@ static int pma_get_portcounters_cong(struct ib_pma_mad *pmp,
 	cntrs.port_xmit_packets -= ibp->z_port_xmit_packets;
 	cntrs.port_rcv_packets -= ibp->z_port_rcv_packets;
 
-	memset(pmp->reserved, 0, sizeof(pmp->reserved) +
-	       sizeof(pmp->data));
+	memset(pmp->reserved, 0, sizeof(pmp->reserved));
+	memset(pmp->data, 0, sizeof(pmp->data));
 
 	/*
 	 * Set top 3 bits to indicate interval in picoseconds in
diff --git a/drivers/infiniband/hw/qib/qib_pcie.c b/drivers/infiniband/hw/qib/qib_pcie.c
index c379b83..d90403e 100644
--- a/drivers/infiniband/hw/qib/qib_pcie.c
+++ b/drivers/infiniband/hw/qib/qib_pcie.c
@@ -1,4 +1,5 @@
 /*
+ * Copyright (c) 2010 - 2017 Intel Corporation.  All rights reserved.
  * Copyright (c) 2008, 2009 QLogic Corporation. All rights reserved.
  *
  * This software is available to you under a choice of one of two
@@ -187,112 +188,84 @@ void qib_pcie_ddcleanup(struct qib_devdata *dd)
 	pci_set_drvdata(dd->pcidev, NULL);
 }
 
-static void qib_msix_setup(struct qib_devdata *dd, int pos, u32 *msixcnt,
-			   struct qib_msix_entry *qib_msix_entry)
-{
-	int ret;
-	int nvec = *msixcnt;
-	struct msix_entry *msix_entry;
-	int i;
-
-	ret = pci_msix_vec_count(dd->pcidev);
-	if (ret < 0)
-		goto do_intx;
-
-	nvec = min(nvec, ret);
-
-	/* We can't pass qib_msix_entry array to qib_msix_setup
-	 * so use a dummy msix_entry array and copy the allocated
-	 * irq back to the qib_msix_entry array. */
-	msix_entry = kcalloc(nvec, sizeof(*msix_entry), GFP_KERNEL);
-	if (!msix_entry)
-		goto do_intx;
-
-	for (i = 0; i < nvec; i++)
-		msix_entry[i] = qib_msix_entry[i].msix;
-
-	ret = pci_enable_msix_range(dd->pcidev, msix_entry, 1, nvec);
-	if (ret < 0)
-		goto free_msix_entry;
-	else
-		nvec = ret;
-
-	for (i = 0; i < nvec; i++)
-		qib_msix_entry[i].msix = msix_entry[i];
-
-	kfree(msix_entry);
-	*msixcnt = nvec;
-	return;
-
-free_msix_entry:
-	kfree(msix_entry);
-
-do_intx:
-	qib_dev_err(
-		dd,
-		"pci_enable_msix_range %d vectors failed: %d, falling back to INTx\n",
-		nvec, ret);
-	*msixcnt = 0;
-	qib_enable_intx(dd->pcidev);
-}
-
 /**
  * We save the msi lo and hi values, so we can restore them after
  * chip reset (the kernel PCI infrastructure doesn't yet handle that
  * correctly.
  */
-static int qib_msi_setup(struct qib_devdata *dd, int pos)
+static void qib_msi_setup(struct qib_devdata *dd, int pos)
 {
 	struct pci_dev *pdev = dd->pcidev;
 	u16 control;
-	int ret;
 
-	ret = pci_enable_msi(pdev);
-	if (ret)
-		qib_dev_err(dd,
-			"pci_enable_msi failed: %d, interrupts may not work\n",
-			ret);
-	/* continue even if it fails, we may still be OK... */
-
-	pci_read_config_dword(pdev, pos + PCI_MSI_ADDRESS_LO,
-			      &dd->msi_lo);
-	pci_read_config_dword(pdev, pos + PCI_MSI_ADDRESS_HI,
-			      &dd->msi_hi);
+	pci_read_config_dword(pdev, pos + PCI_MSI_ADDRESS_LO, &dd->msi_lo);
+	pci_read_config_dword(pdev, pos + PCI_MSI_ADDRESS_HI, &dd->msi_hi);
 	pci_read_config_word(pdev, pos + PCI_MSI_FLAGS, &control);
+
 	/* now save the data (vector) info */
-	pci_read_config_word(pdev, pos + ((control & PCI_MSI_FLAGS_64BIT)
-				    ? 12 : 8),
+	pci_read_config_word(pdev,
+			     pos + ((control & PCI_MSI_FLAGS_64BIT) ? 12 : 8),
 			     &dd->msi_data);
-	return ret;
 }
 
-int qib_pcie_params(struct qib_devdata *dd, u32 minw, u32 *nent,
-		    struct qib_msix_entry *entry)
+static int qib_allocate_irqs(struct qib_devdata *dd, u32 maxvec)
+{
+	unsigned int flags = PCI_IRQ_LEGACY;
+
+	/* Check our capabilities */
+	if (dd->pcidev->msix_cap) {
+		flags |= PCI_IRQ_MSIX;
+	} else {
+		if (dd->pcidev->msi_cap) {
+			flags |= PCI_IRQ_MSI;
+			/* Get msi_lo and msi_hi */
+			qib_msi_setup(dd, dd->pcidev->msi_cap);
+		}
+	}
+
+	if (!(flags & (PCI_IRQ_MSIX | PCI_IRQ_MSI)))
+		qib_dev_err(dd, "No PCI MSI or MSIx capability!\n");
+
+	return pci_alloc_irq_vectors(dd->pcidev, 1, maxvec, flags);
+}
+
+int qib_pcie_params(struct qib_devdata *dd, u32 minw, u32 *nent)
 {
 	u16 linkstat, speed;
-	int pos = 0, ret = 1;
+	int nvec;
+	int maxvec;
+	int ret = 0;
 
 	if (!pci_is_pcie(dd->pcidev)) {
 		qib_dev_err(dd, "Can't find PCI Express capability!\n");
 		/* set up something... */
 		dd->lbus_width = 1;
 		dd->lbus_speed = 2500; /* Gen1, 2.5GHz */
+		ret = -1;
 		goto bail;
 	}
 
-	pos = dd->pcidev->msix_cap;
-	if (nent && *nent && pos) {
-		qib_msix_setup(dd, pos, nent, entry);
-		ret = 0; /* did it, either MSIx or INTx */
-	} else {
-		pos = dd->pcidev->msi_cap;
-		if (pos)
-			ret = qib_msi_setup(dd, pos);
-		else
-			qib_dev_err(dd, "No PCI MSI or MSIx capability!\n");
+	maxvec = (nent && *nent) ? *nent : 1;
+	nvec = qib_allocate_irqs(dd, maxvec);
+	if (nvec < 0) {
+		ret = nvec;
+		goto bail;
 	}
-	if (!pos)
-		qib_enable_intx(dd->pcidev);
+
+	/*
+	 * If nent exists, make sure to record how many vectors were allocated
+	 */
+	if (nent) {
+		*nent = nvec;
+
+		/*
+		 * If we requested (nent) MSIX, but msix_enabled is not set,
+		 * pci_alloc_irq_vectors() enabled INTx.
+		 */
+		if (!dd->pcidev->msix_enabled)
+			qib_dev_err(dd,
+				    "no msix vectors allocated, using INTx\n");
+	}
 
 	pcie_capability_read_word(dd->pcidev, PCI_EXP_LNKSTA, &linkstat);
 	/*
@@ -379,7 +352,7 @@ int qib_reinit_intr(struct qib_devdata *dd)
 	ret = 1;
 bail:
 	if (!ret && (dd->flags & QIB_HAS_INTX)) {
-		qib_enable_intx(dd->pcidev);
+		qib_enable_intx(dd);
 		ret = 1;
 	}
 
@@ -397,7 +370,7 @@ int qib_reinit_intr(struct qib_devdata *dd)
 void qib_nomsi(struct qib_devdata *dd)
 {
 	dd->msi_lo = 0;
-	pci_disable_msi(dd->pcidev);
+	pci_free_irq_vectors(dd->pcidev);
 }
 
 /*
@@ -405,23 +378,21 @@ void qib_nomsi(struct qib_devdata *dd)
  */
 void qib_nomsix(struct qib_devdata *dd)
 {
-	pci_disable_msix(dd->pcidev);
+	pci_free_irq_vectors(dd->pcidev);
 }
 
 /*
  * Similar to pci_intx(pdev, 1), except that we make sure
  * msi(x) is off.
  */
-void qib_enable_intx(struct pci_dev *pdev)
+void qib_enable_intx(struct qib_devdata *dd)
 {
 	u16 cw, new;
 	int pos;
+	struct pci_dev *pdev = dd->pcidev;
 
-	/* first, turn on INTx */
-	pci_read_config_word(pdev, PCI_COMMAND, &cw);
-	new = cw & ~PCI_COMMAND_INTX_DISABLE;
-	if (new != cw)
-		pci_write_config_word(pdev, PCI_COMMAND, new);
+	if (pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_LEGACY) < 0)
+		qib_dev_err(dd,	"Failed to enable INTx\n");
 
 	pos = pdev->msi_cap;
 	if (pos) {
diff --git a/drivers/infiniband/hw/qib/qib_qp.c b/drivers/infiniband/hw/qib/qib_qp.c
index a343e3b..344e401 100644
--- a/drivers/infiniband/hw/qib/qib_qp.c
+++ b/drivers/infiniband/hw/qib/qib_qp.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2012, 2013 Intel Corporation.  All rights reserved.
+ * Copyright (c) 2012 - 2017 Intel Corporation.  All rights reserved.
  * Copyright (c) 2006 - 2012 QLogic Corporation.  * All rights reserved.
  * Copyright (c) 2005, 2006 PathScale, Inc. All rights reserved.
  *
@@ -415,53 +415,16 @@ int qib_check_send_wqe(struct rvt_qp *qp,
 
 #ifdef CONFIG_DEBUG_FS
 
-struct qib_qp_iter {
-	struct qib_ibdev *dev;
-	struct rvt_qp *qp;
-	int n;
-};
-
-struct qib_qp_iter *qib_qp_iter_init(struct qib_ibdev *dev)
-{
-	struct qib_qp_iter *iter;
-
-	iter = kzalloc(sizeof(*iter), GFP_KERNEL);
-	if (!iter)
-		return NULL;
-
-	iter->dev = dev;
-
-	return iter;
-}
-
-int qib_qp_iter_next(struct qib_qp_iter *iter)
-{
-	struct qib_ibdev *dev = iter->dev;
-	int n = iter->n;
-	int ret = 1;
-	struct rvt_qp *pqp = iter->qp;
-	struct rvt_qp *qp;
-
-	for (; n < dev->rdi.qp_dev->qp_table_size; n++) {
-		if (pqp)
-			qp = rcu_dereference(pqp->next);
-		else
-			qp = rcu_dereference(dev->rdi.qp_dev->qp_table[n]);
-		pqp = qp;
-		if (qp) {
-			iter->qp = qp;
-			iter->n = n;
-			return 0;
-		}
-	}
-	return ret;
-}
-
 static const char * const qp_type_str[] = {
 	"SMI", "GSI", "RC", "UC", "UD",
 };
 
-void qib_qp_iter_print(struct seq_file *s, struct qib_qp_iter *iter)
+/**
+ * qib_qp_iter_print - print information to seq_file
+ * @s - the seq_file
+ * @iter - the iterator
+ */
+void qib_qp_iter_print(struct seq_file *s, struct rvt_qp_iter *iter)
 {
 	struct rvt_swqe *wqe;
 	struct rvt_qp *qp = iter->qp;
diff --git a/drivers/infiniband/hw/qib/qib_rc.c b/drivers/infiniband/hw/qib/qib_rc.c
index 4ddbcac..e9a9173 100644
--- a/drivers/infiniband/hw/qib/qib_rc.c
+++ b/drivers/infiniband/hw/qib/qib_rc.c
@@ -348,7 +348,7 @@ int qib_make_rc_req(struct rvt_qp *qp, unsigned long *flags)
 		case IB_WR_RDMA_WRITE:
 			if (newreq && !(qp->s_flags & RVT_S_UNLIMITED_CREDIT))
 				qp->s_lsn++;
-			/* FALLTHROUGH */
+			goto no_flow_control;
 		case IB_WR_RDMA_WRITE_WITH_IMM:
 			/* If no credit, return. */
 			if (!(qp->s_flags & RVT_S_UNLIMITED_CREDIT) &&
@@ -356,7 +356,7 @@ int qib_make_rc_req(struct rvt_qp *qp, unsigned long *flags)
 				qp->s_flags |= RVT_S_WAIT_SSN_CREDIT;
 				goto bail;
 			}
-
+no_flow_control:
 			ohdr->u.rc.reth.vaddr =
 				cpu_to_be64(wqe->rdma_wr.remote_addr);
 			ohdr->u.rc.reth.rkey =
diff --git a/drivers/infiniband/hw/qib/qib_ruc.c b/drivers/infiniband/hw/qib/qib_ruc.c
index bd09de7..53efbb0 100644
--- a/drivers/infiniband/hw/qib/qib_ruc.c
+++ b/drivers/infiniband/hw/qib/qib_ruc.c
@@ -58,8 +58,10 @@ static int qib_init_sge(struct rvt_qp *qp, struct rvt_rwqe *wqe)
 		if (wqe->sg_list[i].length == 0)
 			continue;
 		/* Check LKEY */
-		if (!rvt_lkey_ok(rkt, pd, j ? &ss->sg_list[j - 1] : &ss->sge,
-				 &wqe->sg_list[i], IB_ACCESS_LOCAL_WRITE))
+		ret = rvt_lkey_ok(rkt, pd, j ? &ss->sg_list[j - 1] : &ss->sge,
+				  NULL, &wqe->sg_list[i],
+				  IB_ACCESS_LOCAL_WRITE);
+		if (unlikely(ret <= 0))
 			goto bad_lkey;
 		qp->r_len += wqe->sg_list[i].length;
 		j++;
@@ -256,11 +258,11 @@ int qib_ruc_check_hdr(struct qib_ibport *ibp, struct ib_header *hdr,
 		}
 		if (!qib_pkey_ok((u16)bth0,
 				 qib_get_pkey(ibp, qp->s_alt_pkey_index))) {
-			qib_bad_pqkey(ibp, IB_NOTICE_TRAP_BAD_PKEY,
-				      (u16)bth0,
-				      (be16_to_cpu(hdr->lrh[0]) >> 4) & 0xF,
-				      0, qp->ibqp.qp_num,
-				      hdr->lrh[3], hdr->lrh[1]);
+			qib_bad_pkey(ibp,
+				     (u16)bth0,
+				     (be16_to_cpu(hdr->lrh[0]) >> 4) & 0xF,
+				     0, qp->ibqp.qp_num,
+				     hdr->lrh[3], hdr->lrh[1]);
 			goto err;
 		}
 		/* Validate the SLID. See Ch. 9.6.1.5 and 17.2.8 */
@@ -295,11 +297,11 @@ int qib_ruc_check_hdr(struct qib_ibport *ibp, struct ib_header *hdr,
 		}
 		if (!qib_pkey_ok((u16)bth0,
 				 qib_get_pkey(ibp, qp->s_pkey_index))) {
-			qib_bad_pqkey(ibp, IB_NOTICE_TRAP_BAD_PKEY,
-				      (u16)bth0,
-				      (be16_to_cpu(hdr->lrh[0]) >> 4) & 0xF,
-				      0, qp->ibqp.qp_num,
-				      hdr->lrh[3], hdr->lrh[1]);
+			qib_bad_pkey(ibp,
+				     (u16)bth0,
+				     (be16_to_cpu(hdr->lrh[0]) >> 4) & 0xF,
+				     0, qp->ibqp.qp_num,
+				     hdr->lrh[3], hdr->lrh[1]);
 			goto err;
 		}
 		/* Validate the SLID. See Ch. 9.6.1.5 */
@@ -643,8 +645,10 @@ u32 qib_make_grh(struct qib_ibport *ibp, struct ib_grh *hdr,
 	hdr->hop_limit = grh->hop_limit;
 	/* The SGID is 32-bit aligned. */
 	hdr->sgid.global.subnet_prefix = ibp->rvp.gid_prefix;
-	hdr->sgid.global.interface_id = grh->sgid_index ?
-		ibp->guids[grh->sgid_index - 1] : ppd_from_ibp(ibp)->guid;
+	if (!grh->sgid_index)
+		hdr->sgid.global.interface_id = ppd_from_ibp(ibp)->guid;
+	else if (grh->sgid_index < QIB_GUIDS_PER_PORT)
+		hdr->sgid.global.interface_id = ibp->guids[grh->sgid_index - 1];
 	hdr->dgid = grh->dgid;
 
 	/* GRH header size in 32-bit words. */
diff --git a/drivers/infiniband/hw/qib/qib_sysfs.c b/drivers/infiniband/hw/qib/qib_sysfs.c
index fe4cf5e..ca2638d 100644
--- a/drivers/infiniband/hw/qib/qib_sysfs.c
+++ b/drivers/infiniband/hw/qib/qib_sysfs.c
@@ -247,7 +247,7 @@ static struct kobj_type qib_port_cc_ktype = {
 	.release = qib_port_release,
 };
 
-static struct bin_attribute cc_table_bin_attr = {
+static const struct bin_attribute cc_table_bin_attr = {
 	.attr = {.name = "cc_table_bin", .mode = 0444},
 	.read = read_cc_table_bin,
 	.size = PAGE_SIZE,
@@ -286,7 +286,7 @@ static ssize_t read_cc_setting_bin(struct file *filp, struct kobject *kobj,
 	return count;
 }
 
-static struct bin_attribute cc_setting_bin_attr = {
+static const struct bin_attribute cc_setting_bin_attr = {
 	.attr = {.name = "cc_settings_bin", .mode = 0444},
 	.read = read_cc_setting_bin,
 	.size = PAGE_SIZE,
diff --git a/drivers/infiniband/hw/qib/qib_ud.c b/drivers/infiniband/hw/qib/qib_ud.c
index 341a123..be49074 100644
--- a/drivers/infiniband/hw/qib/qib_ud.c
+++ b/drivers/infiniband/hw/qib/qib_ud.c
@@ -66,8 +66,7 @@ static void qib_ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
 	qp = rvt_lookup_qpn(rdi, &ibp->rvp, swqe->ud_wr.remote_qpn);
 	if (!qp) {
 		ibp->rvp.n_pkt_drops++;
-		rcu_read_unlock();
-		return;
+		goto drop;
 	}
 
 	sqptype = sqp->ibqp.qp_type == IB_QPT_GSI ?
@@ -94,11 +93,11 @@ static void qib_ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
 		if (unlikely(!qib_pkey_ok(pkey1, pkey2))) {
 			lid = ppd->lid | (rdma_ah_get_path_bits(ah_attr) &
 					  ((1 << ppd->lmc) - 1));
-			qib_bad_pqkey(ibp, IB_NOTICE_TRAP_BAD_PKEY, pkey1,
-				      rdma_ah_get_sl(ah_attr),
-				      sqp->ibqp.qp_num, qp->ibqp.qp_num,
-				      cpu_to_be16(lid),
-				      cpu_to_be16(rdma_ah_get_dlid(ah_attr)));
+			qib_bad_pkey(ibp, pkey1,
+				     rdma_ah_get_sl(ah_attr),
+				     sqp->ibqp.qp_num, qp->ibqp.qp_num,
+				     cpu_to_be16(lid),
+				     cpu_to_be16(rdma_ah_get_dlid(ah_attr)));
 			goto drop;
 		}
 	}
@@ -113,18 +112,8 @@ static void qib_ud_loopback(struct rvt_qp *sqp, struct rvt_swqe *swqe)
 
 		qkey = (int)swqe->ud_wr.remote_qkey < 0 ?
 			sqp->qkey : swqe->ud_wr.remote_qkey;
-		if (unlikely(qkey != qp->qkey)) {
-			u16 lid;
-
-			lid = ppd->lid | (rdma_ah_get_path_bits(ah_attr) &
-					  ((1 << ppd->lmc) - 1));
-			qib_bad_pqkey(ibp, IB_NOTICE_TRAP_BAD_QKEY, qkey,
-				      rdma_ah_get_sl(ah_attr),
-				      sqp->ibqp.qp_num, qp->ibqp.qp_num,
-				      cpu_to_be16(lid),
-				      cpu_to_be16(rdma_ah_get_dlid(ah_attr)));
+		if (unlikely(qkey != qp->qkey))
 			goto drop;
-		}
 	}
 
 	/*
@@ -487,22 +476,18 @@ void qib_ud_rcv(struct qib_ibport *ibp, struct ib_header *hdr,
 			pkey1 = be32_to_cpu(ohdr->bth[0]);
 			pkey2 = qib_get_pkey(ibp, qp->s_pkey_index);
 			if (unlikely(!qib_pkey_ok(pkey1, pkey2))) {
-				qib_bad_pqkey(ibp, IB_NOTICE_TRAP_BAD_PKEY,
-					      pkey1,
-					      (be16_to_cpu(hdr->lrh[0]) >> 4) &
+				qib_bad_pkey(ibp,
+					     pkey1,
+					     (be16_to_cpu(hdr->lrh[0]) >> 4) &
 						0xF,
-					      src_qp, qp->ibqp.qp_num,
-					      hdr->lrh[3], hdr->lrh[1]);
+					     src_qp, qp->ibqp.qp_num,
+					     hdr->lrh[3], hdr->lrh[1]);
 				return;
 			}
 		}
-		if (unlikely(qkey != qp->qkey)) {
-			qib_bad_pqkey(ibp, IB_NOTICE_TRAP_BAD_QKEY, qkey,
-				      (be16_to_cpu(hdr->lrh[0]) >> 4) & 0xF,
-				      src_qp, qp->ibqp.qp_num,
-				      hdr->lrh[3], hdr->lrh[1]);
+		if (unlikely(qkey != qp->qkey))
 			return;
-		}
+
 		/* Drop invalid MAD packets (see 13.5.3.1). */
 		if (unlikely(qp->ibqp.qp_num == 1 &&
 			     (tlen != 256 ||
diff --git a/drivers/infiniband/hw/qib/qib_verbs.c b/drivers/infiniband/hw/qib/qib_verbs.c
index ac42dce..9d92aeb 100644
--- a/drivers/infiniband/hw/qib/qib_verbs.c
+++ b/drivers/infiniband/hw/qib/qib_verbs.c
@@ -1341,6 +1341,15 @@ int qib_check_ah(struct ib_device *ibdev, struct rdma_ah_attr *ah_attr)
 	if (rdma_ah_get_sl(ah_attr) > 15)
 		return -EINVAL;
 
+	if (rdma_ah_get_dlid(ah_attr) == 0)
+		return -EINVAL;
+	if (rdma_ah_get_dlid(ah_attr) >=
+		be16_to_cpu(IB_MULTICAST_LID_BASE) &&
+	    rdma_ah_get_dlid(ah_attr) !=
+		be16_to_cpu(IB_LID_PERMISSIVE) &&
+	    !(rdma_ah_get_ah_flags(ah_attr) & IB_AH_GRH))
+		return -EINVAL;
+
 	return 0;
 }
 
diff --git a/drivers/infiniband/hw/qib/qib_verbs.h b/drivers/infiniband/hw/qib/qib_verbs.h
index a52fc67..f887737 100644
--- a/drivers/infiniband/hw/qib/qib_verbs.h
+++ b/drivers/infiniband/hw/qib/qib_verbs.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2012, 2013 Intel Corporation.  All rights reserved.
+ * Copyright (c) 2012 - 2017 Intel Corporation.  All rights reserved.
  * Copyright (c) 2006 - 2012 QLogic Corporation. All rights reserved.
  * Copyright (c) 2005, 2006 PathScale, Inc. All rights reserved.
  *
@@ -241,8 +241,8 @@ static inline int qib_pkey_ok(u16 pkey1, u16 pkey2)
 	return p1 && p1 == p2 && ((__s16)pkey1 < 0 || (__s16)pkey2 < 0);
 }
 
-void qib_bad_pqkey(struct qib_ibport *ibp, __be16 trap_num, u32 key, u32 sl,
-		   u32 qp1, u32 qp2, __be16 lid1, __be16 lid2);
+void qib_bad_pkey(struct qib_ibport *ibp, u32 key, u32 sl,
+		  u32 qp1, u32 qp2, __be16 lid1, __be16 lid2);
 void qib_cap_mask_chg(struct rvt_dev_info *rdi, u8 port_num);
 void qib_sys_guid_chg(struct qib_ibport *ibp);
 void qib_node_desc_chg(struct qib_ibport *ibp);
@@ -282,13 +282,7 @@ int qib_alloc_qpn(struct rvt_dev_info *rdi, struct rvt_qpn_table *qpt,
 void qib_restart_rc(struct rvt_qp *qp, u32 psn, int wait);
 #ifdef CONFIG_DEBUG_FS
 
-struct qib_qp_iter;
-
-struct qib_qp_iter *qib_qp_iter_init(struct qib_ibdev *dev);
-
-int qib_qp_iter_next(struct qib_qp_iter *iter);
-
-void qib_qp_iter_print(struct seq_file *s, struct qib_qp_iter *iter);
+void qib_qp_iter_print(struct seq_file *s, struct rvt_qp_iter *iter);
 
 #endif
 
diff --git a/drivers/infiniband/hw/usnic/usnic_fwd.c b/drivers/infiniband/hw/usnic/usnic_fwd.c
index 3c37dd5..995a26b 100644
--- a/drivers/infiniband/hw/usnic/usnic_fwd.c
+++ b/drivers/infiniband/hw/usnic/usnic_fwd.c
@@ -110,20 +110,12 @@ void usnic_fwd_set_mac(struct usnic_fwd_dev *ufdev, char mac[ETH_ALEN])
 	spin_unlock(&ufdev->lock);
 }
 
-int usnic_fwd_add_ipaddr(struct usnic_fwd_dev *ufdev, __be32 inaddr)
+void usnic_fwd_add_ipaddr(struct usnic_fwd_dev *ufdev, __be32 inaddr)
 {
-	int status;
-
 	spin_lock(&ufdev->lock);
-	if (ufdev->inaddr == 0) {
+	if (!ufdev->inaddr)
 		ufdev->inaddr = inaddr;
-		status = 0;
-	} else {
-		status = -EFAULT;
-	}
 	spin_unlock(&ufdev->lock);
-
-	return status;
 }
 
 void usnic_fwd_del_ipaddr(struct usnic_fwd_dev *ufdev)
diff --git a/drivers/infiniband/hw/usnic/usnic_fwd.h b/drivers/infiniband/hw/usnic/usnic_fwd.h
index b2ac22b..0b2cc4e 100644
--- a/drivers/infiniband/hw/usnic/usnic_fwd.h
+++ b/drivers/infiniband/hw/usnic/usnic_fwd.h
@@ -75,7 +75,7 @@ struct usnic_fwd_dev *usnic_fwd_dev_alloc(struct pci_dev *pdev);
 void usnic_fwd_dev_free(struct usnic_fwd_dev *ufdev);
 
 void usnic_fwd_set_mac(struct usnic_fwd_dev *ufdev, char mac[ETH_ALEN]);
-int usnic_fwd_add_ipaddr(struct usnic_fwd_dev *ufdev, __be32 inaddr);
+void usnic_fwd_add_ipaddr(struct usnic_fwd_dev *ufdev, __be32 inaddr);
 void usnic_fwd_del_ipaddr(struct usnic_fwd_dev *ufdev);
 void usnic_fwd_carrier_up(struct usnic_fwd_dev *ufdev);
 void usnic_fwd_carrier_down(struct usnic_fwd_dev *ufdev);
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_main.c b/drivers/infiniband/hw/usnic/usnic_ib_main.c
index c0c1e8b..f45e99a 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_main.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_main.c
@@ -333,9 +333,7 @@ static int usnic_port_immutable(struct ib_device *ibdev, u8 port_num,
 	return 0;
 }
 
-static void usnic_get_dev_fw_str(struct ib_device *device,
-				 char *str,
-				 size_t str_len)
+static void usnic_get_dev_fw_str(struct ib_device *device, char *str)
 {
 	struct usnic_ib_dev *us_ibdev =
 		container_of(device, struct usnic_ib_dev, ib_dev);
@@ -345,7 +343,7 @@ static void usnic_get_dev_fw_str(struct ib_device *device,
 	us_ibdev->netdev->ethtool_ops->get_drvinfo(us_ibdev->netdev, &info);
 	mutex_unlock(&us_ibdev->usdev_lock);
 
-	snprintf(str, str_len, "%s", info.fw_version);
+	snprintf(str, IB_FW_VERSION_NAME_MAX, "%s", info.fw_version);
 }
 
 /* Start of PF discovery section */
@@ -353,7 +351,7 @@ static void *usnic_ib_device_add(struct pci_dev *dev)
 {
 	struct usnic_ib_dev *us_ibdev;
 	union ib_gid gid;
-	struct in_ifaddr *in;
+	struct in_device *ind;
 	struct net_device *netdev;
 
 	usnic_dbg("\n");
@@ -409,6 +407,7 @@ static void *usnic_ib_device_add(struct pci_dev *dev)
 	us_ibdev->ib_dev.query_port = usnic_ib_query_port;
 	us_ibdev->ib_dev.query_pkey = usnic_ib_query_pkey;
 	us_ibdev->ib_dev.query_gid = usnic_ib_query_gid;
+	us_ibdev->ib_dev.get_netdev = usnic_get_netdev;
 	us_ibdev->ib_dev.get_link_layer = usnic_ib_port_link_layer;
 	us_ibdev->ib_dev.alloc_pd = usnic_ib_alloc_pd;
 	us_ibdev->ib_dev.dealloc_pd = usnic_ib_dealloc_pd;
@@ -442,9 +441,11 @@ static void *usnic_ib_device_add(struct pci_dev *dev)
 	if (netif_carrier_ok(us_ibdev->netdev))
 		usnic_fwd_carrier_up(us_ibdev->ufdev);
 
-	in = ((struct in_device *)(netdev->ip_ptr))->ifa_list;
-	if (in != NULL)
-		usnic_fwd_add_ipaddr(us_ibdev->ufdev, in->ifa_address);
+	ind = in_dev_get(netdev);
+	if (ind->ifa_list)
+		usnic_fwd_add_ipaddr(us_ibdev->ufdev,
+				     ind->ifa_list->ifa_address);
+	in_dev_put(ind);
 
 	usnic_mac_ip_to_gid(us_ibdev->netdev->perm_addr,
 				us_ibdev->ufdev->inaddr, &gid.raw[0]);
@@ -720,7 +721,6 @@ static void __exit usnic_ib_destroy(void)
 MODULE_DESCRIPTION("Cisco VIC (usNIC) Verbs Driver");
 MODULE_AUTHOR("Upinder Malhi <umalhi@cisco.com>");
 MODULE_LICENSE("Dual BSD/GPL");
-MODULE_VERSION(DRV_VERSION);
 module_param(usnic_log_lvl, uint, S_IRUGO | S_IWUSR);
 module_param(usnic_ib_share_vf, uint, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(usnic_log_lvl, " Off=0, Err=1, Info=2, Debug=3");
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
index 4996984..e4113ef 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
@@ -164,6 +164,8 @@ find_free_vf_and_create_qp_grp(struct usnic_ib_dev *us_ibdev,
 	if (usnic_ib_share_vf) {
 		/* Try to find resouces on a used vf which is in pd */
 		dev_list = usnic_uiom_get_dev_list(pd->umem_pd);
+		if (IS_ERR(dev_list))
+			return ERR_CAST(dev_list);
 		for (i = 0; dev_list[i]; i++) {
 			dev = dev_list[i];
 			vf = pci_get_drvdata(to_pci_dev(dev));
@@ -226,27 +228,6 @@ static void qp_grp_destroy(struct usnic_ib_qp_grp *qp_grp)
 	spin_unlock(&vf->lock);
 }
 
-static void eth_speed_to_ib_speed(int speed, u8 *active_speed,
-					u8 *active_width)
-{
-	if (speed <= 10000) {
-		*active_width = IB_WIDTH_1X;
-		*active_speed = IB_SPEED_FDR10;
-	} else if (speed <= 20000) {
-		*active_width = IB_WIDTH_4X;
-		*active_speed = IB_SPEED_DDR;
-	} else if (speed <= 30000) {
-		*active_width = IB_WIDTH_4X;
-		*active_speed = IB_SPEED_QDR;
-	} else if (speed <= 40000) {
-		*active_width = IB_WIDTH_4X;
-		*active_speed = IB_SPEED_FDR10;
-	} else {
-		*active_width = IB_WIDTH_4X;
-		*active_speed = IB_SPEED_EDR;
-	}
-}
-
 static int create_qp_validate_user_data(struct usnic_ib_create_qp_cmd cmd)
 {
 	if (cmd.spec.trans_type <= USNIC_TRANSPORT_UNKNOWN ||
@@ -326,12 +307,16 @@ int usnic_ib_query_port(struct ib_device *ibdev, u8 port,
 				struct ib_port_attr *props)
 {
 	struct usnic_ib_dev *us_ibdev = to_usdev(ibdev);
-	struct ethtool_link_ksettings cmd;
 
 	usnic_dbg("\n");
 
 	mutex_lock(&us_ibdev->usdev_lock);
-	__ethtool_get_link_ksettings(us_ibdev->netdev, &cmd);
+	if (ib_get_eth_speed(ibdev, port, &props->active_speed,
+			     &props->active_width)) {
+		mutex_unlock(&us_ibdev->usdev_lock);
+		return -EINVAL;
+	}
+
 	/* props being zeroed by the caller, avoid zeroing it here */
 
 	props->lid = 0;
@@ -355,8 +340,6 @@ int usnic_ib_query_port(struct ib_device *ibdev, u8 port,
 	props->pkey_tbl_len = 1;
 	props->bad_pkey_cntr = 0;
 	props->qkey_viol_cntr = 0;
-	eth_speed_to_ib_speed(cmd.base.speed, &props->active_speed,
-			      &props->active_width);
 	props->max_mtu = IB_MTU_4096;
 	props->active_mtu = iboe_get_mtu(us_ibdev->ufdev->mtu);
 	/* Userspace will adjust for hdrs */
@@ -424,6 +407,16 @@ int usnic_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
 	return 0;
 }
 
+struct net_device *usnic_get_netdev(struct ib_device *device, u8 port_num)
+{
+	struct usnic_ib_dev *us_ibdev = to_usdev(device);
+
+	if (us_ibdev->netdev)
+		dev_hold(us_ibdev->netdev);
+
+	return us_ibdev->netdev;
+}
+
 int usnic_ib_query_pkey(struct ib_device *ibdev, u8 port, u16 index,
 				u16 *pkey)
 {
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
index 172e43b..1fda944 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
@@ -48,6 +48,7 @@ int usnic_ib_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
 				struct ib_qp_init_attr *qp_init_attr);
 int usnic_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
 				union ib_gid *gid);
+struct net_device *usnic_get_netdev(struct ib_device *device, u8 port_num);
 int usnic_ib_query_pkey(struct ib_device *ibdev, u8 port, u16 index,
 				u16 *pkey);
 struct ib_pd *usnic_ib_alloc_pd(struct ib_device *ibdev,
diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma.h b/drivers/infiniband/hw/vmw_pvrdma/pvrdma.h
index 8e2f0a1..663a0c3 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma.h
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma.h
@@ -194,6 +194,7 @@ struct pvrdma_dev {
 	void *resp_slot;
 	unsigned long flags;
 	struct list_head device_link;
+	unsigned int dsr_version;
 
 	/* Locking and interrupt information. */
 	spinlock_t cmd_lock; /* Command lock. */
@@ -444,6 +445,7 @@ void pvrdma_ah_attr_to_rdma(struct rdma_ah_attr *dst,
 			    const struct pvrdma_ah_attr *src);
 void rdma_ah_attr_to_pvrdma(struct pvrdma_ah_attr *dst,
 			    const struct rdma_ah_attr *src);
+u8 ib_gid_type_to_pvrdma(enum ib_gid_type gid_type);
 
 int pvrdma_uar_table_init(struct pvrdma_dev *dev);
 void pvrdma_uar_table_cleanup(struct pvrdma_dev *dev);
diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
index 90aa326..3562c0c 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
@@ -299,7 +299,7 @@ static inline struct pvrdma_cqe *get_cqe(struct pvrdma_cq *cq, int i)
 
 void _pvrdma_flush_cqe(struct pvrdma_qp *qp, struct pvrdma_cq *cq)
 {
-	int head;
+	unsigned int head;
 	int has_data;
 
 	if (!cq->is_kernel)
@@ -389,6 +389,7 @@ static int pvrdma_poll_one(struct pvrdma_cq *cq, struct pvrdma_qp **cur_qp,
 	wc->dlid_path_bits = cqe->dlid_path_bits;
 	wc->port_num = cqe->port_num;
 	wc->vendor_err = cqe->vendor_err;
+	wc->network_hdr_type = cqe->network_hdr_type;
 
 	/* Update shared ring state */
 	pvrdma_idx_ring_inc(&cq->ring_state->rx.cons_head, cq->ibcq.cqe);
diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_dev_api.h b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_dev_api.h
index 09078cc..df0a6b5 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_dev_api.h
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_dev_api.h
@@ -50,7 +50,15 @@
 
 #include "pvrdma_verbs.h"
 
-#define PVRDMA_VERSION			17
+/*
+ * PVRDMA version macros. Some new features require updates to PVRDMA_VERSION.
+ * These macros allow us to check for different features if necessary.
+ */
+
+#define PVRDMA_ROCEV1_VERSION		17
+#define PVRDMA_ROCEV2_VERSION		18
+#define PVRDMA_VERSION			PVRDMA_ROCEV2_VERSION
+
 #define PVRDMA_BOARD_ID			1
 #define PVRDMA_REV_ID			1
 
@@ -123,6 +131,31 @@
 #define PVRDMA_GID_TYPE_FLAG_ROCE_V1	BIT(0)
 #define PVRDMA_GID_TYPE_FLAG_ROCE_V2	BIT(1)
 
+/*
+ * Version checks. This checks whether each version supports specific
+ * capabilities from the device.
+ */
+
+#define PVRDMA_IS_VERSION17(_dev)					\
+	(_dev->dsr_version == PVRDMA_ROCEV1_VERSION &&			\
+	 _dev->dsr->caps.gid_types == PVRDMA_GID_TYPE_FLAG_ROCE_V1)
+
+#define PVRDMA_IS_VERSION18(_dev)					\
+	(_dev->dsr_version >= PVRDMA_ROCEV2_VERSION &&			\
+	 (_dev->dsr->caps.gid_types == PVRDMA_GID_TYPE_FLAG_ROCE_V1 ||  \
+	  _dev->dsr->caps.gid_types == PVRDMA_GID_TYPE_FLAG_ROCE_V2))	\
+
+#define PVRDMA_SUPPORTED(_dev)						\
+	((_dev->dsr->caps.mode == PVRDMA_DEVICE_MODE_ROCE) &&		\
+	 (PVRDMA_IS_VERSION17(_dev) || PVRDMA_IS_VERSION18(_dev)))
+
+/*
+ * Get capability values based on device version.
+ */
+
+#define PVRDMA_GET_CAP(_dev, _old_val, _val) \
+	((PVRDMA_IS_VERSION18(_dev)) ? _val : _old_val)
+
 enum pvrdma_pci_resource {
 	PVRDMA_PCI_RESOURCE_MSIX,	/* BAR0: MSI-X, MMIO. */
 	PVRDMA_PCI_RESOURCE_REG,	/* BAR1: Registers, MMIO. */
@@ -225,7 +258,7 @@ struct pvrdma_device_caps {
 	u8  atomic_ops;				/* PVRDMA_ATOMIC_OP_* bits */
 	u8  bmme_flags;				/* FRWR Mem Mgmt Extensions */
 	u8  gid_types;				/* PVRDMA_GID_TYPE_FLAG_ */
-	u8  reserved[4];
+	u32 max_fast_reg_page_list_len;
 };
 
 struct pvrdma_ring_page_info {
diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c
index 34ebc76..6ce709a 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_main.c
@@ -102,12 +102,11 @@ static struct device_attribute *pvrdma_class_attributes[] = {
 	&dev_attr_board_id
 };
 
-static void pvrdma_get_fw_ver_str(struct ib_device *device, char *str,
-				  size_t str_len)
+static void pvrdma_get_fw_ver_str(struct ib_device *device, char *str)
 {
 	struct pvrdma_dev *dev =
 		container_of(device, struct pvrdma_dev, ib_dev);
-	snprintf(str, str_len, "%d.%d.%d\n",
+	snprintf(str, IB_FW_VERSION_NAME_MAX, "%d.%d.%d\n",
 		 (int) (dev->dsr->caps.fw_ver >> 32),
 		 (int) (dev->dsr->caps.fw_ver >> 16) & 0xffff,
 		 (int) dev->dsr->caps.fw_ver & 0xffff);
@@ -129,10 +128,14 @@ static int pvrdma_init_device(struct pvrdma_dev *dev)
 static int pvrdma_port_immutable(struct ib_device *ibdev, u8 port_num,
 				 struct ib_port_immutable *immutable)
 {
+	struct pvrdma_dev *dev = to_vdev(ibdev);
 	struct ib_port_attr attr;
 	int err;
 
-	immutable->core_cap_flags = RDMA_CORE_PORT_IBA_ROCE;
+	if (dev->dsr->caps.gid_types == PVRDMA_GID_TYPE_FLAG_ROCE_V1)
+		immutable->core_cap_flags |= RDMA_CORE_PORT_IBA_ROCE;
+	else if (dev->dsr->caps.gid_types == PVRDMA_GID_TYPE_FLAG_ROCE_V2)
+		immutable->core_cap_flags |= RDMA_CORE_PORT_IBA_ROCE_UDP_ENCAP;
 
 	err = ib_query_port(ibdev, port_num, &attr);
 	if (err)
@@ -570,6 +573,7 @@ static void pvrdma_free_slots(struct pvrdma_dev *dev)
 
 static int pvrdma_add_gid_at_index(struct pvrdma_dev *dev,
 				   const union ib_gid *gid,
+				   u8 gid_type,
 				   int index)
 {
 	int ret;
@@ -587,7 +591,7 @@ static int pvrdma_add_gid_at_index(struct pvrdma_dev *dev,
 	cmd_bind->mtu = ib_mtu_enum_to_int(IB_MTU_1024);
 	cmd_bind->vlan = 0xfff;
 	cmd_bind->index = index;
-	cmd_bind->gid_type = PVRDMA_GID_TYPE_FLAG_ROCE_V1;
+	cmd_bind->gid_type = gid_type;
 
 	ret = pvrdma_cmd_post(dev, &req, NULL, 0);
 	if (ret < 0) {
@@ -608,7 +612,9 @@ static int pvrdma_add_gid(struct ib_device *ibdev,
 {
 	struct pvrdma_dev *dev = to_vdev(ibdev);
 
-	return pvrdma_add_gid_at_index(dev, gid, index);
+	return pvrdma_add_gid_at_index(dev, gid,
+				       ib_gid_type_to_pvrdma(attr->gid_type),
+				       index);
 }
 
 static int pvrdma_del_gid_at_index(struct pvrdma_dev *dev, int index)
@@ -723,7 +729,6 @@ static int pvrdma_pci_probe(struct pci_dev *pdev,
 	int ret;
 	unsigned long start;
 	unsigned long len;
-	unsigned int version;
 	dma_addr_t slot_dma = 0;
 
 	dev_dbg(&pdev->dev, "initializing driver %s\n", pci_name(pdev));
@@ -820,13 +825,9 @@ static int pvrdma_pci_probe(struct pci_dev *pdev,
 		goto err_unmap_regs;
 	}
 
-	version = pvrdma_read_reg(dev, PVRDMA_REG_VERSION);
+	dev->dsr_version = pvrdma_read_reg(dev, PVRDMA_REG_VERSION);
 	dev_info(&pdev->dev, "device version %d, driver version %d\n",
-		 version, PVRDMA_VERSION);
-	if (version < PVRDMA_VERSION) {
-		dev_err(&pdev->dev, "incompatible device version\n");
-		goto err_uar_unmap;
-	}
+		 dev->dsr_version, PVRDMA_VERSION);
 
 	dev->dsr = dma_alloc_coherent(&pdev->dev, sizeof(*dev->dsr),
 				      &dev->dsrbase, GFP_KERNEL);
@@ -897,17 +898,9 @@ static int pvrdma_pci_probe(struct pci_dev *pdev,
 	/* Make sure the write is complete before reading status. */
 	mb();
 
-	/* Currently, the driver only supports RoCE mode. */
-	if (dev->dsr->caps.mode != PVRDMA_DEVICE_MODE_ROCE) {
-		dev_err(&pdev->dev, "unsupported transport %d\n",
-			dev->dsr->caps.mode);
-		ret = -EFAULT;
-		goto err_free_cq_ring;
-	}
-
-	/* Currently, the driver only supports RoCE V1. */
-	if (!(dev->dsr->caps.gid_types & PVRDMA_GID_TYPE_FLAG_ROCE_V1)) {
-		dev_err(&pdev->dev, "driver needs RoCE v1 support\n");
+	/* The driver supports RoCE V1 and V2. */
+	if (!PVRDMA_SUPPORTED(dev)) {
+		dev_err(&pdev->dev, "driver needs RoCE v1 or v2 support\n");
 		ret = -EFAULT;
 		goto err_free_cq_ring;
 	}
@@ -1078,7 +1071,7 @@ static void pvrdma_pci_remove(struct pci_dev *pdev)
 	pci_set_drvdata(pdev, NULL);
 }
 
-static struct pci_device_id pvrdma_pci_table[] = {
+static const struct pci_device_id pvrdma_pci_table[] = {
 	{ PCI_DEVICE(PCI_VENDOR_ID_VMWARE, PCI_DEVICE_ID_VMWARE_PVRDMA), },
 	{ 0 },
 };
@@ -1119,5 +1112,4 @@ module_exit(pvrdma_cleanup);
 
 MODULE_AUTHOR("VMware, Inc");
 MODULE_DESCRIPTION("VMware Paravirtual RDMA driver");
-MODULE_VERSION(DRV_VERSION);
 MODULE_LICENSE("Dual BSD/GPL");
diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_misc.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_misc.c
index ec6a4ca..fb0c5c0 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_misc.c
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_misc.c
@@ -303,3 +303,10 @@ void rdma_ah_attr_to_pvrdma(struct pvrdma_ah_attr *dst,
 	dst->port_num = rdma_ah_get_port_num(src);
 	memcpy(&dst->dmac, src->roce.dmac, sizeof(dst->dmac));
 }
+
+u8 ib_gid_type_to_pvrdma(enum ib_gid_type gid_type)
+{
+	return (gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP) ?
+		PVRDMA_GID_TYPE_FLAG_ROCE_V2 :
+		PVRDMA_GID_TYPE_FLAG_ROCE_V1;
+}
diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_ring.h b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_ring.h
index ed9022a..8b558ae 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_ring.h
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_ring.h
@@ -111,21 +111,4 @@ static inline __s32 pvrdma_idx_ring_has_data(const struct pvrdma_ring *r,
 	return PVRDMA_INVALID_IDX;
 }
 
-static inline bool pvrdma_idx_ring_is_valid_idx(const struct pvrdma_ring *r,
-						__u32 max_elems, __u32 *idx)
-{
-	const __u32 tail = atomic_read(&r->prod_tail);
-	const __u32 head = atomic_read(&r->cons_head);
-
-	if (pvrdma_idx_valid(tail, max_elems) &&
-	    pvrdma_idx_valid(head, max_elems) &&
-	    pvrdma_idx_valid(*idx, max_elems)) {
-		if (tail > head && (*idx < tail && *idx >= head))
-			return true;
-		else if (head > tail && (*idx >= head || *idx < tail))
-			return true;
-	}
-	return false;
-}
-
 #endif /* __PVRDMA_RING_H__ */
diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c
index 2851704..48776f5 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c
@@ -83,6 +83,8 @@ int pvrdma_query_device(struct ib_device *ibdev,
 	props->max_qp_wr = dev->dsr->caps.max_qp_wr;
 	props->device_cap_flags = dev->dsr->caps.device_cap_flags;
 	props->max_sge = dev->dsr->caps.max_sge;
+	props->max_sge_rd = PVRDMA_GET_CAP(dev, dev->dsr->caps.max_sge,
+					   dev->dsr->caps.max_sge_rd);
 	props->max_cq = dev->dsr->caps.max_cq;
 	props->max_cqe = dev->dsr->caps.max_cqe;
 	props->max_mr = dev->dsr->caps.max_mr;
@@ -101,8 +103,14 @@ int pvrdma_query_device(struct ib_device *ibdev,
 	    (dev->dsr->caps.bmme_flags & PVRDMA_BMME_FLAG_REMOTE_INV) &&
 	    (dev->dsr->caps.bmme_flags & PVRDMA_BMME_FLAG_FAST_REG_WR)) {
 		props->device_cap_flags |= IB_DEVICE_MEM_MGT_EXTENSIONS;
+		props->max_fast_reg_page_list_len = PVRDMA_GET_CAP(dev,
+				PVRDMA_MAX_FAST_REG_PAGES,
+				dev->dsr->caps.max_fast_reg_page_list_len);
 	}
 
+	props->device_cap_flags |= IB_DEVICE_PORT_ACTIVE_EVENT |
+				   IB_DEVICE_RC_RNR_NAK_GEN;
+
 	return 0;
 }
 
@@ -143,6 +151,7 @@ int pvrdma_query_port(struct ib_device *ibdev, u8 port,
 	props->gid_tbl_len = resp->attrs.gid_tbl_len;
 	props->port_cap_flags =
 		pvrdma_port_cap_flags_to_ib(resp->attrs.port_cap_flags);
+	props->port_cap_flags |= IB_PORT_CM_SUP | IB_PORT_IP_BASED_GIDS;
 	props->max_msg_sz = resp->attrs.max_msg_sz;
 	props->bad_pkey_cntr = resp->attrs.bad_pkey_cntr;
 	props->qkey_viol_cntr = resp->attrs.qkey_viol_cntr;
diff --git a/drivers/infiniband/sw/rdmavt/ah.c b/drivers/infiniband/sw/rdmavt/ah.c
index a96d4aa..ba3639a 100644
--- a/drivers/infiniband/sw/rdmavt/ah.c
+++ b/drivers/infiniband/sw/rdmavt/ah.c
@@ -66,8 +66,6 @@ int rvt_check_ah(struct ib_device *ibdev,
 	int port_num = rdma_ah_get_port_num(ah_attr);
 	struct ib_port_attr port_attr;
 	struct rvt_dev_info *rdi = ib_to_rvt(ibdev);
-	enum rdma_link_layer link = rdma_port_get_link_layer(ibdev, port_num);
-	u32 dlid = rdma_ah_get_dlid(ah_attr);
 	u8 ah_flags = rdma_ah_get_ah_flags(ah_attr);
 	u8 static_rate = rdma_ah_get_static_rate(ah_attr);
 
@@ -83,14 +81,6 @@ int rvt_check_ah(struct ib_device *ibdev,
 	if ((ah_flags & IB_AH_GRH) &&
 	    rdma_ah_read_grh(ah_attr)->sgid_index >= port_attr.gid_tbl_len)
 		return -EINVAL;
-	if (link != IB_LINK_LAYER_ETHERNET) {
-		if (dlid == 0)
-			return -EINVAL;
-		if (dlid >= be16_to_cpu(IB_MULTICAST_LID_BASE) &&
-		    dlid != be16_to_cpu(IB_LID_PERMISSIVE) &&
-		    !(ah_flags & IB_AH_GRH))
-			return -EINVAL;
-	}
 	if (rdi->driver_f.check_ah)
 		return rdi->driver_f.check_ah(ibdev, ah_attr);
 	return 0;
diff --git a/drivers/infiniband/sw/rdmavt/cq.c b/drivers/infiniband/sw/rdmavt/cq.c
index 0ae2ff8..97d71e4 100644
--- a/drivers/infiniband/sw/rdmavt/cq.c
+++ b/drivers/infiniband/sw/rdmavt/cq.c
@@ -107,7 +107,7 @@ void rvt_cq_enter(struct rvt_cq *cq, struct ib_wc *entry, bool solicited)
 		wc->uqueue[head].src_qp = entry->src_qp;
 		wc->uqueue[head].wc_flags = entry->wc_flags;
 		wc->uqueue[head].pkey_index = entry->pkey_index;
-		wc->uqueue[head].slid = entry->slid;
+		wc->uqueue[head].slid = ib_lid_cpu16(entry->slid);
 		wc->uqueue[head].sl = entry->sl;
 		wc->uqueue[head].dlid_path_bits = entry->dlid_path_bits;
 		wc->uqueue[head].port_num = entry->port_num;
diff --git a/drivers/infiniband/sw/rdmavt/mr.c b/drivers/infiniband/sw/rdmavt/mr.c
index aa5f9ea3..4271351 100644
--- a/drivers/infiniband/sw/rdmavt/mr.c
+++ b/drivers/infiniband/sw/rdmavt/mr.c
@@ -441,6 +441,105 @@ struct ib_mr *rvt_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 }
 
 /**
+ * rvt_dereg_clean_qp_cb - callback from iterator
+ * @qp - the qp
+ * @v - the mregion (as u64)
+ *
+ * This routine fields the callback for all QPs and
+ * for QPs in the same PD as the MR will call the
+ * rvt_qp_mr_clean() to potentially cleanup references.
+ */
+static void rvt_dereg_clean_qp_cb(struct rvt_qp *qp, u64 v)
+{
+	struct rvt_mregion *mr = (struct rvt_mregion *)v;
+
+	/* skip PDs that are not ours */
+	if (mr->pd != qp->ibqp.pd)
+		return;
+	rvt_qp_mr_clean(qp, mr->lkey);
+}
+
+/**
+ * rvt_dereg_clean_qps - find QPs for reference cleanup
+ * @mr - the MR that is being deregistered
+ *
+ * This routine iterates RC QPs looking for references
+ * to the lkey noted in mr.
+ */
+static void rvt_dereg_clean_qps(struct rvt_mregion *mr)
+{
+	struct rvt_dev_info *rdi = ib_to_rvt(mr->pd->device);
+
+	rvt_qp_iter(rdi, (u64)mr, rvt_dereg_clean_qp_cb);
+}
+
+/**
+ * rvt_check_refs - check references
+ * @mr - the megion
+ * @t - the caller identification
+ *
+ * This routine checks MRs holding a reference during
+ * when being de-registered.
+ *
+ * If the count is non-zero, the code calls a clean routine then
+ * waits for the timeout for the count to zero.
+ */
+static int rvt_check_refs(struct rvt_mregion *mr, const char *t)
+{
+	unsigned long timeout;
+	struct rvt_dev_info *rdi = ib_to_rvt(mr->pd->device);
+
+	if (percpu_ref_is_zero(&mr->refcount))
+		return 0;
+	/* avoid dma mr */
+	if (mr->lkey)
+		rvt_dereg_clean_qps(mr);
+	timeout = wait_for_completion_timeout(&mr->comp, 5 * HZ);
+	if (!timeout) {
+		rvt_pr_err(rdi,
+			   "%s timeout mr %p pd %p lkey %x refcount %ld\n",
+			   t, mr, mr->pd, mr->lkey,
+			   atomic_long_read(&mr->refcount.count));
+		rvt_get_mr(mr);
+		return -EBUSY;
+	}
+	return 0;
+}
+
+/**
+ * rvt_mr_has_lkey - is MR
+ * @mr - the mregion
+ * @lkey - the lkey
+ */
+bool rvt_mr_has_lkey(struct rvt_mregion *mr, u32 lkey)
+{
+	return mr && lkey == mr->lkey;
+}
+
+/**
+ * rvt_ss_has_lkey - is mr in sge tests
+ * @ss - the sge state
+ * @lkey
+ *
+ * This code tests for an MR in the indicated
+ * sge state.
+ */
+bool rvt_ss_has_lkey(struct rvt_sge_state *ss, u32 lkey)
+{
+	int i;
+	bool rval = false;
+
+	if (!ss->num_sge)
+		return rval;
+	/* first one */
+	rval = rvt_mr_has_lkey(ss->sge.mr, lkey);
+	/* any others */
+	for (i = 0; !rval && i < ss->num_sge - 1; i++)
+		rval = rvt_mr_has_lkey(ss->sg_list[i].mr, lkey);
+	return rval;
+}
+
+/**
  * rvt_dereg_mr - unregister and free a memory region
  * @ibmr: the memory region to free
  *
@@ -453,22 +552,14 @@ struct ib_mr *rvt_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 int rvt_dereg_mr(struct ib_mr *ibmr)
 {
 	struct rvt_mr *mr = to_imr(ibmr);
-	struct rvt_dev_info *rdi = ib_to_rvt(ibmr->pd->device);
-	int ret = 0;
-	unsigned long timeout;
+	int ret;
 
 	rvt_free_lkey(&mr->mr);
 
 	rvt_put_mr(&mr->mr); /* will set completion if last */
-	timeout = wait_for_completion_timeout(&mr->mr.comp, 5 * HZ);
-	if (!timeout) {
-		rvt_pr_err(rdi,
-			   "rvt_dereg_mr timeout mr %p pd %p\n",
-			   mr, mr->mr.pd);
-		rvt_get_mr(&mr->mr);
-		ret = -EBUSY;
+	ret = rvt_check_refs(&mr->mr, __func__);
+	if (ret)
 		goto out;
-	}
 	rvt_deinit_mregion(&mr->mr);
 	if (mr->umem)
 		ib_umem_release(mr->umem);
@@ -761,16 +852,12 @@ int rvt_dealloc_fmr(struct ib_fmr *ibfmr)
 {
 	struct rvt_fmr *fmr = to_ifmr(ibfmr);
 	int ret = 0;
-	unsigned long timeout;
 
 	rvt_free_lkey(&fmr->mr);
 	rvt_put_mr(&fmr->mr); /* will set completion if last */
-	timeout = wait_for_completion_timeout(&fmr->mr.comp, 5 * HZ);
-	if (!timeout) {
-		rvt_get_mr(&fmr->mr);
-		ret = -EBUSY;
+	ret = rvt_check_refs(&fmr->mr, __func__);
+	if (ret)
 		goto out;
-	}
 	rvt_deinit_mregion(&fmr->mr);
 	kfree(fmr);
 out:
@@ -778,23 +865,52 @@ int rvt_dealloc_fmr(struct ib_fmr *ibfmr)
 }
 
 /**
+ * rvt_sge_adjacent - is isge compressible
+ * @last_sge: last outgoing SGE written
+ * @sge: SGE to check
+ *
+ * If adjacent will update last_sge to add length.
+ *
+ * Return: true if isge is adjacent to last sge
+ */
+static inline bool rvt_sge_adjacent(struct rvt_sge *last_sge,
+				    struct ib_sge *sge)
+{
+	if (last_sge && sge->lkey == last_sge->mr->lkey &&
+	    ((uint64_t)(last_sge->vaddr + last_sge->length) == sge->addr)) {
+		if (sge->lkey) {
+			if (unlikely((sge->addr - last_sge->mr->user_base +
+			      sge->length > last_sge->mr->length)))
+				return false; /* overrun, caller will catch */
+		} else {
+			last_sge->length += sge->length;
+		}
+		last_sge->sge_length += sge->length;
+		trace_rvt_sge_adjacent(last_sge, sge);
+		return true;
+	}
+	return false;
+}
+
+/**
  * rvt_lkey_ok - check IB SGE for validity and initialize
  * @rkt: table containing lkey to check SGE against
  * @pd: protection domain
  * @isge: outgoing internal SGE
+ * @last_sge: last outgoing SGE written
  * @sge: SGE to check
  * @acc: access flags
  *
  * Check the IB SGE for validity and initialize our internal version
  * of it.
  *
- * Return: 1 if valid and successful, otherwise returns 0.
+ * Increments the reference count when a new sge is stored.
  *
- * increments the reference count upon success
- *
+ * Return: 0 if compressed, 1 if added , otherwise returns -errno.
  */
 int rvt_lkey_ok(struct rvt_lkey_table *rkt, struct rvt_pd *pd,
-		struct rvt_sge *isge, struct ib_sge *sge, int acc)
+		struct rvt_sge *isge, struct rvt_sge *last_sge,
+		struct ib_sge *sge, int acc)
 {
 	struct rvt_mregion *mr;
 	unsigned n, m;
@@ -804,12 +920,14 @@ int rvt_lkey_ok(struct rvt_lkey_table *rkt, struct rvt_pd *pd,
 	 * We use LKEY == zero for kernel virtual addresses
 	 * (see rvt_get_dma_mr() and dma_virt_ops).
 	 */
-	rcu_read_lock();
 	if (sge->lkey == 0) {
 		struct rvt_dev_info *dev = ib_to_rvt(pd->ibpd.device);
 
 		if (pd->user)
-			goto bail;
+			return -EINVAL;
+		if (rvt_sge_adjacent(last_sge, sge))
+			return 0;
+		rcu_read_lock();
 		mr = rcu_dereference(dev->dma_mr);
 		if (!mr)
 			goto bail;
@@ -824,6 +942,9 @@ int rvt_lkey_ok(struct rvt_lkey_table *rkt, struct rvt_pd *pd,
 		isge->n = 0;
 		goto ok;
 	}
+	if (rvt_sge_adjacent(last_sge, sge))
+		return 0;
+	rcu_read_lock();
 	mr = rcu_dereference(rkt->table[sge->lkey >> rkt->shift]);
 	if (!mr)
 		goto bail;
@@ -874,12 +995,13 @@ int rvt_lkey_ok(struct rvt_lkey_table *rkt, struct rvt_pd *pd,
 	isge->m = m;
 	isge->n = n;
 ok:
+	trace_rvt_sge_new(isge, sge);
 	return 1;
 bail_unref:
 	rvt_put_mr(mr);
 bail:
 	rcu_read_unlock();
-	return 0;
+	return -EINVAL;
 }
 EXPORT_SYMBOL(rvt_lkey_ok);
 
diff --git a/drivers/infiniband/sw/rdmavt/qp.c b/drivers/infiniband/sw/rdmavt/qp.c
index 8876ee7..22df09a 100644
--- a/drivers/infiniband/sw/rdmavt/qp.c
+++ b/drivers/infiniband/sw/rdmavt/qp.c
@@ -52,6 +52,7 @@
 #include <linux/slab.h>
 #include <rdma/ib_verbs.h>
 #include <rdma/ib_hdrs.h>
+#include <rdma/opa_addr.h>
 #include "qp.h"
 #include "vt.h"
 #include "trace.h"
@@ -421,15 +422,6 @@ static int alloc_qpn(struct rvt_dev_info *rdi, struct rvt_qpn_table *qpt,
 	return ret;
 }
 
-static void free_qpn(struct rvt_qpn_table *qpt, u32 qpn)
-{
-	struct rvt_qpn_map *map;
-
-	map = qpt->map + qpn / RVT_BITS_PER_PAGE;
-	if (map->page)
-		clear_bit(qpn & RVT_BITS_PER_PAGE_MASK, map->page);
-}
-
 /**
  * rvt_clear_mr_refs - Drop help mr refs
  * @qp: rvt qp data structure
@@ -448,13 +440,9 @@ static void rvt_clear_mr_refs(struct rvt_qp *qp, int clr_sends)
 	if (clr_sends) {
 		while (qp->s_last != qp->s_head) {
 			struct rvt_swqe *wqe = rvt_get_swqe_ptr(qp, qp->s_last);
-			unsigned i;
 
-			for (i = 0; i < wqe->wr.num_sge; i++) {
-				struct rvt_sge *sge = &wqe->sg_list[i];
+			rvt_put_swqe(wqe);
 
-				rvt_put_mr(sge->mr);
-			}
 			if (qp->ibqp.qp_type == IB_QPT_UD ||
 			    qp->ibqp.qp_type == IB_QPT_SMI ||
 			    qp->ibqp.qp_type == IB_QPT_GSI)
@@ -470,10 +458,7 @@ static void rvt_clear_mr_refs(struct rvt_qp *qp, int clr_sends)
 		}
 	}
 
-	if (qp->ibqp.qp_type != IB_QPT_RC)
-		return;
-
-	for (n = 0; n < rvt_max_atomic(rdi); n++) {
+	for (n = 0; qp->s_ack_queue && n < rvt_max_atomic(rdi); n++) {
 		struct rvt_ack_entry *e = &qp->s_ack_queue[n];
 
 		if (e->rdma_sge.mr) {
@@ -484,6 +469,113 @@ static void rvt_clear_mr_refs(struct rvt_qp *qp, int clr_sends)
 }
 
 /**
+ * rvt_swqe_has_lkey - return true if lkey is used by swqe
+ * @wqe - the send wqe
+ * @lkey - the lkey
+ *
+ * Test the swqe for using lkey
+ */
+static bool rvt_swqe_has_lkey(struct rvt_swqe *wqe, u32 lkey)
+{
+	int i;
+
+	for (i = 0; i < wqe->wr.num_sge; i++) {
+		struct rvt_sge *sge = &wqe->sg_list[i];
+
+		if (rvt_mr_has_lkey(sge->mr, lkey))
+			return true;
+	}
+	return false;
+}
+
+/**
+ * rvt_qp_sends_has_lkey - return true is qp sends use lkey
+ * @qp - the rvt_qp
+ * @lkey - the lkey
+ */
+static bool rvt_qp_sends_has_lkey(struct rvt_qp *qp, u32 lkey)
+{
+	u32 s_last = qp->s_last;
+
+	while (s_last != qp->s_head) {
+		struct rvt_swqe *wqe = rvt_get_swqe_ptr(qp, s_last);
+
+		if (rvt_swqe_has_lkey(wqe, lkey))
+			return true;
+
+		if (++s_last >= qp->s_size)
+			s_last = 0;
+	}
+	if (qp->s_rdma_mr)
+		if (rvt_mr_has_lkey(qp->s_rdma_mr, lkey))
+			return true;
+	return false;
+}
+
+/**
+ * rvt_qp_acks_has_lkey - return true if acks have lkey
+ * @qp - the qp
+ * @lkey - the lkey
+ */
+static bool rvt_qp_acks_has_lkey(struct rvt_qp *qp, u32 lkey)
+{
+	int i;
+	struct rvt_dev_info *rdi = ib_to_rvt(qp->ibqp.device);
+
+	for (i = 0; qp->s_ack_queue && i < rvt_max_atomic(rdi); i++) {
+		struct rvt_ack_entry *e = &qp->s_ack_queue[i];
+
+		if (rvt_mr_has_lkey(e->rdma_sge.mr, lkey))
+			return true;
+	}
+	return false;
+}
+
+/*
+ * rvt_qp_mr_clean - clean up remote ops for lkey
+ * @qp - the qp
+ * @lkey - the lkey that is being de-registered
+ *
+ * This routine checks if the lkey is being used by
+ * the qp.
+ *
+ * If so, the qp is put into an error state to elminate
+ * any references from the qp.
+ */
+void rvt_qp_mr_clean(struct rvt_qp *qp, u32 lkey)
+{
+	bool lastwqe = false;
+
+	if (qp->ibqp.qp_type == IB_QPT_SMI ||
+	    qp->ibqp.qp_type == IB_QPT_GSI)
+		/* avoid special QPs */
+		return;
+	spin_lock_irq(&qp->r_lock);
+	spin_lock(&qp->s_hlock);
+	spin_lock(&qp->s_lock);
+
+	if (qp->state == IB_QPS_ERR || qp->state == IB_QPS_RESET)
+		goto check_lwqe;
+
+	if (rvt_ss_has_lkey(&qp->r_sge, lkey) ||
+	    rvt_qp_sends_has_lkey(qp, lkey) ||
+	    rvt_qp_acks_has_lkey(qp, lkey))
+		lastwqe = rvt_error_qp(qp, IB_WC_LOC_PROT_ERR);
+check_lwqe:
+	spin_unlock(&qp->s_lock);
+	spin_unlock(&qp->s_hlock);
+	spin_unlock_irq(&qp->r_lock);
+	if (lastwqe) {
+		struct ib_event ev;
+
+		ev.device = qp->ibqp.device;
+		ev.element.qp = &qp->ibqp;
+		ev.event = IB_EVENT_QP_LAST_WQE_REACHED;
+		qp->ibqp.event_handler(&ev, qp->ibqp.qp_context);
+	}
+}
+
+/**
  * rvt_remove_qp - remove qp form table
  * @rdi: rvt dev struct
  * @qp: qp to remove
@@ -645,6 +737,19 @@ static void rvt_reset_qp(struct rvt_dev_info *rdi, struct rvt_qp *qp,
 	lockdep_assert_held(&qp->s_lock);
 }
 
+/** rvt_free_qpn - Free a qpn from the bit map
+ * @qpt: QP table
+ * @qpn: queue pair number to free
+ */
+static void rvt_free_qpn(struct rvt_qpn_table *qpt, u32 qpn)
+{
+	struct rvt_qpn_map *map;
+
+	map = qpt->map + (qpn & RVT_QPN_MASK) / RVT_BITS_PER_PAGE;
+	if (map->page)
+		clear_bit(qpn & RVT_BITS_PER_PAGE_MASK, map->page);
+}
+
 /**
  * rvt_create_qp - create a queue pair for a device
  * @ibpd: the protection domain who's device we create the queue pair for
@@ -914,7 +1019,7 @@ struct ib_qp *rvt_create_qp(struct ib_pd *ibpd,
 		kref_put(&qp->ip->ref, rvt_release_mmap_info);
 
 bail_qpn:
-	free_qpn(&rdi->qp_dev->qpn_table, qp->ibqp.qp_num);
+	rvt_free_qpn(&rdi->qp_dev->qpn_table, qp->ibqp.qp_num);
 
 bail_rq_wq:
 	if (!qp->ip)
@@ -1062,6 +1167,7 @@ int rvt_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 	int mig = 0;
 	int pmtu = 0; /* for gcc warning only */
 	enum rdma_link_layer link;
+	int opa_ah;
 
 	link = rdma_port_get_link_layer(ibqp->device, qp->port_num);
 
@@ -1072,6 +1178,7 @@ int rvt_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 	cur_state = attr_mask & IB_QP_CUR_STATE ?
 		attr->cur_qp_state : qp->state;
 	new_state = attr_mask & IB_QP_STATE ? attr->qp_state : cur_state;
+	opa_ah = rdma_cap_opa_ah(ibqp->device, qp->port_num);
 
 	if (!ib_modify_qp_is_ok(cur_state, new_state, ibqp->qp_type,
 				attr_mask, link))
@@ -1082,17 +1189,31 @@ int rvt_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 		goto inval;
 
 	if (attr_mask & IB_QP_AV) {
-		if (rdma_ah_get_dlid(&attr->ah_attr) >=
-		    be16_to_cpu(IB_MULTICAST_LID_BASE))
-			goto inval;
+		if (opa_ah) {
+			if (rdma_ah_get_dlid(&attr->ah_attr) >=
+				opa_get_mcast_base(OPA_MCAST_NR))
+				goto inval;
+		} else {
+			if (rdma_ah_get_dlid(&attr->ah_attr) >=
+				be16_to_cpu(IB_MULTICAST_LID_BASE))
+				goto inval;
+		}
+
 		if (rvt_check_ah(qp->ibqp.device, &attr->ah_attr))
 			goto inval;
 	}
 
 	if (attr_mask & IB_QP_ALT_PATH) {
-		if (rdma_ah_get_dlid(&attr->alt_ah_attr) >=
-		    be16_to_cpu(IB_MULTICAST_LID_BASE))
-			goto inval;
+		if (opa_ah) {
+			if (rdma_ah_get_dlid(&attr->alt_ah_attr) >=
+				opa_get_mcast_base(OPA_MCAST_NR))
+				goto inval;
+		} else {
+			if (rdma_ah_get_dlid(&attr->alt_ah_attr) >=
+				be16_to_cpu(IB_MULTICAST_LID_BASE))
+				goto inval;
+		}
+
 		if (rvt_check_ah(qp->ibqp.device, &attr->alt_ah_attr))
 			goto inval;
 		if (attr->alt_pkey_index >= rvt_get_npkeys(rdi))
@@ -1239,7 +1360,6 @@ int rvt_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 
 	if (attr_mask & IB_QP_PATH_MTU) {
 		qp->pmtu = rdi->driver_f.mtu_from_qp(rdi, qp, pmtu);
-		qp->path_mtu = rdi->driver_f.mtu_to_path_mtu(qp->pmtu);
 		qp->log_pmtu = ilog2(qp->pmtu);
 	}
 
@@ -1301,19 +1421,6 @@ int rvt_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 	return -EINVAL;
 }
 
-/** rvt_free_qpn - Free a qpn from the bit map
- * @qpt: QP table
- * @qpn: queue pair number to free
- */
-static void rvt_free_qpn(struct rvt_qpn_table *qpt, u32 qpn)
-{
-	struct rvt_qpn_map *map;
-
-	map = qpt->map + qpn / RVT_BITS_PER_PAGE;
-	if (map->page)
-		clear_bit(qpn & RVT_BITS_PER_PAGE_MASK, map->page);
-}
-
 /**
  * rvt_destroy_qp - destroy a queue pair
  * @ibqp: the queue pair to destroy
@@ -1375,7 +1482,7 @@ int rvt_query_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 
 	attr->qp_state = qp->state;
 	attr->cur_qp_state = attr->qp_state;
-	attr->path_mtu = qp->path_mtu;
+	attr->path_mtu = rdi->driver_f.mtu_to_path_mtu(qp->pmtu);
 	attr->path_mig_state = qp->s_mig_state;
 	attr->qkey = qp->qkey;
 	attr->rq_psn = qp->r_psn & rdi->dparms.psn_mask;
@@ -1695,22 +1802,23 @@ static int rvt_post_one_wr(struct rvt_qp *qp,
 	wqe->length = 0;
 	j = 0;
 	if (wr->num_sge) {
+		struct rvt_sge *last_sge = NULL;
+
 		acc = wr->opcode >= IB_WR_RDMA_READ ?
 			IB_ACCESS_LOCAL_WRITE : 0;
 		for (i = 0; i < wr->num_sge; i++) {
 			u32 length = wr->sg_list[i].length;
-			int ok;
 
 			if (length == 0)
 				continue;
-			ok = rvt_lkey_ok(rkt, pd, &wqe->sg_list[j],
-					 &wr->sg_list[i], acc);
-			if (!ok) {
-				ret = -EINVAL;
+			ret = rvt_lkey_ok(rkt, pd, &wqe->sg_list[j], last_sge,
+					  &wr->sg_list[i], acc);
+			if (unlikely(ret < 0))
 				goto bail_inval_free;
-			}
 			wqe->length += length;
-			j++;
+			if (ret)
+				last_sge = &wqe->sg_list[j];
+			j += ret;
 		}
 		wqe->wr.num_sge = j;
 	}
@@ -1757,7 +1865,7 @@ static int rvt_post_one_wr(struct rvt_qp *qp,
 		wqe->wr.send_flags &= ~RVT_SEND_RESERVE_USED;
 		qp->s_avail--;
 	}
-	trace_rvt_post_one_wr(qp, wqe);
+	trace_rvt_post_one_wr(qp, wqe, wr->num_sge);
 	smp_wmb(); /* see request builders */
 	qp->s_head = next;
 
@@ -2065,3 +2173,147 @@ enum hrtimer_restart rvt_rc_rnr_retry(struct hrtimer *t)
 	return HRTIMER_NORESTART;
 }
 EXPORT_SYMBOL(rvt_rc_rnr_retry);
+
+/**
+ * rvt_qp_iter_init - initial for QP iteration
+ * @rdi - rvt devinfo
+ * @v - u64 value
+ *
+ * This returns an iterator suitable for iterating QPs
+ * in the system.
+ *
+ * The @cb is a user defined callback and @v is a 64
+ * bit value passed to and relevant for processing in the
+ * @cb.  An example use case would be to alter QP processing
+ * based on criteria not part of the rvt_qp.
+ *
+ * Use cases that require memory allocation to succeed
+ * must preallocate appropriately.
+ *
+ * Return: a pointer to an rvt_qp_iter or NULL
+ */
+struct rvt_qp_iter *rvt_qp_iter_init(struct rvt_dev_info *rdi,
+				     u64 v,
+				     void (*cb)(struct rvt_qp *qp, u64 v))
+{
+	struct rvt_qp_iter *i;
+
+	i = kzalloc(sizeof(*i), GFP_KERNEL);
+	if (!i)
+		return NULL;
+
+	i->rdi = rdi;
+	/* number of special QPs (SMI/GSI) for device */
+	i->specials = rdi->ibdev.phys_port_cnt * 2;
+	i->v = v;
+	i->cb = cb;
+
+	return i;
+}
+EXPORT_SYMBOL(rvt_qp_iter_init);
+
+/**
+ * rvt_qp_iter_next - return the next QP in iter
+ * @iter - the iterator
+ *
+ * Fine grained QP iterator suitable for use
+ * with debugfs seq_file mechanisms.
+ *
+ * Updates iter->qp with the current QP when the return
+ * value is 0.
+ *
+ * Return: 0 - iter->qp is valid 1 - no more QPs
+ */
+int rvt_qp_iter_next(struct rvt_qp_iter *iter)
+	__must_hold(RCU)
+{
+	int n = iter->n;
+	int ret = 1;
+	struct rvt_qp *pqp = iter->qp;
+	struct rvt_qp *qp;
+	struct rvt_dev_info *rdi = iter->rdi;
+
+	/*
+	 * The approach is to consider the special qps
+	 * as additional table entries before the
+	 * real hash table.  Since the qp code sets
+	 * the qp->next hash link to NULL, this works just fine.
+	 *
+	 * iter->specials is 2 * # ports
+	 *
+	 * n = 0..iter->specials is the special qp indices
+	 *
+	 * n = iter->specials..rdi->qp_dev->qp_table_size+iter->specials are
+	 * the potential hash bucket entries
+	 *
+	 */
+	for (; n <  rdi->qp_dev->qp_table_size + iter->specials; n++) {
+		if (pqp) {
+			qp = rcu_dereference(pqp->next);
+		} else {
+			if (n < iter->specials) {
+				struct rvt_ibport *rvp;
+				int pidx;
+
+				pidx = n % rdi->ibdev.phys_port_cnt;
+				rvp = rdi->ports[pidx];
+				qp = rcu_dereference(rvp->qp[n & 1]);
+			} else {
+				qp = rcu_dereference(
+					rdi->qp_dev->qp_table[
+						(n - iter->specials)]);
+			}
+		}
+		pqp = qp;
+		if (qp) {
+			iter->qp = qp;
+			iter->n = n;
+			return 0;
+		}
+	}
+	return ret;
+}
+EXPORT_SYMBOL(rvt_qp_iter_next);
+
+/**
+ * rvt_qp_iter - iterate all QPs
+ * @rdi - rvt devinfo
+ * @v - a 64 bit value
+ * @cb - a callback
+ *
+ * This provides a way for iterating all QPs.
+ *
+ * The @cb is a user defined callback and @v is a 64
+ * bit value passed to and relevant for processing in the
+ * cb.  An example use case would be to alter QP processing
+ * based on criteria not part of the rvt_qp.
+ *
+ * The code has an internal iterator to simplify
+ * non seq_file use cases.
+ */
+void rvt_qp_iter(struct rvt_dev_info *rdi,
+		 u64 v,
+		 void (*cb)(struct rvt_qp *qp, u64 v))
+{
+	int ret;
+	struct rvt_qp_iter i = {
+		.rdi = rdi,
+		.specials = rdi->ibdev.phys_port_cnt * 2,
+		.v = v,
+		.cb = cb
+	};
+
+	rcu_read_lock();
+	do {
+		ret = rvt_qp_iter_next(&i);
+		if (!ret) {
+			rvt_get_qp(i.qp);
+			rcu_read_unlock();
+			i.cb(i.qp, i.v);
+			rcu_read_lock();
+			rvt_put_qp(i.qp);
+		}
+	} while (!ret);
+	rcu_read_unlock();
+}
+EXPORT_SYMBOL(rvt_qp_iter);
diff --git a/drivers/infiniband/sw/rdmavt/trace_mr.h b/drivers/infiniband/sw/rdmavt/trace_mr.h
index 3318a6c..976e482 100644
--- a/drivers/infiniband/sw/rdmavt/trace_mr.h
+++ b/drivers/infiniband/sw/rdmavt/trace_mr.h
@@ -103,6 +103,68 @@ DEFINE_EVENT(
 	TP_PROTO(struct rvt_mregion *mr, u16 m, u16 n, void *v, size_t len),
 	TP_ARGS(mr, m, n, v, len));
 
+DECLARE_EVENT_CLASS(
+	rvt_sge_template,
+	TP_PROTO(struct rvt_sge *sge, struct ib_sge *isge),
+	TP_ARGS(sge, isge),
+	TP_STRUCT__entry(
+		RDI_DEV_ENTRY(ib_to_rvt(sge->mr->pd->device))
+		__field(struct rvt_mregion *, mr)
+		__field(struct rvt_sge *, sge)
+		__field(struct ib_sge *, isge)
+		__field(void *, vaddr)
+		__field(u64, ivaddr)
+		__field(u32, lkey)
+		__field(u32, sge_length)
+		__field(u32, length)
+		__field(u32, ilength)
+		__field(int, user)
+		__field(u16, m)
+		__field(u16, n)
+	),
+	TP_fast_assign(
+		RDI_DEV_ASSIGN(ib_to_rvt(sge->mr->pd->device));
+		__entry->mr = sge->mr;
+		__entry->sge = sge;
+		__entry->isge = isge;
+		__entry->vaddr = sge->vaddr;
+		__entry->ivaddr = isge->addr;
+		__entry->lkey = sge->mr->lkey;
+		__entry->sge_length = sge->sge_length;
+		__entry->length = sge->length;
+		__entry->ilength = isge->length;
+		__entry->m = sge->m;
+		__entry->n = sge->m;
+		__entry->user = ibpd_to_rvtpd(sge->mr->pd)->user;
+	),
+	TP_printk(
+		"[%s] mr %p sge %p isge %p vaddr %p ivaddr %llx lkey %x sge_length %u length %u ilength %u m %u n %u user %u",
+		__get_str(dev),
+		__entry->mr,
+		__entry->sge,
+		__entry->isge,
+		__entry->vaddr,
+		__entry->ivaddr,
+		__entry->lkey,
+		__entry->sge_length,
+		__entry->length,
+		__entry->ilength,
+		__entry->m,
+		__entry->n,
+		__entry->user
+	)
+);
+
+DEFINE_EVENT(
+	rvt_sge_template, rvt_sge_adjacent,
+	TP_PROTO(struct rvt_sge *sge, struct ib_sge *isge),
+	TP_ARGS(sge, isge));
+
+DEFINE_EVENT(
+	rvt_sge_template, rvt_sge_new,
+	TP_PROTO(struct rvt_sge *sge, struct ib_sge *isge),
+	TP_ARGS(sge, isge));
+
 #endif /* __RVT_TRACE_MR_H */
 
 #undef TRACE_INCLUDE_PATH
diff --git a/drivers/infiniband/sw/rdmavt/trace_tx.h b/drivers/infiniband/sw/rdmavt/trace_tx.h
index a613a22..0ef25fc 100644
--- a/drivers/infiniband/sw/rdmavt/trace_tx.h
+++ b/drivers/infiniband/sw/rdmavt/trace_tx.h
@@ -84,12 +84,12 @@ __print_symbolic(opcode,                                   \
 	wr_opcode_name(RESERVED10))
 
 #define POS_PRN \
-"[%s] wqe %p wr_id %llx send_flags %x qpn %x qpt %u psn %x lpsn %x ssn %x length %u opcode 0x%.2x,%s size %u avail %u head %u last %u pid %u num_sge %u"
+"[%s] wqe %p wr_id %llx send_flags %x qpn %x qpt %u psn %x lpsn %x ssn %x length %u opcode 0x%.2x,%s size %u avail %u head %u last %u pid %u num_sge %u wr_num_sge %u"
 
 TRACE_EVENT(
 	rvt_post_one_wr,
-	TP_PROTO(struct rvt_qp *qp, struct rvt_swqe *wqe),
-	TP_ARGS(qp, wqe),
+	TP_PROTO(struct rvt_qp *qp, struct rvt_swqe *wqe, int wr_num_sge),
+	TP_ARGS(qp, wqe, wr_num_sge),
 	TP_STRUCT__entry(
 		RDI_DEV_ENTRY(ib_to_rvt(qp->ibqp.device))
 		__field(u64, wr_id)
@@ -108,6 +108,7 @@ TRACE_EVENT(
 		__field(int, send_flags)
 		__field(pid_t, pid)
 		__field(int, num_sge)
+		__field(int, wr_num_sge)
 	),
 	TP_fast_assign(
 		RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device))
@@ -127,6 +128,7 @@ TRACE_EVENT(
 		__entry->ssn = wqe->ssn;
 		__entry->send_flags = wqe->wr.send_flags;
 		__entry->num_sge = wqe->wr.num_sge;
+		__entry->wr_num_sge = wr_num_sge;
 	),
 	TP_printk(
 		POS_PRN,
@@ -146,7 +148,8 @@ TRACE_EVENT(
 		__entry->head,
 		__entry->last,
 		__entry->pid,
-		__entry->num_sge
+		__entry->num_sge,
+		__entry->wr_num_sge
 	)
 );
 
diff --git a/drivers/infiniband/sw/rdmavt/vt.c b/drivers/infiniband/sw/rdmavt/vt.c
index 0d7c6bb..64bdd44 100644
--- a/drivers/infiniband/sw/rdmavt/vt.c
+++ b/drivers/infiniband/sw/rdmavt/vt.c
@@ -202,8 +202,13 @@ static int rvt_modify_port(struct ib_device *ibdev, u8 port_num,
 		return -EINVAL;
 
 	rvp = rdi->ports[port_index];
-	rvp->port_cap_flags |= props->set_port_cap_mask;
-	rvp->port_cap_flags &= ~props->clr_port_cap_mask;
+	if (port_modify_mask & IB_PORT_OPA_MASK_CHG) {
+		rvp->port_cap3_flags |= props->set_port_cap_mask;
+		rvp->port_cap3_flags &= ~props->clr_port_cap_mask;
+	} else {
+		rvp->port_cap_flags |= props->set_port_cap_mask;
+		rvp->port_cap_flags &= ~props->clr_port_cap_mask;
+	}
 
 	if (props->set_port_cap_mask || props->clr_port_cap_mask)
 		rdi->driver_f.cap_mask_chg(rdi, port_num);
diff --git a/drivers/infiniband/sw/rxe/rxe.c b/drivers/infiniband/sw/rxe/rxe.c
index c21c913..8c3d30b 100644
--- a/drivers/infiniband/sw/rxe/rxe.c
+++ b/drivers/infiniband/sw/rxe/rxe.c
@@ -38,7 +38,6 @@
 MODULE_AUTHOR("Bob Pearson, Frank Zago, John Groves, Kamal Heib");
 MODULE_DESCRIPTION("Soft RDMA transport");
 MODULE_LICENSE("Dual BSD/GPL");
-MODULE_VERSION("0.2");
 
 /* free resources for all ports on a device */
 static void rxe_cleanup_ports(struct rxe_dev *rxe)
diff --git a/drivers/infiniband/sw/rxe/rxe.h b/drivers/infiniband/sw/rxe/rxe.h
index 1ac5b85..6447d73 100644
--- a/drivers/infiniband/sw/rxe/rxe.h
+++ b/drivers/infiniband/sw/rxe/rxe.h
@@ -97,7 +97,7 @@ int rxe_rcv(struct sk_buff *skb);
 
 void rxe_dev_put(struct rxe_dev *rxe);
 struct rxe_dev *net_to_rxe(struct net_device *ndev);
-struct rxe_dev *get_rxe_by_name(const char* name);
+struct rxe_dev *get_rxe_by_name(const char *name);
 
 void rxe_port_up(struct rxe_dev *rxe);
 void rxe_port_down(struct rxe_dev *rxe);
diff --git a/drivers/infiniband/sw/rxe/rxe_av.c b/drivers/infiniband/sw/rxe/rxe_av.c
index 5bddf46..1cc9e2e 100644
--- a/drivers/infiniband/sw/rxe/rxe_av.c
+++ b/drivers/infiniband/sw/rxe/rxe_av.c
@@ -38,18 +38,13 @@ int rxe_av_chk_attr(struct rxe_dev *rxe, struct rdma_ah_attr *attr)
 {
 	struct rxe_port *port;
 
-	if (rdma_ah_get_port_num(attr) != 1) {
-		pr_info("invalid port_num = %d\n", rdma_ah_get_port_num(attr));
-		return -EINVAL;
-	}
-
 	port = &rxe->port;
 
 	if (rdma_ah_get_ah_flags(attr) & IB_AH_GRH) {
 		u8 sgid_index = rdma_ah_read_grh(attr)->sgid_index;
 
 		if (sgid_index > port->attr.gid_tbl_len) {
-			pr_info("invalid sgid index = %d\n", sgid_index);
+			pr_warn("invalid sgid index = %d\n", sgid_index);
 			return -EINVAL;
 		}
 	}
diff --git a/drivers/infiniband/sw/rxe/rxe_cq.c b/drivers/infiniband/sw/rxe/rxe_cq.c
index 49fe42c..c4aabf7 100644
--- a/drivers/infiniband/sw/rxe/rxe_cq.c
+++ b/drivers/infiniband/sw/rxe/rxe_cq.c
@@ -69,6 +69,14 @@ int rxe_cq_chk_attr(struct rxe_dev *rxe, struct rxe_cq *cq,
 static void rxe_send_complete(unsigned long data)
 {
 	struct rxe_cq *cq = (struct rxe_cq *)data;
+	unsigned long flags;
+
+	spin_lock_irqsave(&cq->cq_lock, flags);
+	if (cq->is_dying) {
+		spin_unlock_irqrestore(&cq->cq_lock, flags);
+		return;
+	}
+	spin_unlock_irqrestore(&cq->cq_lock, flags);
 
 	cq->ibcq.comp_handler(&cq->ibcq, cq->ibcq.cq_context);
 }
@@ -97,6 +105,8 @@ int rxe_cq_from_init(struct rxe_dev *rxe, struct rxe_cq *cq, int cqe,
 	if (udata)
 		cq->is_user = 1;
 
+	cq->is_dying = false;
+
 	tasklet_init(&cq->comp_task, rxe_send_complete, (unsigned long)cq);
 
 	spin_lock_init(&cq->cq_lock);
@@ -156,6 +166,15 @@ int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited)
 	return 0;
 }
 
+void rxe_cq_disable(struct rxe_cq *cq)
+{
+	unsigned long flags;
+
+	spin_lock_irqsave(&cq->cq_lock, flags);
+	cq->is_dying = true;
+	spin_unlock_irqrestore(&cq->cq_lock, flags);
+}
+
 void rxe_cq_cleanup(struct rxe_pool_entry *arg)
 {
 	struct rxe_cq *cq = container_of(arg, typeof(*cq), pelem);
diff --git a/drivers/infiniband/sw/rxe/rxe_hw_counters.c b/drivers/infiniband/sw/rxe/rxe_hw_counters.c
index 7ef90aa..6aeb7a1 100644
--- a/drivers/infiniband/sw/rxe/rxe_hw_counters.c
+++ b/drivers/infiniband/sw/rxe/rxe_hw_counters.c
@@ -33,7 +33,7 @@
 #include "rxe.h"
 #include "rxe_hw_counters.h"
 
-const char * const rxe_counter_name[] = {
+static const char * const rxe_counter_name[] = {
 	[RXE_CNT_SENT_PKTS]           =  "sent_pkts",
 	[RXE_CNT_RCVD_PKTS]           =  "rcvd_pkts",
 	[RXE_CNT_DUP_REQ]             =  "duplicate_request",
diff --git a/drivers/infiniband/sw/rxe/rxe_loc.h b/drivers/infiniband/sw/rxe/rxe_loc.h
index d6299ed..77b3ed0 100644
--- a/drivers/infiniband/sw/rxe/rxe_loc.h
+++ b/drivers/infiniband/sw/rxe/rxe_loc.h
@@ -64,6 +64,8 @@ int rxe_cq_resize_queue(struct rxe_cq *cq, int new_cqe, struct ib_udata *udata);
 
 int rxe_cq_post(struct rxe_cq *cq, struct rxe_cqe *cqe, int solicited);
 
+void rxe_cq_disable(struct rxe_cq *cq);
+
 void rxe_cq_cleanup(struct rxe_pool_entry *arg);
 
 /* rxe_mcast.c */
@@ -219,8 +221,6 @@ static inline void rxe_advance_resp_resource(struct rxe_qp *qp)
 void retransmit_timer(unsigned long data);
 void rnr_nak_timer(unsigned long data);
 
-void dump_qp(struct rxe_qp *qp);
-
 /* rxe_srq.c */
 #define IB_SRQ_INIT_MASK (~IB_SRQ_LIMIT)
 
@@ -250,7 +250,7 @@ void rxe_resp_queue_pkt(struct rxe_dev *rxe,
 void rxe_comp_queue_pkt(struct rxe_dev *rxe,
 			struct rxe_qp *qp, struct sk_buff *skb);
 
-static inline unsigned wr_opcode_mask(int opcode, struct rxe_qp *qp)
+static inline unsigned int wr_opcode_mask(int opcode, struct rxe_qp *qp)
 {
 	return rxe_wr_opcode_info[opcode].mask[qp->ibqp.qp_type];
 }
diff --git a/drivers/infiniband/sw/rxe/rxe_mmap.c b/drivers/infiniband/sw/rxe/rxe_mmap.c
index bd812e0..d22431e 100644
--- a/drivers/infiniband/sw/rxe/rxe_mmap.c
+++ b/drivers/infiniband/sw/rxe/rxe_mmap.c
@@ -76,7 +76,7 @@ static void rxe_vma_close(struct vm_area_struct *vma)
 	kref_put(&ip->ref, rxe_mmap_release);
 }
 
-static struct vm_operations_struct rxe_vm_ops = {
+static const struct vm_operations_struct rxe_vm_ops = {
 	.open = rxe_vma_open,
 	.close = rxe_vma_close,
 };
diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index e37cc89..5c2684b 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -367,11 +367,11 @@ int rxe_mem_copy(struct rxe_mem *mem, u64 iova, void *addr, int length,
 		dest = (dir == to_mem_obj) ?
 			((void *)(uintptr_t)iova) : addr;
 
+		memcpy(dest, src, length);
+
 		if (crcp)
 			*crcp = rxe_crc32(to_rdev(mem->pd->ibpd.device),
-					*crcp, src, length);
-
-		memcpy(dest, src, length);
+					*crcp, dest, length);
 
 		return 0;
 	}
@@ -401,11 +401,11 @@ int rxe_mem_copy(struct rxe_mem *mem, u64 iova, void *addr, int length,
 		if (bytes > length)
 			bytes = length;
 
+		memcpy(dest, src, bytes);
+
 		if (crcp)
 			crc = rxe_crc32(to_rdev(mem->pd->ibpd.device),
-					crc, src, bytes);
-
-		memcpy(dest, src, bytes);
+					crc, dest, bytes);
 
 		length	-= bytes;
 		addr	+= bytes;
diff --git a/drivers/infiniband/sw/rxe/rxe_net.c b/drivers/infiniband/sw/rxe/rxe_net.c
index 08f3f90..59dee10 100644
--- a/drivers/infiniband/sw/rxe/rxe_net.c
+++ b/drivers/infiniband/sw/rxe/rxe_net.c
@@ -191,7 +191,7 @@ static struct dst_entry *rxe_find_route(struct rxe_dev *rxe,
 	if (qp_type(qp) == IB_QPT_RC)
 		dst = sk_dst_get(qp->sk->sk);
 
-	if (!dst || !(dst->obsolete && dst->ops->check(dst, 0))) {
+	if (!dst || !dst_check(dst, qp->dst_cookie)) {
 		if (dst)
 			dst_release(dst);
 
@@ -209,6 +209,11 @@ static struct dst_entry *rxe_find_route(struct rxe_dev *rxe,
 			saddr6 = &av->sgid_addr._sockaddr_in6.sin6_addr;
 			daddr6 = &av->dgid_addr._sockaddr_in6.sin6_addr;
 			dst = rxe_find_route6(rxe->ndev, saddr6, daddr6);
+#if IS_ENABLED(CONFIG_IPV6)
+			if (dst)
+				qp->dst_cookie =
+					rt6_get_cookie((struct rt6_info *)dst);
+#endif
 		}
 	}
 
@@ -337,7 +342,7 @@ static void prepare_ipv6_hdr(struct dst_entry *dst, struct sk_buff *skb,
 	memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
 	IPCB(skb)->flags &= ~(IPSKB_XFRM_TUNNEL_SIZE | IPSKB_XFRM_TRANSFORMED
 			    | IPSKB_REROUTED);
-	skb_dst_set(skb, dst);
+	skb_dst_set(skb, dst_clone(dst));
 
 	__skb_push(skb, sizeof(*ip6h));
 	skb_reset_network_header(skb);
@@ -388,7 +393,7 @@ static int prepare6(struct rxe_dev *rxe, struct rxe_pkt_info *pkt,
 		    struct sk_buff *skb, struct rxe_av *av)
 {
 	struct rxe_qp *qp = pkt->qp;
-	struct dst_entry *dst = NULL;
+	struct dst_entry *dst;
 	struct in6_addr *saddr = &av->sgid_addr._sockaddr_in6.sin6_addr;
 	struct in6_addr *daddr = &av->dgid_addr._sockaddr_in6.sin6_addr;
 
@@ -460,12 +465,17 @@ int rxe_send(struct rxe_dev *rxe, struct rxe_pkt_info *pkt, struct sk_buff *skb)
 	nskb->destructor = rxe_skb_tx_dtor;
 	nskb->sk = pkt->qp->sk->sk;
 
+	rxe_add_ref(pkt->qp);
+	atomic_inc(&pkt->qp->skb_out);
+
 	if (av->network_type == RDMA_NETWORK_IPV4) {
 		err = ip_local_out(dev_net(skb_dst(skb)->dev), nskb->sk, nskb);
 	} else if (av->network_type == RDMA_NETWORK_IPV6) {
 		err = ip6_local_out(dev_net(skb_dst(skb)->dev), nskb->sk, nskb);
 	} else {
 		pr_err("Unknown layer 3 protocol: %d\n", av->network_type);
+		atomic_dec(&pkt->qp->skb_out);
+		rxe_drop_ref(pkt->qp);
 		kfree_skb(nskb);
 		return -EINVAL;
 	}
@@ -475,10 +485,7 @@ int rxe_send(struct rxe_dev *rxe, struct rxe_pkt_info *pkt, struct sk_buff *skb)
 		return -EAGAIN;
 	}
 
-	rxe_add_ref(pkt->qp);
-	atomic_inc(&pkt->qp->skb_out);
 	kfree_skb(skb);
-
 	return 0;
 }
 
@@ -644,8 +651,13 @@ static int rxe_notify(struct notifier_block *not_blk,
 		pr_info("%s changed mtu to %d\n", ndev->name, ndev->mtu);
 		rxe_set_mtu(rxe, ndev->mtu);
 		break;
-	case NETDEV_REBOOT:
 	case NETDEV_CHANGE:
+		if (netif_running(ndev) && netif_carrier_ok(ndev))
+			rxe_port_up(rxe);
+		else
+			rxe_port_down(rxe);
+		break;
+	case NETDEV_REBOOT:
 	case NETDEV_GOING_DOWN:
 	case NETDEV_CHANGEADDR:
 	case NETDEV_CHANGENAME:
diff --git a/drivers/infiniband/sw/rxe/rxe_pool.c b/drivers/infiniband/sw/rxe/rxe_pool.c
index 75d11ee..c1b5f38 100644
--- a/drivers/infiniband/sw/rxe/rxe_pool.c
+++ b/drivers/infiniband/sw/rxe/rxe_pool.c
@@ -188,7 +188,7 @@ int rxe_pool_init(
 	struct rxe_dev		*rxe,
 	struct rxe_pool		*pool,
 	enum rxe_elem_type	type,
-	unsigned		max_elem)
+	unsigned int		max_elem)
 {
 	int			err = 0;
 	size_t			size = rxe_type_info[type].size;
diff --git a/drivers/infiniband/sw/rxe/rxe_qp.c b/drivers/infiniband/sw/rxe/rxe_qp.c
index 80ccc7c..00bda93 100644
--- a/drivers/infiniband/sw/rxe/rxe_qp.c
+++ b/drivers/infiniband/sw/rxe/rxe_qp.c
@@ -851,13 +851,8 @@ void rxe_qp_cleanup(struct rxe_pool_entry *arg)
 		qp->resp.mr = NULL;
 	}
 
-	if (qp_type(qp) == IB_QPT_RC) {
-		struct dst_entry *dst = NULL;
-
-		dst = sk_dst_get(qp->sk->sk);
-		if (dst)
-			dst_release(dst);
-	}
+	if (qp_type(qp) == IB_QPT_RC)
+		sk_dst_reset(qp->sk->sk);
 
 	free_rd_atomic_resources(qp);
 
diff --git a/drivers/infiniband/sw/rxe/rxe_req.c b/drivers/infiniband/sw/rxe/rxe_req.c
index 7ee465d..d84222f 100644
--- a/drivers/infiniband/sw/rxe/rxe_req.c
+++ b/drivers/infiniband/sw/rxe/rxe_req.c
@@ -43,7 +43,7 @@ static int next_opcode(struct rxe_qp *qp, struct rxe_send_wqe *wqe,
 
 static inline void retry_first_write_send(struct rxe_qp *qp,
 					  struct rxe_send_wqe *wqe,
-					  unsigned mask, int npsn)
+					  unsigned int mask, int npsn)
 {
 	int i;
 
@@ -594,8 +594,10 @@ int rxe_requester(void *arg)
 	rxe_add_ref(qp);
 
 next_wqe:
-	if (unlikely(!qp->valid))
+	if (unlikely(!qp->valid)) {
+		rxe_drain_req_pkts(qp, true);
 		goto exit;
+	}
 
 	if (unlikely(qp->req.state == QP_STATE_ERROR)) {
 		rxe_drain_req_pkts(qp, true);
diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c
index a958ee9..4240866 100644
--- a/drivers/infiniband/sw/rxe/rxe_resp.c
+++ b/drivers/infiniband/sw/rxe/rxe_resp.c
@@ -1055,7 +1055,7 @@ static struct resp_res *find_resource(struct rxe_qp *qp, u32 psn)
 {
 	int i;
 
-	for (i = 0; i < qp->attr.max_rd_atomic; i++) {
+	for (i = 0; i < qp->attr.max_dest_rd_atomic; i++) {
 		struct resp_res *res = &qp->resp.resources[i];
 
 		if (res->type == 0)
diff --git a/drivers/infiniband/sw/rxe/rxe_task.c b/drivers/infiniband/sw/rxe/rxe_task.c
index d2a14a1..ea3810b 100644
--- a/drivers/infiniband/sw/rxe/rxe_task.c
+++ b/drivers/infiniband/sw/rxe/rxe_task.c
@@ -78,7 +78,7 @@ void rxe_do_task(unsigned long data)
 
 	default:
 		spin_unlock_irqrestore(&task->state_lock, flags);
-		pr_warn("bad state = %d in rxe_do_task\n", task->state);
+		pr_warn("%s failed with bad state %d\n", __func__, task->state);
 		return;
 	}
 
@@ -105,7 +105,7 @@ void rxe_do_task(unsigned long data)
 			break;
 
 		default:
-			pr_warn("bad state = %d in rxe_do_task\n",
+			pr_warn("%s failed with bad state %d\n", __func__,
 				task->state);
 		}
 		spin_unlock_irqrestore(&task->state_lock, flags);
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index af90a7d..0b362f4 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -51,40 +51,16 @@ static int rxe_query_device(struct ib_device *dev,
 	return 0;
 }
 
-static void rxe_eth_speed_to_ib_speed(int speed, u8 *active_speed,
-				      u8 *active_width)
-{
-	if (speed <= 1000) {
-		*active_width = IB_WIDTH_1X;
-		*active_speed = IB_SPEED_SDR;
-	} else if (speed <= 10000) {
-		*active_width = IB_WIDTH_1X;
-		*active_speed = IB_SPEED_FDR10;
-	} else if (speed <= 20000) {
-		*active_width = IB_WIDTH_4X;
-		*active_speed = IB_SPEED_DDR;
-	} else if (speed <= 30000) {
-		*active_width = IB_WIDTH_4X;
-		*active_speed = IB_SPEED_QDR;
-	} else if (speed <= 40000) {
-		*active_width = IB_WIDTH_4X;
-		*active_speed = IB_SPEED_FDR10;
-	} else {
-		*active_width = IB_WIDTH_4X;
-		*active_speed = IB_SPEED_EDR;
-	}
-}
-
 static int rxe_query_port(struct ib_device *dev,
 			  u8 port_num, struct ib_port_attr *attr)
 {
 	struct rxe_dev *rxe = to_rdev(dev);
 	struct rxe_port *port;
-	u32 speed;
+	int rc = -EINVAL;
 
 	if (unlikely(port_num != 1)) {
 		pr_warn("invalid port_number %d\n", port_num);
-		goto err1;
+		goto out;
 	}
 
 	port = &rxe->port;
@@ -93,29 +69,12 @@ static int rxe_query_port(struct ib_device *dev,
 	*attr = port->attr;
 
 	mutex_lock(&rxe->usdev_lock);
-	if (rxe->ndev->ethtool_ops->get_link_ksettings) {
-		struct ethtool_link_ksettings ks;
-
-		rxe->ndev->ethtool_ops->get_link_ksettings(rxe->ndev, &ks);
-		speed = ks.base.speed;
-	} else if (rxe->ndev->ethtool_ops->get_settings) {
-		struct ethtool_cmd cmd;
-
-		rxe->ndev->ethtool_ops->get_settings(rxe->ndev, &cmd);
-		speed = cmd.speed;
-	} else {
-		pr_warn("%s speed is unknown, defaulting to 1000\n",
-			rxe->ndev->name);
-		speed = 1000;
-	}
-	rxe_eth_speed_to_ib_speed(speed, &attr->active_speed,
-				  &attr->active_width);
+	rc = ib_get_eth_speed(dev, port_num, &attr->active_speed,
+			      &attr->active_width);
 	mutex_unlock(&rxe->usdev_lock);
 
-	return 0;
-
-err1:
-	return -EINVAL;
+out:
+	return rc;
 }
 
 static int rxe_query_gid(struct ib_device *device,
@@ -960,6 +919,8 @@ static int rxe_destroy_cq(struct ib_cq *ibcq)
 {
 	struct rxe_cq *cq = to_rcq(ibcq);
 
+	rxe_cq_disable(cq);
+
 	rxe_drop_ref(cq);
 	return 0;
 }
@@ -1210,8 +1171,8 @@ static int rxe_detach_mcast(struct ib_qp *ibqp, union ib_gid *mgid, u16 mlid)
 	return rxe_mcast_drop_grp_elem(rxe, qp, mgid);
 }
 
-static ssize_t rxe_show_parent(struct device *device,
-			       struct device_attribute *attr, char *buf)
+static ssize_t parent_show(struct device *device,
+			   struct device_attribute *attr, char *buf)
 {
 	struct rxe_dev *rxe = container_of(device, struct rxe_dev,
 					   ib_dev.dev);
@@ -1219,7 +1180,7 @@ static ssize_t rxe_show_parent(struct device *device,
 	return snprintf(buf, 16, "%s\n", rxe_parent_name(rxe, 1));
 }
 
-static DEVICE_ATTR(parent, S_IRUGO, rxe_show_parent, NULL);
+static DEVICE_ATTR_RO(parent);
 
 static struct device_attribute *rxe_dev_attributes[] = {
 	&dev_attr_parent,
@@ -1336,15 +1297,15 @@ int rxe_register_device(struct rxe_dev *rxe)
 
 	err = ib_register_device(dev, NULL);
 	if (err) {
-		pr_warn("rxe_register_device failed, err = %d\n", err);
+		pr_warn("%s failed with error %d\n", __func__, err);
 		goto err1;
 	}
 
 	for (i = 0; i < ARRAY_SIZE(rxe_dev_attributes); ++i) {
 		err = device_create_file(&dev->dev, rxe_dev_attributes[i]);
 		if (err) {
-			pr_warn("device_create_file failed, i = %d, err = %d\n",
-				i, err);
+			pr_warn("%s failed with error %d for attr number %d\n",
+				__func__, err, i);
 			goto err2;
 		}
 	}
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.h b/drivers/infiniband/sw/rxe/rxe_verbs.h
index 5a180fb..0c2dbe4 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.h
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.h
@@ -89,6 +89,7 @@ struct rxe_cq {
 	struct rxe_queue	*queue;
 	spinlock_t		cq_lock;
 	u8			notify;
+	bool			is_dying;
 	int			is_user;
 	struct tasklet_struct	comp_task;
 };
@@ -247,6 +248,7 @@ struct rxe_qp {
 	struct rxe_rq		rq;
 
 	struct socket		*sk;
+	u32			dst_cookie;
 
 	struct rxe_av		pri_av;
 	struct rxe_av		alt_av;
diff --git a/drivers/infiniband/ulp/ipoib/ipoib.h b/drivers/infiniband/ulp/ipoib/ipoib.h
index 7ac2505..4a5c7a0 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib.h
+++ b/drivers/infiniband/ulp/ipoib/ipoib.h
@@ -337,6 +337,7 @@ struct ipoib_dev_priv {
 
 	struct rw_semaphore vlan_rwsem;
 	struct mutex mcast_mutex;
+	struct mutex sysfs_mutex;
 
 	struct rb_root  path_tree;
 	struct list_head path_list;
@@ -367,7 +368,7 @@ struct ipoib_dev_priv {
 	u32		  qkey;
 
 	union ib_gid local_gid;
-	u16	     local_lid;
+	u32	     local_lid;
 
 	unsigned int admin_mtu;
 	unsigned int mcast_mtu;
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index d69410c..14b62f7 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -1506,9 +1506,14 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr,
 	if (test_bit(IPOIB_FLAG_GOING_DOWN, &priv->flags))
 		return -EPERM;
 
-	if (!rtnl_trylock())
+	if (!mutex_trylock(&priv->sysfs_mutex))
 		return restart_syscall();
 
+	if (!rtnl_trylock()) {
+		mutex_unlock(&priv->sysfs_mutex);
+		return restart_syscall();
+	}
+
 	ret = ipoib_set_mode(dev, buf);
 
 	/* The assumption is that the function ipoib_set_mode returned
@@ -1517,6 +1522,7 @@ static ssize_t set_mode(struct device *d, struct device_attribute *attr,
 	 */
 	if (ret != -EBUSY)
 		rtnl_unlock();
+	mutex_unlock(&priv->sysfs_mutex);
 
 	return (!ret || ret == -EBUSY) ? count : ret;
 }
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c b/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c
index 184a22f..8dc1e62 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ethtool.c
@@ -63,8 +63,7 @@ static void ipoib_get_drvinfo(struct net_device *netdev,
 {
 	struct ipoib_dev_priv *priv = ipoib_priv(netdev);
 
-	ib_get_device_fw_str(priv->ca, drvinfo->fw_version,
-			     sizeof(drvinfo->fw_version));
+	ib_get_device_fw_str(priv->ca, drvinfo->fw_version);
 
 	strlcpy(drvinfo->bus_info, dev_name(priv->ca->dev.parent),
 		sizeof(drvinfo->bus_info));
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 6c77df34..bac95b5 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -60,7 +60,6 @@ const char ipoib_driver_version[] = DRV_VERSION;
 MODULE_AUTHOR("Roland Dreier");
 MODULE_DESCRIPTION("IP-over-InfiniBand net driver");
 MODULE_LICENSE("Dual BSD/GPL");
-MODULE_VERSION(DRV_VERSION);
 
 int ipoib_sendq_size __read_mostly = IPOIB_TX_RING_SIZE;
 int ipoib_recvq_size __read_mostly = IPOIB_RX_RING_SIZE;
@@ -100,6 +99,8 @@ static struct net_device *ipoib_get_net_dev_by_params(
 		const union ib_gid *gid, const struct sockaddr *addr,
 		void *client_data);
 static int ipoib_set_mac(struct net_device *dev, void *addr);
+static int ipoib_ioctl(struct net_device *dev, struct ifreq *ifr,
+		       int cmd);
 
 static struct ib_client ipoib_client = {
 	.name   = "ipoib",
@@ -1681,6 +1682,17 @@ static int ipoib_dev_init_default(struct net_device *dev)
 	return -ENOMEM;
 }
 
+static int ipoib_ioctl(struct net_device *dev, struct ifreq *ifr,
+		       int cmd)
+{
+	struct ipoib_dev_priv *priv = ipoib_priv(dev);
+
+	if (!priv->rn_ops->ndo_do_ioctl)
+		return -EOPNOTSUPP;
+
+	return priv->rn_ops->ndo_do_ioctl(dev, ifr, cmd);
+}
+
 int ipoib_dev_init(struct net_device *dev, struct ib_device *ca, int port)
 {
 	struct ipoib_dev_priv *priv = ipoib_priv(dev);
@@ -1835,6 +1847,7 @@ static const struct net_device_ops ipoib_netdev_ops_pf = {
 	.ndo_set_vf_guid	 = ipoib_set_vf_guid,
 	.ndo_set_mac_address	 = ipoib_set_mac,
 	.ndo_get_stats64	 = ipoib_get_stats,
+	.ndo_do_ioctl		 = ipoib_ioctl,
 };
 
 static const struct net_device_ops ipoib_netdev_ops_vf = {
@@ -1848,6 +1861,7 @@ static const struct net_device_ops ipoib_netdev_ops_vf = {
 	.ndo_set_rx_mode	 = ipoib_set_mcast_list,
 	.ndo_get_iflink		 = ipoib_get_iflink,
 	.ndo_get_stats64	 = ipoib_get_stats,
+	.ndo_do_ioctl		 = ipoib_ioctl,
 };
 
 void ipoib_setup_common(struct net_device *dev)
@@ -1879,6 +1893,7 @@ static void ipoib_build_priv(struct net_device *dev)
 	spin_lock_init(&priv->lock);
 	init_rwsem(&priv->vlan_rwsem);
 	mutex_init(&priv->mcast_mutex);
+	mutex_init(&priv->sysfs_mutex);
 
 	INIT_LIST_HEAD(&priv->path_list);
 	INIT_LIST_HEAD(&priv->child_intfs);
@@ -2228,13 +2243,7 @@ static struct net_device *ipoib_add_port(const char *format,
 
 	INIT_IB_EVENT_HANDLER(&priv->event_handler,
 			      priv->ca, ipoib_event);
-	result = ib_register_event_handler(&priv->event_handler);
-	if (result < 0) {
-		printk(KERN_WARNING "%s: ib_register_event_handler failed for "
-		       "port %d (ret = %d)\n",
-		       hca->name, port, result);
-		goto event_failed;
-	}
+	ib_register_event_handler(&priv->event_handler);
 
 	result = register_netdev(priv->dev);
 	if (result) {
@@ -2267,8 +2276,6 @@ static struct net_device *ipoib_add_port(const char *format,
 	set_bit(IPOIB_STOP_NEIGH_GC, &priv->flags);
 	cancel_delayed_work(&priv->neigh_reap_task);
 	flush_workqueue(priv->wq);
-
-event_failed:
 	ipoib_dev_cleanup(priv->dev);
 
 device_init_failed:
@@ -2338,7 +2345,11 @@ static void ipoib_remove_one(struct ib_device *device, void *client_data)
 		cancel_delayed_work(&priv->neigh_reap_task);
 		flush_workqueue(priv->wq);
 
+		/* Wrap rtnl_lock/unlock with mutex to protect sysfs calls */
+		mutex_lock(&priv->sysfs_mutex);
 		unregister_netdev(priv->dev);
+		mutex_unlock(&priv->sysfs_mutex);
+
 		rn->free_rdma_netdev(priv->dev);
 
 		list_for_each_entry_safe(cpriv, tcpriv, &priv->child_intfs, list)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_vlan.c b/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
index 081b33d..9927cd6 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_vlan.c
@@ -133,12 +133,20 @@ int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey)
 	snprintf(intf_name, sizeof intf_name, "%s.%04x",
 		 ppriv->dev->name, pkey);
 
-	if (!rtnl_trylock())
+	if (!mutex_trylock(&ppriv->sysfs_mutex))
 		return restart_syscall();
 
+	if (!rtnl_trylock()) {
+		mutex_unlock(&ppriv->sysfs_mutex);
+		return restart_syscall();
+	}
+
 	priv = ipoib_intf_alloc(ppriv->ca, ppriv->port, intf_name);
-	if (!priv)
+	if (!priv) {
+		rtnl_unlock();
+		mutex_unlock(&ppriv->sysfs_mutex);
 		return -ENOMEM;
+	}
 
 	down_write(&ppriv->vlan_rwsem);
 
@@ -164,8 +172,8 @@ int ipoib_vlan_add(struct net_device *pdev, unsigned short pkey)
 
 out:
 	up_write(&ppriv->vlan_rwsem);
-
 	rtnl_unlock();
+	mutex_unlock(&ppriv->sysfs_mutex);
 
 	if (result) {
 		free_netdev(priv->dev);
@@ -188,9 +196,14 @@ int ipoib_vlan_delete(struct net_device *pdev, unsigned short pkey)
 	if (test_bit(IPOIB_FLAG_GOING_DOWN, &ppriv->flags))
 		return -EPERM;
 
-	if (!rtnl_trylock())
+	if (!mutex_trylock(&ppriv->sysfs_mutex))
 		return restart_syscall();
 
+	if (!rtnl_trylock()) {
+		mutex_unlock(&ppriv->sysfs_mutex);
+		return restart_syscall();
+	}
+
 	down_write(&ppriv->vlan_rwsem);
 	list_for_each_entry_safe(priv, tpriv, &ppriv->child_intfs, list) {
 		if (priv->pkey == pkey &&
@@ -208,6 +221,7 @@ int ipoib_vlan_delete(struct net_device *pdev, unsigned short pkey)
 	}
 
 	rtnl_unlock();
+	mutex_unlock(&ppriv->sysfs_mutex);
 
 	if (dev) {
 		free_netdev(dev);
diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c
index 37b33d7..19624e0 100644
--- a/drivers/infiniband/ulp/iser/iscsi_iser.c
+++ b/drivers/infiniband/ulp/iser/iscsi_iser.c
@@ -77,7 +77,6 @@
 MODULE_DESCRIPTION("iSER (iSCSI Extensions for RDMA) Datamover");
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_AUTHOR("Alex Nezhinsky, Dan Bar Dov, Or Gerlitz");
-MODULE_VERSION(DRV_VER);
 
 static struct scsi_host_template iscsi_iser_sht;
 static struct iscsi_transport iscsi_iser_transport;
diff --git a/drivers/infiniband/ulp/iser/iser_verbs.c b/drivers/infiniband/ulp/iser/iser_verbs.c
index 26a004e..55a73b0 100644
--- a/drivers/infiniband/ulp/iser/iser_verbs.c
+++ b/drivers/infiniband/ulp/iser/iser_verbs.c
@@ -106,9 +106,7 @@ static int iser_create_device_ib_res(struct iser_device *device)
 
 	INIT_IB_EVENT_HANDLER(&device->event_handler, ib_dev,
 			      iser_event_handler);
-	if (ib_register_event_handler(&device->event_handler))
-		goto cq_err;
-
+	ib_register_event_handler(&device->event_handler);
 	return 0;
 
 cq_err:
@@ -141,7 +139,7 @@ static void iser_free_device_ib_res(struct iser_device *device)
 		comp->cq = NULL;
 	}
 
-	(void)ib_unregister_event_handler(&device->event_handler);
+	ib_unregister_event_handler(&device->event_handler);
 	ib_dealloc_pd(device->pd);
 
 	kfree(device->comps);
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
index 0e662656..ceabdb8 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -2710,7 +2710,6 @@ static void __exit isert_exit(void)
 }
 
 MODULE_DESCRIPTION("iSER-Target for mainline target infrastructure");
-MODULE_VERSION("1.0");
 MODULE_AUTHOR("nab@Linux-iSCSI.org");
 MODULE_LICENSE("GPL");
 
diff --git a/drivers/infiniband/ulp/opa_vnic/opa_vnic_vema.c b/drivers/infiniband/ulp/opa_vnic/opa_vnic_vema.c
index cf768dd..21f0b48 100644
--- a/drivers/infiniband/ulp/opa_vnic/opa_vnic_vema.c
+++ b/drivers/infiniband/ulp/opa_vnic/opa_vnic_vema.c
@@ -52,7 +52,9 @@
 
 #include <linux/module.h>
 #include <rdma/ib_addr.h>
-#include <rdma/ib_smi.h>
+#include <rdma/ib_verbs.h>
+#include <rdma/opa_smi.h>
+#include <rdma/opa_port_info.h>
 
 #include "opa_vnic_internal.h"
 
@@ -952,12 +954,7 @@ static int vema_register(struct opa_vnic_ctrl_port *cport)
 
 		INIT_IB_EVENT_HANDLER(&port->event_handler,
 				      cport->ibdev, opa_vnic_event);
-		ret = ib_register_event_handler(&port->event_handler);
-		if (ret) {
-			c_err("port %d: event handler register failed\n", i);
-			vema_unregister(cport);
-			return ret;
-		}
+		ib_register_event_handler(&port->event_handler);
 
 		idr_init(&port->vport_idr);
 		mutex_init(&port->lock);
@@ -980,6 +977,27 @@ static int vema_register(struct opa_vnic_ctrl_port *cport)
 }
 
 /**
+ * opa_vnic_ctrl_config_dev -- This function sends a trap to the EM
+ * by way of ib_modify_port to indicate support for ethernet on the
+ * fabric.
+ * @cport: pointer to control port
+ * @en: enable or disable ethernet on fabric support
+ */
+static void opa_vnic_ctrl_config_dev(struct opa_vnic_ctrl_port *cport, bool en)
+{
+	struct ib_port_modify pm = { 0 };
+	int i;
+
+	if (en)
+		pm.set_port_cap_mask = OPA_CAP_MASK3_IsEthOnFabricSupported;
+	else
+		pm.clr_port_cap_mask = OPA_CAP_MASK3_IsEthOnFabricSupported;
+
+	for (i = 1; i <= cport->num_ports; i++)
+		ib_modify_port(cport->ibdev, i, IB_PORT_OPA_MASK_CHG, &pm);
+}
+
+/**
  * opa_vnic_vema_add_one -- Handle new ib device
  * @device: ib device pointer
  *
@@ -1007,6 +1025,7 @@ static void opa_vnic_vema_add_one(struct ib_device *device)
 		c_info("VNIC client initialized\n");
 
 	ib_set_client_data(device, &opa_vnic_client, cport);
+	opa_vnic_ctrl_config_dev(cport, true);
 }
 
 /**
@@ -1025,6 +1044,7 @@ static void opa_vnic_vema_rem_one(struct ib_device *device,
 		return;
 
 	c_info("removing VNIC client\n");
+	opa_vnic_ctrl_config_dev(cport, false);
 	vema_unregister(cport);
 	kfree(cport);
 }
@@ -1053,4 +1073,3 @@ module_exit(opa_vnic_deinit);
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_AUTHOR("Intel Corporation");
 MODULE_DESCRIPTION("Intel OPA Virtual Network driver");
-MODULE_VERSION(DRV_VERSION);
diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c
index 2354c74..fa5ccdb 100644
--- a/drivers/infiniband/ulp/srp/ib_srp.c
+++ b/drivers/infiniband/ulp/srp/ib_srp.c
@@ -62,7 +62,6 @@
 MODULE_AUTHOR("Roland Dreier");
 MODULE_DESCRIPTION("InfiniBand SCSI RDMA Protocol initiator");
 MODULE_LICENSE("Dual BSD/GPL");
-MODULE_VERSION(DRV_VERSION);
 MODULE_INFO(release_date, DRV_RELDATE);
 
 #if !defined(CONFIG_DYNAMIC_DEBUG)
diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 402275b..9e8e922 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -2238,7 +2238,7 @@ static int srpt_write_pending(struct se_cmd *se_cmd)
 				cqe, first_wr);
 		cqe = NULL;
 	}
-	
+
 	ret = ib_post_send(ch->qp, first_wr, &bad_wr);
 	if (ret) {
 		pr_err("%s: ib_post_send() returned %d for %d (avail: %d)\n",
@@ -2530,8 +2530,7 @@ static void srpt_add_one(struct ib_device *device)
 
 	INIT_IB_EVENT_HANDLER(&sdev->event_handler, sdev->device,
 			      srpt_event_handler);
-	if (ib_register_event_handler(&sdev->event_handler))
-		goto err_cm;
+	ib_register_event_handler(&sdev->event_handler);
 
 	sdev->ioctx_ring = (struct srpt_recv_ioctx **)
 		srpt_alloc_ioctx_ring(sdev, sdev->srq_size,
diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.h b/drivers/infiniband/ulp/srpt/ib_srpt.h
index cc118385..1b817e5 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.h
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.h
@@ -328,8 +328,8 @@ struct srpt_port {
 	u8			port_guid[24];
 	u8			port_gid[64];
 	u8			port;
-	u16			sm_lid;
-	u16			lid;
+	u32			sm_lid;
+	u32			lid;
 	union ib_gid		gid;
 	struct work_struct	work;
 	struct se_portal_group	port_guid_tpg;
diff --git a/drivers/input/joystick/xpad.c b/drivers/input/joystick/xpad.c
index 298a6ba..ca0e19a 100644
--- a/drivers/input/joystick/xpad.c
+++ b/drivers/input/joystick/xpad.c
@@ -476,10 +476,21 @@ static const u8 xboxone_hori_init[] = {
 };
 
 /*
- * A rumble packet is required for some PowerA pads to start
+ * A specific rumble packet is required for some PowerA pads to start
  * sending input reports. One of those pads is (0x24c6:0x543a).
  */
-static const u8 xboxone_zerorumble_init[] = {
+static const u8 xboxone_rumblebegin_init[] = {
+	0x09, 0x00, 0x00, 0x09, 0x00, 0x0F, 0x00, 0x00,
+	0x1D, 0x1D, 0xFF, 0x00, 0x00
+};
+
+/*
+ * A rumble packet with zero FF intensity will immediately
+ * terminate the rumbling required to init PowerA pads.
+ * This should happen fast enough that the motors don't
+ * spin up to enough speed to actually vibrate the gamepad.
+ */
+static const u8 xboxone_rumbleend_init[] = {
 	0x09, 0x00, 0x00, 0x09, 0x00, 0x0F, 0x00, 0x00,
 	0x00, 0x00, 0x00, 0x00, 0x00
 };
@@ -494,9 +505,12 @@ static const struct xboxone_init_packet xboxone_init_packets[] = {
 	XBOXONE_INIT_PKT(0x0e6f, 0x0165, xboxone_hori_init),
 	XBOXONE_INIT_PKT(0x0f0d, 0x0067, xboxone_hori_init),
 	XBOXONE_INIT_PKT(0x0000, 0x0000, xboxone_fw2015_init),
-	XBOXONE_INIT_PKT(0x24c6, 0x541a, xboxone_zerorumble_init),
-	XBOXONE_INIT_PKT(0x24c6, 0x542a, xboxone_zerorumble_init),
-	XBOXONE_INIT_PKT(0x24c6, 0x543a, xboxone_zerorumble_init),
+	XBOXONE_INIT_PKT(0x24c6, 0x541a, xboxone_rumblebegin_init),
+	XBOXONE_INIT_PKT(0x24c6, 0x542a, xboxone_rumblebegin_init),
+	XBOXONE_INIT_PKT(0x24c6, 0x543a, xboxone_rumblebegin_init),
+	XBOXONE_INIT_PKT(0x24c6, 0x541a, xboxone_rumbleend_init),
+	XBOXONE_INIT_PKT(0x24c6, 0x542a, xboxone_rumbleend_init),
+	XBOXONE_INIT_PKT(0x24c6, 0x543a, xboxone_rumbleend_init),
 };
 
 struct xpad_output_packet {
diff --git a/drivers/input/mouse/synaptics.c b/drivers/input/mouse/synaptics.c
index 16c3046..5af0b7d 100644
--- a/drivers/input/mouse/synaptics.c
+++ b/drivers/input/mouse/synaptics.c
@@ -535,16 +535,17 @@ static void synaptics_apply_quirks(struct psmouse *psmouse,
 	}
 }
 
+static bool synaptics_has_agm(struct synaptics_data *priv)
+{
+	return (SYN_CAP_ADV_GESTURE(priv->info.ext_cap_0c) ||
+		SYN_CAP_IMAGE_SENSOR(priv->info.ext_cap_0c));
+}
+
 static int synaptics_set_advanced_gesture_mode(struct psmouse *psmouse)
 {
 	static u8 param = 0xc8;
-	struct synaptics_data *priv = psmouse->private;
 	int error;
 
-	if (!(SYN_CAP_ADV_GESTURE(priv->info.ext_cap_0c) ||
-	      SYN_CAP_IMAGE_SENSOR(priv->info.ext_cap_0c)))
-		return 0;
-
 	error = psmouse_sliced_command(psmouse, SYN_QUE_MODEL);
 	if (error)
 		return error;
@@ -553,9 +554,6 @@ static int synaptics_set_advanced_gesture_mode(struct psmouse *psmouse)
 	if (error)
 		return error;
 
-	/* Advanced gesture mode also sends multi finger data */
-	priv->info.capabilities |= BIT(1);
-
 	return 0;
 }
 
@@ -578,7 +576,7 @@ static int synaptics_set_mode(struct psmouse *psmouse)
 	if (error)
 		return error;
 
-	if (priv->absolute_mode) {
+	if (priv->absolute_mode && synaptics_has_agm(priv)) {
 		error = synaptics_set_advanced_gesture_mode(psmouse);
 		if (error) {
 			psmouse_err(psmouse,
@@ -766,9 +764,7 @@ static int synaptics_parse_hw_state(const u8 buf[],
 			 ((buf[0] & 0x04) >> 1) |
 			 ((buf[3] & 0x04) >> 2));
 
-		if ((SYN_CAP_ADV_GESTURE(priv->info.ext_cap_0c) ||
-			SYN_CAP_IMAGE_SENSOR(priv->info.ext_cap_0c)) &&
-		    hw->w == 2) {
+		if (synaptics_has_agm(priv) && hw->w == 2) {
 			synaptics_parse_agm(buf, priv, hw);
 			return 1;
 		}
@@ -1033,6 +1029,15 @@ static void synaptics_image_sensor_process(struct psmouse *psmouse,
 	synaptics_report_mt_data(psmouse, sgm, num_fingers);
 }
 
+static bool synaptics_has_multifinger(struct synaptics_data *priv)
+{
+	if (SYN_CAP_MULTIFINGER(priv->info.capabilities))
+		return true;
+
+	/* Advanced gesture mode also sends multi finger data */
+	return synaptics_has_agm(priv);
+}
+
 /*
  *  called for each full received packet from the touchpad
  */
@@ -1079,7 +1084,7 @@ static void synaptics_process_packet(struct psmouse *psmouse)
 		if (SYN_CAP_EXTENDED(info->capabilities)) {
 			switch (hw.w) {
 			case 0 ... 1:
-				if (SYN_CAP_MULTIFINGER(info->capabilities))
+				if (synaptics_has_multifinger(priv))
 					num_fingers = hw.w + 2;
 				break;
 			case 2:
@@ -1123,7 +1128,7 @@ static void synaptics_process_packet(struct psmouse *psmouse)
 		input_report_abs(dev, ABS_TOOL_WIDTH, finger_width);
 
 	input_report_key(dev, BTN_TOOL_FINGER, num_fingers == 1);
-	if (SYN_CAP_MULTIFINGER(info->capabilities)) {
+	if (synaptics_has_multifinger(priv)) {
 		input_report_key(dev, BTN_TOOL_DOUBLETAP, num_fingers == 2);
 		input_report_key(dev, BTN_TOOL_TRIPLETAP, num_fingers == 3);
 	}
@@ -1283,7 +1288,7 @@ static void set_input_params(struct psmouse *psmouse,
 	__set_bit(BTN_TOUCH, dev->keybit);
 	__set_bit(BTN_TOOL_FINGER, dev->keybit);
 
-	if (SYN_CAP_MULTIFINGER(info->capabilities)) {
+	if (synaptics_has_multifinger(priv)) {
 		__set_bit(BTN_TOOL_DOUBLETAP, dev->keybit);
 		__set_bit(BTN_TOOL_TRIPLETAP, dev->keybit);
 	}
diff --git a/drivers/iommu/amd_iommu_v2.c b/drivers/iommu/amd_iommu_v2.c
index 6629c472..dccf5b7 100644
--- a/drivers/iommu/amd_iommu_v2.c
+++ b/drivers/iommu/amd_iommu_v2.c
@@ -391,13 +391,6 @@ static int mn_clear_flush_young(struct mmu_notifier *mn,
 	return 0;
 }
 
-static void mn_invalidate_page(struct mmu_notifier *mn,
-			       struct mm_struct *mm,
-			       unsigned long address)
-{
-	__mn_flush_page(mn, address);
-}
-
 static void mn_invalidate_range(struct mmu_notifier *mn,
 				struct mm_struct *mm,
 				unsigned long start, unsigned long end)
@@ -436,7 +429,6 @@ static void mn_release(struct mmu_notifier *mn, struct mm_struct *mm)
 static const struct mmu_notifier_ops iommu_mn = {
 	.release		= mn_release,
 	.clear_flush_young      = mn_clear_flush_young,
-	.invalidate_page        = mn_invalidate_page,
 	.invalidate_range       = mn_invalidate_range,
 };
 
diff --git a/drivers/iommu/intel-svm.c b/drivers/iommu/intel-svm.c
index f167c0d..f620dcc 100644
--- a/drivers/iommu/intel-svm.c
+++ b/drivers/iommu/intel-svm.c
@@ -223,14 +223,6 @@ static void intel_change_pte(struct mmu_notifier *mn, struct mm_struct *mm,
 	intel_flush_svm_range(svm, address, 1, 1, 0);
 }
 
-static void intel_invalidate_page(struct mmu_notifier *mn, struct mm_struct *mm,
-				  unsigned long address)
-{
-	struct intel_svm *svm = container_of(mn, struct intel_svm, notifier);
-
-	intel_flush_svm_range(svm, address, 1, 1, 0);
-}
-
 /* Pages have been freed at this point */
 static void intel_invalidate_range(struct mmu_notifier *mn,
 				   struct mm_struct *mm,
@@ -285,7 +277,6 @@ static void intel_mm_release(struct mmu_notifier *mn, struct mm_struct *mm)
 static const struct mmu_notifier_ops intel_mmuops = {
 	.release = intel_mm_release,
 	.change_pte = intel_change_pte,
-	.invalidate_page = intel_invalidate_page,
 	.invalidate_range = intel_invalidate_range,
 };
 
diff --git a/drivers/irqchip/irq-mips-gic.c b/drivers/irqchip/irq-mips-gic.c
index 6ab1d3a..48ee1ba 100644
--- a/drivers/irqchip/irq-mips-gic.c
+++ b/drivers/irqchip/irq-mips-gic.c
@@ -1020,8 +1020,11 @@ static int __init gic_of_init(struct device_node *node,
 		gic_len = resource_size(&res);
 	}
 
-	if (mips_cm_present())
+	if (mips_cm_present()) {
 		write_gcr_gic_base(gic_base | CM_GCR_GIC_BASE_GICEN_MSK);
+		/* Ensure GIC region is enabled before trying to access it */
+		__sync();
+	}
 	gic_present = true;
 
 	__gic_init(gic_base, gic_len, cpu_vec, 0, node);
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 0e8ab5b..d24e4b0 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -504,7 +504,6 @@ static int multipath_clone_and_map(struct dm_target *ti, struct request *rq,
 		if (queue_dying) {
 			atomic_inc(&m->pg_init_in_progress);
 			activate_or_offline_path(pgpath);
-			return DM_MAPIO_REQUEUE;
 		}
 		return DM_MAPIO_DELAY_REQUEUE;
 	}
@@ -1458,7 +1457,6 @@ static int noretry_error(blk_status_t error)
 	case BLK_STS_TARGET:
 	case BLK_STS_NEXUS:
 	case BLK_STS_MEDIUM:
-	case BLK_STS_RESOURCE:
 		return 1;
 	}
 
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 2edbcc2..d669fdd 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -27,16 +27,6 @@
 
 #define DM_MSG_PREFIX "core"
 
-#ifdef CONFIG_PRINTK
-/*
- * ratelimit state to be used in DMXXX_LIMIT().
- */
-DEFINE_RATELIMIT_STATE(dm_ratelimit_state,
-		       DEFAULT_RATELIMIT_INTERVAL,
-		       DEFAULT_RATELIMIT_BURST);
-EXPORT_SYMBOL(dm_ratelimit_state);
-#endif
-
 /*
  * Cookies are numeric values sent with CHANGE and REMOVE
  * uevents while resuming, removing or renaming the device.
@@ -1523,7 +1513,7 @@ static void __split_and_process_bio(struct mapped_device *md,
 	}
 
 	/* drop the extra reference count */
-	dec_pending(ci.io, error);
+	dec_pending(ci.io, errno_to_blk_status(error));
 }
 /*-----------------------------------------------------------------
  * CRUD END
diff --git a/drivers/media/platform/vsp1/vsp1.h b/drivers/media/platform/vsp1/vsp1.h
index 847963b..78ef838 100644
--- a/drivers/media/platform/vsp1/vsp1.h
+++ b/drivers/media/platform/vsp1/vsp1.h
@@ -41,11 +41,11 @@ struct vsp1_rwpf;
 struct vsp1_sru;
 struct vsp1_uds;
 
+#define VSP1_MAX_LIF		2
 #define VSP1_MAX_RPF		5
 #define VSP1_MAX_UDS		3
 #define VSP1_MAX_WPF		4
 
-#define VSP1_HAS_LIF		(1 << 0)
 #define VSP1_HAS_LUT		(1 << 1)
 #define VSP1_HAS_SRU		(1 << 2)
 #define VSP1_HAS_BRU		(1 << 3)
@@ -54,12 +54,14 @@ struct vsp1_uds;
 #define VSP1_HAS_WPF_HFLIP	(1 << 6)
 #define VSP1_HAS_HGO		(1 << 7)
 #define VSP1_HAS_HGT		(1 << 8)
+#define VSP1_HAS_BRS		(1 << 9)
 
 struct vsp1_device_info {
 	u32 version;
 	const char *model;
 	unsigned int gen;
 	unsigned int features;
+	unsigned int lif_count;
 	unsigned int rpf_count;
 	unsigned int uds_count;
 	unsigned int wpf_count;
@@ -76,13 +78,14 @@ struct vsp1_device {
 	struct rcar_fcp_device *fcp;
 	struct device *bus_master;
 
+	struct vsp1_bru *brs;
 	struct vsp1_bru *bru;
 	struct vsp1_clu *clu;
 	struct vsp1_hgo *hgo;
 	struct vsp1_hgt *hgt;
 	struct vsp1_hsit *hsi;
 	struct vsp1_hsit *hst;
-	struct vsp1_lif *lif;
+	struct vsp1_lif *lif[VSP1_MAX_LIF];
 	struct vsp1_lut *lut;
 	struct vsp1_rwpf *rpf[VSP1_MAX_RPF];
 	struct vsp1_sru *sru;
diff --git a/drivers/media/platform/vsp1/vsp1_bru.c b/drivers/media/platform/vsp1/vsp1_bru.c
index 85362c5..e8fd2ae 100644
--- a/drivers/media/platform/vsp1/vsp1_bru.c
+++ b/drivers/media/platform/vsp1/vsp1_bru.c
@@ -33,7 +33,7 @@
 static inline void vsp1_bru_write(struct vsp1_bru *bru, struct vsp1_dl_list *dl,
 				  u32 reg, u32 data)
 {
-	vsp1_dl_list_write(dl, reg, data);
+	vsp1_dl_list_write(dl, bru->base + reg, data);
 }
 
 /* -----------------------------------------------------------------------------
@@ -332,11 +332,14 @@ static void bru_configure(struct vsp1_entity *entity,
 	/*
 	 * Route BRU input 1 as SRC input to the ROP unit and configure the ROP
 	 * unit with a NOP operation to make BRU input 1 available as the
-	 * Blend/ROP unit B SRC input.
+	 * Blend/ROP unit B SRC input. Only needed for BRU, the BRS has no ROP
+	 * unit.
 	 */
-	vsp1_bru_write(bru, dl, VI6_BRU_ROP, VI6_BRU_ROP_DSTSEL_BRUIN(1) |
-		       VI6_BRU_ROP_CROP(VI6_ROP_NOP) |
-		       VI6_BRU_ROP_AROP(VI6_ROP_NOP));
+	if (entity->type == VSP1_ENTITY_BRU)
+		vsp1_bru_write(bru, dl, VI6_BRU_ROP,
+			       VI6_BRU_ROP_DSTSEL_BRUIN(1) |
+			       VI6_BRU_ROP_CROP(VI6_ROP_NOP) |
+			       VI6_BRU_ROP_AROP(VI6_ROP_NOP));
 
 	for (i = 0; i < bru->entity.source_pad; ++i) {
 		bool premultiplied = false;
@@ -366,12 +369,13 @@ static void bru_configure(struct vsp1_entity *entity,
 			ctrl |= VI6_BRU_CTRL_DSTSEL_VRPF;
 
 		/*
-		 * Route BRU inputs 0 to 3 as SRC inputs to Blend/ROP units A to
-		 * D in that order. The Blend/ROP unit B SRC is hardwired to the
-		 * ROP unit output, the corresponding register bits must be set
-		 * to 0.
+		 * Route inputs 0 to 3 as SRC inputs to Blend/ROP units A to D
+		 * in that order. In the BRU the Blend/ROP unit B SRC is
+		 * hardwired to the ROP unit output, the corresponding register
+		 * bits must be set to 0. The BRS has no ROP unit and doesn't
+		 * need any special processing.
 		 */
-		if (i != 1)
+		if (!(entity->type == VSP1_ENTITY_BRU && i == 1))
 			ctrl |= VI6_BRU_CTRL_SRCSEL_BRUIN(i);
 
 		vsp1_bru_write(bru, dl, VI6_BRU_CTRL(i), ctrl);
@@ -407,20 +411,31 @@ static const struct vsp1_entity_operations bru_entity_ops = {
  * Initialization and Cleanup
  */
 
-struct vsp1_bru *vsp1_bru_create(struct vsp1_device *vsp1)
+struct vsp1_bru *vsp1_bru_create(struct vsp1_device *vsp1,
+				 enum vsp1_entity_type type)
 {
 	struct vsp1_bru *bru;
+	unsigned int num_pads;
+	const char *name;
 	int ret;
 
 	bru = devm_kzalloc(vsp1->dev, sizeof(*bru), GFP_KERNEL);
 	if (bru == NULL)
 		return ERR_PTR(-ENOMEM);
 
+	bru->base = type == VSP1_ENTITY_BRU ? VI6_BRU_BASE : VI6_BRS_BASE;
 	bru->entity.ops = &bru_entity_ops;
-	bru->entity.type = VSP1_ENTITY_BRU;
+	bru->entity.type = type;
 
-	ret = vsp1_entity_init(vsp1, &bru->entity, "bru",
-			       vsp1->info->num_bru_inputs + 1, &bru_ops,
+	if (type == VSP1_ENTITY_BRU) {
+		num_pads = vsp1->info->num_bru_inputs + 1;
+		name = "bru";
+	} else {
+		num_pads = 3;
+		name = "brs";
+	}
+
+	ret = vsp1_entity_init(vsp1, &bru->entity, name, num_pads, &bru_ops,
 			       MEDIA_ENT_F_PROC_VIDEO_COMPOSER);
 	if (ret < 0)
 		return ERR_PTR(ret);
@@ -435,7 +450,7 @@ struct vsp1_bru *vsp1_bru_create(struct vsp1_device *vsp1)
 	bru->entity.subdev.ctrl_handler = &bru->ctrls;
 
 	if (bru->ctrls.error) {
-		dev_err(vsp1->dev, "bru: failed to initialize controls\n");
+		dev_err(vsp1->dev, "%s: failed to initialize controls\n", name);
 		ret = bru->ctrls.error;
 		vsp1_entity_destroy(&bru->entity);
 		return ERR_PTR(ret);
diff --git a/drivers/media/platform/vsp1/vsp1_bru.h b/drivers/media/platform/vsp1/vsp1_bru.h
index 828a3fc..c98ed96 100644
--- a/drivers/media/platform/vsp1/vsp1_bru.h
+++ b/drivers/media/platform/vsp1/vsp1_bru.h
@@ -26,6 +26,7 @@ struct vsp1_rwpf;
 
 struct vsp1_bru {
 	struct vsp1_entity entity;
+	unsigned int base;
 
 	struct v4l2_ctrl_handler ctrls;
 
@@ -41,6 +42,7 @@ static inline struct vsp1_bru *to_bru(struct v4l2_subdev *subdev)
 	return container_of(subdev, struct vsp1_bru, entity.subdev);
 }
 
-struct vsp1_bru *vsp1_bru_create(struct vsp1_device *vsp1);
+struct vsp1_bru *vsp1_bru_create(struct vsp1_device *vsp1,
+				 enum vsp1_entity_type type);
 
 #endif /* __VSP1_BRU_H__ */
diff --git a/drivers/media/platform/vsp1/vsp1_dl.c b/drivers/media/platform/vsp1/vsp1_dl.c
index aaf17b1..8b5cbb6 100644
--- a/drivers/media/platform/vsp1/vsp1_dl.c
+++ b/drivers/media/platform/vsp1/vsp1_dl.c
@@ -95,6 +95,7 @@ enum vsp1_dl_mode {
  * struct vsp1_dl_manager - Display List manager
  * @index: index of the related WPF
  * @mode: display list operation mode (header or headerless)
+ * @singleshot: execute the display list in single-shot mode
  * @vsp1: the VSP1 device
  * @lock: protects the free, active, queued, pending and gc_fragments lists
  * @free: array of all free display lists
@@ -107,6 +108,7 @@ enum vsp1_dl_mode {
 struct vsp1_dl_manager {
 	unsigned int index;
 	enum vsp1_dl_mode mode;
+	bool singleshot;
 	struct vsp1_device *vsp1;
 
 	spinlock_t lock;
@@ -437,6 +439,7 @@ int vsp1_dl_list_add_chain(struct vsp1_dl_list *head,
 
 static void vsp1_dl_list_fill_header(struct vsp1_dl_list *dl, bool is_last)
 {
+	struct vsp1_dl_manager *dlm = dl->dlm;
 	struct vsp1_dl_header_list *hdr = dl->header->lists;
 	struct vsp1_dl_body *dlb;
 	unsigned int num_lists = 0;
@@ -461,38 +464,128 @@ static void vsp1_dl_list_fill_header(struct vsp1_dl_list *dl, bool is_last)
 
 	dl->header->num_lists = num_lists;
 
-	/*
-	 * If this display list's chain is not empty, we are on a list, where
-	 * the next item in the list is the display list entity which should be
-	 * automatically queued by the hardware.
-	 */
 	if (!list_empty(&dl->chain) && !is_last) {
+		/*
+		 * If this display list's chain is not empty, we are on a list,
+		 * and the next item is the display list that we must queue for
+		 * automatic processing by the hardware.
+		 */
 		struct vsp1_dl_list *next = list_next_entry(dl, chain);
 
 		dl->header->next_header = next->dma;
 		dl->header->flags = VSP1_DLH_AUTO_START;
+	} else if (!dlm->singleshot) {
+		/*
+		 * if the display list manager works in continuous mode, the VSP
+		 * should loop over the display list continuously until
+		 * instructed to do otherwise.
+		 */
+		dl->header->next_header = dl->dma;
+		dl->header->flags = VSP1_DLH_INT_ENABLE | VSP1_DLH_AUTO_START;
 	} else {
+		/*
+		 * Otherwise, in mem-to-mem mode, we work in single-shot mode
+		 * and the next display list must not be started automatically.
+		 */
 		dl->header->flags = VSP1_DLH_INT_ENABLE;
 	}
 }
 
+static bool vsp1_dl_list_hw_update_pending(struct vsp1_dl_manager *dlm)
+{
+	struct vsp1_device *vsp1 = dlm->vsp1;
+
+	if (!dlm->queued)
+		return false;
+
+	/*
+	 * Check whether the VSP1 has taken the update. In headerless mode the
+	 * hardware indicates this by clearing the UPD bit in the DL_BODY_SIZE
+	 * register, and in header mode by clearing the UPDHDR bit in the CMD
+	 * register.
+	 */
+	if (dlm->mode == VSP1_DL_MODE_HEADERLESS)
+		return !!(vsp1_read(vsp1, VI6_DL_BODY_SIZE)
+			  & VI6_DL_BODY_SIZE_UPD);
+	else
+		return !!(vsp1_read(vsp1, VI6_CMD(dlm->index) & VI6_CMD_UPDHDR));
+}
+
+static void vsp1_dl_list_hw_enqueue(struct vsp1_dl_list *dl)
+{
+	struct vsp1_dl_manager *dlm = dl->dlm;
+	struct vsp1_device *vsp1 = dlm->vsp1;
+
+	if (dlm->mode == VSP1_DL_MODE_HEADERLESS) {
+		/*
+		 * In headerless mode, program the hardware directly with the
+		 * display list body address and size and set the UPD bit. The
+		 * bit will be cleared by the hardware when the display list
+		 * processing starts.
+		 */
+		vsp1_write(vsp1, VI6_DL_HDR_ADDR(0), dl->body0.dma);
+		vsp1_write(vsp1, VI6_DL_BODY_SIZE, VI6_DL_BODY_SIZE_UPD |
+			   (dl->body0.num_entries * sizeof(*dl->header->lists)));
+	} else {
+		/*
+		 * In header mode, program the display list header address. If
+		 * the hardware is idle (single-shot mode or first frame in
+		 * continuous mode) it will then be started independently. If
+		 * the hardware is operating, the VI6_DL_HDR_REF_ADDR register
+		 * will be updated with the display list address.
+		 */
+		vsp1_write(vsp1, VI6_DL_HDR_ADDR(dlm->index), dl->dma);
+	}
+}
+
+static void vsp1_dl_list_commit_continuous(struct vsp1_dl_list *dl)
+{
+	struct vsp1_dl_manager *dlm = dl->dlm;
+
+	/*
+	 * If a previous display list has been queued to the hardware but not
+	 * processed yet, the VSP can start processing it at any time. In that
+	 * case we can't replace the queued list by the new one, as we could
+	 * race with the hardware. We thus mark the update as pending, it will
+	 * be queued up to the hardware by the frame end interrupt handler.
+	 */
+	if (vsp1_dl_list_hw_update_pending(dlm)) {
+		__vsp1_dl_list_put(dlm->pending);
+		dlm->pending = dl;
+		return;
+	}
+
+	/*
+	 * Pass the new display list to the hardware and mark it as queued. It
+	 * will become active when the hardware starts processing it.
+	 */
+	vsp1_dl_list_hw_enqueue(dl);
+
+	__vsp1_dl_list_put(dlm->queued);
+	dlm->queued = dl;
+}
+
+static void vsp1_dl_list_commit_singleshot(struct vsp1_dl_list *dl)
+{
+	struct vsp1_dl_manager *dlm = dl->dlm;
+
+	/*
+	 * When working in single-shot mode, the caller guarantees that the
+	 * hardware is idle at this point. Just commit the head display list
+	 * to hardware. Chained lists will be started automatically.
+	 */
+	vsp1_dl_list_hw_enqueue(dl);
+
+	dlm->active = dl;
+}
+
 void vsp1_dl_list_commit(struct vsp1_dl_list *dl)
 {
 	struct vsp1_dl_manager *dlm = dl->dlm;
-	struct vsp1_device *vsp1 = dlm->vsp1;
+	struct vsp1_dl_list *dl_child;
 	unsigned long flags;
-	bool update;
 
-	spin_lock_irqsave(&dlm->lock, flags);
-
-	if (dl->dlm->mode == VSP1_DL_MODE_HEADER) {
-		struct vsp1_dl_list *dl_child;
-
-		/*
-		 * In header mode the caller guarantees that the hardware is
-		 * idle at this point.
-		 */
-
+	if (dlm->mode == VSP1_DL_MODE_HEADER) {
 		/* Fill the header for the head and chained display lists. */
 		vsp1_dl_list_fill_header(dl, list_empty(&dl->chain));
 
@@ -501,43 +594,15 @@ void vsp1_dl_list_commit(struct vsp1_dl_list *dl)
 
 			vsp1_dl_list_fill_header(dl_child, last);
 		}
-
-		/*
-		 * Commit the head display list to hardware. Chained headers
-		 * will auto-start.
-		 */
-		vsp1_write(vsp1, VI6_DL_HDR_ADDR(dlm->index), dl->dma);
-
-		dlm->active = dl;
-		goto done;
 	}
 
-	/*
-	 * Once the UPD bit has been set the hardware can start processing the
-	 * display list at any time and we can't touch the address and size
-	 * registers. In that case mark the update as pending, it will be
-	 * queued up to the hardware by the frame end interrupt handler.
-	 */
-	update = !!(vsp1_read(vsp1, VI6_DL_BODY_SIZE) & VI6_DL_BODY_SIZE_UPD);
-	if (update) {
-		__vsp1_dl_list_put(dlm->pending);
-		dlm->pending = dl;
-		goto done;
-	}
+	spin_lock_irqsave(&dlm->lock, flags);
 
-	/*
-	 * Program the hardware with the display list body address and size.
-	 * The UPD bit will be cleared by the device when the display list is
-	 * processed.
-	 */
-	vsp1_write(vsp1, VI6_DL_HDR_ADDR(0), dl->body0.dma);
-	vsp1_write(vsp1, VI6_DL_BODY_SIZE, VI6_DL_BODY_SIZE_UPD |
-		   (dl->body0.num_entries * sizeof(*dl->header->lists)));
+	if (dlm->singleshot)
+		vsp1_dl_list_commit_singleshot(dl);
+	else
+		vsp1_dl_list_commit_continuous(dl);
 
-	__vsp1_dl_list_put(dlm->queued);
-	dlm->queued = dl;
-
-done:
 	spin_unlock_irqrestore(&dlm->lock, flags);
 }
 
@@ -545,22 +610,6 @@ void vsp1_dl_list_commit(struct vsp1_dl_list *dl)
  * Display List Manager
  */
 
-/* Interrupt Handling */
-void vsp1_dlm_irq_display_start(struct vsp1_dl_manager *dlm)
-{
-	spin_lock(&dlm->lock);
-
-	/*
-	 * The display start interrupt signals the end of the display list
-	 * processing by the device. The active display list, if any, won't be
-	 * accessed anymore and can be reused.
-	 */
-	__vsp1_dl_list_put(dlm->active);
-	dlm->active = NULL;
-
-	spin_unlock(&dlm->lock);
-}
-
 /**
  * vsp1_dlm_irq_frame_end - Display list handler for the frame end interrupt
  * @dlm: the display list manager
@@ -572,31 +621,28 @@ void vsp1_dlm_irq_display_start(struct vsp1_dl_manager *dlm)
  */
 bool vsp1_dlm_irq_frame_end(struct vsp1_dl_manager *dlm)
 {
-	struct vsp1_device *vsp1 = dlm->vsp1;
 	bool completed = false;
 
 	spin_lock(&dlm->lock);
 
-	__vsp1_dl_list_put(dlm->active);
-	dlm->active = NULL;
-
 	/*
-	 * Header mode is used for mem-to-mem pipelines only. We don't need to
-	 * perform any operation as there can't be any new display list queued
-	 * in that case.
+	 * The mem-to-mem pipelines work in single-shot mode. No new display
+	 * list can be queued, we don't have to do anything.
 	 */
-	if (dlm->mode == VSP1_DL_MODE_HEADER) {
+	if (dlm->singleshot) {
+		__vsp1_dl_list_put(dlm->active);
+		dlm->active = NULL;
 		completed = true;
 		goto done;
 	}
 
 	/*
-	 * The UPD bit set indicates that the commit operation raced with the
-	 * interrupt and occurred after the frame end event and UPD clear but
-	 * before interrupt processing. The hardware hasn't taken the update
-	 * into account yet, we'll thus skip one frame and retry.
+	 * If the commit operation raced with the interrupt and occurred after
+	 * the frame end event but before interrupt processing, the hardware
+	 * hasn't taken the update into account yet. We have to skip one frame
+	 * and retry.
 	 */
-	if (vsp1_read(vsp1, VI6_DL_BODY_SIZE) & VI6_DL_BODY_SIZE_UPD)
+	if (vsp1_dl_list_hw_update_pending(dlm))
 		goto done;
 
 	/*
@@ -604,24 +650,20 @@ bool vsp1_dlm_irq_frame_end(struct vsp1_dl_manager *dlm)
 	 * frame end interrupt. The display list thus becomes active.
 	 */
 	if (dlm->queued) {
+		__vsp1_dl_list_put(dlm->active);
 		dlm->active = dlm->queued;
 		dlm->queued = NULL;
 		completed = true;
 	}
 
 	/*
-	 * Now that the UPD bit has been cleared we can queue the next display
-	 * list to the hardware if one has been prepared.
+	 * Now that the VSP has started processing the queued display list, we
+	 * can queue the pending display list to the hardware if one has been
+	 * prepared.
 	 */
 	if (dlm->pending) {
-		struct vsp1_dl_list *dl = dlm->pending;
-
-		vsp1_write(vsp1, VI6_DL_HDR_ADDR(0), dl->body0.dma);
-		vsp1_write(vsp1, VI6_DL_BODY_SIZE, VI6_DL_BODY_SIZE_UPD |
-			   (dl->body0.num_entries *
-			    sizeof(*dl->header->lists)));
-
-		dlm->queued = dl;
+		vsp1_dl_list_hw_enqueue(dlm->pending);
+		dlm->queued = dlm->pending;
 		dlm->pending = NULL;
 	}
 
@@ -714,6 +756,7 @@ struct vsp1_dl_manager *vsp1_dlm_create(struct vsp1_device *vsp1,
 	dlm->index = index;
 	dlm->mode = index == 0 && !vsp1->info->uapi
 		  ? VSP1_DL_MODE_HEADERLESS : VSP1_DL_MODE_HEADER;
+	dlm->singleshot = vsp1->info->uapi;
 	dlm->vsp1 = vsp1;
 
 	spin_lock_init(&dlm->lock);
diff --git a/drivers/media/platform/vsp1/vsp1_dl.h b/drivers/media/platform/vsp1/vsp1_dl.h
index 6ec1380..ee35081 100644
--- a/drivers/media/platform/vsp1/vsp1_dl.h
+++ b/drivers/media/platform/vsp1/vsp1_dl.h
@@ -27,7 +27,6 @@ struct vsp1_dl_manager *vsp1_dlm_create(struct vsp1_device *vsp1,
 					unsigned int prealloc);
 void vsp1_dlm_destroy(struct vsp1_dl_manager *dlm);
 void vsp1_dlm_reset(struct vsp1_dl_manager *dlm);
-void vsp1_dlm_irq_display_start(struct vsp1_dl_manager *dlm);
 bool vsp1_dlm_irq_frame_end(struct vsp1_dl_manager *dlm);
 
 struct vsp1_dl_list *vsp1_dl_list_get(struct vsp1_dl_manager *dlm);
diff --git a/drivers/media/platform/vsp1/vsp1_drm.c b/drivers/media/platform/vsp1/vsp1_drm.c
index 9377aaf..4dfbeac 100644
--- a/drivers/media/platform/vsp1/vsp1_drm.c
+++ b/drivers/media/platform/vsp1/vsp1_drm.c
@@ -32,17 +32,13 @@
  * Interrupt Handling
  */
 
-void vsp1_drm_display_start(struct vsp1_device *vsp1)
+static void vsp1_du_pipeline_frame_end(struct vsp1_pipeline *pipe,
+				       bool completed)
 {
-	vsp1_dlm_irq_display_start(vsp1->drm->pipe.output->dlm);
-}
+	struct vsp1_drm_pipeline *drm_pipe = to_vsp1_drm_pipeline(pipe);
 
-static void vsp1_du_pipeline_frame_end(struct vsp1_pipeline *pipe)
-{
-	struct vsp1_drm *drm = to_vsp1_drm(pipe);
-
-	if (drm->du_complete)
-		drm->du_complete(drm->du_private);
+	if (drm_pipe->du_complete)
+		drm_pipe->du_complete(drm_pipe->du_private, completed);
 }
 
 /* -----------------------------------------------------------------------------
@@ -63,29 +59,44 @@ EXPORT_SYMBOL_GPL(vsp1_du_init);
 /**
  * vsp1_du_setup_lif - Setup the output part of the VSP pipeline
  * @dev: the VSP device
+ * @pipe_index: the DRM pipeline index
  * @cfg: the LIF configuration
  *
  * Configure the output part of VSP DRM pipeline for the given frame @cfg.width
- * and @cfg.height. This sets up formats on the BRU source pad, the WPF0 sink
- * and source pads, and the LIF sink pad.
+ * and @cfg.height. This sets up formats on the blend unit (BRU or BRS) source
+ * pad, the WPF sink and source pads, and the LIF sink pad.
  *
- * As the media bus code on the BRU source pad is conditioned by the
- * configuration of the BRU sink 0 pad, we also set up the formats on all BRU
+ * The @pipe_index argument selects which DRM pipeline to setup. The number of
+ * available pipelines depend on the VSP instance.
+ *
+ * As the media bus code on the blend unit source pad is conditioned by the
+ * configuration of its sink 0 pad, we also set up the formats on all blend unit
  * sinks, even if the configuration will be overwritten later by
- * vsp1_du_setup_rpf(). This ensures that the BRU configuration is set to a well
- * defined state.
+ * vsp1_du_setup_rpf(). This ensures that the blend unit configuration is set to
+ * a well defined state.
  *
  * Return 0 on success or a negative error code on failure.
  */
-int vsp1_du_setup_lif(struct device *dev, const struct vsp1_du_lif_config *cfg)
+int vsp1_du_setup_lif(struct device *dev, unsigned int pipe_index,
+		      const struct vsp1_du_lif_config *cfg)
 {
 	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
-	struct vsp1_pipeline *pipe = &vsp1->drm->pipe;
-	struct vsp1_bru *bru = vsp1->bru;
+	struct vsp1_drm_pipeline *drm_pipe;
+	struct vsp1_pipeline *pipe;
+	struct vsp1_bru *bru;
 	struct v4l2_subdev_format format;
+	const char *bru_name;
 	unsigned int i;
 	int ret;
 
+	if (pipe_index >= vsp1->info->lif_count)
+		return -EINVAL;
+
+	drm_pipe = &vsp1->drm->pipe[pipe_index];
+	pipe = &drm_pipe->pipe;
+	bru = to_bru(&pipe->bru->subdev);
+	bru_name = pipe->bru->type == VSP1_ENTITY_BRU ? "BRU" : "BRS";
+
 	if (!cfg) {
 		/*
 		 * NULL configuration means the CRTC is being disabled, stop
@@ -97,14 +108,25 @@ int vsp1_du_setup_lif(struct device *dev, const struct vsp1_du_lif_config *cfg)
 
 		media_pipeline_stop(&pipe->output->entity.subdev.entity);
 
-		for (i = 0; i < bru->entity.source_pad; ++i) {
-			vsp1->drm->inputs[i].enabled = false;
-			bru->inputs[i].rpf = NULL;
+		for (i = 0; i < ARRAY_SIZE(pipe->inputs); ++i) {
+			struct vsp1_rwpf *rpf = pipe->inputs[i];
+
+			if (!rpf)
+				continue;
+
+			/*
+			 * Remove the RPF from the pipe and the list of BRU
+			 * inputs.
+			 */
+			WARN_ON(list_empty(&rpf->entity.list_pipe));
+			list_del_init(&rpf->entity.list_pipe);
 			pipe->inputs[i] = NULL;
+
+			bru->inputs[rpf->bru_input].rpf = NULL;
 		}
 
+		drm_pipe->du_complete = NULL;
 		pipe->num_inputs = 0;
-		vsp1->drm->du_complete = NULL;
 
 		vsp1_dlm_reset(pipe->output->dlm);
 		vsp1_device_put(vsp1);
@@ -114,8 +136,8 @@ int vsp1_du_setup_lif(struct device *dev, const struct vsp1_du_lif_config *cfg)
 		return 0;
 	}
 
-	dev_dbg(vsp1->dev, "%s: configuring LIF with format %ux%u\n",
-		__func__, cfg->width, cfg->height);
+	dev_dbg(vsp1->dev, "%s: configuring LIF%u with format %ux%u\n",
+		__func__, pipe_index, cfg->width, cfg->height);
 
 	/*
 	 * Configure the format at the BRU sinks and propagate it through the
@@ -124,7 +146,7 @@ int vsp1_du_setup_lif(struct device *dev, const struct vsp1_du_lif_config *cfg)
 	memset(&format, 0, sizeof(format));
 	format.which = V4L2_SUBDEV_FORMAT_ACTIVE;
 
-	for (i = 0; i < bru->entity.source_pad; ++i) {
+	for (i = 0; i < pipe->bru->source_pad; ++i) {
 		format.pad = i;
 
 		format.format.width = cfg->width;
@@ -132,60 +154,60 @@ int vsp1_du_setup_lif(struct device *dev, const struct vsp1_du_lif_config *cfg)
 		format.format.code = MEDIA_BUS_FMT_ARGB8888_1X32;
 		format.format.field = V4L2_FIELD_NONE;
 
-		ret = v4l2_subdev_call(&bru->entity.subdev, pad,
+		ret = v4l2_subdev_call(&pipe->bru->subdev, pad,
 				       set_fmt, NULL, &format);
 		if (ret < 0)
 			return ret;
 
-		dev_dbg(vsp1->dev, "%s: set format %ux%u (%x) on BRU pad %u\n",
+		dev_dbg(vsp1->dev, "%s: set format %ux%u (%x) on %s pad %u\n",
 			__func__, format.format.width, format.format.height,
-			format.format.code, i);
+			format.format.code, bru_name, i);
 	}
 
-	format.pad = bru->entity.source_pad;
+	format.pad = pipe->bru->source_pad;
 	format.format.width = cfg->width;
 	format.format.height = cfg->height;
 	format.format.code = MEDIA_BUS_FMT_ARGB8888_1X32;
 	format.format.field = V4L2_FIELD_NONE;
 
-	ret = v4l2_subdev_call(&bru->entity.subdev, pad, set_fmt, NULL,
+	ret = v4l2_subdev_call(&pipe->bru->subdev, pad, set_fmt, NULL,
 			       &format);
 	if (ret < 0)
 		return ret;
 
-	dev_dbg(vsp1->dev, "%s: set format %ux%u (%x) on BRU pad %u\n",
+	dev_dbg(vsp1->dev, "%s: set format %ux%u (%x) on %s pad %u\n",
 		__func__, format.format.width, format.format.height,
-		format.format.code, i);
+		format.format.code, bru_name, i);
 
 	format.pad = RWPF_PAD_SINK;
-	ret = v4l2_subdev_call(&vsp1->wpf[0]->entity.subdev, pad, set_fmt, NULL,
+	ret = v4l2_subdev_call(&pipe->output->entity.subdev, pad, set_fmt, NULL,
 			       &format);
 	if (ret < 0)
 		return ret;
 
-	dev_dbg(vsp1->dev, "%s: set format %ux%u (%x) on WPF0 sink\n",
+	dev_dbg(vsp1->dev, "%s: set format %ux%u (%x) on WPF%u sink\n",
 		__func__, format.format.width, format.format.height,
-		format.format.code);
+		format.format.code, pipe->output->entity.index);
 
 	format.pad = RWPF_PAD_SOURCE;
-	ret = v4l2_subdev_call(&vsp1->wpf[0]->entity.subdev, pad, get_fmt, NULL,
+	ret = v4l2_subdev_call(&pipe->output->entity.subdev, pad, get_fmt, NULL,
 			       &format);
 	if (ret < 0)
 		return ret;
 
-	dev_dbg(vsp1->dev, "%s: got format %ux%u (%x) on WPF0 source\n",
+	dev_dbg(vsp1->dev, "%s: got format %ux%u (%x) on WPF%u source\n",
 		__func__, format.format.width, format.format.height,
-		format.format.code);
+		format.format.code, pipe->output->entity.index);
 
 	format.pad = LIF_PAD_SINK;
-	ret = v4l2_subdev_call(&vsp1->lif->entity.subdev, pad, set_fmt, NULL,
+	ret = v4l2_subdev_call(&pipe->lif->subdev, pad, set_fmt, NULL,
 			       &format);
 	if (ret < 0)
 		return ret;
 
-	dev_dbg(vsp1->dev, "%s: set format %ux%u (%x) on LIF sink\n",
+	dev_dbg(vsp1->dev, "%s: set format %ux%u (%x) on LIF%u sink\n",
 		__func__, format.format.width, format.format.height,
-		format.format.code);
+		format.format.code, pipe_index);
 
 	/*
 	 * Verify that the format at the output of the pipeline matches the
@@ -213,8 +235,8 @@ int vsp1_du_setup_lif(struct device *dev, const struct vsp1_du_lif_config *cfg)
 	 * Register a callback to allow us to notify the DRM driver of frame
 	 * completion events.
 	 */
-	vsp1->drm->du_complete = cfg->callback;
-	vsp1->drm->du_private = cfg->callback_data;
+	drm_pipe->du_complete = cfg->callback;
+	drm_pipe->du_private = cfg->callback_data;
 
 	ret = media_pipeline_start(&pipe->output->entity.subdev.entity,
 					  &pipe->pipe);
@@ -224,6 +246,10 @@ int vsp1_du_setup_lif(struct device *dev, const struct vsp1_du_lif_config *cfg)
 		return ret;
 	}
 
+	/* Disable the display interrupts. */
+	vsp1_write(vsp1, VI6_DISP_IRQ_STA, 0);
+	vsp1_write(vsp1, VI6_DISP_IRQ_ENB, 0);
+
 	dev_dbg(vsp1->dev, "%s: pipeline enabled\n", __func__);
 
 	return 0;
@@ -233,19 +259,21 @@ EXPORT_SYMBOL_GPL(vsp1_du_setup_lif);
 /**
  * vsp1_du_atomic_begin - Prepare for an atomic update
  * @dev: the VSP device
+ * @pipe_index: the DRM pipeline index
  */
-void vsp1_du_atomic_begin(struct device *dev)
+void vsp1_du_atomic_begin(struct device *dev, unsigned int pipe_index)
 {
 	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
-	struct vsp1_pipeline *pipe = &vsp1->drm->pipe;
+	struct vsp1_drm_pipeline *drm_pipe = &vsp1->drm->pipe[pipe_index];
 
-	vsp1->drm->num_inputs = pipe->num_inputs;
+	drm_pipe->enabled = drm_pipe->pipe.num_inputs != 0;
 }
 EXPORT_SYMBOL_GPL(vsp1_du_atomic_begin);
 
 /**
  * vsp1_du_atomic_update - Setup one RPF input of the VSP pipeline
  * @dev: the VSP device
+ * @pipe_index: the DRM pipeline index
  * @rpf_index: index of the RPF to setup (0-based)
  * @cfg: the RPF configuration
  *
@@ -272,10 +300,12 @@ EXPORT_SYMBOL_GPL(vsp1_du_atomic_begin);
  *
  * Return 0 on success or a negative error code on failure.
  */
-int vsp1_du_atomic_update(struct device *dev, unsigned int rpf_index,
+int vsp1_du_atomic_update(struct device *dev, unsigned int pipe_index,
+			  unsigned int rpf_index,
 			  const struct vsp1_du_atomic_config *cfg)
 {
 	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
+	struct vsp1_drm_pipeline *drm_pipe = &vsp1->drm->pipe[pipe_index];
 	const struct vsp1_format_info *fmtinfo;
 	struct vsp1_rwpf *rpf;
 
@@ -288,7 +318,12 @@ int vsp1_du_atomic_update(struct device *dev, unsigned int rpf_index,
 		dev_dbg(vsp1->dev, "%s: RPF%u: disable requested\n", __func__,
 			rpf_index);
 
-		vsp1->drm->inputs[rpf_index].enabled = false;
+		/*
+		 * Remove the RPF from the pipe's inputs. The atomic flush
+		 * handler will disable the input and remove the entity from the
+		 * pipe's entities list.
+		 */
+		drm_pipe->pipe.inputs[rpf_index] = NULL;
 		return 0;
 	}
 
@@ -324,13 +359,15 @@ int vsp1_du_atomic_update(struct device *dev, unsigned int rpf_index,
 	vsp1->drm->inputs[rpf_index].crop = cfg->src;
 	vsp1->drm->inputs[rpf_index].compose = cfg->dst;
 	vsp1->drm->inputs[rpf_index].zpos = cfg->zpos;
-	vsp1->drm->inputs[rpf_index].enabled = true;
+
+	drm_pipe->pipe.inputs[rpf_index] = rpf;
 
 	return 0;
 }
 EXPORT_SYMBOL_GPL(vsp1_du_atomic_update);
 
 static int vsp1_du_setup_rpf_pipe(struct vsp1_device *vsp1,
+				  struct vsp1_pipeline *pipe,
 				  struct vsp1_rwpf *rpf, unsigned int bru_input)
 {
 	struct v4l2_subdev_selection sel;
@@ -404,7 +441,7 @@ static int vsp1_du_setup_rpf_pipe(struct vsp1_device *vsp1,
 	/* BRU sink, propagate the format from the RPF source. */
 	format.pad = bru_input;
 
-	ret = v4l2_subdev_call(&vsp1->bru->entity.subdev, pad, set_fmt, NULL,
+	ret = v4l2_subdev_call(&pipe->bru->subdev, pad, set_fmt, NULL,
 			       &format);
 	if (ret < 0)
 		return ret;
@@ -417,8 +454,8 @@ static int vsp1_du_setup_rpf_pipe(struct vsp1_device *vsp1,
 	sel.target = V4L2_SEL_TGT_COMPOSE;
 	sel.r = vsp1->drm->inputs[rpf->entity.index].compose;
 
-	ret = v4l2_subdev_call(&vsp1->bru->entity.subdev, pad, set_selection,
-			       NULL, &sel);
+	ret = v4l2_subdev_call(&pipe->bru->subdev, pad, set_selection, NULL,
+			       &sel);
 	if (ret < 0)
 		return ret;
 
@@ -438,18 +475,25 @@ static unsigned int rpf_zpos(struct vsp1_device *vsp1, struct vsp1_rwpf *rpf)
 /**
  * vsp1_du_atomic_flush - Commit an atomic update
  * @dev: the VSP device
+ * @pipe_index: the DRM pipeline index
  */
-void vsp1_du_atomic_flush(struct device *dev)
+void vsp1_du_atomic_flush(struct device *dev, unsigned int pipe_index)
 {
 	struct vsp1_device *vsp1 = dev_get_drvdata(dev);
-	struct vsp1_pipeline *pipe = &vsp1->drm->pipe;
+	struct vsp1_drm_pipeline *drm_pipe = &vsp1->drm->pipe[pipe_index];
+	struct vsp1_pipeline *pipe = &drm_pipe->pipe;
 	struct vsp1_rwpf *inputs[VSP1_MAX_RPF] = { NULL, };
+	struct vsp1_bru *bru = to_bru(&pipe->bru->subdev);
 	struct vsp1_entity *entity;
+	struct vsp1_entity *next;
 	struct vsp1_dl_list *dl;
+	const char *bru_name;
 	unsigned long flags;
 	unsigned int i;
 	int ret;
 
+	bru_name = pipe->bru->type == VSP1_ENTITY_BRU ? "BRU" : "BRS";
+
 	/* Prepare the display list. */
 	dl = vsp1_dl_list_get(pipe->output->dlm);
 
@@ -460,12 +504,8 @@ void vsp1_du_atomic_flush(struct device *dev)
 		struct vsp1_rwpf *rpf = vsp1->rpf[i];
 		unsigned int j;
 
-		if (!vsp1->drm->inputs[i].enabled) {
-			pipe->inputs[i] = NULL;
+		if (!pipe->inputs[i])
 			continue;
-		}
-
-		pipe->inputs[i] = rpf;
 
 		/* Insert the RPF in the sorted RPFs array. */
 		for (j = pipe->num_inputs++; j > 0; --j) {
@@ -478,22 +518,26 @@ void vsp1_du_atomic_flush(struct device *dev)
 	}
 
 	/* Setup the RPF input pipeline for every enabled input. */
-	for (i = 0; i < vsp1->info->num_bru_inputs; ++i) {
+	for (i = 0; i < pipe->bru->source_pad; ++i) {
 		struct vsp1_rwpf *rpf = inputs[i];
 
 		if (!rpf) {
-			vsp1->bru->inputs[i].rpf = NULL;
+			bru->inputs[i].rpf = NULL;
 			continue;
 		}
 
-		vsp1->bru->inputs[i].rpf = rpf;
+		if (list_empty(&rpf->entity.list_pipe))
+			list_add_tail(&rpf->entity.list_pipe, &pipe->entities);
+
+		bru->inputs[i].rpf = rpf;
 		rpf->bru_input = i;
+		rpf->entity.sink = pipe->bru;
 		rpf->entity.sink_pad = i;
 
-		dev_dbg(vsp1->dev, "%s: connecting RPF.%u to BRU:%u\n",
-			__func__, rpf->entity.index, i);
+		dev_dbg(vsp1->dev, "%s: connecting RPF.%u to %s:%u\n",
+			__func__, rpf->entity.index, bru_name, i);
 
-		ret = vsp1_du_setup_rpf_pipe(vsp1, rpf, i);
+		ret = vsp1_du_setup_rpf_pipe(vsp1, pipe, rpf, i);
 		if (ret < 0)
 			dev_err(vsp1->dev,
 				"%s: failed to setup RPF.%u\n",
@@ -501,16 +545,16 @@ void vsp1_du_atomic_flush(struct device *dev)
 	}
 
 	/* Configure all entities in the pipeline. */
-	list_for_each_entry(entity, &pipe->entities, list_pipe) {
+	list_for_each_entry_safe(entity, next, &pipe->entities, list_pipe) {
 		/* Disconnect unused RPFs from the pipeline. */
-		if (entity->type == VSP1_ENTITY_RPF) {
-			struct vsp1_rwpf *rpf = to_rwpf(&entity->subdev);
+		if (entity->type == VSP1_ENTITY_RPF &&
+		    !pipe->inputs[entity->index]) {
+			vsp1_dl_list_write(dl, entity->route->reg,
+					   VI6_DPR_NODE_UNUSED);
 
-			if (!pipe->inputs[rpf->entity.index]) {
-				vsp1_dl_list_write(dl, entity->route->reg,
-						   VI6_DPR_NODE_UNUSED);
-				continue;
-			}
+			list_del_init(&entity->list_pipe);
+
+			continue;
 		}
 
 		vsp1_entity_route_setup(entity, pipe, dl);
@@ -528,14 +572,11 @@ void vsp1_du_atomic_flush(struct device *dev)
 	vsp1_dl_list_commit(dl);
 
 	/* Start or stop the pipeline if needed. */
-	if (!vsp1->drm->num_inputs && pipe->num_inputs) {
-		vsp1_write(vsp1, VI6_DISP_IRQ_STA, 0);
-		vsp1_write(vsp1, VI6_DISP_IRQ_ENB, VI6_DISP_IRQ_ENB_DSTE);
+	if (!drm_pipe->enabled && pipe->num_inputs) {
 		spin_lock_irqsave(&pipe->irqlock, flags);
 		vsp1_pipeline_run(pipe);
 		spin_unlock_irqrestore(&pipe->irqlock, flags);
-	} else if (vsp1->drm->num_inputs && !pipe->num_inputs) {
-		vsp1_write(vsp1, VI6_DISP_IRQ_ENB, 0);
+	} else if (drm_pipe->enabled && !pipe->num_inputs) {
 		vsp1_pipeline_stop(pipe);
 	}
 }
@@ -568,83 +609,48 @@ EXPORT_SYMBOL_GPL(vsp1_du_unmap_sg);
  * Initialization
  */
 
-int vsp1_drm_create_links(struct vsp1_device *vsp1)
-{
-	const u32 flags = MEDIA_LNK_FL_ENABLED | MEDIA_LNK_FL_IMMUTABLE;
-	unsigned int i;
-	int ret;
-
-	/*
-	 * VSPD instances require a BRU to perform composition and a LIF to
-	 * output to the DU.
-	 */
-	if (!vsp1->bru || !vsp1->lif)
-		return -ENXIO;
-
-	for (i = 0; i < vsp1->info->rpf_count; ++i) {
-		struct vsp1_rwpf *rpf = vsp1->rpf[i];
-
-		ret = media_create_pad_link(&rpf->entity.subdev.entity,
-					    RWPF_PAD_SOURCE,
-					    &vsp1->bru->entity.subdev.entity,
-					    i, flags);
-		if (ret < 0)
-			return ret;
-
-		rpf->entity.sink = &vsp1->bru->entity.subdev.entity;
-		rpf->entity.sink_pad = i;
-	}
-
-	ret = media_create_pad_link(&vsp1->bru->entity.subdev.entity,
-				    vsp1->bru->entity.source_pad,
-				    &vsp1->wpf[0]->entity.subdev.entity,
-				    RWPF_PAD_SINK, flags);
-	if (ret < 0)
-		return ret;
-
-	vsp1->bru->entity.sink = &vsp1->wpf[0]->entity.subdev.entity;
-	vsp1->bru->entity.sink_pad = RWPF_PAD_SINK;
-
-	ret = media_create_pad_link(&vsp1->wpf[0]->entity.subdev.entity,
-				    RWPF_PAD_SOURCE,
-				    &vsp1->lif->entity.subdev.entity,
-				    LIF_PAD_SINK, flags);
-	if (ret < 0)
-		return ret;
-
-	return 0;
-}
-
 int vsp1_drm_init(struct vsp1_device *vsp1)
 {
-	struct vsp1_pipeline *pipe;
 	unsigned int i;
 
 	vsp1->drm = devm_kzalloc(vsp1->dev, sizeof(*vsp1->drm), GFP_KERNEL);
 	if (!vsp1->drm)
 		return -ENOMEM;
 
-	pipe = &vsp1->drm->pipe;
+	/* Create one DRM pipeline per LIF. */
+	for (i = 0; i < vsp1->info->lif_count; ++i) {
+		struct vsp1_drm_pipeline *drm_pipe = &vsp1->drm->pipe[i];
+		struct vsp1_pipeline *pipe = &drm_pipe->pipe;
 
-	vsp1_pipeline_init(pipe);
+		vsp1_pipeline_init(pipe);
 
-	/* The DRM pipeline is static, add entities manually. */
+		/*
+		 * The DRM pipeline is static, add entities manually. The first
+		 * pipeline uses the BRU and the second pipeline the BRS.
+		 */
+		pipe->bru = i == 0 ? &vsp1->bru->entity : &vsp1->brs->entity;
+		pipe->lif = &vsp1->lif[i]->entity;
+		pipe->output = vsp1->wpf[i];
+		pipe->output->pipe = pipe;
+		pipe->frame_end = vsp1_du_pipeline_frame_end;
+
+		pipe->bru->sink = &pipe->output->entity;
+		pipe->bru->sink_pad = 0;
+		pipe->output->entity.sink = pipe->lif;
+		pipe->output->entity.sink_pad = 0;
+
+		list_add_tail(&pipe->bru->list_pipe, &pipe->entities);
+		list_add_tail(&pipe->lif->list_pipe, &pipe->entities);
+		list_add_tail(&pipe->output->entity.list_pipe, &pipe->entities);
+	}
+
+	/* Disable all RPFs initially. */
 	for (i = 0; i < vsp1->info->rpf_count; ++i) {
 		struct vsp1_rwpf *input = vsp1->rpf[i];
 
-		list_add_tail(&input->entity.list_pipe, &pipe->entities);
+		INIT_LIST_HEAD(&input->entity.list_pipe);
 	}
 
-	list_add_tail(&vsp1->bru->entity.list_pipe, &pipe->entities);
-	list_add_tail(&vsp1->wpf[0]->entity.list_pipe, &pipe->entities);
-	list_add_tail(&vsp1->lif->entity.list_pipe, &pipe->entities);
-
-	pipe->bru = &vsp1->bru->entity;
-	pipe->lif = &vsp1->lif->entity;
-	pipe->output = vsp1->wpf[0];
-	pipe->output->pipe = pipe;
-	pipe->frame_end = vsp1_du_pipeline_frame_end;
-
 	return 0;
 }
 
diff --git a/drivers/media/platform/vsp1/vsp1_drm.h b/drivers/media/platform/vsp1/vsp1_drm.h
index e9f8072..1cd9db7 100644
--- a/drivers/media/platform/vsp1/vsp1_drm.h
+++ b/drivers/media/platform/vsp1/vsp1_drm.h
@@ -18,38 +18,44 @@
 #include "vsp1_pipe.h"
 
 /**
- * vsp1_drm - State for the API exposed to the DRM driver
+ * vsp1_drm_pipeline - State for the API exposed to the DRM driver
  * @pipe: the VSP1 pipeline used for display
- * @num_inputs: number of active pipeline inputs at the beginning of an update
- * @inputs: source crop rectangle, destination compose rectangle and z-order
- *	position for every input
+ * @enabled: pipeline state at the beginning of an update
  * @du_complete: frame completion callback for the DU driver (optional)
  * @du_private: data to be passed to the du_complete callback
  */
-struct vsp1_drm {
+struct vsp1_drm_pipeline {
 	struct vsp1_pipeline pipe;
-	unsigned int num_inputs;
+	bool enabled;
+
+	/* Frame synchronisation */
+	void (*du_complete)(void *, bool);
+	void *du_private;
+};
+
+/**
+ * vsp1_drm - State for the API exposed to the DRM driver
+ * @pipe: the VSP1 DRM pipeline used for display
+ * @inputs: source crop rectangle, destination compose rectangle and z-order
+ *	position for every input (indexed by RPF index)
+ */
+struct vsp1_drm {
+	struct vsp1_drm_pipeline pipe[VSP1_MAX_LIF];
+
 	struct {
-		bool enabled;
 		struct v4l2_rect crop;
 		struct v4l2_rect compose;
 		unsigned int zpos;
 	} inputs[VSP1_MAX_RPF];
-
-	/* Frame synchronisation */
-	void (*du_complete)(void *);
-	void *du_private;
 };
 
-static inline struct vsp1_drm *to_vsp1_drm(struct vsp1_pipeline *pipe)
+static inline struct vsp1_drm_pipeline *
+to_vsp1_drm_pipeline(struct vsp1_pipeline *pipe)
 {
-	return container_of(pipe, struct vsp1_drm, pipe);
+	return container_of(pipe, struct vsp1_drm_pipeline, pipe);
 }
 
 int vsp1_drm_init(struct vsp1_device *vsp1);
 void vsp1_drm_cleanup(struct vsp1_device *vsp1);
-int vsp1_drm_create_links(struct vsp1_device *vsp1);
-
-void vsp1_drm_display_start(struct vsp1_device *vsp1);
 
 #endif /* __VSP1_DRM_H__ */
diff --git a/drivers/media/platform/vsp1/vsp1_drv.c b/drivers/media/platform/vsp1/vsp1_drv.c
index 95c26ed..962e4c3 100644
--- a/drivers/media/platform/vsp1/vsp1_drv.c
+++ b/drivers/media/platform/vsp1/vsp1_drv.c
@@ -68,14 +68,6 @@ static irqreturn_t vsp1_irq_handler(int irq, void *data)
 		}
 	}
 
-	status = vsp1_read(vsp1, VI6_DISP_IRQ_STA);
-	vsp1_write(vsp1, VI6_DISP_IRQ_STA, ~status & VI6_DISP_IRQ_STA_DST);
-
-	if (status & VI6_DISP_IRQ_STA_DST) {
-		vsp1_drm_display_start(vsp1);
-		ret = IRQ_HANDLED;
-	}
-
 	return ret;
 }
 
@@ -92,6 +84,10 @@ static irqreturn_t vsp1_irq_handler(int irq, void *data)
  *
  * - from a UDS to a UDS (UDS entities can't be chained)
  * - from an entity to itself (no loops are allowed)
+ *
+ * Furthermore, the BRS can't be connected to histogram generators, but no
+ * special check is currently needed as all VSP instances that include a BRS
+ * have no histogram generator.
  */
 static int vsp1_create_sink_links(struct vsp1_device *vsp1,
 				  struct vsp1_entity *sink)
@@ -129,7 +125,7 @@ static int vsp1_create_sink_links(struct vsp1_device *vsp1,
 				return ret;
 
 			if (flags & MEDIA_LNK_FL_ENABLED)
-				source->sink = entity;
+				source->sink = sink;
 		}
 	}
 
@@ -172,10 +168,13 @@ static int vsp1_uapi_create_links(struct vsp1_device *vsp1)
 			return ret;
 	}
 
-	if (vsp1->lif) {
-		ret = media_create_pad_link(&vsp1->wpf[0]->entity.subdev.entity,
+	for (i = 0; i < vsp1->info->lif_count; ++i) {
+		if (!vsp1->lif[i])
+			continue;
+
+		ret = media_create_pad_link(&vsp1->wpf[i]->entity.subdev.entity,
 					    RWPF_PAD_SOURCE,
-					    &vsp1->lif->entity.subdev.entity,
+					    &vsp1->lif[i]->entity.subdev.entity,
 					    LIF_PAD_SINK, 0);
 		if (ret < 0)
 			return ret;
@@ -269,8 +268,18 @@ static int vsp1_create_entities(struct vsp1_device *vsp1)
 	}
 
 	/* Instantiate all the entities. */
+	if (vsp1->info->features & VSP1_HAS_BRS) {
+		vsp1->brs = vsp1_bru_create(vsp1, VSP1_ENTITY_BRS);
+		if (IS_ERR(vsp1->brs)) {
+			ret = PTR_ERR(vsp1->brs);
+			goto done;
+		}
+
+		list_add_tail(&vsp1->brs->entity.list_dev, &vsp1->entities);
+	}
+
 	if (vsp1->info->features & VSP1_HAS_BRU) {
-		vsp1->bru = vsp1_bru_create(vsp1);
+		vsp1->bru = vsp1_bru_create(vsp1, VSP1_ENTITY_BRU);
 		if (IS_ERR(vsp1->bru)) {
 			ret = PTR_ERR(vsp1->bru);
 			goto done;
@@ -328,18 +337,23 @@ static int vsp1_create_entities(struct vsp1_device *vsp1)
 	}
 
 	/*
-	 * The LIF is only supported when used in conjunction with the DU, in
+	 * The LIFs are only supported when used in conjunction with the DU, in
 	 * which case the userspace API is disabled. If the userspace API is
-	 * enabled skip the LIF, even when present.
+	 * enabled skip the LIFs, even when present.
 	 */
-	if (vsp1->info->features & VSP1_HAS_LIF && !vsp1->info->uapi) {
-		vsp1->lif = vsp1_lif_create(vsp1);
-		if (IS_ERR(vsp1->lif)) {
-			ret = PTR_ERR(vsp1->lif);
-			goto done;
-		}
+	if (!vsp1->info->uapi) {
+		for (i = 0; i < vsp1->info->lif_count; ++i) {
+			struct vsp1_lif *lif;
 
-		list_add_tail(&vsp1->lif->entity.list_dev, &vsp1->entities);
+			lif = vsp1_lif_create(vsp1, i);
+			if (IS_ERR(lif)) {
+				ret = PTR_ERR(lif);
+				goto done;
+			}
+
+			vsp1->lif[i] = lif;
+			list_add_tail(&lif->entity.list_dev, &vsp1->entities);
+		}
 	}
 
 	if (vsp1->info->features & VSP1_HAS_LUT) {
@@ -420,7 +434,6 @@ static int vsp1_create_entities(struct vsp1_device *vsp1)
 			}
 
 			list_add_tail(&video->list, &vsp1->videos);
-			wpf->entity.sink = &video->video.entity;
 		}
 	}
 
@@ -432,19 +445,15 @@ static int vsp1_create_entities(struct vsp1_device *vsp1)
 			goto done;
 	}
 
-	/* Create links. */
-	if (vsp1->info->uapi)
-		ret = vsp1_uapi_create_links(vsp1);
-	else
-		ret = vsp1_drm_create_links(vsp1);
-	if (ret < 0)
-		goto done;
-
 	/*
-	 * Register subdev nodes if the userspace API is enabled or initialize
-	 * the DRM pipeline otherwise.
+	 * Create links and register subdev nodes if the userspace API is
+	 * enabled or initialize the DRM pipeline otherwise.
 	 */
 	if (vsp1->info->uapi) {
+		ret = vsp1_uapi_create_links(vsp1);
+		if (ret < 0)
+			goto done;
+
 		ret = v4l2_device_register_subdev_nodes(&vsp1->v4l2_dev);
 		if (ret < 0)
 			goto done;
@@ -515,6 +524,9 @@ static int vsp1_device_init(struct vsp1_device *vsp1)
 	vsp1_write(vsp1, VI6_DPR_HSI_ROUTE, VI6_DPR_NODE_UNUSED);
 	vsp1_write(vsp1, VI6_DPR_BRU_ROUTE, VI6_DPR_NODE_UNUSED);
 
+	if (vsp1->info->features & VSP1_HAS_BRS)
+		vsp1_write(vsp1, VI6_DPR_ILV_BRS_ROUTE, VI6_DPR_NODE_UNUSED);
+
 	vsp1_write(vsp1, VI6_DPR_HGO_SMPPT, (7 << VI6_DPR_SMPPT_TGW_SHIFT) |
 		   (VI6_DPR_NODE_UNUSED << VI6_DPR_SMPPT_PT_SHIFT));
 	vsp1_write(vsp1, VI6_DPR_HGT_SMPPT, (7 << VI6_DPR_SMPPT_TGW_SHIFT) |
@@ -634,8 +646,8 @@ static const struct vsp1_device_info vsp1_device_infos[] = {
 		.version = VI6_IP_VERSION_MODEL_VSPD_GEN2,
 		.model = "VSP1-D",
 		.gen = 2,
-		.features = VSP1_HAS_BRU | VSP1_HAS_HGO | VSP1_HAS_LIF
-			  | VSP1_HAS_LUT,
+		.features = VSP1_HAS_BRU | VSP1_HAS_HGO | VSP1_HAS_LUT,
+		.lif_count = 1,
 		.rpf_count = 4,
 		.uds_count = 1,
 		.wpf_count = 1,
@@ -668,8 +680,8 @@ static const struct vsp1_device_info vsp1_device_infos[] = {
 		.version = VI6_IP_VERSION_MODEL_VSPD_V2H,
 		.model = "VSP1V-D",
 		.gen = 2,
-		.features = VSP1_HAS_BRU | VSP1_HAS_CLU | VSP1_HAS_LUT
-			  | VSP1_HAS_LIF,
+		.features = VSP1_HAS_BRU | VSP1_HAS_CLU | VSP1_HAS_LUT,
+		.lif_count = 1,
 		.rpf_count = 4,
 		.uds_count = 1,
 		.wpf_count = 1,
@@ -706,10 +718,37 @@ static const struct vsp1_device_info vsp1_device_infos[] = {
 		.num_bru_inputs = 5,
 		.uapi = true,
 	}, {
+		.version = VI6_IP_VERSION_MODEL_VSPBS_GEN3,
+		.model = "VSP2-BS",
+		.gen = 3,
+		.features = VSP1_HAS_BRS | VSP1_HAS_WPF_VFLIP,
+		.rpf_count = 2,
+		.wpf_count = 1,
+		.uapi = true,
+	}, {
 		.version = VI6_IP_VERSION_MODEL_VSPD_GEN3,
 		.model = "VSP2-D",
 		.gen = 3,
-		.features = VSP1_HAS_BRU | VSP1_HAS_LIF | VSP1_HAS_WPF_VFLIP,
+		.features = VSP1_HAS_BRU | VSP1_HAS_WPF_VFLIP,
+		.lif_count = 1,
+		.rpf_count = 5,
+		.wpf_count = 2,
+		.num_bru_inputs = 5,
+	}, {
+		.version = VI6_IP_VERSION_MODEL_VSPD_V3,
+		.model = "VSP2-D",
+		.gen = 3,
+		.features = VSP1_HAS_BRS | VSP1_HAS_BRU,
+		.lif_count = 1,
+		.rpf_count = 5,
+		.wpf_count = 1,
+		.num_bru_inputs = 5,
+	}, {
+		.version = VI6_IP_VERSION_MODEL_VSPDL_GEN3,
+		.model = "VSP2-DL",
+		.gen = 3,
+		.features = VSP1_HAS_BRS | VSP1_HAS_BRU,
+		.lif_count = 2,
 		.rpf_count = 5,
 		.wpf_count = 2,
 		.num_bru_inputs = 5,
diff --git a/drivers/media/platform/vsp1/vsp1_entity.c b/drivers/media/platform/vsp1/vsp1_entity.c
index 4bdb3b1..54de150 100644
--- a/drivers/media/platform/vsp1/vsp1_entity.c
+++ b/drivers/media/platform/vsp1/vsp1_entity.c
@@ -24,18 +24,12 @@
 #include "vsp1_pipe.h"
 #include "vsp1_rwpf.h"
 
-static inline struct vsp1_entity *
-media_entity_to_vsp1_entity(struct media_entity *entity)
-{
-	return container_of(entity, struct vsp1_entity, subdev.entity);
-}
-
 void vsp1_entity_route_setup(struct vsp1_entity *entity,
 			     struct vsp1_pipeline *pipe,
 			     struct vsp1_dl_list *dl)
 {
 	struct vsp1_entity *source;
-	struct vsp1_entity *sink;
+	u32 route;
 
 	if (entity->type == VSP1_ENTITY_HGO) {
 		u32 smppt;
@@ -44,7 +38,7 @@ void vsp1_entity_route_setup(struct vsp1_entity *entity,
 		 * The HGO is a special case, its routing is configured on the
 		 * sink pad.
 		 */
-		source = media_entity_to_vsp1_entity(entity->sources[0]);
+		source = entity->sources[0];
 		smppt = (pipe->output->entity.index << VI6_DPR_SMPPT_TGW_SHIFT)
 		      | (source->route->output << VI6_DPR_SMPPT_PT_SHIFT);
 
@@ -57,7 +51,7 @@ void vsp1_entity_route_setup(struct vsp1_entity *entity,
 		 * The HGT is a special case, its routing is configured on the
 		 * sink pad.
 		 */
-		source = media_entity_to_vsp1_entity(entity->sources[0]);
+		source = entity->sources[0];
 		smppt = (pipe->output->entity.index << VI6_DPR_SMPPT_TGW_SHIFT)
 		      | (source->route->output << VI6_DPR_SMPPT_PT_SHIFT);
 
@@ -69,9 +63,14 @@ void vsp1_entity_route_setup(struct vsp1_entity *entity,
 	if (source->route->reg == 0)
 		return;
 
-	sink = media_entity_to_vsp1_entity(source->sink);
-	vsp1_dl_list_write(dl, source->route->reg,
-			   sink->route->inputs[source->sink_pad]);
+	route = source->sink->route->inputs[source->sink_pad];
+	/*
+	 * The ILV and BRS share the same data path route. The extra BRSSEL bit
+	 * selects between the ILV and BRS.
+	 */
+	if (source->type == VSP1_ENTITY_BRS)
+		route |= VI6_DPR_ROUTE_BRSSEL;
+	vsp1_dl_list_write(dl, source->route->reg, route);
 }
 
 /* -----------------------------------------------------------------------------
@@ -316,6 +315,12 @@ int vsp1_subdev_enum_frame_size(struct v4l2_subdev *subdev,
  * Media Operations
  */
 
+static inline struct vsp1_entity *
+media_entity_to_vsp1_entity(struct media_entity *entity)
+{
+	return container_of(entity, struct vsp1_entity, subdev.entity);
+}
+
 static int vsp1_entity_link_setup_source(const struct media_pad *source_pad,
 					 const struct media_pad *sink_pad,
 					 u32 flags)
@@ -339,7 +344,7 @@ static int vsp1_entity_link_setup_source(const struct media_pad *source_pad,
 		    sink->type != VSP1_ENTITY_HGT) {
 			if (source->sink)
 				return -EBUSY;
-			source->sink = sink_pad->entity;
+			source->sink = sink;
 			source->sink_pad = sink_pad->index;
 		}
 	} else {
@@ -355,15 +360,17 @@ static int vsp1_entity_link_setup_sink(const struct media_pad *source_pad,
 				       u32 flags)
 {
 	struct vsp1_entity *sink;
+	struct vsp1_entity *source;
 
 	sink = media_entity_to_vsp1_entity(sink_pad->entity);
+	source = media_entity_to_vsp1_entity(source_pad->entity);
 
 	if (flags & MEDIA_LNK_FL_ENABLED) {
 		/* Fan-in is limited to one. */
 		if (sink->sources[sink_pad->index])
 			return -EBUSY;
 
-		sink->sources[sink_pad->index] = source_pad->entity;
+		sink->sources[sink_pad->index] = source;
 	} else {
 		sink->sources[sink_pad->index] = NULL;
 	}
@@ -450,6 +457,8 @@ struct media_pad *vsp1_entity_remote_pad(struct media_pad *pad)
 	  { VI6_DPR_NODE_WPF(idx) }, VI6_DPR_NODE_WPF(idx) }
 
 static const struct vsp1_route vsp1_routes[] = {
+	{ VSP1_ENTITY_BRS, 0, VI6_DPR_ILV_BRS_ROUTE,
+	  { VI6_DPR_NODE_BRS_IN(0), VI6_DPR_NODE_BRS_IN(1) }, 0 },
 	{ VSP1_ENTITY_BRU, 0, VI6_DPR_BRU_ROUTE,
 	  { VI6_DPR_NODE_BRU_IN(0), VI6_DPR_NODE_BRU_IN(1),
 	    VI6_DPR_NODE_BRU_IN(2), VI6_DPR_NODE_BRU_IN(3),
@@ -459,7 +468,8 @@ static const struct vsp1_route vsp1_routes[] = {
 	{ VSP1_ENTITY_HGT, 0, 0, { 0, }, 0 },
 	VSP1_ENTITY_ROUTE(HSI),
 	VSP1_ENTITY_ROUTE(HST),
-	{ VSP1_ENTITY_LIF, 0, 0, { VI6_DPR_NODE_LIF, }, VI6_DPR_NODE_LIF },
+	{ VSP1_ENTITY_LIF, 0, 0, { 0, }, 0 },
+	{ VSP1_ENTITY_LIF, 1, 0, { 0, }, 0 },
 	VSP1_ENTITY_ROUTE(LUT),
 	VSP1_ENTITY_ROUTE_RPF(0),
 	VSP1_ENTITY_ROUTE_RPF(1),
diff --git a/drivers/media/platform/vsp1/vsp1_entity.h b/drivers/media/platform/vsp1/vsp1_entity.h
index c169a06..11f8363 100644
--- a/drivers/media/platform/vsp1/vsp1_entity.h
+++ b/drivers/media/platform/vsp1/vsp1_entity.h
@@ -23,6 +23,7 @@ struct vsp1_dl_list;
 struct vsp1_pipeline;
 
 enum vsp1_entity_type {
+	VSP1_ENTITY_BRS,
 	VSP1_ENTITY_BRU,
 	VSP1_ENTITY_CLU,
 	VSP1_ENTITY_HGO,
@@ -104,8 +105,8 @@ struct vsp1_entity {
 	struct media_pad *pads;
 	unsigned int source_pad;
 
-	struct media_entity **sources;
-	struct media_entity *sink;
+	struct vsp1_entity **sources;
+	struct vsp1_entity *sink;
 	unsigned int sink_pad;
 
 	struct v4l2_subdev subdev;
diff --git a/drivers/media/platform/vsp1/vsp1_lif.c b/drivers/media/platform/vsp1/vsp1_lif.c
index 702487f..e6fa16d 100644
--- a/drivers/media/platform/vsp1/vsp1_lif.c
+++ b/drivers/media/platform/vsp1/vsp1_lif.c
@@ -30,7 +30,7 @@
 static inline void vsp1_lif_write(struct vsp1_lif *lif, struct vsp1_dl_list *dl,
 				  u32 reg, u32 data)
 {
-	vsp1_dl_list_write(dl, reg, data);
+	vsp1_dl_list_write(dl, reg + lif->entity.index * VI6_LIF_OFFSET, data);
 }
 
 /* -----------------------------------------------------------------------------
@@ -165,7 +165,7 @@ static const struct vsp1_entity_operations lif_entity_ops = {
  * Initialization and Cleanup
  */
 
-struct vsp1_lif *vsp1_lif_create(struct vsp1_device *vsp1)
+struct vsp1_lif *vsp1_lif_create(struct vsp1_device *vsp1, unsigned int index)
 {
 	struct vsp1_lif *lif;
 	int ret;
@@ -176,6 +176,7 @@ struct vsp1_lif *vsp1_lif_create(struct vsp1_device *vsp1)
 
 	lif->entity.ops = &lif_entity_ops;
 	lif->entity.type = VSP1_ENTITY_LIF;
+	lif->entity.index = index;
 
 	/*
 	 * The LIF is never exposed to userspace, but media entity registration
diff --git a/drivers/media/platform/vsp1/vsp1_lif.h b/drivers/media/platform/vsp1/vsp1_lif.h
index 7b35879..3417339 100644
--- a/drivers/media/platform/vsp1/vsp1_lif.h
+++ b/drivers/media/platform/vsp1/vsp1_lif.h
@@ -32,6 +32,6 @@ static inline struct vsp1_lif *to_lif(struct v4l2_subdev *subdev)
 	return container_of(subdev, struct vsp1_lif, entity.subdev);
 }
 
-struct vsp1_lif *vsp1_lif_create(struct vsp1_device *vsp1);
+struct vsp1_lif *vsp1_lif_create(struct vsp1_device *vsp1, unsigned int index);
 
 #endif /* __VSP1_LIF_H__ */
diff --git a/drivers/media/platform/vsp1/vsp1_pipe.c b/drivers/media/platform/vsp1/vsp1_pipe.c
index e817623..4f4b732 100644
--- a/drivers/media/platform/vsp1/vsp1_pipe.c
+++ b/drivers/media/platform/vsp1/vsp1_pipe.c
@@ -335,16 +335,12 @@ void vsp1_pipeline_frame_end(struct vsp1_pipeline *pipe)
 	if (pipe == NULL)
 		return;
 
+	/*
+	 * If the DL commit raced with the frame end interrupt, the commit ends
+	 * up being postponed by one frame. @completed represents whether the
+	 * active frame was finished or postponed.
+	 */
 	completed = vsp1_dlm_irq_frame_end(pipe->output->dlm);
-	if (!completed) {
-		/*
-		 * If the DL commit raced with the frame end interrupt, the
-		 * commit ends up being postponed by one frame. Return
-		 * immediately without calling the pipeline's frame end handler
-		 * or incrementing the sequence number.
-		 */
-		return;
-	}
 
 	if (pipe->hgo)
 		vsp1_hgo_frame_end(pipe->hgo);
@@ -352,8 +348,12 @@ void vsp1_pipeline_frame_end(struct vsp1_pipeline *pipe)
 	if (pipe->hgt)
 		vsp1_hgt_frame_end(pipe->hgt);
 
+	/*
+	 * Regardless of frame completion we still need to notify the pipe
+	 * frame_end to account for vblank events.
+	 */
 	if (pipe->frame_end)
-		pipe->frame_end(pipe);
+		pipe->frame_end(pipe, completed);
 
 	pipe->sequence++;
 }
@@ -373,10 +373,11 @@ void vsp1_pipeline_propagate_alpha(struct vsp1_pipeline *pipe,
 		return;
 
 	/*
-	 * The BRU background color has a fixed alpha value set to 255, the
-	 * output alpha value is thus always equal to 255.
+	 * The BRU and BRS background color has a fixed alpha value set to 255,
+	 * the output alpha value is thus always equal to 255.
 	 */
-	if (pipe->uds_input->type == VSP1_ENTITY_BRU)
+	if (pipe->uds_input->type == VSP1_ENTITY_BRU ||
+	    pipe->uds_input->type == VSP1_ENTITY_BRS)
 		alpha = 255;
 
 	vsp1_uds_set_alpha(pipe->uds, dl, alpha);
diff --git a/drivers/media/platform/vsp1/vsp1_pipe.h b/drivers/media/platform/vsp1/vsp1_pipe.h
index 91a784a..c5d01a36 100644
--- a/drivers/media/platform/vsp1/vsp1_pipe.h
+++ b/drivers/media/platform/vsp1/vsp1_pipe.h
@@ -91,7 +91,7 @@ struct vsp1_pipeline {
 	enum vsp1_pipeline_state state;
 	wait_queue_head_t wq;
 
-	void (*frame_end)(struct vsp1_pipeline *pipe);
+	void (*frame_end)(struct vsp1_pipeline *pipe, bool completed);
 
 	struct mutex lock;
 	struct kref kref;
diff --git a/drivers/media/platform/vsp1/vsp1_regs.h b/drivers/media/platform/vsp1/vsp1_regs.h
index cd3e32a..58d0bea 100644
--- a/drivers/media/platform/vsp1/vsp1_regs.h
+++ b/drivers/media/platform/vsp1/vsp1_regs.h
@@ -18,6 +18,7 @@
  */
 
 #define VI6_CMD(n)			(0x0000 + (n) * 4)
+#define VI6_CMD_UPDHDR			(1 << 4)
 #define VI6_CMD_STRCMD			(1 << 0)
 
 #define VI6_CLK_DCSWT			0x0018
@@ -238,6 +239,10 @@
 #define VI6_WPF_SRCRPF_VIRACT_SUB	(1 << 28)
 #define VI6_WPF_SRCRPF_VIRACT_MST	(2 << 28)
 #define VI6_WPF_SRCRPF_VIRACT_MASK	(3 << 28)
+#define VI6_WPF_SRCRPF_VIRACT2_DIS	(0 << 24)
+#define VI6_WPF_SRCRPF_VIRACT2_SUB	(1 << 24)
+#define VI6_WPF_SRCRPF_VIRACT2_MST	(2 << 24)
+#define VI6_WPF_SRCRPF_VIRACT2_MASK	(3 << 24)
 #define VI6_WPF_SRCRPF_RPF_ACT_DIS(n)	(0 << ((n) * 2))
 #define VI6_WPF_SRCRPF_RPF_ACT_SUB(n)	(1 << ((n) * 2))
 #define VI6_WPF_SRCRPF_RPF_ACT_MST(n)	(2 << ((n) * 2))
@@ -321,6 +326,8 @@
 #define VI6_DPR_HST_ROUTE		0x2044
 #define VI6_DPR_HSI_ROUTE		0x2048
 #define VI6_DPR_BRU_ROUTE		0x204c
+#define VI6_DPR_ILV_BRS_ROUTE		0x2050
+#define VI6_DPR_ROUTE_BRSSEL		(1 << 28)
 #define VI6_DPR_ROUTE_FXA_MASK		(0xff << 16)
 #define VI6_DPR_ROUTE_FXA_SHIFT		16
 #define VI6_DPR_ROUTE_FP_MASK		(0x3f << 8)
@@ -344,7 +351,8 @@
 #define VI6_DPR_NODE_CLU		29
 #define VI6_DPR_NODE_HST		30
 #define VI6_DPR_NODE_HSI		31
-#define VI6_DPR_NODE_LIF		55
+#define VI6_DPR_NODE_BRS_IN(n)		(38 + (n))
+#define VI6_DPR_NODE_LIF		55		/* Gen2 only */
 #define VI6_DPR_NODE_WPF(n)		(56 + (n))
 #define VI6_DPR_NODE_UNUSED		63
 
@@ -476,7 +484,7 @@
 #define VI6_HSI_CTRL_EN			(1 << 0)
 
 /* -----------------------------------------------------------------------------
- * BRU Control Registers
+ * BRS and BRU Control Registers
  */
 
 #define VI6_ROP_NOP			0
@@ -496,7 +504,10 @@
 #define VI6_ROP_NAND			14
 #define VI6_ROP_SET			15
 
-#define VI6_BRU_INCTRL			0x2c00
+#define VI6_BRU_BASE			0x2c00
+#define VI6_BRS_BASE			0x3900
+
+#define VI6_BRU_INCTRL			0x0000
 #define VI6_BRU_INCTRL_NRM		(1 << 28)
 #define VI6_BRU_INCTRL_DnON		(1 << (16 + (n)))
 #define VI6_BRU_INCTRL_DITHn_OFF	(0 << ((n) * 4))
@@ -508,19 +519,19 @@
 #define VI6_BRU_INCTRL_DITHn_MASK	(7 << ((n) * 4))
 #define VI6_BRU_INCTRL_DITHn_SHIFT	((n) * 4)
 
-#define VI6_BRU_VIRRPF_SIZE		0x2c04
+#define VI6_BRU_VIRRPF_SIZE		0x0004
 #define VI6_BRU_VIRRPF_SIZE_HSIZE_MASK	(0x1fff << 16)
 #define VI6_BRU_VIRRPF_SIZE_HSIZE_SHIFT	16
 #define VI6_BRU_VIRRPF_SIZE_VSIZE_MASK	(0x1fff << 0)
 #define VI6_BRU_VIRRPF_SIZE_VSIZE_SHIFT	0
 
-#define VI6_BRU_VIRRPF_LOC		0x2c08
+#define VI6_BRU_VIRRPF_LOC		0x0008
 #define VI6_BRU_VIRRPF_LOC_HCOORD_MASK	(0x1fff << 16)
 #define VI6_BRU_VIRRPF_LOC_HCOORD_SHIFT	16
 #define VI6_BRU_VIRRPF_LOC_VCOORD_MASK	(0x1fff << 0)
 #define VI6_BRU_VIRRPF_LOC_VCOORD_SHIFT	0
 
-#define VI6_BRU_VIRRPF_COL		0x2c0c
+#define VI6_BRU_VIRRPF_COL		0x000c
 #define VI6_BRU_VIRRPF_COL_A_MASK	(0xff << 24)
 #define VI6_BRU_VIRRPF_COL_A_SHIFT	24
 #define VI6_BRU_VIRRPF_COL_RCR_MASK	(0xff << 16)
@@ -530,7 +541,7 @@
 #define VI6_BRU_VIRRPF_COL_BCB_MASK	(0xff << 0)
 #define VI6_BRU_VIRRPF_COL_BCB_SHIFT	0
 
-#define VI6_BRU_CTRL(n)			(0x2c10 + (n) * 8 + ((n) <= 3 ? 0 : 4))
+#define VI6_BRU_CTRL(n)			(0x0010 + (n) * 8 + ((n) <= 3 ? 0 : 4))
 #define VI6_BRU_CTRL_RBC		(1 << 31)
 #define VI6_BRU_CTRL_DSTSEL_BRUIN(n)	(((n) <= 3 ? (n) : (n)+1) << 20)
 #define VI6_BRU_CTRL_DSTSEL_VRPF	(4 << 20)
@@ -543,7 +554,7 @@
 #define VI6_BRU_CTRL_AROP(rop)		((rop) << 0)
 #define VI6_BRU_CTRL_AROP_MASK		(0xf << 0)
 
-#define VI6_BRU_BLD(n)			(0x2c14 + (n) * 8 + ((n) <= 3 ? 0 : 4))
+#define VI6_BRU_BLD(n)			(0x0014 + (n) * 8 + ((n) <= 3 ? 0 : 4))
 #define VI6_BRU_BLD_CBES		(1 << 31)
 #define VI6_BRU_BLD_CCMDX_DST_A		(0 << 28)
 #define VI6_BRU_BLD_CCMDX_255_DST_A	(1 << 28)
@@ -576,7 +587,7 @@
 #define VI6_BRU_BLD_COEFY_MASK		(0xff << 0)
 #define VI6_BRU_BLD_COEFY_SHIFT		0
 
-#define VI6_BRU_ROP			0x2c30
+#define VI6_BRU_ROP			0x0030	/* Only available on BRU */
 #define VI6_BRU_ROP_DSTSEL_BRUIN(n)	(((n) <= 3 ? (n) : (n)+1) << 20)
 #define VI6_BRU_ROP_DSTSEL_VRPF		(4 << 20)
 #define VI6_BRU_ROP_DSTSEL_MASK		(7 << 20)
@@ -653,6 +664,8 @@
  * LIF Control Registers
  */
 
+#define VI6_LIF_OFFSET			(-0x100)
+
 #define VI6_LIF_CTRL			0x3b00
 #define VI6_LIF_CTRL_OBTH_MASK		(0x7ff << 16)
 #define VI6_LIF_CTRL_OBTH_SHIFT		16
@@ -689,9 +702,20 @@
 #define VI6_IP_VERSION_MODEL_VSPBD_GEN3	(0x15 << 8)
 #define VI6_IP_VERSION_MODEL_VSPBC_GEN3	(0x16 << 8)
 #define VI6_IP_VERSION_MODEL_VSPD_GEN3	(0x17 << 8)
+#define VI6_IP_VERSION_MODEL_VSPD_V3	(0x18 << 8)
+#define VI6_IP_VERSION_MODEL_VSPDL_GEN3	(0x19 << 8)
+#define VI6_IP_VERSION_MODEL_VSPBS_GEN3	(0x1a << 8)
 #define VI6_IP_VERSION_SOC_MASK		(0xff << 0)
-#define VI6_IP_VERSION_SOC_H		(0x01 << 0)
-#define VI6_IP_VERSION_SOC_M		(0x02 << 0)
+#define VI6_IP_VERSION_SOC_H2		(0x01 << 0)
+#define VI6_IP_VERSION_SOC_V2H		(0x01 << 0)
+#define VI6_IP_VERSION_SOC_V3M		(0x01 << 0)
+#define VI6_IP_VERSION_SOC_M2		(0x02 << 0)
+#define VI6_IP_VERSION_SOC_M3W		(0x02 << 0)
+#define VI6_IP_VERSION_SOC_V3H		(0x02 << 0)
+#define VI6_IP_VERSION_SOC_H3		(0x03 << 0)
+#define VI6_IP_VERSION_SOC_D3		(0x04 << 0)
+#define VI6_IP_VERSION_SOC_M3N		(0x04 << 0)
+#define VI6_IP_VERSION_SOC_E3		(0x04 << 0)
 
 /* -----------------------------------------------------------------------------
  * RPF CLUT Registers
diff --git a/drivers/media/platform/vsp1/vsp1_video.c b/drivers/media/platform/vsp1/vsp1_video.c
index 5af3486..e9f5dcb 100644
--- a/drivers/media/platform/vsp1/vsp1_video.c
+++ b/drivers/media/platform/vsp1/vsp1_video.c
@@ -440,13 +440,17 @@ static void vsp1_video_pipeline_run(struct vsp1_pipeline *pipe)
 	vsp1_pipeline_run(pipe);
 }
 
-static void vsp1_video_pipeline_frame_end(struct vsp1_pipeline *pipe)
+static void vsp1_video_pipeline_frame_end(struct vsp1_pipeline *pipe,
+					  bool completed)
 {
 	struct vsp1_device *vsp1 = pipe->output->entity.vsp1;
 	enum vsp1_pipeline_state state;
 	unsigned long flags;
 	unsigned int i;
 
+	/* M2M Pipelines should never call here with an incomplete frame. */
+	WARN_ON_ONCE(!completed);
+
 	spin_lock_irqsave(&pipe->irqlock, flags);
 
 	/* Complete buffers on all video nodes. */
@@ -481,7 +485,7 @@ static int vsp1_video_pipeline_build_branch(struct vsp1_pipeline *pipe,
 	struct media_entity_enum ent_enum;
 	struct vsp1_entity *entity;
 	struct media_pad *pad;
-	bool bru_found = false;
+	struct vsp1_bru *bru = NULL;
 	int ret;
 
 	ret = media_entity_enum_init(&ent_enum, &input->entity.vsp1->media_dev);
@@ -511,16 +515,20 @@ static int vsp1_video_pipeline_build_branch(struct vsp1_pipeline *pipe,
 			media_entity_to_v4l2_subdev(pad->entity));
 
 		/*
-		 * A BRU is present in the pipeline, store the BRU input pad
+		 * A BRU or BRS is present in the pipeline, store its input pad
 		 * number in the input RPF for use when configuring the RPF.
 		 */
-		if (entity->type == VSP1_ENTITY_BRU) {
-			struct vsp1_bru *bru = to_bru(&entity->subdev);
+		if (entity->type == VSP1_ENTITY_BRU ||
+		    entity->type == VSP1_ENTITY_BRS) {
+			/* BRU and BRS can't be chained. */
+			if (bru) {
+				ret = -EPIPE;
+				goto out;
+			}
 
+			bru = to_bru(&entity->subdev);
 			bru->inputs[pad->index].rpf = input;
 			input->bru_input = pad->index;
-
-			bru_found = true;
 		}
 
 		/* We've reached the WPF, we're done. */
@@ -542,8 +550,7 @@ static int vsp1_video_pipeline_build_branch(struct vsp1_pipeline *pipe,
 			}
 
 			pipe->uds = entity;
-			pipe->uds_input = bru_found ? pipe->bru
-					: &input->entity;
+			pipe->uds_input = bru ? &bru->entity : &input->entity;
 		}
 
 		/* Follow the source link, ignoring any HGO or HGT. */
@@ -589,30 +596,42 @@ static int vsp1_video_pipeline_build(struct vsp1_pipeline *pipe,
 		e = to_vsp1_entity(subdev);
 		list_add_tail(&e->list_pipe, &pipe->entities);
 
-		if (e->type == VSP1_ENTITY_RPF) {
+		switch (e->type) {
+		case VSP1_ENTITY_RPF:
 			rwpf = to_rwpf(subdev);
 			pipe->inputs[rwpf->entity.index] = rwpf;
 			rwpf->video->pipe_index = ++pipe->num_inputs;
 			rwpf->pipe = pipe;
-		} else if (e->type == VSP1_ENTITY_WPF) {
+			break;
+
+		case VSP1_ENTITY_WPF:
 			rwpf = to_rwpf(subdev);
 			pipe->output = rwpf;
 			rwpf->video->pipe_index = 0;
 			rwpf->pipe = pipe;
-		} else if (e->type == VSP1_ENTITY_LIF) {
+			break;
+
+		case VSP1_ENTITY_LIF:
 			pipe->lif = e;
-		} else if (e->type == VSP1_ENTITY_BRU) {
+			break;
+
+		case VSP1_ENTITY_BRU:
+		case VSP1_ENTITY_BRS:
 			pipe->bru = e;
-		} else if (e->type == VSP1_ENTITY_HGO) {
-			struct vsp1_hgo *hgo = to_hgo(subdev);
+			break;
 
+		case VSP1_ENTITY_HGO:
 			pipe->hgo = e;
-			hgo->histo.pipe = pipe;
-		} else if (e->type == VSP1_ENTITY_HGT) {
-			struct vsp1_hgt *hgt = to_hgt(subdev);
+			to_hgo(subdev)->histo.pipe = pipe;
+			break;
 
+		case VSP1_ENTITY_HGT:
 			pipe->hgt = e;
-			hgt->histo.pipe = pipe;
+			to_hgt(subdev)->histo.pipe = pipe;
+			break;
+
+		default:
+			break;
 		}
 	}
 
@@ -796,12 +815,14 @@ static int vsp1_video_setup_pipeline(struct vsp1_pipeline *pipe)
 		struct vsp1_uds *uds = to_uds(&pipe->uds->subdev);
 
 		/*
-		 * If a BRU is present in the pipeline before the UDS, the alpha
-		 * component doesn't need to be scaled as the BRU output alpha
-		 * value is fixed to 255. Otherwise we need to scale the alpha
-		 * component only when available at the input RPF.
+		 * If a BRU or BRS is present in the pipeline before the UDS,
+		 * the alpha component doesn't need to be scaled as the BRU and
+		 * BRS output alpha value is fixed to 255. Otherwise we need to
+		 * scale the alpha component only when available at the input
+		 * RPF.
 		 */
-		if (pipe->uds_input->type == VSP1_ENTITY_BRU) {
+		if (pipe->uds_input->type == VSP1_ENTITY_BRU ||
+		    pipe->uds_input->type == VSP1_ENTITY_BRS) {
 			uds->scale_alpha = false;
 		} else {
 			struct vsp1_rwpf *rpf =
diff --git a/drivers/media/platform/vsp1/vsp1_wpf.c b/drivers/media/platform/vsp1/vsp1_wpf.c
index 32df109..b6c902b 100644
--- a/drivers/media/platform/vsp1/vsp1_wpf.c
+++ b/drivers/media/platform/vsp1/vsp1_wpf.c
@@ -453,7 +453,9 @@ static void wpf_configure(struct vsp1_entity *entity,
 	}
 
 	if (pipe->bru || pipe->num_inputs > 1)
-		srcrpf |= VI6_WPF_SRCRPF_VIRACT_MST;
+		srcrpf |= pipe->bru->type == VSP1_ENTITY_BRU
+			? VI6_WPF_SRCRPF_VIRACT_MST
+			: VI6_WPF_SRCRPF_VIRACT2_MST;
 
 	vsp1_wpf_write(wpf, dl, VI6_WPF_SRCRPF, srcrpf);
 
diff --git a/drivers/mfd/da9052-core.c b/drivers/mfd/da9052-core.c
index a88c206..a23a3a1 100644
--- a/drivers/mfd/da9052-core.c
+++ b/drivers/mfd/da9052-core.c
@@ -18,6 +18,7 @@
 #include <linux/mfd/core.h>
 #include <linux/slab.h>
 #include <linux/module.h>
+#include <linux/property.h>
 
 #include <linux/mfd/da9052/da9052.h>
 #include <linux/mfd/da9052/pdata.h>
@@ -519,9 +520,6 @@ static const struct mfd_cell da9052_subdev_info[] = {
 		.name = "da9052-wled3",
 	},
 	{
-		.name = "da9052-tsi",
-	},
-	{
 		.name = "da9052-bat",
 	},
 	{
@@ -529,6 +527,10 @@ static const struct mfd_cell da9052_subdev_info[] = {
 	},
 };
 
+static const struct mfd_cell da9052_tsi_subdev_info[] = {
+	{ .name = "da9052-tsi" },
+};
+
 const struct regmap_config da9052_regmap_config = {
 	.reg_bits = 8,
 	.val_bits = 8,
@@ -619,9 +621,27 @@ int da9052_device_init(struct da9052 *da9052, u8 chip_id)
 		goto err;
 	}
 
+	/*
+	 * Check if touchscreen pins are used are analogue input instead
+	 * of having a touchscreen connected to them. The analogue input
+	 * functionality will be provided by hwmon driver (if enabled).
+	 */
+	if (!device_property_read_bool(da9052->dev, "dlg,tsi-as-adc")) {
+		ret = mfd_add_devices(da9052->dev, PLATFORM_DEVID_AUTO,
+				      da9052_tsi_subdev_info,
+				      ARRAY_SIZE(da9052_tsi_subdev_info),
+				      NULL, 0, NULL);
+		if (ret) {
+			dev_err(da9052->dev, "failed to add TSI subdev: %d\n",
+				ret);
+			goto err;
+		}
+	}
+
 	return 0;
 
 err:
+	mfd_remove_devices(da9052->dev);
 	da9052_irq_exit(da9052);
 
 	return ret;
diff --git a/drivers/misc/mic/scif/scif_dma.c b/drivers/misc/mic/scif/scif_dma.c
index 64d5760..63d6246 100644
--- a/drivers/misc/mic/scif/scif_dma.c
+++ b/drivers/misc/mic/scif/scif_dma.c
@@ -200,16 +200,6 @@ static void scif_mmu_notifier_release(struct mmu_notifier *mn,
 	schedule_work(&scif_info.misc_work);
 }
 
-static void scif_mmu_notifier_invalidate_page(struct mmu_notifier *mn,
-					      struct mm_struct *mm,
-					      unsigned long address)
-{
-	struct scif_mmu_notif	*mmn;
-
-	mmn = container_of(mn, struct scif_mmu_notif, ep_mmu_notifier);
-	scif_rma_destroy_tcw(mmn, address, PAGE_SIZE);
-}
-
 static void scif_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn,
 						     struct mm_struct *mm,
 						     unsigned long start,
@@ -235,7 +225,6 @@ static void scif_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn,
 static const struct mmu_notifier_ops scif_mmu_notifier_ops = {
 	.release = scif_mmu_notifier_release,
 	.clear_flush_young = NULL,
-	.invalidate_page = scif_mmu_notifier_invalidate_page,
 	.invalidate_range_start = scif_mmu_notifier_invalidate_range_start,
 	.invalidate_range_end = scif_mmu_notifier_invalidate_range_end};
 
diff --git a/drivers/misc/sgi-gru/grutlbpurge.c b/drivers/misc/sgi-gru/grutlbpurge.c
index e936d43..9918eda 100644
--- a/drivers/misc/sgi-gru/grutlbpurge.c
+++ b/drivers/misc/sgi-gru/grutlbpurge.c
@@ -247,17 +247,6 @@ static void gru_invalidate_range_end(struct mmu_notifier *mn,
 	gru_dbg(grudev, "gms %p, start 0x%lx, end 0x%lx\n", gms, start, end);
 }
 
-static void gru_invalidate_page(struct mmu_notifier *mn, struct mm_struct *mm,
-				unsigned long address)
-{
-	struct gru_mm_struct *gms = container_of(mn, struct gru_mm_struct,
-						 ms_notifier);
-
-	STAT(mmu_invalidate_page);
-	gru_flush_tlb_range(gms, address, PAGE_SIZE);
-	gru_dbg(grudev, "gms %p, address 0x%lx\n", gms, address);
-}
-
 static void gru_release(struct mmu_notifier *mn, struct mm_struct *mm)
 {
 	struct gru_mm_struct *gms = container_of(mn, struct gru_mm_struct,
@@ -269,7 +258,6 @@ static void gru_release(struct mmu_notifier *mn, struct mm_struct *mm)
 
 
 static const struct mmu_notifier_ops gru_mmuops = {
-	.invalidate_page	= gru_invalidate_page,
 	.invalidate_range_start	= gru_invalidate_range_start,
 	.invalidate_range_end	= gru_invalidate_range_end,
 	.release		= gru_release,
diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index 80d1ec6..8bd7aba 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -1213,7 +1213,7 @@ static void mmc_blk_issue_drv_op(struct mmc_queue *mq, struct request *req)
 		break;
 	}
 	mq_rq->drv_op_result = ret;
-	blk_end_request_all(req, ret);
+	blk_end_request_all(req, ret ? BLK_STS_IOERR : BLK_STS_OK);
 }
 
 static void mmc_blk_issue_discard_rq(struct mmc_queue *mq, struct request *req)
@@ -1718,9 +1718,9 @@ static bool mmc_blk_rw_cmd_err(struct mmc_blk_data *md, struct mmc_card *card,
 		if (err)
 			req_pending = old_req_pending;
 		else
-			req_pending = blk_end_request(req, 0, blocks << 9);
+			req_pending = blk_end_request(req, BLK_STS_OK, blocks << 9);
 	} else {
-		req_pending = blk_end_request(req, 0, brq->data.bytes_xfered);
+		req_pending = blk_end_request(req, BLK_STS_OK, brq->data.bytes_xfered);
 	}
 	return req_pending;
 }
diff --git a/drivers/mmc/host/sdhci-xenon.c b/drivers/mmc/host/sdhci-xenon.c
index bc1781b..c580af0 100644
--- a/drivers/mmc/host/sdhci-xenon.c
+++ b/drivers/mmc/host/sdhci-xenon.c
@@ -210,8 +210,27 @@ static void xenon_set_uhs_signaling(struct sdhci_host *host,
 	sdhci_writew(host, ctrl_2, SDHCI_HOST_CONTROL2);
 }
 
+static void xenon_set_power(struct sdhci_host *host, unsigned char mode,
+		unsigned short vdd)
+{
+	struct mmc_host *mmc = host->mmc;
+	u8 pwr = host->pwr;
+
+	sdhci_set_power_noreg(host, mode, vdd);
+
+	if (host->pwr == pwr)
+		return;
+
+	if (host->pwr == 0)
+		vdd = 0;
+
+	if (!IS_ERR(mmc->supply.vmmc))
+		mmc_regulator_set_ocr(mmc, mmc->supply.vmmc, vdd);
+}
+
 static const struct sdhci_ops sdhci_xenon_ops = {
 	.set_clock		= sdhci_set_clock,
+	.set_power		= xenon_set_power,
 	.set_bus_width		= sdhci_set_bus_width,
 	.reset			= xenon_reset,
 	.set_uhs_signaling	= xenon_set_uhs_signaling,
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 648f91b..9b6ce7c 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -1048,6 +1048,7 @@ struct bcm_sf2_of_data {
 	u32 type;
 	const u16 *reg_offsets;
 	unsigned int core_reg_align;
+	unsigned int num_cfp_rules;
 };
 
 /* Register offsets for the SWITCH_REG_* block */
@@ -1071,6 +1072,7 @@ static const struct bcm_sf2_of_data bcm_sf2_7445_data = {
 	.type		= BCM7445_DEVICE_ID,
 	.core_reg_align	= 0,
 	.reg_offsets	= bcm_sf2_7445_reg_offsets,
+	.num_cfp_rules	= 256,
 };
 
 static const u16 bcm_sf2_7278_reg_offsets[] = {
@@ -1093,6 +1095,7 @@ static const struct bcm_sf2_of_data bcm_sf2_7278_data = {
 	.type		= BCM7278_DEVICE_ID,
 	.core_reg_align	= 1,
 	.reg_offsets	= bcm_sf2_7278_reg_offsets,
+	.num_cfp_rules	= 128,
 };
 
 static const struct of_device_id bcm_sf2_of_match[] = {
@@ -1149,6 +1152,7 @@ static int bcm_sf2_sw_probe(struct platform_device *pdev)
 	priv->type = data->type;
 	priv->reg_offsets = data->reg_offsets;
 	priv->core_reg_align = data->core_reg_align;
+	priv->num_cfp_rules = data->num_cfp_rules;
 
 	/* Auto-detection using standard registers will not work, so
 	 * provide an indication of what kind of device we are for
diff --git a/drivers/net/dsa/bcm_sf2.h b/drivers/net/dsa/bcm_sf2.h
index 7d3030e..7f9125e 100644
--- a/drivers/net/dsa/bcm_sf2.h
+++ b/drivers/net/dsa/bcm_sf2.h
@@ -72,6 +72,7 @@ struct bcm_sf2_priv {
 	u32 				type;
 	const u16			*reg_offsets;
 	unsigned int			core_reg_align;
+	unsigned int			num_cfp_rules;
 
 	/* spinlock protecting access to the indirect registers */
 	spinlock_t			indir_lock;
diff --git a/drivers/net/dsa/bcm_sf2_cfp.c b/drivers/net/dsa/bcm_sf2_cfp.c
index 2fb32d6..8a1da7e 100644
--- a/drivers/net/dsa/bcm_sf2_cfp.c
+++ b/drivers/net/dsa/bcm_sf2_cfp.c
@@ -98,7 +98,7 @@ static inline void bcm_sf2_cfp_rule_addr_set(struct bcm_sf2_priv *priv,
 {
 	u32 reg;
 
-	WARN_ON(addr >= CFP_NUM_RULES);
+	WARN_ON(addr >= priv->num_cfp_rules);
 
 	reg = core_readl(priv, CORE_CFP_ACC);
 	reg &= ~(XCESS_ADDR_MASK << XCESS_ADDR_SHIFT);
@@ -109,7 +109,7 @@ static inline void bcm_sf2_cfp_rule_addr_set(struct bcm_sf2_priv *priv,
 static inline unsigned int bcm_sf2_cfp_rule_size(struct bcm_sf2_priv *priv)
 {
 	/* Entry #0 is reserved */
-	return CFP_NUM_RULES - 1;
+	return priv->num_cfp_rules - 1;
 }
 
 static int bcm_sf2_cfp_rule_set(struct dsa_switch *ds, int port,
@@ -523,7 +523,7 @@ static int bcm_sf2_cfp_rule_get_all(struct bcm_sf2_priv *priv,
 		if (!(reg & OP_STR_DONE))
 			break;
 
-	} while (index < CFP_NUM_RULES);
+	} while (index < priv->num_cfp_rules);
 
 	/* Put the TCAM size here */
 	nfc->data = bcm_sf2_cfp_rule_size(priv);
@@ -544,7 +544,7 @@ int bcm_sf2_get_rxnfc(struct dsa_switch *ds, int port,
 	case ETHTOOL_GRXCLSRLCNT:
 		/* Subtract the default, unusable rule */
 		nfc->rule_cnt = bitmap_weight(priv->cfp.used,
-					      CFP_NUM_RULES) - 1;
+					      priv->num_cfp_rules) - 1;
 		/* We support specifying rule locations */
 		nfc->data |= RX_CLS_LOC_SPECIAL;
 		break;
diff --git a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
index 1d307f2..6e253d9 100644
--- a/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
+++ b/drivers/net/ethernet/apm/xgene/xgene_enet_main.c
@@ -1661,21 +1661,21 @@ static int xgene_enet_get_irqs(struct xgene_enet_pdata *pdata)
 	return 0;
 }
 
-static int xgene_enet_check_phy_handle(struct xgene_enet_pdata *pdata)
+static void xgene_enet_check_phy_handle(struct xgene_enet_pdata *pdata)
 {
 	int ret;
 
 	if (pdata->phy_mode == PHY_INTERFACE_MODE_XGMII)
-		return 0;
+		return;
 
 	if (!IS_ENABLED(CONFIG_MDIO_XGENE))
-		return 0;
+		return;
 
 	ret = xgene_enet_phy_connect(pdata->ndev);
 	if (!ret)
 		pdata->mdio_driver = true;
 
-	return 0;
+	return;
 }
 
 static void xgene_enet_gpiod_get(struct xgene_enet_pdata *pdata)
@@ -1779,10 +1779,6 @@ static int xgene_enet_get_resources(struct xgene_enet_pdata *pdata)
 	if (ret)
 		return ret;
 
-	ret = xgene_enet_check_phy_handle(pdata);
-	if (ret)
-		return ret;
-
 	xgene_enet_gpiod_get(pdata);
 
 	pdata->clk = devm_clk_get(&pdev->dev, NULL);
@@ -2097,9 +2093,11 @@ static int xgene_enet_probe(struct platform_device *pdev)
 		goto err;
 	}
 
+	xgene_enet_check_phy_handle(pdata);
+
 	ret = xgene_enet_init_hw(pdata);
 	if (ret)
-		goto err;
+		goto err2;
 
 	link_state = pdata->mac_ops->link_state;
 	if (pdata->phy_mode == PHY_INTERFACE_MODE_XGMII) {
@@ -2117,29 +2115,30 @@ static int xgene_enet_probe(struct platform_device *pdev)
 	spin_lock_init(&pdata->stats_lock);
 	ret = xgene_extd_stats_init(pdata);
 	if (ret)
-		goto err2;
+		goto err1;
 
 	xgene_enet_napi_add(pdata);
 	ret = register_netdev(ndev);
 	if (ret) {
 		netdev_err(ndev, "Failed to register netdev\n");
-		goto err2;
+		goto err1;
 	}
 
 	return 0;
 
-err2:
+err1:
 	/*
 	 * If necessary, free_netdev() will call netif_napi_del() and undo
 	 * the effects of xgene_enet_napi_add()'s calls to netif_napi_add().
 	 */
 
+	xgene_enet_delete_desc_rings(pdata);
+
+err2:
 	if (pdata->mdio_driver)
 		xgene_enet_phy_disconnect(pdata);
 	else if (phy_interface_mode_is_rgmii(pdata->phy_mode))
 		xgene_enet_mdio_remove(pdata);
-err1:
-	xgene_enet_delete_desc_rings(pdata);
 err:
 	free_netdev(ndev);
 	return ret;
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_hw.h b/drivers/net/ethernet/aquantia/atlantic/aq_hw.h
index fce0fd3..bf9b3f0 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_hw.h
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_hw.h
@@ -105,8 +105,7 @@ struct aq_hw_ops {
 
 	int (*hw_set_mac_address)(struct aq_hw_s *self, u8 *mac_addr);
 
-	int (*hw_get_link_status)(struct aq_hw_s *self,
-				  struct aq_hw_link_status_s *link_status);
+	int (*hw_get_link_status)(struct aq_hw_s *self);
 
 	int (*hw_set_link_speed)(struct aq_hw_s *self, u32 speed);
 
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
index 9ee1c50..6ac9e26 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_nic.c
@@ -103,6 +103,8 @@ int aq_nic_cfg_start(struct aq_nic_s *self)
 	else
 		cfg->vecs = 1U;
 
+	cfg->num_rss_queues = min(cfg->vecs, AQ_CFG_NUM_RSS_QUEUES_DEF);
+
 	cfg->irq_type = aq_pci_func_get_irq_type(self->aq_pci_func);
 
 	if ((cfg->irq_type == AQ_HW_IRQ_LEGACY) ||
@@ -123,33 +125,30 @@ static void aq_nic_service_timer_cb(unsigned long param)
 	struct net_device *ndev = aq_nic_get_ndev(self);
 	int err = 0;
 	unsigned int i = 0U;
-	struct aq_hw_link_status_s link_status;
 	struct aq_ring_stats_rx_s stats_rx;
 	struct aq_ring_stats_tx_s stats_tx;
 
 	if (aq_utils_obj_test(&self->header.flags, AQ_NIC_FLAGS_IS_NOT_READY))
 		goto err_exit;
 
-	err = self->aq_hw_ops.hw_get_link_status(self->aq_hw, &link_status);
+	err = self->aq_hw_ops.hw_get_link_status(self->aq_hw);
 	if (err < 0)
 		goto err_exit;
 
+	self->link_status = self->aq_hw->aq_link_status;
+
 	self->aq_hw_ops.hw_interrupt_moderation_set(self->aq_hw,
-			    self->aq_nic_cfg.is_interrupt_moderation);
+		    self->aq_nic_cfg.is_interrupt_moderation);
 
-	if (memcmp(&link_status, &self->link_status, sizeof(link_status))) {
-		if (link_status.mbps) {
-			aq_utils_obj_set(&self->header.flags,
-					 AQ_NIC_FLAG_STARTED);
-			aq_utils_obj_clear(&self->header.flags,
-					   AQ_NIC_LINK_DOWN);
-			netif_carrier_on(self->ndev);
-		} else {
-			netif_carrier_off(self->ndev);
-			aq_utils_obj_set(&self->header.flags, AQ_NIC_LINK_DOWN);
-		}
-
-		self->link_status = link_status;
+	if (self->link_status.mbps) {
+		aq_utils_obj_set(&self->header.flags,
+				 AQ_NIC_FLAG_STARTED);
+		aq_utils_obj_clear(&self->header.flags,
+				   AQ_NIC_LINK_DOWN);
+		netif_carrier_on(self->ndev);
+	} else {
+		netif_carrier_off(self->ndev);
+		aq_utils_obj_set(&self->header.flags, AQ_NIC_LINK_DOWN);
 	}
 
 	memset(&stats_rx, 0U, sizeof(struct aq_ring_stats_rx_s));
@@ -597,14 +596,11 @@ static unsigned int aq_nic_map_skb(struct aq_nic_s *self,
 }
 
 int aq_nic_xmit(struct aq_nic_s *self, struct sk_buff *skb)
-__releases(&ring->lock)
-__acquires(&ring->lock)
 {
 	struct aq_ring_s *ring = NULL;
 	unsigned int frags = 0U;
 	unsigned int vec = skb->queue_mapping % self->aq_nic_cfg.vecs;
 	unsigned int tc = 0U;
-	unsigned int trys = AQ_CFG_LOCK_TRYS;
 	int err = NETDEV_TX_OK;
 	bool is_nic_in_bad_state;
 
@@ -628,36 +624,21 @@ __acquires(&ring->lock)
 		goto err_exit;
 	}
 
-	do {
-		if (spin_trylock(&ring->header.lock)) {
-			frags = aq_nic_map_skb(self, skb, ring);
+	frags = aq_nic_map_skb(self, skb, ring);
 
-			if (likely(frags)) {
-				err = self->aq_hw_ops.hw_ring_tx_xmit(
-								self->aq_hw,
-								ring, frags);
-				if (err >= 0) {
-					if (aq_ring_avail_dx(ring) <
-					    AQ_CFG_SKB_FRAGS_MAX + 1)
-						aq_nic_ndev_queue_stop(
-								self,
-								ring->idx);
+	if (likely(frags)) {
+		err = self->aq_hw_ops.hw_ring_tx_xmit(self->aq_hw,
+						      ring,
+						      frags);
+		if (err >= 0) {
+			if (aq_ring_avail_dx(ring) < AQ_CFG_SKB_FRAGS_MAX + 1)
+				aq_nic_ndev_queue_stop(self, ring->idx);
 
-					++ring->stats.tx.packets;
-					ring->stats.tx.bytes += skb->len;
-				}
-			} else {
-				err = NETDEV_TX_BUSY;
-			}
-
-			spin_unlock(&ring->header.lock);
-			break;
+			++ring->stats.tx.packets;
+			ring->stats.tx.bytes += skb->len;
 		}
-	} while (--trys);
-
-	if (!trys) {
+	} else {
 		err = NETDEV_TX_BUSY;
-		goto err_exit;
 	}
 
 err_exit:
@@ -688,11 +669,26 @@ int aq_nic_set_multicast_list(struct aq_nic_s *self, struct net_device *ndev)
 	netdev_for_each_mc_addr(ha, ndev) {
 		ether_addr_copy(self->mc_list.ar[i++], ha->addr);
 		++self->mc_list.count;
+
+		if (i >= AQ_CFG_MULTICAST_ADDRESS_MAX)
+			break;
 	}
 
-	return self->aq_hw_ops.hw_multicast_list_set(self->aq_hw,
+	if (i >= AQ_CFG_MULTICAST_ADDRESS_MAX) {
+		/* Number of filters is too big: atlantic does not support this.
+		 * Force all multi filter to support this.
+		 * With this we disable all UC filters and setup "all pass"
+		 * multicast mask
+		 */
+		self->packet_filter |= IFF_ALLMULTI;
+		self->aq_hw->aq_nic_cfg->mc_list_count = 0;
+		return self->aq_hw_ops.hw_packet_filter_set(self->aq_hw,
+							self->packet_filter);
+	} else {
+		return self->aq_hw_ops.hw_multicast_list_set(self->aq_hw,
 						    self->mc_list.ar,
 						    self->mc_list.count);
+	}
 }
 
 int aq_nic_set_mtu(struct aq_nic_s *self, int new_mtu)
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
index 9a08179..ec5579f 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_ring.c
@@ -101,7 +101,6 @@ int aq_ring_init(struct aq_ring_s *self)
 	self->hw_head = 0;
 	self->sw_head = 0;
 	self->sw_tail = 0;
-	spin_lock_init(&self->header.lock);
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_utils.h b/drivers/net/ethernet/aquantia/atlantic/aq_utils.h
index f6012b3..e12bcdf 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_utils.h
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_utils.h
@@ -17,7 +17,6 @@
 #define AQ_DIMOF(_ARY_)  ARRAY_SIZE(_ARY_)
 
 struct aq_obj_s {
-	spinlock_t lock; /* spinlock for nic/rings processing */
 	atomic_t flags;
 };
 
diff --git a/drivers/net/ethernet/aquantia/atlantic/aq_vec.c b/drivers/net/ethernet/aquantia/atlantic/aq_vec.c
index ad5b4d4d..fee446a 100644
--- a/drivers/net/ethernet/aquantia/atlantic/aq_vec.c
+++ b/drivers/net/ethernet/aquantia/atlantic/aq_vec.c
@@ -34,8 +34,6 @@ struct aq_vec_s {
 #define AQ_VEC_RX_ID 1
 
 static int aq_vec_poll(struct napi_struct *napi, int budget)
-__releases(&self->lock)
-__acquires(&self->lock)
 {
 	struct aq_vec_s *self = container_of(napi, struct aq_vec_s, napi);
 	struct aq_ring_s *ring = NULL;
@@ -47,7 +45,7 @@ __acquires(&self->lock)
 
 	if (!self) {
 		err = -EINVAL;
-	} else if (spin_trylock(&self->header.lock)) {
+	} else {
 		for (i = 0U, ring = self->ring[0];
 			self->tx_rings > i; ++i, ring = self->ring[i]) {
 			if (self->aq_hw_ops->hw_ring_tx_head_update) {
@@ -105,11 +103,8 @@ __acquires(&self->lock)
 			self->aq_hw_ops->hw_irq_enable(self->aq_hw,
 					1U << self->aq_ring_param.vec_idx);
 		}
-
-err_exit:
-		spin_unlock(&self->header.lock);
 	}
-
+err_exit:
 	return work_done;
 }
 
@@ -185,8 +180,6 @@ int aq_vec_init(struct aq_vec_s *self, struct aq_hw_ops *aq_hw_ops,
 	self->aq_hw_ops = aq_hw_ops;
 	self->aq_hw = aq_hw;
 
-	spin_lock_init(&self->header.lock);
-
 	for (i = 0U, ring = self->ring[0];
 		self->tx_rings > i; ++i, ring = self->ring[i]) {
 		err = aq_ring_init(&ring[AQ_VEC_TX_ID]);
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c
index faeb493..c5a02df 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_a0.c
@@ -629,6 +629,12 @@ static int hw_atl_a0_hw_ring_rx_receive(struct aq_hw_s *self,
 				buff->is_udp_cso = (is_err & 0x10U) ? 0 : 1;
 			else if (0x0U == (pkt_type & 0x1CU))
 				buff->is_tcp_cso = (is_err & 0x10U) ? 0 : 1;
+
+			/* Checksum offload workaround for small packets */
+			if (rxd_wb->pkt_len <= 60) {
+				buff->is_ip_cso = 0U;
+				buff->is_cso_err = 0U;
+			}
 		}
 
 		is_err &= ~0x18U;
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
index 1bceb73..21784cc 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_b0.c
@@ -645,6 +645,12 @@ static int hw_atl_b0_hw_ring_rx_receive(struct aq_hw_s *self,
 				buff->is_udp_cso = buff->is_cso_err ? 0U : 1U;
 			else if (0x0U == (pkt_type & 0x1CU))
 				buff->is_tcp_cso = buff->is_cso_err ? 0U : 1U;
+
+			/* Checksum offload workaround for small packets */
+			if (rxd_wb->pkt_len <= 60) {
+				buff->is_ip_cso = 0U;
+				buff->is_cso_err = 0U;
+			}
 		}
 
 		is_err &= ~0x18U;
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c
index 8d6d8f5..4f5ec9a 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.c
@@ -141,6 +141,12 @@ static int hw_atl_utils_init_ucp(struct aq_hw_s *self,
 
 	err = hw_atl_utils_ver_match(aq_hw_caps->fw_ver_expected,
 				     aq_hw_read_reg(self, 0x18U));
+
+	if (err < 0)
+		pr_err("%s: Bad FW version detected: expected=%x, actual=%x\n",
+		       AQ_CFG_DRV_NAME,
+		       aq_hw_caps->fw_ver_expected,
+		       aq_hw_read_reg(self, 0x18U));
 	return err;
 }
 
@@ -313,11 +319,11 @@ void hw_atl_utils_mpi_set(struct aq_hw_s *self,
 err_exit:;
 }
 
-int hw_atl_utils_mpi_get_link_status(struct aq_hw_s *self,
-				     struct aq_hw_link_status_s *link_status)
+int hw_atl_utils_mpi_get_link_status(struct aq_hw_s *self)
 {
 	u32 cp0x036C = aq_hw_read_reg(self, HW_ATL_MPI_STATE_ADR);
 	u32 link_speed_mask = cp0x036C >> HW_ATL_MPI_SPEED_SHIFT;
+	struct aq_hw_link_status_s *link_status = &self->aq_link_status;
 
 	if (!link_speed_mask) {
 		link_status->mbps = 0U;
diff --git a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.h b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.h
index a66aee5..e0360a6 100644
--- a/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.h
+++ b/drivers/net/ethernet/aquantia/atlantic/hw_atl/hw_atl_utils.h
@@ -180,8 +180,7 @@ void hw_atl_utils_mpi_set(struct aq_hw_s *self,
 int hw_atl_utils_mpi_set_speed(struct aq_hw_s *self, u32 speed,
 			       enum hal_atl_utils_fw_state_e state);
 
-int hw_atl_utils_mpi_get_link_status(struct aq_hw_s *self,
-				     struct aq_hw_link_status_s *link_status);
+int hw_atl_utils_mpi_get_link_status(struct aq_hw_s *self);
 
 int hw_atl_utils_get_mac_permanent(struct aq_hw_s *self,
 				   struct aq_hw_caps_s *aq_hw_caps,
diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c
index dc30527..c28fa5a 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -597,7 +597,7 @@ static int bcm_sysport_set_coalesce(struct net_device *dev,
 
 static void bcm_sysport_free_cb(struct bcm_sysport_cb *cb)
 {
-	dev_kfree_skb_any(cb->skb);
+	dev_consume_skb_any(cb->skb);
 	cb->skb = NULL;
 	dma_unmap_addr_set(cb, dma_addr, 0);
 }
@@ -1346,6 +1346,8 @@ static int bcm_sysport_init_tx_ring(struct bcm_sysport_priv *priv,
 
 	ring->cbs = kcalloc(size, sizeof(struct bcm_sysport_cb), GFP_KERNEL);
 	if (!ring->cbs) {
+		dma_free_coherent(kdev, sizeof(struct dma_desc),
+				  ring->desc_cpu, ring->desc_dma);
 		netif_err(priv, hw, priv->netdev, "CB allocation failed\n");
 		return -ENOMEM;
 	}
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index e7c8539..f20b3d2 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -4647,7 +4647,6 @@ static int bnxt_hwrm_func_qcaps(struct bnxt *bp)
 		pf->port_id = le16_to_cpu(resp->port_id);
 		bp->dev->dev_port = pf->port_id;
 		memcpy(pf->mac_addr, resp->mac_address, ETH_ALEN);
-		memcpy(bp->dev->dev_addr, pf->mac_addr, ETH_ALEN);
 		pf->max_rsscos_ctxs = le16_to_cpu(resp->max_rsscos_ctx);
 		pf->max_cp_rings = le16_to_cpu(resp->max_cmpl_rings);
 		pf->max_tx_rings = le16_to_cpu(resp->max_tx_rings);
@@ -4687,16 +4686,6 @@ static int bnxt_hwrm_func_qcaps(struct bnxt *bp)
 		vf->max_stat_ctxs = le16_to_cpu(resp->max_stat_ctx);
 
 		memcpy(vf->mac_addr, resp->mac_address, ETH_ALEN);
-		mutex_unlock(&bp->hwrm_cmd_lock);
-
-		if (is_valid_ether_addr(vf->mac_addr)) {
-			/* overwrite netdev dev_adr with admin VF MAC */
-			memcpy(bp->dev->dev_addr, vf->mac_addr, ETH_ALEN);
-		} else {
-			eth_hw_addr_random(bp->dev);
-			rc = bnxt_approve_mac(bp, bp->dev->dev_addr);
-		}
-		return rc;
 #endif
 	}
 
@@ -7152,6 +7141,7 @@ int bnxt_setup_mq_tc(struct net_device *dev, u8 tc)
 		bp->tx_nr_rings = bp->tx_nr_rings_per_tc;
 		netdev_reset_tc(dev);
 	}
+	bp->tx_nr_rings += bp->tx_nr_rings_xdp;
 	bp->cp_nr_rings = sh ? max_t(int, bp->tx_nr_rings, bp->rx_nr_rings) :
 			       bp->tx_nr_rings + bp->rx_nr_rings;
 	bp->num_stat_ctxs = bp->cp_nr_rings;
@@ -7661,6 +7651,28 @@ void bnxt_restore_pf_fw_resources(struct bnxt *bp)
 	bnxt_subtract_ulp_resources(bp, BNXT_ROCE_ULP);
 }
 
+static int bnxt_init_mac_addr(struct bnxt *bp)
+{
+	int rc = 0;
+
+	if (BNXT_PF(bp)) {
+		memcpy(bp->dev->dev_addr, bp->pf.mac_addr, ETH_ALEN);
+	} else {
+#ifdef CONFIG_BNXT_SRIOV
+		struct bnxt_vf_info *vf = &bp->vf;
+
+		if (is_valid_ether_addr(vf->mac_addr)) {
+			/* overwrite netdev dev_adr with admin VF MAC */
+			memcpy(bp->dev->dev_addr, vf->mac_addr, ETH_ALEN);
+		} else {
+			eth_hw_addr_random(bp->dev);
+			rc = bnxt_approve_mac(bp, bp->dev->dev_addr);
+		}
+#endif
+	}
+	return rc;
+}
+
 static void bnxt_parse_log_pcie_link(struct bnxt *bp)
 {
 	enum pcie_link_width width = PCIE_LNK_WIDTH_UNKNOWN;
@@ -7789,7 +7801,12 @@ static int bnxt_init_one(struct pci_dev *pdev, const struct pci_device_id *ent)
 		rc = -1;
 		goto init_err_pci_clean;
 	}
-
+	rc = bnxt_init_mac_addr(bp);
+	if (rc) {
+		dev_err(&pdev->dev, "Unable to initialize mac address.\n");
+		rc = -EADDRNOTAVAIL;
+		goto init_err_pci_clean;
+	}
 	rc = bnxt_hwrm_queue_qportcfg(bp);
 	if (rc) {
 		netdev_err(bp->dev, "hwrm query qportcfg failure rc: %x\n",
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
index 77da75a..997e10e 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c
@@ -84,6 +84,8 @@ static int bnxt_unregister_dev(struct bnxt_en_dev *edev, int ulp_id)
 
 		max_stat_ctxs = bnxt_get_max_func_stat_ctxs(bp);
 		bnxt_set_max_func_stat_ctxs(bp, max_stat_ctxs + 1);
+		if (ulp->msix_requested)
+			edev->en_ops->bnxt_free_msix(edev, ulp_id);
 	}
 	if (ulp->max_async_event_id)
 		bnxt_hwrm_func_rgtr_async_events(bp, NULL, 0);
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index a981c4e..fea3f9a 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -1360,7 +1360,7 @@ static unsigned int __bcmgenet_tx_reclaim(struct net_device *dev,
 		if (skb) {
 			pkts_compl++;
 			bytes_compl += GENET_CB(skb)->bytes_sent;
-			dev_kfree_skb_any(skb);
+			dev_consume_skb_any(skb);
 		}
 
 		txbds_processed++;
@@ -1875,7 +1875,7 @@ static int bcmgenet_alloc_rx_buffers(struct bcmgenet_priv *priv,
 		cb = ring->cbs + i;
 		skb = bcmgenet_rx_refill(priv, cb);
 		if (skb)
-			dev_kfree_skb_any(skb);
+			dev_consume_skb_any(skb);
 		if (!cb->skb)
 			return -ENOMEM;
 	}
@@ -1894,7 +1894,7 @@ static void bcmgenet_free_rx_buffers(struct bcmgenet_priv *priv)
 
 		skb = bcmgenet_free_rx_cb(&priv->pdev->dev, cb);
 		if (skb)
-			dev_kfree_skb_any(skb);
+			dev_consume_skb_any(skb);
 	}
 }
 
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
index 82bf7aa..0293b41 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
@@ -369,12 +369,12 @@ int t4_wr_mbox_meat_timeout(struct adapter *adap, int mbox, const void *cmd,
 		list_del(&entry.list);
 		spin_unlock(&adap->mbox_lock);
 		ret = (v == MBOX_OWNER_FW) ? -EBUSY : -ETIMEDOUT;
-		t4_record_mbox(adap, cmd, MBOX_LEN, access, ret);
+		t4_record_mbox(adap, cmd, size, access, ret);
 		return ret;
 	}
 
 	/* Copy in the new mailbox command and send it on its way ... */
-	t4_record_mbox(adap, cmd, MBOX_LEN, access, 0);
+	t4_record_mbox(adap, cmd, size, access, 0);
 	for (i = 0; i < size; i += 8)
 		t4_write_reg64(adap, data_reg + i, be64_to_cpu(*p++));
 
@@ -426,7 +426,7 @@ int t4_wr_mbox_meat_timeout(struct adapter *adap, int mbox, const void *cmd,
 	}
 
 	ret = (pcie_fw & PCIE_FW_ERR_F) ? -ENXIO : -ETIMEDOUT;
-	t4_record_mbox(adap, cmd, MBOX_LEN, access, ret);
+	t4_record_mbox(adap, cmd, size, access, ret);
 	dev_err(adap->pdev_dev, "command %#x in mailbox %d timed out\n",
 		*(const u8 *)cmd, mbox);
 	t4_report_fw_error(adap);
diff --git a/drivers/net/ethernet/faraday/ftgmac100.c b/drivers/net/ethernet/faraday/ftgmac100.c
index 34dae51..59da7ac 100644
--- a/drivers/net/ethernet/faraday/ftgmac100.c
+++ b/drivers/net/ethernet/faraday/ftgmac100.c
@@ -1863,7 +1863,6 @@ static int ftgmac100_probe(struct platform_device *pdev)
 err_ioremap:
 	release_resource(priv->res);
 err_req_mem:
-	netif_napi_del(&priv->napi);
 	free_netdev(netdev);
 err_alloc_etherdev:
 	return err;
diff --git a/drivers/net/ethernet/freescale/fman/mac.c b/drivers/net/ethernet/freescale/fman/mac.c
index 6e67d22..1c7da16 100644
--- a/drivers/net/ethernet/freescale/fman/mac.c
+++ b/drivers/net/ethernet/freescale/fman/mac.c
@@ -623,6 +623,8 @@ static struct platform_device *dpaa_eth_add_device(int fman_id,
 		goto no_mem;
 	}
 
+	pdev->dev.of_node = node;
+	pdev->dev.parent = priv->dev;
 	set_dma_ops(&pdev->dev, get_dma_ops(priv->dev));
 
 	ret = platform_device_add_data(pdev, &data, sizeof(data));
diff --git a/drivers/net/ethernet/marvell/mvpp2.c b/drivers/net/ethernet/marvell/mvpp2.c
index 48d21c1..4d598ca 100644
--- a/drivers/net/ethernet/marvell/mvpp2.c
+++ b/drivers/net/ethernet/marvell/mvpp2.c
@@ -6504,7 +6504,7 @@ static int mvpp2_port_probe(struct platform_device *pdev,
 	struct resource *res;
 	const char *dt_mac_addr;
 	const char *mac_from;
-	char hw_mac_addr[ETH_ALEN];
+	char hw_mac_addr[ETH_ALEN] = {0};
 	u32 id;
 	int features;
 	int phy_mode;
diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
index 674773b..78b89ce 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
@@ -2535,8 +2535,8 @@ int mlx4_cmd_init(struct mlx4_dev *dev)
 	}
 
 	if (!priv->cmd.pool) {
-		priv->cmd.pool = pci_pool_create("mlx4_cmd",
-						 dev->persist->pdev,
+		priv->cmd.pool = dma_pool_create("mlx4_cmd",
+						 &dev->persist->pdev->dev,
 						 MLX4_MAILBOX_SIZE,
 						 MLX4_MAILBOX_SIZE, 0);
 		if (!priv->cmd.pool)
@@ -2607,7 +2607,7 @@ void mlx4_cmd_cleanup(struct mlx4_dev *dev, int cleanup_mask)
 	struct mlx4_priv *priv = mlx4_priv(dev);
 
 	if (priv->cmd.pool && (cleanup_mask & MLX4_CMD_CLEANUP_POOL)) {
-		pci_pool_destroy(priv->cmd.pool);
+		dma_pool_destroy(priv->cmd.pool);
 		priv->cmd.pool = NULL;
 	}
 
@@ -2699,7 +2699,7 @@ struct mlx4_cmd_mailbox *mlx4_alloc_cmd_mailbox(struct mlx4_dev *dev)
 	if (!mailbox)
 		return ERR_PTR(-ENOMEM);
 
-	mailbox->buf = pci_pool_zalloc(mlx4_priv(dev)->cmd.pool, GFP_KERNEL,
+	mailbox->buf = dma_pool_zalloc(mlx4_priv(dev)->cmd.pool, GFP_KERNEL,
 				       &mailbox->dma);
 	if (!mailbox->buf) {
 		kfree(mailbox);
@@ -2716,7 +2716,7 @@ void mlx4_free_cmd_mailbox(struct mlx4_dev *dev,
 	if (!mailbox)
 		return;
 
-	pci_pool_free(mlx4_priv(dev)->cmd.pool, mailbox->buf, mailbox->dma);
+	dma_pool_free(mlx4_priv(dev)->cmd.pool, mailbox->buf, mailbox->dma);
 	kfree(mailbox);
 }
 EXPORT_SYMBOL_GPL(mlx4_free_cmd_mailbox);
diff --git a/drivers/net/ethernet/mellanox/mlx4/cq.c b/drivers/net/ethernet/mellanox/mlx4/cq.c
index c56a511..72eb50c 100644
--- a/drivers/net/ethernet/mellanox/mlx4/cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/cq.c
@@ -241,13 +241,14 @@ int __mlx4_cq_alloc_icm(struct mlx4_dev *dev, int *cqn)
 	return err;
 }
 
-static int mlx4_cq_alloc_icm(struct mlx4_dev *dev, int *cqn)
+static int mlx4_cq_alloc_icm(struct mlx4_dev *dev, int *cqn, u8 usage)
 {
+	u32 in_modifier = RES_CQ | (((u32)usage & 3) << 30);
 	u64 out_param;
 	int err;
 
 	if (mlx4_is_mfunc(dev)) {
-		err = mlx4_cmd_imm(dev, 0, &out_param, RES_CQ,
+		err = mlx4_cmd_imm(dev, 0, &out_param, in_modifier,
 				   RES_OP_RESERVE_AND_MAP, MLX4_CMD_ALLOC_RES,
 				   MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
 		if (err)
@@ -303,7 +304,7 @@ int mlx4_cq_alloc(struct mlx4_dev *dev, int nent,
 
 	cq->vector = vector;
 
-	err = mlx4_cq_alloc_icm(dev, &cq->cqn);
+	err = mlx4_cq_alloc_icm(dev, &cq->cqn, cq->usage);
 	if (err)
 		return err;
 
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_cq.c b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
index 85fe17e..f849eec 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_cq.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_cq.c
@@ -140,6 +140,7 @@ int mlx4_en_activate_cq(struct mlx4_en_priv *priv, struct mlx4_en_cq *cq,
 	    (cq->type == RX && priv->hwtstamp_config.rx_filter))
 		timestamp_en = 1;
 
+	cq->mcq.usage = MLX4_RES_USAGE_DRIVER;
 	err = mlx4_cq_alloc(mdev->dev, cq->size, &cq->wqres.mtt,
 			    &mdev->priv_uar, cq->wqres.db.dma, &cq->mcq,
 			    cq->vector, 0, timestamp_en);
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 3a291fc..e3e6d9f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -651,7 +651,8 @@ static int mlx4_en_get_qp(struct mlx4_en_priv *priv)
 		return 0;
 	}
 
-	err = mlx4_qp_reserve_range(dev, 1, 1, qpn, MLX4_RESERVE_A0_QP);
+	err = mlx4_qp_reserve_range(dev, 1, 1, qpn, MLX4_RESERVE_A0_QP,
+				    MLX4_RES_USAGE_DRIVER);
 	en_dbg(DRV, priv, "Reserved qp %d\n", *qpn);
 	if (err) {
 		en_err(priv, "Failed to reserve qp for mac registration\n");
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index bf16380..ec24c40 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -1088,7 +1088,8 @@ int mlx4_en_create_drop_qp(struct mlx4_en_priv *priv)
 	u32 qpn;
 
 	err = mlx4_qp_reserve_range(priv->mdev->dev, 1, 1, &qpn,
-				    MLX4_RESERVE_A0_QP);
+				    MLX4_RESERVE_A0_QP,
+				    MLX4_RES_USAGE_DRIVER);
 	if (err) {
 		en_err(priv, "Failed reserving drop qpn\n");
 		return err;
@@ -1134,7 +1135,8 @@ int mlx4_en_config_rss_steer(struct mlx4_en_priv *priv)
 	flags = priv->rx_ring_num == 1 ? MLX4_RESERVE_A0_QP : 0;
 	err = mlx4_qp_reserve_range(mdev->dev, priv->rx_ring_num,
 				    priv->rx_ring_num,
-				    &rss_map->base_qpn, flags);
+				    &rss_map->base_qpn, flags,
+				    MLX4_RES_USAGE_DRIVER);
 	if (err) {
 		en_err(priv, "Failed reserving %d qps\n", priv->rx_ring_num);
 		return err;
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_tx.c b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
index 73faa3d..a81db25 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_tx.c
@@ -105,7 +105,8 @@ int mlx4_en_create_tx_ring(struct mlx4_en_priv *priv,
 	       (unsigned long long) ring->sp_wqres.buf.direct.map);
 
 	err = mlx4_qp_reserve_range(mdev->dev, 1, 1, &ring->qpn,
-				    MLX4_RESERVE_ETH_BF_QP);
+				    MLX4_RESERVE_ETH_BF_QP,
+				    MLX4_RES_USAGE_DRIVER);
 	if (err) {
 		en_err(priv, "failed reserving qp for TX ring\n");
 		goto err_hwq_res;
diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
index 5fe5cdc..a594bfd 100644
--- a/drivers/net/ethernet/mellanox/mlx4/main.c
+++ b/drivers/net/ethernet/mellanox/mlx4/main.c
@@ -2477,7 +2477,7 @@ static int mlx4_allocate_default_counters(struct mlx4_dev *dev)
 		priv->def_counter[port] = -1;
 
 	for (port = 0; port < dev->caps.num_ports; port++) {
-		err = mlx4_counter_alloc(dev, &idx);
+		err = mlx4_counter_alloc(dev, &idx, MLX4_RES_USAGE_DRIVER);
 
 		if (!err || err == -ENOSPC) {
 			priv->def_counter[port] = idx;
@@ -2519,13 +2519,14 @@ int __mlx4_counter_alloc(struct mlx4_dev *dev, u32 *idx)
 	return 0;
 }
 
-int mlx4_counter_alloc(struct mlx4_dev *dev, u32 *idx)
+int mlx4_counter_alloc(struct mlx4_dev *dev, u32 *idx, u8 usage)
 {
+	u32 in_modifier = RES_COUNTER | (((u32)usage & 3) << 30);
 	u64 out_param;
 	int err;
 
 	if (mlx4_is_mfunc(dev)) {
-		err = mlx4_cmd_imm(dev, 0, &out_param, RES_COUNTER,
+		err = mlx4_cmd_imm(dev, 0, &out_param, in_modifier,
 				   RES_OP_RESERVE, MLX4_CMD_ALLOC_RES,
 				   MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
 		if (!err)
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4.h b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
index 706d7f2..852d00a 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4.h
@@ -626,7 +626,7 @@ struct mlx4_mgm {
 };
 
 struct mlx4_cmd {
-	struct pci_pool	       *pool;
+	struct dma_pool	       *pool;
 	void __iomem	       *hcr;
 	struct mutex		slave_cmd_mutex;
 	struct semaphore	poll_sem;
diff --git a/drivers/net/ethernet/mellanox/mlx4/qp.c b/drivers/net/ethernet/mellanox/mlx4/qp.c
index 26747212..5e5b447 100644
--- a/drivers/net/ethernet/mellanox/mlx4/qp.c
+++ b/drivers/net/ethernet/mellanox/mlx4/qp.c
@@ -245,8 +245,9 @@ int __mlx4_qp_reserve_range(struct mlx4_dev *dev, int cnt, int align,
 }
 
 int mlx4_qp_reserve_range(struct mlx4_dev *dev, int cnt, int align,
-			  int *base, u8 flags)
+			  int *base, u8 flags, u8 usage)
 {
+	u32 in_modifier = RES_QP | (((u32)usage & 3) << 30);
 	u64 in_param = 0;
 	u64 out_param;
 	int err;
@@ -258,7 +259,7 @@ int mlx4_qp_reserve_range(struct mlx4_dev *dev, int cnt, int align,
 		set_param_l(&in_param, (((u32)flags) << 24) | (u32)cnt);
 		set_param_h(&in_param, align);
 		err = mlx4_cmd_imm(dev, in_param, &out_param,
-				   RES_QP, RES_OP_RESERVE,
+				   in_modifier, RES_OP_RESERVE,
 				   MLX4_CMD_ALLOC_RES,
 				   MLX4_CMD_TIME_CLASS_A, MLX4_CMD_WRAPPED);
 		if (err)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index 31cbe5e..1acbb72 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -1110,7 +1110,7 @@ static struct mlx5_cmd_mailbox *alloc_cmd_box(struct mlx5_core_dev *dev,
 	if (!mailbox)
 		return ERR_PTR(-ENOMEM);
 
-	mailbox->buf = pci_pool_zalloc(dev->cmd.pool, flags,
+	mailbox->buf = dma_pool_zalloc(dev->cmd.pool, flags,
 				       &mailbox->dma);
 	if (!mailbox->buf) {
 		mlx5_core_dbg(dev, "failed allocation\n");
@@ -1125,7 +1125,7 @@ static struct mlx5_cmd_mailbox *alloc_cmd_box(struct mlx5_core_dev *dev,
 static void free_cmd_box(struct mlx5_core_dev *dev,
 			 struct mlx5_cmd_mailbox *mailbox)
 {
-	pci_pool_free(dev->cmd.pool, mailbox->buf, mailbox->dma);
+	dma_pool_free(dev->cmd.pool, mailbox->buf, mailbox->dma);
 	kfree(mailbox);
 }
 
@@ -1776,7 +1776,8 @@ int mlx5_cmd_init(struct mlx5_core_dev *dev)
 		return -EINVAL;
 	}
 
-	cmd->pool = pci_pool_create("mlx5_cmd", dev->pdev, size, align, 0);
+	cmd->pool = dma_pool_create("mlx5_cmd", &dev->pdev->dev, size, align,
+				    0);
 	if (!cmd->pool)
 		return -ENOMEM;
 
@@ -1866,7 +1867,7 @@ int mlx5_cmd_init(struct mlx5_core_dev *dev)
 	free_cmd_page(dev, cmd);
 
 err_free_pool:
-	pci_pool_destroy(cmd->pool);
+	dma_pool_destroy(cmd->pool);
 
 	return err;
 }
@@ -1880,6 +1881,6 @@ void mlx5_cmd_cleanup(struct mlx5_core_dev *dev)
 	destroy_workqueue(cmd->wq);
 	destroy_msg_cache(dev);
 	free_cmd_page(dev, cmd);
-	pci_pool_destroy(cmd->pool);
+	dma_pool_destroy(cmd->pool);
 }
 EXPORT_SYMBOL(mlx5_cmd_cleanup);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 0039b47..040d1af 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -263,6 +263,7 @@ struct mlx5e_dcbx {
 
 	/* The only setting that cannot be read from FW */
 	u8                         tc_tsa[IEEE_8021QAZ_MAX_TCS];
+	u8                         cap;
 };
 #endif
 
@@ -595,7 +596,6 @@ struct mlx5e_channel {
 	struct mlx5_core_dev      *mdev;
 	struct mlx5e_tstamp       *tstamp;
 	int                        ix;
-	int                        cpu;
 };
 
 struct mlx5e_channels {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
index 2eb54d3..c1d384f 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
@@ -288,13 +288,8 @@ static int mlx5e_dcbnl_ieee_setpfc(struct net_device *dev,
 static u8 mlx5e_dcbnl_getdcbx(struct net_device *dev)
 {
 	struct mlx5e_priv *priv = netdev_priv(dev);
-	struct mlx5e_dcbx *dcbx = &priv->dcbx;
-	u8 mode = DCB_CAP_DCBX_VER_IEEE | DCB_CAP_DCBX_VER_CEE;
 
-	if (dcbx->mode == MLX5E_DCBX_PARAM_VER_OPER_HOST)
-		mode |= DCB_CAP_DCBX_HOST;
-
-	return mode;
+	return priv->dcbx.cap;
 }
 
 static u8 mlx5e_dcbnl_setdcbx(struct net_device *dev, u8 mode)
@@ -312,6 +307,7 @@ static u8 mlx5e_dcbnl_setdcbx(struct net_device *dev, u8 mode)
 		/* set dcbx to fw controlled */
 		if (!mlx5e_dcbnl_set_dcbx_mode(priv, MLX5E_DCBX_PARAM_VER_OPER_AUTO)) {
 			dcbx->mode = MLX5E_DCBX_PARAM_VER_OPER_AUTO;
+			dcbx->cap &= ~DCB_CAP_DCBX_HOST;
 			return 0;
 		}
 
@@ -324,6 +320,8 @@ static u8 mlx5e_dcbnl_setdcbx(struct net_device *dev, u8 mode)
 	if (mlx5e_dcbnl_switch_to_host_mode(netdev_priv(dev)))
 		return 1;
 
+	dcbx->cap = mode;
+
 	return 0;
 }
 
@@ -628,9 +626,9 @@ static u8 mlx5e_dcbnl_getcap(struct net_device *netdev,
 		*cap = false;
 		break;
 	case DCB_CAP_ATTR_DCBX:
-		*cap = (DCB_CAP_DCBX_LLD_MANAGED |
-			DCB_CAP_DCBX_VER_CEE |
-			DCB_CAP_DCBX_STATIC);
+		*cap = priv->dcbx.cap |
+		       DCB_CAP_DCBX_VER_CEE |
+		       DCB_CAP_DCBX_VER_IEEE;
 		break;
 	default:
 		*cap = 0;
@@ -754,8 +752,16 @@ void mlx5e_dcbnl_initialize(struct mlx5e_priv *priv)
 {
 	struct mlx5e_dcbx *dcbx = &priv->dcbx;
 
+	if (!MLX5_CAP_GEN(priv->mdev, qos))
+		return;
+
 	if (MLX5_CAP_GEN(priv->mdev, dcbx))
 		mlx5e_dcbnl_query_dcbx_mode(priv, &dcbx->mode);
 
+	priv->dcbx.cap = DCB_CAP_DCBX_VER_CEE |
+			 DCB_CAP_DCBX_VER_IEEE;
+	if (priv->dcbx.mode == MLX5E_DCBX_PARAM_VER_OPER_HOST)
+		priv->dcbx.cap |= DCB_CAP_DCBX_HOST;
+
 	mlx5e_ets_init(priv);
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 917fade..f559401 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -641,8 +641,10 @@ int mlx5e_ethtool_set_channels(struct mlx5e_priv *priv,
 
 	new_channels.params = priv->channels.params;
 	new_channels.params.num_channels = count;
-	mlx5e_build_default_indir_rqt(priv->mdev, new_channels.params.indirection_rqt,
-				      MLX5E_INDIR_RQT_SIZE, count);
+	if (!netif_is_rxfh_configured(priv->netdev))
+		mlx5e_build_default_indir_rqt(priv->mdev,
+					      new_channels.params.indirection_rqt,
+					      MLX5E_INDIR_RQT_SIZE, count);
 
 	if (!test_bit(MLX5E_STATE_OPENED, &priv->state)) {
 		priv->channels.params = new_channels.params;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index 57f31fa..d75f309 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -71,6 +71,11 @@ struct mlx5e_channel_param {
 	struct mlx5e_cq_param      icosq_cq;
 };
 
+static int mlx5e_get_node(struct mlx5e_priv *priv, int ix)
+{
+	return pci_irq_get_node(priv->mdev->pdev, MLX5_EQ_VEC_COMP_BASE + ix);
+}
+
 static bool mlx5e_check_fragmented_striding_rq_cap(struct mlx5_core_dev *mdev)
 {
 	return MLX5_CAP_GEN(mdev, striding_rq) &&
@@ -396,7 +401,7 @@ static void mlx5e_enable_async_events(struct mlx5e_priv *priv)
 static void mlx5e_disable_async_events(struct mlx5e_priv *priv)
 {
 	clear_bit(MLX5E_STATE_ASYNC_EVENTS_ENABLED, &priv->state);
-	synchronize_irq(mlx5_get_msix_vec(priv->mdev, MLX5_EQ_VEC_ASYNC));
+	synchronize_irq(pci_irq_vector(priv->mdev->pdev, MLX5_EQ_VEC_ASYNC));
 }
 
 static inline int mlx5e_get_wqe_mtt_sz(void)
@@ -443,16 +448,17 @@ static int mlx5e_rq_alloc_mpwqe_info(struct mlx5e_rq *rq,
 	int wq_sz = mlx5_wq_ll_get_size(&rq->wq);
 	int mtt_sz = mlx5e_get_wqe_mtt_sz();
 	int mtt_alloc = mtt_sz + MLX5_UMR_ALIGN - 1;
+	int node = mlx5e_get_node(c->priv, c->ix);
 	int i;
 
 	rq->mpwqe.info = kzalloc_node(wq_sz * sizeof(*rq->mpwqe.info),
-				      GFP_KERNEL, cpu_to_node(c->cpu));
+					GFP_KERNEL, node);
 	if (!rq->mpwqe.info)
 		goto err_out;
 
 	/* We allocate more than mtt_sz as we will align the pointer */
-	rq->mpwqe.mtt_no_align = kzalloc_node(mtt_alloc * wq_sz, GFP_KERNEL,
-					cpu_to_node(c->cpu));
+	rq->mpwqe.mtt_no_align = kzalloc_node(mtt_alloc * wq_sz,
+					GFP_KERNEL, node);
 	if (unlikely(!rq->mpwqe.mtt_no_align))
 		goto err_free_wqe_info;
 
@@ -560,7 +566,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
 	int err;
 	int i;
 
-	rqp->wq.db_numa_node = cpu_to_node(c->cpu);
+	rqp->wq.db_numa_node = mlx5e_get_node(c->priv, c->ix);
 
 	err = mlx5_wq_ll_create(mdev, &rqp->wq, rqc_wq, &rq->wq,
 				&rq->wq_ctrl);
@@ -627,7 +633,8 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c,
 	default: /* MLX5_WQ_TYPE_LINKED_LIST */
 		rq->wqe.frag_info =
 			kzalloc_node(wq_sz * sizeof(*rq->wqe.frag_info),
-				     GFP_KERNEL, cpu_to_node(c->cpu));
+				     GFP_KERNEL,
+				     mlx5e_get_node(c->priv, c->ix));
 		if (!rq->wqe.frag_info) {
 			err = -ENOMEM;
 			goto err_rq_wq_destroy;
@@ -992,13 +999,13 @@ static int mlx5e_alloc_xdpsq(struct mlx5e_channel *c,
 	sq->uar_map   = mdev->mlx5e_res.bfreg.map;
 	sq->min_inline_mode = params->tx_min_inline_mode;
 
-	param->wq.db_numa_node = cpu_to_node(c->cpu);
+	param->wq.db_numa_node = mlx5e_get_node(c->priv, c->ix);
 	err = mlx5_wq_cyc_create(mdev, &param->wq, sqc_wq, &sq->wq, &sq->wq_ctrl);
 	if (err)
 		return err;
 	sq->wq.db = &sq->wq.db[MLX5_SND_DBR];
 
-	err = mlx5e_alloc_xdpsq_db(sq, cpu_to_node(c->cpu));
+	err = mlx5e_alloc_xdpsq_db(sq, mlx5e_get_node(c->priv, c->ix));
 	if (err)
 		goto err_sq_wq_destroy;
 
@@ -1046,13 +1053,13 @@ static int mlx5e_alloc_icosq(struct mlx5e_channel *c,
 	sq->channel   = c;
 	sq->uar_map   = mdev->mlx5e_res.bfreg.map;
 
-	param->wq.db_numa_node = cpu_to_node(c->cpu);
+	param->wq.db_numa_node = mlx5e_get_node(c->priv, c->ix);
 	err = mlx5_wq_cyc_create(mdev, &param->wq, sqc_wq, &sq->wq, &sq->wq_ctrl);
 	if (err)
 		return err;
 	sq->wq.db = &sq->wq.db[MLX5_SND_DBR];
 
-	err = mlx5e_alloc_icosq_db(sq, cpu_to_node(c->cpu));
+	err = mlx5e_alloc_icosq_db(sq, mlx5e_get_node(c->priv, c->ix));
 	if (err)
 		goto err_sq_wq_destroy;
 
@@ -1118,13 +1125,13 @@ static int mlx5e_alloc_txqsq(struct mlx5e_channel *c,
 	if (MLX5_IPSEC_DEV(c->priv->mdev))
 		set_bit(MLX5E_SQ_STATE_IPSEC, &sq->state);
 
-	param->wq.db_numa_node = cpu_to_node(c->cpu);
+	param->wq.db_numa_node = mlx5e_get_node(c->priv, c->ix);
 	err = mlx5_wq_cyc_create(mdev, &param->wq, sqc_wq, &sq->wq, &sq->wq_ctrl);
 	if (err)
 		return err;
 	sq->wq.db    = &sq->wq.db[MLX5_SND_DBR];
 
-	err = mlx5e_alloc_txqsq_db(sq, cpu_to_node(c->cpu));
+	err = mlx5e_alloc_txqsq_db(sq, mlx5e_get_node(c->priv, c->ix));
 	if (err)
 		goto err_sq_wq_destroy;
 
@@ -1496,8 +1503,8 @@ static int mlx5e_alloc_cq(struct mlx5e_channel *c,
 	struct mlx5_core_dev *mdev = c->priv->mdev;
 	int err;
 
-	param->wq.buf_numa_node = cpu_to_node(c->cpu);
-	param->wq.db_numa_node  = cpu_to_node(c->cpu);
+	param->wq.buf_numa_node = mlx5e_get_node(c->priv, c->ix);
+	param->wq.db_numa_node  = mlx5e_get_node(c->priv, c->ix);
 	param->eq_ix   = c->ix;
 
 	err = mlx5e_alloc_cq_common(mdev, param, cq);
@@ -1596,11 +1603,6 @@ static void mlx5e_close_cq(struct mlx5e_cq *cq)
 	mlx5e_free_cq(cq);
 }
 
-static int mlx5e_get_cpu(struct mlx5e_priv *priv, int ix)
-{
-	return cpumask_first(priv->mdev->priv.irq_info[ix].mask);
-}
-
 static int mlx5e_open_tx_cqs(struct mlx5e_channel *c,
 			     struct mlx5e_params *params,
 			     struct mlx5e_channel_param *cparam)
@@ -1749,11 +1751,10 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
 {
 	struct mlx5e_cq_moder icocq_moder = {0, 0};
 	struct net_device *netdev = priv->netdev;
-	int cpu = mlx5e_get_cpu(priv, ix);
 	struct mlx5e_channel *c;
 	int err;
 
-	c = kzalloc_node(sizeof(*c), GFP_KERNEL, cpu_to_node(cpu));
+	c = kzalloc_node(sizeof(*c), GFP_KERNEL, mlx5e_get_node(priv, ix));
 	if (!c)
 		return -ENOMEM;
 
@@ -1761,7 +1762,6 @@ static int mlx5e_open_channel(struct mlx5e_priv *priv, int ix,
 	c->mdev     = priv->mdev;
 	c->tstamp   = &priv->tstamp;
 	c->ix       = ix;
-	c->cpu      = cpu;
 	c->pdev     = &priv->mdev->pdev->dev;
 	c->netdev   = priv->netdev;
 	c->mkey_be  = cpu_to_be32(priv->mdev->mlx5e_res.mkey.key);
@@ -1847,7 +1847,8 @@ static void mlx5e_activate_channel(struct mlx5e_channel *c)
 	for (tc = 0; tc < c->num_tc; tc++)
 		mlx5e_activate_txqsq(&c->sq[tc]);
 	mlx5e_activate_rq(&c->rq);
-	netif_set_xps_queue(c->netdev, get_cpu_mask(c->cpu), c->ix);
+	netif_set_xps_queue(c->netdev,
+		mlx5_get_vector_affinity(c->priv->mdev, c->ix), c->ix);
 }
 
 static void mlx5e_deactivate_channel(struct mlx5e_channel *c)
@@ -1969,6 +1970,7 @@ static void mlx5e_build_rx_cq_param(struct mlx5e_priv *priv,
 	}
 
 	mlx5e_build_common_cq_param(priv, param);
+	param->cq_period_mode = params->rx_cq_period_mode;
 }
 
 static void mlx5e_build_tx_cq_param(struct mlx5e_priv *priv,
@@ -3792,18 +3794,8 @@ void mlx5e_build_default_indir_rqt(struct mlx5_core_dev *mdev,
 				   u32 *indirection_rqt, int len,
 				   int num_channels)
 {
-	int node = mdev->priv.numa_node;
-	int node_num_of_cores;
 	int i;
 
-	if (node == -1)
-		node = first_online_node;
-
-	node_num_of_cores = cpumask_weight(cpumask_of_node(node));
-
-	if (node_num_of_cores)
-		num_channels = min_t(int, num_channels, node_num_of_cores);
-
 	for (i = 0; i < len; i++)
 		indirection_rqt[i] = i % num_channels;
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
index 325b2c8..7344433 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx.c
@@ -222,13 +222,13 @@ static inline int mlx5e_page_alloc_mapped(struct mlx5e_rq *rq,
 	if (unlikely(!page))
 		return -ENOMEM;
 
-	dma_info->page = page;
 	dma_info->addr = dma_map_page(rq->pdev, page, 0,
 				      RQ_PAGE_SIZE(rq), rq->buff.map_dir);
 	if (unlikely(dma_mapping_error(rq->pdev, dma_info->addr))) {
 		put_page(page);
 		return -ENOMEM;
 	}
+	dma_info->page = page;
 
 	return 0;
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
index 898759f..1f1f8af8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
@@ -189,6 +189,7 @@ struct mlx5e_lbt_priv {
 	struct packet_type pt;
 	struct completion comp;
 	bool loopback_ok;
+	bool local_lb;
 };
 
 static int
@@ -236,6 +237,13 @@ static int mlx5e_test_loopback_setup(struct mlx5e_priv *priv,
 {
 	int err = 0;
 
+	/* Temporarily enable local_lb */
+	if (MLX5_CAP_GEN(priv->mdev, disable_local_lb)) {
+		mlx5_nic_vport_query_local_lb(priv->mdev, &lbtp->local_lb);
+		if (!lbtp->local_lb)
+			mlx5_nic_vport_update_local_lb(priv->mdev, true);
+	}
+
 	err = mlx5e_refresh_tirs(priv, true);
 	if (err)
 		return err;
@@ -254,6 +262,11 @@ static int mlx5e_test_loopback_setup(struct mlx5e_priv *priv,
 static void mlx5e_test_loopback_cleanup(struct mlx5e_priv *priv,
 					struct mlx5e_lbt_priv *lbtp)
 {
+	if (MLX5_CAP_GEN(priv->mdev, disable_local_lb)) {
+		if (!lbtp->local_lb)
+			mlx5_nic_vport_update_local_lb(priv->mdev, false);
+	}
+
 	dev_remove_pack(&lbtp->pt);
 	mlx5e_refresh_tirs(priv, false);
 }
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index 3c536f5..7f282e8 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -1443,12 +1443,10 @@ static int mlx5e_route_lookup_ipv6(struct mlx5e_priv *priv,
 	struct mlx5_eswitch *esw = priv->mdev->priv.eswitch;
 	int ret;
 
-	dst = ip6_route_output(dev_net(mirred_dev), NULL, fl6);
-	ret = dst->error;
-	if (ret) {
-		dst_release(dst);
+	ret = ipv6_stub->ipv6_dst_lookup(dev_net(mirred_dev), NULL, &dst,
+					 fl6);
+	if (ret < 0)
 		return ret;
-	}
 
 	*out_ttl = ip6_dst_hoplimit(dst);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index aaa0f4e..31353e5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -128,10 +128,10 @@ static inline int mlx5e_skb_l3_header_offset(struct sk_buff *skb)
 		return mlx5e_skb_l2_header_offset(skb);
 }
 
-static inline unsigned int mlx5e_calc_min_inline(enum mlx5_inline_modes mode,
-						 struct sk_buff *skb)
+static inline u16 mlx5e_calc_min_inline(enum mlx5_inline_modes mode,
+					struct sk_buff *skb)
 {
-	int hlen;
+	u16 hlen;
 
 	switch (mode) {
 	case MLX5_INLINE_MODE_NONE:
@@ -140,19 +140,22 @@ static inline unsigned int mlx5e_calc_min_inline(enum mlx5_inline_modes mode,
 		hlen = eth_get_headlen(skb->data, skb_headlen(skb));
 		if (hlen == ETH_HLEN && !skb_vlan_tag_present(skb))
 			hlen += VLAN_HLEN;
-		return hlen;
+		break;
 	case MLX5_INLINE_MODE_IP:
 		/* When transport header is set to zero, it means no transport
 		 * header. When transport header is set to 0xff's, it means
 		 * transport header wasn't set.
 		 */
-		if (skb_transport_offset(skb))
-			return mlx5e_skb_l3_header_offset(skb);
+		if (skb_transport_offset(skb)) {
+			hlen = mlx5e_skb_l3_header_offset(skb);
+			break;
+		}
 		/* fall through */
 	case MLX5_INLINE_MODE_L2:
 	default:
-		return mlx5e_skb_l2_header_offset(skb);
+		hlen = mlx5e_skb_l2_header_offset(skb);
 	}
+	return min_t(u16, hlen, skb->len);
 }
 
 static inline void mlx5e_tx_skb_pull_inline(unsigned char **skb_data,
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eq.c b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
index 52b9a64..edd11eb 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eq.c
@@ -161,6 +161,8 @@ static const char *eqe_type_str(u8 type)
 		return "MLX5_EVENT_TYPE_NIC_VPORT_CHANGE";
 	case MLX5_EVENT_TYPE_FPGA_ERROR:
 		return "MLX5_EVENT_TYPE_FPGA_ERROR";
+	case MLX5_EVENT_TYPE_GENERAL_EVENT:
+		return "MLX5_EVENT_TYPE_GENERAL_EVENT";
 	default:
 		return "Unrecognized event";
 	}
@@ -378,6 +380,20 @@ int mlx5_core_page_fault_resume(struct mlx5_core_dev *dev, u32 token,
 EXPORT_SYMBOL_GPL(mlx5_core_page_fault_resume);
 #endif
 
+static void general_event_handler(struct mlx5_core_dev *dev,
+				  struct mlx5_eqe *eqe)
+{
+	switch (eqe->sub_type) {
+	case MLX5_GENERAL_SUBTYPE_DELAY_DROP_TIMEOUT:
+		if (dev->event)
+			dev->event(dev, MLX5_DEV_EVENT_DELAY_DROP_TIMEOUT, 0);
+		break;
+	default:
+		mlx5_core_dbg(dev, "General event with unrecognized subtype: sub_type %d\n",
+			      eqe->sub_type);
+	}
+}
+
 static irqreturn_t mlx5_eq_int(int irq, void *eq_ptr)
 {
 	struct mlx5_eq *eq = eq_ptr;
@@ -486,6 +502,9 @@ static irqreturn_t mlx5_eq_int(int irq, void *eq_ptr)
 			mlx5_fpga_event(dev, eqe->type, &eqe->data.raw);
 			break;
 
+		case MLX5_EVENT_TYPE_GENERAL_EVENT:
+			general_event_handler(dev, eqe);
+			break;
 		default:
 			mlx5_core_warn(dev, "Unhandled event 0x%x on EQ 0x%x\n",
 				       eqe->type, eq->eqn);
@@ -585,7 +604,7 @@ int mlx5_create_map_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq, u8 vecidx,
 		 name, pci_name(dev->pdev));
 
 	eq->eqn = MLX5_GET(create_eq_out, out, eq_number);
-	eq->irqn = priv->msix_arr[vecidx].vector;
+	eq->irqn = pci_irq_vector(dev->pdev, vecidx);
 	eq->dev = dev;
 	eq->doorbell = priv->uar->map + MLX5_EQ_DOORBEL_OFFSET;
 	err = request_irq(eq->irqn, handler, 0,
@@ -620,7 +639,7 @@ int mlx5_create_map_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq, u8 vecidx,
 	return 0;
 
 err_irq:
-	free_irq(priv->msix_arr[vecidx].vector, eq);
+	free_irq(eq->irqn, eq);
 
 err_eq:
 	mlx5_cmd_destroy_eq(dev, eq->eqn);
@@ -661,11 +680,6 @@ int mlx5_destroy_unmap_eq(struct mlx5_core_dev *dev, struct mlx5_eq *eq)
 }
 EXPORT_SYMBOL_GPL(mlx5_destroy_unmap_eq);
 
-u32 mlx5_get_msix_vec(struct mlx5_core_dev *dev, int vecidx)
-{
-	return dev->priv.msix_arr[MLX5_EQ_VEC_ASYNC].vector;
-}
-
 int mlx5_eq_init(struct mlx5_core_dev *dev)
 {
 	int err;
@@ -693,6 +707,10 @@ int mlx5_start_eqs(struct mlx5_core_dev *dev)
 	    mlx5_core_is_pf(dev))
 		async_event_mask |= (1ull << MLX5_EVENT_TYPE_NIC_VPORT_CHANGE);
 
+	if (MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_ETH &&
+	    MLX5_CAP_GEN(dev, general_notification_event))
+		async_event_mask |= (1ull << MLX5_EVENT_TYPE_GENERAL_EVENT);
+
 	if (MLX5_CAP_GEN(dev, port_module_event))
 		async_event_mask |= (1ull << MLX5_EVENT_TYPE_PORT_MODULE_EVENT);
 	else
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
index 8b18cc9..5b41f69 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
@@ -1585,7 +1585,7 @@ static void esw_disable_vport(struct mlx5_eswitch *esw, int vport_num)
 	/* Mark this vport as disabled to discard new events */
 	vport->enabled = false;
 
-	synchronize_irq(mlx5_get_msix_vec(esw->dev, MLX5_EQ_VEC_ASYNC));
+	synchronize_irq(pci_irq_vector(esw->dev->pdev, MLX5_EQ_VEC_ASYNC));
 	/* Wait for current already scheduled events to complete */
 	flush_workqueue(esw->work_queue);
 	/* Disable events from this vport */
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
index 95b6402..5bc0593 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
@@ -815,7 +815,7 @@ void esw_offloads_cleanup(struct mlx5_eswitch *esw, int nvports)
 	struct mlx5_eswitch_rep *rep;
 	int vport;
 
-	for (vport = 0; vport < nvports; vport++) {
+	for (vport = nvports - 1; vport >= 0; vport--) {
 		rep = &esw->offloads.vport_reps[vport];
 		if (!rep->valid)
 			continue;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fw.c b/drivers/net/ethernet/mellanox/mlx5/core/fw.c
index fa33d59..2c71557 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/fw.c
@@ -120,6 +120,12 @@ int mlx5_query_hca_caps(struct mlx5_core_dev *dev)
 			return err;
 	}
 
+	if (MLX5_CAP_GEN(dev, ipoib_enhanced_offloads)) {
+		err = mlx5_core_get_caps(dev, MLX5_CAP_IPOIB_ENHANCED_OFFLOADS);
+		if (err)
+			return err;
+	}
+
 	if (MLX5_CAP_GEN(dev, pg)) {
 		err = mlx5_core_get_caps(dev, MLX5_CAP_ODP);
 		if (err)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/health.c b/drivers/net/ethernet/mellanox/mlx5/core/health.c
index 4b6b03d..8aea0a0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/health.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/health.c
@@ -81,7 +81,7 @@ static void trigger_cmd_completions(struct mlx5_core_dev *dev)
 	u64 vector;
 
 	/* wait for pending handlers to complete */
-	synchronize_irq(dev->priv.msix_arr[MLX5_EQ_VEC_CMD].vector);
+	synchronize_irq(pci_irq_vector(dev->pdev, MLX5_EQ_VEC_CMD));
 	spin_lock_irqsave(&dev->cmd.alloc_lock, flags);
 	vector = ~dev->cmd.bitmask & ((1ul << (1 << dev->cmd.log_sz)) - 1);
 	if (!vector)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index c065132..8c4b45ef5 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -47,6 +47,7 @@
 #include <linux/debugfs.h>
 #include <linux/kmod.h>
 #include <linux/mlx5/mlx5_ifc.h>
+#include <linux/mlx5/vport.h>
 #ifdef CONFIG_RFS_ACCEL
 #include <linux/cpu_rmap.h>
 #endif
@@ -312,13 +313,15 @@ static void release_bar(struct pci_dev *pdev)
 	pci_release_regions(pdev);
 }
 
-static int mlx5_enable_msix(struct mlx5_core_dev *dev)
+static int mlx5_alloc_irq_vectors(struct mlx5_core_dev *dev)
 {
 	struct mlx5_priv *priv = &dev->priv;
 	struct mlx5_eq_table *table = &priv->eq_table;
+	struct irq_affinity irqdesc = {
+		.pre_vectors = MLX5_EQ_VEC_COMP_BASE,
+	};
 	int num_eqs = 1 << MLX5_CAP_GEN(dev, log_max_eq);
 	int nvec;
-	int i;
 
 	nvec = MLX5_CAP_GEN(dev, num_ports) * num_online_cpus() +
 	       MLX5_EQ_VEC_COMP_BASE;
@@ -326,17 +329,14 @@ static int mlx5_enable_msix(struct mlx5_core_dev *dev)
 	if (nvec <= MLX5_EQ_VEC_COMP_BASE)
 		return -ENOMEM;
 
-	priv->msix_arr = kcalloc(nvec, sizeof(*priv->msix_arr), GFP_KERNEL);
-
 	priv->irq_info = kcalloc(nvec, sizeof(*priv->irq_info), GFP_KERNEL);
-	if (!priv->msix_arr || !priv->irq_info)
+	if (!priv->irq_info)
 		goto err_free_msix;
 
-	for (i = 0; i < nvec; i++)
-		priv->msix_arr[i].entry = i;
-
-	nvec = pci_enable_msix_range(dev->pdev, priv->msix_arr,
-				     MLX5_EQ_VEC_COMP_BASE + 1, nvec);
+	nvec = pci_alloc_irq_vectors_affinity(dev->pdev,
+			MLX5_EQ_VEC_COMP_BASE + 1, nvec,
+			PCI_IRQ_MSIX | PCI_IRQ_AFFINITY,
+			&irqdesc);
 	if (nvec < 0)
 		return nvec;
 
@@ -346,17 +346,15 @@ static int mlx5_enable_msix(struct mlx5_core_dev *dev)
 
 err_free_msix:
 	kfree(priv->irq_info);
-	kfree(priv->msix_arr);
 	return -ENOMEM;
 }
 
-static void mlx5_disable_msix(struct mlx5_core_dev *dev)
+static void mlx5_free_irq_vectors(struct mlx5_core_dev *dev)
 {
 	struct mlx5_priv *priv = &dev->priv;
 
-	pci_disable_msix(dev->pdev);
+	pci_free_irq_vectors(dev->pdev);
 	kfree(priv->irq_info);
-	kfree(priv->msix_arr);
 }
 
 struct mlx5_reg_host_endianness {
@@ -579,6 +577,18 @@ static int set_hca_ctrl(struct mlx5_core_dev *dev)
 	return err;
 }
 
+static int mlx5_core_set_hca_defaults(struct mlx5_core_dev *dev)
+{
+	int ret = 0;
+
+	/* Disable local_lb by default */
+	if ((MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_ETH) &&
+	    MLX5_CAP_GEN(dev, disable_local_lb))
+		ret = mlx5_nic_vport_update_local_lb(dev, false);
+
+	return ret;
+}
+
 int mlx5_core_enable_hca(struct mlx5_core_dev *dev, u16 func_id)
 {
 	u32 out[MLX5_ST_SZ_DW(enable_hca_out)] = {0};
@@ -612,65 +622,6 @@ u64 mlx5_read_internal_timer(struct mlx5_core_dev *dev)
 	return (u64)timer_l | (u64)timer_h1 << 32;
 }
 
-static int mlx5_irq_set_affinity_hint(struct mlx5_core_dev *mdev, int i)
-{
-	struct mlx5_priv *priv  = &mdev->priv;
-	struct msix_entry *msix = priv->msix_arr;
-	int irq                 = msix[i + MLX5_EQ_VEC_COMP_BASE].vector;
-
-	if (!zalloc_cpumask_var(&priv->irq_info[i].mask, GFP_KERNEL)) {
-		mlx5_core_warn(mdev, "zalloc_cpumask_var failed");
-		return -ENOMEM;
-	}
-
-	cpumask_set_cpu(cpumask_local_spread(i, priv->numa_node),
-			priv->irq_info[i].mask);
-
-	if (IS_ENABLED(CONFIG_SMP) &&
-	    irq_set_affinity_hint(irq, priv->irq_info[i].mask))
-		mlx5_core_warn(mdev, "irq_set_affinity_hint failed, irq 0x%.4x", irq);
-
-	return 0;
-}
-
-static void mlx5_irq_clear_affinity_hint(struct mlx5_core_dev *mdev, int i)
-{
-	struct mlx5_priv *priv  = &mdev->priv;
-	struct msix_entry *msix = priv->msix_arr;
-	int irq                 = msix[i + MLX5_EQ_VEC_COMP_BASE].vector;
-
-	irq_set_affinity_hint(irq, NULL);
-	free_cpumask_var(priv->irq_info[i].mask);
-}
-
-static int mlx5_irq_set_affinity_hints(struct mlx5_core_dev *mdev)
-{
-	int err;
-	int i;
-
-	for (i = 0; i < mdev->priv.eq_table.num_comp_vectors; i++) {
-		err = mlx5_irq_set_affinity_hint(mdev, i);
-		if (err)
-			goto err_out;
-	}
-
-	return 0;
-
-err_out:
-	for (i--; i >= 0; i--)
-		mlx5_irq_clear_affinity_hint(mdev, i);
-
-	return err;
-}
-
-static void mlx5_irq_clear_affinity_hints(struct mlx5_core_dev *mdev)
-{
-	int i;
-
-	for (i = 0; i < mdev->priv.eq_table.num_comp_vectors; i++)
-		mlx5_irq_clear_affinity_hint(mdev, i);
-}
-
 int mlx5_vector2eqn(struct mlx5_core_dev *dev, int vector, int *eqn,
 		    unsigned int *irqn)
 {
@@ -760,8 +711,8 @@ static int alloc_comp_eqs(struct mlx5_core_dev *dev)
 		}
 
 #ifdef CONFIG_RFS_ACCEL
-		irq_cpu_rmap_add(dev->rmap,
-				 dev->priv.msix_arr[i + MLX5_EQ_VEC_COMP_BASE].vector);
+		irq_cpu_rmap_add(dev->rmap, pci_irq_vector(dev->pdev,
+				 MLX5_EQ_VEC_COMP_BASE + i));
 #endif
 		snprintf(name, MLX5_MAX_IRQ_NAME, "mlx5_comp%d", i);
 		err = mlx5_create_map_eq(dev, eq,
@@ -1119,9 +1070,9 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
 		goto err_stop_poll;
 	}
 
-	err = mlx5_enable_msix(dev);
+	err = mlx5_alloc_irq_vectors(dev);
 	if (err) {
-		dev_err(&pdev->dev, "enable msix failed\n");
+		dev_err(&pdev->dev, "alloc irq vectors failed\n");
 		goto err_cleanup_once;
 	}
 
@@ -1143,18 +1094,18 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
 		goto err_stop_eqs;
 	}
 
-	err = mlx5_irq_set_affinity_hints(dev);
-	if (err) {
-		dev_err(&pdev->dev, "Failed to alloc affinity hint cpumask\n");
-		goto err_affinity_hints;
-	}
-
 	err = mlx5_init_fs(dev);
 	if (err) {
 		dev_err(&pdev->dev, "Failed to init flow steering\n");
 		goto err_fs;
 	}
 
+	err = mlx5_core_set_hca_defaults(dev);
+	if (err) {
+		dev_err(&pdev->dev, "Failed to set hca defaults\n");
+		goto err_fs;
+	}
+
 #ifdef CONFIG_MLX5_CORE_EN
 	mlx5_eswitch_attach(dev->priv.eswitch);
 #endif
@@ -1186,7 +1137,6 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
 		}
 	}
 
-	clear_bit(MLX5_INTERFACE_STATE_DOWN, &dev->intf_state);
 	set_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state);
 out:
 	mutex_unlock(&dev->intf_state_mutex);
@@ -1208,9 +1158,6 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
 	mlx5_cleanup_fs(dev);
 
 err_fs:
-	mlx5_irq_clear_affinity_hints(dev);
-
-err_affinity_hints:
 	free_comp_eqs(dev);
 
 err_stop_eqs:
@@ -1220,7 +1167,7 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
 	mlx5_put_uars_page(dev, priv->uar);
 
 err_disable_msix:
-	mlx5_disable_msix(dev);
+	mlx5_free_irq_vectors(dev);
 
 err_cleanup_once:
 	if (boot)
@@ -1261,7 +1208,7 @@ static int mlx5_unload_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
 		mlx5_drain_health_recovery(dev);
 
 	mutex_lock(&dev->intf_state_mutex);
-	if (test_bit(MLX5_INTERFACE_STATE_DOWN, &dev->intf_state)) {
+	if (!test_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state)) {
 		dev_warn(&dev->pdev->dev, "%s: interface is down, NOP\n",
 			 __func__);
 		if (cleanup)
@@ -1270,7 +1217,6 @@ static int mlx5_unload_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
 	}
 
 	clear_bit(MLX5_INTERFACE_STATE_UP, &dev->intf_state);
-	set_bit(MLX5_INTERFACE_STATE_DOWN, &dev->intf_state);
 
 	if (mlx5_device_registered(dev))
 		mlx5_detach_device(dev);
@@ -1283,11 +1229,10 @@ static int mlx5_unload_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
 	mlx5_eswitch_detach(dev->priv.eswitch);
 #endif
 	mlx5_cleanup_fs(dev);
-	mlx5_irq_clear_affinity_hints(dev);
 	free_comp_eqs(dev);
 	mlx5_stop_eqs(dev);
 	mlx5_put_uars_page(dev, priv->uar);
-	mlx5_disable_msix(dev);
+	mlx5_free_irq_vectors(dev);
 	if (cleanup)
 		mlx5_cleanup_once(dev);
 	mlx5_stop_health_poll(dev);
@@ -1565,8 +1510,6 @@ static void shutdown(struct pci_dev *pdev)
 	int err;
 
 	dev_info(&pdev->dev, "Shutdown was called\n");
-	/* Notify mlx5 clients that the kernel is being shut down */
-	set_bit(MLX5_INTERFACE_STATE_SHUTDOWN, &dev->intf_state);
 	err = mlx5_try_fast_unload(dev);
 	if (err)
 		mlx5_unload_one(dev, priv, false);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
index 6a263e8..01d637d 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/mlx5_core.h
@@ -110,7 +110,6 @@ int mlx5_destroy_scheduling_element_cmd(struct mlx5_core_dev *dev, u8 hierarchy,
 					u32 element_id);
 int mlx5_wait_for_vf_pages(struct mlx5_core_dev *dev);
 u64 mlx5_read_internal_timer(struct mlx5_core_dev *dev);
-u32 mlx5_get_msix_vec(struct mlx5_core_dev *dev, int vecidx);
 struct mlx5_eq *mlx5_eqn2eq(struct mlx5_core_dev *dev, int eqn);
 void mlx5_cq_tasklet_cb(unsigned long data);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/qp.c b/drivers/net/ethernet/mellanox/mlx5/core/qp.c
index 340f281..db9e665 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/qp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/qp.c
@@ -242,6 +242,20 @@ int mlx5_core_destroy_qp(struct mlx5_core_dev *dev,
 }
 EXPORT_SYMBOL_GPL(mlx5_core_destroy_qp);
 
+int mlx5_core_set_delay_drop(struct mlx5_core_dev *dev,
+			     u32 timeout_usec)
+{
+	u32 out[MLX5_ST_SZ_DW(set_delay_drop_params_out)] = {0};
+	u32 in[MLX5_ST_SZ_DW(set_delay_drop_params_in)]   = {0};
+
+	MLX5_SET(set_delay_drop_params_in, in, opcode,
+		 MLX5_CMD_OP_SET_DELAY_DROP_PARAMS);
+	MLX5_SET(set_delay_drop_params_in, in, delay_drop_timeout,
+		 timeout_usec / 100);
+	return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
+}
+EXPORT_SYMBOL_GPL(mlx5_core_set_delay_drop);
+
 struct mbox_info {
 	u32 *in;
 	u32 *out;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
index bf99d40..28d8472 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/sriov.c
@@ -32,6 +32,7 @@
 
 #include <linux/pci.h>
 #include <linux/mlx5/driver.h>
+#include <linux/mlx5/vport.h>
 #include "mlx5_core.h"
 #ifdef CONFIG_MLX5_CORE_EN
 #include "eswitch.h"
@@ -44,6 +45,38 @@ bool mlx5_sriov_is_enabled(struct mlx5_core_dev *dev)
 	return !!sriov->num_vfs;
 }
 
+static int sriov_restore_guids(struct mlx5_core_dev *dev, int vf)
+{
+	struct mlx5_core_sriov *sriov = &dev->priv.sriov;
+	struct mlx5_hca_vport_context *in;
+	int err = 0;
+
+	/* Restore sriov guid and policy settings */
+	if (sriov->vfs_ctx[vf].node_guid ||
+	    sriov->vfs_ctx[vf].port_guid ||
+	    sriov->vfs_ctx[vf].policy != MLX5_POLICY_INVALID) {
+		in = kzalloc(sizeof(*in), GFP_KERNEL);
+		if (!in)
+			return -ENOMEM;
+
+		in->node_guid = sriov->vfs_ctx[vf].node_guid;
+		in->port_guid = sriov->vfs_ctx[vf].port_guid;
+		in->policy = sriov->vfs_ctx[vf].policy;
+		in->field_select =
+			!!(in->port_guid) * MLX5_HCA_VPORT_SEL_PORT_GUID |
+			!!(in->node_guid) * MLX5_HCA_VPORT_SEL_NODE_GUID |
+			!!(in->policy) * MLX5_HCA_VPORT_SEL_STATE_POLICY;
+
+		err = mlx5_core_modify_hca_vport_context(dev, 1, 1, vf + 1, in);
+		if (err)
+			mlx5_core_warn(dev, "modify vport context failed, unable to restore VF %d settings\n", vf);
+
+		kfree(in);
+	}
+
+	return err;
+}
+
 static int mlx5_device_enable_sriov(struct mlx5_core_dev *dev, int num_vfs)
 {
 	struct mlx5_core_sriov *sriov = &dev->priv.sriov;
@@ -74,6 +107,15 @@ static int mlx5_device_enable_sriov(struct mlx5_core_dev *dev, int num_vfs)
 		}
 		sriov->vfs_ctx[vf].enabled = 1;
 		sriov->enabled_vfs++;
+		if (MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_IB) {
+			err = sriov_restore_guids(dev, vf);
+			if (err) {
+				mlx5_core_warn(dev,
+					       "failed to restore VF %d settings, err %d\n",
+					       vf, err);
+			continue;
+			}
+		}
 		mlx5_core_dbg(dev, "successfully enabled VF* %d\n", vf);
 
 	}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/srq.c b/drivers/net/ethernet/mellanox/mlx5/core/srq.c
index f774de6..23cc337 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/srq.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/srq.c
@@ -201,13 +201,13 @@ static int destroy_srq_cmd(struct mlx5_core_dev *dev,
 static int arm_srq_cmd(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
 		       u16 lwm, int is_srq)
 {
-	/* arm_srq structs missing using identical xrc ones */
-	u32 srq_in[MLX5_ST_SZ_DW(arm_xrc_srq_in)] = {0};
-	u32 srq_out[MLX5_ST_SZ_DW(arm_xrc_srq_out)] = {0};
+	u32 srq_in[MLX5_ST_SZ_DW(arm_rq_in)] = {0};
+	u32 srq_out[MLX5_ST_SZ_DW(arm_rq_out)] = {0};
 
-	MLX5_SET(arm_xrc_srq_in, srq_in, opcode,   MLX5_CMD_OP_ARM_XRC_SRQ);
-	MLX5_SET(arm_xrc_srq_in, srq_in, xrc_srqn, srq->srqn);
-	MLX5_SET(arm_xrc_srq_in, srq_in, lwm,      lwm);
+	MLX5_SET(arm_rq_in, srq_in, opcode, MLX5_CMD_OP_ARM_RQ);
+	MLX5_SET(arm_rq_in, srq_in, op_mod, MLX5_ARM_RQ_IN_OP_MOD_SRQ);
+	MLX5_SET(arm_rq_in, srq_in, srq_number, srq->srqn);
+	MLX5_SET(arm_rq_in, srq_in, lwm,      lwm);
 
 	return  mlx5_cmd_exec(dev, srq_in, sizeof(srq_in),
 			      srq_out, sizeof(srq_out));
@@ -435,16 +435,128 @@ static int query_rmp_cmd(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
 	return err;
 }
 
+static int create_xrq_cmd(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
+			  struct mlx5_srq_attr *in)
+{
+	u32 create_out[MLX5_ST_SZ_DW(create_xrq_out)] = {0};
+	void *create_in;
+	void *xrqc;
+	void *wq;
+	int pas_size;
+	int inlen;
+	int err;
+
+	pas_size = get_pas_size(in);
+	inlen = MLX5_ST_SZ_BYTES(create_xrq_in) + pas_size;
+	create_in = kvzalloc(inlen, GFP_KERNEL);
+	if (!create_in)
+		return -ENOMEM;
+
+	xrqc = MLX5_ADDR_OF(create_xrq_in, create_in, xrq_context);
+	wq = MLX5_ADDR_OF(xrqc, xrqc, wq);
+
+	set_wq(wq, in);
+	memcpy(MLX5_ADDR_OF(xrqc, xrqc, wq.pas), in->pas, pas_size);
+
+	if (in->type == IB_SRQT_TM) {
+		MLX5_SET(xrqc, xrqc, topology, MLX5_XRQC_TOPOLOGY_TAG_MATCHING);
+		if (in->flags & MLX5_SRQ_FLAG_RNDV)
+			MLX5_SET(xrqc, xrqc, offload, MLX5_XRQC_OFFLOAD_RNDV);
+		MLX5_SET(xrqc, xrqc,
+			 tag_matching_topology_context.log_matching_list_sz,
+			 in->tm_log_list_size);
+	}
+	MLX5_SET(xrqc, xrqc, user_index, in->user_index);
+	MLX5_SET(xrqc, xrqc, cqn, in->cqn);
+	MLX5_SET(create_xrq_in, create_in, opcode, MLX5_CMD_OP_CREATE_XRQ);
+	err = mlx5_cmd_exec(dev, create_in, inlen, create_out,
+			    sizeof(create_out));
+	kvfree(create_in);
+	if (!err)
+		srq->srqn = MLX5_GET(create_xrq_out, create_out, xrqn);
+
+	return err;
+}
+
+static int destroy_xrq_cmd(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq)
+{
+	u32 in[MLX5_ST_SZ_DW(destroy_xrq_in)] = {0};
+	u32 out[MLX5_ST_SZ_DW(destroy_xrq_out)] = {0};
+
+	MLX5_SET(destroy_xrq_in, in, opcode, MLX5_CMD_OP_DESTROY_XRQ);
+	MLX5_SET(destroy_xrq_in, in, xrqn,   srq->srqn);
+
+	return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
+}
+
+static int arm_xrq_cmd(struct mlx5_core_dev *dev,
+		       struct mlx5_core_srq *srq,
+		       u16 lwm)
+{
+	u32 out[MLX5_ST_SZ_DW(arm_rq_out)] = {0};
+	u32 in[MLX5_ST_SZ_DW(arm_rq_in)] = {0};
+
+	MLX5_SET(arm_rq_in, in, opcode,     MLX5_CMD_OP_ARM_RQ);
+	MLX5_SET(arm_rq_in, in, op_mod,     MLX5_ARM_RQ_IN_OP_MOD_XRQ);
+	MLX5_SET(arm_rq_in, in, srq_number, srq->srqn);
+	MLX5_SET(arm_rq_in, in, lwm,	    lwm);
+
+	return mlx5_cmd_exec(dev, in, sizeof(in), out, sizeof(out));
+}
+
+static int query_xrq_cmd(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
+			 struct mlx5_srq_attr *out)
+{
+	u32 in[MLX5_ST_SZ_DW(query_xrq_in)] = {0};
+	u32 *xrq_out;
+	int outlen = MLX5_ST_SZ_BYTES(query_xrq_out);
+	void *xrqc;
+	int err;
+
+	xrq_out = kvzalloc(outlen, GFP_KERNEL);
+	if (!xrq_out)
+		return -ENOMEM;
+
+	MLX5_SET(query_xrq_in, in, opcode, MLX5_CMD_OP_QUERY_XRQ);
+	MLX5_SET(query_xrq_in, in, xrqn, srq->srqn);
+
+	err = mlx5_cmd_exec(dev, in, sizeof(in), xrq_out, outlen);
+	if (err)
+		goto out;
+
+	xrqc = MLX5_ADDR_OF(query_xrq_out, xrq_out, xrq_context);
+	get_wq(MLX5_ADDR_OF(xrqc, xrqc, wq), out);
+	if (MLX5_GET(xrqc, xrqc, state) != MLX5_XRQC_STATE_GOOD)
+		out->flags |= MLX5_SRQ_FLAG_ERR;
+	out->tm_next_tag =
+		MLX5_GET(xrqc, xrqc,
+			 tag_matching_topology_context.append_next_index);
+	out->tm_hw_phase_cnt =
+		MLX5_GET(xrqc, xrqc,
+			 tag_matching_topology_context.hw_phase_cnt);
+	out->tm_sw_phase_cnt =
+		MLX5_GET(xrqc, xrqc,
+			 tag_matching_topology_context.sw_phase_cnt);
+
+out:
+	kvfree(xrq_out);
+	return err;
+}
+
 static int create_srq_split(struct mlx5_core_dev *dev,
 			    struct mlx5_core_srq *srq,
 			    struct mlx5_srq_attr *in)
 {
 	if (!dev->issi)
 		return create_srq_cmd(dev, srq, in);
-	else if (srq->common.res == MLX5_RES_XSRQ)
+	switch (srq->common.res) {
+	case MLX5_RES_XSRQ:
 		return create_xrc_srq_cmd(dev, srq, in);
-	else
+	case MLX5_RES_XRQ:
+		return create_xrq_cmd(dev, srq, in);
+	default:
 		return create_rmp_cmd(dev, srq, in);
+	}
 }
 
 static int destroy_srq_split(struct mlx5_core_dev *dev,
@@ -452,10 +564,14 @@ static int destroy_srq_split(struct mlx5_core_dev *dev,
 {
 	if (!dev->issi)
 		return destroy_srq_cmd(dev, srq);
-	else if (srq->common.res == MLX5_RES_XSRQ)
+	switch (srq->common.res) {
+	case MLX5_RES_XSRQ:
 		return destroy_xrc_srq_cmd(dev, srq);
-	else
+	case MLX5_RES_XRQ:
+		return destroy_xrq_cmd(dev, srq);
+	default:
 		return destroy_rmp_cmd(dev, srq);
+	}
 }
 
 int mlx5_core_create_srq(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
@@ -464,10 +580,16 @@ int mlx5_core_create_srq(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
 	int err;
 	struct mlx5_srq_table *table = &dev->priv.srq_table;
 
-	if (in->type == IB_SRQT_XRC)
+	switch (in->type) {
+	case IB_SRQT_XRC:
 		srq->common.res = MLX5_RES_XSRQ;
-	else
+		break;
+	case IB_SRQT_TM:
+		srq->common.res = MLX5_RES_XRQ;
+		break;
+	default:
 		srq->common.res = MLX5_RES_SRQ;
+	}
 
 	err = create_srq_split(dev, srq, in);
 	if (err)
@@ -528,10 +650,14 @@ int mlx5_core_query_srq(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
 {
 	if (!dev->issi)
 		return query_srq_cmd(dev, srq, out);
-	else if (srq->common.res == MLX5_RES_XSRQ)
+	switch (srq->common.res) {
+	case MLX5_RES_XSRQ:
 		return query_xrc_srq_cmd(dev, srq, out);
-	else
+	case MLX5_RES_XRQ:
+		return query_xrq_cmd(dev, srq, out);
+	default:
 		return query_rmp_cmd(dev, srq, out);
+	}
 }
 EXPORT_SYMBOL(mlx5_core_query_srq);
 
@@ -540,10 +666,14 @@ int mlx5_core_arm_srq(struct mlx5_core_dev *dev, struct mlx5_core_srq *srq,
 {
 	if (!dev->issi)
 		return arm_srq_cmd(dev, srq, lwm, is_srq);
-	else if (srq->common.res == MLX5_RES_XSRQ)
+	switch (srq->common.res) {
+	case MLX5_RES_XSRQ:
 		return arm_xrc_srq_cmd(dev, srq, lwm);
-	else
+	case MLX5_RES_XRQ:
+		return arm_xrq_cmd(dev, srq, lwm);
+	default:
 		return arm_rmp_cmd(dev, srq, lwm);
+	}
 }
 EXPORT_SYMBOL(mlx5_core_arm_srq);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/vport.c b/drivers/net/ethernet/mellanox/mlx5/core/vport.c
index 5abfec1..d653b00 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/vport.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/vport.c
@@ -897,6 +897,68 @@ int mlx5_modify_nic_vport_promisc(struct mlx5_core_dev *mdev,
 }
 EXPORT_SYMBOL_GPL(mlx5_modify_nic_vport_promisc);
 
+enum {
+	UC_LOCAL_LB,
+	MC_LOCAL_LB
+};
+
+int mlx5_nic_vport_update_local_lb(struct mlx5_core_dev *mdev, bool enable)
+{
+	int inlen = MLX5_ST_SZ_BYTES(modify_nic_vport_context_in);
+	void *in;
+	int err;
+
+	mlx5_core_dbg(mdev, "%s local_lb\n", enable ? "enable" : "disable");
+	in = kvzalloc(inlen, GFP_KERNEL);
+	if (!in)
+		return -ENOMEM;
+
+	MLX5_SET(modify_nic_vport_context_in, in,
+		 field_select.disable_mc_local_lb, 1);
+	MLX5_SET(modify_nic_vport_context_in, in,
+		 nic_vport_context.disable_mc_local_lb, !enable);
+
+	MLX5_SET(modify_nic_vport_context_in, in,
+		 field_select.disable_uc_local_lb, 1);
+	MLX5_SET(modify_nic_vport_context_in, in,
+		 nic_vport_context.disable_uc_local_lb, !enable);
+
+	err = mlx5_modify_nic_vport_context(mdev, in, inlen);
+
+	kvfree(in);
+	return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_nic_vport_update_local_lb);
+
+int mlx5_nic_vport_query_local_lb(struct mlx5_core_dev *mdev, bool *status)
+{
+	int outlen = MLX5_ST_SZ_BYTES(query_nic_vport_context_out);
+	u32 *out;
+	int value;
+	int err;
+
+	out = kzalloc(outlen, GFP_KERNEL);
+	if (!out)
+		return -ENOMEM;
+
+	err = mlx5_query_nic_vport_context(mdev, 0, out, outlen);
+	if (err)
+		goto out;
+
+	value = MLX5_GET(query_nic_vport_context_out, out,
+			 nic_vport_context.disable_mc_local_lb) << MC_LOCAL_LB;
+
+	value |= MLX5_GET(query_nic_vport_context_out, out,
+			  nic_vport_context.disable_uc_local_lb) << UC_LOCAL_LB;
+
+	*status = !value;
+
+out:
+	kfree(out);
+	return err;
+}
+EXPORT_SYMBOL_GPL(mlx5_nic_vport_query_local_lb);
+
 enum mlx5_vport_roce_state {
 	MLX5_VPORT_ROCE_DISABLED = 0,
 	MLX5_VPORT_ROCE_ENABLED  = 1,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 60bf8f2..c6a3e61b 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -4139,6 +4139,8 @@ static int mlxsw_sp_netdevice_port_upper_event(struct net_device *lower_dev,
 			return -EINVAL;
 		if (!info->linking)
 			break;
+		if (netdev_has_any_upper_dev(upper_dev))
+			return -EINVAL;
 		if (netif_is_lag_master(upper_dev) &&
 		    !mlxsw_sp_master_lag_check(mlxsw_sp, upper_dev,
 					       info->upper_info))
@@ -4258,6 +4260,10 @@ static int mlxsw_sp_netdevice_port_vlan_event(struct net_device *vlan_dev,
 		upper_dev = info->upper_dev;
 		if (!netif_is_bridge_master(upper_dev))
 			return -EINVAL;
+		if (!info->linking)
+			break;
+		if (netdev_has_any_upper_dev(upper_dev))
+			return -EINVAL;
 		break;
 	case NETDEV_CHANGEUPPER:
 		upper_dev = info->upper_dev;
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index 5eb1606..d39ffbf 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -705,6 +705,7 @@ static int mlxsw_sp_port_attr_mc_router_set(struct mlxsw_sp_port *mlxsw_sp_port,
 					    bool is_port_mc_router)
 {
 	struct mlxsw_sp_bridge_port *bridge_port;
+	int err;
 
 	if (switchdev_trans_ph_prepare(trans))
 		return 0;
@@ -715,11 +716,17 @@ static int mlxsw_sp_port_attr_mc_router_set(struct mlxsw_sp_port *mlxsw_sp_port,
 		return 0;
 
 	if (!bridge_port->bridge_device->multicast_enabled)
-		return 0;
+		goto out;
 
-	return mlxsw_sp_bridge_port_flood_table_set(mlxsw_sp_port, bridge_port,
-						    MLXSW_SP_FLOOD_TYPE_MC,
-						    is_port_mc_router);
+	err = mlxsw_sp_bridge_port_flood_table_set(mlxsw_sp_port, bridge_port,
+						   MLXSW_SP_FLOOD_TYPE_MC,
+						   is_port_mc_router);
+	if (err)
+		return err;
+
+out:
+	bridge_port->mrouter = is_port_mc_router;
+	return 0;
 }
 
 static int mlxsw_sp_port_mc_disabled_set(struct mlxsw_sp_port *mlxsw_sp_port,
diff --git a/drivers/net/ethernet/netronome/nfp/flower/match.c b/drivers/net/ethernet/netronome/nfp/flower/match.c
index 0e08404..d25b503 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/match.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/match.c
@@ -42,33 +42,29 @@ nfp_flower_compile_meta_tci(struct nfp_flower_meta_two *frame,
 			    struct tc_cls_flower_offload *flow, u8 key_type,
 			    bool mask_version)
 {
+	struct fl_flow_key *target = mask_version ? flow->mask : flow->key;
 	struct flow_dissector_key_vlan *flow_vlan;
 	u16 tmp_tci;
 
+	memset(frame, 0, sizeof(struct nfp_flower_meta_two));
 	/* Populate the metadata frame. */
 	frame->nfp_flow_key_layer = key_type;
 	frame->mask_id = ~0;
 
-	if (mask_version) {
-		frame->tci = cpu_to_be16(~0);
-		return;
+	if (dissector_uses_key(flow->dissector, FLOW_DISSECTOR_KEY_VLAN)) {
+		flow_vlan = skb_flow_dissector_target(flow->dissector,
+						      FLOW_DISSECTOR_KEY_VLAN,
+						      target);
+		/* Populate the tci field. */
+		if (flow_vlan->vlan_id) {
+			tmp_tci = FIELD_PREP(NFP_FLOWER_MASK_VLAN_PRIO,
+					     flow_vlan->vlan_priority) |
+				  FIELD_PREP(NFP_FLOWER_MASK_VLAN_VID,
+					     flow_vlan->vlan_id) |
+				  NFP_FLOWER_MASK_VLAN_CFI;
+			frame->tci = cpu_to_be16(tmp_tci);
+		}
 	}
-
-	flow_vlan = skb_flow_dissector_target(flow->dissector,
-					      FLOW_DISSECTOR_KEY_VLAN,
-					      flow->key);
-
-	/* Populate the tci field. */
-	if (!flow_vlan->vlan_id) {
-		tmp_tci = 0;
-	} else {
-		tmp_tci = FIELD_PREP(NFP_FLOWER_MASK_VLAN_PRIO,
-				     flow_vlan->vlan_priority) |
-			  FIELD_PREP(NFP_FLOWER_MASK_VLAN_VID,
-				     flow_vlan->vlan_id) |
-			  NFP_FLOWER_MASK_VLAN_CFI;
-	}
-	frame->tci = cpu_to_be16(tmp_tci);
 }
 
 static void
@@ -99,17 +95,18 @@ nfp_flower_compile_mac(struct nfp_flower_mac_mpls *frame,
 		       bool mask_version)
 {
 	struct fl_flow_key *target = mask_version ? flow->mask : flow->key;
-	struct flow_dissector_key_eth_addrs *flow_mac;
-
-	flow_mac = skb_flow_dissector_target(flow->dissector,
-					     FLOW_DISSECTOR_KEY_ETH_ADDRS,
-					     target);
+	struct flow_dissector_key_eth_addrs *addr;
 
 	memset(frame, 0, sizeof(struct nfp_flower_mac_mpls));
 
-	/* Populate mac frame. */
-	ether_addr_copy(frame->mac_dst, &flow_mac->dst[0]);
-	ether_addr_copy(frame->mac_src, &flow_mac->src[0]);
+	if (dissector_uses_key(flow->dissector, FLOW_DISSECTOR_KEY_ETH_ADDRS)) {
+		addr = skb_flow_dissector_target(flow->dissector,
+						 FLOW_DISSECTOR_KEY_ETH_ADDRS,
+						 target);
+		/* Populate mac frame. */
+		ether_addr_copy(frame->mac_dst, &addr->dst[0]);
+		ether_addr_copy(frame->mac_src, &addr->src[0]);
+	}
 
 	if (mask_version)
 		frame->mpls_lse = cpu_to_be32(~0);
@@ -121,14 +118,17 @@ nfp_flower_compile_tport(struct nfp_flower_tp_ports *frame,
 			 bool mask_version)
 {
 	struct fl_flow_key *target = mask_version ? flow->mask : flow->key;
-	struct flow_dissector_key_ports *flow_tp;
+	struct flow_dissector_key_ports *tp;
 
-	flow_tp = skb_flow_dissector_target(flow->dissector,
-					    FLOW_DISSECTOR_KEY_PORTS,
-					    target);
+	memset(frame, 0, sizeof(struct nfp_flower_tp_ports));
 
-	frame->port_src = flow_tp->src;
-	frame->port_dst = flow_tp->dst;
+	if (dissector_uses_key(flow->dissector, FLOW_DISSECTOR_KEY_PORTS)) {
+		tp = skb_flow_dissector_target(flow->dissector,
+					       FLOW_DISSECTOR_KEY_PORTS,
+					       target);
+		frame->port_src = tp->src;
+		frame->port_dst = tp->dst;
+	}
 }
 
 static void
@@ -137,25 +137,27 @@ nfp_flower_compile_ipv4(struct nfp_flower_ipv4 *frame,
 			bool mask_version)
 {
 	struct fl_flow_key *target = mask_version ? flow->mask : flow->key;
-	struct flow_dissector_key_ipv4_addrs *flow_ipv4;
-	struct flow_dissector_key_basic *flow_basic;
+	struct flow_dissector_key_ipv4_addrs *addr;
+	struct flow_dissector_key_basic *basic;
 
-	flow_ipv4 = skb_flow_dissector_target(flow->dissector,
-					      FLOW_DISSECTOR_KEY_IPV4_ADDRS,
-					      target);
-
-	flow_basic = skb_flow_dissector_target(flow->dissector,
-					       FLOW_DISSECTOR_KEY_BASIC,
-					       target);
-
-	/* Populate IPv4 frame. */
-	frame->reserved = 0;
-	frame->ipv4_src = flow_ipv4->src;
-	frame->ipv4_dst = flow_ipv4->dst;
-	frame->proto = flow_basic->ip_proto;
 	/* Wildcard TOS/TTL for now. */
-	frame->tos = 0;
-	frame->ttl = 0;
+	memset(frame, 0, sizeof(struct nfp_flower_ipv4));
+
+	if (dissector_uses_key(flow->dissector,
+			       FLOW_DISSECTOR_KEY_IPV4_ADDRS)) {
+		addr = skb_flow_dissector_target(flow->dissector,
+						 FLOW_DISSECTOR_KEY_IPV4_ADDRS,
+						 target);
+		frame->ipv4_src = addr->src;
+		frame->ipv4_dst = addr->dst;
+	}
+
+	if (dissector_uses_key(flow->dissector, FLOW_DISSECTOR_KEY_BASIC)) {
+		basic = skb_flow_dissector_target(flow->dissector,
+						  FLOW_DISSECTOR_KEY_BASIC,
+						  target);
+		frame->proto = basic->ip_proto;
+	}
 }
 
 static void
@@ -164,26 +166,27 @@ nfp_flower_compile_ipv6(struct nfp_flower_ipv6 *frame,
 			bool mask_version)
 {
 	struct fl_flow_key *target = mask_version ? flow->mask : flow->key;
-	struct flow_dissector_key_ipv6_addrs *flow_ipv6;
-	struct flow_dissector_key_basic *flow_basic;
+	struct flow_dissector_key_ipv6_addrs *addr;
+	struct flow_dissector_key_basic *basic;
 
-	flow_ipv6 = skb_flow_dissector_target(flow->dissector,
-					      FLOW_DISSECTOR_KEY_IPV6_ADDRS,
-					      target);
-
-	flow_basic = skb_flow_dissector_target(flow->dissector,
-					       FLOW_DISSECTOR_KEY_BASIC,
-					       target);
-
-	/* Populate IPv6 frame. */
-	frame->reserved = 0;
-	frame->ipv6_src = flow_ipv6->src;
-	frame->ipv6_dst = flow_ipv6->dst;
-	frame->proto = flow_basic->ip_proto;
 	/* Wildcard LABEL/TOS/TTL for now. */
-	frame->ipv6_flow_label_exthdr = 0;
-	frame->tos = 0;
-	frame->ttl = 0;
+	memset(frame, 0, sizeof(struct nfp_flower_ipv6));
+
+	if (dissector_uses_key(flow->dissector,
+			       FLOW_DISSECTOR_KEY_IPV6_ADDRS)) {
+		addr = skb_flow_dissector_target(flow->dissector,
+						 FLOW_DISSECTOR_KEY_IPV6_ADDRS,
+						 target);
+		frame->ipv6_src = addr->src;
+		frame->ipv6_dst = addr->dst;
+	}
+
+	if (dissector_uses_key(flow->dissector, FLOW_DISSECTOR_KEY_BASIC)) {
+		basic = skb_flow_dissector_target(flow->dissector,
+						  FLOW_DISSECTOR_KEY_BASIC,
+						  target);
+		frame->proto = basic->ip_proto;
+	}
 }
 
 int nfp_flower_compile_flow_match(struct tc_cls_flower_offload *flow,
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index 4ad10bd..74a96d6 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -105,43 +105,62 @@ static int
 nfp_flower_calculate_key_layers(struct nfp_fl_key_ls *ret_key_ls,
 				struct tc_cls_flower_offload *flow)
 {
-	struct flow_dissector_key_control *mask_enc_ctl;
-	struct flow_dissector_key_basic *mask_basic;
-	struct flow_dissector_key_basic *key_basic;
+	struct flow_dissector_key_basic *mask_basic = NULL;
+	struct flow_dissector_key_basic *key_basic = NULL;
+	struct flow_dissector_key_ip *mask_ip = NULL;
 	u32 key_layer_two;
 	u8 key_layer;
 	int key_size;
 
-	mask_enc_ctl = skb_flow_dissector_target(flow->dissector,
-						 FLOW_DISSECTOR_KEY_ENC_CONTROL,
-						 flow->mask);
+	if (dissector_uses_key(flow->dissector,
+			       FLOW_DISSECTOR_KEY_ENC_CONTROL)) {
+		struct flow_dissector_key_control *mask_enc_ctl =
+			skb_flow_dissector_target(flow->dissector,
+						  FLOW_DISSECTOR_KEY_ENC_CONTROL,
+						  flow->mask);
+		/* We are expecting a tunnel. For now we ignore offloading. */
+		if (mask_enc_ctl->addr_type)
+			return -EOPNOTSUPP;
+	}
 
-	mask_basic = skb_flow_dissector_target(flow->dissector,
-					       FLOW_DISSECTOR_KEY_BASIC,
-					       flow->mask);
+	if (dissector_uses_key(flow->dissector, FLOW_DISSECTOR_KEY_BASIC)) {
+		mask_basic = skb_flow_dissector_target(flow->dissector,
+						       FLOW_DISSECTOR_KEY_BASIC,
+						       flow->mask);
 
-	key_basic = skb_flow_dissector_target(flow->dissector,
-					      FLOW_DISSECTOR_KEY_BASIC,
-					      flow->key);
+		key_basic = skb_flow_dissector_target(flow->dissector,
+						      FLOW_DISSECTOR_KEY_BASIC,
+						      flow->key);
+	}
+
+	if (dissector_uses_key(flow->dissector, FLOW_DISSECTOR_KEY_IP))
+		mask_ip = skb_flow_dissector_target(flow->dissector,
+						    FLOW_DISSECTOR_KEY_IP,
+						    flow->mask);
+
 	key_layer_two = 0;
 	key_layer = NFP_FLOWER_LAYER_PORT | NFP_FLOWER_LAYER_MAC;
 	key_size = sizeof(struct nfp_flower_meta_one) +
 		   sizeof(struct nfp_flower_in_port) +
 		   sizeof(struct nfp_flower_mac_mpls);
 
-	/* We are expecting a tunnel. For now we ignore offloading. */
-	if (mask_enc_ctl->addr_type)
-		return -EOPNOTSUPP;
-
-	if (mask_basic->n_proto) {
+	if (mask_basic && mask_basic->n_proto) {
 		/* Ethernet type is present in the key. */
 		switch (key_basic->n_proto) {
 		case cpu_to_be16(ETH_P_IP):
+			if (mask_ip && mask_ip->tos)
+				return -EOPNOTSUPP;
+			if (mask_ip && mask_ip->ttl)
+				return -EOPNOTSUPP;
 			key_layer |= NFP_FLOWER_LAYER_IPV4;
 			key_size += sizeof(struct nfp_flower_ipv4);
 			break;
 
 		case cpu_to_be16(ETH_P_IPV6):
+			if (mask_ip && mask_ip->tos)
+				return -EOPNOTSUPP;
+			if (mask_ip && mask_ip->ttl)
+				return -EOPNOTSUPP;
 			key_layer |= NFP_FLOWER_LAYER_IPV6;
 			key_size += sizeof(struct nfp_flower_ipv6);
 			break;
@@ -152,6 +171,11 @@ nfp_flower_calculate_key_layers(struct nfp_fl_key_ls *ret_key_ls,
 		case cpu_to_be16(ETH_P_ARP):
 			return -EOPNOTSUPP;
 
+		/* Currently we do not offload MPLS. */
+		case cpu_to_be16(ETH_P_MPLS_UC):
+		case cpu_to_be16(ETH_P_MPLS_MC):
+			return -EOPNOTSUPP;
+
 		/* Will be included in layer 2. */
 		case cpu_to_be16(ETH_P_8021Q):
 			break;
@@ -166,7 +190,7 @@ nfp_flower_calculate_key_layers(struct nfp_fl_key_ls *ret_key_ls,
 		}
 	}
 
-	if (mask_basic->ip_proto) {
+	if (mask_basic && mask_basic->ip_proto) {
 		/* Ethernet type is present in the key. */
 		switch (key_basic->ip_proto) {
 		case IPPROTO_TCP:
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_main.c b/drivers/net/ethernet/netronome/nfp/nfp_main.c
index d67969d..3f199db2 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_main.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_main.c
@@ -98,21 +98,20 @@ static int nfp_pcie_sriov_enable(struct pci_dev *pdev, int num_vfs)
 	struct nfp_pf *pf = pci_get_drvdata(pdev);
 	int err;
 
-	mutex_lock(&pf->lock);
-
 	if (num_vfs > pf->limit_vfs) {
 		nfp_info(pf->cpp, "Firmware limits number of VFs to %u\n",
 			 pf->limit_vfs);
-		err = -EINVAL;
-		goto err_unlock;
+		return -EINVAL;
 	}
 
 	err = pci_enable_sriov(pdev, num_vfs);
 	if (err) {
 		dev_warn(&pdev->dev, "Failed to enable PCI SR-IOV: %d\n", err);
-		goto err_unlock;
+		return err;
 	}
 
+	mutex_lock(&pf->lock);
+
 	err = nfp_app_sriov_enable(pf->app, num_vfs);
 	if (err) {
 		dev_warn(&pdev->dev,
@@ -129,9 +128,8 @@ static int nfp_pcie_sriov_enable(struct pci_dev *pdev, int num_vfs)
 	return num_vfs;
 
 err_sriov_disable:
-	pci_disable_sriov(pdev);
-err_unlock:
 	mutex_unlock(&pf->lock);
+	pci_disable_sriov(pdev);
 	return err;
 #endif
 	return 0;
@@ -158,10 +156,10 @@ static int nfp_pcie_sriov_disable(struct pci_dev *pdev)
 
 	pf->num_vfs = 0;
 
+	mutex_unlock(&pf->lock);
+
 	pci_disable_sriov(pdev);
 	dev_dbg(&pdev->dev, "Removed VFs.\n");
-
-	mutex_unlock(&pf->lock);
 #endif
 	return 0;
 }
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 9f77ce0..66a09e4 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -895,6 +895,8 @@ static int nfp_net_tx(struct sk_buff *skb, struct net_device *netdev)
 
 	netdev_tx_sent_queue(nd_q, txbuf->real_len);
 
+	skb_tx_timestamp(skb);
+
 	tx_ring->wr_p += nr_frags + 1;
 	if (nfp_net_tx_ring_should_stop(tx_ring))
 		nfp_net_tx_ring_stop(nd_q, tx_ring);
@@ -903,8 +905,6 @@ static int nfp_net_tx(struct sk_buff *skb, struct net_device *netdev)
 	if (!skb->xmit_more || netif_xmit_stopped(nd_q))
 		nfp_net_tx_xmit_more_flush(tx_ring);
 
-	skb_tx_timestamp(skb);
-
 	return NETDEV_TX_OK;
 
 err_unmap:
@@ -1751,6 +1751,10 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, int budget)
 			continue;
 		}
 
+		nfp_net_dma_unmap_rx(dp, rxbuf->dma_addr);
+
+		nfp_net_rx_give_one(dp, rx_ring, new_frag, new_dma_addr);
+
 		if (likely(!meta.portid)) {
 			netdev = dp->netdev;
 		} else {
@@ -1759,16 +1763,12 @@ static int nfp_net_rx(struct nfp_net_rx_ring *rx_ring, int budget)
 			nn = netdev_priv(dp->netdev);
 			netdev = nfp_app_repr_get(nn->app, meta.portid);
 			if (unlikely(!netdev)) {
-				nfp_net_rx_drop(dp, r_vec, rx_ring, rxbuf, skb);
+				nfp_net_rx_drop(dp, r_vec, rx_ring, NULL, skb);
 				continue;
 			}
 			nfp_repr_inc_rx_stats(netdev, pkt_len);
 		}
 
-		nfp_net_dma_unmap_rx(dp, rxbuf->dma_addr);
-
-		nfp_net_rx_give_one(dp, rx_ring, new_frag, new_dma_addr);
-
 		skb_reserve(skb, pkt_off);
 		skb_put(skb, pkt_len);
 
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_main.c b/drivers/net/ethernet/netronome/nfp/nfp_net_main.c
index 5797dbf..34b9853 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_main.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_main.c
@@ -456,13 +456,9 @@ static int nfp_net_pf_app_start(struct nfp_pf *pf)
 {
 	int err;
 
-	err = nfp_net_pf_app_start_ctrl(pf);
-	if (err)
-		return err;
-
 	err = nfp_app_start(pf->app, pf->ctrl_vnic);
 	if (err)
-		goto err_ctrl_stop;
+		return err;
 
 	if (pf->num_vfs) {
 		err = nfp_app_sriov_enable(pf->app, pf->num_vfs);
@@ -474,8 +470,6 @@ static int nfp_net_pf_app_start(struct nfp_pf *pf)
 
 err_app_stop:
 	nfp_app_stop(pf->app);
-err_ctrl_stop:
-	nfp_net_pf_app_stop_ctrl(pf);
 	return err;
 }
 
@@ -484,7 +478,6 @@ static void nfp_net_pf_app_stop(struct nfp_pf *pf)
 	if (pf->num_vfs)
 		nfp_app_sriov_disable(pf->app);
 	nfp_app_stop(pf->app);
-	nfp_net_pf_app_stop_ctrl(pf);
 }
 
 static void nfp_net_pci_unmap_mem(struct nfp_pf *pf)
@@ -559,7 +552,7 @@ static int nfp_net_pci_map_mem(struct nfp_pf *pf)
 
 static void nfp_net_pci_remove_finish(struct nfp_pf *pf)
 {
-	nfp_net_pf_app_stop(pf);
+	nfp_net_pf_app_stop_ctrl(pf);
 	/* stop app first, to avoid double free of ctrl vNIC's ddir */
 	nfp_net_debugfs_dir_clean(&pf->ddir);
 
@@ -690,6 +683,7 @@ int nfp_net_pci_probe(struct nfp_pf *pf)
 {
 	struct nfp_net_fw_version fw_ver;
 	u8 __iomem *ctrl_bar, *qc_bar;
+	struct nfp_net *nn;
 	int stride;
 	int err;
 
@@ -766,7 +760,7 @@ int nfp_net_pci_probe(struct nfp_pf *pf)
 	if (err)
 		goto err_free_vnics;
 
-	err = nfp_net_pf_app_start(pf);
+	err = nfp_net_pf_app_start_ctrl(pf);
 	if (err)
 		goto err_free_irqs;
 
@@ -774,12 +768,20 @@ int nfp_net_pci_probe(struct nfp_pf *pf)
 	if (err)
 		goto err_stop_app;
 
+	err = nfp_net_pf_app_start(pf);
+	if (err)
+		goto err_clean_vnics;
+
 	mutex_unlock(&pf->lock);
 
 	return 0;
 
+err_clean_vnics:
+	list_for_each_entry(nn, &pf->vnics, vnic_list)
+		if (nfp_net_is_data_vnic(nn))
+			nfp_net_pf_clean_vnic(pf, nn);
 err_stop_app:
-	nfp_net_pf_app_stop(pf);
+	nfp_net_pf_app_stop_ctrl(pf);
 err_free_irqs:
 	nfp_net_pf_free_irqs(pf);
 err_free_vnics:
@@ -803,6 +805,8 @@ void nfp_net_pci_remove(struct nfp_pf *pf)
 	if (list_empty(&pf->vnics))
 		goto out;
 
+	nfp_net_pf_app_stop(pf);
+
 	list_for_each_entry(nn, &pf->vnics, vnic_list)
 		if (nfp_net_is_data_vnic(nn))
 			nfp_net_pf_clean_vnic(pf, nn);
diff --git a/drivers/net/ethernet/qlogic/qlge/qlge_dbg.c b/drivers/net/ethernet/qlogic/qlge/qlge_dbg.c
index 28ea0af..e3223f2 100644
--- a/drivers/net/ethernet/qlogic/qlge/qlge_dbg.c
+++ b/drivers/net/ethernet/qlogic/qlge/qlge_dbg.c
@@ -724,7 +724,7 @@ static void ql_build_coredump_seg_header(
 	seg_hdr->cookie = MPI_COREDUMP_COOKIE;
 	seg_hdr->segNum = seg_number;
 	seg_hdr->segSize = seg_size;
-	memcpy(seg_hdr->description, desc, (sizeof(seg_hdr->description)) - 1);
+	strncpy(seg_hdr->description, desc, (sizeof(seg_hdr->description)) - 1);
 }
 
 /*
diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index bd07a15..e03fcf9 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -6863,8 +6863,7 @@ static void rtl8169_tx_clear_range(struct rtl8169_private *tp, u32 start,
 			rtl8169_unmap_tx_skb(&tp->pci_dev->dev, tx_skb,
 					     tp->TxDescArray + entry);
 			if (skb) {
-				tp->dev->stats.tx_dropped++;
-				dev_kfree_skb_any(skb);
+				dev_consume_skb_any(skb);
 				tx_skb->skb = NULL;
 			}
 		}
@@ -7319,7 +7318,7 @@ static void rtl_tx(struct net_device *dev, struct rtl8169_private *tp)
 			tp->tx_stats.packets++;
 			tp->tx_stats.bytes += tx_skb->skb->len;
 			u64_stats_update_end(&tp->tx_stats.syncp);
-			dev_kfree_skb_any(tx_skb->skb);
+			dev_consume_skb_any(tx_skb->skb);
 			tx_skb->skb = NULL;
 		}
 		dirty_tx++;
diff --git a/drivers/net/ethernet/samsung/sxgbe/sxgbe_platform.c b/drivers/net/ethernet/samsung/sxgbe/sxgbe_platform.c
index 73427e2..fbd00cb 100644
--- a/drivers/net/ethernet/samsung/sxgbe/sxgbe_platform.c
+++ b/drivers/net/ethernet/samsung/sxgbe/sxgbe_platform.c
@@ -47,6 +47,8 @@ static int sxgbe_probe_config_dt(struct platform_device *pdev,
 	plat->mdio_bus_data = devm_kzalloc(&pdev->dev,
 					   sizeof(*plat->mdio_bus_data),
 					   GFP_KERNEL);
+	if (!plat->mdio_bus_data)
+		return -ENOMEM;
 
 	dma_cfg = devm_kzalloc(&pdev->dev, sizeof(*dma_cfg), GFP_KERNEL);
 	if (!dma_cfg)
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c
index 17d4bba..6e35957 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c
@@ -269,7 +269,10 @@ static int socfpga_dwmac_set_phy_mode(struct socfpga_dwmac *dwmac)
 	ctrl &= ~(SYSMGR_EMACGRP_CTRL_PHYSEL_MASK << reg_shift);
 	ctrl |= val << reg_shift;
 
-	if (dwmac->f2h_ptp_ref_clk) {
+	if (dwmac->f2h_ptp_ref_clk ||
+	    phymode == PHY_INTERFACE_MODE_MII ||
+	    phymode == PHY_INTERFACE_MODE_GMII ||
+	    phymode == PHY_INTERFACE_MODE_SGMII) {
 		ctrl |= SYSMGR_EMACGRP_CTRL_PTP_REF_CLK_MASK << (reg_shift / 2);
 		regmap_read(sys_mgr_base_addr, SYSMGR_FPGAGRP_MODULE_REG,
 			    &module);
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
index fffd6d5..39c2122 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-sun8i.c
@@ -979,14 +979,6 @@ static int sun8i_dwmac_probe(struct platform_device *pdev)
 }
 
 static const struct of_device_id sun8i_dwmac_match[] = {
-	{ .compatible = "allwinner,sun8i-h3-emac",
-		.data = &emac_variant_h3 },
-	{ .compatible = "allwinner,sun8i-v3s-emac",
-		.data = &emac_variant_v3s },
-	{ .compatible = "allwinner,sun8i-a83t-emac",
-		.data = &emac_variant_a83t },
-	{ .compatible = "allwinner,sun50i-a64-emac",
-		.data = &emac_variant_a64 },
 	{ }
 };
 MODULE_DEVICE_TABLE(of, sun8i_dwmac_match);
diff --git a/drivers/net/ethernet/ti/cpsw-common.c b/drivers/net/ethernet/ti/cpsw-common.c
index 56ba411..38d1cc5 100644
--- a/drivers/net/ethernet/ti/cpsw-common.c
+++ b/drivers/net/ethernet/ti/cpsw-common.c
@@ -96,7 +96,7 @@ int ti_cm_get_macid(struct device *dev, int slave, u8 *mac_addr)
 	if (of_machine_is_compatible("ti,dra7"))
 		return davinci_emac_3517_get_macid(dev, 0x514, slave, mac_addr);
 
-	dev_err(dev, "incompatible machine/device type for reading mac address\n");
+	dev_info(dev, "incompatible machine/device type for reading mac address\n");
 	return -ENOENT;
 }
 EXPORT_SYMBOL_GPL(ti_cm_get_macid);
diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index 0d78727..d91cbc6 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -1269,7 +1269,12 @@ static void netvsc_link_change(struct work_struct *w)
 	bool notify = false, reschedule = false;
 	unsigned long flags, next_reconfig, delay;
 
-	rtnl_lock();
+	/* if changes are happening, comeback later */
+	if (!rtnl_trylock()) {
+		schedule_delayed_work(&ndev_ctx->dwork, LINKCHANGE_INT);
+		return;
+	}
+
 	net_device = rtnl_dereference(ndev_ctx->nvdev);
 	if (!net_device)
 		goto out_unlock;
diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index 5e1ab11..98e4dea 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -3521,6 +3521,7 @@ module_init(macsec_init);
 module_exit(macsec_exit);
 
 MODULE_ALIAS_RTNL_LINK("macsec");
+MODULE_ALIAS_GENL_FAMILY("macsec");
 
 MODULE_DESCRIPTION("MACsec IEEE 802.1AE");
 MODULE_LICENSE("GPL v2");
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index 5068c58..d0626bf 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -749,9 +749,6 @@ void phy_stop_machine(struct phy_device *phydev)
 	if (phydev->state > PHY_UP && phydev->state != PHY_HALTED)
 		phydev->state = PHY_UP;
 	mutex_unlock(&phydev->lock);
-
-	/* Now we can run the state machine synchronously */
-	phy_state_machine(&phydev->state_queue.work);
 }
 
 /**
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 1790f7f..2f742ae 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -864,15 +864,17 @@ EXPORT_SYMBOL(phy_attached_info);
 #define ATTACHED_FMT "attached PHY driver [%s] (mii_bus:phy_addr=%s, irq=%d)"
 void phy_attached_print(struct phy_device *phydev, const char *fmt, ...)
 {
+	const char *drv_name = phydev->drv ? phydev->drv->name : "unbound";
+
 	if (!fmt) {
 		dev_info(&phydev->mdio.dev, ATTACHED_FMT "\n",
-			 phydev->drv->name, phydev_name(phydev),
+			 drv_name, phydev_name(phydev),
 			 phydev->irq);
 	} else {
 		va_list ap;
 
 		dev_info(&phydev->mdio.dev, ATTACHED_FMT,
-			 phydev->drv->name, phydev_name(phydev),
+			 drv_name, phydev_name(phydev),
 			 phydev->irq);
 
 		va_start(ap, fmt);
diff --git a/drivers/net/usb/cdc_ncm.c b/drivers/net/usb/cdc_ncm.c
index 8f572b9..9c80e80 100644
--- a/drivers/net/usb/cdc_ncm.c
+++ b/drivers/net/usb/cdc_ncm.c
@@ -1758,6 +1758,13 @@ static const struct usb_device_id cdc_devs[] = {
 	  .driver_info = (unsigned long)&wwan_noarp_info,
 	},
 
+	/* u-blox TOBY-L4 */
+	{ USB_DEVICE_AND_INTERFACE_INFO(0x1546, 0x1010,
+		USB_CLASS_COMM,
+		USB_CDC_SUBCLASS_NCM, USB_CDC_PROTO_NONE),
+	  .driver_info = (unsigned long)&wwan_info,
+	},
+
 	/* Generic CDC-NCM devices */
 	{ USB_INTERFACE_INFO(USB_CLASS_COMM,
 		USB_CDC_SUBCLASS_NCM, USB_CDC_PROTO_NONE),
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 98f17b0..b06169e 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1058,7 +1058,7 @@ static void free_old_xmit_skbs(struct send_queue *sq)
 		bytes += skb->len;
 		packets++;
 
-		dev_kfree_skb_any(skb);
+		dev_consume_skb_any(skb);
 	}
 
 	/* Avoid overhead when no packets have been processed
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
index fa315d8..a1ea9ef 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
@@ -787,6 +787,8 @@ int iwl_pci_fw_enter_d0i3(struct iwl_trans *trans);
 
 void iwl_pcie_enable_rx_wake(struct iwl_trans *trans, bool enable);
 
+void iwl_pcie_rx_allocator_work(struct work_struct *data);
+
 /* common functions that are used by gen2 transport */
 void iwl_pcie_apm_config(struct iwl_trans *trans);
 int iwl_pcie_prepare_card_hw(struct iwl_trans *trans);
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
index 351c442..942736d 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/rx.c
@@ -597,7 +597,7 @@ static void iwl_pcie_rx_allocator_get(struct iwl_trans *trans,
 	rxq->free_count += RX_CLAIM_REQ_ALLOC;
 }
 
-static void iwl_pcie_rx_allocator_work(struct work_struct *data)
+void iwl_pcie_rx_allocator_work(struct work_struct *data)
 {
 	struct iwl_rb_allocator *rba_p =
 		container_of(data, struct iwl_rb_allocator, rx_alloc);
@@ -900,10 +900,6 @@ static int _iwl_pcie_rx_init(struct iwl_trans *trans)
 			return err;
 	}
 	def_rxq = trans_pcie->rxq;
-	if (!rba->alloc_wq)
-		rba->alloc_wq = alloc_workqueue("rb_allocator",
-						WQ_HIGHPRI | WQ_UNBOUND, 1);
-	INIT_WORK(&rba->rx_alloc, iwl_pcie_rx_allocator_work);
 
 	spin_lock(&rba->lock);
 	atomic_set(&rba->req_pending, 0);
@@ -1017,10 +1013,6 @@ void iwl_pcie_rx_free(struct iwl_trans *trans)
 	}
 
 	cancel_work_sync(&rba->rx_alloc);
-	if (rba->alloc_wq) {
-		destroy_workqueue(rba->alloc_wq);
-		rba->alloc_wq = NULL;
-	}
 
 	iwl_pcie_free_rbs_pool(trans);
 
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
index f95eec5..3927bbf 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
@@ -1786,6 +1786,11 @@ void iwl_trans_pcie_free(struct iwl_trans *trans)
 		iwl_pcie_tx_free(trans);
 	iwl_pcie_rx_free(trans);
 
+	if (trans_pcie->rba.alloc_wq) {
+		destroy_workqueue(trans_pcie->rba.alloc_wq);
+		trans_pcie->rba.alloc_wq = NULL;
+	}
+
 	if (trans_pcie->msix_enabled) {
 		for (i = 0; i < trans_pcie->alloc_vecs; i++) {
 			irq_set_affinity_hint(
@@ -3169,6 +3174,10 @@ struct iwl_trans *iwl_trans_pcie_alloc(struct pci_dev *pdev,
 		trans_pcie->inta_mask = CSR_INI_SET_MASK;
 	 }
 
+	trans_pcie->rba.alloc_wq = alloc_workqueue("rb_allocator",
+						   WQ_HIGHPRI | WQ_UNBOUND, 1);
+	INIT_WORK(&trans_pcie->rba.rx_alloc, iwl_pcie_rx_allocator_work);
+
 #ifdef CONFIG_IWLWIFI_PCIE_RTPM
 	trans->runtime_pm_mode = IWL_PLAT_PM_MODE_D0I3;
 #else
diff --git a/drivers/net/wireless/ti/wl1251/main.c b/drivers/net/wireless/ti/wl1251/main.c
index 08f0477..9915d83 100644
--- a/drivers/net/wireless/ti/wl1251/main.c
+++ b/drivers/net/wireless/ti/wl1251/main.c
@@ -1571,6 +1571,7 @@ struct ieee80211_hw *wl1251_alloc_hw(void)
 
 	wl->state = WL1251_STATE_OFF;
 	mutex_init(&wl->mutex);
+	spin_lock_init(&wl->wl_lock);
 
 	wl->tx_mgmt_frm_rate = DEFAULT_HW_GEN_TX_RATE;
 	wl->tx_mgmt_frm_mod = DEFAULT_HW_GEN_MODULATION_TYPE;
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 925467b..ea892e7 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -109,6 +109,7 @@ struct nvme_dev {
 	/* host memory buffer support: */
 	u64 host_mem_size;
 	u32 nr_host_mem_descs;
+	dma_addr_t host_mem_descs_dma;
 	struct nvme_host_mem_buf_desc *host_mem_descs;
 	void **host_mem_desc_bufs;
 };
@@ -1565,16 +1566,10 @@ static inline void nvme_release_cmb(struct nvme_dev *dev)
 
 static int nvme_set_host_mem(struct nvme_dev *dev, u32 bits)
 {
-	size_t len = dev->nr_host_mem_descs * sizeof(*dev->host_mem_descs);
+	u64 dma_addr = dev->host_mem_descs_dma;
 	struct nvme_command c;
-	u64 dma_addr;
 	int ret;
 
-	dma_addr = dma_map_single(dev->dev, dev->host_mem_descs, len,
-			DMA_TO_DEVICE);
-	if (dma_mapping_error(dev->dev, dma_addr))
-		return -ENOMEM;
-
 	memset(&c, 0, sizeof(c));
 	c.features.opcode	= nvme_admin_set_features;
 	c.features.fid		= cpu_to_le32(NVME_FEAT_HOST_MEM_BUF);
@@ -1591,7 +1586,6 @@ static int nvme_set_host_mem(struct nvme_dev *dev, u32 bits)
 			 "failed to set host mem (err %d, flags %#x).\n",
 			 ret, bits);
 	}
-	dma_unmap_single(dev->dev, dma_addr, len, DMA_TO_DEVICE);
 	return ret;
 }
 
@@ -1609,7 +1603,9 @@ static void nvme_free_host_mem(struct nvme_dev *dev)
 
 	kfree(dev->host_mem_desc_bufs);
 	dev->host_mem_desc_bufs = NULL;
-	kfree(dev->host_mem_descs);
+	dma_free_coherent(dev->dev,
+			dev->nr_host_mem_descs * sizeof(*dev->host_mem_descs),
+			dev->host_mem_descs, dev->host_mem_descs_dma);
 	dev->host_mem_descs = NULL;
 }
 
@@ -1617,6 +1613,7 @@ static int nvme_alloc_host_mem(struct nvme_dev *dev, u64 min, u64 preferred)
 {
 	struct nvme_host_mem_buf_desc *descs;
 	u32 chunk_size, max_entries, len;
+	dma_addr_t descs_dma;
 	int i = 0;
 	void **bufs;
 	u64 size = 0, tmp;
@@ -1627,7 +1624,8 @@ static int nvme_alloc_host_mem(struct nvme_dev *dev, u64 min, u64 preferred)
 	tmp = (preferred + chunk_size - 1);
 	do_div(tmp, chunk_size);
 	max_entries = tmp;
-	descs = kcalloc(max_entries, sizeof(*descs), GFP_KERNEL);
+	descs = dma_zalloc_coherent(dev->dev, max_entries * sizeof(*descs),
+			&descs_dma, GFP_KERNEL);
 	if (!descs)
 		goto out;
 
@@ -1661,6 +1659,7 @@ static int nvme_alloc_host_mem(struct nvme_dev *dev, u64 min, u64 preferred)
 	dev->nr_host_mem_descs = i;
 	dev->host_mem_size = size;
 	dev->host_mem_descs = descs;
+	dev->host_mem_descs_dma = descs_dma;
 	dev->host_mem_desc_bufs = bufs;
 	return 0;
 
@@ -1674,7 +1673,8 @@ static int nvme_alloc_host_mem(struct nvme_dev *dev, u64 min, u64 preferred)
 
 	kfree(bufs);
 out_free_descs:
-	kfree(descs);
+	dma_free_coherent(dev->dev, max_entries * sizeof(*descs), descs,
+			descs_dma);
 out:
 	/* try a smaller chunk size if we failed early */
 	if (chunk_size >= PAGE_SIZE * 2 && (i == 0 || size < min)) {
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index da04df1..a7f7d0a 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -19,6 +19,7 @@
 #include <linux/string.h>
 #include <linux/atomic.h>
 #include <linux/blk-mq.h>
+#include <linux/blk-mq-rdma.h>
 #include <linux/types.h>
 #include <linux/list.h>
 #include <linux/mutex.h>
@@ -463,14 +464,10 @@ static int nvme_rdma_create_queue_ib(struct nvme_rdma_queue *queue)
 	ibdev = queue->device->dev;
 
 	/*
-	 * The admin queue is barely used once the controller is live, so don't
-	 * bother to spread it out.
+	 * Spread I/O queues completion vectors according their queue index.
+	 * Admin queues can always go on completion vector 0.
 	 */
-	if (idx == 0)
-		comp_vector = 0;
-	else
-		comp_vector = idx % ibdev->num_comp_vectors;
-
+	comp_vector = idx == 0 ? idx : idx - 1;
 
 	/* +1 for ib_stop_cq */
 	queue->ib_cq = ib_alloc_cq(ibdev, queue,
@@ -611,10 +608,20 @@ static int nvme_rdma_connect_io_queues(struct nvme_rdma_ctrl *ctrl)
 static int nvme_rdma_init_io_queues(struct nvme_rdma_ctrl *ctrl)
 {
 	struct nvmf_ctrl_options *opts = ctrl->ctrl.opts;
+	struct ib_device *ibdev = ctrl->device->dev;
 	unsigned int nr_io_queues;
 	int i, ret;
 
 	nr_io_queues = min(opts->nr_io_queues, num_online_cpus());
+
+	/*
+	 * we map queues according to the device irq vectors for
+	 * optimal locality so we don't need more queues than
+	 * completion vectors.
+	 */
+	nr_io_queues = min_t(unsigned int, nr_io_queues,
+				ibdev->num_comp_vectors);
+
 	ret = nvme_set_queue_count(&ctrl->ctrl, &nr_io_queues);
 	if (ret)
 		return ret;
@@ -920,7 +927,11 @@ static int nvme_rdma_map_sg_fr(struct nvme_rdma_queue *queue,
 	struct nvme_keyed_sgl_desc *sg = &c->common.dptr.ksgl;
 	int nr;
 
-	nr = ib_map_mr_sg(req->mr, req->sg_table.sgl, count, NULL, PAGE_SIZE);
+	/*
+	 * Align the MR to a 4K page size to match the ctrl page size and
+	 * the block virtual boundary.
+	 */
+	nr = ib_map_mr_sg(req->mr, req->sg_table.sgl, count, NULL, SZ_4K);
 	if (nr < count) {
 		if (nr < 0)
 			return nr;
@@ -1498,6 +1509,13 @@ static void nvme_rdma_complete_rq(struct request *rq)
 	nvme_complete_rq(rq);
 }
 
+static int nvme_rdma_map_queues(struct blk_mq_tag_set *set)
+{
+	struct nvme_rdma_ctrl *ctrl = set->driver_data;
+
+	return blk_mq_rdma_map_queues(set, ctrl->device->dev, 0);
+}
+
 static const struct blk_mq_ops nvme_rdma_mq_ops = {
 	.queue_rq	= nvme_rdma_queue_rq,
 	.complete	= nvme_rdma_complete_rq,
@@ -1507,6 +1525,7 @@ static const struct blk_mq_ops nvme_rdma_mq_ops = {
 	.init_hctx	= nvme_rdma_init_hctx,
 	.poll		= nvme_rdma_poll,
 	.timeout	= nvme_rdma_timeout,
+	.map_queues	= nvme_rdma_map_queues,
 };
 
 static const struct blk_mq_ops nvme_rdma_admin_mq_ops = {
@@ -1583,7 +1602,7 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl)
 		goto out_cleanup_queue;
 
 	ctrl->ctrl.max_hw_sectors =
-		(ctrl->max_fr_pages - 1) << (PAGE_SHIFT - 9);
+		(ctrl->max_fr_pages - 1) << (ilog2(SZ_4K) - 9);
 
 	error = nvme_init_identify(&ctrl->ctrl);
 	if (error)
@@ -1946,10 +1965,6 @@ static struct nvmf_transport_ops nvme_rdma_transport = {
 	.create_ctrl	= nvme_rdma_create_ctrl,
 };
 
-static void nvme_rdma_add_one(struct ib_device *ib_device)
-{
-}
-
 static void nvme_rdma_remove_one(struct ib_device *ib_device, void *client_data)
 {
 	struct nvme_rdma_ctrl *ctrl;
@@ -1971,7 +1986,6 @@ static void nvme_rdma_remove_one(struct ib_device *ib_device, void *client_data)
 
 static struct ib_client nvme_rdma_ib_client = {
 	.name   = "nvme_rdma",
-	.add = nvme_rdma_add_one,
 	.remove = nvme_rdma_remove_one
 };
 
diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index 56a4cba..76d2bb7 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -1510,10 +1510,6 @@ static struct nvmet_fabrics_ops nvmet_rdma_ops = {
 	.delete_ctrl		= nvmet_rdma_delete_ctrl,
 };
 
-static void nvmet_rdma_add_one(struct ib_device *ib_device)
-{
-}
-
 static void nvmet_rdma_remove_one(struct ib_device *ib_device, void *client_data)
 {
 	struct nvmet_rdma_queue *queue;
@@ -1534,7 +1530,6 @@ static void nvmet_rdma_remove_one(struct ib_device *ib_device, void *client_data
 
 static struct ib_client nvmet_rdma_ib_client = {
 	.name   = "nvmet_rdma",
-	.add = nvmet_rdma_add_one,
 	.remove = nvmet_rdma_remove_one
 };
 
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index d51e873..4450fea 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -1307,6 +1307,7 @@ int __pci_register_driver(struct pci_driver *drv, struct module *owner,
 	drv->driver.bus = &pci_bus_type;
 	drv->driver.owner = owner;
 	drv->driver.mod_name = mod_name;
+	drv->driver.groups = drv->groups;
 
 	spin_lock_init(&drv->dynids.lock);
 	INIT_LIST_HEAD(&drv->dynids.list);
diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c
index ba6ac83..5ccfdc8 100644
--- a/drivers/s390/cio/vfio_ccw_cp.c
+++ b/drivers/s390/cio/vfio_ccw_cp.c
@@ -481,7 +481,7 @@ static int ccwchain_fetch_tic(struct ccwchain *chain,
 		ccw_tail = ccw_head + (iter->ch_len - 1) * sizeof(struct ccw1);
 
 		if ((ccw_head <= ccw->cda) && (ccw->cda <= ccw_tail)) {
-			ccw->cda = (__u32) (addr_t) (iter->ch_ccw +
+			ccw->cda = (__u32) (addr_t) (((char *)iter->ch_ccw) +
 						     (ccw->cda - ccw_head));
 			return 0;
 		}
diff --git a/drivers/scsi/ipr.c b/drivers/scsi/ipr.c
index da5bdbd..f838bd7 100644
--- a/drivers/scsi/ipr.c
+++ b/drivers/scsi/ipr.c
@@ -4945,6 +4945,7 @@ static int ipr_slave_configure(struct scsi_device *sdev)
 		}
 		if (ipr_is_vset_device(res)) {
 			sdev->scsi_level = SCSI_SPC_3;
+			sdev->no_report_opcodes = 1;
 			blk_queue_rq_timeout(sdev->request_queue,
 					     IPR_VSET_RW_TIMEOUT);
 			blk_queue_max_hw_sectors(sdev->request_queue, IPR_VSET_MAX_SECTORS);
diff --git a/drivers/scsi/qedf/qedf_els.c b/drivers/scsi/qedf/qedf_els.c
index eb07f1d..59c18ca 100644
--- a/drivers/scsi/qedf/qedf_els.c
+++ b/drivers/scsi/qedf/qedf_els.c
@@ -489,7 +489,7 @@ static void qedf_srr_compl(struct qedf_els_cb_arg *cb_arg)
 
 	/* If a SRR times out, simply free resources */
 	if (srr_req->event == QEDF_IOREQ_EV_ELS_TMO)
-		goto out_free;
+		goto out_put;
 
 	/* Normalize response data into struct fc_frame */
 	mp_req = &(srr_req->mp_req);
@@ -501,7 +501,7 @@ static void qedf_srr_compl(struct qedf_els_cb_arg *cb_arg)
 	if (!fp) {
 		QEDF_ERR(&(qedf->dbg_ctx),
 		    "fc_frame_alloc failure.\n");
-		goto out_free;
+		goto out_put;
 	}
 
 	/* Copy frame header from firmware into fp */
@@ -526,9 +526,10 @@ static void qedf_srr_compl(struct qedf_els_cb_arg *cb_arg)
 	}
 
 	fc_frame_free(fp);
-out_free:
+out_put:
 	/* Put reference for original command since SRR completed */
 	kref_put(&orig_io_req->refcount, qedf_release_cmd);
+out_free:
 	kfree(cb_arg);
 }
 
@@ -780,7 +781,7 @@ static void qedf_rec_compl(struct qedf_els_cb_arg *cb_arg)
 
 	/* If a REC times out, free resources */
 	if (rec_req->event == QEDF_IOREQ_EV_ELS_TMO)
-		goto out_free;
+		goto out_put;
 
 	/* Normalize response data into struct fc_frame */
 	mp_req = &(rec_req->mp_req);
@@ -792,7 +793,7 @@ static void qedf_rec_compl(struct qedf_els_cb_arg *cb_arg)
 	if (!fp) {
 		QEDF_ERR(&(qedf->dbg_ctx),
 		    "fc_frame_alloc failure.\n");
-		goto out_free;
+		goto out_put;
 	}
 
 	/* Copy frame header from firmware into fp */
@@ -884,9 +885,10 @@ static void qedf_rec_compl(struct qedf_els_cb_arg *cb_arg)
 
 out_free_frame:
 	fc_frame_free(fp);
-out_free:
+out_put:
 	/* Put reference for original command since REC completed */
 	kref_put(&orig_io_req->refcount, qedf_release_cmd);
+out_free:
 	kfree(cb_arg);
 }
 
diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
index d7ff71e..84e782d 100644
--- a/drivers/scsi/sg.c
+++ b/drivers/scsi/sg.c
@@ -1021,7 +1021,7 @@ sg_ioctl(struct file *filp, unsigned int cmd_in, unsigned long arg)
 			read_lock_irqsave(&sfp->rq_list_lock, iflags);
 			val = 0;
 			list_for_each_entry(srp, &sfp->rq_list, entry) {
-				if (val > SG_MAX_QUEUE)
+				if (val >= SG_MAX_QUEUE)
 					break;
 				memset(&rinfo[val], 0, SZ_SG_REQ_INFO);
 				rinfo[val].req_state = srp->done + 1;
diff --git a/drivers/staging/vboxvideo/vbox_drv.c b/drivers/staging/vboxvideo/vbox_drv.c
index 92ae156..6d0600c 100644
--- a/drivers/staging/vboxvideo/vbox_drv.c
+++ b/drivers/staging/vboxvideo/vbox_drv.c
@@ -232,7 +232,6 @@ static struct drm_driver driver = {
 	.lastclose = vbox_driver_lastclose,
 	.master_set = vbox_master_set,
 	.master_drop = vbox_master_drop,
-	.set_busid = drm_pci_set_busid,
 
 	.fops = &vbox_fops,
 	.irq_handler = vbox_irq_handler,
@@ -270,12 +269,12 @@ static int __init vbox_init(void)
 	if (vbox_modeset == 0)
 		return -EINVAL;
 
-	return drm_pci_init(&driver, &vbox_pci_driver);
+	return pci_register_driver(&vbox_pci_driver);
 }
 
 static void __exit vbox_exit(void)
 {
-	drm_pci_exit(&driver, &vbox_pci_driver);
+	pci_unregister_driver(&vbox_pci_driver);
 }
 
 module_init(vbox_init);
diff --git a/drivers/staging/vboxvideo/vbox_fb.c b/drivers/staging/vboxvideo/vbox_fb.c
index 35f6d9f..c157284 100644
--- a/drivers/staging/vboxvideo/vbox_fb.c
+++ b/drivers/staging/vboxvideo/vbox_fb.c
@@ -317,22 +317,7 @@ static int vboxfb_create(struct drm_fb_helper *helper,
 	return 0;
 }
 
-static void vbox_fb_gamma_set(struct drm_crtc *crtc, u16 red, u16 green,
-			      u16 blue, int regno)
-{
-}
-
-static void vbox_fb_gamma_get(struct drm_crtc *crtc, u16 *red, u16 *green,
-			      u16 *blue, int regno)
-{
-	*red = regno;
-	*green = regno;
-	*blue = regno;
-}
-
 static struct drm_fb_helper_funcs vbox_fb_helper_funcs = {
-	.gamma_set = vbox_fb_gamma_set,
-	.gamma_get = vbox_fb_gamma_get,
 	.fb_probe = vboxfb_create,
 };
 
@@ -358,7 +343,7 @@ void vbox_fbdev_fini(struct drm_device *dev)
 				vbox_bo_unpin(bo);
 			vbox_bo_unreserve(bo);
 		}
-		drm_gem_object_unreference_unlocked(afb->obj);
+		drm_gem_object_put_unlocked(afb->obj);
 		afb->obj = NULL;
 	}
 	drm_fb_helper_fini(&fbdev->helper);
diff --git a/drivers/staging/vboxvideo/vbox_main.c b/drivers/staging/vboxvideo/vbox_main.c
index d0c6ec7..80bd039 100644
--- a/drivers/staging/vboxvideo/vbox_main.c
+++ b/drivers/staging/vboxvideo/vbox_main.c
@@ -40,7 +40,7 @@ static void vbox_user_framebuffer_destroy(struct drm_framebuffer *fb)
 	struct vbox_framebuffer *vbox_fb = to_vbox_framebuffer(fb);
 
 	if (vbox_fb->obj)
-		drm_gem_object_unreference_unlocked(vbox_fb->obj);
+		drm_gem_object_put_unlocked(vbox_fb->obj);
 
 	drm_framebuffer_cleanup(fb);
 	kfree(fb);
@@ -198,7 +198,7 @@ static struct drm_framebuffer *vbox_user_framebuffer_create(
 err_free_vbox_fb:
 	kfree(vbox_fb);
 err_unref_obj:
-	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
 	return ERR_PTR(ret);
 }
 
@@ -472,7 +472,7 @@ int vbox_dumb_create(struct drm_file *file,
 		return ret;
 
 	ret = drm_gem_handle_create(file, gobj, &handle);
-	drm_gem_object_unreference_unlocked(gobj);
+	drm_gem_object_put_unlocked(gobj);
 	if (ret)
 		return ret;
 
@@ -525,7 +525,7 @@ vbox_dumb_mmap_offset(struct drm_file *file,
 	bo = gem_to_vbox_bo(obj);
 	*offset = vbox_bo_mmap_offset(bo);
 
-	drm_gem_object_unreference(obj);
+	drm_gem_object_put(obj);
 	ret = 0;
 
 out_unlock:
diff --git a/drivers/staging/vboxvideo/vbox_mode.c b/drivers/staging/vboxvideo/vbox_mode.c
index f2b85f3..e5b6383 100644
--- a/drivers/staging/vboxvideo/vbox_mode.c
+++ b/drivers/staging/vboxvideo/vbox_mode.c
@@ -134,10 +134,6 @@ static int vbox_set_view(struct drm_crtc *crtc)
 	return 0;
 }
 
-static void vbox_crtc_load_lut(struct drm_crtc *crtc)
-{
-}
-
 static void vbox_crtc_dpms(struct drm_crtc *crtc, int mode)
 {
 	struct vbox_crtc *vbox_crtc = to_vbox_crtc(crtc);
@@ -330,7 +326,6 @@ static const struct drm_crtc_helper_funcs vbox_crtc_helper_funcs = {
 	.mode_set = vbox_crtc_mode_set,
 	/* .mode_set_base = vbox_crtc_mode_set_base, */
 	.disable = vbox_crtc_disable,
-	.load_lut = vbox_crtc_load_lut,
 	.prepare = vbox_crtc_prepare,
 	.commit = vbox_crtc_commit,
 };
@@ -817,7 +812,7 @@ static int vbox_cursor_set2(struct drm_crtc *crtc, struct drm_file *file_priv,
 out_unreserve_bo:
 	vbox_bo_unreserve(bo);
 out_unref_obj:
-	drm_gem_object_unreference_unlocked(obj);
+	drm_gem_object_put_unlocked(obj);
 
 	return ret;
 }
diff --git a/drivers/video/fbdev/omap2/omapfb/dss/hdmi4.c b/drivers/video/fbdev/omap2/omapfb/dss/hdmi4.c
index 156a254..ec78d61 100644
--- a/drivers/video/fbdev/omap2/omapfb/dss/hdmi4.c
+++ b/drivers/video/fbdev/omap2/omapfb/dss/hdmi4.c
@@ -664,7 +664,7 @@ static int hdmi_audio_register(struct device *dev)
 {
 	struct omap_hdmi_audio_pdata pdata = {
 		.dev = dev,
-		.dss_version = omapdss_get_version(),
+		.version = 4,
 		.audio_dma_addr = hdmi_wp_get_audio_dma_addr(&hdmi.wp),
 		.ops = &hdmi_audio_ops,
 	};
diff --git a/drivers/video/fbdev/omap2/omapfb/dss/hdmi5.c b/drivers/video/fbdev/omap2/omapfb/dss/hdmi5.c
index 4da36bc..2e2fcc3 100644
--- a/drivers/video/fbdev/omap2/omapfb/dss/hdmi5.c
+++ b/drivers/video/fbdev/omap2/omapfb/dss/hdmi5.c
@@ -695,7 +695,7 @@ static int hdmi_audio_register(struct device *dev)
 {
 	struct omap_hdmi_audio_pdata pdata = {
 		.dev = dev,
-		.dss_version = omapdss_get_version(),
+		.version = 5,
 		.audio_dma_addr = hdmi_wp_get_audio_dma_addr(&hdmi.wp),
 		.ops = &hdmi_audio_ops,
 	};
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index f3bf8f4..8236059 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -484,13 +484,6 @@ static void mn_invl_range_start(struct mmu_notifier *mn,
 	mutex_unlock(&priv->lock);
 }
 
-static void mn_invl_page(struct mmu_notifier *mn,
-			 struct mm_struct *mm,
-			 unsigned long address)
-{
-	mn_invl_range_start(mn, mm, address, address + PAGE_SIZE);
-}
-
 static void mn_release(struct mmu_notifier *mn,
 		       struct mm_struct *mm)
 {
@@ -522,7 +515,6 @@ static void mn_release(struct mmu_notifier *mn,
 
 static const struct mmu_notifier_ops gntdev_mmu_ops = {
 	.release                = mn_release,
-	.invalidate_page        = mn_invl_page,
 	.invalidate_range_start = mn_invl_range_start,
 };
 
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 5083628..1bc709f 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -189,7 +189,7 @@ static int ceph_releasepage(struct page *page, gfp_t g)
 /*
  * read a single page, without unlocking it.
  */
-static int readpage_nounlock(struct file *filp, struct page *page)
+static int ceph_do_readpage(struct file *filp, struct page *page)
 {
 	struct inode *inode = file_inode(filp);
 	struct ceph_inode_info *ci = ceph_inode(inode);
@@ -219,7 +219,7 @@ static int readpage_nounlock(struct file *filp, struct page *page)
 
 	err = ceph_readpage_from_fscache(inode, page);
 	if (err == 0)
-		goto out;
+		return -EINPROGRESS;
 
 	dout("readpage inode %p file %p page %p index %lu\n",
 	     inode, filp, page, page->index);
@@ -249,8 +249,11 @@ static int readpage_nounlock(struct file *filp, struct page *page)
 
 static int ceph_readpage(struct file *filp, struct page *page)
 {
-	int r = readpage_nounlock(filp, page);
-	unlock_page(page);
+	int r = ceph_do_readpage(filp, page);
+	if (r != -EINPROGRESS)
+		unlock_page(page);
+	else
+		r = 0;
 	return r;
 }
 
@@ -1237,7 +1240,7 @@ static int ceph_update_writeable_page(struct file *file,
 			goto retry_locked;
 		r = writepage_nounlock(page, NULL);
 		if (r < 0)
-			goto fail_nosnap;
+			goto fail_unlock;
 		goto retry_locked;
 	}
 
@@ -1265,11 +1268,14 @@ static int ceph_update_writeable_page(struct file *file,
 	}
 
 	/* we need to read it. */
-	r = readpage_nounlock(file, page);
-	if (r < 0)
-		goto fail_nosnap;
+	r = ceph_do_readpage(file, page);
+	if (r < 0) {
+		if (r == -EINPROGRESS)
+			return -EAGAIN;
+		goto fail_unlock;
+	}
 	goto retry_locked;
-fail_nosnap:
+fail_unlock:
 	unlock_page(page);
 	return r;
 }
diff --git a/fs/ceph/cache.c b/fs/ceph/cache.c
index fd11728..337f886 100644
--- a/fs/ceph/cache.c
+++ b/fs/ceph/cache.c
@@ -297,13 +297,7 @@ void ceph_fscache_file_set_cookie(struct inode *inode, struct file *filp)
 	}
 }
 
-static void ceph_vfs_readpage_complete(struct page *page, void *data, int error)
-{
-	if (!error)
-		SetPageUptodate(page);
-}
-
-static void ceph_vfs_readpage_complete_unlock(struct page *page, void *data, int error)
+static void ceph_readpage_from_fscache_complete(struct page *page, void *data, int error)
 {
 	if (!error)
 		SetPageUptodate(page);
@@ -331,7 +325,7 @@ int ceph_readpage_from_fscache(struct inode *inode, struct page *page)
 		return -ENOBUFS;
 
 	ret = fscache_read_or_alloc_page(ci->fscache, page,
-					 ceph_vfs_readpage_complete, NULL,
+					 ceph_readpage_from_fscache_complete, NULL,
 					 GFP_KERNEL);
 
 	switch (ret) {
@@ -360,7 +354,7 @@ int ceph_readpages_from_fscache(struct inode *inode,
 		return -ENOBUFS;
 
 	ret = fscache_read_or_alloc_pages(ci->fscache, mapping, pages, nr_pages,
-					  ceph_vfs_readpage_complete_unlock,
+					  ceph_readpage_from_fscache_complete,
 					  NULL, mapping_gfp_mask(mapping));
 
 	switch (ret) {
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 59647eb..83a8f52 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -1223,6 +1223,7 @@ cifs_parse_mount_options(const char *mountdata, const char *devname,
 	char *tmp_end, *value;
 	char delim;
 	bool got_ip = false;
+	bool got_version = false;
 	unsigned short port = 0;
 	struct sockaddr *dstaddr = (struct sockaddr *)&vol->dstaddr;
 
@@ -1874,24 +1875,35 @@ cifs_parse_mount_options(const char *mountdata, const char *devname,
 				pr_warn("CIFS: server netbiosname longer than 15 truncated.\n");
 			break;
 		case Opt_ver:
+			/* version of mount userspace tools, not dialect */
 			string = match_strdup(args);
 			if (string == NULL)
 				goto out_nomem;
 
+			/* If interface changes in mount.cifs bump to new ver */
 			if (strncasecmp(string, "1", 1) == 0) {
+				if (strlen(string) > 1) {
+					pr_warn("Bad mount helper ver=%s. Did "
+						"you want SMB1 (CIFS) dialect "
+						"and mean to type vers=1.0 "
+						"instead?\n", string);
+					goto cifs_parse_mount_err;
+				}
 				/* This is the default */
 				break;
 			}
 			/* For all other value, error */
-			pr_warn("CIFS: Invalid version specified\n");
+			pr_warn("CIFS: Invalid mount helper version specified\n");
 			goto cifs_parse_mount_err;
 		case Opt_vers:
+			/* protocol version (dialect) */
 			string = match_strdup(args);
 			if (string == NULL)
 				goto out_nomem;
 
 			if (cifs_parse_smb_version(string, vol) != 0)
 				goto cifs_parse_mount_err;
+			got_version = true;
 			break;
 		case Opt_sec:
 			string = match_strdup(args);
@@ -1973,6 +1985,14 @@ cifs_parse_mount_options(const char *mountdata, const char *devname,
 	else if (override_gid == 1)
 		pr_notice("CIFS: ignoring forcegid mount option specified with no gid= option.\n");
 
+	if (got_version == false)
+		pr_warn("No dialect specified on mount. Default has changed to "
+			"a more secure dialect, SMB3 (vers=3.0), from CIFS "
+			"(SMB1). To use the less secure SMB1 dialect to access "
+			"old servers which do not support SMB3 specify vers=1.0"
+			" on mount. For somewhat newer servers such as Windows "
+			"7 try vers=2.1.\n");
+
 	kfree(mountdata_copy);
 	return 0;
 
diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c
index 569d3fb..e702d48 100644
--- a/fs/cifs/dir.c
+++ b/fs/cifs/dir.c
@@ -205,7 +205,7 @@ check_name(struct dentry *direntry, struct cifs_tcon *tcon)
 	int i;
 
 	if (unlikely(direntry->d_name.len >
-		     tcon->fsAttrInfo.MaxPathNameComponentLength))
+		     le32_to_cpu(tcon->fsAttrInfo.MaxPathNameComponentLength)))
 		return -ENAMETOOLONG;
 
 	if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_POSIX_PATHS)) {
diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
index 97edb4d3..7aa6720 100644
--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -514,7 +514,12 @@ SMB2_negotiate(const unsigned int xid, struct cifs_ses *ses)
 	 * No tcon so can't do
 	 * cifs_stats_inc(&tcon->stats.smb2_stats.smb2_com_fail[SMB2...]);
 	 */
-	if (rc != 0)
+	if (rc == -EOPNOTSUPP) {
+		cifs_dbg(VFS, "Dialect not supported by server. Consider "
+			"specifying vers=1.0 or vers=2.1 on mount for accessing"
+			" older servers\n");
+		goto neg_exit;
+	} else if (rc != 0)
 		goto neg_exit;
 
 	cifs_dbg(FYI, "mode 0x%x\n", rsp->SecurityMode);
diff --git a/fs/cifs/smb2pdu.h b/fs/cifs/smb2pdu.h
index 18700fd..2826882 100644
--- a/fs/cifs/smb2pdu.h
+++ b/fs/cifs/smb2pdu.h
@@ -84,8 +84,8 @@
 
 #define NUMBER_OF_SMB2_COMMANDS	0x0013
 
-/* BB FIXME - analyze following length BB */
-#define MAX_SMB2_HDR_SIZE 0x78 /* 4 len + 64 hdr + (2*24 wct) + 2 bct + 2 pad */
+/* 4 len + 52 transform hdr + 64 hdr + 56 create rsp */
+#define MAX_SMB2_HDR_SIZE 0x00b0
 
 #define SMB2_PROTO_NUMBER cpu_to_le32(0x424d53fe)
 #define SMB2_TRANSFORM_PROTO_NUM cpu_to_le32(0x424d53fd)
diff --git a/fs/dax.c b/fs/dax.c
index 865d42c..ab925dc 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -646,11 +646,10 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping,
 	pte_t pte, *ptep = NULL;
 	pmd_t *pmdp = NULL;
 	spinlock_t *ptl;
-	bool changed;
 
 	i_mmap_lock_read(mapping);
 	vma_interval_tree_foreach(vma, &mapping->i_mmap, index, index) {
-		unsigned long address;
+		unsigned long address, start, end;
 
 		cond_resched();
 
@@ -658,8 +657,13 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping,
 			continue;
 
 		address = pgoff_address(index, vma);
-		changed = false;
-		if (follow_pte_pmd(vma->vm_mm, address, &ptep, &pmdp, &ptl))
+
+		/*
+		 * Note because we provide start/end to follow_pte_pmd it will
+		 * call mmu_notifier_invalidate_range_start() on our behalf
+		 * before taking any lock.
+		 */
+		if (follow_pte_pmd(vma->vm_mm, address, &start, &end, &ptep, &pmdp, &ptl))
 			continue;
 
 		if (pmdp) {
@@ -676,7 +680,7 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping,
 			pmd = pmd_wrprotect(pmd);
 			pmd = pmd_mkclean(pmd);
 			set_pmd_at(vma->vm_mm, address, pmdp, pmd);
-			changed = true;
+			mmu_notifier_invalidate_range(vma->vm_mm, start, end);
 unlock_pmd:
 			spin_unlock(ptl);
 #endif
@@ -691,13 +695,12 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping,
 			pte = pte_wrprotect(pte);
 			pte = pte_mkclean(pte);
 			set_pte_at(vma->vm_mm, address, ptep, pte);
-			changed = true;
+			mmu_notifier_invalidate_range(vma->vm_mm, start, end);
 unlock_pte:
 			pte_unmap_unlock(ptep, ptl);
 		}
 
-		if (changed)
-			mmu_notifier_invalidate_page(vma->vm_mm, address);
+		mmu_notifier_invalidate_range_end(vma->vm_mm, start, end);
 	}
 	i_mmap_unlock_read(mapping);
 }
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index e767e43..adbe328 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -600,8 +600,13 @@ static void ep_remove_wait_queue(struct eppoll_entry *pwq)
 	wait_queue_head_t *whead;
 
 	rcu_read_lock();
-	/* If it is cleared by POLLFREE, it should be rcu-safe */
-	whead = rcu_dereference(pwq->whead);
+	/*
+	 * If it is cleared by POLLFREE, it should be rcu-safe.
+	 * If we read NULL we need a barrier paired with
+	 * smp_store_release() in ep_poll_callback(), otherwise
+	 * we rely on whead->lock.
+	 */
+	whead = smp_load_acquire(&pwq->whead);
 	if (whead)
 		remove_wait_queue(whead, &pwq->wait);
 	rcu_read_unlock();
@@ -1134,17 +1139,6 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
 	struct eventpoll *ep = epi->ep;
 	int ewake = 0;
 
-	if ((unsigned long)key & POLLFREE) {
-		ep_pwq_from_wait(wait)->whead = NULL;
-		/*
-		 * whead = NULL above can race with ep_remove_wait_queue()
-		 * which can do another remove_wait_queue() after us, so we
-		 * can't use __remove_wait_queue(). whead->lock is held by
-		 * the caller.
-		 */
-		list_del_init(&wait->entry);
-	}
-
 	spin_lock_irqsave(&ep->lock, flags);
 
 	ep_set_busy_poll_napi_id(epi);
@@ -1228,10 +1222,26 @@ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, v
 	if (pwake)
 		ep_poll_safewake(&ep->poll_wait);
 
-	if (epi->event.events & EPOLLEXCLUSIVE)
-		return ewake;
+	if (!(epi->event.events & EPOLLEXCLUSIVE))
+		ewake = 1;
 
-	return 1;
+	if ((unsigned long)key & POLLFREE) {
+		/*
+		 * If we race with ep_remove_wait_queue() it can miss
+		 * ->whead = NULL and do another remove_wait_queue() after
+		 * us, so we can't use __remove_wait_queue().
+		 */
+		list_del_init(&wait->entry);
+		/*
+		 * ->whead != NULL protects us from the race with ep_free()
+		 * or ep_remove(), ep_remove_wait_queue() takes whead->lock
+		 * held by the caller. Once we nullify it, nothing protects
+		 * ep/epi or even wait.
+		 */
+		smp_store_release(&ep_pwq_from_wait(wait)->whead, NULL);
+	}
+
+	return ewake;
 }
 
 /*
diff --git a/fs/jfs/super.c b/fs/jfs/super.c
index 78b41e1..60726ae 100644
--- a/fs/jfs/super.c
+++ b/fs/jfs/super.c
@@ -619,16 +619,10 @@ static int jfs_fill_super(struct super_block *sb, void *data, int silent)
 	if (!sb->s_root)
 		goto out_no_root;
 
-	/* logical blocks are represented by 40 bits in pxd_t, etc. */
-	sb->s_maxbytes = ((u64) sb->s_blocksize) << 40;
-#if BITS_PER_LONG == 32
-	/*
-	 * Page cache is indexed by long.
-	 * I would use MAX_LFS_FILESIZE, but it's only half as big
+	/* logical blocks are represented by 40 bits in pxd_t, etc.
+	 * and page cache is indexed by long
 	 */
-	sb->s_maxbytes = min(((u64) PAGE_SIZE << 32) - 1,
-			     (u64)sb->s_maxbytes);
-#endif
+	sb->s_maxbytes = min(((loff_t)sb->s_blocksize) << 40, MAX_LFS_FILESIZE);
 	sb->s_time_gran = 1;
 	return 0;
 
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index 38d0383..bc69d40 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -969,7 +969,7 @@ nfsd_vfs_write(struct svc_rqst *rqstp, struct svc_fh *fhp, struct file *file,
 	int			use_wgather;
 	loff_t			pos = offset;
 	unsigned int		pflags = current->flags;
-	int			flags = 0;
+	rwf_t			flags = 0;
 
 	if (test_bit(RQ_LOCAL, &rqstp->rq_flags))
 		/*
diff --git a/fs/read_write.c b/fs/read_write.c
index 0cc7033..61b58c7 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -33,7 +33,7 @@ const struct file_operations generic_ro_fops = {
 
 EXPORT_SYMBOL(generic_ro_fops);
 
-static inline int unsigned_offsets(struct file *file)
+static inline bool unsigned_offsets(struct file *file)
 {
 	return file->f_mode & FMODE_UNSIGNED_OFFSET;
 }
@@ -633,7 +633,7 @@ unsigned long iov_shorten(struct iovec *iov, unsigned long nr_segs, size_t to)
 EXPORT_SYMBOL(iov_shorten);
 
 static ssize_t do_iter_readv_writev(struct file *filp, struct iov_iter *iter,
-		loff_t *ppos, int type, int flags)
+		loff_t *ppos, int type, rwf_t flags)
 {
 	struct kiocb kiocb;
 	ssize_t ret;
@@ -655,7 +655,7 @@ static ssize_t do_iter_readv_writev(struct file *filp, struct iov_iter *iter,
 
 /* Do it by hand, with file-ops */
 static ssize_t do_loop_readv_writev(struct file *filp, struct iov_iter *iter,
-		loff_t *ppos, int type, int flags)
+		loff_t *ppos, int type, rwf_t flags)
 {
 	ssize_t ret = 0;
 
@@ -871,7 +871,7 @@ ssize_t compat_rw_copy_check_uvector(int type,
 #endif
 
 static ssize_t do_iter_read(struct file *file, struct iov_iter *iter,
-		loff_t *pos, int flags)
+		loff_t *pos, rwf_t flags)
 {
 	size_t tot_len;
 	ssize_t ret = 0;
@@ -899,7 +899,7 @@ static ssize_t do_iter_read(struct file *file, struct iov_iter *iter,
 }
 
 ssize_t vfs_iter_read(struct file *file, struct iov_iter *iter, loff_t *ppos,
-		int flags)
+		rwf_t flags)
 {
 	if (!file->f_op->read_iter)
 		return -EINVAL;
@@ -908,7 +908,7 @@ ssize_t vfs_iter_read(struct file *file, struct iov_iter *iter, loff_t *ppos,
 EXPORT_SYMBOL(vfs_iter_read);
 
 static ssize_t do_iter_write(struct file *file, struct iov_iter *iter,
-		loff_t *pos, int flags)
+		loff_t *pos, rwf_t flags)
 {
 	size_t tot_len;
 	ssize_t ret = 0;
@@ -935,7 +935,7 @@ static ssize_t do_iter_write(struct file *file, struct iov_iter *iter,
 }
 
 ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos,
-		int flags)
+		rwf_t flags)
 {
 	if (!file->f_op->write_iter)
 		return -EINVAL;
@@ -944,7 +944,7 @@ ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos,
 EXPORT_SYMBOL(vfs_iter_write);
 
 ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
-		  unsigned long vlen, loff_t *pos, int flags)
+		  unsigned long vlen, loff_t *pos, rwf_t flags)
 {
 	struct iovec iovstack[UIO_FASTIOV];
 	struct iovec *iov = iovstack;
@@ -962,7 +962,7 @@ ssize_t vfs_readv(struct file *file, const struct iovec __user *vec,
 EXPORT_SYMBOL(vfs_readv);
 
 ssize_t vfs_writev(struct file *file, const struct iovec __user *vec,
-		   unsigned long vlen, loff_t *pos, int flags)
+		   unsigned long vlen, loff_t *pos, rwf_t flags)
 {
 	struct iovec iovstack[UIO_FASTIOV];
 	struct iovec *iov = iovstack;
@@ -981,7 +981,7 @@ ssize_t vfs_writev(struct file *file, const struct iovec __user *vec,
 EXPORT_SYMBOL(vfs_writev);
 
 static ssize_t do_readv(unsigned long fd, const struct iovec __user *vec,
-			unsigned long vlen, int flags)
+			unsigned long vlen, rwf_t flags)
 {
 	struct fd f = fdget_pos(fd);
 	ssize_t ret = -EBADF;
@@ -1001,7 +1001,7 @@ static ssize_t do_readv(unsigned long fd, const struct iovec __user *vec,
 }
 
 static ssize_t do_writev(unsigned long fd, const struct iovec __user *vec,
-			 unsigned long vlen, int flags)
+			 unsigned long vlen, rwf_t flags)
 {
 	struct fd f = fdget_pos(fd);
 	ssize_t ret = -EBADF;
@@ -1027,7 +1027,7 @@ static inline loff_t pos_from_hilo(unsigned long high, unsigned long low)
 }
 
 static ssize_t do_preadv(unsigned long fd, const struct iovec __user *vec,
-			 unsigned long vlen, loff_t pos, int flags)
+			 unsigned long vlen, loff_t pos, rwf_t flags)
 {
 	struct fd f;
 	ssize_t ret = -EBADF;
@@ -1050,7 +1050,7 @@ static ssize_t do_preadv(unsigned long fd, const struct iovec __user *vec,
 }
 
 static ssize_t do_pwritev(unsigned long fd, const struct iovec __user *vec,
-			  unsigned long vlen, loff_t pos, int flags)
+			  unsigned long vlen, loff_t pos, rwf_t flags)
 {
 	struct fd f;
 	ssize_t ret = -EBADF;
@@ -1094,7 +1094,7 @@ SYSCALL_DEFINE5(preadv, unsigned long, fd, const struct iovec __user *, vec,
 
 SYSCALL_DEFINE6(preadv2, unsigned long, fd, const struct iovec __user *, vec,
 		unsigned long, vlen, unsigned long, pos_l, unsigned long, pos_h,
-		int, flags)
+		rwf_t, flags)
 {
 	loff_t pos = pos_from_hilo(pos_h, pos_l);
 
@@ -1114,7 +1114,7 @@ SYSCALL_DEFINE5(pwritev, unsigned long, fd, const struct iovec __user *, vec,
 
 SYSCALL_DEFINE6(pwritev2, unsigned long, fd, const struct iovec __user *, vec,
 		unsigned long, vlen, unsigned long, pos_l, unsigned long, pos_h,
-		int, flags)
+		rwf_t, flags)
 {
 	loff_t pos = pos_from_hilo(pos_h, pos_l);
 
@@ -1127,7 +1127,7 @@ SYSCALL_DEFINE6(pwritev2, unsigned long, fd, const struct iovec __user *, vec,
 #ifdef CONFIG_COMPAT
 static size_t compat_readv(struct file *file,
 			   const struct compat_iovec __user *vec,
-			   unsigned long vlen, loff_t *pos, int flags)
+			   unsigned long vlen, loff_t *pos, rwf_t flags)
 {
 	struct iovec iovstack[UIO_FASTIOV];
 	struct iovec *iov = iovstack;
@@ -1147,7 +1147,7 @@ static size_t compat_readv(struct file *file,
 
 static size_t do_compat_readv(compat_ulong_t fd,
 				 const struct compat_iovec __user *vec,
-				 compat_ulong_t vlen, int flags)
+				 compat_ulong_t vlen, rwf_t flags)
 {
 	struct fd f = fdget_pos(fd);
 	ssize_t ret;
@@ -1173,7 +1173,7 @@ COMPAT_SYSCALL_DEFINE3(readv, compat_ulong_t, fd,
 
 static long do_compat_preadv64(unsigned long fd,
 				  const struct compat_iovec __user *vec,
-				  unsigned long vlen, loff_t pos, int flags)
+				  unsigned long vlen, loff_t pos, rwf_t flags)
 {
 	struct fd f;
 	ssize_t ret;
@@ -1211,7 +1211,7 @@ COMPAT_SYSCALL_DEFINE5(preadv, compat_ulong_t, fd,
 #ifdef __ARCH_WANT_COMPAT_SYS_PREADV64V2
 COMPAT_SYSCALL_DEFINE5(preadv64v2, unsigned long, fd,
 		const struct compat_iovec __user *,vec,
-		unsigned long, vlen, loff_t, pos, int, flags)
+		unsigned long, vlen, loff_t, pos, rwf_t, flags)
 {
 	return do_compat_preadv64(fd, vec, vlen, pos, flags);
 }
@@ -1220,7 +1220,7 @@ COMPAT_SYSCALL_DEFINE5(preadv64v2, unsigned long, fd,
 COMPAT_SYSCALL_DEFINE6(preadv2, compat_ulong_t, fd,
 		const struct compat_iovec __user *,vec,
 		compat_ulong_t, vlen, u32, pos_low, u32, pos_high,
-		int, flags)
+		rwf_t, flags)
 {
 	loff_t pos = ((loff_t)pos_high << 32) | pos_low;
 
@@ -1232,7 +1232,7 @@ COMPAT_SYSCALL_DEFINE6(preadv2, compat_ulong_t, fd,
 
 static size_t compat_writev(struct file *file,
 			    const struct compat_iovec __user *vec,
-			    unsigned long vlen, loff_t *pos, int flags)
+			    unsigned long vlen, loff_t *pos, rwf_t flags)
 {
 	struct iovec iovstack[UIO_FASTIOV];
 	struct iovec *iov = iovstack;
@@ -1254,7 +1254,7 @@ static size_t compat_writev(struct file *file,
 
 static size_t do_compat_writev(compat_ulong_t fd,
 				  const struct compat_iovec __user* vec,
-				  compat_ulong_t vlen, int flags)
+				  compat_ulong_t vlen, rwf_t flags)
 {
 	struct fd f = fdget_pos(fd);
 	ssize_t ret;
@@ -1279,7 +1279,7 @@ COMPAT_SYSCALL_DEFINE3(writev, compat_ulong_t, fd,
 
 static long do_compat_pwritev64(unsigned long fd,
 				   const struct compat_iovec __user *vec,
-				   unsigned long vlen, loff_t pos, int flags)
+				   unsigned long vlen, loff_t pos, rwf_t flags)
 {
 	struct fd f;
 	ssize_t ret;
@@ -1317,7 +1317,7 @@ COMPAT_SYSCALL_DEFINE5(pwritev, compat_ulong_t, fd,
 #ifdef __ARCH_WANT_COMPAT_SYS_PWRITEV64V2
 COMPAT_SYSCALL_DEFINE5(pwritev64v2, unsigned long, fd,
 		const struct compat_iovec __user *,vec,
-		unsigned long, vlen, loff_t, pos, int, flags)
+		unsigned long, vlen, loff_t, pos, rwf_t, flags)
 {
 	return do_compat_pwritev64(fd, vec, vlen, pos, flags);
 }
@@ -1325,7 +1325,7 @@ COMPAT_SYSCALL_DEFINE5(pwritev64v2, unsigned long, fd,
 
 COMPAT_SYSCALL_DEFINE6(pwritev2, compat_ulong_t, fd,
 		const struct compat_iovec __user *,vec,
-		compat_ulong_t, vlen, u32, pos_low, u32, pos_high, int, flags)
+		compat_ulong_t, vlen, u32, pos_low, u32, pos_high, rwf_t, flags)
 {
 	loff_t pos = ((loff_t)pos_high << 32) | pos_low;
 
diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h
index 9f0681b..6626077 100644
--- a/include/asm-generic/qspinlock.h
+++ b/include/asm-generic/qspinlock.h
@@ -22,17 +22,6 @@
 #include <asm-generic/qspinlock_types.h>
 
 /**
- * queued_spin_unlock_wait - wait until the _current_ lock holder releases the lock
- * @lock : Pointer to queued spinlock structure
- *
- * There is a very slight possibility of live-lock if the lockers keep coming
- * and the waiter is just unfortunate enough to not see any unlock state.
- */
-#ifndef queued_spin_unlock_wait
-extern void queued_spin_unlock_wait(struct qspinlock *lock);
-#endif
-
-/**
  * queued_spin_is_locked - is the spinlock locked?
  * @lock: Pointer to queued spinlock structure
  * Return: 1 if it is locked, 0 otherwise
@@ -41,8 +30,6 @@ extern void queued_spin_unlock_wait(struct qspinlock *lock);
 static __always_inline int queued_spin_is_locked(struct qspinlock *lock)
 {
 	/*
-	 * See queued_spin_unlock_wait().
-	 *
 	 * Any !0 state indicates it is locked, even if _Q_LOCKED_VAL
 	 * isn't immediately observable.
 	 */
@@ -135,6 +122,5 @@ static __always_inline bool virt_spin_lock(struct qspinlock *lock)
 #define arch_spin_trylock(l)		queued_spin_trylock(l)
 #define arch_spin_unlock(l)		queued_spin_unlock(l)
 #define arch_spin_lock_flags(l, f)	queued_spin_lock(l)
-#define arch_spin_unlock_wait(l)	queued_spin_unlock_wait(l)
 
 #endif /* __ASM_GENERIC_QSPINLOCK_H */
diff --git a/include/drm/bridge/dw_mipi_dsi.h b/include/drm/bridge/dw_mipi_dsi.h
new file mode 100644
index 0000000..9b30fec
--- /dev/null
+++ b/include/drm/bridge/dw_mipi_dsi.h
@@ -0,0 +1,39 @@
+/*
+ * Copyright (C) STMicroelectronics SA 2017
+ *
+ * Authors: Philippe Cornu <philippe.cornu@st.com>
+ *          Yannick Fertre <yannick.fertre@st.com>
+ *
+ * License terms:  GNU General Public License (GPL), version 2
+ */
+
+#ifndef __DW_MIPI_DSI__
+#define __DW_MIPI_DSI__
+
+struct dw_mipi_dsi_phy_ops {
+	int (*init)(void *priv_data);
+	int (*get_lane_mbps)(void *priv_data, struct drm_display_mode *mode,
+			     unsigned long mode_flags, u32 lanes, u32 format,
+			     unsigned int *lane_mbps);
+};
+
+struct dw_mipi_dsi_plat_data {
+	void __iomem *base;
+	unsigned int max_data_lanes;
+
+	enum drm_mode_status (*mode_valid)(void *priv_data,
+					   const struct drm_display_mode *mode);
+
+	const struct dw_mipi_dsi_phy_ops *phy_ops;
+
+	void *priv_data;
+};
+
+int dw_mipi_dsi_probe(struct platform_device *pdev,
+		      const struct dw_mipi_dsi_plat_data *plat_data);
+void dw_mipi_dsi_remove(struct platform_device *pdev);
+int dw_mipi_dsi_bind(struct platform_device *pdev, struct drm_encoder *encoder,
+		     const struct dw_mipi_dsi_plat_data *plat_data);
+void dw_mipi_dsi_unbind(struct device *dev);
+
+#endif /* __DW_MIPI_DSI__ */
diff --git a/include/drm/drmP.h b/include/drm/drmP.h
index 39df16a..7277783a 100644
--- a/include/drm/drmP.h
+++ b/include/drm/drmP.h
@@ -82,19 +82,10 @@
 #include <drm/drm_sysfs.h>
 #include <drm/drm_vblank.h>
 #include <drm/drm_irq.h>
-
+#include <drm/drm_device.h>
 
 struct module;
 
-struct drm_device;
-struct drm_agp_head;
-struct drm_local_map;
-struct drm_device_dma;
-struct drm_gem_object;
-struct drm_master;
-struct drm_vblank_crtc;
-struct drm_vma_offset_manager;
-
 struct device_node;
 struct videomode;
 struct reservation_object;
@@ -306,143 +297,6 @@ struct pci_controller;
 
 
 /**
- * DRM device structure. This structure represent a complete card that
- * may contain multiple heads.
- */
-struct drm_device {
-	struct list_head legacy_dev_list;/**< list of devices per driver for stealth attach cleanup */
-	int if_version;			/**< Highest interface version set */
-
-	/** \name Lifetime Management */
-	/*@{ */
-	struct kref ref;		/**< Object ref-count */
-	struct device *dev;		/**< Device structure of bus-device */
-	struct drm_driver *driver;	/**< DRM driver managing the device */
-	void *dev_private;		/**< DRM driver private data */
-	struct drm_minor *control;		/**< Control node */
-	struct drm_minor *primary;		/**< Primary node */
-	struct drm_minor *render;		/**< Render node */
-	bool registered;
-
-	/* currently active master for this device. Protected by master_mutex */
-	struct drm_master *master;
-
-	atomic_t unplugged;			/**< Flag whether dev is dead */
-	struct inode *anon_inode;		/**< inode for private address-space */
-	char *unique;				/**< unique name of the device */
-	/*@} */
-
-	/** \name Locks */
-	/*@{ */
-	struct mutex struct_mutex;	/**< For others */
-	struct mutex master_mutex;      /**< For drm_minor::master and drm_file::is_master */
-	/*@} */
-
-	/** \name Usage Counters */
-	/*@{ */
-	int open_count;			/**< Outstanding files open, protected by drm_global_mutex. */
-	spinlock_t buf_lock;		/**< For drm_device::buf_use and a few other things. */
-	int buf_use;			/**< Buffers in use -- cannot alloc */
-	atomic_t buf_alloc;		/**< Buffer allocation in progress */
-	/*@} */
-
-	struct mutex filelist_mutex;
-	struct list_head filelist;
-
-	/** \name Memory management */
-	/*@{ */
-	struct list_head maplist;	/**< Linked list of regions */
-	struct drm_open_hash map_hash;	/**< User token hash table for maps */
-
-	/** \name Context handle management */
-	/*@{ */
-	struct list_head ctxlist;	/**< Linked list of context handles */
-	struct mutex ctxlist_mutex;	/**< For ctxlist */
-
-	struct idr ctx_idr;
-
-	struct list_head vmalist;	/**< List of vmas (for debugging) */
-
-	/*@} */
-
-	/** \name DMA support */
-	/*@{ */
-	struct drm_device_dma *dma;		/**< Optional pointer for DMA support */
-	/*@} */
-
-	/** \name Context support */
-	/*@{ */
-
-	__volatile__ long context_flag;	/**< Context swapping flag */
-	int last_context;		/**< Last current context */
-	/*@} */
-
-	/**
-	 * @irq_enabled:
-	 *
-	 * Indicates that interrupt handling is enabled, specifically vblank
-	 * handling. Drivers which don't use drm_irq_install() need to set this
-	 * to true manually.
-	 */
-	bool irq_enabled;
-	int irq;
-
-	/*
-	 * If true, vblank interrupt will be disabled immediately when the
-	 * refcount drops to zero, as opposed to via the vblank disable
-	 * timer.
-	 * This can be set to true it the hardware has a working vblank
-	 * counter and the driver uses drm_vblank_on() and drm_vblank_off()
-	 * appropriately.
-	 */
-	bool vblank_disable_immediate;
-
-	/* array of size num_crtcs */
-	struct drm_vblank_crtc *vblank;
-
-	spinlock_t vblank_time_lock;    /**< Protects vblank count and time updates during vblank enable/disable */
-	spinlock_t vbl_lock;
-
-	u32 max_vblank_count;           /**< size of vblank counter register */
-
-	/**
-	 * List of events
-	 */
-	struct list_head vblank_event_list;
-	spinlock_t event_lock;
-
-	/*@} */
-
-	struct drm_agp_head *agp;	/**< AGP data */
-
-	struct pci_dev *pdev;		/**< PCI device structure */
-#ifdef __alpha__
-	struct pci_controller *hose;
-#endif
-
-	struct drm_sg_mem *sg;	/**< Scatter gather memory */
-	unsigned int num_crtcs;                  /**< Number of CRTCs on this device */
-
-	struct {
-		int context;
-		struct drm_hw_lock *lock;
-	} sigdata;
-
-	struct drm_local_map *agp_buffer_map;
-	unsigned int agp_buffer_token;
-
-	struct drm_mode_config mode_config;	/**< Current mode config */
-
-	/** \name GEM information */
-	/*@{ */
-	struct mutex object_name_lock;
-	struct idr object_name_idr;
-	struct drm_vma_offset_manager *vma_offset_manager;
-	/*@} */
-	int switch_power_state;
-};
-
-/**
  * drm_drv_uses_atomic_modeset - check if the driver implements
  * atomic_commit()
  * @dev: DRM device
@@ -466,19 +320,6 @@ static __inline__ int drm_core_check_feature(struct drm_device *dev,
 	return ((dev->driver->driver_features & feature) ? 1 : 0);
 }
 
-static inline void drm_device_set_unplugged(struct drm_device *dev)
-{
-	smp_wmb();
-	atomic_set(&dev->unplugged, 1);
-}
-
-static inline int drm_device_is_unplugged(struct drm_device *dev)
-{
-	int ret = atomic_read(&dev->unplugged);
-	smp_rmb();
-	return ret;
-}
-
 /******************************************************************/
 /** \name Internal function definitions */
 /*@{*/
diff --git a/include/drm/drm_atomic.h b/include/drm/drm_atomic.h
index 0196f26..8a5808e 100644
--- a/include/drm/drm_atomic.h
+++ b/include/drm/drm_atomic.h
@@ -154,6 +154,9 @@ struct __drm_connnectors_state {
 	struct drm_connector_state *state, *old_state, *new_state;
 };
 
+struct drm_private_obj;
+struct drm_private_state;
+
 /**
  * struct drm_private_state_funcs - atomic state functions for private objects
  *
@@ -166,7 +169,7 @@ struct __drm_connnectors_state {
  */
 struct drm_private_state_funcs {
 	/**
-	 * @duplicate_state:
+	 * @atomic_duplicate_state:
 	 *
 	 * Duplicate the current state of the private object and return it. It
 	 * is an error to call this before obj->state has been initialized.
@@ -176,29 +179,30 @@ struct drm_private_state_funcs {
 	 * Duplicated atomic state or NULL when obj->state is not
 	 * initialized or allocation failed.
 	 */
-	void *(*duplicate_state)(struct drm_atomic_state *state, void *obj);
+	struct drm_private_state *(*atomic_duplicate_state)(struct drm_private_obj *obj);
 
 	/**
-	 * @swap_state:
+	 * @atomic_destroy_state:
 	 *
-	 * This function swaps the existing state of a private object @obj with
-	 * it's newly created state, the pointer to which is passed as
-	 * @obj_state_ptr.
+	 * Frees the private object state created with @atomic_duplicate_state.
 	 */
-	void (*swap_state)(void *obj, void **obj_state_ptr);
+	void (*atomic_destroy_state)(struct drm_private_obj *obj,
+				     struct drm_private_state *state);
+};
 
-	/**
-	 * @destroy_state:
-	 *
-	 * Frees the private object state created with @duplicate_state.
-	 */
-	void (*destroy_state)(void *obj_state);
+struct drm_private_obj {
+	struct drm_private_state *state;
+
+	const struct drm_private_state_funcs *funcs;
+};
+
+struct drm_private_state {
+	struct drm_atomic_state *state;
 };
 
 struct __drm_private_objs_state {
-	void *obj;
-	void *obj_state;
-	const struct drm_private_state_funcs *funcs;
+	struct drm_private_obj *ptr;
+	struct drm_private_state *state, *old_state, *new_state;
 };
 
 /**
@@ -207,6 +211,7 @@ struct __drm_private_objs_state {
  * @dev: parent DRM device
  * @allow_modeset: allow full modeset
  * @legacy_cursor_update: hint to enforce legacy cursor IOCTL semantics
+ * @async_update: hint for asynchronous plane update
  * @planes: pointer to array of structures with per-plane data
  * @crtcs: pointer to array of CRTC pointers
  * @num_connector: size of the @connectors and @connector_states arrays
@@ -221,6 +226,7 @@ struct drm_atomic_state {
 	struct drm_device *dev;
 	bool allow_modeset : 1;
 	bool legacy_cursor_update : 1;
+	bool async_update : 1;
 	struct __drm_planes_state *planes;
 	struct __drm_crtcs_state *crtcs;
 	int num_connector;
@@ -309,20 +315,18 @@ int drm_atomic_crtc_set_property(struct drm_crtc *crtc,
 struct drm_plane_state * __must_check
 drm_atomic_get_plane_state(struct drm_atomic_state *state,
 			   struct drm_plane *plane);
-int drm_atomic_plane_set_property(struct drm_plane *plane,
-		struct drm_plane_state *state, struct drm_property *property,
-		uint64_t val);
 struct drm_connector_state * __must_check
 drm_atomic_get_connector_state(struct drm_atomic_state *state,
 			       struct drm_connector *connector);
-int drm_atomic_connector_set_property(struct drm_connector *connector,
-		struct drm_connector_state *state, struct drm_property *property,
-		uint64_t val);
 
-void * __must_check
+void drm_atomic_private_obj_init(struct drm_private_obj *obj,
+				 struct drm_private_state *state,
+				 const struct drm_private_state_funcs *funcs);
+void drm_atomic_private_obj_fini(struct drm_private_obj *obj);
+
+struct drm_private_state * __must_check
 drm_atomic_get_private_obj_state(struct drm_atomic_state *state,
-			      void *obj,
-			      const struct drm_private_state_funcs *funcs);
+				 struct drm_private_obj *obj);
 
 /**
  * drm_atomic_get_existing_crtc_state - get crtc state, if it exists
@@ -541,8 +545,6 @@ int __must_check
 drm_atomic_add_affected_planes(struct drm_atomic_state *state,
 			       struct drm_crtc *crtc);
 
-void drm_atomic_legacy_backoff(struct drm_atomic_state *state);
-
 void
 drm_atomic_clean_old_fb(struct drm_device *dev, unsigned plane_mask, int ret);
 
@@ -809,43 +811,63 @@ void drm_state_dump(struct drm_device *dev, struct drm_printer *p);
 		for_each_if (plane)
 
 /**
- * __for_each_private_obj - iterate over all private objects
+ * for_each_oldnew_private_obj_in_state - iterate over all private objects in an atomic update
  * @__state: &struct drm_atomic_state pointer
- * @obj: private object iteration cursor
- * @obj_state: private object state iteration cursor
+ * @obj: &struct drm_private_obj iteration cursor
+ * @old_obj_state: &struct drm_private_state iteration cursor for the old state
+ * @new_obj_state: &struct drm_private_state iteration cursor for the new state
  * @__i: int iteration cursor, for macro-internal use
- * @__funcs: &struct drm_private_state_funcs iteration cursor
  *
- * This macro iterates over the array containing private object data in atomic
- * state
+ * This iterates over all private objects in an atomic update, tracking both
+ * old and new state. This is useful in places where the state delta needs
+ * to be considered, for example in atomic check functions.
  */
-#define __for_each_private_obj(__state, obj, obj_state, __i, __funcs)	\
-	for ((__i) = 0;							\
-	     (__i) < (__state)->num_private_objs &&			\
-	     ((obj) = (__state)->private_objs[__i].obj,			\
-	      (__funcs) = (__state)->private_objs[__i].funcs,		\
-	      (obj_state) = (__state)->private_objs[__i].obj_state,	\
-	      1);							\
-	     (__i)++)							\
+#define for_each_oldnew_private_obj_in_state(__state, obj, old_obj_state, new_obj_state, __i) \
+	for ((__i) = 0; \
+	     (__i) < (__state)->num_private_objs && \
+		     ((obj) = (__state)->private_objs[__i].ptr, \
+		      (old_obj_state) = (__state)->private_objs[__i].old_state,	\
+		      (new_obj_state) = (__state)->private_objs[__i].new_state, 1); \
+	     (__i)++) \
+		for_each_if (obj)
 
 /**
- * for_each_private_obj - iterate over a specify type of private object
+ * for_each_old_private_obj_in_state - iterate over all private objects in an atomic update
  * @__state: &struct drm_atomic_state pointer
- * @obj_funcs: &struct drm_private_state_funcs function table to filter
- * 	private objects
- * @obj: private object iteration cursor
- * @obj_state: private object state iteration cursor
+ * @obj: &struct drm_private_obj iteration cursor
+ * @old_obj_state: &struct drm_private_state iteration cursor for the old state
  * @__i: int iteration cursor, for macro-internal use
- * @__funcs: &struct drm_private_state_funcs iteration cursor
  *
- * This macro iterates over the private objects state array while filtering the
- * objects based on the vfunc table that is passed as @obj_funcs. New macros
- * can be created by passing in the vfunc table associated with a specific
- * private object.
+ * This iterates over all private objects in an atomic update, tracking only
+ * the old state. This is useful in disable functions, where we need the old
+ * state the hardware is still in.
  */
-#define for_each_private_obj(__state, obj_funcs, obj, obj_state, __i, __funcs)	\
-	__for_each_private_obj(__state, obj, obj_state, __i, __funcs)		\
-		for_each_if (__funcs == obj_funcs)
+#define for_each_old_private_obj_in_state(__state, obj, old_obj_state, __i) \
+	for ((__i) = 0; \
+	     (__i) < (__state)->num_private_objs && \
+		     ((obj) = (__state)->private_objs[__i].ptr, \
+		      (old_obj_state) = (__state)->private_objs[__i].old_state, 1); \
+	     (__i)++) \
+		for_each_if (obj)
+
+/**
+ * for_each_new_private_obj_in_state - iterate over all private objects in an atomic update
+ * @__state: &struct drm_atomic_state pointer
+ * @obj: &struct drm_private_obj iteration cursor
+ * @new_obj_state: &struct drm_private_state iteration cursor for the new state
+ * @__i: int iteration cursor, for macro-internal use
+ *
+ * This iterates over all private objects in an atomic update, tracking only
+ * the new state. This is useful in enable functions, where we need the new state the
+ * hardware should be in when the atomic commit operation has completed.
+ */
+#define for_each_new_private_obj_in_state(__state, obj, new_obj_state, __i) \
+	for ((__i) = 0; \
+	     (__i) < (__state)->num_private_objs && \
+		     ((obj) = (__state)->private_objs[__i].ptr, \
+		      (new_obj_state) = (__state)->private_objs[__i].new_state, 1); \
+	     (__i)++) \
+		for_each_if (obj)
 
 /**
  * drm_atomic_crtc_needs_modeset - compute combined modeset need
@@ -853,7 +875,7 @@ void drm_state_dump(struct drm_device *dev, struct drm_printer *p);
  *
  * To give drivers flexibility &struct drm_crtc_state has 3 booleans to track
  * whether the state CRTC changed enough to need a full modeset cycle:
- * planes_changed, mode_changed and active_changed. This helper simply
+ * mode_changed, active_changed and connectors_changed. This helper simply
  * combines these three to compute the overall need for a modeset for @state.
  *
  * The atomic helper code sets these booleans, but drivers can and should
diff --git a/include/drm/drm_atomic_helper.h b/include/drm/drm_atomic_helper.h
index f0a8678..d2b56cc 100644
--- a/include/drm/drm_atomic_helper.h
+++ b/include/drm/drm_atomic_helper.h
@@ -33,6 +33,8 @@
 #include <drm/drm_modeset_helper.h>
 
 struct drm_atomic_state;
+struct drm_private_obj;
+struct drm_private_state;
 
 int drm_atomic_helper_check_modeset(struct drm_device *dev,
 				struct drm_atomic_state *state);
@@ -41,9 +43,14 @@ int drm_atomic_helper_check_planes(struct drm_device *dev,
 int drm_atomic_helper_check(struct drm_device *dev,
 			    struct drm_atomic_state *state);
 void drm_atomic_helper_commit_tail(struct drm_atomic_state *state);
+void drm_atomic_helper_commit_tail_rpm(struct drm_atomic_state *state);
 int drm_atomic_helper_commit(struct drm_device *dev,
 			     struct drm_atomic_state *state,
 			     bool nonblock);
+int drm_atomic_helper_async_check(struct drm_device *dev,
+				  struct drm_atomic_state *state);
+void drm_atomic_helper_async_commit(struct drm_device *dev,
+				    struct drm_atomic_state *state);
 
 int drm_atomic_helper_wait_for_fences(struct drm_device *dev,
 					struct drm_atomic_state *state,
@@ -52,6 +59,9 @@ int drm_atomic_helper_wait_for_fences(struct drm_device *dev,
 void drm_atomic_helper_wait_for_vblanks(struct drm_device *dev,
 					struct drm_atomic_state *old_state);
 
+void drm_atomic_helper_wait_for_flip_done(struct drm_device *dev,
+					  struct drm_atomic_state *old_state);
+
 void
 drm_atomic_helper_update_legacy_modeset_state(struct drm_device *dev,
 					      struct drm_atomic_state *old_state);
@@ -77,8 +87,8 @@ void
 drm_atomic_helper_disable_planes_on_crtc(struct drm_crtc_state *old_crtc_state,
 					 bool atomic);
 
-void drm_atomic_helper_swap_state(struct drm_atomic_state *state,
-				  bool stall);
+int __must_check drm_atomic_helper_swap_state(struct drm_atomic_state *state,
+					      bool stall);
 
 /* nonblocking commit helpers */
 int drm_atomic_helper_setup_commit(struct drm_atomic_state *state,
@@ -114,15 +124,6 @@ int drm_atomic_helper_commit_duplicated_state(struct drm_atomic_state *state,
 int drm_atomic_helper_resume(struct drm_device *dev,
 			     struct drm_atomic_state *state);
 
-int drm_atomic_helper_crtc_set_property(struct drm_crtc *crtc,
-					struct drm_property *property,
-					uint64_t val);
-int drm_atomic_helper_plane_set_property(struct drm_plane *plane,
-					struct drm_property *property,
-					uint64_t val);
-int drm_atomic_helper_connector_set_property(struct drm_connector *connector,
-					struct drm_property *property,
-					uint64_t val);
 int drm_atomic_helper_page_flip(struct drm_crtc *crtc,
 				struct drm_framebuffer *fb,
 				struct drm_pending_vblank_event *event,
@@ -135,8 +136,6 @@ int drm_atomic_helper_page_flip_target(
 				uint32_t flags,
 				uint32_t target,
 				struct drm_modeset_acquire_ctx *ctx);
-int drm_atomic_helper_connector_dpms(struct drm_connector *connector,
-				     int mode);
 struct drm_encoder *
 drm_atomic_helper_best_encoder(struct drm_connector *connector);
 
@@ -178,6 +177,8 @@ int drm_atomic_helper_legacy_gamma_set(struct drm_crtc *crtc,
 				       u16 *red, u16 *green, u16 *blue,
 				       uint32_t size,
 				       struct drm_modeset_acquire_ctx *ctx);
+void __drm_atomic_helper_private_obj_duplicate_state(struct drm_private_obj *obj,
+						     struct drm_private_state *state);
 
 /**
  * drm_atomic_crtc_for_each_plane - iterate over planes currently attached to CRTC
diff --git a/include/drm/drm_bridge.h b/include/drm/drm_bridge.h
index 1dc94d5..6522d4cb 100644
--- a/include/drm/drm_bridge.h
+++ b/include/drm/drm_bridge.h
@@ -268,6 +268,9 @@ void drm_bridge_enable(struct drm_bridge *bridge);
 struct drm_bridge *drm_panel_bridge_add(struct drm_panel *panel,
 					u32 connector_type);
 void drm_panel_bridge_remove(struct drm_bridge *bridge);
+struct drm_bridge *devm_drm_panel_bridge_add(struct device *dev,
+					     struct drm_panel *panel,
+					     u32 connector_type);
 #endif
 
 #endif
diff --git a/include/drm/drm_connector.h b/include/drm/drm_connector.h
index ae5b7dc..ea8da401 100644
--- a/include/drm/drm_connector.h
+++ b/include/drm/drm_connector.h
@@ -135,6 +135,28 @@ struct drm_scdc {
 struct drm_hdmi_info {
 	/** @scdc: sink's scdc support and capabilities */
 	struct drm_scdc scdc;
+
+	/**
+	 * @y420_vdb_modes: bitmap of modes which can support ycbcr420
+	 * output only (not normal RGB/YCBCR444/422 outputs). There are total
+	 * 107 VICs defined by CEA-861-F spec, so the size is 128 bits to map
+	 * upto 128 VICs;
+	 */
+	unsigned long y420_vdb_modes[BITS_TO_LONGS(128)];
+
+	/**
+	 * @y420_cmdb_modes: bitmap of modes which can support ycbcr420
+	 * output also, along with normal HDMI outputs. There are total 107
+	 * VICs defined by CEA-861-F spec, so the size is 128 bits to map upto
+	 * 128 VICs;
+	 */
+	unsigned long y420_cmdb_modes[BITS_TO_LONGS(128)];
+
+	/** @y420_cmdb_map: bitmap of SVD index, to extraxt vcb modes */
+	u64 y420_cmdb_map;
+
+	/** @y420_dc_modes: bitmap of deep color support index */
+	u8 y420_dc_modes;
 };
 
 /**
@@ -198,6 +220,7 @@ struct drm_display_info {
 #define DRM_COLOR_FORMAT_RGB444		(1<<0)
 #define DRM_COLOR_FORMAT_YCRCB444	(1<<1)
 #define DRM_COLOR_FORMAT_YCRCB422	(1<<2)
+#define DRM_COLOR_FORMAT_YCRCB420	(1<<3)
 
 	/**
 	 * @color_formats: HDMI Color formats, selects between RGB and YCrCb
@@ -359,8 +382,8 @@ struct drm_connector_funcs {
 	 * implement the 4 level DPMS support on the connector any more, but
 	 * instead only have an on/off "ACTIVE" property on the CRTC object.
 	 *
-	 * Drivers implementing atomic modeset should use
-	 * drm_atomic_helper_connector_dpms() to implement this hook.
+	 * This hook is not used by atomic drivers, remapping of the legacy DPMS
+	 * property is entirely handled in the DRM core.
 	 *
 	 * RETURNS:
 	 *
@@ -457,11 +480,9 @@ struct drm_connector_funcs {
 	 * This is the legacy entry point to update a property attached to the
 	 * connector.
 	 *
-	 * Drivers implementing atomic modeset should use
-	 * drm_atomic_helper_connector_set_property() to implement this hook.
-	 *
 	 * This callback is optional if the driver does not support any legacy
-	 * driver-private properties.
+	 * driver-private properties. For atomic drivers it is not used because
+	 * property handling is done entirely in the DRM core.
 	 *
 	 * RETURNS:
 	 *
@@ -726,6 +747,15 @@ struct drm_connector {
 	bool interlace_allowed;
 	bool doublescan_allowed;
 	bool stereo_allowed;
+
+	/**
+	 * @ycbcr_420_allowed : This bool indicates if this connector is
+	 * capable of handling YCBCR 420 output. While parsing the EDID
+	 * blocks, its very helpful to know, if the source is capable of
+	 * handling YCBCR 420 outputs.
+	 */
+	bool ycbcr_420_allowed;
+
 	/**
 	 * @registered: Is this connector exposed (registered) with userspace?
 	 * Protected by @mutex.
diff --git a/include/drm/drm_crtc.h b/include/drm/drm_crtc.h
index 629a5fe..1a64202 100644
--- a/include/drm/drm_crtc.h
+++ b/include/drm/drm_crtc.h
@@ -358,14 +358,6 @@ struct drm_crtc_funcs {
 	 * drm_crtc_enable_color_mgmt(), which then supports the legacy gamma
 	 * interface through the drm_atomic_helper_legacy_gamma_set()
 	 * compatibility implementation.
-	 *
-	 * NOTE:
-	 *
-	 * Drivers that support gamma tables and also fbdev emulation through
-	 * the provided helper library need to take care to fill out the gamma
-	 * hooks for both. Currently there's a bit an unfortunate duplication
-	 * going on, which should eventually be unified to just one set of
-	 * hooks.
 	 */
 	int (*gamma_set)(struct drm_crtc *crtc, u16 *r, u16 *g, u16 *b,
 			 uint32_t size,
@@ -481,11 +473,9 @@ struct drm_crtc_funcs {
 	 * This is the legacy entry point to update a property attached to the
 	 * CRTC.
 	 *
-	 * Drivers implementing atomic modeset should use
-	 * drm_atomic_helper_crtc_set_property() to implement this hook.
-	 *
 	 * This callback is optional if the driver does not support any legacy
-	 * driver-private properties.
+	 * driver-private properties. For atomic drivers it is not used because
+	 * property handling is done entirely in the DRM core.
 	 *
 	 * RETURNS:
 	 *
@@ -685,6 +675,9 @@ struct drm_crtc_funcs {
 	 * drm_crtc_vblank_off() and drm_crtc_vblank_on() when disabling or
 	 * enabling a CRTC.
 	 *
+	 * See also &drm_device.vblank_disable_immediate and
+	 * &drm_device.max_vblank_count.
+	 *
 	 * Returns:
 	 *
 	 * Raw vblank counter value.
diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h
new file mode 100644
index 0000000..e21af87
--- /dev/null
+++ b/include/drm/drm_device.h
@@ -0,0 +1,190 @@
+#ifndef _DRM_DEVICE_H_
+#define _DRM_DEVICE_H_
+
+#include <linux/list.h>
+#include <linux/kref.h>
+#include <linux/mutex.h>
+#include <linux/idr.h>
+
+#include <drm/drm_hashtab.h>
+#include <drm/drm_mode_config.h>
+
+struct drm_driver;
+struct drm_minor;
+struct drm_master;
+struct drm_device_dma;
+struct drm_vblank_crtc;
+struct drm_sg_mem;
+struct drm_local_map;
+struct drm_vma_offset_manager;
+
+struct inode;
+
+struct pci_dev;
+struct pci_controller;
+
+/**
+ * DRM device structure. This structure represent a complete card that
+ * may contain multiple heads.
+ */
+struct drm_device {
+	struct list_head legacy_dev_list;/**< list of devices per driver for stealth attach cleanup */
+	int if_version;			/**< Highest interface version set */
+
+	/** \name Lifetime Management */
+	/*@{ */
+	struct kref ref;		/**< Object ref-count */
+	struct device *dev;		/**< Device structure of bus-device */
+	struct drm_driver *driver;	/**< DRM driver managing the device */
+	void *dev_private;		/**< DRM driver private data */
+	struct drm_minor *control;		/**< Control node */
+	struct drm_minor *primary;		/**< Primary node */
+	struct drm_minor *render;		/**< Render node */
+	bool registered;
+
+	/* currently active master for this device. Protected by master_mutex */
+	struct drm_master *master;
+
+	atomic_t unplugged;			/**< Flag whether dev is dead */
+	struct inode *anon_inode;		/**< inode for private address-space */
+	char *unique;				/**< unique name of the device */
+	/*@} */
+
+	/** \name Locks */
+	/*@{ */
+	struct mutex struct_mutex;	/**< For others */
+	struct mutex master_mutex;      /**< For drm_minor::master and drm_file::is_master */
+	/*@} */
+
+	/** \name Usage Counters */
+	/*@{ */
+	int open_count;			/**< Outstanding files open, protected by drm_global_mutex. */
+	spinlock_t buf_lock;		/**< For drm_device::buf_use and a few other things. */
+	int buf_use;			/**< Buffers in use -- cannot alloc */
+	atomic_t buf_alloc;		/**< Buffer allocation in progress */
+	/*@} */
+
+	struct mutex filelist_mutex;
+	struct list_head filelist;
+
+	/** \name Memory management */
+	/*@{ */
+	struct list_head maplist;	/**< Linked list of regions */
+	struct drm_open_hash map_hash;	/**< User token hash table for maps */
+
+	/** \name Context handle management */
+	/*@{ */
+	struct list_head ctxlist;	/**< Linked list of context handles */
+	struct mutex ctxlist_mutex;	/**< For ctxlist */
+
+	struct idr ctx_idr;
+
+	struct list_head vmalist;	/**< List of vmas (for debugging) */
+
+	/*@} */
+
+	/** \name DMA support */
+	/*@{ */
+	struct drm_device_dma *dma;		/**< Optional pointer for DMA support */
+	/*@} */
+
+	/** \name Context support */
+	/*@{ */
+
+	__volatile__ long context_flag;	/**< Context swapping flag */
+	int last_context;		/**< Last current context */
+	/*@} */
+
+	/**
+	 * @irq_enabled:
+	 *
+	 * Indicates that interrupt handling is enabled, specifically vblank
+	 * handling. Drivers which don't use drm_irq_install() need to set this
+	 * to true manually.
+	 */
+	bool irq_enabled;
+	int irq;
+
+	/**
+	 * @vblank_disable_immediate:
+	 *
+	 * If true, vblank interrupt will be disabled immediately when the
+	 * refcount drops to zero, as opposed to via the vblank disable
+	 * timer.
+	 *
+	 * This can be set to true it the hardware has a working vblank counter
+	 * with high-precision timestamping (otherwise there are races) and the
+	 * driver uses drm_crtc_vblank_on() and drm_crtc_vblank_off()
+	 * appropriately. See also @max_vblank_count and
+	 * &drm_crtc_funcs.get_vblank_counter.
+	 */
+	bool vblank_disable_immediate;
+
+	/**
+	 * @vblank:
+	 *
+	 * Array of vblank tracking structures, one per &struct drm_crtc. For
+	 * historical reasons (vblank support predates kernel modesetting) this
+	 * is free-standing and not part of &struct drm_crtc itself. It must be
+	 * initialized explicitly by calling drm_vblank_init().
+	 */
+	struct drm_vblank_crtc *vblank;
+
+	spinlock_t vblank_time_lock;    /**< Protects vblank count and time updates during vblank enable/disable */
+	spinlock_t vbl_lock;
+
+	/**
+	 * @max_vblank_count:
+	 *
+	 * Maximum value of the vblank registers. This value +1 will result in a
+	 * wrap-around of the vblank register. It is used by the vblank core to
+	 * handle wrap-arounds.
+	 *
+	 * If set to zero the vblank core will try to guess the elapsed vblanks
+	 * between times when the vblank interrupt is disabled through
+	 * high-precision timestamps. That approach is suffering from small
+	 * races and imprecision over longer time periods, hence exposing a
+	 * hardware vblank counter is always recommended.
+	 *
+	 * If non-zeor, &drm_crtc_funcs.get_vblank_counter must be set.
+	 */
+	u32 max_vblank_count;           /**< size of vblank counter register */
+
+	/**
+	 * List of events
+	 */
+	struct list_head vblank_event_list;
+	spinlock_t event_lock;
+
+	/*@} */
+
+	struct drm_agp_head *agp;	/**< AGP data */
+
+	struct pci_dev *pdev;		/**< PCI device structure */
+#ifdef __alpha__
+	struct pci_controller *hose;
+#endif
+
+	struct drm_sg_mem *sg;	/**< Scatter gather memory */
+	unsigned int num_crtcs;                  /**< Number of CRTCs on this device */
+
+	struct {
+		int context;
+		struct drm_hw_lock *lock;
+	} sigdata;
+
+	struct drm_local_map *agp_buffer_map;
+	unsigned int agp_buffer_token;
+
+	struct drm_mode_config mode_config;	/**< Current mode config */
+
+	/** \name GEM information */
+	/*@{ */
+	struct mutex object_name_lock;
+	struct idr object_name_idr;
+	struct drm_vma_offset_manager *vma_offset_manager;
+	/*@} */
+	int switch_power_state;
+};
+
+#endif
diff --git a/include/drm/drm_dp_mst_helper.h b/include/drm/drm_dp_mst_helper.h
index 177ab6f..d55abb7 100644
--- a/include/drm/drm_dp_mst_helper.h
+++ b/include/drm/drm_dp_mst_helper.h
@@ -404,12 +404,17 @@ struct drm_dp_payload {
 	int vcpi;
 };
 
+#define to_dp_mst_topology_state(x) container_of(x, struct drm_dp_mst_topology_state, base)
+
 struct drm_dp_mst_topology_state {
+	struct drm_private_state base;
 	int avail_slots;
 	struct drm_atomic_state *state;
 	struct drm_dp_mst_topology_mgr *mgr;
 };
 
+#define to_dp_mst_topology_mgr(x) container_of(x, struct drm_dp_mst_topology_mgr, base)
+
 /**
  * struct drm_dp_mst_topology_mgr - DisplayPort MST manager
  *
@@ -419,6 +424,11 @@ struct drm_dp_mst_topology_state {
  */
 struct drm_dp_mst_topology_mgr {
 	/**
+	 * @base: Base private object for atomic
+	 */
+	struct drm_private_obj base;
+
+	/**
 	 * @dev: device pointer for adding i2c devices etc.
 	 */
 	struct drm_device *dev;
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index d855f9a..71bbaae 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -30,7 +30,8 @@
 #include <linux/list.h>
 #include <linux/irqreturn.h>
 
-struct drm_device;
+#include <drm/drm_device.h>
+
 struct drm_file;
 struct drm_gem_object;
 struct drm_master;
@@ -173,8 +174,6 @@ struct drm_driver {
 	 */
 	void (*release) (struct drm_device *);
 
-	int (*set_busid)(struct drm_device *dev, struct drm_master *master);
-
 	/**
 	 * @get_vblank_counter:
 	 *
@@ -392,6 +391,11 @@ struct drm_driver {
 	 */
 	void (*master_drop)(struct drm_device *dev, struct drm_file *file_priv);
 
+	/**
+	 * @debugfs_init:
+	 *
+	 * Allows drivers to create driver-specific debugfs files.
+	 */
 	int (*debugfs_init)(struct drm_minor *minor);
 
 	/**
@@ -410,7 +414,18 @@ struct drm_driver {
 	 */
 	void (*gem_free_object_unlocked) (struct drm_gem_object *obj);
 
+	/**
+	 * @gem_open_object:
+	 *
+	 * Driver hook called upon gem handle creation
+	 */
 	int (*gem_open_object) (struct drm_gem_object *, struct drm_file *);
+
+	/**
+	 * @gem_close_object:
+	 *
+	 * Driver hook called upon gem handle release
+	 */
 	void (*gem_close_object) (struct drm_gem_object *, struct drm_file *);
 
 	/**
@@ -423,19 +438,34 @@ struct drm_driver {
 						    size_t size);
 
 	/* prime: */
-	/* export handle -> fd (see drm_gem_prime_handle_to_fd() helper) */
+	/**
+	 * @prime_handle_to_fd:
+	 *
+	 * export handle -> fd (see drm_gem_prime_handle_to_fd() helper)
+	 */
 	int (*prime_handle_to_fd)(struct drm_device *dev, struct drm_file *file_priv,
 				uint32_t handle, uint32_t flags, int *prime_fd);
-	/* import fd -> handle (see drm_gem_prime_fd_to_handle() helper) */
+	/**
+	 * @prime_fd_to_handle:
+	 *
+	 * import fd -> handle (see drm_gem_prime_fd_to_handle() helper)
+	 */
 	int (*prime_fd_to_handle)(struct drm_device *dev, struct drm_file *file_priv,
 				int prime_fd, uint32_t *handle);
-	/* export GEM -> dmabuf */
+	/**
+	 * @gem_prime_export:
+	 *
+	 * export GEM -> dmabuf
+	 */
 	struct dma_buf * (*gem_prime_export)(struct drm_device *dev,
 				struct drm_gem_object *obj, int flags);
-	/* import dmabuf -> GEM */
+	/**
+	 * @gem_prime_import:
+	 *
+	 * import dmabuf -> GEM
+	 */
 	struct drm_gem_object * (*gem_prime_import)(struct drm_device *dev,
 				struct dma_buf *dma_buf);
-	/* low-level interface used by drm_gem_prime_{import,export} */
 	int (*gem_prime_pin)(struct drm_gem_object *obj);
 	void (*gem_prime_unpin)(struct drm_gem_object *obj);
 	struct reservation_object * (*gem_prime_res_obj)(
@@ -507,19 +537,46 @@ struct drm_driver {
 			    struct drm_device *dev,
 			    uint32_t handle);
 
-	/* Driver private ops for this object */
+	/**
+	 * @gem_vm_ops: Driver private ops for this object
+	 */
 	const struct vm_operations_struct *gem_vm_ops;
 
+	/** @major: driver major number */
 	int major;
+	/** @minor: driver minor number */
 	int minor;
+	/** @patchlevel: driver patch level */
 	int patchlevel;
+	/** @name: driver name */
 	char *name;
+	/** @desc: driver description */
 	char *desc;
+	/** @date: driver date */
 	char *date;
 
+	/** @driver_features: driver features */
 	u32 driver_features;
+
+	/**
+	 * @ioctls:
+	 *
+	 * Array of driver-private IOCTL description entries. See the chapter on
+	 * :ref:`IOCTL support in the userland interfaces
+	 * chapter<drm_driver_ioctl>` for the full details.
+	 */
+
 	const struct drm_ioctl_desc *ioctls;
+	/** @num_ioctls: Number of entries in @ioctls. */
 	int num_ioctls;
+
+	/**
+	 * @fops:
+	 *
+	 * File operations for the DRM device node. See the discussion in
+	 * :ref:`file operations<drm_driver_fops>` for in-depth coverage and
+	 * some examples.
+	 */
 	const struct file_operations *fops;
 
 	/* Everything below here is for legacy driver, never use! */
@@ -557,7 +614,24 @@ void drm_dev_unregister(struct drm_device *dev);
 void drm_dev_ref(struct drm_device *dev);
 void drm_dev_unref(struct drm_device *dev);
 void drm_put_dev(struct drm_device *dev);
-void drm_unplug_dev(struct drm_device *dev);
+void drm_dev_unplug(struct drm_device *dev);
+
+/**
+ * drm_dev_is_unplugged - is a DRM device unplugged
+ * @dev: DRM device
+ *
+ * This function can be called to check whether a hotpluggable is unplugged.
+ * Unplugging itself is singalled through drm_dev_unplug(). If a device is
+ * unplugged, these two functions guarantee that any store before calling
+ * drm_dev_unplug() is visible to callers of this function after it completes
+ */
+static inline int drm_dev_is_unplugged(struct drm_device *dev)
+{
+	int ret = atomic_read(&dev->unplugged);
+	smp_rmb();
+	return ret;
+}
+
 
 int drm_dev_set_unique(struct drm_device *dev, const char *name);
 
diff --git a/include/drm/drm_edid.h b/include/drm/drm_edid.h
index 7b9f48b..1e1908a 100644
--- a/include/drm/drm_edid.h
+++ b/include/drm/drm_edid.h
@@ -213,6 +213,14 @@ struct detailed_timing {
 #define DRM_EDID_HDMI_DC_30               (1 << 4)
 #define DRM_EDID_HDMI_DC_Y444             (1 << 3)
 
+/* YCBCR 420 deep color modes */
+#define DRM_EDID_YCBCR420_DC_48		  (1 << 6)
+#define DRM_EDID_YCBCR420_DC_36		  (1 << 5)
+#define DRM_EDID_YCBCR420_DC_30		  (1 << 4)
+#define DRM_EDID_YCBCR420_DC_MASK (DRM_EDID_YCBCR420_DC_48 | \
+				    DRM_EDID_YCBCR420_DC_36 | \
+				    DRM_EDID_YCBCR420_DC_30)
+
 /* ELD Header Block */
 #define DRM_ELD_HEADER_BLOCK_SIZE	4
 
@@ -343,7 +351,8 @@ drm_load_edid_firmware(struct drm_connector *connector)
 
 int
 drm_hdmi_avi_infoframe_from_display_mode(struct hdmi_avi_infoframe *frame,
-					 const struct drm_display_mode *mode);
+					 const struct drm_display_mode *mode,
+					 bool is_hdmi2_sink);
 int
 drm_hdmi_vendor_infoframe_from_display_mode(struct hdmi_vendor_infoframe *frame,
 					    const struct drm_display_mode *mode);
diff --git a/include/drm/drm_fb_cma_helper.h b/include/drm/drm_fb_cma_helper.h
index 199a63f..a323781 100644
--- a/include/drm/drm_fb_cma_helper.h
+++ b/include/drm/drm_fb_cma_helper.h
@@ -24,9 +24,9 @@ void drm_fbdev_cma_fini(struct drm_fbdev_cma *fbdev_cma);
 
 void drm_fbdev_cma_restore_mode(struct drm_fbdev_cma *fbdev_cma);
 void drm_fbdev_cma_hotplug_event(struct drm_fbdev_cma *fbdev_cma);
-void drm_fbdev_cma_set_suspend(struct drm_fbdev_cma *fbdev_cma, int state);
+void drm_fbdev_cma_set_suspend(struct drm_fbdev_cma *fbdev_cma, bool state);
 void drm_fbdev_cma_set_suspend_unlocked(struct drm_fbdev_cma *fbdev_cma,
-					int state);
+					bool state);
 
 void drm_fb_cma_destroy(struct drm_framebuffer *fb);
 int drm_fb_cma_create_handle(struct drm_framebuffer *fb,
diff --git a/include/drm/drm_fb_helper.h b/include/drm/drm_fb_helper.h
index 119e5e4..33fe9592 100644
--- a/include/drm/drm_fb_helper.h
+++ b/include/drm/drm_fb_helper.h
@@ -85,38 +85,6 @@ struct drm_fb_helper_surface_size {
  */
 struct drm_fb_helper_funcs {
 	/**
-	 * @gamma_set:
-	 *
-	 * Set the given gamma LUT register on the given CRTC.
-	 *
-	 * This callback is optional.
-	 *
-	 * FIXME:
-	 *
-	 * This callback is functionally redundant with the core gamma table
-	 * support and simply exists because the fbdev hasn't yet been
-	 * refactored to use the core gamma table interfaces.
-	 */
-	void (*gamma_set)(struct drm_crtc *crtc, u16 red, u16 green,
-			  u16 blue, int regno);
-	/**
-	 * @gamma_get:
-	 *
-	 * Read the given gamma LUT register on the given CRTC, used to save the
-	 * current LUT when force-restoring the fbdev for e.g. kdbg.
-	 *
-	 * This callback is optional.
-	 *
-	 * FIXME:
-	 *
-	 * This callback is functionally redundant with the core gamma table
-	 * support and simply exists because the fbdev hasn't yet been
-	 * refactored to use the core gamma table interfaces.
-	 */
-	void (*gamma_get)(struct drm_crtc *crtc, u16 *red, u16 *green,
-			  u16 *blue, int regno);
-
-	/**
 	 * @fb_probe:
 	 *
 	 * Driver callback to allocate and initialize the fbdev info structure.
@@ -169,7 +137,6 @@ struct drm_fb_helper_connector {
  * @crtc_info: per-CRTC helper state (mode, x/y offset, etc)
  * @connector_count: number of connected connectors
  * @connector_info_alloc_count: size of connector_info
- * @connector_info: array of per-connector information
  * @funcs: driver callbacks for fb helper
  * @fbdev: emulated fbdev device info struct
  * @pseudo_palette: fake palette of 16 colors
@@ -191,6 +158,12 @@ struct drm_fb_helper {
 	struct drm_fb_helper_crtc *crtc_info;
 	int connector_count;
 	int connector_info_alloc_count;
+	/**
+	 * @connector_info:
+	 *
+	 * Array of per-connector information. Do not iterate directly, but use
+	 * drm_fb_helper_for_each_connector.
+	 */
 	struct drm_fb_helper_connector **connector_info;
 	const struct drm_fb_helper_funcs *funcs;
 	struct fb_info *fbdev;
@@ -201,6 +174,18 @@ struct drm_fb_helper {
 	struct work_struct resume_work;
 
 	/**
+	 * @lock:
+	 *
+	 * Top-level FBDEV helper lock. This protects all internal data
+	 * structures and lists, such as @connector_info and @crtc_info.
+	 *
+	 * FIXME: fbdev emulation locking is a mess and long term we want to
+	 * protect all helper internal state with this lock as well as reduce
+	 * core KMS locking as much as possible.
+	 */
+	struct mutex lock;
+
+	/**
 	 * @kernel_fb_list:
 	 *
 	 * Entry on the global kernel_fb_helper_list, used for kgdb entry/exit.
@@ -215,6 +200,29 @@ struct drm_fb_helper {
 	 * needs to be reprobe when fbdev is in control again.
 	 */
 	bool delayed_hotplug;
+
+	/**
+	 * @deferred_setup:
+	 *
+	 * If no outputs are connected (disconnected or unknown) the FB helper
+	 * code will defer setup until at least one of the outputs shows up.
+	 * This field keeps track of the status so that setup can be retried
+	 * at every hotplug event until it succeeds eventually.
+	 *
+	 * Protected by @lock.
+	 */
+	bool deferred_setup;
+
+	/**
+	 * @preferred_bpp:
+	 *
+	 * Temporary storage for the driver's preferred BPP setting passed to
+	 * FB helper initialization. This needs to be tracked so that deferred
+	 * FB helper setup can pass this on.
+	 *
+	 * See also: @deferred_setup
+	 */
+	int preferred_bpp;
 };
 
 /**
diff --git a/include/drm/drm_framebuffer.h b/include/drm/drm_framebuffer.h
index 5244f05..b6996dd 100644
--- a/include/drm/drm_framebuffer.h
+++ b/include/drm/drm_framebuffer.h
@@ -190,6 +190,13 @@ struct drm_framebuffer {
 	 * @filp_head: Placed on &drm_file.fbs, protected by &drm_file.fbs_lock.
 	 */
 	struct list_head filp_head;
+	/**
+	 * @obj: GEM objects backing the framebuffer, one per plane (optional).
+	 *
+	 * This is used by the GEM framebuffer helpers, see e.g.
+	 * drm_gem_fb_create().
+	 */
+	struct drm_gem_object *obj[4];
 };
 
 #define obj_to_fb(x) container_of(x, struct drm_framebuffer, base)
diff --git a/include/drm/drm_gem.h b/include/drm/drm_gem.h
index 663d803..9c55c2a 100644
--- a/include/drm/drm_gem.h
+++ b/include/drm/drm_gem.h
@@ -131,21 +131,6 @@ struct drm_gem_object {
 	uint32_t write_domain;
 
 	/**
-	 * @pending_read_domains:
-	 *
-	 * While validating an exec operation, the
-	 * new read/write domain values are computed here.
-	 * They will be transferred to the above values
-	 * at the point that any cache flushing occurs
-	 */
-	uint32_t pending_read_domains;
-
-	/**
-	 * @pending_write_domain: Write domain similar to @pending_read_domains.
-	 */
-	uint32_t pending_write_domain;
-
-	/**
 	 * @dma_buf:
 	 *
 	 * dma-buf associated with this GEM object.
@@ -317,6 +302,8 @@ void drm_gem_put_pages(struct drm_gem_object *obj, struct page **pages,
 		bool dirty, bool accessed);
 
 struct drm_gem_object *drm_gem_object_lookup(struct drm_file *filp, u32 handle);
+int drm_gem_dumb_map_offset(struct drm_file *file, struct drm_device *dev,
+			    u32 handle, u64 *offset);
 int drm_gem_dumb_destroy(struct drm_file *file,
 			 struct drm_device *dev,
 			 uint32_t handle);
diff --git a/include/drm/drm_gem_cma_helper.h b/include/drm/drm_gem_cma_helper.h
index b42529e..58a739b 100644
--- a/include/drm/drm_gem_cma_helper.h
+++ b/include/drm/drm_gem_cma_helper.h
@@ -73,11 +73,6 @@ int drm_gem_cma_dumb_create(struct drm_file *file_priv,
 			    struct drm_device *drm,
 			    struct drm_mode_create_dumb *args);
 
-/* map memory region for DRM framebuffer to user space */
-int drm_gem_cma_dumb_map_offset(struct drm_file *file_priv,
-				struct drm_device *drm, u32 handle,
-				u64 *offset);
-
 /* set vm_flags and we can change the VM attribute to other one at here */
 int drm_gem_cma_mmap(struct file *filp, struct vm_area_struct *vma);
 
diff --git a/include/drm/drm_gem_framebuffer_helper.h b/include/drm/drm_gem_framebuffer_helper.h
new file mode 100644
index 0000000..db9cfa0
--- /dev/null
+++ b/include/drm/drm_gem_framebuffer_helper.h
@@ -0,0 +1,37 @@
+#ifndef __DRM_GEM_FB_HELPER_H__
+#define __DRM_GEM_FB_HELPER_H__
+
+struct drm_device;
+struct drm_file;
+struct drm_fb_helper_surface_size;
+struct drm_framebuffer;
+struct drm_framebuffer_funcs;
+struct drm_gem_object;
+struct drm_mode_fb_cmd2;
+struct drm_plane;
+struct drm_plane_state;
+
+struct drm_gem_object *drm_gem_fb_get_obj(struct drm_framebuffer *fb,
+					  unsigned int plane);
+void drm_gem_fb_destroy(struct drm_framebuffer *fb);
+int drm_gem_fb_create_handle(struct drm_framebuffer *fb, struct drm_file *file,
+			     unsigned int *handle);
+
+struct drm_framebuffer *
+drm_gem_fb_create_with_funcs(struct drm_device *dev, struct drm_file *file,
+			     const struct drm_mode_fb_cmd2 *mode_cmd,
+			     const struct drm_framebuffer_funcs *funcs);
+struct drm_framebuffer *
+drm_gem_fb_create(struct drm_device *dev, struct drm_file *file,
+		  const struct drm_mode_fb_cmd2 *mode_cmd);
+
+int drm_gem_fb_prepare_fb(struct drm_plane *plane,
+			  struct drm_plane_state *state);
+
+struct drm_framebuffer *
+drm_gem_fbdev_fb_create(struct drm_device *dev,
+			struct drm_fb_helper_surface_size *sizes,
+			unsigned int pitch_align, struct drm_gem_object *obj,
+			const struct drm_framebuffer_funcs *funcs);
+
+#endif
diff --git a/include/drm/drm_mode_config.h b/include/drm/drm_mode_config.h
index 4298171..1b37368 100644
--- a/include/drm/drm_mode_config.h
+++ b/include/drm/drm_mode_config.h
@@ -757,6 +757,12 @@ struct drm_mode_config {
 	 */
 	bool allow_fb_modifiers;
 
+	/**
+	 * @modifiers: Plane property to list support modifier/format
+	 * combination.
+	 */
+	struct drm_property *modifiers_property;
+
 	/* cursor size */
 	uint32_t cursor_width, cursor_height;
 
diff --git a/include/drm/drm_modes.h b/include/drm/drm_modes.h
index 94ac771..9f3421c 100644
--- a/include/drm/drm_modes.h
+++ b/include/drm/drm_modes.h
@@ -80,6 +80,7 @@ struct videomode;
  * @MODE_ONE_SIZE: only one resolution is supported
  * @MODE_NO_REDUCED: monitor doesn't accept reduced blanking
  * @MODE_NO_STEREO: stereo modes not supported
+ * @MODE_NO_420: ycbcr 420 modes not supported
  * @MODE_STALE: mode has become stale
  * @MODE_BAD: unspecified reason
  * @MODE_ERROR: error condition
@@ -124,6 +125,7 @@ enum drm_mode_status {
 	MODE_ONE_SIZE,
 	MODE_NO_REDUCED,
 	MODE_NO_STEREO,
+	MODE_NO_420,
 	MODE_STALE = -3,
 	MODE_BAD = -2,
 	MODE_ERROR = -1
@@ -450,6 +452,12 @@ int drm_mode_convert_umode(struct drm_display_mode *out,
 			   const struct drm_mode_modeinfo *in);
 void drm_mode_probed_add(struct drm_connector *connector, struct drm_display_mode *mode);
 void drm_mode_debug_printmodeline(const struct drm_display_mode *mode);
+bool drm_mode_is_420_only(const struct drm_display_info *display,
+			  const struct drm_display_mode *mode);
+bool drm_mode_is_420_also(const struct drm_display_info *display,
+			  const struct drm_display_mode *mode);
+bool drm_mode_is_420(const struct drm_display_info *display,
+		     const struct drm_display_mode *mode);
 
 struct drm_display_mode *drm_cvt_mode(struct drm_device *dev,
 				      int hdisplay, int vdisplay, int vrefresh,
@@ -496,6 +504,9 @@ bool drm_mode_equal_no_clocks_no_stereo(const struct drm_display_mode *mode1,
 enum drm_mode_status drm_mode_validate_basic(const struct drm_display_mode *mode);
 enum drm_mode_status drm_mode_validate_size(const struct drm_display_mode *mode,
 					    int maxX, int maxY);
+enum drm_mode_status
+drm_mode_validate_ycbcr420(const struct drm_display_mode *mode,
+			   struct drm_connector *connector);
 void drm_mode_prune_invalid(struct drm_device *dev,
 			    struct list_head *mode_list, bool verbose);
 void drm_mode_sort(struct list_head *mode_list);
diff --git a/include/drm/drm_modeset_helper_vtables.h b/include/drm/drm_modeset_helper_vtables.h
index 85984b2..c55cf3f 100644
--- a/include/drm/drm_modeset_helper_vtables.h
+++ b/include/drm/drm_modeset_helper_vtables.h
@@ -71,7 +71,7 @@ struct drm_crtc_helper_funcs {
 	 * This callback is used by the legacy CRTC helpers.  Atomic helpers
 	 * also support using this hook for enabling and disabling a CRTC to
 	 * facilitate transitions to atomic, but it is deprecated. Instead
-	 * @enable and @disable should be used.
+	 * @atomic_enable and @atomic_disable should be used.
 	 */
 	void (*dpms)(struct drm_crtc *crtc, int mode);
 
@@ -85,8 +85,8 @@ struct drm_crtc_helper_funcs {
 	 *
 	 * This callback is used by the legacy CRTC helpers.  Atomic helpers
 	 * also support using this hook for disabling a CRTC to facilitate
-	 * transitions to atomic, but it is deprecated. Instead @disable should
-	 * be used.
+	 * transitions to atomic, but it is deprecated. Instead @atomic_disable
+	 * should be used.
 	 */
 	void (*prepare)(struct drm_crtc *crtc);
 
@@ -100,8 +100,8 @@ struct drm_crtc_helper_funcs {
 	 *
 	 * This callback is used by the legacy CRTC helpers.  Atomic helpers
 	 * also support using this hook for enabling a CRTC to facilitate
-	 * transitions to atomic, but it is deprecated. Instead @enable should
-	 * be used.
+	 * transitions to atomic, but it is deprecated. Instead @atomic_enable
+	 * should be used.
 	 */
 	void (*commit)(struct drm_crtc *crtc);
 
@@ -222,7 +222,7 @@ struct drm_crtc_helper_funcs {
 	 * pipeline is suspended using either DPMS or the new "ACTIVE" property.
 	 * Which means register values set in this callback might get reset when
 	 * the CRTC is suspended, but not restored.  Such drivers should instead
-	 * move all their CRTC setup into the @enable callback.
+	 * move all their CRTC setup into the @atomic_enable callback.
 	 *
 	 * This callback is optional.
 	 */
@@ -267,22 +267,6 @@ struct drm_crtc_helper_funcs {
 				    enum mode_set_atomic);
 
 	/**
-	 * @load_lut:
-	 *
-	 * Load a LUT prepared with the &drm_fb_helper_funcs.gamma_set vfunc.
-	 *
-	 * This callback is optional and is only used by the fbdev emulation
-	 * helpers.
-	 *
-	 * FIXME:
-	 *
-	 * This callback is functionally redundant with the core gamma table
-	 * support and simply exists because the fbdev hasn't yet been
-	 * refactored to use the core gamma table interfaces.
-	 */
-	void (*load_lut)(struct drm_crtc *crtc);
-
-	/**
 	 * @disable:
 	 *
 	 * This callback should be used to disable the CRTC. With the atomic
@@ -297,7 +281,7 @@ struct drm_crtc_helper_funcs {
 	 * Atomic drivers don't need to implement it if there's no need to
 	 * disable anything at the CRTC level. To ensure that runtime PM
 	 * handling (using either DPMS or the new "ACTIVE" property) works
-	 * @disable must be the inverse of @enable for atomic drivers.
+	 * @disable must be the inverse of @atomic_enable for atomic drivers.
 	 * Atomic drivers should consider to use @atomic_disable instead of
 	 * this one.
 	 *
@@ -316,24 +300,6 @@ struct drm_crtc_helper_funcs {
 	void (*disable)(struct drm_crtc *crtc);
 
 	/**
-	 * @enable:
-	 *
-	 * This callback should be used to enable the CRTC. With the atomic
-	 * drivers it is called before all encoders connected to this CRTC are
-	 * enabled through the encoder's own &drm_encoder_helper_funcs.enable
-	 * hook.  If that sequence is too simple drivers can just add their own
-	 * hooks and call it from this CRTC callback here by looping over all
-	 * encoders connected to it using for_each_encoder_on_crtc().
-	 *
-	 * This hook is used only by atomic helpers, for symmetry with @disable.
-	 * Atomic drivers don't need to implement it if there's no need to
-	 * enable anything at the CRTC level. To ensure that runtime PM handling
-	 * (using either DPMS or the new "ACTIVE" property) works
-	 * @enable must be the inverse of @disable for atomic drivers.
-	 */
-	void (*enable)(struct drm_crtc *crtc);
-
-	/**
 	 * @atomic_check:
 	 *
 	 * Drivers should check plane-update related CRTC constraints in this
@@ -433,6 +399,30 @@ struct drm_crtc_helper_funcs {
 			     struct drm_crtc_state *old_crtc_state);
 
 	/**
+	 * @atomic_enable:
+	 *
+	 * This callback should be used to enable the CRTC. With the atomic
+	 * drivers it is called before all encoders connected to this CRTC are
+	 * enabled through the encoder's own &drm_encoder_helper_funcs.enable
+	 * hook.  If that sequence is too simple drivers can just add their own
+	 * hooks and call it from this CRTC callback here by looping over all
+	 * encoders connected to it using for_each_encoder_on_crtc().
+	 *
+	 * This hook is used only by atomic helpers, for symmetry with
+	 * @atomic_disable. Atomic drivers don't need to implement it if there's
+	 * no need to enable anything at the CRTC level. To ensure that runtime
+	 * PM handling (using either DPMS or the new "ACTIVE" property) works
+	 * @atomic_enable must be the inverse of @atomic_disable for atomic
+	 * drivers.
+	 *
+	 * Drivers can use the @old_crtc_state input parameter if the operations
+	 * needed to enable the CRTC don't depend solely on the new state but
+	 * also on the transition between the old state and the new state.
+	 */
+	void (*atomic_enable)(struct drm_crtc *crtc,
+			      struct drm_crtc_state *old_crtc_state);
+
+	/**
 	 * @atomic_disable:
 	 *
 	 * This callback should be used to disable the CRTC. With the atomic
@@ -1129,6 +1119,56 @@ struct drm_plane_helper_funcs {
 	 */
 	void (*atomic_disable)(struct drm_plane *plane,
 			       struct drm_plane_state *old_state);
+
+	/**
+	 * @atomic_async_check:
+	 *
+	 * Drivers should set this function pointer to check if the plane state
+	 * can be updated in a async fashion. Here async means "not vblank
+	 * synchronized".
+	 *
+	 * This hook is called by drm_atomic_async_check() to establish if a
+	 * given update can be committed asynchronously, that is, if it can
+	 * jump ahead of the state currently queued for update.
+	 *
+	 * RETURNS:
+	 *
+	 * Return 0 on success and any error returned indicates that the update
+	 * can not be applied in asynchronous manner.
+	 */
+	int (*atomic_async_check)(struct drm_plane *plane,
+				  struct drm_plane_state *state);
+
+	/**
+	 * @atomic_async_update:
+	 *
+	 * Drivers should set this function pointer to perform asynchronous
+	 * updates of planes, that is, jump ahead of the currently queued
+	 * state and update the plane. Here async means "not vblank
+	 * synchronized".
+	 *
+	 * This hook is called by drm_atomic_helper_async_commit().
+	 *
+	 * An async update will happen on legacy cursor updates. An async
+	 * update won't happen if there is an outstanding commit modifying
+	 * the same plane.
+	 *
+	 * Note that unlike &drm_plane_helper_funcs.atomic_update this hook
+	 * takes the new &drm_plane_state as parameter. When doing async_update
+	 * drivers shouldn't replace the &drm_plane_state but update the
+	 * current one with the new plane configurations in the new
+	 * plane_state.
+	 *
+	 * FIXME:
+	 *  - It only works for single plane updates
+	 *  - Async Pageflips are not supported yet
+	 *  - Some hw might still scan out the old buffer until the next
+	 *    vblank, however we let go of the fb references as soon as
+	 *    we run this hook. For now drivers must implement their own workers
+	 *    for deferring if needed, until a common solution is created.
+	 */
+	void (*atomic_async_update)(struct drm_plane *plane,
+				    struct drm_plane_state *new_state);
 };
 
 /**
@@ -1169,7 +1209,8 @@ struct drm_mode_config_helper_funcs {
 	 * After the atomic update is committed to the hardware this hook needs
 	 * to call drm_atomic_helper_commit_hw_done(). Then wait for the upate
 	 * to be executed by the hardware, for example using
-	 * drm_atomic_helper_wait_for_vblanks(), and then clean up the old
+	 * drm_atomic_helper_wait_for_vblanks() or
+	 * drm_atomic_helper_wait_for_flip_done(), and then clean up the old
 	 * framebuffers using drm_atomic_helper_cleanup_planes().
 	 *
 	 * When disabling a CRTC this hook _must_ stall for the commit to
diff --git a/include/drm/drm_pci.h b/include/drm/drm_pci.h
index 4579fac..6745990 100644
--- a/include/drm/drm_pci.h
+++ b/include/drm/drm_pci.h
@@ -43,13 +43,12 @@ struct drm_dma_handle *drm_pci_alloc(struct drm_device *dev, size_t size,
 				     size_t align);
 void drm_pci_free(struct drm_device *dev, struct drm_dma_handle * dmah);
 
-int drm_pci_init(struct drm_driver *driver, struct pci_driver *pdriver);
-void drm_pci_exit(struct drm_driver *driver, struct pci_driver *pdriver);
+int drm_legacy_pci_init(struct drm_driver *driver, struct pci_driver *pdriver);
+void drm_legacy_pci_exit(struct drm_driver *driver, struct pci_driver *pdriver);
 #ifdef CONFIG_PCI
 int drm_get_pci_dev(struct pci_dev *pdev,
 		    const struct pci_device_id *ent,
 		    struct drm_driver *driver);
-int drm_pci_set_busid(struct drm_device *dev, struct drm_master *master);
 #else
 static inline int drm_get_pci_dev(struct pci_dev *pdev,
 				  const struct pci_device_id *ent,
@@ -57,12 +56,6 @@ static inline int drm_get_pci_dev(struct pci_dev *pdev,
 {
 	return -ENOSYS;
 }
-
-static inline int drm_pci_set_busid(struct drm_device *dev,
-				    struct drm_master *master)
-{
-	return -ENOSYS;
-}
 #endif
 
 #define DRM_PCIE_SPEED_25 1
diff --git a/include/drm/drm_plane.h b/include/drm/drm_plane.h
index 9ab3e70..73f90f9 100644
--- a/include/drm/drm_plane.h
+++ b/include/drm/drm_plane.h
@@ -233,11 +233,9 @@ struct drm_plane_funcs {
 	 * This is the legacy entry point to update a property attached to the
 	 * plane.
 	 *
-	 * Drivers implementing atomic modeset should use
-	 * drm_atomic_helper_plane_set_property() to implement this hook.
-	 *
 	 * This callback is optional if the driver does not support any legacy
-	 * driver-private properties.
+	 * driver-private properties. For atomic drivers it is not used because
+	 * property handling is done entirely in the DRM core.
 	 *
 	 * RETURNS:
 	 *
@@ -392,6 +390,22 @@ struct drm_plane_funcs {
 	 */
 	void (*atomic_print_state)(struct drm_printer *p,
 				   const struct drm_plane_state *state);
+
+	/**
+	 * @format_mod_supported:
+	 *
+	 * This optional hook is used for the DRM to determine if the given
+	 * format/modifier combination is valid for the plane. This allows the
+	 * DRM to generate the correct format bitmask (which formats apply to
+	 * which modifier).
+	 *
+	 * Returns:
+	 *
+	 * True if the given modifier is valid for that format on the plane.
+	 * False otherwise.
+	 */
+	bool (*format_mod_supported)(struct drm_plane *plane, uint32_t format,
+				     uint64_t modifier);
 };
 
 /**
@@ -487,6 +501,9 @@ struct drm_plane {
 	unsigned int format_count;
 	bool format_default;
 
+	uint64_t *modifiers;
+	unsigned int modifier_count;
+
 	struct drm_crtc *crtc;
 	struct drm_framebuffer *fb;
 
@@ -527,13 +544,14 @@ struct drm_plane {
 
 #define obj_to_plane(x) container_of(x, struct drm_plane, base)
 
-__printf(8, 9)
+__printf(9, 10)
 int drm_universal_plane_init(struct drm_device *dev,
 			     struct drm_plane *plane,
 			     uint32_t possible_crtcs,
 			     const struct drm_plane_funcs *funcs,
 			     const uint32_t *formats,
 			     unsigned int format_count,
+			     const uint64_t *format_modifiers,
 			     enum drm_plane_type type,
 			     const char *name, ...);
 int drm_plane_init(struct drm_device *dev,
diff --git a/include/drm/drm_property.h b/include/drm/drm_property.h
index 619868dc..37355c6 100644
--- a/include/drm/drm_property.h
+++ b/include/drm/drm_property.h
@@ -273,6 +273,8 @@ int drm_property_replace_global_blob(struct drm_device *dev,
 				     const void *data,
 				     struct drm_mode_object *obj_holds_id,
 				     struct drm_property *prop_holds_id);
+bool drm_property_replace_blob(struct drm_property_blob **blob,
+			       struct drm_property_blob *new_blob);
 struct drm_property_blob *drm_property_blob_get(struct drm_property_blob *blob);
 void drm_property_blob_put(struct drm_property_blob *blob);
 
diff --git a/include/drm/drm_scdc_helper.h b/include/drm/drm_scdc_helper.h
index c25122b..f92eb20 100644
--- a/include/drm/drm_scdc_helper.h
+++ b/include/drm/drm_scdc_helper.h
@@ -131,31 +131,6 @@ static inline int drm_scdc_writeb(struct i2c_adapter *adapter, u8 offset,
 
 bool drm_scdc_get_scrambling_status(struct i2c_adapter *adapter);
 
-/**
- * drm_scdc_set_scrambling - enable scrambling
- * @adapter: I2C adapter for DDC channel
- * @enable: bool to indicate if scrambling is to be enabled/disabled
- *
- * Writes the TMDS config register over SCDC channel, and:
- * enables scrambling when enable = 1
- * disables scrambling when enable = 0
- *
- * Returns:
- * True if scrambling is set/reset successfully, false otherwise.
- */
 bool drm_scdc_set_scrambling(struct i2c_adapter *adapter, bool enable);
-
-/**
- * drm_scdc_set_high_tmds_clock_ratio - set TMDS clock ratio
- * @adapter: I2C adapter for DDC channel
- * @set: ret or reset the high clock ratio
- *
- * Writes to the TMDS config register over SCDC channel, and:
- * sets TMDS clock ratio to 1/40 when set = 1
- * sets TMDS clock ratio to 1/10 when set = 0
- *
- * Returns:
- * True if write is successful, false otherwise.
- */
 bool drm_scdc_set_high_tmds_clock_ratio(struct i2c_adapter *adapter, bool set);
 #endif
diff --git a/include/drm/drm_simple_kms_helper.h b/include/drm/drm_simple_kms_helper.h
index 2d36538..6d9adbb 100644
--- a/include/drm/drm_simple_kms_helper.h
+++ b/include/drm/drm_simple_kms_helper.h
@@ -122,6 +122,7 @@ int drm_simple_display_pipe_init(struct drm_device *dev,
 			struct drm_simple_display_pipe *pipe,
 			const struct drm_simple_display_pipe_funcs *funcs,
 			const uint32_t *formats, unsigned int format_count,
+			const uint64_t *format_modifiers,
 			struct drm_connector *connector);
 
 #endif /* __LINUX_DRM_SIMPLE_KMS_HELPER_H */
diff --git a/include/drm/drm_syncobj.h b/include/drm/drm_syncobj.h
index 89976da..c00fee5 100644
--- a/include/drm/drm_syncobj.h
+++ b/include/drm/drm_syncobj.h
@@ -28,6 +28,8 @@
 
 #include "linux/dma-fence.h"
 
+struct drm_syncobj_cb;
+
 /**
  * struct drm_syncobj - sync object.
  *
@@ -43,15 +45,47 @@ struct drm_syncobj {
 	/**
 	 * @fence:
 	 * NULL or a pointer to the fence bound to this object.
+	 *
+	 * This field should not be used directly.  Use drm_syncobj_fence_get
+	 * and drm_syncobj_replace_fence instead.
 	 */
 	struct dma_fence *fence;
 	/**
+	 * @cb_list:
+	 * List of callbacks to call when the fence gets replaced
+	 */
+	struct list_head cb_list;
+	/**
+	 * @lock:
+	 * locks cb_list and write-locks fence.
+	 */
+	spinlock_t lock;
+	/**
 	 * @file:
 	 * a file backing for this syncobj.
 	 */
 	struct file *file;
 };
 
+typedef void (*drm_syncobj_func_t)(struct drm_syncobj *syncobj,
+				   struct drm_syncobj_cb *cb);
+
+/**
+ * struct drm_syncobj_cb - callback for drm_syncobj_add_callback
+ * @node: used by drm_syncob_add_callback to append this struct to
+ *	  syncobj::cb_list
+ * @func: drm_syncobj_func_t to call
+ *
+ * This struct will be initialized by drm_syncobj_add_callback, additional
+ * data can be passed along by embedding drm_syncobj_cb in another struct.
+ * The callback will get called the next time drm_syncobj_replace_fence is
+ * called.
+ */
+struct drm_syncobj_cb {
+	struct list_head node;
+	drm_syncobj_func_t func;
+};
+
 void drm_syncobj_free(struct kref *kref);
 
 /**
@@ -77,13 +111,30 @@ drm_syncobj_put(struct drm_syncobj *obj)
 	kref_put(&obj->refcount, drm_syncobj_free);
 }
 
+static inline struct dma_fence *
+drm_syncobj_fence_get(struct drm_syncobj *syncobj)
+{
+	struct dma_fence *fence;
+
+	rcu_read_lock();
+	fence = dma_fence_get_rcu_safe(&syncobj->fence);
+	rcu_read_unlock();
+
+	return fence;
+}
+
 struct drm_syncobj *drm_syncobj_find(struct drm_file *file_private,
 				     u32 handle);
+void drm_syncobj_add_callback(struct drm_syncobj *syncobj,
+			      struct drm_syncobj_cb *cb,
+			      drm_syncobj_func_t func);
+void drm_syncobj_remove_callback(struct drm_syncobj *syncobj,
+				 struct drm_syncobj_cb *cb);
 void drm_syncobj_replace_fence(struct drm_syncobj *syncobj,
 			       struct dma_fence *fence);
-int drm_syncobj_fence_get(struct drm_file *file_private,
-			  u32 handle,
-			  struct dma_fence **fence);
+int drm_syncobj_find_fence(struct drm_file *file_private,
+			   u32 handle,
+			   struct dma_fence **fence);
 void drm_syncobj_free(struct kref *kref);
 
 #endif
diff --git a/include/drm/drm_vblank.h b/include/drm/drm_vblank.h
index 4cde473..7fba9ef 100644
--- a/include/drm/drm_vblank.h
+++ b/include/drm/drm_vblank.h
@@ -168,8 +168,7 @@ void drm_crtc_wait_one_vblank(struct drm_crtc *crtc);
 void drm_crtc_vblank_off(struct drm_crtc *crtc);
 void drm_crtc_vblank_reset(struct drm_crtc *crtc);
 void drm_crtc_vblank_on(struct drm_crtc *crtc);
-void drm_vblank_cleanup(struct drm_device *dev);
-u32 drm_accurate_vblank_count(struct drm_crtc *crtc);
+u32 drm_crtc_accurate_vblank_count(struct drm_crtc *crtc);
 
 bool drm_calc_vbltimestamp_from_scanoutpos(struct drm_device *dev,
 					   unsigned int pipe, int *max_error,
diff --git a/include/drm/tinydrm/mipi-dbi.h b/include/drm/tinydrm/mipi-dbi.h
index d137b16e..83346dd 100644
--- a/include/drm/tinydrm/mipi-dbi.h
+++ b/include/drm/tinydrm/mipi-dbi.h
@@ -62,11 +62,7 @@ mipi_dbi_from_tinydrm(struct tinydrm_device *tdev)
 }
 
 int mipi_dbi_spi_init(struct spi_device *spi, struct mipi_dbi *mipi,
-		      struct gpio_desc *dc,
-		      const struct drm_simple_display_pipe_funcs *pipe_funcs,
-		      struct drm_driver *driver,
-		      const struct drm_display_mode *mode,
-		      unsigned int rotation);
+		      struct gpio_desc *dc);
 int mipi_dbi_init(struct device *dev, struct mipi_dbi *mipi,
 		  const struct drm_simple_display_pipe_funcs *pipe_funcs,
 		  struct drm_driver *driver,
diff --git a/include/drm/tinydrm/tinydrm-helpers.h b/include/drm/tinydrm/tinydrm-helpers.h
index 9b9b6cf..d554ded 100644
--- a/include/drm/tinydrm/tinydrm-helpers.h
+++ b/include/drm/tinydrm/tinydrm-helpers.h
@@ -43,6 +43,8 @@ void tinydrm_swab16(u16 *dst, void *vaddr, struct drm_framebuffer *fb,
 void tinydrm_xrgb8888_to_rgb565(u16 *dst, void *vaddr,
 				struct drm_framebuffer *fb,
 				struct drm_clip_rect *clip, bool swap);
+void tinydrm_xrgb8888_to_gray8(u8 *dst, void *vaddr, struct drm_framebuffer *fb,
+			       struct drm_clip_rect *clip);
 
 struct backlight_device *tinydrm_of_find_backlight(struct device *dev);
 int tinydrm_enable_backlight(struct backlight_device *backlight);
diff --git a/include/drm/tinydrm/tinydrm.h b/include/drm/tinydrm/tinydrm.h
index 00b800d..4774fe3 100644
--- a/include/drm/tinydrm/tinydrm.h
+++ b/include/drm/tinydrm/tinydrm.h
@@ -56,9 +56,7 @@ pipe_to_tinydrm(struct drm_simple_display_pipe *pipe)
 	.gem_prime_vmap		= drm_gem_cma_prime_vmap, \
 	.gem_prime_vunmap	= drm_gem_cma_prime_vunmap, \
 	.gem_prime_mmap		= drm_gem_cma_prime_mmap, \
-	.dumb_create		= drm_gem_cma_dumb_create, \
-	.dumb_map_offset	= drm_gem_cma_dumb_map_offset, \
-	.dumb_destroy		= drm_gem_dumb_destroy
+	.dumb_create		= drm_gem_cma_dumb_create
 
 /**
  * TINYDRM_MODE - tinydrm display mode
diff --git a/include/drm/ttm/ttm_bo_driver.h b/include/drm/ttm/ttm_bo_driver.h
index 990d529..5f821a9b 100644
--- a/include/drm/ttm/ttm_bo_driver.h
+++ b/include/drm/ttm/ttm_bo_driver.h
@@ -229,13 +229,14 @@ struct ttm_mem_type_manager_func {
 	 * struct ttm_mem_type_manager member debug
 	 *
 	 * @man: Pointer to a memory type manager.
-	 * @prefix: Prefix to be used in printout to identify the caller.
+	 * @printer: Prefix to be used in printout to identify the caller.
 	 *
 	 * This function is called to print out the state of the memory
 	 * type manager to aid debugging of out-of-memory conditions.
 	 * It may not be called from within atomic context.
 	 */
-	void (*debug)(struct ttm_mem_type_manager *man, const char *prefix);
+	void (*debug)(struct ttm_mem_type_manager *man,
+		      struct drm_printer *printer);
 };
 
 /**
@@ -472,6 +473,23 @@ struct ttm_bo_driver {
 	 */
 	unsigned long (*io_mem_pfn)(struct ttm_buffer_object *bo,
 				    unsigned long page_offset);
+
+	/**
+	 * Read/write memory buffers for ptrace access
+	 *
+	 * @bo: the BO to access
+	 * @offset: the offset from the start of the BO
+	 * @buf: pointer to source/destination buffer
+	 * @len: number of bytes to copy
+	 * @write: whether to read (0) from or write (non-0) to BO
+	 *
+	 * If successful, this function should return the number of
+	 * bytes copied, -EIO otherwise. If the number of bytes
+	 * returned is < len, the function may be called again with
+	 * the remainder of the buffer to copy.
+	 */
+	int (*access_memory)(struct ttm_buffer_object *bo, unsigned long offset,
+			     void *buf, int len, int write);
 };
 
 /**
diff --git a/include/linux/ata.h b/include/linux/ata.h
index e65ae4b..c7a3538 100644
--- a/include/linux/ata.h
+++ b/include/linux/ata.h
@@ -60,7 +60,8 @@ enum {
 	ATA_ID_FW_REV		= 23,
 	ATA_ID_PROD		= 27,
 	ATA_ID_MAX_MULTSECT	= 47,
-	ATA_ID_DWORD_IO		= 48,
+	ATA_ID_DWORD_IO		= 48,	/* before ATA-8 */
+	ATA_ID_TRUSTED		= 48,	/* ATA-8 and later */
 	ATA_ID_CAPABILITY	= 49,
 	ATA_ID_OLD_PIO_MODES	= 51,
 	ATA_ID_OLD_DMA_MODES	= 52,
@@ -889,6 +890,13 @@ static inline bool ata_id_has_dword_io(const u16 *id)
 	return id[ATA_ID_DWORD_IO] & (1 << 0);
 }
 
+static inline bool ata_id_has_trusted(const u16 *id)
+{
+	if (ata_id_major_version(id) <= 7)
+		return false;
+	return id[ATA_ID_TRUSTED] & (1 << 0);
+}
+
 static inline bool ata_id_has_unload(const u16 *id)
 {
 	if (ata_id_major_version(id) >= 7 &&
diff --git a/include/linux/blk-mq-rdma.h b/include/linux/blk-mq-rdma.h
new file mode 100644
index 0000000..b4ade19
--- /dev/null
+++ b/include/linux/blk-mq-rdma.h
@@ -0,0 +1,10 @@
+#ifndef _LINUX_BLK_MQ_RDMA_H
+#define _LINUX_BLK_MQ_RDMA_H
+
+struct blk_mq_tag_set;
+struct ib_device;
+
+int blk_mq_rdma_map_queues(struct blk_mq_tag_set *set,
+		struct ib_device *dev, int first_vec);
+
+#endif /* _LINUX_BLK_MQ_RDMA_H */
diff --git a/include/linux/compat.h b/include/linux/compat.h
index 5a6a109..3fc4333 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -27,7 +27,7 @@
 #endif
 
 #ifndef __SC_DELOUSE
-#define __SC_DELOUSE(t,v) ((t)(unsigned long)(v))
+#define __SC_DELOUSE(t,v) ((__force t)(unsigned long)(v))
 #endif
 
 #define COMPAT_SYSCALL_DEFINE0(name) \
@@ -365,10 +365,10 @@ asmlinkage ssize_t compat_sys_pwritev(compat_ulong_t fd,
 		compat_ulong_t vlen, u32 pos_low, u32 pos_high);
 asmlinkage ssize_t compat_sys_preadv2(compat_ulong_t fd,
 		const struct compat_iovec __user *vec,
-		compat_ulong_t vlen, u32 pos_low, u32 pos_high, int flags);
+		compat_ulong_t vlen, u32 pos_low, u32 pos_high, rwf_t flags);
 asmlinkage ssize_t compat_sys_pwritev2(compat_ulong_t fd,
 		const struct compat_iovec __user *vec,
-		compat_ulong_t vlen, u32 pos_low, u32 pos_high, int flags);
+		compat_ulong_t vlen, u32 pos_low, u32 pos_high, rwf_t flags);
 
 #ifdef __ARCH_WANT_COMPAT_SYS_PREADV64
 asmlinkage long compat_sys_preadv64(unsigned long fd,
@@ -382,6 +382,18 @@ asmlinkage long compat_sys_pwritev64(unsigned long fd,
 		unsigned long vlen, loff_t pos);
 #endif
 
+#ifdef __ARCH_WANT_COMPAT_SYS_PREADV64V2
+asmlinkage long  compat_sys_readv64v2(unsigned long fd,
+		const struct compat_iovec __user *vec,
+		unsigned long vlen, loff_t pos, rwf_t flags);
+#endif
+
+#ifdef __ARCH_WANT_COMPAT_SYS_PWRITEV64V2
+asmlinkage long compat_sys_pwritev64v2(unsigned long fd,
+		const struct compat_iovec __user *vec,
+		unsigned long vlen, loff_t pos, rwf_t flags);
+#endif
+
 asmlinkage long compat_sys_lseek(unsigned int, compat_off_t, unsigned int);
 
 asmlinkage long compat_sys_execve(const char __user *filename, const compat_uptr_t __user *argv,
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index eca8ad7..043b60d 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -517,7 +517,8 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
 # define __compiletime_error_fallback(condition) do { } while (0)
 #endif
 
-#define __compiletime_assert(condition, msg, prefix, suffix)		\
+#ifdef __OPTIMIZE__
+# define __compiletime_assert(condition, msg, prefix, suffix)		\
 	do {								\
 		bool __cond = !(condition);				\
 		extern void prefix ## suffix(void) __compiletime_error(msg); \
@@ -525,6 +526,9 @@ static __always_inline void __write_once_size(volatile void *p, void *res, int s
 			prefix ## suffix();				\
 		__compiletime_error_fallback(__cond);			\
 	} while (0)
+#else
+# define __compiletime_assert(condition, msg, prefix, suffix) do { } while (0)
+#endif
 
 #define _compiletime_assert(condition, msg, prefix, suffix) \
 	__compiletime_assert(condition, msg, prefix, suffix)
diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h
index 1473455..4f2b3b2 100644
--- a/include/linux/device-mapper.h
+++ b/include/linux/device-mapper.h
@@ -549,46 +549,29 @@ void *dm_vcalloc(unsigned long nmemb, unsigned long elem_size);
  *---------------------------------------------------------------*/
 #define DM_NAME "device-mapper"
 
-#ifdef CONFIG_PRINTK
-extern struct ratelimit_state dm_ratelimit_state;
-
-#define dm_ratelimit()	__ratelimit(&dm_ratelimit_state)
-#else
-#define dm_ratelimit()	0
-#endif
+#define DM_RATELIMIT(pr_func, fmt, ...)					\
+do {									\
+	static DEFINE_RATELIMIT_STATE(rs, DEFAULT_RATELIMIT_INTERVAL,	\
+				      DEFAULT_RATELIMIT_BURST);		\
+									\
+	if (__ratelimit(&rs))						\
+		pr_func(DM_FMT(fmt), ##__VA_ARGS__);			\
+} while (0)
 
 #define DM_FMT(fmt) DM_NAME ": " DM_MSG_PREFIX ": " fmt "\n"
 
 #define DMCRIT(fmt, ...) pr_crit(DM_FMT(fmt), ##__VA_ARGS__)
 
 #define DMERR(fmt, ...) pr_err(DM_FMT(fmt), ##__VA_ARGS__)
-#define DMERR_LIMIT(fmt, ...)						\
-do {									\
-	if (dm_ratelimit())						\
-		DMERR(fmt, ##__VA_ARGS__);				\
-} while (0)
-
+#define DMERR_LIMIT(fmt, ...) DM_RATELIMIT(pr_err, fmt, ##__VA_ARGS__)
 #define DMWARN(fmt, ...) pr_warn(DM_FMT(fmt), ##__VA_ARGS__)
-#define DMWARN_LIMIT(fmt, ...)						\
-do {									\
-	if (dm_ratelimit())						\
-		DMWARN(fmt, ##__VA_ARGS__);				\
-} while (0)
-
+#define DMWARN_LIMIT(fmt, ...) DM_RATELIMIT(pr_warn, fmt, ##__VA_ARGS__)
 #define DMINFO(fmt, ...) pr_info(DM_FMT(fmt), ##__VA_ARGS__)
-#define DMINFO_LIMIT(fmt, ...)						\
-do {									\
-	if (dm_ratelimit())						\
-		DMINFO(fmt, ##__VA_ARGS__);				\
-} while (0)
+#define DMINFO_LIMIT(fmt, ...) DM_RATELIMIT(pr_info, fmt, ##__VA_ARGS__)
 
 #ifdef CONFIG_DM_DEBUG
 #define DMDEBUG(fmt, ...) printk(KERN_DEBUG DM_FMT(fmt), ##__VA_ARGS__)
-#define DMDEBUG_LIMIT(fmt, ...)						\
-do {									\
-	if (dm_ratelimit())						\
-		DMDEBUG(fmt, ##__VA_ARGS__);				\
-} while (0)
+#define DMDEBUG_LIMIT(fmt, ...) DM_RATELIMIT(pr_debug, fmt, ##__VA_ARGS__)
 #else
 #define DMDEBUG(fmt, ...) no_printk(fmt, ##__VA_ARGS__)
 #define DMDEBUG_LIMIT(fmt, ...) no_printk(fmt, ##__VA_ARGS__)
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 0a186c4..1718950 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -338,6 +338,19 @@ dma_fence_is_signaled(struct dma_fence *fence)
 }
 
 /**
+ * __dma_fence_is_later - return if f1 is chronologically later than f2
+ * @f1:	[in]	the first fence's seqno
+ * @f2:	[in]	the second fence's seqno from the same context
+ *
+ * Returns true if f1 is chronologically later than f2. Both fences must be
+ * from the same context, since a seqno is not common across contexts.
+ */
+static inline bool __dma_fence_is_later(u32 f1, u32 f2)
+{
+	return (int)(f1 - f2) > 0;
+}
+
+/**
  * dma_fence_is_later - return if f1 is chronologically later than f2
  * @f1:	[in]	the first fence from the same context
  * @f2:	[in]	the second fence from the same context
@@ -351,7 +364,7 @@ static inline bool dma_fence_is_later(struct dma_fence *f1,
 	if (WARN_ON(f1->context != f2->context))
 		return false;
 
-	return (int)(f1->seqno - f2->seqno) > 0;
+	return __dma_fence_is_later(f1->seqno, f2->seqno);
 }
 
 /**
@@ -418,8 +431,8 @@ int dma_fence_get_status(struct dma_fence *fence);
 static inline void dma_fence_set_error(struct dma_fence *fence,
 				       int error)
 {
-	BUG_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags));
-	BUG_ON(error >= 0 || error < -MAX_ERRNO);
+	WARN_ON(test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags));
+	WARN_ON(error >= 0 || error < -MAX_ERRNO);
 
 	fence->error = error;
 }
diff --git a/include/linux/fs.h b/include/linux/fs.h
index cbfe127..2625fc4 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -72,6 +72,8 @@ extern int leases_enable, lease_break_time;
 extern int sysctl_protected_symlinks;
 extern int sysctl_protected_hardlinks;
 
+typedef __kernel_rwf_t rwf_t;
+
 struct buffer_head;
 typedef int (get_block_t)(struct inode *inode, sector_t iblock,
 			struct buffer_head *bh_result, int create);
@@ -1758,9 +1760,9 @@ extern ssize_t __vfs_write(struct file *, const char __user *, size_t, loff_t *)
 extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *);
 extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *);
 extern ssize_t vfs_readv(struct file *, const struct iovec __user *,
-		unsigned long, loff_t *, int);
+		unsigned long, loff_t *, rwf_t);
 extern ssize_t vfs_writev(struct file *, const struct iovec __user *,
-		unsigned long, loff_t *, int);
+		unsigned long, loff_t *, rwf_t);
 extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *,
 				   loff_t, size_t, unsigned int);
 extern int vfs_clone_file_prep_inodes(struct inode *inode_in, loff_t pos_in,
@@ -2874,9 +2876,9 @@ extern ssize_t generic_file_direct_write(struct kiocb *, struct iov_iter *);
 extern ssize_t generic_perform_write(struct file *, struct iov_iter *, loff_t);
 
 ssize_t vfs_iter_read(struct file *file, struct iov_iter *iter, loff_t *ppos,
-		int flags);
+		rwf_t flags);
 ssize_t vfs_iter_write(struct file *file, struct iov_iter *iter, loff_t *ppos,
-		int flags);
+		rwf_t flags);
 
 /* fs/block_dev.c */
 extern ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to);
@@ -3143,7 +3145,7 @@ static inline int iocb_flags(struct file *file)
 	return res;
 }
 
-static inline int kiocb_set_rw_flags(struct kiocb *ki, int flags)
+static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
 {
 	if (unlikely(flags & ~RWF_SUPPORTED))
 		return -EOPNOTSUPP;
diff --git a/include/linux/genalloc.h b/include/linux/genalloc.h
index 29d4385..6dfec4d 100644
--- a/include/linux/genalloc.h
+++ b/include/linux/genalloc.h
@@ -38,12 +38,13 @@ struct device_node;
 struct gen_pool;
 
 /**
- * Allocation callback function type definition
+ * typedef genpool_algo_t: Allocation callback function type definition
  * @map: Pointer to bitmap
  * @size: The bitmap size in bits
  * @start: The bitnumber to start searching at
  * @nr: The number of zeroed bits we're looking for
- * @data: optional additional data used by @genpool_algo_t
+ * @data: optional additional data used by the callback
+ * @pool: the pool being allocated from
  */
 typedef unsigned long (*genpool_algo_t)(unsigned long *map,
 			unsigned long size,
diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index a2f6707..0e84971 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -126,17 +126,11 @@ extern struct group_info init_groups;
 #endif
 
 #ifdef CONFIG_PREEMPT_RCU
-#define INIT_TASK_RCU_TREE_PREEMPT()					\
-	.rcu_blocked_node = NULL,
-#else
-#define INIT_TASK_RCU_TREE_PREEMPT(tsk)
-#endif
-#ifdef CONFIG_PREEMPT_RCU
 #define INIT_TASK_RCU_PREEMPT(tsk)					\
 	.rcu_read_lock_nesting = 0,					\
 	.rcu_read_unlock_special.s = 0,					\
 	.rcu_node_entry = LIST_HEAD_INIT(tsk.rcu_node_entry),		\
-	INIT_TASK_RCU_TREE_PREEMPT()
+	.rcu_blocked_node = NULL,
 #else
 #define INIT_TASK_RCU_PREEMPT(tsk)
 #endif
diff --git a/include/linux/mfd/da9052/da9052.h b/include/linux/mfd/da9052/da9052.h
index ce9230a..ae5b663 100644
--- a/include/linux/mfd/da9052/da9052.h
+++ b/include/linux/mfd/da9052/da9052.h
@@ -45,6 +45,12 @@
 #define DA9052_ADC_TJUNC	8
 #define DA9052_ADC_VBBAT	9
 
+/* TSI channel has its own 4 channel mux */
+#define DA9052_ADC_TSI_XP	70
+#define DA9052_ADC_TSI_XN	71
+#define DA9052_ADC_TSI_YP	72
+#define DA9052_ADC_TSI_YN	73
+
 #define DA9052_IRQ_DCIN	0
 #define DA9052_IRQ_VBUS	1
 #define DA9052_IRQ_DCINREM	2
diff --git a/include/linux/mfd/da9052/reg.h b/include/linux/mfd/da9052/reg.h
index 5010f97..76780ea 100644
--- a/include/linux/mfd/da9052/reg.h
+++ b/include/linux/mfd/da9052/reg.h
@@ -690,7 +690,10 @@
 /* TSI CONTROL REGISTER B BITS */
 #define DA9052_TSICONTB_ADCREF		0X80
 #define DA9052_TSICONTB_TSIMAN		0X40
-#define DA9052_TSICONTB_TSIMUX		0X30
+#define DA9052_TSICONTB_TSIMUX_XP	0X00
+#define DA9052_TSICONTB_TSIMUX_YP	0X10
+#define DA9052_TSICONTB_TSIMUX_XN	0X20
+#define DA9052_TSICONTB_TSIMUX_YN	0X30
 #define DA9052_TSICONTB_TSISEL3	0X08
 #define DA9052_TSICONTB_TSISEL2	0X04
 #define DA9052_TSICONTB_TSISEL1	0X02
@@ -705,8 +708,14 @@
 /* TSI CO-ORDINATE LSB RESULT REGISTER BITS */
 #define DA9052_TSILSB_PENDOWN		0X40
 #define DA9052_TSILSB_TSIZL		0X30
+#define DA9052_TSILSB_TSIZL_SHIFT	4
+#define DA9052_TSILSB_TSIZL_BITS	2
 #define DA9052_TSILSB_TSIYL		0X0C
+#define DA9052_TSILSB_TSIYL_SHIFT	2
+#define DA9052_TSILSB_TSIYL_BITS	2
 #define DA9052_TSILSB_TSIXL		0X03
+#define DA9052_TSILSB_TSIXL_SHIFT	0
+#define DA9052_TSILSB_TSIXL_BITS	2
 
 /* TSI Z MEASUREMENT MSB RESULT REGISTER BIT */
 #define DA9052_TSIZMSB_TSIZM		0XFF
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index b54517c..c8a63e1 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -428,6 +428,12 @@ enum mlx4_steer_type {
 	MLX4_NUM_STEERS
 };
 
+enum mlx4_resource_usage {
+	MLX4_RES_USAGE_NONE,
+	MLX4_RES_USAGE_DRIVER,
+	MLX4_RES_USAGE_USER_VERBS,
+};
+
 enum {
 	MLX4_NUM_FEXCH          = 64 * 1024,
 };
@@ -749,6 +755,7 @@ struct mlx4_cq {
 	} tasklet_ctx;
 	int		reset_notify_added;
 	struct list_head	reset_notify;
+	u8			usage;
 };
 
 struct mlx4_qp {
@@ -758,6 +765,7 @@ struct mlx4_qp {
 
 	atomic_t		refcount;
 	struct completion	free;
+	u8			usage;
 };
 
 struct mlx4_srq {
@@ -1121,7 +1129,7 @@ int mlx4_cq_alloc(struct mlx4_dev *dev, int nent, struct mlx4_mtt *mtt,
 		  unsigned vector, int collapsed, int timestamp_en);
 void mlx4_cq_free(struct mlx4_dev *dev, struct mlx4_cq *cq);
 int mlx4_qp_reserve_range(struct mlx4_dev *dev, int cnt, int align,
-			  int *base, u8 flags);
+			  int *base, u8 flags, u8 usage);
 void mlx4_qp_release_range(struct mlx4_dev *dev, int base_qpn, int cnt);
 
 int mlx4_qp_alloc(struct mlx4_dev *dev, int qpn, struct mlx4_qp *qp);
@@ -1418,7 +1426,7 @@ int mlx4_get_phys_port_id(struct mlx4_dev *dev);
 int mlx4_wol_read(struct mlx4_dev *dev, u64 *config, int port);
 int mlx4_wol_write(struct mlx4_dev *dev, u64 config, int port);
 
-int mlx4_counter_alloc(struct mlx4_dev *dev, u32 *idx);
+int mlx4_counter_alloc(struct mlx4_dev *dev, u32 *idx, u8 usage);
 void mlx4_counter_free(struct mlx4_dev *dev, u32 idx);
 int mlx4_get_default_counter_index(struct mlx4_dev *dev, int port);
 
diff --git a/include/linux/mlx5/device.h b/include/linux/mlx5/device.h
index f31a0b5..c13d71d 100644
--- a/include/linux/mlx5/device.h
+++ b/include/linux/mlx5/device.h
@@ -290,6 +290,7 @@ enum mlx5_event {
 	MLX5_EVENT_TYPE_GPIO_EVENT	   = 0x15,
 	MLX5_EVENT_TYPE_PORT_MODULE_EVENT  = 0x16,
 	MLX5_EVENT_TYPE_REMOTE_CONFIG	   = 0x19,
+	MLX5_EVENT_TYPE_GENERAL_EVENT	   = 0x22,
 	MLX5_EVENT_TYPE_PPS_EVENT          = 0x25,
 
 	MLX5_EVENT_TYPE_DB_BF_CONGESTION   = 0x1a,
@@ -305,6 +306,10 @@ enum mlx5_event {
 };
 
 enum {
+	MLX5_GENERAL_SUBTYPE_DELAY_DROP_TIMEOUT = 0x1,
+};
+
+enum {
 	MLX5_PORT_CHANGE_SUBTYPE_DOWN		= 1,
 	MLX5_PORT_CHANGE_SUBTYPE_ACTIVE		= 4,
 	MLX5_PORT_CHANGE_SUBTYPE_INITIALIZED	= 5,
@@ -968,7 +973,7 @@ enum mlx5_cap_type {
 	MLX5_CAP_ATOMIC,
 	MLX5_CAP_ROCE,
 	MLX5_CAP_IPOIB_OFFLOADS,
-	MLX5_CAP_EOIB_OFFLOADS,
+	MLX5_CAP_IPOIB_ENHANCED_OFFLOADS,
 	MLX5_CAP_FLOW_TABLE,
 	MLX5_CAP_ESWITCH_FLOW_TABLE,
 	MLX5_CAP_ESWITCH,
@@ -1011,6 +1016,10 @@ enum mlx5_mcam_feature_groups {
 	MLX5_GET(per_protocol_networking_offload_caps,\
 		 mdev->caps.hca_max[MLX5_CAP_ETHERNET_OFFLOADS], cap)
 
+#define MLX5_CAP_IPOIB_ENHANCED(mdev, cap) \
+	MLX5_GET(per_protocol_networking_offload_caps,\
+		 mdev->caps.hca_cur[MLX5_CAP_IPOIB_ENHANCED_OFFLOADS], cap)
+
 #define MLX5_CAP_ROCE(mdev, cap) \
 	MLX5_GET(roce_cap, mdev->caps.hca_cur[MLX5_CAP_ROCE], cap)
 
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index df6ce59..b3fc9d5 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -162,6 +162,13 @@ enum dbg_rsc_type {
 	MLX5_DBG_RSC_CQ,
 };
 
+enum port_state_policy {
+	MLX5_POLICY_DOWN	= 0,
+	MLX5_POLICY_UP		= 1,
+	MLX5_POLICY_FOLLOW	= 2,
+	MLX5_POLICY_INVALID	= 0xffffffff
+};
+
 struct mlx5_field_desc {
 	struct dentry	       *dent;
 	int			i;
@@ -185,6 +192,7 @@ enum mlx5_dev_event {
 	MLX5_DEV_EVENT_GUID_CHANGE,
 	MLX5_DEV_EVENT_CLIENT_REREG,
 	MLX5_DEV_EVENT_PPS,
+	MLX5_DEV_EVENT_DELAY_DROP_TIMEOUT,
 };
 
 enum mlx5_port_status {
@@ -291,7 +299,7 @@ struct mlx5_cmd {
 	struct semaphore pages_sem;
 	int	mode;
 	struct mlx5_cmd_work_ent *ent_arr[MLX5_MAX_COMMANDS];
-	struct pci_pool *pool;
+	struct dma_pool *pool;
 	struct mlx5_cmd_debug dbg;
 	struct cmd_msg_cache cache[MLX5_NUM_COMMAND_CACHES];
 	int checksum_disabled;
@@ -410,6 +418,7 @@ enum mlx5_res_type {
 	MLX5_RES_SQ	= MLX5_EVENT_QUEUE_TYPE_SQ,
 	MLX5_RES_SRQ	= 3,
 	MLX5_RES_XSRQ	= 4,
+	MLX5_RES_XRQ	= 5,
 };
 
 struct mlx5_core_rsc_common {
@@ -525,6 +534,9 @@ struct mlx5_mkey_table {
 
 struct mlx5_vf_context {
 	int	enabled;
+	u64	port_guid;
+	u64	node_guid;
+	enum port_state_policy	policy;
 };
 
 struct mlx5_core_sriov {
@@ -534,7 +546,6 @@ struct mlx5_core_sriov {
 };
 
 struct mlx5_irq_info {
-	cpumask_var_t mask;
 	char name[MLX5_MAX_IRQ_NAME];
 };
 
@@ -597,7 +608,6 @@ struct mlx5_port_module_event_stats {
 struct mlx5_priv {
 	char			name[MLX5_MAX_NAME_LEN];
 	struct mlx5_eq_table	eq_table;
-	struct msix_entry	*msix_arr;
 	struct mlx5_irq_info	*irq_info;
 
 	/* pages stuff */
@@ -673,9 +683,7 @@ enum mlx5_device_state {
 };
 
 enum mlx5_interface_state {
-	MLX5_INTERFACE_STATE_DOWN = BIT(0),
-	MLX5_INTERFACE_STATE_UP = BIT(1),
-	MLX5_INTERFACE_STATE_SHUTDOWN = BIT(2),
+	MLX5_INTERFACE_STATE_UP = BIT(0),
 };
 
 enum mlx5_pci_status {
@@ -842,13 +850,6 @@ struct mlx5_pas {
 	u8	log_sz;
 };
 
-enum port_state_policy {
-	MLX5_POLICY_DOWN	= 0,
-	MLX5_POLICY_UP		= 1,
-	MLX5_POLICY_FOLLOW	= 2,
-	MLX5_POLICY_INVALID	= 0xffffffff
-};
-
 enum phy_port_state {
 	MLX5_AAA_111
 };
@@ -1091,7 +1092,7 @@ enum {
 };
 
 enum {
-	MAX_UMR_CACHE_ENTRY = 20,
+	MR_CACHE_LAST_STD_ENTRY = 20,
 	MLX5_IMR_MTT_CACHE_ENTRY,
 	MLX5_IMR_KSM_CACHE_ENTRY,
 	MAX_MR_CACHE_ENTRIES
@@ -1185,4 +1186,10 @@ enum {
 	MLX5_TRIGGERED_CMD_COMP = (u64)1 << 32,
 };
 
+static inline const struct cpumask *
+mlx5_get_vector_affinity(struct mlx5_core_dev *dev, int vector)
+{
+	return pci_irq_get_affinity(dev->pdev, MLX5_EQ_VEC_COMP_BASE + vector);
+}
+
 #endif /* MLX5_DRIVER_H */
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index 3030121..b7338a2 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -200,6 +200,7 @@ enum {
 	MLX5_CMD_OP_QUERY_SQ                      = 0x907,
 	MLX5_CMD_OP_CREATE_RQ                     = 0x908,
 	MLX5_CMD_OP_MODIFY_RQ                     = 0x909,
+	MLX5_CMD_OP_SET_DELAY_DROP_PARAMS         = 0x910,
 	MLX5_CMD_OP_DESTROY_RQ                    = 0x90a,
 	MLX5_CMD_OP_QUERY_RQ                      = 0x90b,
 	MLX5_CMD_OP_CREATE_RMP                    = 0x90c,
@@ -294,8 +295,10 @@ struct mlx5_ifc_flow_table_fields_supported_bits {
 	u8         inner_tcp_dport[0x1];
 	u8         inner_tcp_flags[0x1];
 	u8         reserved_at_37[0x9];
+	u8         reserved_at_40[0x1a];
+	u8         bth_dst_qp[0x1];
 
-	u8         reserved_at_40[0x40];
+	u8         reserved_at_5b[0x25];
 };
 
 struct mlx5_ifc_flow_table_prop_layout_bits {
@@ -431,7 +434,9 @@ struct mlx5_ifc_fte_match_set_misc_bits {
 	u8         reserved_at_100[0xc];
 	u8         inner_ipv6_flow_label[0x14];
 
-	u8         reserved_at_120[0xe0];
+	u8         reserved_at_120[0x28];
+	u8         bth_dst_qp[0x18];
+	u8         reserved_at_160[0xa0];
 };
 
 struct mlx5_ifc_cmd_pas_bits {
@@ -599,7 +604,7 @@ struct mlx5_ifc_per_protocol_networking_offload_caps_bits {
 	u8         rss_ind_tbl_cap[0x4];
 	u8         reg_umr_sq[0x1];
 	u8         scatter_fcs[0x1];
-	u8         reserved_at_1a[0x1];
+	u8         enhanced_multi_pkt_send_wqe[0x1];
 	u8         tunnel_lso_const_out_ip_id[0x1];
 	u8         reserved_at_1c[0x2];
 	u8         tunnel_statless_gre[0x1];
@@ -840,7 +845,7 @@ struct mlx5_ifc_cmd_hca_cap_bits {
 	u8         retransmission_q_counters[0x1];
 	u8         reserved_at_183[0x1];
 	u8         modify_rq_counter_set_id[0x1];
-	u8         reserved_at_185[0x1];
+	u8         rq_delay_drop[0x1];
 	u8         max_qp_cnt[0xa];
 	u8         pkey_table_size[0x10];
 
@@ -857,7 +862,7 @@ struct mlx5_ifc_cmd_hca_cap_bits {
 	u8         pcam_reg[0x1];
 	u8         local_ca_ack_delay[0x5];
 	u8         port_module_event[0x1];
-	u8         reserved_at_1b1[0x1];
+	u8         enhanced_error_q_counters[0x1];
 	u8         ports_check[0x1];
 	u8         reserved_at_1b3[0x1];
 	u8         disable_link_up[0x1];
@@ -873,7 +878,8 @@ struct mlx5_ifc_cmd_hca_cap_bits {
 	u8         max_tc[0x4];
 	u8         reserved_at_1d0[0x1];
 	u8         dcbx[0x1];
-	u8         reserved_at_1d2[0x3];
+	u8         general_notification_event[0x1];
+	u8         reserved_at_1d3[0x2];
 	u8         fpga[0x1];
 	u8         rol_s[0x1];
 	u8         rol_g[0x1];
@@ -1016,7 +1022,8 @@ struct mlx5_ifc_cmd_hca_cap_bits {
 	u8         log_max_wq_sz[0x5];
 
 	u8         nic_vport_change_event[0x1];
-	u8         reserved_at_3e1[0xa];
+	u8         disable_local_lb[0x1];
+	u8         reserved_at_3e2[0x9];
 	u8         log_max_vlan_list[0x5];
 	u8         reserved_at_3f0[0x3];
 	u8         log_max_current_mc_list[0x5];
@@ -1187,7 +1194,8 @@ struct mlx5_ifc_cong_control_r_roce_ecn_np_bits {
 
 	u8         reserved_at_c0[0x12];
 	u8         cnp_dscp[0x6];
-	u8         reserved_at_d8[0x5];
+	u8         reserved_at_d8[0x4];
+	u8         cnp_prio_mode[0x1];
 	u8         cnp_802p_prio[0x3];
 
 	u8         reserved_at_e0[0x720];
@@ -2015,6 +2023,10 @@ enum {
 };
 
 enum {
+	MLX5_QPC_OFFLOAD_TYPE_RNDV  = 0x1,
+};
+
+enum {
 	MLX5_QPC_END_PADDING_MODE_SCATTER_AS_IS                = 0x0,
 	MLX5_QPC_END_PADDING_MODE_PAD_TO_CACHE_LINE_ALIGNMENT  = 0x1,
 };
@@ -2057,7 +2069,8 @@ struct mlx5_ifc_qpc_bits {
 	u8         st[0x8];
 	u8         reserved_at_10[0x3];
 	u8         pm_state[0x2];
-	u8         reserved_at_15[0x7];
+	u8         reserved_at_15[0x3];
+	u8         offload_type[0x4];
 	u8         end_padding_mode[0x2];
 	u8         reserved_at_1e[0x2];
 
@@ -2437,7 +2450,7 @@ struct mlx5_ifc_sqc_bits {
 	u8         cd_master[0x1];
 	u8         fre[0x1];
 	u8         flush_in_error_en[0x1];
-	u8         reserved_at_4[0x1];
+	u8         allow_multi_pkt_send_wqe[0x1];
 	u8	   min_wqe_inline_mode[0x3];
 	u8         state[0x4];
 	u8         reg_umr[0x1];
@@ -2515,7 +2528,7 @@ enum {
 
 struct mlx5_ifc_rqc_bits {
 	u8         rlky[0x1];
-	u8         reserved_at_1[0x1];
+	u8	   delay_drop_en[0x1];
 	u8         scatter_fcs[0x1];
 	u8         vsd[0x1];
 	u8         mem_rq_type[0x4];
@@ -2562,7 +2575,9 @@ struct mlx5_ifc_rmpc_bits {
 struct mlx5_ifc_nic_vport_context_bits {
 	u8         reserved_at_0[0x5];
 	u8         min_wqe_inline_mode[0x3];
-	u8         reserved_at_8[0x17];
+	u8         reserved_at_8[0x15];
+	u8         disable_mc_local_lb[0x1];
+	u8         disable_uc_local_lb[0x1];
 	u8         roce_en[0x1];
 
 	u8         arm_change_event[0x1];
@@ -3000,7 +3015,7 @@ struct mlx5_ifc_xrqc_bits {
 
 	struct mlx5_ifc_tag_matching_topology_context_bits tag_matching_topology_context;
 
-	u8         reserved_at_180[0x880];
+	u8         reserved_at_180[0x280];
 
 	struct mlx5_ifc_wq_bits wq;
 };
@@ -3947,7 +3962,47 @@ struct mlx5_ifc_query_q_counter_out_bits {
 
 	u8         local_ack_timeout_err[0x20];
 
-	u8         reserved_at_320[0x4e0];
+	u8         reserved_at_320[0xa0];
+
+	u8         resp_local_length_error[0x20];
+
+	u8         req_local_length_error[0x20];
+
+	u8         resp_local_qp_error[0x20];
+
+	u8         local_operation_error[0x20];
+
+	u8         resp_local_protection[0x20];
+
+	u8         req_local_protection[0x20];
+
+	u8         resp_cqe_error[0x20];
+
+	u8         req_cqe_error[0x20];
+
+	u8         req_mw_binding[0x20];
+
+	u8         req_bad_response[0x20];
+
+	u8         req_remote_invalid_request[0x20];
+
+	u8         resp_remote_invalid_request[0x20];
+
+	u8         req_remote_access_errors[0x20];
+
+	u8	   resp_remote_access_errors[0x20];
+
+	u8         req_remote_operation_errors[0x20];
+
+	u8         req_transport_retries_exceeded[0x20];
+
+	u8         cq_overflow[0x20];
+
+	u8         resp_cqe_flush_error[0x20];
+
+	u8         req_cqe_flush_error[0x20];
+
+	u8         reserved_at_620[0x1e0];
 };
 
 struct mlx5_ifc_query_q_counter_in_bits {
@@ -5229,7 +5284,9 @@ struct mlx5_ifc_modify_nic_vport_context_out_bits {
 };
 
 struct mlx5_ifc_modify_nic_vport_field_select_bits {
-	u8         reserved_at_0[0x16];
+	u8         reserved_at_0[0x14];
+	u8         disable_uc_local_lb[0x1];
+	u8         disable_mc_local_lb[0x1];
 	u8         node_guid[0x1];
 	u8         port_guid[0x1];
 	u8         min_inline[0x1];
@@ -5847,6 +5904,28 @@ struct mlx5_ifc_destroy_rq_in_bits {
 	u8         reserved_at_60[0x20];
 };
 
+struct mlx5_ifc_set_delay_drop_params_in_bits {
+	u8         opcode[0x10];
+	u8         reserved_at_10[0x10];
+
+	u8         reserved_at_20[0x10];
+	u8         op_mod[0x10];
+
+	u8         reserved_at_40[0x20];
+
+	u8         reserved_at_60[0x10];
+	u8         delay_drop_timeout[0x10];
+};
+
+struct mlx5_ifc_set_delay_drop_params_out_bits {
+	u8         status[0x8];
+	u8         reserved_at_8[0x18];
+
+	u8         syndrome[0x20];
+
+	u8         reserved_at_40[0x40];
+};
+
 struct mlx5_ifc_destroy_rmp_out_bits {
 	u8         status[0x8];
 	u8         reserved_at_8[0x18];
diff --git a/include/linux/mlx5/qp.h b/include/linux/mlx5/qp.h
index f378dc0..66d19b6 100644
--- a/include/linux/mlx5/qp.h
+++ b/include/linux/mlx5/qp.h
@@ -560,6 +560,9 @@ int mlx5_core_destroy_qp(struct mlx5_core_dev *dev,
 int mlx5_core_qp_query(struct mlx5_core_dev *dev, struct mlx5_core_qp *qp,
 		       u32 *out, int outlen);
 
+int mlx5_core_set_delay_drop(struct mlx5_core_dev *dev,
+			     u32 timeout_usec);
+
 int mlx5_core_xrcd_alloc(struct mlx5_core_dev *dev, u32 *xrcdn);
 int mlx5_core_xrcd_dealloc(struct mlx5_core_dev *dev, u32 xrcdn);
 void mlx5_init_qp_table(struct mlx5_core_dev *dev);
diff --git a/include/linux/mlx5/srq.h b/include/linux/mlx5/srq.h
index 1cde0fd..24ff23e 100644
--- a/include/linux/mlx5/srq.h
+++ b/include/linux/mlx5/srq.h
@@ -38,6 +38,7 @@
 enum {
 	MLX5_SRQ_FLAG_ERR    = (1 << 0),
 	MLX5_SRQ_FLAG_WQ_SIG = (1 << 1),
+	MLX5_SRQ_FLAG_RNDV   = (1 << 2),
 };
 
 struct mlx5_srq_attr {
@@ -56,6 +57,10 @@ struct mlx5_srq_attr {
 	u32 user_index;
 	u64 db_record;
 	__be64 *pas;
+	u32 tm_log_list_size;
+	u32 tm_next_tag;
+	u32 tm_hw_phase_cnt;
+	u32 tm_sw_phase_cnt;
 };
 
 struct mlx5_core_dev;
diff --git a/include/linux/mlx5/vport.h b/include/linux/mlx5/vport.h
index 656c70b..aaa0bb9 100644
--- a/include/linux/mlx5/vport.h
+++ b/include/linux/mlx5/vport.h
@@ -114,5 +114,6 @@ int mlx5_core_modify_hca_vport_context(struct mlx5_core_dev *dev,
 				       u8 other_vport, u8 port_num,
 				       int vf,
 				       struct mlx5_hca_vport_context *req);
-
+int mlx5_nic_vport_update_local_lb(struct mlx5_core_dev *mdev, bool enable);
+int mlx5_nic_vport_query_local_lb(struct mlx5_core_dev *mdev, bool *status);
 #endif /* __MLX5_VPORT_H__ */
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 46b9ac5..c1f6c95 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1260,6 +1260,7 @@ int copy_page_range(struct mm_struct *dst, struct mm_struct *src,
 void unmap_mapping_range(struct address_space *mapping,
 		loff_t const holebegin, loff_t const holelen, int even_cows);
 int follow_pte_pmd(struct mm_struct *mm, unsigned long address,
+			     unsigned long *start, unsigned long *end,
 			     pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp);
 int follow_pfn(struct vm_area_struct *vma, unsigned long address,
 	unsigned long *pfn);
diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index c91b3bc..7b2e31b 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -95,17 +95,6 @@ struct mmu_notifier_ops {
 			   pte_t pte);
 
 	/*
-	 * Before this is invoked any secondary MMU is still ok to
-	 * read/write to the page previously pointed to by the Linux
-	 * pte because the page hasn't been freed yet and it won't be
-	 * freed until this returns. If required set_page_dirty has to
-	 * be called internally to this method.
-	 */
-	void (*invalidate_page)(struct mmu_notifier *mn,
-				struct mm_struct *mm,
-				unsigned long address);
-
-	/*
 	 * invalidate_range_start() and invalidate_range_end() must be
 	 * paired and are called only when the mmap_sem and/or the
 	 * locks protecting the reverse maps are held. If the subsystem
@@ -220,8 +209,6 @@ extern int __mmu_notifier_test_young(struct mm_struct *mm,
 				     unsigned long address);
 extern void __mmu_notifier_change_pte(struct mm_struct *mm,
 				      unsigned long address, pte_t pte);
-extern void __mmu_notifier_invalidate_page(struct mm_struct *mm,
-					  unsigned long address);
 extern void __mmu_notifier_invalidate_range_start(struct mm_struct *mm,
 				  unsigned long start, unsigned long end);
 extern void __mmu_notifier_invalidate_range_end(struct mm_struct *mm,
@@ -268,13 +255,6 @@ static inline void mmu_notifier_change_pte(struct mm_struct *mm,
 		__mmu_notifier_change_pte(mm, address, pte);
 }
 
-static inline void mmu_notifier_invalidate_page(struct mm_struct *mm,
-					  unsigned long address)
-{
-	if (mm_has_notifiers(mm))
-		__mmu_notifier_invalidate_page(mm, address);
-}
-
 static inline void mmu_notifier_invalidate_range_start(struct mm_struct *mm,
 				  unsigned long start, unsigned long end)
 {
@@ -442,11 +422,6 @@ static inline void mmu_notifier_change_pte(struct mm_struct *mm,
 {
 }
 
-static inline void mmu_notifier_invalidate_page(struct mm_struct *mm,
-					  unsigned long address)
-{
-}
-
 static inline void mmu_notifier_invalidate_range_start(struct mm_struct *mm,
 				  unsigned long start, unsigned long end)
 {
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 779b235..c99ba79 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3866,6 +3866,8 @@ int netdev_walk_all_upper_dev_rcu(struct net_device *dev,
 bool netdev_has_upper_dev_all_rcu(struct net_device *dev,
 				  struct net_device *upper_dev);
 
+bool netdev_has_any_upper_dev(struct net_device *dev);
+
 void *netdev_lower_get_next_private(struct net_device *dev,
 				    struct list_head **iter);
 void *netdev_lower_get_next_private_rcu(struct net_device *dev,
diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index 25d8225..8efff88 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -254,7 +254,7 @@ enum {
 	NVME_CTRL_VWC_PRESENT			= 1 << 0,
 	NVME_CTRL_OACS_SEC_SUPP                 = 1 << 0,
 	NVME_CTRL_OACS_DIRECTIVES		= 1 << 5,
-	NVME_CTRL_OACS_DBBUF_SUPP		= 1 << 7,
+	NVME_CTRL_OACS_DBBUF_SUPP		= 1 << 8,
 };
 
 struct nvme_lbaf {
diff --git a/include/linux/pci.h b/include/linux/pci.h
index f958d07..da05e5d 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -731,6 +731,7 @@ struct pci_driver {
 	void (*shutdown) (struct pci_dev *dev);
 	int (*sriov_configure) (struct pci_dev *dev, int num_vfs); /* PF pdev */
 	const struct pci_error_handlers *err_handler;
+	const struct attribute_group **groups;
 	struct device_driver	driver;
 	struct pci_dynids dynids;
 };
diff --git a/include/linux/platform_data/omap_drm.h b/include/linux/platform_data/omap_drm.h
deleted file mode 100644
index f4e4a23..0000000
--- a/include/linux/platform_data/omap_drm.h
+++ /dev/null
@@ -1,53 +0,0 @@
-/*
- * DRM/KMS platform data for TI OMAP platforms
- *
- * Copyright (C) 2012 Texas Instruments
- * Author: Rob Clark <rob.clark@linaro.org>
- *
- * This program is free software; you can redistribute it and/or modify it
- * under the terms of the GNU General Public License version 2 as published by
- * the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but WITHOUT
- * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
- * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
- * more details.
- *
- * You should have received a copy of the GNU General Public License along with
- * this program.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#ifndef __PLATFORM_DATA_OMAP_DRM_H__
-#define __PLATFORM_DATA_OMAP_DRM_H__
-
-/*
- * Optional platform data to configure the default configuration of which
- * pipes/overlays/CRTCs are used.. if this is not provided, then instead the
- * first CONFIG_DRM_OMAP_NUM_CRTCS are used, and they are each connected to
- * one manager, with priority given to managers that are connected to
- * detected devices.  Remaining overlays are used as video planes.  This
- * should be a good default behavior for most cases, but yet there still
- * might be times when you wish to do something different.
- */
-struct omap_kms_platform_data {
-	/* overlays to use as CRTCs: */
-	int ovl_cnt;
-	const int *ovl_ids;
-
-	/* overlays to use as video planes: */
-	int pln_cnt;
-	const int *pln_ids;
-
-	int mgr_cnt;
-	const int *mgr_ids;
-
-	int dev_cnt;
-	const char **dev_names;
-};
-
-struct omap_drm_platform_data {
-	uint32_t omaprev;
-	struct omap_kms_platform_data *kms_pdata;
-};
-
-#endif /* __PLATFORM_DATA_OMAP_DRM_H__ */
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index f816fc7..96f1baf 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -58,8 +58,6 @@ void call_rcu(struct rcu_head *head, rcu_callback_t func);
 void call_rcu_bh(struct rcu_head *head, rcu_callback_t func);
 void call_rcu_sched(struct rcu_head *head, rcu_callback_t func);
 void synchronize_sched(void);
-void call_rcu_tasks(struct rcu_head *head, rcu_callback_t func);
-void synchronize_rcu_tasks(void);
 void rcu_barrier_tasks(void);
 
 #ifdef CONFIG_PREEMPT_RCU
@@ -105,11 +103,13 @@ static inline int rcu_preempt_depth(void)
 
 /* Internal to kernel */
 void rcu_init(void);
+extern int rcu_scheduler_active __read_mostly;
 void rcu_sched_qs(void);
 void rcu_bh_qs(void);
 void rcu_check_callbacks(int user);
 void rcu_report_dead(unsigned int cpu);
 void rcu_cpu_starting(unsigned int cpu);
+void rcutree_migrate_callbacks(int cpu);
 
 #ifdef CONFIG_RCU_STALL_COMMON
 void rcu_sysrq_start(void);
@@ -164,8 +164,6 @@ static inline void rcu_init_nohz(void) { }
  * macro rather than an inline function to avoid #include hell.
  */
 #ifdef CONFIG_TASKS_RCU
-#define TASKS_RCU(x) x
-extern struct srcu_struct tasks_rcu_exit_srcu;
 #define rcu_note_voluntary_context_switch_lite(t) \
 	do { \
 		if (READ_ONCE((t)->rcu_tasks_holdout)) \
@@ -176,10 +174,17 @@ extern struct srcu_struct tasks_rcu_exit_srcu;
 		rcu_all_qs(); \
 		rcu_note_voluntary_context_switch_lite(t); \
 	} while (0)
+void call_rcu_tasks(struct rcu_head *head, rcu_callback_t func);
+void synchronize_rcu_tasks(void);
+void exit_tasks_rcu_start(void);
+void exit_tasks_rcu_finish(void);
 #else /* #ifdef CONFIG_TASKS_RCU */
-#define TASKS_RCU(x) do { } while (0)
 #define rcu_note_voluntary_context_switch_lite(t)	do { } while (0)
 #define rcu_note_voluntary_context_switch(t)		rcu_all_qs()
+#define call_rcu_tasks call_rcu_sched
+#define synchronize_rcu_tasks synchronize_sched
+static inline void exit_tasks_rcu_start(void) { }
+static inline void exit_tasks_rcu_finish(void) { }
 #endif /* #else #ifdef CONFIG_TASKS_RCU */
 
 /**
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index 5becbbc..b3dbf95 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -116,13 +116,11 @@ static inline void rcu_irq_exit_irqson(void) { }
 static inline void rcu_irq_enter_irqson(void) { }
 static inline void rcu_irq_exit(void) { }
 static inline void exit_rcu(void) { }
-
-#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_SRCU)
-extern int rcu_scheduler_active __read_mostly;
+#ifdef CONFIG_SRCU
 void rcu_scheduler_starting(void);
-#else /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_SRCU) */
+#else /* #ifndef CONFIG_SRCU */
 static inline void rcu_scheduler_starting(void) { }
-#endif /* #else #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_SRCU) */
+#endif /* #else #ifndef CONFIG_SRCU */
 static inline void rcu_end_inkernel_boot(void) { }
 static inline bool rcu_is_watching(void) { return true; }
 
diff --git a/include/linux/reservation.h b/include/linux/reservation.h
index 156cfd3..21fc84d 100644
--- a/include/linux/reservation.h
+++ b/include/linux/reservation.h
@@ -254,6 +254,9 @@ int reservation_object_get_fences_rcu(struct reservation_object *obj,
 				      unsigned *pshared_count,
 				      struct dma_fence ***pshared);
 
+int reservation_object_copy_fences(struct reservation_object *dst,
+				   struct reservation_object *src);
+
 long reservation_object_wait_timeout_rcu(struct reservation_object *obj,
 					 bool wait_all, bool intr,
 					 unsigned long timeout);
diff --git a/include/linux/sched.h b/include/linux/sched.h
index c05ac5f..b57c7be 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -589,9 +589,10 @@ struct task_struct {
 
 #ifdef CONFIG_TASKS_RCU
 	unsigned long			rcu_tasks_nvcsw;
-	bool				rcu_tasks_holdout;
-	struct list_head		rcu_tasks_holdout_list;
+	u8				rcu_tasks_holdout;
+	u8				rcu_tasks_idx;
 	int				rcu_tasks_idle_cpu;
+	struct list_head		rcu_tasks_holdout_list;
 #endif /* #ifdef CONFIG_TASKS_RCU */
 
 	struct sched_info		sched_info;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index dbe29b6..d67a818 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -973,7 +973,23 @@ int __must_check skb_to_sgvec_nomark(struct sk_buff *skb, struct scatterlist *sg
 int __must_check skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg,
 			      int offset, int len);
 int skb_cow_data(struct sk_buff *skb, int tailbits, struct sk_buff **trailer);
-int skb_pad(struct sk_buff *skb, int pad);
+int __skb_pad(struct sk_buff *skb, int pad, bool free_on_error);
+
+/**
+ *	skb_pad			-	zero pad the tail of an skb
+ *	@skb: buffer to pad
+ *	@pad: space to pad
+ *
+ *	Ensure that a buffer is followed by a padding area that is zero
+ *	filled. Used by network drivers which may DMA or transfer data
+ *	beyond the buffer end onto the wire.
+ *
+ *	May return error in out of memory cases. The skb is freed on error.
+ */
+static inline int skb_pad(struct sk_buff *skb, int pad)
+{
+	return __skb_pad(skb, pad, true);
+}
 #define dev_kfree_skb(a)	consume_skb(a)
 
 int skb_append_datato_frags(struct sock *sk, struct sk_buff *skb,
@@ -2825,6 +2841,31 @@ static inline int skb_padto(struct sk_buff *skb, unsigned int len)
  *	skb_put_padto - increase size and pad an skbuff up to a minimal size
  *	@skb: buffer to pad
  *	@len: minimal length
+ *	@free_on_error: free buffer on error
+ *
+ *	Pads up a buffer to ensure the trailing bytes exist and are
+ *	blanked. If the buffer already contains sufficient data it
+ *	is untouched. Otherwise it is extended. Returns zero on
+ *	success. The skb is freed on error if @free_on_error is true.
+ */
+static inline int __skb_put_padto(struct sk_buff *skb, unsigned int len,
+				  bool free_on_error)
+{
+	unsigned int size = skb->len;
+
+	if (unlikely(size < len)) {
+		len -= size;
+		if (__skb_pad(skb, len, free_on_error))
+			return -ENOMEM;
+		__skb_put(skb, len);
+	}
+	return 0;
+}
+
+/**
+ *	skb_put_padto - increase size and pad an skbuff up to a minimal size
+ *	@skb: buffer to pad
+ *	@len: minimal length
  *
  *	Pads up a buffer to ensure the trailing bytes exist and are
  *	blanked. If the buffer already contains sufficient data it
@@ -2833,15 +2874,7 @@ static inline int skb_padto(struct sk_buff *skb, unsigned int len)
  */
 static inline int skb_put_padto(struct sk_buff *skb, unsigned int len)
 {
-	unsigned int size = skb->len;
-
-	if (unlikely(size < len)) {
-		len -= size;
-		if (skb_pad(skb, len))
-			return -ENOMEM;
-		__skb_put(skb, len);
-	}
-	return 0;
+	return __skb_put_padto(skb, len, true);
 }
 
 static inline int skb_add_data(struct sk_buff *skb,
diff --git a/include/linux/spinlock.h b/include/linux/spinlock.h
index d9510e8..ef018a6 100644
--- a/include/linux/spinlock.h
+++ b/include/linux/spinlock.h
@@ -130,12 +130,6 @@ do {								\
 #define smp_mb__before_spinlock()	smp_wmb()
 #endif
 
-/**
- * raw_spin_unlock_wait - wait until the spinlock gets unlocked
- * @lock: the spinlock in question.
- */
-#define raw_spin_unlock_wait(lock)	arch_spin_unlock_wait(&(lock)->raw_lock)
-
 #ifdef CONFIG_DEBUG_SPINLOCK
  extern void do_raw_spin_lock(raw_spinlock_t *lock) __acquires(lock);
 #define do_raw_spin_lock_flags(lock, flags) do_raw_spin_lock(lock)
@@ -369,31 +363,6 @@ static __always_inline int spin_trylock_irq(spinlock_t *lock)
 	raw_spin_trylock_irqsave(spinlock_check(lock), flags); \
 })
 
-/**
- * spin_unlock_wait - Interpose between successive critical sections
- * @lock: the spinlock whose critical sections are to be interposed.
- *
- * Semantically this is equivalent to a spin_lock() immediately
- * followed by a spin_unlock().  However, most architectures have
- * more efficient implementations in which the spin_unlock_wait()
- * cannot block concurrent lock acquisition, and in some cases
- * where spin_unlock_wait() does not write to the lock variable.
- * Nevertheless, spin_unlock_wait() can have high overhead, so if
- * you feel the need to use it, please check to see if there is
- * a better way to get your job done.
- *
- * The ordering guarantees provided by spin_unlock_wait() are:
- *
- * 1.  All accesses preceding the spin_unlock_wait() happen before
- *     any accesses in later critical sections for this same lock.
- * 2.  All accesses following the spin_unlock_wait() happen after
- *     any accesses in earlier critical sections for this same lock.
- */
-static __always_inline void spin_unlock_wait(spinlock_t *lock)
-{
-	raw_spin_unlock_wait(&lock->rlock);
-}
-
 static __always_inline int spin_is_locked(spinlock_t *lock)
 {
 	return raw_spin_is_locked(&lock->rlock);
diff --git a/include/linux/spinlock_up.h b/include/linux/spinlock_up.h
index 0d9848d..612fb530 100644
--- a/include/linux/spinlock_up.h
+++ b/include/linux/spinlock_up.h
@@ -26,11 +26,6 @@
 #ifdef CONFIG_DEBUG_SPINLOCK
 #define arch_spin_is_locked(x)		((x)->slock == 0)
 
-static inline void arch_spin_unlock_wait(arch_spinlock_t *lock)
-{
-	smp_cond_load_acquire(&lock->slock, VAL);
-}
-
 static inline void arch_spin_lock(arch_spinlock_t *lock)
 {
 	lock->slock = 0;
@@ -73,7 +68,6 @@ static inline void arch_spin_unlock(arch_spinlock_t *lock)
 
 #else /* DEBUG_SPINLOCK */
 #define arch_spin_is_locked(lock)	((void)(lock), 0)
-#define arch_spin_unlock_wait(lock)	do { barrier(); (void)(lock); } while (0)
 /* for sched/core.c and kernel_lock.c: */
 # define arch_spin_lock(lock)		do { barrier(); (void)(lock); } while (0)
 # define arch_spin_lock_flags(lock, flags)	do { barrier(); (void)(lock); } while (0)
diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h
index cfbfc54..261471f 100644
--- a/include/linux/srcutiny.h
+++ b/include/linux/srcutiny.h
@@ -87,4 +87,17 @@ static inline void srcu_barrier(struct srcu_struct *sp)
 	synchronize_srcu(sp);
 }
 
+/* Defined here to avoid size increase for non-torture kernels. */
+static inline void srcu_torture_stats_print(struct srcu_struct *sp,
+					    char *tt, char *tf)
+{
+	int idx;
+
+	idx = READ_ONCE(sp->srcu_idx) & 0x1;
+	pr_alert("%s%s Tiny SRCU per-CPU(idx=%d): (%hd,%hd)\n",
+		 tt, tf, idx,
+		 READ_ONCE(sp->srcu_lock_nesting[!idx]),
+		 READ_ONCE(sp->srcu_lock_nesting[idx]));
+}
+
 #endif
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index 42973f7..a949f4f 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -104,8 +104,6 @@ struct srcu_struct {
 #define SRCU_STATE_SCAN1	1
 #define SRCU_STATE_SCAN2	2
 
-void process_srcu(struct work_struct *work);
-
 #define __SRCU_STRUCT_INIT(name)					\
 	{								\
 		.sda = &name##_srcu_data,				\
@@ -141,5 +139,6 @@ void process_srcu(struct work_struct *work);
 
 void synchronize_srcu_expedited(struct srcu_struct *sp);
 void srcu_barrier(struct srcu_struct *sp);
+void srcu_torture_stats_print(struct srcu_struct *sp, char *tt, char *tf);
 
 #endif
diff --git a/include/linux/swait.h b/include/linux/swait.h
index c1f9c62..4a4e180 100644
--- a/include/linux/swait.h
+++ b/include/linux/swait.h
@@ -169,4 +169,59 @@ do {									\
 	__ret;								\
 })
 
+#define __swait_event_idle(wq, condition)				\
+	(void)___swait_event(wq, condition, TASK_IDLE, 0, schedule())
+
+/**
+ * swait_event_idle - wait without system load contribution
+ * @wq: the waitqueue to wait on
+ * @condition: a C expression for the event to wait for
+ *
+ * The process is put to sleep (TASK_IDLE) until the @condition evaluates to
+ * true. The @condition is checked each time the waitqueue @wq is woken up.
+ *
+ * This function is mostly used when a kthread or workqueue waits for some
+ * condition and doesn't want to contribute to system load. Signals are
+ * ignored.
+ */
+#define swait_event_idle(wq, condition)					\
+do {									\
+	if (condition)							\
+		break;							\
+	__swait_event_idle(wq, condition);				\
+} while (0)
+
+#define __swait_event_idle_timeout(wq, condition, timeout)		\
+	___swait_event(wq, ___wait_cond_timeout(condition),		\
+		       TASK_IDLE, timeout,				\
+		       __ret = schedule_timeout(__ret))
+
+/**
+ * swait_event_idle_timeout - wait up to timeout without load contribution
+ * @wq: the waitqueue to wait on
+ * @condition: a C expression for the event to wait for
+ * @timeout: timeout at which we'll give up in jiffies
+ *
+ * The process is put to sleep (TASK_IDLE) until the @condition evaluates to
+ * true. The @condition is checked each time the waitqueue @wq is woken up.
+ *
+ * This function is mostly used when a kthread or workqueue waits for some
+ * condition and doesn't want to contribute to system load. Signals are
+ * ignored.
+ *
+ * Returns:
+ * 0 if the @condition evaluated to %false after the @timeout elapsed,
+ * 1 if the @condition evaluated to %true after the @timeout elapsed,
+ * or the remaining jiffies (at least 1) if the @condition evaluated
+ * to %true before the @timeout elapsed.
+ */
+#define swait_event_idle_timeout(wq, condition, timeout)		\
+({									\
+	long __ret = timeout;						\
+	if (!___wait_cond_timeout(condition))				\
+		__ret = __swait_event_idle_timeout(wq,			\
+						   condition, timeout);	\
+	__ret;								\
+})
+
 #endif /* _LINUX_SWAIT_H */
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 3cb15ea..138c945 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -100,11 +100,12 @@ union bpf_attr;
 #define __MAP(n,...) __MAP##n(__VA_ARGS__)
 
 #define __SC_DECL(t, a)	t a
-#define __TYPE_IS_L(t)	(__same_type((t)0, 0L))
-#define __TYPE_IS_UL(t)	(__same_type((t)0, 0UL))
-#define __TYPE_IS_LL(t) (__same_type((t)0, 0LL) || __same_type((t)0, 0ULL))
+#define __TYPE_AS(t, v)	__same_type((__force t)0, v)
+#define __TYPE_IS_L(t)	(__TYPE_AS(t, 0L))
+#define __TYPE_IS_UL(t)	(__TYPE_AS(t, 0UL))
+#define __TYPE_IS_LL(t) (__TYPE_AS(t, 0LL) || __TYPE_AS(t, 0ULL))
 #define __SC_LONG(t, a) __typeof(__builtin_choose_expr(__TYPE_IS_LL(t), 0LL, 0L)) a
-#define __SC_CAST(t, a)	(t) a
+#define __SC_CAST(t, a)	(__force t) a
 #define __SC_ARGS(t, a)	a
 #define __SC_TEST(t, a) (void)BUILD_BUG_ON_ZERO(!__TYPE_IS_LL(t) && sizeof(t) > sizeof(long))
 
@@ -578,12 +579,12 @@ asmlinkage long sys_preadv(unsigned long fd, const struct iovec __user *vec,
 			   unsigned long vlen, unsigned long pos_l, unsigned long pos_h);
 asmlinkage long sys_preadv2(unsigned long fd, const struct iovec __user *vec,
 			    unsigned long vlen, unsigned long pos_l, unsigned long pos_h,
-			    int flags);
+			    rwf_t flags);
 asmlinkage long sys_pwritev(unsigned long fd, const struct iovec __user *vec,
 			    unsigned long vlen, unsigned long pos_l, unsigned long pos_h);
 asmlinkage long sys_pwritev2(unsigned long fd, const struct iovec __user *vec,
 			    unsigned long vlen, unsigned long pos_l, unsigned long pos_h,
-			    int flags);
+			    rwf_t flags);
 asmlinkage long sys_getcwd(char __user *buf, unsigned long size);
 asmlinkage long sys_mkdir(const char __user *pathname, umode_t mode);
 asmlinkage long sys_chdir(const char __user *filename);
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index acdd6f9..20ef8e6 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -156,7 +156,7 @@ copy_to_user(void __user *to, const void *from, unsigned long n)
 }
 #ifdef CONFIG_COMPAT
 static __always_inline unsigned long __must_check
-copy_in_user(void __user *to, const void *from, unsigned long n)
+copy_in_user(void __user *to, const void __user *from, unsigned long n)
 {
 	might_fault();
 	if (access_ok(VERIFY_WRITE, to, n) && access_ok(VERIFY_READ, from, n))
diff --git a/include/media/vsp1.h b/include/media/vsp1.h
index c837383..68a8abe 100644
--- a/include/media/vsp1.h
+++ b/include/media/vsp1.h
@@ -34,11 +34,12 @@ struct vsp1_du_lif_config {
 	unsigned int width;
 	unsigned int height;
 
-	void (*callback)(void *);
+	void (*callback)(void *, bool);
 	void *callback_data;
 };
 
-int vsp1_du_setup_lif(struct device *dev, const struct vsp1_du_lif_config *cfg);
+int vsp1_du_setup_lif(struct device *dev, unsigned int pipe_index,
+		      const struct vsp1_du_lif_config *cfg);
 
 struct vsp1_du_atomic_config {
 	u32 pixelformat;
@@ -50,10 +51,11 @@ struct vsp1_du_atomic_config {
 	unsigned int zpos;
 };
 
-void vsp1_du_atomic_begin(struct device *dev);
-int vsp1_du_atomic_update(struct device *dev, unsigned int rpf,
+void vsp1_du_atomic_begin(struct device *dev, unsigned int pipe_index);
+int vsp1_du_atomic_update(struct device *dev, unsigned int pipe_index,
+			  unsigned int rpf,
 			  const struct vsp1_du_atomic_config *cfg);
-void vsp1_du_atomic_flush(struct device *dev);
+void vsp1_du_atomic_flush(struct device *dev, unsigned int pipe_index);
 int vsp1_du_map_sg(struct device *dev, struct sg_table *sgt);
 void vsp1_du_unmap_sg(struct device *dev, struct sg_table *sgt);
 
diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 1a88008..af509f8 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -70,6 +70,7 @@ struct fib6_node {
 	__u16			fn_flags;
 	int			fn_sernum;
 	struct rt6_info		*rr_ptr;
+	struct rcu_head		rcu;
 };
 
 #ifndef CONFIG_IPV6_SUBTREES
@@ -104,7 +105,7 @@ struct rt6_info {
 	 * the same cache line.
 	 */
 	struct fib6_table		*rt6i_table;
-	struct fib6_node		*rt6i_node;
+	struct fib6_node __rcu		*rt6i_node;
 
 	struct in6_addr			rt6i_gateway;
 
@@ -167,13 +168,40 @@ static inline void rt6_update_expires(struct rt6_info *rt0, int timeout)
 	rt0->rt6i_flags |= RTF_EXPIRES;
 }
 
+/* Function to safely get fn->sernum for passed in rt
+ * and store result in passed in cookie.
+ * Return true if we can get cookie safely
+ * Return false if not
+ */
+static inline bool rt6_get_cookie_safe(const struct rt6_info *rt,
+				       u32 *cookie)
+{
+	struct fib6_node *fn;
+	bool status = false;
+
+	rcu_read_lock();
+	fn = rcu_dereference(rt->rt6i_node);
+
+	if (fn) {
+		*cookie = fn->fn_sernum;
+		status = true;
+	}
+
+	rcu_read_unlock();
+	return status;
+}
+
 static inline u32 rt6_get_cookie(const struct rt6_info *rt)
 {
+	u32 cookie = 0;
+
 	if (rt->rt6i_flags & RTF_PCPU ||
 	    (unlikely(!list_empty(&rt->rt6i_uncached)) && rt->dst.from))
 		rt = (struct rt6_info *)(rt->dst.from);
 
-	return rt->rt6i_node ? rt->rt6i_node->fn_sernum : 0;
+	rt6_get_cookie_safe(rt, &cookie);
+
+	return cookie;
 }
 
 static inline void ip6_rt_put(struct rt6_info *rt)
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 67f815e..c1109cd 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -101,6 +101,13 @@ struct Qdisc {
 	spinlock_t		busylock ____cacheline_aligned_in_smp;
 };
 
+static inline void qdisc_refcount_inc(struct Qdisc *qdisc)
+{
+	if (qdisc->flags & TCQ_F_BUILTIN)
+		return;
+	refcount_inc(&qdisc->refcnt);
+}
+
 static inline bool qdisc_is_running(const struct Qdisc *qdisc)
 {
 	return (raw_read_seqcount(&qdisc->running) & 1) ? true : false;
diff --git a/include/net/tcp.h b/include/net/tcp.h
index ada65e7..f642a39 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1004,9 +1004,7 @@ void tcp_get_default_congestion_control(char *name);
 void tcp_get_available_congestion_control(char *buf, size_t len);
 void tcp_get_allowed_congestion_control(char *buf, size_t len);
 int tcp_set_allowed_congestion_control(char *allowed);
-int tcp_set_congestion_control(struct sock *sk, const char *name, bool load);
-void tcp_reinit_congestion_control(struct sock *sk,
-				   const struct tcp_congestion_ops *ca);
+int tcp_set_congestion_control(struct sock *sk, const char *name, bool load, bool reinit);
 u32 tcp_slow_start(struct tcp_sock *tp, u32 acked);
 void tcp_cong_avoid_ai(struct tcp_sock *tp, u32 w, u32 acked);
 
diff --git a/include/net/udp.h b/include/net/udp.h
index 586de4b..626c2d8 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -260,7 +260,7 @@ static inline struct sk_buff *skb_recv_udp(struct sock *sk, unsigned int flags,
 }
 
 void udp_v4_early_demux(struct sk_buff *skb);
-void udp_sk_rx_dst_set(struct sock *sk, struct dst_entry *dst);
+bool udp_sk_rx_dst_set(struct sock *sk, struct dst_entry *dst);
 int udp_get_port(struct sock *sk, unsigned short snum,
 		 int (*saddr_cmp)(const struct sock *,
 				  const struct sock *));
diff --git a/include/rdma/ib_addr.h b/include/rdma/ib_addr.h
index b73a14e..ec5008c 100644
--- a/include/rdma/ib_addr.h
+++ b/include/rdma/ib_addr.h
@@ -172,7 +172,8 @@ static inline int rdma_ip2gid(struct sockaddr *addr, union ib_gid *gid)
 				       (struct in6_addr *)gid);
 		break;
 	case AF_INET6:
-		memcpy(gid->raw, &((struct sockaddr_in6 *)addr)->sin6_addr, 16);
+		*(struct in6_addr *)&gid->raw =
+			((struct sockaddr_in6 *)addr)->sin6_addr;
 		break;
 	default:
 		return -EINVAL;
@@ -304,7 +305,13 @@ static inline void rdma_get_ll_mac(struct in6_addr *addr, u8 *mac)
 
 static inline int rdma_is_multicast_addr(struct in6_addr *addr)
 {
-	return addr->s6_addr[0] == 0xff;
+	u32 ipv4_addr;
+
+	if (addr->s6_addr[0] == 0xff)
+		return 1;
+
+	memcpy(&ipv4_addr, addr->s6_addr + 12, 4);
+	return (ipv6_addr_v4mapped(addr) && ipv4_is_multicast(ipv4_addr));
 }
 
 static inline void rdma_get_mcast_mac(struct in6_addr *addr, u8 *mac)
diff --git a/include/rdma/ib_hdrs.h b/include/rdma/ib_hdrs.h
index 5519f31..c124d51 100644
--- a/include/rdma/ib_hdrs.h
+++ b/include/rdma/ib_hdrs.h
@@ -193,8 +193,12 @@ static inline void put_ib_ateth_compare(u64 val, struct ib_atomic_eth *ateth)
 #define IB_LNH_MASK		3
 #define IB_SC_MASK		0xf
 #define IB_SC_SHIFT		12
+#define IB_SC5_MASK		0x10
 #define IB_SL_MASK		0xf
 #define IB_SL_SHIFT		4
+#define IB_SL_SHIFT		4
+#define IB_LVER_MASK	0xf
+#define IB_LVER_SHIFT	8
 
 static inline u8 ib_get_lnh(struct ib_header *hdr)
 {
@@ -206,6 +210,11 @@ static inline u8 ib_get_sc(struct ib_header *hdr)
 	return ((be16_to_cpu(hdr->lrh[0]) >> IB_SC_SHIFT) & IB_SC_MASK);
 }
 
+static inline bool ib_is_sc5(u16 sc5)
+{
+	return !!(sc5 & IB_SC5_MASK);
+}
+
 static inline u8 ib_get_sl(struct ib_header *hdr)
 {
 	return ((be16_to_cpu(hdr->lrh[0]) >> IB_SL_SHIFT) & IB_SL_MASK);
@@ -221,6 +230,27 @@ static inline u16 ib_get_slid(struct ib_header *hdr)
 	return (be16_to_cpu(hdr->lrh[3]));
 }
 
+static inline u8 ib_get_lver(struct ib_header *hdr)
+{
+	return (u8)((be16_to_cpu(hdr->lrh[0]) >> IB_LVER_SHIFT) &
+		   IB_LVER_MASK);
+}
+
+static inline u16 ib_get_len(struct ib_header *hdr)
+{
+	return (u16)(be16_to_cpu(hdr->lrh[2]));
+}
+
+static inline u32 ib_get_qkey(struct ib_other_headers *ohdr)
+{
+	return be32_to_cpu(ohdr->u.ud.deth[0]);
+}
+
+static inline u32 ib_get_sqpn(struct ib_other_headers *ohdr)
+{
+	return ((be32_to_cpu(ohdr->u.ud.deth[1])) & IB_QPN_MASK);
+}
+
 /*
  * BTH
  */
@@ -229,6 +259,14 @@ static inline u16 ib_get_slid(struct ib_header *hdr)
 #define IB_BTH_PAD_MASK	3
 #define IB_BTH_PKEY_MASK	0xffff
 #define IB_BTH_PAD_SHIFT	20
+#define IB_BTH_A_MASK		1
+#define IB_BTH_A_SHIFT		31
+#define IB_BTH_M_MASK		1
+#define IB_BTH_M_SHIFT		22
+#define IB_BTH_SE_MASK		1
+#define IB_BTH_SE_SHIFT	23
+#define IB_BTH_TVER_MASK	0xf
+#define IB_BTH_TVER_SHIFT	16
 
 static inline u8 ib_bth_get_pad(struct ib_other_headers *ohdr)
 {
@@ -247,4 +285,50 @@ static inline u8 ib_bth_get_opcode(struct ib_other_headers *ohdr)
 		   IB_BTH_OPCODE_MASK);
 }
 
+static inline u8 ib_bth_get_ackreq(struct ib_other_headers *ohdr)
+{
+	return (u8)((be32_to_cpu(ohdr->bth[2]) >> IB_BTH_A_SHIFT) &
+		   IB_BTH_A_MASK);
+}
+
+static inline u8 ib_bth_get_migreq(struct ib_other_headers *ohdr)
+{
+	return (u8)((be32_to_cpu(ohdr->bth[0]) >> IB_BTH_M_SHIFT) &
+		    IB_BTH_M_MASK);
+}
+
+static inline u8 ib_bth_get_se(struct ib_other_headers *ohdr)
+{
+	return (u8)((be32_to_cpu(ohdr->bth[0]) >> IB_BTH_SE_SHIFT) &
+		    IB_BTH_SE_MASK);
+}
+
+static inline u32 ib_bth_get_psn(struct ib_other_headers *ohdr)
+{
+	return (u32)(be32_to_cpu(ohdr->bth[2]));
+}
+
+static inline u32 ib_bth_get_qpn(struct ib_other_headers *ohdr)
+{
+	return (u32)((be32_to_cpu(ohdr->bth[1])) & IB_QPN_MASK);
+}
+
+static inline u8 ib_bth_get_becn(struct ib_other_headers *ohdr)
+{
+	return (u8)((be32_to_cpu(ohdr->bth[1]) >> IB_BECN_SHIFT) &
+		     IB_BECN_MASK);
+}
+
+static inline u8 ib_bth_get_fecn(struct ib_other_headers *ohdr)
+{
+	return (u8)((be32_to_cpu(ohdr->bth[1]) >> IB_FECN_SHIFT) &
+		    IB_FECN_MASK);
+}
+
+static inline u8 ib_bth_get_tver(struct ib_other_headers *ohdr)
+{
+	return (u8)((be32_to_cpu(ohdr->bth[0]) >> IB_BTH_TVER_SHIFT)  &
+		    IB_BTH_TVER_MASK);
+}
+
 #endif                          /* IB_HDRS_H */
diff --git a/include/rdma/ib_marshall.h b/include/rdma/ib_marshall.h
index 68cef3b..8ebf84a 100644
--- a/include/rdma/ib_marshall.h
+++ b/include/rdma/ib_marshall.h
@@ -38,10 +38,12 @@
 #include <rdma/ib_user_verbs.h>
 #include <rdma/ib_user_sa.h>
 
-void ib_copy_qp_attr_to_user(struct ib_uverbs_qp_attr *dst,
+void ib_copy_qp_attr_to_user(struct ib_device *device,
+			     struct ib_uverbs_qp_attr *dst,
 			     struct ib_qp_attr *src);
 
-void ib_copy_ah_attr_to_user(struct ib_uverbs_ah_attr *dst,
+void ib_copy_ah_attr_to_user(struct ib_device *device,
+			     struct ib_uverbs_ah_attr *dst,
 			     struct rdma_ah_attr *src);
 
 void ib_copy_path_rec_to_user(struct ib_user_path_rec *dst,
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 88c32ab..e6df680 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -64,6 +64,8 @@
 #include <linux/cgroup_rdma.h>
 #include <uapi/rdma/ib_user_verbs.h>
 
+#define IB_FW_VERSION_NAME_MAX	ETHTOOL_FWVERS_LEN
+
 extern struct workqueue_struct *ib_wq;
 extern struct workqueue_struct *ib_comp_wq;
 
@@ -168,7 +170,7 @@ enum ib_device_cap_flags {
 	IB_DEVICE_UD_AV_PORT_ENFORCE		= (1 << 6),
 	IB_DEVICE_CURR_QP_STATE_MOD		= (1 << 7),
 	IB_DEVICE_SHUTDOWN_PORT			= (1 << 8),
-	IB_DEVICE_INIT_TYPE			= (1 << 9),
+	/* Not in use, former INIT_TYPE		= (1 << 9),*/
 	IB_DEVICE_PORT_ACTIVE_EVENT		= (1 << 10),
 	IB_DEVICE_SYS_IMAGE_GUID		= (1 << 11),
 	IB_DEVICE_RC_RNR_NAK_GEN		= (1 << 12),
@@ -183,7 +185,7 @@ enum ib_device_cap_flags {
 	 * which will always contain a usable lkey.
 	 */
 	IB_DEVICE_LOCAL_DMA_LKEY		= (1 << 15),
-	IB_DEVICE_RESERVED /* old SEND_W_INV */	= (1 << 16),
+	/* Reserved, old SEND_W_INV		= (1 << 16),*/
 	IB_DEVICE_MEM_WINDOW			= (1 << 17),
 	/*
 	 * Devices should set IB_DEVICE_UD_IP_SUM if they support
@@ -218,7 +220,7 @@ enum ib_device_cap_flags {
 	 * of I/O operations with single completion queue managed
 	 * by hardware.
 	 */
-	IB_DEVICE_CROSS_CHANNEL		= (1 << 27),
+	IB_DEVICE_CROSS_CHANNEL			= (1 << 27),
 	IB_DEVICE_MANAGED_FLOW_STEERING		= (1 << 29),
 	IB_DEVICE_SIGNATURE_HANDOVER		= (1 << 30),
 	IB_DEVICE_ON_DEMAND_PAGING		= (1ULL << 31),
@@ -278,6 +280,24 @@ struct ib_rss_caps {
 	u32 max_rwq_indirection_table_size;
 };
 
+enum ib_tm_cap_flags {
+	/*  Support tag matching on RC transport */
+	IB_TM_CAP_RC		    = 1 << 0,
+};
+
+struct ib_xrq_caps {
+	/* Max size of RNDV header */
+	u32 max_rndv_hdr_size;
+	/* Max number of entries in tag matching list */
+	u32 max_num_tags;
+	/* From enum ib_tm_cap_flags */
+	u32 flags;
+	/* Max number of outstanding list operations */
+	u32 max_ops;
+	/* Max number of SGE in tag matching entry */
+	u32 max_sge;
+};
+
 enum ib_cq_creation_flags {
 	IB_CQ_FLAGS_TIMESTAMP_COMPLETION   = 1 << 0,
 	IB_CQ_FLAGS_IGNORE_OVERRUN	   = 1 << 1,
@@ -338,6 +358,7 @@ struct ib_device_attr {
 	struct ib_rss_caps	rss_caps;
 	u32			max_wq_type_rq;
 	u32			raw_packet_caps; /* Use ib_raw_packet_caps enum */
+	struct ib_xrq_caps	xrq_caps;
 };
 
 enum ib_mtu {
@@ -549,8 +570,8 @@ struct ib_port_attr {
 	u32			bad_pkey_cntr;
 	u32			qkey_viol_cntr;
 	u16			pkey_tbl_len;
-	u16			lid;
-	u16			sm_lid;
+	u32			sm_lid;
+	u32			lid;
 	u8			lmc;
 	u8			max_vl_num;
 	u8			sm_sl;
@@ -577,7 +598,8 @@ struct ib_device_modify {
 enum ib_port_modify_flags {
 	IB_PORT_SHUTDOWN		= 1,
 	IB_PORT_INIT_TYPE		= (1<<2),
-	IB_PORT_RESET_QKEY_CNTR		= (1<<3)
+	IB_PORT_RESET_QKEY_CNTR		= (1<<3),
+	IB_PORT_OPA_MASK_CHG		= (1<<4)
 };
 
 struct ib_port_modify {
@@ -664,6 +686,8 @@ union rdma_network_hdr {
 	};
 };
 
+#define IB_QPN_MASK		0xFFFFFF
+
 enum {
 	IB_MULTICAST_QPN = 0xffffff
 };
@@ -859,6 +883,7 @@ struct roce_ah_attr {
 struct opa_ah_attr {
 	u32			dlid;
 	u8			src_path_bits;
+	bool			make_grd;
 };
 
 struct rdma_ah_attr {
@@ -948,7 +973,7 @@ struct ib_wc {
 	u32			src_qp;
 	int			wc_flags;
 	u16			pkey_index;
-	u16			slid;
+	u32			slid;
 	u8			sl;
 	u8			dlid_path_bits;
 	u8			port_num;	/* valid only for DR SMPs on switches */
@@ -966,9 +991,16 @@ enum ib_cq_notify_flags {
 
 enum ib_srq_type {
 	IB_SRQT_BASIC,
-	IB_SRQT_XRC
+	IB_SRQT_XRC,
+	IB_SRQT_TM,
 };
 
+static inline bool ib_srq_has_cq(enum ib_srq_type srq_type)
+{
+	return srq_type == IB_SRQT_XRC ||
+	       srq_type == IB_SRQT_TM;
+}
+
 enum ib_srq_attr_mask {
 	IB_SRQ_MAX_WR	= 1 << 0,
 	IB_SRQ_LIMIT	= 1 << 1,
@@ -986,11 +1018,17 @@ struct ib_srq_init_attr {
 	struct ib_srq_attr	attr;
 	enum ib_srq_type	srq_type;
 
-	union {
-		struct {
-			struct ib_xrcd *xrcd;
-			struct ib_cq   *cq;
-		} xrc;
+	struct {
+		struct ib_cq   *cq;
+		union {
+			struct {
+				struct ib_xrcd *xrcd;
+			} xrc;
+
+			struct {
+				u32		max_num_tags;
+			} tag_matching;
+		};
 	} ext;
 };
 
@@ -1059,6 +1097,7 @@ enum ib_qp_create_flags {
 	/* FREE					= 1 << 7, */
 	IB_QP_CREATE_SCATTER_FCS		= 1 << 8,
 	IB_QP_CREATE_CVLAN_STRIPPING		= 1 << 9,
+	IB_QP_CREATE_SOURCE_QPN			= 1 << 10,
 	/* reserve bits 26-31 for low level drivers' internal use */
 	IB_QP_CREATE_RESERVED_START		= 1 << 26,
 	IB_QP_CREATE_RESERVED_END		= 1 << 31,
@@ -1086,6 +1125,7 @@ struct ib_qp_init_attr {
 	 */
 	u8			port_num;
 	struct ib_rwq_ind_table *rwq_ind_tbl;
+	u32			source_qpn;
 };
 
 struct ib_qp_open_attr {
@@ -1527,12 +1567,14 @@ struct ib_srq {
 	enum ib_srq_type	srq_type;
 	atomic_t		usecnt;
 
-	union {
-		struct {
-			struct ib_xrcd *xrcd;
-			struct ib_cq   *cq;
-			u32		srq_num;
-		} xrc;
+	struct {
+		struct ib_cq   *cq;
+		union {
+			struct {
+				struct ib_xrcd *xrcd;
+				u32		srq_num;
+			} xrc;
+		};
 	} ext;
 };
 
@@ -1546,6 +1588,10 @@ enum ib_raw_packet_caps {
 	IB_RAW_PACKET_CAP_SCATTER_FCS		= (1 << 1),
 	/* Checksum offloads are supported (for both send and receive). */
 	IB_RAW_PACKET_CAP_IP_CSUM		= (1 << 2),
+	/* When a packet is received for an RQ with no receive WQEs, the
+	 * packet processing is delayed.
+	 */
+	IB_RAW_PACKET_CAP_DELAY_DROP		= (1 << 3),
 };
 
 enum ib_wq_type {
@@ -1574,6 +1620,7 @@ struct ib_wq {
 enum ib_wq_flags {
 	IB_WQ_FLAGS_CVLAN_STRIPPING	= 1 << 0,
 	IB_WQ_FLAGS_SCATTER_FCS		= 1 << 1,
+	IB_WQ_FLAGS_DELAY_DROP		= 1 << 2,
 };
 
 struct ib_wq_init_attr {
@@ -2289,6 +2336,8 @@ struct ib_device {
 	struct rdmacg_device         cg_device;
 #endif
 
+	u32                          index;
+
 	/**
 	 * The following mandatory functions are used only at device
 	 * registration.  Keep functions such as these at the end of this
@@ -2296,7 +2345,11 @@ struct ib_device {
 	 * in fast paths.
 	 */
 	int (*get_port_immutable)(struct ib_device *, u8, struct ib_port_immutable *);
-	void (*get_dev_fw_str)(struct ib_device *, char *str, size_t str_len);
+	void (*get_dev_fw_str)(struct ib_device *, char *str);
+	const struct cpumask *(*get_vector_affinity)(struct ib_device *ibdev,
+						     int comp_vector);
+
+	struct uverbs_root_spec		*specs_root;
 };
 
 struct ib_client {
@@ -2332,7 +2385,7 @@ struct ib_client {
 struct ib_device *ib_alloc_device(size_t size);
 void ib_dealloc_device(struct ib_device *device);
 
-void ib_get_device_fw_str(struct ib_device *device, char *str, size_t str_len);
+void ib_get_device_fw_str(struct ib_device *device, char *str);
 
 int ib_register_device(struct ib_device *device,
 		       int (*port_callback)(struct ib_device *,
@@ -2396,8 +2449,8 @@ int ib_modify_qp_is_ok(enum ib_qp_state cur_state, enum ib_qp_state next_state,
 		       enum ib_qp_type type, enum ib_qp_attr_mask mask,
 		       enum rdma_link_layer ll);
 
-int ib_register_event_handler  (struct ib_event_handler *event_handler);
-int ib_unregister_event_handler(struct ib_event_handler *event_handler);
+void ib_register_event_handler(struct ib_event_handler *event_handler);
+void ib_unregister_event_handler(struct ib_event_handler *event_handler);
 void ib_dispatch_event(struct ib_event *event);
 
 int ib_query_port(struct ib_device *device,
@@ -3556,6 +3609,7 @@ void ib_drain_qp(struct ib_qp *qp);
 
 int ib_resolve_eth_dmac(struct ib_device *device,
 			struct rdma_ah_attr *ah_attr);
+int ib_get_eth_speed(struct ib_device *dev, u8 port_num, u8 *speed, u8 *width);
 
 static inline u8 *rdma_ah_retrieve_dmac(struct rdma_ah_attr *attr)
 {
@@ -3609,6 +3663,20 @@ static inline u8 rdma_ah_get_path_bits(const struct rdma_ah_attr *attr)
 	return 0;
 }
 
+static inline void rdma_ah_set_make_grd(struct rdma_ah_attr *attr,
+					bool make_grd)
+{
+	if (attr->type == RDMA_AH_ATTR_TYPE_OPA)
+		attr->opa.make_grd = make_grd;
+}
+
+static inline bool rdma_ah_get_make_grd(const struct rdma_ah_attr *attr)
+{
+	if (attr->type == RDMA_AH_ATTR_TYPE_OPA)
+		return attr->opa.make_grd;
+	return false;
+}
+
 static inline void rdma_ah_set_port_num(struct rdma_ah_attr *attr, u8 port_num)
 {
 	attr->port_num = port_num;
@@ -3707,4 +3775,52 @@ static inline enum rdma_ah_attr_type rdma_ah_find_type(struct ib_device *dev,
 	else
 		return RDMA_AH_ATTR_TYPE_IB;
 }
+
+/**
+ * ib_lid_cpu16 - Return lid in 16bit CPU encoding.
+ *     In the current implementation the only way to get
+ *     get the 32bit lid is from other sources for OPA.
+ *     For IB, lids will always be 16bits so cast the
+ *     value accordingly.
+ *
+ * @lid: A 32bit LID
+ */
+static inline u16 ib_lid_cpu16(u32 lid)
+{
+	WARN_ON_ONCE(lid & 0xFFFF0000);
+	return (u16)lid;
+}
+
+/**
+ * ib_lid_be16 - Return lid in 16bit BE encoding.
+ *
+ * @lid: A 32bit LID
+ */
+static inline __be16 ib_lid_be16(u32 lid)
+{
+	WARN_ON_ONCE(lid & 0xFFFF0000);
+	return cpu_to_be16((u16)lid);
+}
+
+/**
+ * ib_get_vector_affinity - Get the affinity mappings of a given completion
+ *   vector
+ * @device:         the rdma device
+ * @comp_vector:    index of completion vector
+ *
+ * Returns NULL on failure, otherwise a corresponding cpu map of the
+ * completion vector (returns all-cpus map if the device driver doesn't
+ * implement get_vector_affinity).
+ */
+static inline const struct cpumask *
+ib_get_vector_affinity(struct ib_device *device, int comp_vector)
+{
+	if (comp_vector < 0 || comp_vector >= device->num_comp_vectors ||
+	    !device->get_vector_affinity)
+		return NULL;
+
+	return device->get_vector_affinity(device, comp_vector);
+
+}
+
 #endif /* IB_VERBS_H */
diff --git a/include/rdma/opa_addr.h b/include/rdma/opa_addr.h
index eace28f1..e6e90f1 100644
--- a/include/rdma/opa_addr.h
+++ b/include/rdma/opa_addr.h
@@ -48,8 +48,21 @@
 #ifndef OPA_ADDR_H
 #define OPA_ADDR_H
 
+#include <rdma/opa_smi.h>
+
 #define	OPA_SPECIAL_OUI		(0x00066AULL)
 #define OPA_MAKE_ID(x)          (cpu_to_be64(OPA_SPECIAL_OUI << 40 | (x)))
+#define OPA_TO_IB_UCAST_LID(x) (((x) >= be16_to_cpu(IB_MULTICAST_LID_BASE)) \
+				? 0 : x)
+#define OPA_GID_INDEX		0x1
+/**
+ * 0xF8 - 4 bits of multicast range and 1 bit for collective range
+ * Example: For 24 bit LID space,
+ * Multicast range: 0xF00000 to 0xF7FFFF
+ * Collective range: 0xF80000 to 0xFFFFFE
+ */
+#define OPA_MCAST_NR 0x4 /* Number of top bits set */
+#define OPA_COLLECTIVE_NR 0x1 /* Number of bits after MCAST_NR */
 
 /**
  * ib_is_opa_gid: Returns true if the top 24 bits of the gid
@@ -59,7 +72,7 @@
  *
  * @gid: The Global identifier
  */
-static inline bool ib_is_opa_gid(union ib_gid *gid)
+static inline bool ib_is_opa_gid(const union ib_gid *gid)
 {
 	return ((be64_to_cpu(gid->global.interface_id) >> 40) ==
 		OPA_SPECIAL_OUI);
@@ -72,8 +85,33 @@ static inline bool ib_is_opa_gid(union ib_gid *gid)
  *
  * @gid: The Global identifier
  */
-static inline u32 opa_get_lid_from_gid(union ib_gid *gid)
+static inline u32 opa_get_lid_from_gid(const union ib_gid *gid)
 {
 	return be64_to_cpu(gid->global.interface_id) & 0xFFFFFFFF;
 }
+
+/**
+ * opa_is_extended_lid: Returns true if dlid or slid are
+ * extended.
+ *
+ * @dlid: The DLID
+ * @slid: The SLID
+ */
+static inline bool opa_is_extended_lid(u32 dlid, u32 slid)
+{
+	if ((be32_to_cpu(dlid) >=
+	     be16_to_cpu(IB_MULTICAST_LID_BASE)) ||
+	    (be32_to_cpu(slid) >=
+	     be16_to_cpu(IB_MULTICAST_LID_BASE)))
+		return true;
+	else
+		return false;
+}
+
+/* Get multicast lid base */
+static inline u32 opa_get_mcast_base(u32 nr_top_bits)
+{
+	return (be32_to_cpu(OPA_LID_PERMISSIVE) << (32 - nr_top_bits));
+}
+
 #endif /* OPA_ADDR_H */
diff --git a/include/rdma/opa_vnic.h b/include/rdma/opa_vnic.h
index 39d6890..0c07a70 100644
--- a/include/rdma/opa_vnic.h
+++ b/include/rdma/opa_vnic.h
@@ -54,9 +54,6 @@
 
 #include <rdma/ib_verbs.h>
 
-/* VNIC uses 16B header format */
-#define OPA_VNIC_L2_TYPE    0x2
-
 /* 16 header bytes + 2 reserved bytes */
 #define OPA_VNIC_L2_HDR_LEN   (16 + 2)
 
diff --git a/include/rdma/rdma_netlink.h b/include/rdma/rdma_netlink.h
index 348c102..2d87859 100644
--- a/include/rdma/rdma_netlink.h
+++ b/include/rdma/rdma_netlink.h
@@ -5,29 +5,43 @@
 #include <linux/netlink.h>
 #include <uapi/rdma/rdma_netlink.h>
 
-struct ibnl_client_cbs {
+struct rdma_nl_cbs {
+	int (*doit)(struct sk_buff *skb, struct nlmsghdr *nlh,
+		    struct netlink_ext_ack *extack);
 	int (*dump)(struct sk_buff *skb, struct netlink_callback *nlcb);
-	struct module *module;
+	u8 flags;
 };
 
-/**
- * Add a a client to the list of IB netlink exporters.
- * @index: Index of the added client
- * @nops: Number of supported ops by the added client.
- * @cb_table: A table for op->callback
- *
- * Returns 0 on success or a negative error code.
+enum rdma_nl_flags {
+	/* Require CAP_NET_ADMIN */
+	RDMA_NL_ADMIN_PERM	= 1 << 0,
+};
+
+/* Define this module as providing netlink services for NETLINK_RDMA, with
+ * index _index.  Since the client indexes were setup in a uapi header as an
+ * enum and we do no want to change that, the user must supply the expanded
+ * constant as well and the compiler checks they are the same.
  */
-int ibnl_add_client(int index, int nops,
-		    const struct ibnl_client_cbs cb_table[]);
+#define MODULE_ALIAS_RDMA_NETLINK(_index, _val)                                \
+	static inline void __chk_##_index(void)                                \
+	{                                                                      \
+		BUILD_BUG_ON(_index != _val);                                  \
+	}                                                                      \
+	MODULE_ALIAS("rdma-netlink-subsys-" __stringify(_val))
+
+/**
+ * Register client in RDMA netlink.
+ * @index: Index of the added client
+ * @cb_table: A table for op->callback
+ */
+void rdma_nl_register(unsigned int index,
+		      const struct rdma_nl_cbs cb_table[]);
 
 /**
  * Remove a client from IB netlink.
  * @index: Index of the removed IB client.
- *
- * Returns 0 on success or a negative error code.
  */
-int ibnl_remove_client(int index);
+void rdma_nl_unregister(unsigned int index);
 
 /**
  * Put a new message in a supplied skb.
@@ -56,22 +70,32 @@ int ibnl_put_attr(struct sk_buff *skb, struct nlmsghdr *nlh,
 /**
  * Send the supplied skb to a specific userspace PID.
  * @skb: The netlink skb
- * @nlh: Header of the netlink message to send
  * @pid: Userspace netlink process ID
  * Returns 0 on success or a negative error code.
  */
-int ibnl_unicast(struct sk_buff *skb, struct nlmsghdr *nlh,
-			__u32 pid);
+int rdma_nl_unicast(struct sk_buff *skb, u32 pid);
+
+/**
+ * Send, with wait/1 retry, the supplied skb to a specific userspace PID.
+ * @skb: The netlink skb
+ * @pid: Userspace netlink process ID
+ * Returns 0 on success or a negative error code.
+ */
+int rdma_nl_unicast_wait(struct sk_buff *skb, __u32 pid);
 
 /**
  * Send the supplied skb to a netlink group.
  * @skb: The netlink skb
- * @nlh: Header of the netlink message to send
  * @group: Netlink group ID
  * @flags: allocation flags
  * Returns 0 on success or a negative error code.
  */
-int ibnl_multicast(struct sk_buff *skb, struct nlmsghdr *nlh,
-			unsigned int group, gfp_t flags);
+int rdma_nl_multicast(struct sk_buff *skb, unsigned int group, gfp_t flags);
 
+/**
+ * Check if there are any listeners to the netlink group
+ * @group: the netlink group ID
+ * Returns 0 on success or a negative for no listeners.
+ */
+int rdma_nl_chk_listeners(unsigned int group);
 #endif /* _RDMA_NETLINK_H */
diff --git a/include/rdma/rdma_vt.h b/include/rdma/rdma_vt.h
index 55af692..1ba84a7 100644
--- a/include/rdma/rdma_vt.h
+++ b/include/rdma/rdma_vt.h
@@ -57,11 +57,21 @@
 #include <linux/list.h>
 #include <linux/hash.h>
 #include <rdma/ib_verbs.h>
+#include <rdma/ib_mad.h>
 #include <rdma/rdmavt_mr.h>
 #include <rdma/rdmavt_qp.h>
 
 #define RVT_MAX_PKEY_VALUES 16
 
+#define RVT_MAX_TRAP_LEN 100 /* Limit pending trap list */
+#define RVT_MAX_TRAP_LISTS 5 /*((IB_NOTICE_TYPE_INFO & 0x0F) + 1)*/
+#define RVT_TRAP_TIMEOUT 4096 /* 4.096 usec */
+
+struct trap_list {
+	u32 list_len;
+	struct list_head list;
+};
+
 struct rvt_ibport {
 	struct rvt_qp __rcu *qp[2];
 	struct ib_mad_agent *send_agent;	/* agent for SMI (traps) */
@@ -75,12 +85,13 @@ struct rvt_ibport {
 	__be64 mkey;
 	u64 tid;
 	u32 port_cap_flags;
+	u16 port_cap3_flags;
 	u32 pma_sample_start;
 	u32 pma_sample_interval;
 	__be16 pma_counter_select[5];
 	u16 pma_tag;
 	u16 mkey_lease_period;
-	u16 sm_lid;
+	u32 sm_lid;
 	u8 sm_sl;
 	u8 mkeyprot;
 	u8 subnet_timeout;
@@ -127,6 +138,13 @@ struct rvt_ibport {
 	u16 *pkey_table;
 
 	struct rvt_ah *sm_ah;
+
+	/*
+	 * Keep a list of traps that have not been repressed.  They will be
+	 * resent based on trap_timer.
+	 */
+	struct trap_list trap_lists[RVT_MAX_TRAP_LISTS];
+	struct timer_list trap_timer;
 };
 
 #define RVT_CQN_MAX 16 /* maximum length of cq name */
@@ -514,7 +532,8 @@ int rvt_invalidate_rkey(struct rvt_qp *qp, u32 rkey);
 int rvt_rkey_ok(struct rvt_qp *qp, struct rvt_sge *sge,
 		u32 len, u64 vaddr, u32 rkey, int acc);
 int rvt_lkey_ok(struct rvt_lkey_table *rkt, struct rvt_pd *pd,
-		struct rvt_sge *isge, struct ib_sge *sge, int acc);
+		struct rvt_sge *isge, struct rvt_sge *last_sge,
+		struct ib_sge *sge, int acc);
 struct rvt_mcast *rvt_mcast_find(struct rvt_ibport *ibp, union ib_gid *mgid,
 				 u16 lid);
 
diff --git a/include/rdma/rdmavt_mr.h b/include/rdma/rdmavt_mr.h
index f418bd5..72a3856 100644
--- a/include/rdma/rdmavt_mr.h
+++ b/include/rdma/rdmavt_mr.h
@@ -191,4 +191,7 @@ static inline void rvt_skip_sge(struct rvt_sge_state *ss, u32 length,
 	}
 }
 
+bool rvt_ss_has_lkey(struct rvt_sge_state *ss, u32 lkey);
+bool rvt_mr_has_lkey(struct rvt_mregion *mr, u32 lkey);
+
 #endif          /* DEF_RDMAVT_INCMRH */
diff --git a/include/rdma/rdmavt_qp.h b/include/rdma/rdmavt_qp.h
index d664d2e..0eed3d8 100644
--- a/include/rdma/rdmavt_qp.h
+++ b/include/rdma/rdmavt_qp.h
@@ -277,7 +277,6 @@ struct rvt_qp {
 
 	unsigned long timeout_jiffies;  /* computed from timeout */
 
-	enum ib_mtu path_mtu;
 	int srate_mbps;		/* s_srate (below) converted to Mbit/s */
 	pid_t pid;		/* pid for user mode QPs */
 	u32 remote_qpn;
@@ -396,7 +395,7 @@ struct rvt_srq {
 #define RVT_QPNMAP_ENTRIES          (RVT_QPN_MAX / PAGE_SIZE / BITS_PER_BYTE)
 #define RVT_BITS_PER_PAGE           (PAGE_SIZE * BITS_PER_BYTE)
 #define RVT_BITS_PER_PAGE_MASK      (RVT_BITS_PER_PAGE - 1)
-#define RVT_QPN_MASK		    0xFFFFFF
+#define RVT_QPN_MASK		    IB_QPN_MASK
 
 /*
  * QPN-map pages start out as NULL, they get allocated upon
@@ -674,4 +673,34 @@ void rvt_del_timers_sync(struct rvt_qp *qp);
 void rvt_stop_rc_timers(struct rvt_qp *qp);
 void rvt_add_retry_timer(struct rvt_qp *qp);
 
+/**
+ * struct rvt_qp_iter - the iterator for QPs
+ * @qp - the current QP
+ *
+ * This structure defines the current iterator
+ * state for sequenced access to all QPs relative
+ * to an rvt_dev_info.
+ */
+struct rvt_qp_iter {
+	struct rvt_qp *qp;
+	/* private: backpointer */
+	struct rvt_dev_info *rdi;
+	/* private: callback routine */
+	void (*cb)(struct rvt_qp *qp, u64 v);
+	/* private: for arg to callback routine */
+	u64 v;
+	/* private: number of SMI,GSI QPs for device */
+	int specials;
+	/* private: current iterator index */
+	int n;
+};
+
+struct rvt_qp_iter *rvt_qp_iter_init(struct rvt_dev_info *rdi,
+				     u64 v,
+				     void (*cb)(struct rvt_qp *qp, u64 v));
+int rvt_qp_iter_next(struct rvt_qp_iter *iter);
+void rvt_qp_iter(struct rvt_dev_info *rdi,
+		 u64 v,
+		 void (*cb)(struct rvt_qp *qp, u64 v));
+void rvt_qp_mr_clean(struct rvt_qp *qp, u32 lkey);
 #endif          /* DEF_RDMAVT_INCQP_H */
diff --git a/include/rdma/uverbs_ioctl.h b/include/rdma/uverbs_ioctl.h
new file mode 100644
index 0000000..6da4407
--- /dev/null
+++ b/include/rdma/uverbs_ioctl.h
@@ -0,0 +1,438 @@
+/*
+ * Copyright (c) 2017, Mellanox Technologies inc.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef _UVERBS_IOCTL_
+#define _UVERBS_IOCTL_
+
+#include <rdma/uverbs_types.h>
+#include <linux/uaccess.h>
+#include <rdma/rdma_user_ioctl.h>
+#include <rdma/ib_user_ioctl_verbs.h>
+
+/*
+ * =======================================
+ *	Verbs action specifications
+ * =======================================
+ */
+
+enum uverbs_attr_type {
+	UVERBS_ATTR_TYPE_NA,
+	UVERBS_ATTR_TYPE_PTR_IN,
+	UVERBS_ATTR_TYPE_PTR_OUT,
+	UVERBS_ATTR_TYPE_IDR,
+	UVERBS_ATTR_TYPE_FD,
+};
+
+enum uverbs_obj_access {
+	UVERBS_ACCESS_READ,
+	UVERBS_ACCESS_WRITE,
+	UVERBS_ACCESS_NEW,
+	UVERBS_ACCESS_DESTROY
+};
+
+enum {
+	UVERBS_ATTR_SPEC_F_MANDATORY	= 1U << 0,
+	/* Support extending attributes by length */
+	UVERBS_ATTR_SPEC_F_MIN_SZ	= 1U << 1,
+};
+
+struct uverbs_attr_spec {
+	enum uverbs_attr_type		type;
+	union {
+		u16				len;
+		struct {
+			/*
+			 * higher bits mean the namespace and lower bits mean
+			 * the type id within the namespace.
+			 */
+			u16			obj_type;
+			u8			access;
+		} obj;
+	};
+	/* Combination of bits from enum UVERBS_ATTR_SPEC_F_XXXX */
+	u8				flags;
+};
+
+struct uverbs_attr_spec_hash {
+	size_t				num_attrs;
+	unsigned long			*mandatory_attrs_bitmask;
+	struct uverbs_attr_spec		attrs[0];
+};
+
+struct uverbs_attr_bundle;
+struct ib_uverbs_file;
+
+enum {
+	/*
+	 * Action marked with this flag creates a context (or root for all
+	 * objects).
+	 */
+	UVERBS_ACTION_FLAG_CREATE_ROOT = 1U << 0,
+};
+
+struct uverbs_method_spec {
+	/* Combination of bits from enum UVERBS_ACTION_FLAG_XXXX */
+	u32						flags;
+	size_t						num_buckets;
+	size_t						num_child_attrs;
+	int (*handler)(struct ib_device *ib_dev, struct ib_uverbs_file *ufile,
+		       struct uverbs_attr_bundle *ctx);
+	struct uverbs_attr_spec_hash		*attr_buckets[0];
+};
+
+struct uverbs_method_spec_hash {
+	size_t					num_methods;
+	struct uverbs_method_spec		*methods[0];
+};
+
+struct uverbs_object_spec {
+	const struct uverbs_obj_type		*type_attrs;
+	size_t					num_buckets;
+	struct uverbs_method_spec_hash		*method_buckets[0];
+};
+
+struct uverbs_object_spec_hash {
+	size_t					num_objects;
+	struct uverbs_object_spec		*objects[0];
+};
+
+struct uverbs_root_spec {
+	size_t					num_buckets;
+	struct uverbs_object_spec_hash		*object_buckets[0];
+};
+
+/*
+ * =======================================
+ *	Verbs definitions
+ * =======================================
+ */
+
+struct uverbs_attr_def {
+	u16                           id;
+	struct uverbs_attr_spec       attr;
+};
+
+struct uverbs_method_def {
+	u16                                  id;
+	/* Combination of bits from enum UVERBS_ACTION_FLAG_XXXX */
+	u32				     flags;
+	size_t				     num_attrs;
+	const struct uverbs_attr_def * const (*attrs)[];
+	int (*handler)(struct ib_device *ib_dev, struct ib_uverbs_file *ufile,
+		       struct uverbs_attr_bundle *ctx);
+};
+
+struct uverbs_object_def {
+	u16					 id;
+	const struct uverbs_obj_type	        *type_attrs;
+	size_t				         num_methods;
+	const struct uverbs_method_def * const (*methods)[];
+};
+
+struct uverbs_object_tree_def {
+	size_t					 num_objects;
+	const struct uverbs_object_def * const (*objects)[];
+};
+
+#define UA_FLAGS(_flags)  .flags = _flags
+#define __UVERBS_ATTR0(_id, _len, _type, ...)                           \
+	((const struct uverbs_attr_def)				  \
+	 {.id = _id, .attr = {.type = _type, {.len = _len}, .flags = 0, } })
+#define __UVERBS_ATTR1(_id, _len, _type, _flags)                        \
+	((const struct uverbs_attr_def)				  \
+	 {.id = _id, .attr = {.type = _type, {.len = _len}, _flags, } })
+#define __UVERBS_ATTR(_id, _len, _type, _flags, _n, ...)		\
+	__UVERBS_ATTR##_n(_id, _len, _type, _flags)
+/*
+ * In new compiler, UVERBS_ATTR could be simplified by declaring it as
+ * [_id] = {.type = _type, .len = _len, ##__VA_ARGS__}
+ * But since we support older compilers too, we need the more complex code.
+ */
+#define UVERBS_ATTR(_id, _len, _type, ...)				\
+	__UVERBS_ATTR(_id, _len, _type, ##__VA_ARGS__, 1, 0)
+#define UVERBS_ATTR_PTR_IN_SZ(_id, _len, ...)				\
+	UVERBS_ATTR(_id, _len, UVERBS_ATTR_TYPE_PTR_IN, ##__VA_ARGS__)
+/* If sizeof(_type) <= sizeof(u64), this will be inlined rather than a pointer */
+#define UVERBS_ATTR_PTR_IN(_id, _type, ...)				\
+	UVERBS_ATTR_PTR_IN_SZ(_id, sizeof(_type), ##__VA_ARGS__)
+#define UVERBS_ATTR_PTR_OUT_SZ(_id, _len, ...)				\
+	UVERBS_ATTR(_id, _len, UVERBS_ATTR_TYPE_PTR_OUT, ##__VA_ARGS__)
+#define UVERBS_ATTR_PTR_OUT(_id, _type, ...)				\
+	UVERBS_ATTR_PTR_OUT_SZ(_id, sizeof(_type), ##__VA_ARGS__)
+
+/*
+ * In new compiler, UVERBS_ATTR_IDR (and FD) could be simplified by declaring
+ * it as
+ * {.id = _id,								\
+ *  .attr {.type = __obj_class,						\
+ *         .obj = {.obj_type = _idr_type,				\
+ *                       .access = _access                              \
+ *                }, ##__VA_ARGS__ } }
+ * But since we support older compilers too, we need the more complex code.
+ */
+#define ___UVERBS_ATTR_OBJ0(_id, _obj_class, _obj_type, _access, ...)\
+	((const struct uverbs_attr_def)					\
+	{.id = _id,							\
+	 .attr = {.type = _obj_class,					\
+		  {.obj = {.obj_type = _obj_type, .access = _access } },\
+		  .flags = 0} })
+#define ___UVERBS_ATTR_OBJ1(_id, _obj_class, _obj_type, _access, _flags)\
+	((const struct uverbs_attr_def)					\
+	{.id = _id,							\
+	.attr = {.type = _obj_class,					\
+		 {.obj = {.obj_type = _obj_type, .access = _access} },	\
+		  _flags} })
+#define ___UVERBS_ATTR_OBJ(_id, _obj_class, _obj_type, _access, _flags, \
+			   _n, ...)					\
+	___UVERBS_ATTR_OBJ##_n(_id, _obj_class, _obj_type, _access, _flags)
+#define __UVERBS_ATTR_OBJ(_id, _obj_class, _obj_type, _access, ...)	\
+	___UVERBS_ATTR_OBJ(_id, _obj_class, _obj_type, _access,		\
+			   ##__VA_ARGS__, 1, 0)
+#define UVERBS_ATTR_IDR(_id, _idr_type, _access, ...)			 \
+	__UVERBS_ATTR_OBJ(_id, UVERBS_ATTR_TYPE_IDR, _idr_type, _access,\
+			  ##__VA_ARGS__)
+#define UVERBS_ATTR_FD(_id, _fd_type, _access, ...)			\
+	__UVERBS_ATTR_OBJ(_id, UVERBS_ATTR_TYPE_FD, _fd_type,		\
+			  (_access) + BUILD_BUG_ON_ZERO(		\
+				(_access) != UVERBS_ACCESS_NEW &&	\
+				(_access) != UVERBS_ACCESS_READ),	\
+			  ##__VA_ARGS__)
+#define DECLARE_UVERBS_ATTR_SPEC(_name, ...)				\
+	const struct uverbs_attr_def _name = __VA_ARGS__
+
+#define _UVERBS_METHOD_ATTRS_SZ(...)					\
+	(sizeof((const struct uverbs_attr_def * const []){__VA_ARGS__}) /\
+	 sizeof(const struct uverbs_attr_def *))
+#define _UVERBS_METHOD(_id, _handler, _flags, ...)			\
+	((const struct uverbs_method_def) {				\
+	 .id = _id,							\
+	 .flags = _flags,						\
+	 .handler = _handler,						\
+	 .num_attrs = _UVERBS_METHOD_ATTRS_SZ(__VA_ARGS__),		\
+	 .attrs = &(const struct uverbs_attr_def * const []){__VA_ARGS__} })
+#define DECLARE_UVERBS_METHOD(_name, _id, _handler, ...)		\
+	const struct uverbs_method_def _name =				\
+		_UVERBS_METHOD(_id, _handler, 0, ##__VA_ARGS__)
+#define DECLARE_UVERBS_CTX_METHOD(_name, _id, _handler, _flags, ...)	\
+	const struct uverbs_method_def _name =				\
+		_UVERBS_METHOD(_id, _handler,				\
+			       UVERBS_ACTION_FLAG_CREATE_ROOT,		\
+			       ##__VA_ARGS__)
+#define _UVERBS_OBJECT_METHODS_SZ(...)					\
+	(sizeof((const struct uverbs_method_def * const []){__VA_ARGS__}) / \
+	 sizeof(const struct uverbs_method_def *))
+#define _UVERBS_OBJECT(_id, _type_attrs, ...)				\
+	((const struct uverbs_object_def) {				\
+	 .id = _id,							\
+	 .type_attrs = _type_attrs,					\
+	 .num_methods = _UVERBS_OBJECT_METHODS_SZ(__VA_ARGS__),		\
+	 .methods = &(const struct uverbs_method_def * const []){__VA_ARGS__} })
+#define DECLARE_UVERBS_OBJECT(_name, _id, _type_attrs, ...)		\
+	const struct uverbs_object_def _name =				\
+		_UVERBS_OBJECT(_id, _type_attrs, ##__VA_ARGS__)
+#define _UVERBS_TREE_OBJECTS_SZ(...)					\
+	(sizeof((const struct uverbs_object_def * const []){__VA_ARGS__}) / \
+	 sizeof(const struct uverbs_object_def *))
+#define _UVERBS_OBJECT_TREE(...)					\
+	((const struct uverbs_object_tree_def) {			\
+	 .num_objects = _UVERBS_TREE_OBJECTS_SZ(__VA_ARGS__),		\
+	 .objects = &(const struct uverbs_object_def * const []){__VA_ARGS__} })
+#define DECLARE_UVERBS_OBJECT_TREE(_name, ...)				\
+	const struct uverbs_object_tree_def _name =			\
+		_UVERBS_OBJECT_TREE(__VA_ARGS__)
+
+/* =================================================
+ *              Parsing infrastructure
+ * =================================================
+ */
+
+struct uverbs_ptr_attr {
+	union {
+		u64		data;
+		void	__user *ptr;
+	};
+	u16		len;
+	/* Combination of bits from enum UVERBS_ATTR_F_XXXX */
+	u16		flags;
+};
+
+struct uverbs_obj_attr {
+	/* pointer to the kernel descriptor -> type, access, etc */
+	const struct uverbs_obj_type	*type;
+	struct ib_uobject		*uobject;
+	/* fd or id in idr of this object */
+	int				id;
+};
+
+struct uverbs_attr {
+	/*
+	 * pointer to the user-space given attribute, in order to write the
+	 * new uobject's id or update flags.
+	 */
+	struct ib_uverbs_attr __user	*uattr;
+	union {
+		struct uverbs_ptr_attr	ptr_attr;
+		struct uverbs_obj_attr	obj_attr;
+	};
+};
+
+struct uverbs_attr_bundle_hash {
+	/* if bit i is set, it means attrs[i] contains valid information */
+	unsigned long *valid_bitmap;
+	size_t num_attrs;
+	/*
+	 * arrays of attributes, each element corresponds to the specification
+	 * of the attribute in the same index.
+	 */
+	struct uverbs_attr *attrs;
+};
+
+struct uverbs_attr_bundle {
+	size_t				num_buckets;
+	struct uverbs_attr_bundle_hash  hash[];
+};
+
+static inline bool uverbs_attr_is_valid_in_hash(const struct uverbs_attr_bundle_hash *attrs_hash,
+						unsigned int idx)
+{
+	return test_bit(idx, attrs_hash->valid_bitmap);
+}
+
+static inline bool uverbs_attr_is_valid(const struct uverbs_attr_bundle *attrs_bundle,
+					unsigned int idx)
+{
+	u16 idx_bucket = idx >>	UVERBS_ID_NS_SHIFT;
+
+	if (attrs_bundle->num_buckets <= idx_bucket)
+		return false;
+
+	return uverbs_attr_is_valid_in_hash(&attrs_bundle->hash[idx_bucket],
+					    idx & ~UVERBS_ID_NS_MASK);
+}
+
+static inline const struct uverbs_attr *uverbs_attr_get(const struct uverbs_attr_bundle *attrs_bundle,
+							u16 idx)
+{
+	u16 idx_bucket = idx >>	UVERBS_ID_NS_SHIFT;
+
+	if (!uverbs_attr_is_valid(attrs_bundle, idx))
+		return ERR_PTR(-ENOENT);
+
+	return &attrs_bundle->hash[idx_bucket].attrs[idx & ~UVERBS_ID_NS_MASK];
+}
+
+static inline int uverbs_copy_to(const struct uverbs_attr_bundle *attrs_bundle,
+				 size_t idx, const void *from)
+{
+	const struct uverbs_attr *attr = uverbs_attr_get(attrs_bundle, idx);
+	u16 flags;
+
+	if (IS_ERR(attr))
+		return PTR_ERR(attr);
+
+	flags = attr->ptr_attr.flags | UVERBS_ATTR_F_VALID_OUTPUT;
+	return (!copy_to_user(attr->ptr_attr.ptr, from, attr->ptr_attr.len) &&
+		!put_user(flags, &attr->uattr->flags)) ? 0 : -EFAULT;
+}
+
+static inline int _uverbs_copy_from(void *to, size_t to_size,
+				    const struct uverbs_attr_bundle *attrs_bundle,
+				    size_t idx)
+{
+	const struct uverbs_attr *attr = uverbs_attr_get(attrs_bundle, idx);
+
+	if (IS_ERR(attr))
+		return PTR_ERR(attr);
+
+	if (to_size <= sizeof(((struct ib_uverbs_attr *)0)->data))
+		memcpy(to, &attr->ptr_attr.data, attr->ptr_attr.len);
+	else if (copy_from_user(to, attr->ptr_attr.ptr, attr->ptr_attr.len))
+		return -EFAULT;
+
+	return 0;
+}
+
+#define uverbs_copy_from(to, attrs_bundle, idx)				      \
+	_uverbs_copy_from(to, sizeof(*(to)), attrs_bundle, idx)
+
+/* =================================================
+ *	 Definitions -> Specs infrastructure
+ * =================================================
+ */
+
+/*
+ * uverbs_alloc_spec_tree - Merges different common and driver specific feature
+ *	into one parsing tree that every uverbs command will be parsed upon.
+ *
+ * @num_trees: Number of trees in the array @trees.
+ * @trees: Array of pointers to tree root definitions to merge. Each such tree
+ *	   possibly contains objects, methods and attributes definitions.
+ *
+ * Returns:
+ *	uverbs_root_spec *: The root of the merged parsing tree.
+ *	On error, we return an error code. Error is checked via IS_ERR.
+ *
+ * The following merges could take place:
+ * a. Two trees representing the same method with different handler
+ *	-> We take the handler of the tree that its handler != NULL
+ *	   and its index in the trees array is greater. The incentive for that
+ *	   is that developers are expected to first merge common trees and then
+ *	   merge trees that gives specialized the behaviour.
+ * b. Two trees representing the same object with different
+ *    type_attrs (struct uverbs_obj_type):
+ *	-> We take the type_attrs of the tree that its type_attr != NULL
+ *	   and its index in the trees array is greater. This could be used
+ *	   in order to override the free function, allocation size, etc.
+ * c. Two trees representing the same method attribute (same id but possibly
+ *    different attributes):
+ *	-> ERROR (-ENOENT), we believe that's not the programmer's intent.
+ *
+ * An object without any methods is considered invalid and will abort the
+ * function with -ENOENT error.
+ */
+#if IS_ENABLED(CONFIG_INFINIBAND_USER_ACCESS)
+struct uverbs_root_spec *uverbs_alloc_spec_tree(unsigned int num_trees,
+						const struct uverbs_object_tree_def **trees);
+void uverbs_free_spec_tree(struct uverbs_root_spec *root);
+#else
+static inline struct uverbs_root_spec *uverbs_alloc_spec_tree(unsigned int num_trees,
+							      const struct uverbs_object_tree_def **trees)
+{
+	return NULL;
+}
+
+static inline void uverbs_free_spec_tree(struct uverbs_root_spec *root)
+{
+}
+#endif
+
+#endif
diff --git a/include/rdma/uverbs_std_types.h b/include/rdma/uverbs_std_types.h
index 7771ce9..5f8e20b 100644
--- a/include/rdma/uverbs_std_types.h
+++ b/include/rdma/uverbs_std_types.h
@@ -34,19 +34,35 @@
 #define _UVERBS_STD_TYPES__
 
 #include <rdma/uverbs_types.h>
+#include <rdma/uverbs_ioctl.h>
+#include <rdma/ib_user_ioctl_verbs.h>
 
-extern const struct uverbs_obj_fd_type uverbs_type_attrs_comp_channel;
-extern const struct uverbs_obj_idr_type uverbs_type_attrs_cq;
-extern const struct uverbs_obj_idr_type uverbs_type_attrs_qp;
-extern const struct uverbs_obj_idr_type uverbs_type_attrs_rwq_ind_table;
-extern const struct uverbs_obj_idr_type uverbs_type_attrs_wq;
-extern const struct uverbs_obj_idr_type uverbs_type_attrs_srq;
-extern const struct uverbs_obj_idr_type uverbs_type_attrs_ah;
-extern const struct uverbs_obj_idr_type uverbs_type_attrs_flow;
-extern const struct uverbs_obj_idr_type uverbs_type_attrs_mr;
-extern const struct uverbs_obj_idr_type uverbs_type_attrs_mw;
-extern const struct uverbs_obj_idr_type uverbs_type_attrs_pd;
-extern const struct uverbs_obj_idr_type uverbs_type_attrs_xrcd;
+#if IS_ENABLED(CONFIG_INFINIBAND_USER_ACCESS)
+extern const struct uverbs_object_def uverbs_object_comp_channel;
+extern const struct uverbs_object_def uverbs_object_cq;
+extern const struct uverbs_object_def uverbs_object_qp;
+extern const struct uverbs_object_def uverbs_object_rwq_ind_table;
+extern const struct uverbs_object_def uverbs_object_wq;
+extern const struct uverbs_object_def uverbs_object_srq;
+extern const struct uverbs_object_def uverbs_object_ah;
+extern const struct uverbs_object_def uverbs_object_flow;
+extern const struct uverbs_object_def uverbs_object_mr;
+extern const struct uverbs_object_def uverbs_object_mw;
+extern const struct uverbs_object_def uverbs_object_pd;
+extern const struct uverbs_object_def uverbs_object_xrcd;
+extern const struct uverbs_object_def uverbs_object_device;
+
+extern const struct uverbs_object_tree_def uverbs_default_objects;
+static inline const struct uverbs_object_tree_def *uverbs_default_get_objects(void)
+{
+	return &uverbs_default_objects;
+}
+#else
+static inline const struct uverbs_object_tree_def *uverbs_default_get_objects(void)
+{
+	return NULL;
+}
+#endif
 
 static inline struct ib_uobject *__uobj_get(const struct uverbs_obj_type *type,
 					    bool write,
@@ -56,22 +72,22 @@ static inline struct ib_uobject *__uobj_get(const struct uverbs_obj_type *type,
 	return rdma_lookup_get_uobject(type, ucontext, id, write);
 }
 
-#define uobj_get_type(_type) uverbs_type_attrs_##_type.type
+#define uobj_get_type(_object) uverbs_object_##_object.type_attrs
 
 #define uobj_get_read(_type, _id, _ucontext)				\
-	 __uobj_get(&(_type), false, _ucontext, _id)
+	 __uobj_get(_type, false, _ucontext, _id)
 
-#define uobj_get_obj_read(_type, _id, _ucontext)			\
+#define uobj_get_obj_read(_object, _id, _ucontext)			\
 ({									\
-	struct ib_uobject *uobj =					\
-		__uobj_get(&uobj_get_type(_type),			\
+	struct ib_uobject *__uobj =					\
+		__uobj_get(uverbs_object_##_object.type_attrs,		\
 			   false, _ucontext, _id);			\
 									\
-	(struct ib_##_type *)(IS_ERR(uobj) ? NULL : uobj->object);	\
+	(struct ib_##_object *)(IS_ERR(__uobj) ? NULL : __uobj->object);\
 })
 
 #define uobj_get_write(_type, _id, _ucontext)				\
-	 __uobj_get(&(_type), true, _ucontext, _id)
+	 __uobj_get(_type, true, _ucontext, _id)
 
 static inline void uobj_put_read(struct ib_uobject *uobj)
 {
@@ -108,7 +124,7 @@ static inline struct ib_uobject *__uobj_alloc(const struct uverbs_obj_type *type
 }
 
 #define uobj_alloc(_type, ucontext)	\
-	__uobj_alloc(&(_type), ucontext)
+	__uobj_alloc(_type, ucontext)
 
 #endif
 
diff --git a/include/rdma/uverbs_types.h b/include/rdma/uverbs_types.h
index 351ea18..cc04ec6 100644
--- a/include/rdma/uverbs_types.h
+++ b/include/rdma/uverbs_types.h
@@ -129,6 +129,7 @@ struct ib_uobject *rdma_alloc_begin_uobject(const struct uverbs_obj_type *type,
 void rdma_alloc_abort_uobject(struct ib_uobject *uobj);
 int __must_check rdma_remove_commit_uobject(struct ib_uobject *uobj);
 int rdma_alloc_commit_uobject(struct ib_uobject *uobj);
+int rdma_explicit_destroy(struct ib_uobject *uobject);
 
 struct uverbs_obj_fd_type {
 	/*
@@ -151,22 +152,30 @@ extern const struct uverbs_obj_type_class uverbs_fd_class;
 
 #define UVERBS_BUILD_BUG_ON(cond) (sizeof(char[1 - 2 * !!(cond)]) -	\
 				   sizeof(char))
-#define UVERBS_TYPE_ALLOC_FD(_size, _order)				 \
-	{								 \
-		.destroy_order = _order,				 \
-		.type_class = &uverbs_fd_class,				 \
-		.obj_size = (_size) +					 \
-			  UVERBS_BUILD_BUG_ON((_size) <			 \
-					      sizeof(struct ib_uobject_file)),\
-	}
-#define UVERBS_TYPE_ALLOC_IDR_SZ(_size, _order)				\
-	{								\
+#define UVERBS_TYPE_ALLOC_FD(_order, _obj_size, _context_closed, _fops, _name, _flags)\
+	((&((const struct uverbs_obj_fd_type)				\
+	 {.type = {							\
+		.destroy_order = _order,				\
+		.type_class = &uverbs_fd_class,				\
+		.obj_size = (_obj_size) +				\
+			UVERBS_BUILD_BUG_ON((_obj_size) < sizeof(struct ib_uobject_file)), \
+	 },								\
+	 .context_closed = _context_closed,				\
+	 .fops = _fops,							\
+	 .name = _name,							\
+	 .flags = _flags}))->type)
+#define UVERBS_TYPE_ALLOC_IDR_SZ(_size, _order, _destroy_object)	\
+	((&((const struct uverbs_obj_idr_type)				\
+	 {.type = {							\
 		.destroy_order = _order,				\
 		.type_class = &uverbs_idr_class,			\
 		.obj_size = (_size) +					\
-			  UVERBS_BUILD_BUG_ON((_size) <			\
-					      sizeof(struct ib_uobject)), \
-	}
-#define UVERBS_TYPE_ALLOC_IDR(_order)					\
-	 UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uobject), _order)
+			UVERBS_BUILD_BUG_ON((_size) <			\
+					    sizeof(struct ib_uobject))	\
+	 },								\
+	 .destroy_object = _destroy_object,}))->type)
+#define UVERBS_TYPE_ALLOC_IDR(_order, _destroy_object)			\
+	 UVERBS_TYPE_ALLOC_IDR_SZ(sizeof(struct ib_uobject), _order,	\
+				  _destroy_object)
+
 #endif
diff --git a/include/sound/omap-hdmi-audio.h b/include/sound/omap-hdmi-audio.h
index 1df2ff6..0e495ed 100644
--- a/include/sound/omap-hdmi-audio.h
+++ b/include/sound/omap-hdmi-audio.h
@@ -39,7 +39,7 @@ struct omap_hdmi_audio_ops {
 /* HDMI audio initalization data */
 struct omap_hdmi_audio_pdata {
 	struct device *dev;
-	enum omapdss_version dss_version;
+	unsigned int version;
 	phys_addr_t audio_dma_addr;
 
 	const struct omap_hdmi_audio_ops *ops;
diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index 91dc089..e91ae1f 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -703,6 +703,7 @@ TRACE_EVENT(rcu_batch_end,
  * at the beginning and end of the read, respectively.  Note that the
  * callback address can be NULL.
  */
+#define RCUTORTURENAME_LEN 8
 TRACE_EVENT(rcu_torture_read,
 
 	TP_PROTO(const char *rcutorturename, struct rcu_head *rhp,
@@ -711,7 +712,7 @@ TRACE_EVENT(rcu_torture_read,
 	TP_ARGS(rcutorturename, rhp, secs, c_old, c),
 
 	TP_STRUCT__entry(
-		__field(const char *, rcutorturename)
+		__field(char, rcutorturename[RCUTORTURENAME_LEN])
 		__field(struct rcu_head *, rhp)
 		__field(unsigned long, secs)
 		__field(unsigned long, c_old)
@@ -719,7 +720,9 @@ TRACE_EVENT(rcu_torture_read,
 	),
 
 	TP_fast_assign(
-		__entry->rcutorturename = rcutorturename;
+		strncpy(__entry->rcutorturename, rcutorturename,
+			RCUTORTURENAME_LEN);
+		__entry->rcutorturename[RCUTORTURENAME_LEN - 1] = 0;
 		__entry->rhp = rhp;
 		__entry->secs = secs;
 		__entry->c_old = c_old;
diff --git a/include/uapi/drm/armada_drm.h b/include/uapi/drm/armada_drm.h
index 72e326f..0cb9324 100644
--- a/include/uapi/drm/armada_drm.h
+++ b/include/uapi/drm/armada_drm.h
@@ -23,27 +23,27 @@ extern "C" {
 	DRM_##dir(DRM_COMMAND_BASE + DRM_ARMADA_##name, struct drm_armada_##str)
 
 struct drm_armada_gem_create {
-	uint32_t handle;
-	uint32_t size;
+	__u32 handle;
+	__u32 size;
 };
 #define DRM_IOCTL_ARMADA_GEM_CREATE \
 	ARMADA_IOCTL(IOWR, GEM_CREATE, gem_create)
 
 struct drm_armada_gem_mmap {
-	uint32_t handle;
-	uint32_t pad;
-	uint64_t offset;
-	uint64_t size;
-	uint64_t addr;
+	__u32 handle;
+	__u32 pad;
+	__u64 offset;
+	__u64 size;
+	__u64 addr;
 };
 #define DRM_IOCTL_ARMADA_GEM_MMAP \
 	ARMADA_IOCTL(IOWR, GEM_MMAP, gem_mmap)
 
 struct drm_armada_gem_pwrite {
-	uint64_t ptr;
-	uint32_t handle;
-	uint32_t offset;
-	uint32_t size;
+	__u64 ptr;
+	__u32 handle;
+	__u32 offset;
+	__u32 size;
 };
 #define DRM_IOCTL_ARMADA_GEM_PWRITE \
 	ARMADA_IOCTL(IOW, GEM_PWRITE, gem_pwrite)
diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h
index 101593a..97677cd 100644
--- a/include/uapi/drm/drm.h
+++ b/include/uapi/drm/drm.h
@@ -700,6 +700,7 @@ struct drm_prime_handle {
 
 struct drm_syncobj_create {
 	__u32 handle;
+#define DRM_SYNCOBJ_CREATE_SIGNALED (1 << 0)
 	__u32 flags;
 };
 
@@ -718,6 +719,24 @@ struct drm_syncobj_handle {
 	__u32 pad;
 };
 
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_ALL (1 << 0)
+#define DRM_SYNCOBJ_WAIT_FLAGS_WAIT_FOR_SUBMIT (1 << 1)
+struct drm_syncobj_wait {
+	__u64 handles;
+	/* absolute timeout */
+	__s64 timeout_nsec;
+	__u32 count_handles;
+	__u32 flags;
+	__u32 first_signaled; /* only valid when not waiting all */
+	__u32 pad;
+};
+
+struct drm_syncobj_array {
+	__u64 handles;
+	__u32 count_handles;
+	__u32 pad;
+};
+
 #if defined(__cplusplus)
 }
 #endif
@@ -840,6 +859,9 @@ extern "C" {
 #define DRM_IOCTL_SYNCOBJ_DESTROY	DRM_IOWR(0xC0, struct drm_syncobj_destroy)
 #define DRM_IOCTL_SYNCOBJ_HANDLE_TO_FD	DRM_IOWR(0xC1, struct drm_syncobj_handle)
 #define DRM_IOCTL_SYNCOBJ_FD_TO_HANDLE	DRM_IOWR(0xC2, struct drm_syncobj_handle)
+#define DRM_IOCTL_SYNCOBJ_WAIT		DRM_IOWR(0xC3, struct drm_syncobj_wait)
+#define DRM_IOCTL_SYNCOBJ_RESET		DRM_IOWR(0xC4, struct drm_syncobj_array)
+#define DRM_IOCTL_SYNCOBJ_SIGNAL	DRM_IOWR(0xC5, struct drm_syncobj_array)
 
 /**
  * Device specific ioctls should only be in their respective headers
diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h
index 7586c46..3ad838d 100644
--- a/include/uapi/drm/drm_fourcc.h
+++ b/include/uapi/drm/drm_fourcc.h
@@ -185,6 +185,8 @@ extern "C" {
 #define DRM_FORMAT_MOD_VENDOR_BROADCOM 0x07
 /* add more to the end as needed */
 
+#define DRM_FORMAT_RESERVED	      ((1ULL << 56) - 1)
+
 #define fourcc_mod_code(vendor, val) \
 	((((__u64)DRM_FORMAT_MOD_VENDOR_## vendor) << 56) | (val & 0x00ffffffffffffffULL))
 
@@ -197,6 +199,15 @@ extern "C" {
  */
 
 /*
+ * Invalid Modifier
+ *
+ * This modifier can be used as a sentinel to terminate the format modifiers
+ * list, or to initialize a variable with an invalid modifier. It might also be
+ * used to report an error back to userspace for certain APIs.
+ */
+#define DRM_FORMAT_MOD_INVALID	fourcc_mod_code(NONE, DRM_FORMAT_RESERVED)
+
+/*
  * Linear Layout
  *
  * Just plain linear layout. Note that this is different from no specifying any
@@ -253,6 +264,26 @@ extern "C" {
 #define I915_FORMAT_MOD_Yf_TILED fourcc_mod_code(INTEL, 3)
 
 /*
+ * Intel color control surface (CCS) for render compression
+ *
+ * The framebuffer format must be one of the 8:8:8:8 RGB formats.
+ * The main surface will be plane index 0 and must be Y/Yf-tiled,
+ * the CCS will be plane index 1.
+ *
+ * Each CCS tile matches a 1024x512 pixel area of the main surface.
+ * To match certain aspects of the 3D hardware the CCS is
+ * considered to be made up of normal 128Bx32 Y tiles, Thus
+ * the CCS pitch must be specified in multiples of 128 bytes.
+ *
+ * In reality the CCS tile appears to be a 64Bx64 Y tile, composed
+ * of QWORD (8 bytes) chunks instead of OWORD (16 bytes) chunks.
+ * But that fact is not relevant unless the memory is accessed
+ * directly.
+ */
+#define I915_FORMAT_MOD_Y_TILED_CCS	fourcc_mod_code(INTEL, 4)
+#define I915_FORMAT_MOD_Yf_TILED_CCS	fourcc_mod_code(INTEL, 5)
+
+/*
  * Tiled, NV12MT, grouped in 64 (pixels) x 32 (lines) -sized macroblocks
  *
  * Macroblocks are laid in a Z-shape, and each pixel data is following the
diff --git a/include/uapi/drm/drm_mode.h b/include/uapi/drm/drm_mode.h
index 403339f..54fc38c 100644
--- a/include/uapi/drm/drm_mode.h
+++ b/include/uapi/drm/drm_mode.h
@@ -712,6 +712,56 @@ struct drm_mode_atomic {
 	__u64 user_data;
 };
 
+struct drm_format_modifier_blob {
+#define FORMAT_BLOB_CURRENT 1
+	/* Version of this blob format */
+	__u32 version;
+
+	/* Flags */
+	__u32 flags;
+
+	/* Number of fourcc formats supported */
+	__u32 count_formats;
+
+	/* Where in this blob the formats exist (in bytes) */
+	__u32 formats_offset;
+
+	/* Number of drm_format_modifiers */
+	__u32 count_modifiers;
+
+	/* Where in this blob the modifiers exist (in bytes) */
+	__u32 modifiers_offset;
+
+	/* __u32 formats[] */
+	/* struct drm_format_modifier modifiers[] */
+};
+
+struct drm_format_modifier {
+	/* Bitmask of formats in get_plane format list this info applies to. The
+	 * offset allows a sliding window of which 64 formats (bits).
+	 *
+	 * Some examples:
+	 * In today's world with < 65 formats, and formats 0, and 2 are
+	 * supported
+	 * 0x0000000000000005
+	 *		  ^-offset = 0, formats = 5
+	 *
+	 * If the number formats grew to 128, and formats 98-102 are
+	 * supported with the modifier:
+	 *
+	 * 0x0000003c00000000 0000000000000000
+	 *		  ^
+	 *		  |__offset = 64, formats = 0x3c00000000
+	 *
+	 */
+	__u64 formats;
+	__u32 offset;
+	__u32 pad;
+
+	/* The modifier that applies to the >get_plane format list bitmask. */
+	__u64 modifier;
+};
+
 /**
  * Create a new 'blob' data property, copying length bytes from data pointer,
  * and returning new blob ID.
diff --git a/include/uapi/drm/i915_drm.h b/include/uapi/drm/i915_drm.h
index 7ccbd6a..6598fb7 100644
--- a/include/uapi/drm/i915_drm.h
+++ b/include/uapi/drm/i915_drm.h
@@ -260,6 +260,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_I915_GEM_CONTEXT_GETPARAM	0x34
 #define DRM_I915_GEM_CONTEXT_SETPARAM	0x35
 #define DRM_I915_PERF_OPEN		0x36
+#define DRM_I915_PERF_ADD_CONFIG	0x37
+#define DRM_I915_PERF_REMOVE_CONFIG	0x38
 
 #define DRM_IOCTL_I915_INIT		DRM_IOW( DRM_COMMAND_BASE + DRM_I915_INIT, drm_i915_init_t)
 #define DRM_IOCTL_I915_FLUSH		DRM_IO ( DRM_COMMAND_BASE + DRM_I915_FLUSH)
@@ -315,6 +317,8 @@ typedef struct _drm_i915_sarea {
 #define DRM_IOCTL_I915_GEM_CONTEXT_GETPARAM	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_GETPARAM, struct drm_i915_gem_context_param)
 #define DRM_IOCTL_I915_GEM_CONTEXT_SETPARAM	DRM_IOWR (DRM_COMMAND_BASE + DRM_I915_GEM_CONTEXT_SETPARAM, struct drm_i915_gem_context_param)
 #define DRM_IOCTL_I915_PERF_OPEN	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_OPEN, struct drm_i915_perf_open_param)
+#define DRM_IOCTL_I915_PERF_ADD_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_ADD_CONFIG, struct drm_i915_perf_oa_config)
+#define DRM_IOCTL_I915_PERF_REMOVE_CONFIG	DRM_IOW(DRM_COMMAND_BASE + DRM_I915_PERF_REMOVE_CONFIG, __u64)
 
 /* Allow drivers to submit batchbuffers directly to hardware, relying
  * on the security mechanisms provided by hardware.
@@ -431,6 +435,11 @@ typedef struct drm_i915_irq_wait {
  */
 #define I915_PARAM_HAS_EXEC_BATCH_FIRST	 48
 
+/* Query whether DRM_I915_GEM_EXECBUFFER2 supports supplying an array of
+ * drm_i915_gem_exec_fence structures.  See I915_EXEC_FENCE_ARRAY.
+ */
+#define I915_PARAM_HAS_EXEC_FENCE_ARRAY  49
+
 typedef struct drm_i915_getparam {
 	__s32 param;
 	/*
@@ -812,6 +821,17 @@ struct drm_i915_gem_exec_object2 {
 	__u64 rsvd2;
 };
 
+struct drm_i915_gem_exec_fence {
+	/**
+	 * User's handle for a drm_syncobj to wait on or signal.
+	 */
+	__u32 handle;
+
+#define I915_EXEC_FENCE_WAIT            (1<<0)
+#define I915_EXEC_FENCE_SIGNAL          (1<<1)
+	__u32 flags;
+};
+
 struct drm_i915_gem_execbuffer2 {
 	/**
 	 * List of gem_exec_object2 structs
@@ -826,7 +846,11 @@ struct drm_i915_gem_execbuffer2 {
 	__u32 DR1;
 	__u32 DR4;
 	__u32 num_cliprects;
-	/** This is a struct drm_clip_rect *cliprects */
+	/**
+	 * This is a struct drm_clip_rect *cliprects if I915_EXEC_FENCE_ARRAY
+	 * is not set.  If I915_EXEC_FENCE_ARRAY is set, then this is a
+	 * struct drm_i915_gem_exec_fence *fences.
+	 */
 	__u64 cliprects_ptr;
 #define I915_EXEC_RING_MASK              (7<<0)
 #define I915_EXEC_DEFAULT                (0<<0)
@@ -927,7 +951,14 @@ struct drm_i915_gem_execbuffer2 {
  * element).
  */
 #define I915_EXEC_BATCH_FIRST		(1<<18)
-#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_BATCH_FIRST<<1))
+
+/* Setting I915_FENCE_ARRAY implies that num_cliprects and cliprects_ptr
+ * define an array of i915_gem_exec_fence structures which specify a set of
+ * dma fences to wait upon or signal.
+ */
+#define I915_EXEC_FENCE_ARRAY   (1<<19)
+
+#define __I915_EXEC_UNKNOWN_FLAGS (-(I915_EXEC_FENCE_ARRAY<<1))
 
 #define I915_EXEC_CONTEXT_ID_MASK	(0xffffffff)
 #define i915_execbuffer2_set_context_id(eb2, context) \
@@ -1467,6 +1498,22 @@ enum drm_i915_perf_record_type {
 	DRM_I915_PERF_RECORD_MAX /* non-ABI */
 };
 
+/**
+ * Structure to upload perf dynamic configuration into the kernel.
+ */
+struct drm_i915_perf_oa_config {
+	/** String formatted like "%08x-%04x-%04x-%04x-%012x" */
+	char uuid[36];
+
+	__u32 n_mux_regs;
+	__u32 n_boolean_regs;
+	__u32 n_flex_regs;
+
+	__u64 __user mux_regs_ptr;
+	__u64 __user boolean_regs_ptr;
+	__u64 __user flex_regs_ptr;
+};
+
 #if defined(__cplusplus)
 }
 #endif
diff --git a/include/uapi/drm/qxl_drm.h b/include/uapi/drm/qxl_drm.h
index 7eef422..880999d2 100644
--- a/include/uapi/drm/qxl_drm.h
+++ b/include/uapi/drm/qxl_drm.h
@@ -80,8 +80,8 @@ struct drm_qxl_reloc {
 };
 
 struct drm_qxl_command {
-	__u64	 __user command; /* void* */
-	__u64	 __user relocs; /* struct drm_qxl_reloc* */
+	__u64		command; /* void* */
+	__u64		relocs; /* struct drm_qxl_reloc* */
 	__u32		type;
 	__u32		command_size;
 	__u32		relocs_num;
@@ -91,7 +91,7 @@ struct drm_qxl_command {
 struct drm_qxl_execbuffer {
 	__u32		flags;		/* for future use */
 	__u32		commands_num;
-	__u64	 __user commands;	/* struct drm_qxl_command* */
+	__u64		commands;	/* struct drm_qxl_command* */
 };
 
 struct drm_qxl_update_area {
diff --git a/include/uapi/drm/vc4_drm.h b/include/uapi/drm/vc4_drm.h
index 6ac4c5c..afae870 100644
--- a/include/uapi/drm/vc4_drm.h
+++ b/include/uapi/drm/vc4_drm.h
@@ -40,6 +40,7 @@ extern "C" {
 #define DRM_VC4_GET_PARAM                         0x07
 #define DRM_VC4_SET_TILING                        0x08
 #define DRM_VC4_GET_TILING                        0x09
+#define DRM_VC4_LABEL_BO                          0x0a
 
 #define DRM_IOCTL_VC4_SUBMIT_CL           DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_SUBMIT_CL, struct drm_vc4_submit_cl)
 #define DRM_IOCTL_VC4_WAIT_SEQNO          DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_WAIT_SEQNO, struct drm_vc4_wait_seqno)
@@ -51,6 +52,7 @@ extern "C" {
 #define DRM_IOCTL_VC4_GET_PARAM           DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_GET_PARAM, struct drm_vc4_get_param)
 #define DRM_IOCTL_VC4_SET_TILING          DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_SET_TILING, struct drm_vc4_set_tiling)
 #define DRM_IOCTL_VC4_GET_TILING          DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_GET_TILING, struct drm_vc4_get_tiling)
+#define DRM_IOCTL_VC4_LABEL_BO            DRM_IOWR(DRM_COMMAND_BASE + DRM_VC4_LABEL_BO, struct drm_vc4_label_bo)
 
 struct drm_vc4_submit_rcl_surface {
 	__u32 hindex; /* Handle index, or ~0 if not present. */
@@ -153,6 +155,16 @@ struct drm_vc4_submit_cl {
 	__u32 pad:24;
 
 #define VC4_SUBMIT_CL_USE_CLEAR_COLOR			(1 << 0)
+/* By default, the kernel gets to choose the order that the tiles are
+ * rendered in.  If this is set, then the tiles will be rendered in a
+ * raster order, with the right-to-left vs left-to-right and
+ * top-to-bottom vs bottom-to-top dictated by
+ * VC4_SUBMIT_CL_RCL_ORDER_INCREASING_*.  This allows overlapping
+ * blits to be implemented using the 3D engine.
+ */
+#define VC4_SUBMIT_CL_FIXED_RCL_ORDER			(1 << 1)
+#define VC4_SUBMIT_CL_RCL_ORDER_INCREASING_X		(1 << 2)
+#define VC4_SUBMIT_CL_RCL_ORDER_INCREASING_Y		(1 << 3)
 	__u32 flags;
 
 	/* Returned value of the seqno of this render job (for the
@@ -292,6 +304,7 @@ struct drm_vc4_get_hang_state {
 #define DRM_VC4_PARAM_SUPPORTS_BRANCHES		3
 #define DRM_VC4_PARAM_SUPPORTS_ETC1		4
 #define DRM_VC4_PARAM_SUPPORTS_THREADED_FS	5
+#define DRM_VC4_PARAM_SUPPORTS_FIXED_RCL_ORDER	6
 
 struct drm_vc4_get_param {
 	__u32 param;
@@ -311,6 +324,15 @@ struct drm_vc4_set_tiling {
 	__u64 modifier;
 };
 
+/**
+ * struct drm_vc4_label_bo - Attach a name to a BO for debug purposes.
+ */
+struct drm_vc4_label_bo {
+	__u32 handle;
+	__u32 len;
+	__u64 name;
+};
+
 #if defined(__cplusplus)
 }
 #endif
diff --git a/include/uapi/drm/vmwgfx_drm.h b/include/uapi/drm/vmwgfx_drm.h
index d9dfde9..0bc784f 100644
--- a/include/uapi/drm/vmwgfx_drm.h
+++ b/include/uapi/drm/vmwgfx_drm.h
@@ -297,13 +297,17 @@ union drm_vmw_surface_reference_arg {
  * @version: Allows expanding the execbuf ioctl parameters without breaking
  * backwards compatibility, since user-space will always tell the kernel
  * which version it uses.
- * @flags: Execbuf flags. None currently.
+ * @flags: Execbuf flags.
+ * @imported_fence_fd:  FD for a fence imported from another device
  *
  * Argument to the DRM_VMW_EXECBUF Ioctl.
  */
 
 #define DRM_VMW_EXECBUF_VERSION 2
 
+#define DRM_VMW_EXECBUF_FLAG_IMPORT_FENCE_FD (1 << 0)
+#define DRM_VMW_EXECBUF_FLAG_EXPORT_FENCE_FD (1 << 1)
+
 struct drm_vmw_execbuf_arg {
 	__u64 commands;
 	__u32 command_size;
@@ -312,7 +316,7 @@ struct drm_vmw_execbuf_arg {
 	__u32 version;
 	__u32 flags;
 	__u32 context_handle;
-	__u32 pad64;
+	__s32 imported_fence_fd;
 };
 
 /**
@@ -328,6 +332,7 @@ struct drm_vmw_execbuf_arg {
  * @passed_seqno: The highest seqno number processed by the hardware
  * so far. This can be used to mark user-space fence objects as signaled, and
  * to determine whether a fence seqno might be stale.
+ * @fd: FD associated with the fence, -1 if not exported
  * @error: This member should've been set to -EFAULT on submission.
  * The following actions should be take on completion:
  * error == -EFAULT: Fence communication failed. The host is synchronized.
@@ -345,7 +350,7 @@ struct drm_vmw_fence_rep {
 	__u32 mask;
 	__u32 seqno;
 	__u32 passed_seqno;
-	__u32 pad64;
+	__s32 fd;
 	__s32 error;
 };
 
diff --git a/include/uapi/linux/aio_abi.h b/include/uapi/linux/aio_abi.h
index a2d4a8a..a04adbc 100644
--- a/include/uapi/linux/aio_abi.h
+++ b/include/uapi/linux/aio_abi.h
@@ -28,6 +28,7 @@
 #define __LINUX__AIO_ABI_H
 
 #include <linux/types.h>
+#include <linux/fs.h>
 #include <asm/byteorder.h>
 
 typedef __kernel_ulong_t aio_context_t;
@@ -62,14 +63,6 @@ struct io_event {
 	__s64		res2;		/* secondary result */
 };
 
-#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN : defined(__LITTLE_ENDIAN)
-#define PADDED(x,y)	x, y
-#elif defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN : defined(__BIG_ENDIAN)
-#define PADDED(x,y)	y, x
-#else
-#error edit for your odd byteorder.
-#endif
-
 /*
  * we always use a 64bit off_t when communicating
  * with userland.  its up to libraries to do the
@@ -79,8 +72,16 @@ struct io_event {
 struct iocb {
 	/* these are internal to the kernel/libc. */
 	__u64	aio_data;	/* data to be returned in event's data */
-	__u32	PADDED(aio_key, aio_rw_flags);
-				/* the kernel sets aio_key to the req # */
+
+#if defined(__BYTE_ORDER) ? __BYTE_ORDER == __LITTLE_ENDIAN : defined(__LITTLE_ENDIAN)
+	__u32	aio_key;	/* the kernel sets aio_key to the req # */
+	__kernel_rwf_t aio_rw_flags;	/* RWF_* flags */
+#elif defined(__BYTE_ORDER) ? __BYTE_ORDER == __BIG_ENDIAN : defined(__BIG_ENDIAN)
+	__kernel_rwf_t aio_rw_flags;	/* RWF_* flags */
+	__u32	aio_key;	/* the kernel sets aio_key to the req # */
+#else
+#error edit for your odd byteorder.
+#endif
 
 	/* common fields */
 	__u16	aio_lio_opcode;	/* see IOCB_CMD_ above */
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index b7495d0..56235dd 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -358,13 +358,25 @@ struct fscrypt_key {
 #define SYNC_FILE_RANGE_WRITE		2
 #define SYNC_FILE_RANGE_WAIT_AFTER	4
 
-/* flags for preadv2/pwritev2: */
-#define RWF_HIPRI			0x00000001 /* high priority request, poll if possible */
-#define RWF_DSYNC			0x00000002 /* per-IO O_DSYNC */
-#define RWF_SYNC			0x00000004 /* per-IO O_SYNC */
-#define RWF_NOWAIT			0x00000008 /* per-IO, return -EAGAIN if operation would block */
+/*
+ * Flags for preadv2/pwritev2:
+ */
 
-#define RWF_SUPPORTED			(RWF_HIPRI | RWF_DSYNC | RWF_SYNC |\
-					 RWF_NOWAIT)
+typedef int __bitwise __kernel_rwf_t;
+
+/* high priority request, poll if possible */
+#define RWF_HIPRI	((__force __kernel_rwf_t)0x00000001)
+
+/* per-IO O_DSYNC */
+#define RWF_DSYNC	((__force __kernel_rwf_t)0x00000002)
+
+/* per-IO O_SYNC */
+#define RWF_SYNC	((__force __kernel_rwf_t)0x00000004)
+
+/* per-IO, return -EAGAIN if operation would block */
+#define RWF_NOWAIT	((__force __kernel_rwf_t)0x00000008)
+
+/* mask of flags supported by the kernel */
+#define RWF_SUPPORTED	(RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT)
 
 #endif /* _UAPI_LINUX_FS_H */
diff --git a/include/uapi/linux/kfd_ioctl.h b/include/uapi/linux/kfd_ioctl.h
index d683342..7b4567b 100644
--- a/include/uapi/linux/kfd_ioctl.h
+++ b/include/uapi/linux/kfd_ioctl.h
@@ -232,6 +232,35 @@ struct kfd_ioctl_wait_events_args {
 	uint32_t wait_result;		/* from KFD */
 };
 
+struct kfd_ioctl_set_scratch_backing_va_args {
+	uint64_t va_addr;	/* to KFD */
+	uint32_t gpu_id;	/* to KFD */
+	uint32_t pad;
+};
+
+struct kfd_ioctl_get_tile_config_args {
+	/* to KFD: pointer to tile array */
+	uint64_t tile_config_ptr;
+	/* to KFD: pointer to macro tile array */
+	uint64_t macro_tile_config_ptr;
+	/* to KFD: array size allocated by user mode
+	 * from KFD: array size filled by kernel
+	 */
+	uint32_t num_tile_configs;
+	/* to KFD: array size allocated by user mode
+	 * from KFD: array size filled by kernel
+	 */
+	uint32_t num_macro_tile_configs;
+
+	uint32_t gpu_id;		/* to KFD */
+	uint32_t gb_addr_config;	/* from KFD */
+	uint32_t num_banks;		/* from KFD */
+	uint32_t num_ranks;		/* from KFD */
+	/* struct size can be extended later if needed
+	 * without breaking ABI compatibility
+	 */
+};
+
 #define AMDKFD_IOCTL_BASE 'K'
 #define AMDKFD_IO(nr)			_IO(AMDKFD_IOCTL_BASE, nr)
 #define AMDKFD_IOR(nr, type)		_IOR(AMDKFD_IOCTL_BASE, nr, type)
@@ -286,7 +315,13 @@ struct kfd_ioctl_wait_events_args {
 #define AMDKFD_IOC_DBG_WAVE_CONTROL		\
 		AMDKFD_IOW(0x10, struct kfd_ioctl_dbg_wave_control_args)
 
+#define AMDKFD_IOC_SET_SCRATCH_BACKING_VA	\
+		AMDKFD_IOWR(0x11, struct kfd_ioctl_set_scratch_backing_va_args)
+
+#define AMDKFD_IOC_GET_TILE_CONFIG                                      \
+		AMDKFD_IOWR(0x12, struct kfd_ioctl_get_tile_config_args)
+
 #define AMDKFD_COMMAND_START		0x01
-#define AMDKFD_COMMAND_END		0x11
+#define AMDKFD_COMMAND_END		0x13
 
 #endif
diff --git a/include/uapi/linux/membarrier.h b/include/uapi/linux/membarrier.h
index e0b108b..6d47b32 100644
--- a/include/uapi/linux/membarrier.h
+++ b/include/uapi/linux/membarrier.h
@@ -40,14 +40,33 @@
  *                          (non-running threads are de facto in such a
  *                          state). This covers threads from all processes
  *                          running on the system. This command returns 0.
+ * @MEMBARRIER_CMD_PRIVATE_EXPEDITED:
+ *                          Execute a memory barrier on each running
+ *                          thread belonging to the same process as the current
+ *                          thread. Upon return from system call, the
+ *                          caller thread is ensured that all its running
+ *                          threads siblings have passed through a state
+ *                          where all memory accesses to user-space
+ *                          addresses match program order between entry
+ *                          to and return from the system call
+ *                          (non-running threads are de facto in such a
+ *                          state). This only covers threads from the
+ *                          same processes as the caller thread. This
+ *                          command returns 0. The "expedited" commands
+ *                          complete faster than the non-expedited ones,
+ *                          they never block, but have the downside of
+ *                          causing extra overhead.
  *
  * Command to be passed to the membarrier system call. The commands need to
  * be a single bit each, except for MEMBARRIER_CMD_QUERY which is assigned to
  * the value 0.
  */
 enum membarrier_cmd {
-	MEMBARRIER_CMD_QUERY = 0,
-	MEMBARRIER_CMD_SHARED = (1 << 0),
+	MEMBARRIER_CMD_QUERY			= 0,
+	MEMBARRIER_CMD_SHARED			= (1 << 0),
+	/* reserved for MEMBARRIER_CMD_SHARED_EXPEDITED (1 << 1) */
+	/* reserved for MEMBARRIER_CMD_PRIVATE (1 << 2) */
+	MEMBARRIER_CMD_PRIVATE_EXPEDITED	= (1 << 3),
 };
 
 #endif /* _UAPI_LINUX_MEMBARRIER_H */
diff --git a/include/uapi/linux/ndctl.h b/include/uapi/linux/ndctl.h
index 6d3c542..3f03567 100644
--- a/include/uapi/linux/ndctl.h
+++ b/include/uapi/linux/ndctl.h
@@ -145,43 +145,6 @@ struct nd_cmd_clear_error {
 	__u64 cleared;
 } __packed;
 
-struct nd_cmd_trans_spa {
-	__u64 spa;
-	__u32 status;
-	__u8  flags;
-	__u8  _reserved[3];
-	__u64 trans_length;
-	__u32 num_nvdimms;
-	struct nd_nvdimm_device {
-		__u32 nfit_device_handle;
-		__u32 _reserved;
-		__u64 dpa;
-	} __packed devices[0];
-
-} __packed;
-
-struct nd_cmd_ars_err_inj {
-	__u64 err_inj_spa_range_base;
-	__u64 err_inj_spa_range_length;
-	__u8  err_inj_options;
-	__u32 status;
-} __packed;
-
-struct nd_cmd_ars_err_inj_clr {
-	__u64 err_inj_clr_spa_range_base;
-	__u64 err_inj_clr_spa_range_length;
-	__u32 status;
-} __packed;
-
-struct nd_cmd_ars_err_inj_stat {
-	__u32 status;
-	__u32 inj_err_rec_count;
-	struct nd_error_stat_query_record {
-		__u64 err_inj_stat_spa_range_base;
-		__u64 err_inj_stat_spa_range_length;
-	} __packed record[0];
-} __packed;
-
 enum {
 	ND_CMD_IMPLEMENTED = 0,
 
diff --git a/include/uapi/rdma/ib_user_ioctl_verbs.h b/include/uapi/rdma/ib_user_ioctl_verbs.h
new file mode 100644
index 0000000..842792e
--- /dev/null
+++ b/include/uapi/rdma/ib_user_ioctl_verbs.h
@@ -0,0 +1,84 @@
+/*
+ * Copyright (c) 2017, Mellanox Technologies inc.  All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses.  You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ *     Redistribution and use in source and binary forms, with or
+ *     without modification, are permitted provided that the following
+ *     conditions are met:
+ *
+ *      - Redistributions of source code must retain the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer.
+ *
+ *      - Redistributions in binary form must reproduce the above
+ *        copyright notice, this list of conditions and the following
+ *        disclaimer in the documentation and/or other materials
+ *        provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ */
+
+#ifndef IB_USER_IOCTL_VERBS_H
+#define IB_USER_IOCTL_VERBS_H
+
+#include <rdma/rdma_user_ioctl.h>
+
+#define UVERBS_UDATA_DRIVER_DATA_NS	1
+#define UVERBS_UDATA_DRIVER_DATA_FLAG	(1UL << UVERBS_ID_NS_SHIFT)
+
+enum uverbs_default_objects {
+	UVERBS_OBJECT_DEVICE, /* No instances of DEVICE are allowed */
+	UVERBS_OBJECT_PD,
+	UVERBS_OBJECT_COMP_CHANNEL,
+	UVERBS_OBJECT_CQ,
+	UVERBS_OBJECT_QP,
+	UVERBS_OBJECT_SRQ,
+	UVERBS_OBJECT_AH,
+	UVERBS_OBJECT_MR,
+	UVERBS_OBJECT_MW,
+	UVERBS_OBJECT_FLOW,
+	UVERBS_OBJECT_XRCD,
+	UVERBS_OBJECT_RWQ_IND_TBL,
+	UVERBS_OBJECT_WQ,
+	UVERBS_OBJECT_LAST,
+};
+
+enum {
+	UVERBS_UHW_IN = UVERBS_UDATA_DRIVER_DATA_FLAG,
+	UVERBS_UHW_OUT,
+};
+
+enum uverbs_create_cq_cmd_attr_ids {
+	CREATE_CQ_HANDLE,
+	CREATE_CQ_CQE,
+	CREATE_CQ_USER_HANDLE,
+	CREATE_CQ_COMP_CHANNEL,
+	CREATE_CQ_COMP_VECTOR,
+	CREATE_CQ_FLAGS,
+	CREATE_CQ_RESP_CQE,
+};
+
+enum uverbs_destroy_cq_cmd_attr_ids {
+	DESTROY_CQ_HANDLE,
+	DESTROY_CQ_RESP,
+};
+
+enum uverbs_actions_cq_ops {
+	UVERBS_CQ_CREATE,
+	UVERBS_CQ_DESTROY,
+};
+
+#endif
+
diff --git a/include/uapi/rdma/ib_user_verbs.h b/include/uapi/rdma/ib_user_verbs.h
index 270c350..9a0b647 100644
--- a/include/uapi/rdma/ib_user_verbs.h
+++ b/include/uapi/rdma/ib_user_verbs.h
@@ -236,6 +236,20 @@ struct ib_uverbs_rss_caps {
 	__u32 reserved;
 };
 
+struct ib_uverbs_tm_caps {
+	/* Max size of rendezvous request message */
+	__u32 max_rndv_hdr_size;
+	/* Max number of entries in tag matching list */
+	__u32 max_num_tags;
+	/* TM flags */
+	__u32 flags;
+	/* Max number of outstanding list operations */
+	__u32 max_ops;
+	/* Max number of SGE in tag matching entry */
+	__u32 max_sge;
+	__u32 reserved;
+};
+
 struct ib_uverbs_ex_query_device_resp {
 	struct ib_uverbs_query_device_resp base;
 	__u32 comp_mask;
@@ -247,6 +261,7 @@ struct ib_uverbs_ex_query_device_resp {
 	struct ib_uverbs_rss_caps rss_caps;
 	__u32  max_wq_type_rq;
 	__u32 raw_packet_caps;
+	struct ib_uverbs_tm_caps xrq_caps;
 };
 
 struct ib_uverbs_query_port {
@@ -578,7 +593,7 @@ struct ib_uverbs_ex_create_qp {
 	__u32 comp_mask;
 	__u32 create_flags;
 	__u32 rwq_ind_tbl_handle;
-	__u32  reserved1;
+	__u32  source_qpn;
 };
 
 struct ib_uverbs_open_qp {
@@ -1024,7 +1039,7 @@ struct ib_uverbs_create_xsrq {
 	__u32 max_wr;
 	__u32 max_sge;
 	__u32 srq_limit;
-	__u32 reserved;
+	__u32 max_num_tags;
 	__u32 xrcd_handle;
 	__u32 cq_handle;
 	__u64 driver_data[0];
diff --git a/include/uapi/rdma/mlx4-abi.h b/include/uapi/rdma/mlx4-abi.h
index af43175..c55f60e 100644
--- a/include/uapi/rdma/mlx4-abi.h
+++ b/include/uapi/rdma/mlx4-abi.h
@@ -95,13 +95,63 @@ struct mlx4_ib_create_srq_resp {
 	__u32	reserved;
 };
 
+struct mlx4_ib_create_qp_rss {
+	__u64   rx_hash_fields_mask;
+	__u8    rx_hash_function;
+	__u8    reserved[7];
+	__u8    rx_hash_key[40];
+	__u32   comp_mask;
+	__u32   reserved1;
+};
+
 struct mlx4_ib_create_qp {
 	__u64	buf_addr;
 	__u64	db_addr;
 	__u8	log_sq_bb_count;
 	__u8	log_sq_stride;
 	__u8	sq_no_prefetch;
-	__u8	reserved[5];
+	__u8	reserved;
+	__u32	inl_recv_sz;
+};
+
+struct mlx4_ib_create_wq {
+	__u64	buf_addr;
+	__u64	db_addr;
+	__u8	log_range_size;
+	__u8	reserved[3];
+	__u32   comp_mask;
+};
+
+struct mlx4_ib_modify_wq {
+	__u32	comp_mask;
+	__u32	reserved;
+};
+
+struct mlx4_ib_create_rwq_ind_tbl_resp {
+	__u32	response_length;
+	__u32	reserved;
+};
+
+/* RX Hash function flags */
+enum mlx4_ib_rx_hash_function_flags {
+	MLX4_IB_RX_HASH_FUNC_TOEPLITZ	= 1 << 0,
+};
+
+/*
+ * RX Hash flags, these flags allows to set which incoming packet's field should
+ * participates in RX Hash. Each flag represent certain packet's field,
+ * when the flag is set the field that is represented by the flag will
+ * participate in RX Hash calculation.
+ */
+enum mlx4_ib_rx_hash_fields {
+	MLX4_IB_RX_HASH_SRC_IPV4	= 1 << 0,
+	MLX4_IB_RX_HASH_DST_IPV4	= 1 << 1,
+	MLX4_IB_RX_HASH_SRC_IPV6	= 1 << 2,
+	MLX4_IB_RX_HASH_DST_IPV6	= 1 << 3,
+	MLX4_IB_RX_HASH_SRC_PORT_TCP	= 1 << 4,
+	MLX4_IB_RX_HASH_DST_PORT_TCP	= 1 << 5,
+	MLX4_IB_RX_HASH_SRC_PORT_UDP	= 1 << 6,
+	MLX4_IB_RX_HASH_DST_PORT_UDP	= 1 << 7
 };
 
 #endif /* MLX4_ABI_USER_H */
diff --git a/include/uapi/rdma/mlx5-abi.h b/include/uapi/rdma/mlx5-abi.h
index 0b3d308..1791bf1 100644
--- a/include/uapi/rdma/mlx5-abi.h
+++ b/include/uapi/rdma/mlx5-abi.h
@@ -168,6 +168,28 @@ struct mlx5_packet_pacing_caps {
 	__u32 reserved;
 };
 
+enum mlx5_ib_mpw_caps {
+	MPW_RESERVED		= 1 << 0,
+	MLX5_IB_ALLOW_MPW	= 1 << 1,
+	MLX5_IB_SUPPORT_EMPW	= 1 << 2,
+};
+
+enum mlx5_ib_sw_parsing_offloads {
+	MLX5_IB_SW_PARSING = 1 << 0,
+	MLX5_IB_SW_PARSING_CSUM = 1 << 1,
+	MLX5_IB_SW_PARSING_LSO = 1 << 2,
+};
+
+struct mlx5_ib_sw_parsing_caps {
+	__u32 sw_parsing_offloads; /* enum mlx5_ib_sw_parsing_offloads */
+
+	/* Corresponding bit will be set if qp type from
+	 * 'enum ib_qp_type' is supported, e.g.
+	 * supported_qpts |= 1 << IB_QPT_RAW_PACKET
+	 */
+	__u32 supported_qpts;
+};
+
 struct mlx5_ib_query_device_resp {
 	__u32	comp_mask;
 	__u32	response_length;
@@ -177,6 +199,7 @@ struct mlx5_ib_query_device_resp {
 	struct	mlx5_packet_pacing_caps packet_pacing_caps;
 	__u32	mlx5_ib_support_multi_pkt_send_wqes;
 	__u32	reserved;
+	struct mlx5_ib_sw_parsing_caps sw_parsing_caps;
 };
 
 struct mlx5_ib_create_cq {
diff --git a/include/uapi/rdma/qedr-abi.h b/include/uapi/rdma/qedr-abi.h
index 75c270d..54b6435 100644
--- a/include/uapi/rdma/qedr-abi.h
+++ b/include/uapi/rdma/qedr-abi.h
@@ -49,6 +49,9 @@ struct qedr_alloc_ucontext_resp {
 	__u32 sges_per_recv_wr;
 	__u32 sges_per_srq_wr;
 	__u32 max_cqes;
+	__u8 dpm_enabled;
+	__u8 wids_enabled;
+	__u16 wid_count;
 };
 
 struct qedr_alloc_pd_ureq {
diff --git a/include/uapi/rdma/rdma_netlink.h b/include/uapi/rdma/rdma_netlink.h
index 02fe839..861440a 100644
--- a/include/uapi/rdma/rdma_netlink.h
+++ b/include/uapi/rdma/rdma_netlink.h
@@ -8,7 +8,7 @@ enum {
 	RDMA_NL_IWCM,
 	RDMA_NL_RSVD,
 	RDMA_NL_LS,	/* RDMA Local Services */
-	RDMA_NL_I40IW,
+	RDMA_NL_NLDEV,	/* RDMA device interface */
 	RDMA_NL_NUM_CLIENTS
 };
 
@@ -222,4 +222,86 @@ struct rdma_nla_ls_gid {
 	__u8		gid[16];
 };
 
+enum rdma_nldev_command {
+	RDMA_NLDEV_CMD_UNSPEC,
+
+	RDMA_NLDEV_CMD_GET, /* can dump */
+	RDMA_NLDEV_CMD_SET,
+	RDMA_NLDEV_CMD_NEW,
+	RDMA_NLDEV_CMD_DEL,
+
+	RDMA_NLDEV_CMD_PORT_GET, /* can dump */
+	RDMA_NLDEV_CMD_PORT_SET,
+	RDMA_NLDEV_CMD_PORT_NEW,
+	RDMA_NLDEV_CMD_PORT_DEL,
+
+	RDMA_NLDEV_NUM_OPS
+};
+
+enum rdma_nldev_attr {
+	/* don't change the order or add anything between, this is ABI! */
+	RDMA_NLDEV_ATTR_UNSPEC,
+
+	/* Identifier for ib_device */
+	RDMA_NLDEV_ATTR_DEV_INDEX,		/* u32 */
+
+	RDMA_NLDEV_ATTR_DEV_NAME,		/* string */
+	/*
+	 * Device index together with port index are identifiers
+	 * for port/link properties.
+	 *
+	 * For RDMA_NLDEV_CMD_GET commamnd, port index will return number
+	 * of available ports in ib_device, while for port specific operations,
+	 * it will be real port index as it appears in sysfs. Port index follows
+	 * sysfs notation and starts from 1 for the first port.
+	 */
+	RDMA_NLDEV_ATTR_PORT_INDEX,		/* u32 */
+
+	/*
+	 * Device and port capabilities
+	 */
+	RDMA_NLDEV_ATTR_CAP_FLAGS,		/* u64 */
+
+	/*
+	 * FW version
+	 */
+	RDMA_NLDEV_ATTR_FW_VERSION,		/* string */
+
+	/*
+	 * Node GUID (in host byte order) associated with the RDMA device.
+	 */
+	RDMA_NLDEV_ATTR_NODE_GUID,			/* u64 */
+
+	/*
+	 * System image GUID (in host byte order) associated with
+	 * this RDMA device and other devices which are part of a
+	 * single system.
+	 */
+	RDMA_NLDEV_ATTR_SYS_IMAGE_GUID,		/* u64 */
+
+	/*
+	 * Subnet prefix (in host byte order)
+	 */
+	RDMA_NLDEV_ATTR_SUBNET_PREFIX,		/* u64 */
+
+	/*
+	 * Local Identifier (LID),
+	 * According to IB specification, It is 16-bit address assigned
+	 * by the Subnet Manager. Extended to be 32-bit for OmniPath users.
+	 */
+	RDMA_NLDEV_ATTR_LID,			/* u32 */
+	RDMA_NLDEV_ATTR_SM_LID,			/* u32 */
+
+	/*
+	 * LID mask control (LMC)
+	 */
+	RDMA_NLDEV_ATTR_LMC,			/* u8 */
+
+	RDMA_NLDEV_ATTR_PORT_STATE,		/* u8 */
+	RDMA_NLDEV_ATTR_PORT_PHYS_STATE,	/* u8 */
+
+	RDMA_NLDEV_ATTR_DEV_NODE_TYPE,		/* u8 */
+
+	RDMA_NLDEV_ATTR_MAX
+};
 #endif /* _UAPI_RDMA_NETLINK_H */
diff --git a/include/uapi/rdma/rdma_user_ioctl.h b/include/uapi/rdma/rdma_user_ioctl.h
index 9388125..165a27e 100644
--- a/include/uapi/rdma/rdma_user_ioctl.h
+++ b/include/uapi/rdma/rdma_user_ioctl.h
@@ -43,6 +43,39 @@
 /* Legacy name, for user space application which already use it */
 #define IB_IOCTL_MAGIC		RDMA_IOCTL_MAGIC
 
+#define RDMA_VERBS_IOCTL \
+	_IOWR(RDMA_IOCTL_MAGIC, 1, struct ib_uverbs_ioctl_hdr)
+
+#define UVERBS_ID_NS_MASK 0xF000
+#define UVERBS_ID_NS_SHIFT 12
+
+enum {
+	/* User input */
+	UVERBS_ATTR_F_MANDATORY = 1U << 0,
+	/*
+	 * Valid output bit should be ignored and considered set in
+	 * mandatory fields. This bit is kernel output.
+	 */
+	UVERBS_ATTR_F_VALID_OUTPUT = 1U << 1,
+};
+
+struct ib_uverbs_attr {
+	__u16 attr_id;		/* command specific type attribute */
+	__u16 len;		/* only for pointers */
+	__u16 flags;		/* combination of UVERBS_ATTR_F_XXXX */
+	__u16 reserved;
+	__u64 data;		/* ptr to command, inline data or idr/fd */
+};
+
+struct ib_uverbs_ioctl_hdr {
+	__u16 length;
+	__u16 object_id;
+	__u16 method_id;
+	__u16 num_attrs;
+	__u64 reserved;
+	struct ib_uverbs_attr  attrs[0];
+};
+
 /*
  * General blocks assignments
  * It is closed on purpose do not expose it it user space
diff --git a/include/uapi/rdma/vmw_pvrdma-abi.h b/include/uapi/rdma/vmw_pvrdma-abi.h
index c8c1d2d..c6569b0 100644
--- a/include/uapi/rdma/vmw_pvrdma-abi.h
+++ b/include/uapi/rdma/vmw_pvrdma-abi.h
@@ -125,7 +125,8 @@ enum pvrdma_wc_flags {
 	PVRDMA_WC_IP_CSUM_OK		= 1 << 3,
 	PVRDMA_WC_WITH_SMAC		= 1 << 4,
 	PVRDMA_WC_WITH_VLAN		= 1 << 5,
-	PVRDMA_WC_FLAGS_MAX		= PVRDMA_WC_WITH_VLAN,
+	PVRDMA_WC_WITH_NETWORK_HDR_TYPE	= 1 << 6,
+	PVRDMA_WC_FLAGS_MAX		= PVRDMA_WC_WITH_NETWORK_HDR_TYPE,
 };
 
 struct pvrdma_alloc_ucontext_resp {
@@ -283,7 +284,8 @@ struct pvrdma_cqe {
 	__u8 dlid_path_bits;
 	__u8 port_num;
 	__u8 smac[6];
-	__u8 reserved2[7]; /* Pad to next power of 2 (64). */
+	__u8 network_hdr_type;
+	__u8 reserved2[6]; /* Pad to next power of 2 (64). */
 };
 
 #endif /* __VMW_PVRDMA_ABI_H__ */
diff --git a/init/main.c b/init/main.c
index 052481f..7ec20cf 100644
--- a/init/main.c
+++ b/init/main.c
@@ -651,8 +651,8 @@ asmlinkage __visible void __init start_kernel(void)
 	}
 #endif
 	page_ext_init();
-	debug_objects_mem_init();
 	kmemleak_init();
+	debug_objects_mem_init();
 	setup_per_cpu_pageset();
 	numa_policy_init();
 	if (late_time_init)
diff --git a/ipc/sem.c b/ipc/sem.c
index 38371e9..c6c5037 100644
--- a/ipc/sem.c
+++ b/ipc/sem.c
@@ -2091,7 +2091,8 @@ void exit_sem(struct task_struct *tsk)
 			 * possibility where we exit while freeary() didn't
 			 * finish unlocking sem_undo_list.
 			 */
-			spin_unlock_wait(&ulp->lock);
+			spin_lock(&ulp->lock);
+			spin_unlock(&ulp->lock);
 			rcu_read_unlock();
 			break;
 		}
diff --git a/kernel/Makefile b/kernel/Makefile
index 4cb8e8b..9c323a6 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -108,7 +108,6 @@
 obj-$(CONFIG_JUMP_LABEL) += jump_label.o
 obj-$(CONFIG_CONTEXT_TRACKING) += context_tracking.o
 obj-$(CONFIG_TORTURE_TEST) += torture.o
-obj-$(CONFIG_MEMBARRIER) += membarrier.o
 
 obj-$(CONFIG_HAS_IOMEM) += memremap.o
 
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 4fb4631..d11c818 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -652,12 +652,27 @@ static void pcpu_copy_value(struct bpf_htab *htab, void __percpu *pptr,
 	}
 }
 
+static bool fd_htab_map_needs_adjust(const struct bpf_htab *htab)
+{
+	return htab->map.map_type == BPF_MAP_TYPE_HASH_OF_MAPS &&
+	       BITS_PER_LONG == 64;
+}
+
+static u32 htab_size_value(const struct bpf_htab *htab, bool percpu)
+{
+	u32 size = htab->map.value_size;
+
+	if (percpu || fd_htab_map_needs_adjust(htab))
+		size = round_up(size, 8);
+	return size;
+}
+
 static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
 					 void *value, u32 key_size, u32 hash,
 					 bool percpu, bool onallcpus,
 					 struct htab_elem *old_elem)
 {
-	u32 size = htab->map.value_size;
+	u32 size = htab_size_value(htab, percpu);
 	bool prealloc = htab_is_prealloc(htab);
 	struct htab_elem *l_new, **pl_new;
 	void __percpu *pptr;
@@ -696,9 +711,6 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
 
 	memcpy(l_new->key, key, key_size);
 	if (percpu) {
-		/* round up value_size to 8 bytes */
-		size = round_up(size, 8);
-
 		if (prealloc) {
 			pptr = htab_elem_get_ptr(l_new, key_size);
 		} else {
@@ -1209,17 +1221,9 @@ const struct bpf_map_ops htab_lru_percpu_map_ops = {
 
 static struct bpf_map *fd_htab_map_alloc(union bpf_attr *attr)
 {
-	struct bpf_map *map;
-
 	if (attr->value_size != sizeof(u32))
 		return ERR_PTR(-EINVAL);
-
-	/* pointer is stored internally */
-	attr->value_size = sizeof(void *);
-	map = htab_map_alloc(attr);
-	attr->value_size = sizeof(u32);
-
-	return map;
+	return htab_map_alloc(attr);
 }
 
 static void fd_htab_map_free(struct bpf_map *map)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 8d51516..87a1213 100644
--- a/kernel/cgroup/cpuset.c
+++ b/kernel/cgroup/cpuset.c
@@ -1892,6 +1892,7 @@ static struct cftype files[] = {
 	{
 		.name = "memory_pressure",
 		.read_u64 = cpuset_read_u64,
+		.private = FILE_MEMORY_PRESSURE,
 	},
 
 	{
diff --git a/kernel/cpu.c b/kernel/cpu.c
index eee0331..bfbd649 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -650,6 +650,7 @@ static int takedown_cpu(unsigned int cpu)
 	__cpu_die(cpu);
 
 	tick_cleanup_dead_cpu(cpu);
+	rcutree_migrate_callbacks(cpu);
 	return 0;
 }
 
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 0e137f9..267f6ef 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1262,8 +1262,6 @@ void uprobe_end_dup_mmap(void)
 
 void uprobe_dup_mmap(struct mm_struct *oldmm, struct mm_struct *newmm)
 {
-	newmm->uprobes_state.xol_area = NULL;
-
 	if (test_bit(MMF_HAS_UPROBES, &oldmm->flags)) {
 		set_bit(MMF_HAS_UPROBES, &newmm->flags);
 		/* unconditionally, dup_mmap() skips VM_DONTCOPY vmas */
diff --git a/kernel/exit.c b/kernel/exit.c
index c5548fa..f9ef3ec 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -764,7 +764,6 @@ void __noreturn do_exit(long code)
 {
 	struct task_struct *tsk = current;
 	int group_dead;
-	TASKS_RCU(int tasks_rcu_i);
 
 	profile_task_exit(tsk);
 	kcov_task_exit(tsk);
@@ -819,7 +818,8 @@ void __noreturn do_exit(long code)
 	 * Ensure that we must observe the pi_state in exit_mm() ->
 	 * mm_release() -> exit_pi_state_list().
 	 */
-	raw_spin_unlock_wait(&tsk->pi_lock);
+	raw_spin_lock_irq(&tsk->pi_lock);
+	raw_spin_unlock_irq(&tsk->pi_lock);
 
 	if (unlikely(in_atomic())) {
 		pr_info("note: %s[%d] exited with preempt_count %d\n",
@@ -881,9 +881,7 @@ void __noreturn do_exit(long code)
 	 */
 	flush_ptrace_hw_breakpoint(tsk);
 
-	TASKS_RCU(preempt_disable());
-	TASKS_RCU(tasks_rcu_i = __srcu_read_lock(&tasks_rcu_exit_srcu));
-	TASKS_RCU(preempt_enable());
+	exit_tasks_rcu_start();
 	exit_notify(tsk, group_dead);
 	proc_exit_connector(tsk);
 	mpol_put_task_policy(tsk);
@@ -918,7 +916,7 @@ void __noreturn do_exit(long code)
 	if (tsk->nr_dirtied)
 		__this_cpu_add(dirty_throttle_leaks, tsk->nr_dirtied);
 	exit_rcu();
-	TASKS_RCU(__srcu_read_unlock(&tasks_rcu_exit_srcu, tasks_rcu_i));
+	exit_tasks_rcu_finish();
 
 	do_task_dead();
 }
diff --git a/kernel/fork.c b/kernel/fork.c
index cbbea27..b7e9e57 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -785,6 +785,13 @@ static void mm_init_owner(struct mm_struct *mm, struct task_struct *p)
 #endif
 }
 
+static void mm_init_uprobes_state(struct mm_struct *mm)
+{
+#ifdef CONFIG_UPROBES
+	mm->uprobes_state.xol_area = NULL;
+#endif
+}
+
 static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
 	struct user_namespace *user_ns)
 {
@@ -812,6 +819,7 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
 #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS
 	mm->pmd_huge_pte = NULL;
 #endif
+	mm_init_uprobes_state(mm);
 
 	if (current->mm) {
 		mm->flags = current->mm->flags & MMF_INIT_MASK;
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 26db528..1c19edf 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -637,6 +637,7 @@ int kthread_worker_fn(void *worker_ptr)
 		schedule();
 
 	try_to_freeze();
+	cond_resched();
 	goto repeat;
 }
 EXPORT_SYMBOL_GPL(kthread_worker_fn);
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index fd24153..294294c 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -268,123 +268,6 @@ static __always_inline u32  __pv_wait_head_or_lock(struct qspinlock *lock,
 #define queued_spin_lock_slowpath	native_queued_spin_lock_slowpath
 #endif
 
-/*
- * Various notes on spin_is_locked() and spin_unlock_wait(), which are
- * 'interesting' functions:
- *
- * PROBLEM: some architectures have an interesting issue with atomic ACQUIRE
- * operations in that the ACQUIRE applies to the LOAD _not_ the STORE (ARM64,
- * PPC). Also qspinlock has a similar issue per construction, the setting of
- * the locked byte can be unordered acquiring the lock proper.
- *
- * This gets to be 'interesting' in the following cases, where the /should/s
- * end up false because of this issue.
- *
- *
- * CASE 1:
- *
- * So the spin_is_locked() correctness issue comes from something like:
- *
- *   CPU0				CPU1
- *
- *   global_lock();			local_lock(i)
- *     spin_lock(&G)			  spin_lock(&L[i])
- *     for (i)				  if (!spin_is_locked(&G)) {
- *       spin_unlock_wait(&L[i]);	    smp_acquire__after_ctrl_dep();
- *					    return;
- *					  }
- *					  // deal with fail
- *
- * Where it is important CPU1 sees G locked or CPU0 sees L[i] locked such
- * that there is exclusion between the two critical sections.
- *
- * The load from spin_is_locked(&G) /should/ be constrained by the ACQUIRE from
- * spin_lock(&L[i]), and similarly the load(s) from spin_unlock_wait(&L[i])
- * /should/ be constrained by the ACQUIRE from spin_lock(&G).
- *
- * Similarly, later stuff is constrained by the ACQUIRE from CTRL+RMB.
- *
- *
- * CASE 2:
- *
- * For spin_unlock_wait() there is a second correctness issue, namely:
- *
- *   CPU0				CPU1
- *
- *   flag = set;
- *   smp_mb();				spin_lock(&l)
- *   spin_unlock_wait(&l);		if (!flag)
- *					  // add to lockless list
- *					spin_unlock(&l);
- *   // iterate lockless list
- *
- * Which wants to ensure that CPU1 will stop adding bits to the list and CPU0
- * will observe the last entry on the list (if spin_unlock_wait() had ACQUIRE
- * semantics etc..)
- *
- * Where flag /should/ be ordered against the locked store of l.
- */
-
-/*
- * queued_spin_lock_slowpath() can (load-)ACQUIRE the lock before
- * issuing an _unordered_ store to set _Q_LOCKED_VAL.
- *
- * This means that the store can be delayed, but no later than the
- * store-release from the unlock. This means that simply observing
- * _Q_LOCKED_VAL is not sufficient to determine if the lock is acquired.
- *
- * There are two paths that can issue the unordered store:
- *
- *  (1) clear_pending_set_locked():	*,1,0 -> *,0,1
- *
- *  (2) set_locked():			t,0,0 -> t,0,1 ; t != 0
- *      atomic_cmpxchg_relaxed():	t,0,0 -> 0,0,1
- *
- * However, in both cases we have other !0 state we've set before to queue
- * ourseves:
- *
- * For (1) we have the atomic_cmpxchg_acquire() that set _Q_PENDING_VAL, our
- * load is constrained by that ACQUIRE to not pass before that, and thus must
- * observe the store.
- *
- * For (2) we have a more intersting scenario. We enqueue ourselves using
- * xchg_tail(), which ends up being a RELEASE. This in itself is not
- * sufficient, however that is followed by an smp_cond_acquire() on the same
- * word, giving a RELEASE->ACQUIRE ordering. This again constrains our load and
- * guarantees we must observe that store.
- *
- * Therefore both cases have other !0 state that is observable before the
- * unordered locked byte store comes through. This means we can use that to
- * wait for the lock store, and then wait for an unlock.
- */
-#ifndef queued_spin_unlock_wait
-void queued_spin_unlock_wait(struct qspinlock *lock)
-{
-	u32 val;
-
-	for (;;) {
-		val = atomic_read(&lock->val);
-
-		if (!val) /* not locked, we're done */
-			goto done;
-
-		if (val & _Q_LOCKED_MASK) /* locked, go wait for unlock */
-			break;
-
-		/* not locked, but pending, wait until we observe the lock */
-		cpu_relax();
-	}
-
-	/* any unlock is good */
-	while (atomic_read(&lock->val) & _Q_LOCKED_MASK)
-		cpu_relax();
-
-done:
-	smp_acquire__after_ctrl_dep();
-}
-EXPORT_SYMBOL(queued_spin_unlock_wait);
-#endif
-
 #endif /* _GEN_PV_LOCK_SLOWPATH */
 
 /**
diff --git a/kernel/membarrier.c b/kernel/membarrier.c
deleted file mode 100644
index 9f9284f..0000000
--- a/kernel/membarrier.c
+++ /dev/null
@@ -1,70 +0,0 @@
-/*
- * Copyright (C) 2010, 2015 Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
- *
- * membarrier system call
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- */
-
-#include <linux/syscalls.h>
-#include <linux/membarrier.h>
-#include <linux/tick.h>
-
-/*
- * Bitmask made from a "or" of all commands within enum membarrier_cmd,
- * except MEMBARRIER_CMD_QUERY.
- */
-#define MEMBARRIER_CMD_BITMASK	(MEMBARRIER_CMD_SHARED)
-
-/**
- * sys_membarrier - issue memory barriers on a set of threads
- * @cmd:   Takes command values defined in enum membarrier_cmd.
- * @flags: Currently needs to be 0. For future extensions.
- *
- * If this system call is not implemented, -ENOSYS is returned. If the
- * command specified does not exist, or if the command argument is invalid,
- * this system call returns -EINVAL. For a given command, with flags argument
- * set to 0, this system call is guaranteed to always return the same value
- * until reboot.
- *
- * All memory accesses performed in program order from each targeted thread
- * is guaranteed to be ordered with respect to sys_membarrier(). If we use
- * the semantic "barrier()" to represent a compiler barrier forcing memory
- * accesses to be performed in program order across the barrier, and
- * smp_mb() to represent explicit memory barriers forcing full memory
- * ordering across the barrier, we have the following ordering table for
- * each pair of barrier(), sys_membarrier() and smp_mb():
- *
- * The pair ordering is detailed as (O: ordered, X: not ordered):
- *
- *                        barrier()   smp_mb() sys_membarrier()
- *        barrier()          X           X            O
- *        smp_mb()           X           O            O
- *        sys_membarrier()   O           O            O
- */
-SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
-{
-	/* MEMBARRIER_CMD_SHARED is not compatible with nohz_full. */
-	if (tick_nohz_full_enabled())
-		return -ENOSYS;
-	if (unlikely(flags))
-		return -EINVAL;
-	switch (cmd) {
-	case MEMBARRIER_CMD_QUERY:
-		return MEMBARRIER_CMD_BITMASK;
-	case MEMBARRIER_CMD_SHARED:
-		if (num_online_cpus() > 1)
-			synchronize_sched();
-		return 0;
-	default:
-		return -EINVAL;
-	}
-}
diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
index be90c94..9210379 100644
--- a/kernel/rcu/Kconfig
+++ b/kernel/rcu/Kconfig
@@ -69,8 +69,7 @@
 	  This option selects the full-fledged version of SRCU.
 
 config TASKS_RCU
-	bool
-	default n
+	def_bool PREEMPT
 	select SRCU
 	help
 	  This option enables a task-based RCU implementation that uses
diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index 808b8c8..e4b43fe 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -356,22 +356,10 @@ do {									\
 
 #ifdef CONFIG_TINY_RCU
 /* Tiny RCU doesn't expedite, as its purpose in life is instead to be tiny. */
-static inline bool rcu_gp_is_normal(void)  /* Internal RCU use. */
-{
-	return true;
-}
-static inline bool rcu_gp_is_expedited(void)  /* Internal RCU use. */
-{
-	return false;
-}
-
-static inline void rcu_expedite_gp(void)
-{
-}
-
-static inline void rcu_unexpedite_gp(void)
-{
-}
+static inline bool rcu_gp_is_normal(void) { return true; }
+static inline bool rcu_gp_is_expedited(void) { return false; }
+static inline void rcu_expedite_gp(void) { }
+static inline void rcu_unexpedite_gp(void) { }
 #else /* #ifdef CONFIG_TINY_RCU */
 bool rcu_gp_is_normal(void);     /* Internal RCU use. */
 bool rcu_gp_is_expedited(void);  /* Internal RCU use. */
@@ -419,12 +407,8 @@ static inline void rcutorture_get_gp_data(enum rcutorture_type test_type,
 	*gpnum = 0;
 	*completed = 0;
 }
-static inline void rcutorture_record_test_transition(void)
-{
-}
-static inline void rcutorture_record_progress(unsigned long vernum)
-{
-}
+static inline void rcutorture_record_test_transition(void) { }
+static inline void rcutorture_record_progress(unsigned long vernum) { }
 #ifdef CONFIG_RCU_TRACE
 void do_trace_rcu_torture_read(const char *rcutorturename,
 			       struct rcu_head *rhp,
@@ -460,92 +444,20 @@ void srcutorture_get_gp_data(enum rcutorture_type test_type,
 #endif
 
 #ifdef CONFIG_TINY_RCU
-
-/*
- * Return the number of grace periods started.
- */
-static inline unsigned long rcu_batches_started(void)
-{
-	return 0;
-}
-
-/*
- * Return the number of bottom-half grace periods started.
- */
-static inline unsigned long rcu_batches_started_bh(void)
-{
-	return 0;
-}
-
-/*
- * Return the number of sched grace periods started.
- */
-static inline unsigned long rcu_batches_started_sched(void)
-{
-	return 0;
-}
-
-/*
- * Return the number of grace periods completed.
- */
-static inline unsigned long rcu_batches_completed(void)
-{
-	return 0;
-}
-
-/*
- * Return the number of bottom-half grace periods completed.
- */
-static inline unsigned long rcu_batches_completed_bh(void)
-{
-	return 0;
-}
-
-/*
- * Return the number of sched grace periods completed.
- */
-static inline unsigned long rcu_batches_completed_sched(void)
-{
-	return 0;
-}
-
-/*
- * Return the number of expedited grace periods completed.
- */
-static inline unsigned long rcu_exp_batches_completed(void)
-{
-	return 0;
-}
-
-/*
- * Return the number of expedited sched grace periods completed.
- */
-static inline unsigned long rcu_exp_batches_completed_sched(void)
-{
-	return 0;
-}
-
-static inline unsigned long srcu_batches_completed(struct srcu_struct *sp)
-{
-	return 0;
-}
-
-static inline void rcu_force_quiescent_state(void)
-{
-}
-
-static inline void rcu_bh_force_quiescent_state(void)
-{
-}
-
-static inline void rcu_sched_force_quiescent_state(void)
-{
-}
-
-static inline void show_rcu_gp_kthreads(void)
-{
-}
-
+static inline unsigned long rcu_batches_started(void) { return 0; }
+static inline unsigned long rcu_batches_started_bh(void) { return 0; }
+static inline unsigned long rcu_batches_started_sched(void) { return 0; }
+static inline unsigned long rcu_batches_completed(void) { return 0; }
+static inline unsigned long rcu_batches_completed_bh(void) { return 0; }
+static inline unsigned long rcu_batches_completed_sched(void) { return 0; }
+static inline unsigned long rcu_exp_batches_completed(void) { return 0; }
+static inline unsigned long rcu_exp_batches_completed_sched(void) { return 0; }
+static inline unsigned long
+srcu_batches_completed(struct srcu_struct *sp) { return 0; }
+static inline void rcu_force_quiescent_state(void) { }
+static inline void rcu_bh_force_quiescent_state(void) { }
+static inline void rcu_sched_force_quiescent_state(void) { }
+static inline void show_rcu_gp_kthreads(void) { }
 #else /* #ifdef CONFIG_TINY_RCU */
 extern unsigned long rcutorture_testseq;
 extern unsigned long rcutorture_vernum;
diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
index 2b62a38..7649fcd 100644
--- a/kernel/rcu/rcu_segcblist.c
+++ b/kernel/rcu/rcu_segcblist.c
@@ -36,24 +36,6 @@ void rcu_cblist_init(struct rcu_cblist *rclp)
 }
 
 /*
- * Debug function to actually count the number of callbacks.
- * If the number exceeds the limit specified, return -1.
- */
-long rcu_cblist_count_cbs(struct rcu_cblist *rclp, long lim)
-{
-	int cnt = 0;
-	struct rcu_head **rhpp = &rclp->head;
-
-	for (;;) {
-		if (!*rhpp)
-			return cnt;
-		if (++cnt > lim)
-			return -1;
-		rhpp = &(*rhpp)->next;
-	}
-}
-
-/*
  * Dequeue the oldest rcu_head structure from the specified callback
  * list.  This function assumes that the callback is non-lazy, but
  * the caller can later invoke rcu_cblist_dequeued_lazy() if it
@@ -103,17 +85,6 @@ void rcu_segcblist_disable(struct rcu_segcblist *rsclp)
 }
 
 /*
- * Is the specified segment of the specified rcu_segcblist structure
- * empty of callbacks?
- */
-bool rcu_segcblist_segempty(struct rcu_segcblist *rsclp, int seg)
-{
-	if (seg == RCU_DONE_TAIL)
-		return &rsclp->head == rsclp->tails[RCU_DONE_TAIL];
-	return rsclp->tails[seg - 1] == rsclp->tails[seg];
-}
-
-/*
  * Does the specified rcu_segcblist structure contain callbacks that
  * are ready to be invoked?
  */
@@ -134,50 +105,6 @@ bool rcu_segcblist_pend_cbs(struct rcu_segcblist *rsclp)
 }
 
 /*
- * Dequeue and return the first ready-to-invoke callback.  If there
- * are no ready-to-invoke callbacks, return NULL.  Disables interrupts
- * to avoid interference.  Does not protect from interference from other
- * CPUs or tasks.
- */
-struct rcu_head *rcu_segcblist_dequeue(struct rcu_segcblist *rsclp)
-{
-	unsigned long flags;
-	int i;
-	struct rcu_head *rhp;
-
-	local_irq_save(flags);
-	if (!rcu_segcblist_ready_cbs(rsclp)) {
-		local_irq_restore(flags);
-		return NULL;
-	}
-	rhp = rsclp->head;
-	BUG_ON(!rhp);
-	rsclp->head = rhp->next;
-	for (i = RCU_DONE_TAIL; i < RCU_CBLIST_NSEGS; i++) {
-		if (rsclp->tails[i] != &rhp->next)
-			break;
-		rsclp->tails[i] = &rsclp->head;
-	}
-	smp_mb(); /* Dequeue before decrement for rcu_barrier(). */
-	WRITE_ONCE(rsclp->len, rsclp->len - 1);
-	local_irq_restore(flags);
-	return rhp;
-}
-
-/*
- * Account for the fact that a previously dequeued callback turned out
- * to be marked as lazy.
- */
-void rcu_segcblist_dequeued_lazy(struct rcu_segcblist *rsclp)
-{
-	unsigned long flags;
-
-	local_irq_save(flags);
-	rsclp->len_lazy--;
-	local_irq_restore(flags);
-}
-
-/*
  * Return a pointer to the first callback in the specified rcu_segcblist
  * structure.  This is useful for diagnostics.
  */
@@ -203,17 +130,6 @@ struct rcu_head *rcu_segcblist_first_pend_cb(struct rcu_segcblist *rsclp)
 }
 
 /*
- * Does the specified rcu_segcblist structure contain callbacks that
- * have not yet been processed beyond having been posted, that is,
- * does it contain callbacks in its last segment?
- */
-bool rcu_segcblist_new_cbs(struct rcu_segcblist *rsclp)
-{
-	return rcu_segcblist_is_enabled(rsclp) &&
-	       !rcu_segcblist_restempty(rsclp, RCU_NEXT_READY_TAIL);
-}
-
-/*
  * Enqueue the specified callback onto the specified rcu_segcblist
  * structure, updating accounting as needed.  Note that the ->len
  * field may be accessed locklessly, hence the WRITE_ONCE().
@@ -503,3 +419,27 @@ bool rcu_segcblist_future_gp_needed(struct rcu_segcblist *rsclp,
 			return true;
 	return false;
 }
+
+/*
+ * Merge the source rcu_segcblist structure into the destination
+ * rcu_segcblist structure, then initialize the source.  Any pending
+ * callbacks from the source get to start over.  It is best to
+ * advance and accelerate both the destination and the source
+ * before merging.
+ */
+void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
+			 struct rcu_segcblist *src_rsclp)
+{
+	struct rcu_cblist donecbs;
+	struct rcu_cblist pendcbs;
+
+	rcu_cblist_init(&donecbs);
+	rcu_cblist_init(&pendcbs);
+	rcu_segcblist_extract_count(src_rsclp, &donecbs);
+	rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
+	rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
+	rcu_segcblist_insert_count(dst_rsclp, &donecbs);
+	rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
+	rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
+	rcu_segcblist_init(src_rsclp);
+}
diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
index 6e36e36..581c12b 100644
--- a/kernel/rcu/rcu_segcblist.h
+++ b/kernel/rcu/rcu_segcblist.h
@@ -31,29 +31,7 @@ static inline void rcu_cblist_dequeued_lazy(struct rcu_cblist *rclp)
 	rclp->len_lazy--;
 }
 
-/*
- * Interim function to return rcu_cblist head pointer.  Longer term, the
- * rcu_cblist will be used more pervasively, removing the need for this
- * function.
- */
-static inline struct rcu_head *rcu_cblist_head(struct rcu_cblist *rclp)
-{
-	return rclp->head;
-}
-
-/*
- * Interim function to return rcu_cblist head pointer.  Longer term, the
- * rcu_cblist will be used more pervasively, removing the need for this
- * function.
- */
-static inline struct rcu_head **rcu_cblist_tail(struct rcu_cblist *rclp)
-{
-	WARN_ON_ONCE(!rclp->head);
-	return rclp->tail;
-}
-
 void rcu_cblist_init(struct rcu_cblist *rclp);
-long rcu_cblist_count_cbs(struct rcu_cblist *rclp, long lim);
 struct rcu_head *rcu_cblist_dequeue(struct rcu_cblist *rclp);
 
 /*
@@ -134,14 +112,10 @@ static inline struct rcu_head **rcu_segcblist_tail(struct rcu_segcblist *rsclp)
 
 void rcu_segcblist_init(struct rcu_segcblist *rsclp);
 void rcu_segcblist_disable(struct rcu_segcblist *rsclp);
-bool rcu_segcblist_segempty(struct rcu_segcblist *rsclp, int seg);
 bool rcu_segcblist_ready_cbs(struct rcu_segcblist *rsclp);
 bool rcu_segcblist_pend_cbs(struct rcu_segcblist *rsclp);
-struct rcu_head *rcu_segcblist_dequeue(struct rcu_segcblist *rsclp);
-void rcu_segcblist_dequeued_lazy(struct rcu_segcblist *rsclp);
 struct rcu_head *rcu_segcblist_first_cb(struct rcu_segcblist *rsclp);
 struct rcu_head *rcu_segcblist_first_pend_cb(struct rcu_segcblist *rsclp);
-bool rcu_segcblist_new_cbs(struct rcu_segcblist *rsclp);
 void rcu_segcblist_enqueue(struct rcu_segcblist *rsclp,
 			   struct rcu_head *rhp, bool lazy);
 bool rcu_segcblist_entrain(struct rcu_segcblist *rsclp,
@@ -162,3 +136,5 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq);
 bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq);
 bool rcu_segcblist_future_gp_needed(struct rcu_segcblist *rsclp,
 				    unsigned long seq);
+void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
+			 struct rcu_segcblist *src_rsclp);
diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
index 3cc1811..1f87a02 100644
--- a/kernel/rcu/rcuperf.c
+++ b/kernel/rcu/rcuperf.c
@@ -317,8 +317,6 @@ static struct rcu_perf_ops sched_ops = {
 	.name		= "sched"
 };
 
-#ifdef CONFIG_TASKS_RCU
-
 /*
  * Definitions for RCU-tasks perf testing.
  */
@@ -346,24 +344,11 @@ static struct rcu_perf_ops tasks_ops = {
 	.name		= "tasks"
 };
 
-#define RCUPERF_TASKS_OPS &tasks_ops,
-
 static bool __maybe_unused torturing_tasks(void)
 {
 	return cur_ops == &tasks_ops;
 }
 
-#else /* #ifdef CONFIG_TASKS_RCU */
-
-#define RCUPERF_TASKS_OPS
-
-static bool __maybe_unused torturing_tasks(void)
-{
-	return false;
-}
-
-#endif /* #else #ifdef CONFIG_TASKS_RCU */
-
 /*
  * If performance tests complete, wait for shutdown to commence.
  */
@@ -658,7 +643,7 @@ rcu_perf_init(void)
 	int firsterr = 0;
 	static struct rcu_perf_ops *perf_ops[] = {
 		&rcu_ops, &rcu_bh_ops, &srcu_ops, &srcud_ops, &sched_ops,
-		RCUPERF_TASKS_OPS
+		&tasks_ops,
 	};
 
 	if (!torture_init_begin(perf_type, verbose, &perf_runnable))
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index b8f7f8c..45f2ffbc 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -199,7 +199,8 @@ MODULE_PARM_DESC(torture_runnable, "Start rcutorture at boot");
 static u64 notrace rcu_trace_clock_local(void)
 {
 	u64 ts = trace_clock_local();
-	unsigned long __maybe_unused ts_rem = do_div(ts, NSEC_PER_USEC);
+
+	(void)do_div(ts, NSEC_PER_USEC);
 	return ts;
 }
 #else /* #ifdef CONFIG_RCU_TRACE */
@@ -496,7 +497,7 @@ static struct rcu_torture_ops rcu_busted_ops = {
 	.fqs		= NULL,
 	.stats		= NULL,
 	.irq_capable	= 1,
-	.name		= "rcu_busted"
+	.name		= "busted"
 };
 
 /*
@@ -522,7 +523,7 @@ static void srcu_read_delay(struct torture_random_state *rrsp)
 
 	delay = torture_random(rrsp) %
 		(nrealreaders * 2 * longdelay * uspertick);
-	if (!delay)
+	if (!delay && in_task())
 		schedule_timeout_interruptible(longdelay);
 	else
 		rcu_read_delay(rrsp);
@@ -561,44 +562,7 @@ static void srcu_torture_barrier(void)
 
 static void srcu_torture_stats(void)
 {
-	int __maybe_unused cpu;
-	int idx;
-
-#ifdef CONFIG_TREE_SRCU
-	idx = srcu_ctlp->srcu_idx & 0x1;
-	pr_alert("%s%s Tree SRCU per-CPU(idx=%d):",
-		 torture_type, TORTURE_FLAG, idx);
-	for_each_possible_cpu(cpu) {
-		unsigned long l0, l1;
-		unsigned long u0, u1;
-		long c0, c1;
-		struct srcu_data *counts;
-
-		counts = per_cpu_ptr(srcu_ctlp->sda, cpu);
-		u0 = counts->srcu_unlock_count[!idx];
-		u1 = counts->srcu_unlock_count[idx];
-
-		/*
-		 * Make sure that a lock is always counted if the corresponding
-		 * unlock is counted.
-		 */
-		smp_rmb();
-
-		l0 = counts->srcu_lock_count[!idx];
-		l1 = counts->srcu_lock_count[idx];
-
-		c0 = l0 - u0;
-		c1 = l1 - u1;
-		pr_cont(" %d(%ld,%ld)", cpu, c0, c1);
-	}
-	pr_cont("\n");
-#elif defined(CONFIG_TINY_SRCU)
-	idx = READ_ONCE(srcu_ctlp->srcu_idx) & 0x1;
-	pr_alert("%s%s Tiny SRCU per-CPU(idx=%d): (%hd,%hd)\n",
-		 torture_type, TORTURE_FLAG, idx,
-		 READ_ONCE(srcu_ctlp->srcu_lock_nesting[!idx]),
-		 READ_ONCE(srcu_ctlp->srcu_lock_nesting[idx]));
-#endif
+	srcu_torture_stats_print(srcu_ctlp, torture_type, TORTURE_FLAG);
 }
 
 static void srcu_torture_synchronize_expedited(void)
@@ -620,6 +584,7 @@ static struct rcu_torture_ops srcu_ops = {
 	.call		= srcu_torture_call,
 	.cb_barrier	= srcu_torture_barrier,
 	.stats		= srcu_torture_stats,
+	.irq_capable	= 1,
 	.name		= "srcu"
 };
 
@@ -652,6 +617,7 @@ static struct rcu_torture_ops srcud_ops = {
 	.call		= srcu_torture_call,
 	.cb_barrier	= srcu_torture_barrier,
 	.stats		= srcu_torture_stats,
+	.irq_capable	= 1,
 	.name		= "srcud"
 };
 
@@ -696,8 +662,6 @@ static struct rcu_torture_ops sched_ops = {
 	.name		= "sched"
 };
 
-#ifdef CONFIG_TASKS_RCU
-
 /*
  * Definitions for RCU-tasks torture testing.
  */
@@ -735,24 +699,11 @@ static struct rcu_torture_ops tasks_ops = {
 	.name		= "tasks"
 };
 
-#define RCUTORTURE_TASKS_OPS &tasks_ops,
-
 static bool __maybe_unused torturing_tasks(void)
 {
 	return cur_ops == &tasks_ops;
 }
 
-#else /* #ifdef CONFIG_TASKS_RCU */
-
-#define RCUTORTURE_TASKS_OPS
-
-static bool __maybe_unused torturing_tasks(void)
-{
-	return false;
-}
-
-#endif /* #else #ifdef CONFIG_TASKS_RCU */
-
 /*
  * RCU torture priority-boost testing.  Runs one real-time thread per
  * CPU for moderate bursts, repeatedly registering RCU callbacks and
@@ -1114,6 +1065,11 @@ rcu_torture_fakewriter(void *arg)
 	return 0;
 }
 
+static void rcu_torture_timer_cb(struct rcu_head *rhp)
+{
+	kfree(rhp);
+}
+
 /*
  * RCU torture reader from timer handler.  Dereferences rcu_torture_current,
  * incrementing the corresponding element of the pipeline array.  The
@@ -1176,6 +1132,14 @@ static void rcu_torture_timer(unsigned long unused)
 	__this_cpu_inc(rcu_torture_batch[completed]);
 	preempt_enable();
 	cur_ops->readunlock(idx);
+
+	/* Test call_rcu() invocation from interrupt handler. */
+	if (cur_ops->call) {
+		struct rcu_head *rhp = kmalloc(sizeof(*rhp), GFP_NOWAIT);
+
+		if (rhp)
+			cur_ops->call(rhp, rcu_torture_timer_cb);
+	}
 }
 
 /*
@@ -1354,11 +1318,12 @@ rcu_torture_stats_print(void)
 		srcutorture_get_gp_data(cur_ops->ttype, srcu_ctlp,
 					&flags, &gpnum, &completed);
 		wtp = READ_ONCE(writer_task);
-		pr_alert("??? Writer stall state %s(%d) g%lu c%lu f%#x ->state %#lx\n",
+		pr_alert("??? Writer stall state %s(%d) g%lu c%lu f%#x ->state %#lx cpu %d\n",
 			 rcu_torture_writer_state_getname(),
 			 rcu_torture_writer_state,
 			 gpnum, completed, flags,
-			 wtp == NULL ? ~0UL : wtp->state);
+			 wtp == NULL ? ~0UL : wtp->state,
+			 wtp == NULL ? -1 : (int)task_cpu(wtp));
 		show_rcu_gp_kthreads();
 		rcu_ftrace_dump(DUMP_ALL);
 	}
@@ -1749,7 +1714,7 @@ rcu_torture_init(void)
 	int firsterr = 0;
 	static struct rcu_torture_ops *torture_ops[] = {
 		&rcu_ops, &rcu_bh_ops, &rcu_busted_ops, &srcu_ops, &srcud_ops,
-		&sched_ops, RCUTORTURE_TASKS_OPS
+		&sched_ops, &tasks_ops,
 	};
 
 	if (!torture_init_begin(torture_type, verbose, &torture_runnable))
diff --git a/kernel/rcu/srcutiny.c b/kernel/rcu/srcutiny.c
index 1a1c104..76ac5f5 100644
--- a/kernel/rcu/srcutiny.c
+++ b/kernel/rcu/srcutiny.c
@@ -33,6 +33,8 @@
 #include "rcu_segcblist.h"
 #include "rcu.h"
 
+int rcu_scheduler_active __read_mostly;
+
 static int init_srcu_struct_fields(struct srcu_struct *sp)
 {
 	sp->srcu_lock_nesting[0] = 0;
@@ -193,3 +195,9 @@ void synchronize_srcu(struct srcu_struct *sp)
 	destroy_rcu_head_on_stack(&rs.head);
 }
 EXPORT_SYMBOL_GPL(synchronize_srcu);
+
+/* Lockdep diagnostics.  */
+void __init rcu_scheduler_starting(void)
+{
+	rcu_scheduler_active = RCU_SCHEDULER_RUNNING;
+}
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index d0ca524..729a870 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -51,6 +51,7 @@ module_param(counter_wrap_check, ulong, 0444);
 
 static void srcu_invoke_callbacks(struct work_struct *work);
 static void srcu_reschedule(struct srcu_struct *sp, unsigned long delay);
+static void process_srcu(struct work_struct *work);
 
 /*
  * Initialize SRCU combining tree.  Note that statically allocated
@@ -896,6 +897,15 @@ static void __synchronize_srcu(struct srcu_struct *sp, bool do_norm)
 	__call_srcu(sp, &rcu.head, wakeme_after_rcu, do_norm);
 	wait_for_completion(&rcu.completion);
 	destroy_rcu_head_on_stack(&rcu.head);
+
+	/*
+	 * Make sure that later code is ordered after the SRCU grace
+	 * period.  This pairs with the raw_spin_lock_irq_rcu_node()
+	 * in srcu_invoke_callbacks().  Unlike Tree RCU, this is needed
+	 * because the current CPU might have been totally uninvolved with
+	 * (and thus unordered against) that grace period.
+	 */
+	smp_mb();
 }
 
 /**
@@ -1194,7 +1204,7 @@ static void srcu_reschedule(struct srcu_struct *sp, unsigned long delay)
 /*
  * This is the work-queue function that handles SRCU grace periods.
  */
-void process_srcu(struct work_struct *work)
+static void process_srcu(struct work_struct *work)
 {
 	struct srcu_struct *sp;
 
@@ -1203,7 +1213,6 @@ void process_srcu(struct work_struct *work)
 	srcu_advance_state(sp);
 	srcu_reschedule(sp, srcu_get_delay(sp));
 }
-EXPORT_SYMBOL_GPL(process_srcu);
 
 void srcutorture_get_gp_data(enum rcutorture_type test_type,
 			     struct srcu_struct *sp, int *flags,
@@ -1217,6 +1226,43 @@ void srcutorture_get_gp_data(enum rcutorture_type test_type,
 }
 EXPORT_SYMBOL_GPL(srcutorture_get_gp_data);
 
+void srcu_torture_stats_print(struct srcu_struct *sp, char *tt, char *tf)
+{
+	int cpu;
+	int idx;
+	unsigned long s0 = 0, s1 = 0;
+
+	idx = sp->srcu_idx & 0x1;
+	pr_alert("%s%s Tree SRCU per-CPU(idx=%d):", tt, tf, idx);
+	for_each_possible_cpu(cpu) {
+		unsigned long l0, l1;
+		unsigned long u0, u1;
+		long c0, c1;
+		struct srcu_data *counts;
+
+		counts = per_cpu_ptr(sp->sda, cpu);
+		u0 = counts->srcu_unlock_count[!idx];
+		u1 = counts->srcu_unlock_count[idx];
+
+		/*
+		 * Make sure that a lock is always counted if the corresponding
+		 * unlock is counted.
+		 */
+		smp_rmb();
+
+		l0 = counts->srcu_lock_count[!idx];
+		l1 = counts->srcu_lock_count[idx];
+
+		c0 = l0 - u0;
+		c1 = l1 - u1;
+		pr_cont(" %d(%ld,%ld)", cpu, c0, c1);
+		s0 += c0;
+		s1 += c1;
+	}
+	pr_cont(" T(%ld,%ld)\n", s0, s1);
+}
+EXPORT_SYMBOL_GPL(srcu_torture_stats_print);
+
 static int __init srcu_bootup_announce(void)
 {
 	pr_info("Hierarchical SRCU implementation.\n");
diff --git a/kernel/rcu/tiny.c b/kernel/rcu/tiny.c
index f848896..a64eee0 100644
--- a/kernel/rcu/tiny.c
+++ b/kernel/rcu/tiny.c
@@ -56,8 +56,6 @@ static struct rcu_ctrlblk rcu_bh_ctrlblk = {
 	.curtail	= &rcu_bh_ctrlblk.rcucblist,
 };
 
-#include "tiny_plugin.h"
-
 void rcu_barrier_bh(void)
 {
 	wait_rcu_gp(call_rcu_bh);
diff --git a/kernel/rcu/tiny_plugin.h b/kernel/rcu/tiny_plugin.h
deleted file mode 100644
index f0a01b2..0000000
--- a/kernel/rcu/tiny_plugin.h
+++ /dev/null
@@ -1,47 +0,0 @@
-/*
- * Read-Copy Update mechanism for mutual exclusion, the Bloatwatch edition
- * Internal non-public definitions that provide either classic
- * or preemptible semantics.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, you can access it online at
- * http://www.gnu.org/licenses/gpl-2.0.html.
- *
- * Copyright (c) 2010 Linaro
- *
- * Author: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
- */
-
-#if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_SRCU)
-#include <linux/kernel_stat.h>
-
-int rcu_scheduler_active __read_mostly;
-EXPORT_SYMBOL_GPL(rcu_scheduler_active);
-
-/*
- * During boot, we forgive RCU lockdep issues.  After this function is
- * invoked, we start taking RCU lockdep issues seriously.  Note that unlike
- * Tree RCU, Tiny RCU transitions directly from RCU_SCHEDULER_INACTIVE
- * to RCU_SCHEDULER_RUNNING, skipping the RCU_SCHEDULER_INIT stage.
- * The reason for this is that Tiny RCU does not need kthreads, so does
- * not have to care about the fact that the scheduler is half-initialized
- * at a certain phase of the boot process.  Unless SRCU is in the mix.
- */
-void __init rcu_scheduler_starting(void)
-{
-	WARN_ON(nr_context_switches() > 0);
-	rcu_scheduler_active = IS_ENABLED(CONFIG_SRCU)
-		? RCU_SCHEDULER_INIT : RCU_SCHEDULER_RUNNING;
-}
-
-#endif /* #if defined(CONFIG_DEBUG_LOCK_ALLOC) || defined(CONFIG_SRCU) */
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 51d4c3a..84fe966 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -97,9 +97,6 @@ struct rcu_state sname##_state = { \
 	.gp_state = RCU_GP_IDLE, \
 	.gpnum = 0UL - 300UL, \
 	.completed = 0UL - 300UL, \
-	.orphan_lock = __RAW_SPIN_LOCK_UNLOCKED(&sname##_state.orphan_lock), \
-	.orphan_pend = RCU_CBLIST_INITIALIZER(sname##_state.orphan_pend), \
-	.orphan_done = RCU_CBLIST_INITIALIZER(sname##_state.orphan_done), \
 	.barrier_mutex = __MUTEX_INITIALIZER(sname##_state.barrier_mutex), \
 	.name = RCU_STATE_NAME(sname), \
 	.abbr = sabbr, \
@@ -843,13 +840,9 @@ static void rcu_eqs_enter(bool user)
  */
 void rcu_idle_enter(void)
 {
-	unsigned long flags;
-
-	local_irq_save(flags);
+	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_idle_enter() invoked with irqs enabled!!!");
 	rcu_eqs_enter(false);
-	local_irq_restore(flags);
 }
-EXPORT_SYMBOL_GPL(rcu_idle_enter);
 
 #ifdef CONFIG_NO_HZ_FULL
 /**
@@ -862,7 +855,8 @@ EXPORT_SYMBOL_GPL(rcu_idle_enter);
  */
 void rcu_user_enter(void)
 {
-	rcu_eqs_enter(1);
+	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_user_enter() invoked with irqs enabled!!!");
+	rcu_eqs_enter(true);
 }
 #endif /* CONFIG_NO_HZ_FULL */
 
@@ -955,8 +949,10 @@ static void rcu_eqs_exit(bool user)
 	if (oldval & DYNTICK_TASK_NEST_MASK) {
 		rdtp->dynticks_nesting += DYNTICK_TASK_NEST_VALUE;
 	} else {
+		__this_cpu_inc(disable_rcu_irq_enter);
 		rdtp->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE;
 		rcu_eqs_exit_common(oldval, user);
+		__this_cpu_dec(disable_rcu_irq_enter);
 	}
 }
 
@@ -979,7 +975,6 @@ void rcu_idle_exit(void)
 	rcu_eqs_exit(false);
 	local_irq_restore(flags);
 }
-EXPORT_SYMBOL_GPL(rcu_idle_exit);
 
 #ifdef CONFIG_NO_HZ_FULL
 /**
@@ -1358,12 +1353,13 @@ static void rcu_check_gp_kthread_starvation(struct rcu_state *rsp)
 	j = jiffies;
 	gpa = READ_ONCE(rsp->gp_activity);
 	if (j - gpa > 2 * HZ) {
-		pr_err("%s kthread starved for %ld jiffies! g%lu c%lu f%#x %s(%d) ->state=%#lx\n",
+		pr_err("%s kthread starved for %ld jiffies! g%lu c%lu f%#x %s(%d) ->state=%#lx ->cpu=%d\n",
 		       rsp->name, j - gpa,
 		       rsp->gpnum, rsp->completed,
 		       rsp->gp_flags,
 		       gp_state_getname(rsp->gp_state), rsp->gp_state,
-		       rsp->gp_kthread ? rsp->gp_kthread->state : ~0);
+		       rsp->gp_kthread ? rsp->gp_kthread->state : ~0,
+		       rsp->gp_kthread ? task_cpu(rsp->gp_kthread) : -1);
 		if (rsp->gp_kthread) {
 			sched_show_task(rsp->gp_kthread);
 			wake_up_process(rsp->gp_kthread);
@@ -2067,8 +2063,8 @@ static bool rcu_gp_init(struct rcu_state *rsp)
 }
 
 /*
- * Helper function for wait_event_interruptible_timeout() wakeup
- * at force-quiescent-state time.
+ * Helper function for swait_event_idle() wakeup at force-quiescent-state
+ * time.
  */
 static bool rcu_gp_fqs_check_wake(struct rcu_state *rsp, int *gfp)
 {
@@ -2206,9 +2202,8 @@ static int __noreturn rcu_gp_kthread(void *arg)
 					       READ_ONCE(rsp->gpnum),
 					       TPS("reqwait"));
 			rsp->gp_state = RCU_GP_WAIT_GPS;
-			swait_event_interruptible(rsp->gp_wq,
-						 READ_ONCE(rsp->gp_flags) &
-						 RCU_GP_FLAG_INIT);
+			swait_event_idle(rsp->gp_wq, READ_ONCE(rsp->gp_flags) &
+						     RCU_GP_FLAG_INIT);
 			rsp->gp_state = RCU_GP_DONE_GPS;
 			/* Locking provides needed memory barrier. */
 			if (rcu_gp_init(rsp))
@@ -2239,7 +2234,7 @@ static int __noreturn rcu_gp_kthread(void *arg)
 					       READ_ONCE(rsp->gpnum),
 					       TPS("fqswait"));
 			rsp->gp_state = RCU_GP_WAIT_FQS;
-			ret = swait_event_interruptible_timeout(rsp->gp_wq,
+			ret = swait_event_idle_timeout(rsp->gp_wq,
 					rcu_gp_fqs_check_wake(rsp, &gf), j);
 			rsp->gp_state = RCU_GP_DOING_FQS;
 			/* Locking provides needed memory barriers. */
@@ -2409,6 +2404,8 @@ rcu_report_qs_rnp(unsigned long mask, struct rcu_state *rsp,
 			return;
 		}
 		WARN_ON_ONCE(oldmask); /* Any child must be all zeroed! */
+		WARN_ON_ONCE(rnp->level != rcu_num_lvls - 1 &&
+			     rcu_preempt_blocked_readers_cgp(rnp));
 		rnp->qsmask &= ~mask;
 		trace_rcu_quiescent_state_report(rsp->name, rnp->gpnum,
 						 mask, rnp->qsmask, rnp->level,
@@ -2563,85 +2560,6 @@ rcu_check_quiescent_state(struct rcu_state *rsp, struct rcu_data *rdp)
 }
 
 /*
- * Send the specified CPU's RCU callbacks to the orphanage.  The
- * specified CPU must be offline, and the caller must hold the
- * ->orphan_lock.
- */
-static void
-rcu_send_cbs_to_orphanage(int cpu, struct rcu_state *rsp,
-			  struct rcu_node *rnp, struct rcu_data *rdp)
-{
-	lockdep_assert_held(&rsp->orphan_lock);
-
-	/* No-CBs CPUs do not have orphanable callbacks. */
-	if (!IS_ENABLED(CONFIG_HOTPLUG_CPU) || rcu_is_nocb_cpu(rdp->cpu))
-		return;
-
-	/*
-	 * Orphan the callbacks.  First adjust the counts.  This is safe
-	 * because _rcu_barrier() excludes CPU-hotplug operations, so it
-	 * cannot be running now.  Thus no memory barrier is required.
-	 */
-	rdp->n_cbs_orphaned += rcu_segcblist_n_cbs(&rdp->cblist);
-	rcu_segcblist_extract_count(&rdp->cblist, &rsp->orphan_done);
-
-	/*
-	 * Next, move those callbacks still needing a grace period to
-	 * the orphanage, where some other CPU will pick them up.
-	 * Some of the callbacks might have gone partway through a grace
-	 * period, but that is too bad.  They get to start over because we
-	 * cannot assume that grace periods are synchronized across CPUs.
-	 */
-	rcu_segcblist_extract_pend_cbs(&rdp->cblist, &rsp->orphan_pend);
-
-	/*
-	 * Then move the ready-to-invoke callbacks to the orphanage,
-	 * where some other CPU will pick them up.  These will not be
-	 * required to pass though another grace period: They are done.
-	 */
-	rcu_segcblist_extract_done_cbs(&rdp->cblist, &rsp->orphan_done);
-
-	/* Finally, disallow further callbacks on this CPU.  */
-	rcu_segcblist_disable(&rdp->cblist);
-}
-
-/*
- * Adopt the RCU callbacks from the specified rcu_state structure's
- * orphanage.  The caller must hold the ->orphan_lock.
- */
-static void rcu_adopt_orphan_cbs(struct rcu_state *rsp, unsigned long flags)
-{
-	struct rcu_data *rdp = raw_cpu_ptr(rsp->rda);
-
-	lockdep_assert_held(&rsp->orphan_lock);
-
-	/* No-CBs CPUs are handled specially. */
-	if (!IS_ENABLED(CONFIG_HOTPLUG_CPU) ||
-	    rcu_nocb_adopt_orphan_cbs(rsp, rdp, flags))
-		return;
-
-	/* Do the accounting first. */
-	rdp->n_cbs_adopted += rsp->orphan_done.len;
-	if (rsp->orphan_done.len_lazy != rsp->orphan_done.len)
-		rcu_idle_count_callbacks_posted();
-	rcu_segcblist_insert_count(&rdp->cblist, &rsp->orphan_done);
-
-	/*
-	 * We do not need a memory barrier here because the only way we
-	 * can get here if there is an rcu_barrier() in flight is if
-	 * we are the task doing the rcu_barrier().
-	 */
-
-	/* First adopt the ready-to-invoke callbacks, then the done ones. */
-	rcu_segcblist_insert_done_cbs(&rdp->cblist, &rsp->orphan_done);
-	WARN_ON_ONCE(rsp->orphan_done.head);
-	rcu_segcblist_insert_pend_cbs(&rdp->cblist, &rsp->orphan_pend);
-	WARN_ON_ONCE(rsp->orphan_pend.head);
-	WARN_ON_ONCE(rcu_segcblist_empty(&rdp->cblist) !=
-		     !rcu_segcblist_n_cbs(&rdp->cblist));
-}
-
-/*
  * Trace the fact that this CPU is going offline.
  */
 static void rcu_cleanup_dying_cpu(struct rcu_state *rsp)
@@ -2704,14 +2622,12 @@ static void rcu_cleanup_dead_rnp(struct rcu_node *rnp_leaf)
 
 /*
  * The CPU has been completely removed, and some other CPU is reporting
- * this fact from process context.  Do the remainder of the cleanup,
- * including orphaning the outgoing CPU's RCU callbacks, and also
- * adopting them.  There can only be one CPU hotplug operation at a time,
- * so no other CPU can be attempting to update rcu_cpu_kthread_task.
+ * this fact from process context.  Do the remainder of the cleanup.
+ * There can only be one CPU hotplug operation at a time, so no need for
+ * explicit locking.
  */
 static void rcu_cleanup_dead_cpu(int cpu, struct rcu_state *rsp)
 {
-	unsigned long flags;
 	struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu);
 	struct rcu_node *rnp = rdp->mynode;  /* Outgoing CPU's rdp & rnp. */
 
@@ -2720,18 +2636,6 @@ static void rcu_cleanup_dead_cpu(int cpu, struct rcu_state *rsp)
 
 	/* Adjust any no-longer-needed kthreads. */
 	rcu_boost_kthread_setaffinity(rnp, -1);
-
-	/* Orphan the dead CPU's callbacks, and adopt them if appropriate. */
-	raw_spin_lock_irqsave(&rsp->orphan_lock, flags);
-	rcu_send_cbs_to_orphanage(cpu, rsp, rnp, rdp);
-	rcu_adopt_orphan_cbs(rsp, flags);
-	raw_spin_unlock_irqrestore(&rsp->orphan_lock, flags);
-
-	WARN_ONCE(rcu_segcblist_n_cbs(&rdp->cblist) != 0 ||
-		  !rcu_segcblist_empty(&rdp->cblist),
-		  "rcu_cleanup_dead_cpu: Callbacks on offline CPU %d: qlen=%lu, 1stCB=%p\n",
-		  cpu, rcu_segcblist_n_cbs(&rdp->cblist),
-		  rcu_segcblist_first_cb(&rdp->cblist));
 }
 
 /*
@@ -3569,10 +3473,11 @@ static void rcu_barrier_callback(struct rcu_head *rhp)
 	struct rcu_state *rsp = rdp->rsp;
 
 	if (atomic_dec_and_test(&rsp->barrier_cpu_count)) {
-		_rcu_barrier_trace(rsp, "LastCB", -1, rsp->barrier_sequence);
+		_rcu_barrier_trace(rsp, TPS("LastCB"), -1,
+				   rsp->barrier_sequence);
 		complete(&rsp->barrier_completion);
 	} else {
-		_rcu_barrier_trace(rsp, "CB", -1, rsp->barrier_sequence);
+		_rcu_barrier_trace(rsp, TPS("CB"), -1, rsp->barrier_sequence);
 	}
 }
 
@@ -3584,14 +3489,15 @@ static void rcu_barrier_func(void *type)
 	struct rcu_state *rsp = type;
 	struct rcu_data *rdp = raw_cpu_ptr(rsp->rda);
 
-	_rcu_barrier_trace(rsp, "IRQ", -1, rsp->barrier_sequence);
+	_rcu_barrier_trace(rsp, TPS("IRQ"), -1, rsp->barrier_sequence);
 	rdp->barrier_head.func = rcu_barrier_callback;
 	debug_rcu_head_queue(&rdp->barrier_head);
 	if (rcu_segcblist_entrain(&rdp->cblist, &rdp->barrier_head, 0)) {
 		atomic_inc(&rsp->barrier_cpu_count);
 	} else {
 		debug_rcu_head_unqueue(&rdp->barrier_head);
-		_rcu_barrier_trace(rsp, "IRQNQ", -1, rsp->barrier_sequence);
+		_rcu_barrier_trace(rsp, TPS("IRQNQ"), -1,
+				   rsp->barrier_sequence);
 	}
 }
 
@@ -3605,14 +3511,15 @@ static void _rcu_barrier(struct rcu_state *rsp)
 	struct rcu_data *rdp;
 	unsigned long s = rcu_seq_snap(&rsp->barrier_sequence);
 
-	_rcu_barrier_trace(rsp, "Begin", -1, s);
+	_rcu_barrier_trace(rsp, TPS("Begin"), -1, s);
 
 	/* Take mutex to serialize concurrent rcu_barrier() requests. */
 	mutex_lock(&rsp->barrier_mutex);
 
 	/* Did someone else do our work for us? */
 	if (rcu_seq_done(&rsp->barrier_sequence, s)) {
-		_rcu_barrier_trace(rsp, "EarlyExit", -1, rsp->barrier_sequence);
+		_rcu_barrier_trace(rsp, TPS("EarlyExit"), -1,
+				   rsp->barrier_sequence);
 		smp_mb(); /* caller's subsequent code after above check. */
 		mutex_unlock(&rsp->barrier_mutex);
 		return;
@@ -3620,7 +3527,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
 
 	/* Mark the start of the barrier operation. */
 	rcu_seq_start(&rsp->barrier_sequence);
-	_rcu_barrier_trace(rsp, "Inc1", -1, rsp->barrier_sequence);
+	_rcu_barrier_trace(rsp, TPS("Inc1"), -1, rsp->barrier_sequence);
 
 	/*
 	 * Initialize the count to one rather than to zero in order to
@@ -3643,10 +3550,10 @@ static void _rcu_barrier(struct rcu_state *rsp)
 		rdp = per_cpu_ptr(rsp->rda, cpu);
 		if (rcu_is_nocb_cpu(cpu)) {
 			if (!rcu_nocb_cpu_needs_barrier(rsp, cpu)) {
-				_rcu_barrier_trace(rsp, "OfflineNoCB", cpu,
+				_rcu_barrier_trace(rsp, TPS("OfflineNoCB"), cpu,
 						   rsp->barrier_sequence);
 			} else {
-				_rcu_barrier_trace(rsp, "OnlineNoCB", cpu,
+				_rcu_barrier_trace(rsp, TPS("OnlineNoCB"), cpu,
 						   rsp->barrier_sequence);
 				smp_mb__before_atomic();
 				atomic_inc(&rsp->barrier_cpu_count);
@@ -3654,11 +3561,11 @@ static void _rcu_barrier(struct rcu_state *rsp)
 					   rcu_barrier_callback, rsp, cpu, 0);
 			}
 		} else if (rcu_segcblist_n_cbs(&rdp->cblist)) {
-			_rcu_barrier_trace(rsp, "OnlineQ", cpu,
+			_rcu_barrier_trace(rsp, TPS("OnlineQ"), cpu,
 					   rsp->barrier_sequence);
 			smp_call_function_single(cpu, rcu_barrier_func, rsp, 1);
 		} else {
-			_rcu_barrier_trace(rsp, "OnlineNQ", cpu,
+			_rcu_barrier_trace(rsp, TPS("OnlineNQ"), cpu,
 					   rsp->barrier_sequence);
 		}
 	}
@@ -3675,7 +3582,7 @@ static void _rcu_barrier(struct rcu_state *rsp)
 	wait_for_completion(&rsp->barrier_completion);
 
 	/* Mark the end of the barrier operation. */
-	_rcu_barrier_trace(rsp, "Inc2", -1, rsp->barrier_sequence);
+	_rcu_barrier_trace(rsp, TPS("Inc2"), -1, rsp->barrier_sequence);
 	rcu_seq_end(&rsp->barrier_sequence);
 
 	/* Other rcu_barrier() invocations can now safely proceed. */
@@ -3777,8 +3684,6 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp)
 	 */
 	rnp = rdp->mynode;
 	raw_spin_lock_rcu_node(rnp);		/* irqs already disabled. */
-	if (!rdp->beenonline)
-		WRITE_ONCE(rsp->ncpus, READ_ONCE(rsp->ncpus) + 1);
 	rdp->beenonline = true;	 /* We have now been online. */
 	rdp->gpnum = rnp->completed; /* Make CPU later note any new GP. */
 	rdp->completed = rnp->completed;
@@ -3882,6 +3787,8 @@ void rcu_cpu_starting(unsigned int cpu)
 {
 	unsigned long flags;
 	unsigned long mask;
+	int nbits;
+	unsigned long oldmask;
 	struct rcu_data *rdp;
 	struct rcu_node *rnp;
 	struct rcu_state *rsp;
@@ -3892,9 +3799,15 @@ void rcu_cpu_starting(unsigned int cpu)
 		mask = rdp->grpmask;
 		raw_spin_lock_irqsave_rcu_node(rnp, flags);
 		rnp->qsmaskinitnext |= mask;
+		oldmask = rnp->expmaskinitnext;
 		rnp->expmaskinitnext |= mask;
+		oldmask ^= rnp->expmaskinitnext;
+		nbits = bitmap_weight(&oldmask, BITS_PER_LONG);
+		/* Allow lockless access for expedited grace periods. */
+		smp_store_release(&rsp->ncpus, rsp->ncpus + nbits); /* ^^^ */
 		raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
 	}
+	smp_mb(); /* Ensure RCU read-side usage follows above initialization. */
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
@@ -3937,6 +3850,50 @@ void rcu_report_dead(unsigned int cpu)
 	for_each_rcu_flavor(rsp)
 		rcu_cleanup_dying_idle_cpu(cpu, rsp);
 }
+
+/* Migrate the dead CPU's callbacks to the current CPU. */
+static void rcu_migrate_callbacks(int cpu, struct rcu_state *rsp)
+{
+	unsigned long flags;
+	struct rcu_data *my_rdp;
+	struct rcu_data *rdp = per_cpu_ptr(rsp->rda, cpu);
+	struct rcu_node *rnp_root = rcu_get_root(rdp->rsp);
+
+	if (rcu_is_nocb_cpu(cpu) || rcu_segcblist_empty(&rdp->cblist))
+		return;  /* No callbacks to migrate. */
+
+	local_irq_save(flags);
+	my_rdp = this_cpu_ptr(rsp->rda);
+	if (rcu_nocb_adopt_orphan_cbs(my_rdp, rdp, flags)) {
+		local_irq_restore(flags);
+		return;
+	}
+	raw_spin_lock_rcu_node(rnp_root); /* irqs already disabled. */
+	rcu_advance_cbs(rsp, rnp_root, rdp); /* Leverage recent GPs. */
+	rcu_advance_cbs(rsp, rnp_root, my_rdp); /* Assign GP to pending CBs. */
+	rcu_segcblist_merge(&my_rdp->cblist, &rdp->cblist);
+	WARN_ON_ONCE(rcu_segcblist_empty(&my_rdp->cblist) !=
+		     !rcu_segcblist_n_cbs(&my_rdp->cblist));
+	raw_spin_unlock_irqrestore_rcu_node(rnp_root, flags);
+	WARN_ONCE(rcu_segcblist_n_cbs(&rdp->cblist) != 0 ||
+		  !rcu_segcblist_empty(&rdp->cblist),
+		  "rcu_cleanup_dead_cpu: Callbacks on offline CPU %d: qlen=%lu, 1stCB=%p\n",
+		  cpu, rcu_segcblist_n_cbs(&rdp->cblist),
+		  rcu_segcblist_first_cb(&rdp->cblist));
+}
+
+/*
+ * The outgoing CPU has just passed through the dying-idle state,
+ * and we are being invoked from the CPU that was IPIed to continue the
+ * offline operation.  We need to migrate the outgoing CPU's callbacks.
+ */
+void rcutree_migrate_callbacks(int cpu)
+{
+	struct rcu_state *rsp;
+
+	for_each_rcu_flavor(rsp)
+		rcu_migrate_callbacks(cpu, rsp);
+}
 #endif
 
 /*
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 9af0f31..8e1f285 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -219,8 +219,6 @@ struct rcu_data {
 					/* qlen at last check for QS forcing */
 	unsigned long	n_cbs_invoked;	/* count of RCU cbs invoked. */
 	unsigned long	n_nocbs_invoked; /* count of no-CBs RCU cbs invoked. */
-	unsigned long   n_cbs_orphaned; /* RCU cbs orphaned by dying CPU */
-	unsigned long   n_cbs_adopted;  /* RCU cbs adopted from dying CPU */
 	unsigned long	n_force_qs_snap;
 					/* did other CPU force QS recently? */
 	long		blimit;		/* Upper limit on a processed batch */
@@ -268,7 +266,9 @@ struct rcu_data {
 	struct rcu_head **nocb_follower_tail;
 	struct swait_queue_head nocb_wq; /* For nocb kthreads to sleep on. */
 	struct task_struct *nocb_kthread;
+	raw_spinlock_t nocb_lock;	/* Guard following pair of fields. */
 	int nocb_defer_wakeup;		/* Defer wakeup of nocb_kthread. */
+	struct timer_list nocb_timer;	/* Enforce finite deferral. */
 
 	/* The following fields are used by the leader, hence own cacheline. */
 	struct rcu_head *nocb_gp_head ____cacheline_internodealigned_in_smp;
@@ -350,15 +350,6 @@ struct rcu_state {
 
 	/* End of fields guarded by root rcu_node's lock. */
 
-	raw_spinlock_t orphan_lock ____cacheline_internodealigned_in_smp;
-						/* Protect following fields. */
-	struct rcu_cblist orphan_pend;		/* Orphaned callbacks that */
-						/*  need a grace period. */
-	struct rcu_cblist orphan_done;		/* Orphaned callbacks that */
-						/*  are ready to invoke. */
-						/* (Contains counts.) */
-	/* End of fields guarded by orphan_lock. */
-
 	struct mutex barrier_mutex;		/* Guards barrier fields. */
 	atomic_t barrier_cpu_count;		/* # CPUs waiting on. */
 	struct completion barrier_completion;	/* Wake at barrier end. */
@@ -495,7 +486,7 @@ static void rcu_nocb_gp_cleanup(struct swait_queue_head *sq);
 static void rcu_init_one_nocb(struct rcu_node *rnp);
 static bool __call_rcu_nocb(struct rcu_data *rdp, struct rcu_head *rhp,
 			    bool lazy, unsigned long flags);
-static bool rcu_nocb_adopt_orphan_cbs(struct rcu_state *rsp,
+static bool rcu_nocb_adopt_orphan_cbs(struct rcu_data *my_rdp,
 				      struct rcu_data *rdp,
 				      unsigned long flags);
 static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp);
diff --git a/kernel/rcu/tree_exp.h b/kernel/rcu/tree_exp.h
index dd21ca4..46d61b5 100644
--- a/kernel/rcu/tree_exp.h
+++ b/kernel/rcu/tree_exp.h
@@ -73,7 +73,7 @@ static void sync_exp_reset_tree_hotplug(struct rcu_state *rsp)
 	unsigned long flags;
 	unsigned long mask;
 	unsigned long oldmask;
-	int ncpus = READ_ONCE(rsp->ncpus);
+	int ncpus = smp_load_acquire(&rsp->ncpus); /* Order against locking. */
 	struct rcu_node *rnp;
 	struct rcu_node *rnp_up;
 
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 908b309..55bde94 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -180,6 +180,8 @@ static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp)
 	struct task_struct *t = current;
 
 	lockdep_assert_held(&rnp->lock);
+	WARN_ON_ONCE(rdp->mynode != rnp);
+	WARN_ON_ONCE(rnp->level != rcu_num_lvls - 1);
 
 	/*
 	 * Decide where to queue the newly blocked task.  In theory,
@@ -261,6 +263,10 @@ static void rcu_preempt_ctxt_queue(struct rcu_node *rnp, struct rcu_data *rdp)
 		rnp->gp_tasks = &t->rcu_node_entry;
 	if (!rnp->exp_tasks && (blkd_state & RCU_EXP_BLKD))
 		rnp->exp_tasks = &t->rcu_node_entry;
+	WARN_ON_ONCE(!(blkd_state & RCU_GP_BLKD) !=
+		     !(rnp->qsmask & rdp->grpmask));
+	WARN_ON_ONCE(!(blkd_state & RCU_EXP_BLKD) !=
+		     !(rnp->expmask & rdp->grpmask));
 	raw_spin_unlock_rcu_node(rnp); /* interrupts remain disabled. */
 
 	/*
@@ -482,6 +488,7 @@ void rcu_read_unlock_special(struct task_struct *t)
 		rnp = t->rcu_blocked_node;
 		raw_spin_lock_rcu_node(rnp); /* irqs already disabled. */
 		WARN_ON_ONCE(rnp != t->rcu_blocked_node);
+		WARN_ON_ONCE(rnp->level != rcu_num_lvls - 1);
 		empty_norm = !rcu_preempt_blocked_readers_cgp(rnp);
 		empty_exp = sync_rcu_preempt_exp_done(rnp);
 		smp_mb(); /* ensure expedited fastpath sees end of RCU c-s. */
@@ -495,10 +502,10 @@ void rcu_read_unlock_special(struct task_struct *t)
 		if (&t->rcu_node_entry == rnp->exp_tasks)
 			rnp->exp_tasks = np;
 		if (IS_ENABLED(CONFIG_RCU_BOOST)) {
-			if (&t->rcu_node_entry == rnp->boost_tasks)
-				rnp->boost_tasks = np;
 			/* Snapshot ->boost_mtx ownership w/rnp->lock held. */
 			drop_boost_mutex = rt_mutex_owner(&rnp->boost_mtx) == t;
+			if (&t->rcu_node_entry == rnp->boost_tasks)
+				rnp->boost_tasks = np;
 		}
 
 		/*
@@ -636,10 +643,17 @@ static int rcu_print_task_exp_stall(struct rcu_node *rnp)
  */
 static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
 {
+	struct task_struct *t;
+
 	RCU_LOCKDEP_WARN(preemptible(), "rcu_preempt_check_blocked_tasks() invoked with preemption enabled!!!\n");
 	WARN_ON_ONCE(rcu_preempt_blocked_readers_cgp(rnp));
-	if (rcu_preempt_has_tasks(rnp))
+	if (rcu_preempt_has_tasks(rnp)) {
 		rnp->gp_tasks = rnp->blkd_tasks.next;
+		t = container_of(rnp->gp_tasks, struct task_struct,
+				 rcu_node_entry);
+		trace_rcu_unlock_preempted_task(TPS("rcu_preempt-GPS"),
+						rnp->gpnum, t->pid);
+	}
 	WARN_ON_ONCE(rnp->qsmask);
 }
 
@@ -1788,23 +1802,62 @@ bool rcu_is_nocb_cpu(int cpu)
 }
 
 /*
- * Kick the leader kthread for this NOCB group.
+ * Kick the leader kthread for this NOCB group.  Caller holds ->nocb_lock
+ * and this function releases it.
  */
-static void wake_nocb_leader(struct rcu_data *rdp, bool force)
+static void __wake_nocb_leader(struct rcu_data *rdp, bool force,
+			       unsigned long flags)
+	__releases(rdp->nocb_lock)
 {
 	struct rcu_data *rdp_leader = rdp->nocb_leader;
 
-	if (!READ_ONCE(rdp_leader->nocb_kthread))
+	lockdep_assert_held(&rdp->nocb_lock);
+	if (!READ_ONCE(rdp_leader->nocb_kthread)) {
+		raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
 		return;
-	if (READ_ONCE(rdp_leader->nocb_leader_sleep) || force) {
+	}
+	if (rdp_leader->nocb_leader_sleep || force) {
 		/* Prior smp_mb__after_atomic() orders against prior enqueue. */
 		WRITE_ONCE(rdp_leader->nocb_leader_sleep, false);
+		del_timer(&rdp->nocb_timer);
+		raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
 		smp_mb(); /* ->nocb_leader_sleep before swake_up(). */
 		swake_up(&rdp_leader->nocb_wq);
+	} else {
+		raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
 	}
 }
 
 /*
+ * Kick the leader kthread for this NOCB group, but caller has not
+ * acquired locks.
+ */
+static void wake_nocb_leader(struct rcu_data *rdp, bool force)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
+	__wake_nocb_leader(rdp, force, flags);
+}
+
+/*
+ * Arrange to wake the leader kthread for this NOCB group at some
+ * future time when it is safe to do so.
+ */
+static void wake_nocb_leader_defer(struct rcu_data *rdp, int waketype,
+				   const char *reason)
+{
+	unsigned long flags;
+
+	raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
+	if (rdp->nocb_defer_wakeup == RCU_NOCB_WAKE_NOT)
+		mod_timer(&rdp->nocb_timer, jiffies + 1);
+	WRITE_ONCE(rdp->nocb_defer_wakeup, waketype);
+	trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, reason);
+	raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
+}
+
+/*
  * Does the specified CPU need an RCU callback for the specified flavor
  * of rcu_barrier()?
  */
@@ -1891,11 +1944,8 @@ static void __call_rcu_nocb_enqueue(struct rcu_data *rdp,
 			trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu,
 					    TPS("WakeEmpty"));
 		} else {
-			WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE);
-			/* Store ->nocb_defer_wakeup before ->rcu_urgent_qs. */
-			smp_store_release(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs), true);
-			trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu,
-					    TPS("WakeEmptyIsDeferred"));
+			wake_nocb_leader_defer(rdp, RCU_NOCB_WAKE,
+					       TPS("WakeEmptyIsDeferred"));
 		}
 		rdp->qlen_last_fqs_check = 0;
 	} else if (len > rdp->qlen_last_fqs_check + qhimark) {
@@ -1905,11 +1955,8 @@ static void __call_rcu_nocb_enqueue(struct rcu_data *rdp,
 			trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu,
 					    TPS("WakeOvf"));
 		} else {
-			WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_FORCE);
-			/* Store ->nocb_defer_wakeup before ->rcu_urgent_qs. */
-			smp_store_release(this_cpu_ptr(&rcu_dynticks.rcu_urgent_qs), true);
-			trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu,
-					    TPS("WakeOvfIsDeferred"));
+			wake_nocb_leader_defer(rdp, RCU_NOCB_WAKE,
+					       TPS("WakeOvfIsDeferred"));
 		}
 		rdp->qlen_last_fqs_check = LONG_MAX / 2;
 	} else {
@@ -1961,30 +2008,19 @@ static bool __call_rcu_nocb(struct rcu_data *rdp, struct rcu_head *rhp,
  * Adopt orphaned callbacks on a no-CBs CPU, or return 0 if this is
  * not a no-CBs CPU.
  */
-static bool __maybe_unused rcu_nocb_adopt_orphan_cbs(struct rcu_state *rsp,
+static bool __maybe_unused rcu_nocb_adopt_orphan_cbs(struct rcu_data *my_rdp,
 						     struct rcu_data *rdp,
 						     unsigned long flags)
 {
-	long ql = rsp->orphan_done.len;
-	long qll = rsp->orphan_done.len_lazy;
-
-	/* If this is not a no-CBs CPU, tell the caller to do it the old way. */
+	RCU_LOCKDEP_WARN(!irqs_disabled(), "rcu_nocb_adopt_orphan_cbs() invoked with irqs enabled!!!");
 	if (!rcu_is_nocb_cpu(smp_processor_id()))
-		return false;
-
-	/* First, enqueue the donelist, if any.  This preserves CB ordering. */
-	if (rsp->orphan_done.head) {
-		__call_rcu_nocb_enqueue(rdp, rcu_cblist_head(&rsp->orphan_done),
-					rcu_cblist_tail(&rsp->orphan_done),
-					ql, qll, flags);
-	}
-	if (rsp->orphan_pend.head) {
-		__call_rcu_nocb_enqueue(rdp, rcu_cblist_head(&rsp->orphan_pend),
-					rcu_cblist_tail(&rsp->orphan_pend),
-					ql, qll, flags);
-	}
-	rcu_cblist_init(&rsp->orphan_done);
-	rcu_cblist_init(&rsp->orphan_pend);
+		return false; /* Not NOCBs CPU, caller must migrate CBs. */
+	__call_rcu_nocb_enqueue(my_rdp, rcu_segcblist_head(&rdp->cblist),
+				rcu_segcblist_tail(&rdp->cblist),
+				rcu_segcblist_n_cbs(&rdp->cblist),
+				rcu_segcblist_n_lazy_cbs(&rdp->cblist), flags);
+	rcu_segcblist_init(&rdp->cblist);
+	rcu_segcblist_disable(&rdp->cblist);
 	return true;
 }
 
@@ -2031,6 +2067,7 @@ static void rcu_nocb_wait_gp(struct rcu_data *rdp)
 static void nocb_leader_wait(struct rcu_data *my_rdp)
 {
 	bool firsttime = true;
+	unsigned long flags;
 	bool gotcbs;
 	struct rcu_data *rdp;
 	struct rcu_head **tail;
@@ -2039,13 +2076,17 @@ static void nocb_leader_wait(struct rcu_data *my_rdp)
 
 	/* Wait for callbacks to appear. */
 	if (!rcu_nocb_poll) {
-		trace_rcu_nocb_wake(my_rdp->rsp->name, my_rdp->cpu, "Sleep");
+		trace_rcu_nocb_wake(my_rdp->rsp->name, my_rdp->cpu, TPS("Sleep"));
 		swait_event_interruptible(my_rdp->nocb_wq,
 				!READ_ONCE(my_rdp->nocb_leader_sleep));
-		/* Memory barrier handled by smp_mb() calls below and repoll. */
+		raw_spin_lock_irqsave(&my_rdp->nocb_lock, flags);
+		my_rdp->nocb_leader_sleep = true;
+		WRITE_ONCE(my_rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
+		del_timer(&my_rdp->nocb_timer);
+		raw_spin_unlock_irqrestore(&my_rdp->nocb_lock, flags);
 	} else if (firsttime) {
 		firsttime = false; /* Don't drown trace log with "Poll"! */
-		trace_rcu_nocb_wake(my_rdp->rsp->name, my_rdp->cpu, "Poll");
+		trace_rcu_nocb_wake(my_rdp->rsp->name, my_rdp->cpu, TPS("Poll"));
 	}
 
 	/*
@@ -2054,7 +2095,7 @@ static void nocb_leader_wait(struct rcu_data *my_rdp)
 	 * nocb_gp_head, where they await a grace period.
 	 */
 	gotcbs = false;
-	smp_mb(); /* wakeup before ->nocb_head reads. */
+	smp_mb(); /* wakeup and _sleep before ->nocb_head reads. */
 	for (rdp = my_rdp; rdp; rdp = rdp->nocb_next_follower) {
 		rdp->nocb_gp_head = READ_ONCE(rdp->nocb_head);
 		if (!rdp->nocb_gp_head)
@@ -2066,56 +2107,41 @@ static void nocb_leader_wait(struct rcu_data *my_rdp)
 		gotcbs = true;
 	}
 
-	/*
-	 * If there were no callbacks, sleep a bit, rescan after a
-	 * memory barrier, and go retry.
-	 */
+	/* No callbacks?  Sleep a bit if polling, and go retry.  */
 	if (unlikely(!gotcbs)) {
-		if (!rcu_nocb_poll)
-			trace_rcu_nocb_wake(my_rdp->rsp->name, my_rdp->cpu,
-					    "WokeEmpty");
 		WARN_ON(signal_pending(current));
-		schedule_timeout_interruptible(1);
-
-		/* Rescan in case we were a victim of memory ordering. */
-		my_rdp->nocb_leader_sleep = true;
-		smp_mb();  /* Ensure _sleep true before scan. */
-		for (rdp = my_rdp; rdp; rdp = rdp->nocb_next_follower)
-			if (READ_ONCE(rdp->nocb_head)) {
-				/* Found CB, so short-circuit next wait. */
-				my_rdp->nocb_leader_sleep = false;
-				break;
-			}
+		if (rcu_nocb_poll) {
+			schedule_timeout_interruptible(1);
+		} else {
+			trace_rcu_nocb_wake(my_rdp->rsp->name, my_rdp->cpu,
+					    TPS("WokeEmpty"));
+		}
 		goto wait_again;
 	}
 
 	/* Wait for one grace period. */
 	rcu_nocb_wait_gp(my_rdp);
 
-	/*
-	 * We left ->nocb_leader_sleep unset to reduce cache thrashing.
-	 * We set it now, but recheck for new callbacks while
-	 * traversing our follower list.
-	 */
-	my_rdp->nocb_leader_sleep = true;
-	smp_mb(); /* Ensure _sleep true before scan of ->nocb_head. */
-
 	/* Each pass through the following loop wakes a follower, if needed. */
 	for (rdp = my_rdp; rdp; rdp = rdp->nocb_next_follower) {
-		if (READ_ONCE(rdp->nocb_head))
+		if (!rcu_nocb_poll &&
+		    READ_ONCE(rdp->nocb_head) &&
+		    READ_ONCE(my_rdp->nocb_leader_sleep)) {
+			raw_spin_lock_irqsave(&my_rdp->nocb_lock, flags);
 			my_rdp->nocb_leader_sleep = false;/* No need to sleep.*/
+			raw_spin_unlock_irqrestore(&my_rdp->nocb_lock, flags);
+		}
 		if (!rdp->nocb_gp_head)
 			continue; /* No CBs, so no need to wake follower. */
 
 		/* Append callbacks to follower's "done" list. */
-		tail = xchg(&rdp->nocb_follower_tail, rdp->nocb_gp_tail);
+		raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
+		tail = rdp->nocb_follower_tail;
+		rdp->nocb_follower_tail = rdp->nocb_gp_tail;
 		*tail = rdp->nocb_gp_head;
-		smp_mb__after_atomic(); /* Store *tail before wakeup. */
+		raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
 		if (rdp != my_rdp && tail == &rdp->nocb_follower_head) {
-			/*
-			 * List was empty, wake up the follower.
-			 * Memory barriers supplied by atomic_long_add().
-			 */
+			/* List was empty, so wake up the follower.  */
 			swake_up(&rdp->nocb_wq);
 		}
 	}
@@ -2131,28 +2157,16 @@ static void nocb_leader_wait(struct rcu_data *my_rdp)
  */
 static void nocb_follower_wait(struct rcu_data *rdp)
 {
-	bool firsttime = true;
-
 	for (;;) {
-		if (!rcu_nocb_poll) {
-			trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu,
-					    "FollowerSleep");
-			swait_event_interruptible(rdp->nocb_wq,
-						 READ_ONCE(rdp->nocb_follower_head));
-		} else if (firsttime) {
-			/* Don't drown trace log with "Poll"! */
-			firsttime = false;
-			trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, "Poll");
-		}
+		trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("FollowerSleep"));
+		swait_event_interruptible(rdp->nocb_wq,
+					 READ_ONCE(rdp->nocb_follower_head));
 		if (smp_load_acquire(&rdp->nocb_follower_head)) {
 			/* ^^^ Ensure CB invocation follows _head test. */
 			return;
 		}
-		if (!rcu_nocb_poll)
-			trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu,
-					    "WokeEmpty");
 		WARN_ON(signal_pending(current));
-		schedule_timeout_interruptible(1);
+		trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("WokeEmpty"));
 	}
 }
 
@@ -2165,6 +2179,7 @@ static void nocb_follower_wait(struct rcu_data *rdp)
 static int rcu_nocb_kthread(void *arg)
 {
 	int c, cl;
+	unsigned long flags;
 	struct rcu_head *list;
 	struct rcu_head *next;
 	struct rcu_head **tail;
@@ -2179,11 +2194,14 @@ static int rcu_nocb_kthread(void *arg)
 			nocb_follower_wait(rdp);
 
 		/* Pull the ready-to-invoke callbacks onto local list. */
-		list = READ_ONCE(rdp->nocb_follower_head);
+		raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
+		list = rdp->nocb_follower_head;
+		rdp->nocb_follower_head = NULL;
+		tail = rdp->nocb_follower_tail;
+		rdp->nocb_follower_tail = &rdp->nocb_follower_head;
+		raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
 		BUG_ON(!list);
-		trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, "WokeNonEmpty");
-		WRITE_ONCE(rdp->nocb_follower_head, NULL);
-		tail = xchg(&rdp->nocb_follower_tail, &rdp->nocb_follower_head);
+		trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("WokeNonEmpty"));
 
 		/* Each pass through the following loop invokes a callback. */
 		trace_rcu_batch_start(rdp->rsp->name,
@@ -2226,18 +2244,39 @@ static int rcu_nocb_need_deferred_wakeup(struct rcu_data *rdp)
 }
 
 /* Do a deferred wakeup of rcu_nocb_kthread(). */
-static void do_nocb_deferred_wakeup(struct rcu_data *rdp)
+static void do_nocb_deferred_wakeup_common(struct rcu_data *rdp)
 {
+	unsigned long flags;
 	int ndw;
 
-	if (!rcu_nocb_need_deferred_wakeup(rdp))
+	raw_spin_lock_irqsave(&rdp->nocb_lock, flags);
+	if (!rcu_nocb_need_deferred_wakeup(rdp)) {
+		raw_spin_unlock_irqrestore(&rdp->nocb_lock, flags);
 		return;
+	}
 	ndw = READ_ONCE(rdp->nocb_defer_wakeup);
 	WRITE_ONCE(rdp->nocb_defer_wakeup, RCU_NOCB_WAKE_NOT);
-	wake_nocb_leader(rdp, ndw == RCU_NOCB_WAKE_FORCE);
+	__wake_nocb_leader(rdp, ndw == RCU_NOCB_WAKE_FORCE, flags);
 	trace_rcu_nocb_wake(rdp->rsp->name, rdp->cpu, TPS("DeferredWake"));
 }
 
+/* Do a deferred wakeup of rcu_nocb_kthread() from a timer handler. */
+static void do_nocb_deferred_wakeup_timer(unsigned long x)
+{
+	do_nocb_deferred_wakeup_common((struct rcu_data *)x);
+}
+
+/*
+ * Do a deferred wakeup of rcu_nocb_kthread() from fastpath.
+ * This means we do an inexact common-case check.  Note that if
+ * we miss, ->nocb_timer will eventually clean things up.
+ */
+static void do_nocb_deferred_wakeup(struct rcu_data *rdp)
+{
+	if (rcu_nocb_need_deferred_wakeup(rdp))
+		do_nocb_deferred_wakeup_common(rdp);
+}
+
 void __init rcu_init_nohz(void)
 {
 	int cpu;
@@ -2287,6 +2326,9 @@ static void __init rcu_boot_init_nocb_percpu_data(struct rcu_data *rdp)
 	rdp->nocb_tail = &rdp->nocb_head;
 	init_swait_queue_head(&rdp->nocb_wq);
 	rdp->nocb_follower_tail = &rdp->nocb_follower_head;
+	raw_spin_lock_init(&rdp->nocb_lock);
+	setup_timer(&rdp->nocb_timer, do_nocb_deferred_wakeup_timer,
+		    (unsigned long)rdp);
 }
 
 /*
@@ -2459,7 +2501,7 @@ static bool __call_rcu_nocb(struct rcu_data *rdp, struct rcu_head *rhp,
 	return false;
 }
 
-static bool __maybe_unused rcu_nocb_adopt_orphan_cbs(struct rcu_state *rsp,
+static bool __maybe_unused rcu_nocb_adopt_orphan_cbs(struct rcu_data *my_rdp,
 						     struct rcu_data *rdp,
 						     unsigned long flags)
 {
diff --git a/kernel/rcu/update.c b/kernel/rcu/update.c
index 00e77c4..5033b66 100644
--- a/kernel/rcu/update.c
+++ b/kernel/rcu/update.c
@@ -568,7 +568,7 @@ static DECLARE_WAIT_QUEUE_HEAD(rcu_tasks_cbs_wq);
 static DEFINE_RAW_SPINLOCK(rcu_tasks_cbs_lock);
 
 /* Track exiting tasks in order to allow them to be waited for. */
-DEFINE_SRCU(tasks_rcu_exit_srcu);
+DEFINE_STATIC_SRCU(tasks_rcu_exit_srcu);
 
 /* Control stall timeouts.  Disable with <= 0, otherwise jiffies till stall. */
 #define RCU_TASK_STALL_TIMEOUT (HZ * 60 * 10)
@@ -875,6 +875,22 @@ static void rcu_spawn_tasks_kthread(void)
 	mutex_unlock(&rcu_tasks_kthread_mutex);
 }
 
+/* Do the srcu_read_lock() for the above synchronize_srcu().  */
+void exit_tasks_rcu_start(void)
+{
+	preempt_disable();
+	current->rcu_tasks_idx = __srcu_read_lock(&tasks_rcu_exit_srcu);
+	preempt_enable();
+}
+
+/* Do the srcu_read_unlock() for the above synchronize_srcu().  */
+void exit_tasks_rcu_finish(void)
+{
+	preempt_disable();
+	__srcu_read_unlock(&tasks_rcu_exit_srcu, current->rcu_tasks_idx);
+	preempt_enable();
+}
+
 #endif /* #ifdef CONFIG_TASKS_RCU */
 
 #ifndef CONFIG_TINY_RCU
diff --git a/kernel/sched/Makefile b/kernel/sched/Makefile
index 53f0164..78f5493 100644
--- a/kernel/sched/Makefile
+++ b/kernel/sched/Makefile
@@ -25,3 +25,4 @@
 obj-$(CONFIG_CGROUP_CPUACCT) += cpuacct.o
 obj-$(CONFIG_CPU_FREQ) += cpufreq.o
 obj-$(CONFIG_CPU_FREQ_GOV_SCHEDUTIL) += cpufreq_schedutil.o
+obj-$(CONFIG_MEMBARRIER) += membarrier.o
diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c
index 13fc5ae..c9524d2 100644
--- a/kernel/sched/completion.c
+++ b/kernel/sched/completion.c
@@ -300,6 +300,8 @@ EXPORT_SYMBOL(try_wait_for_completion);
  */
 bool completion_done(struct completion *x)
 {
+	unsigned long flags;
+
 	if (!READ_ONCE(x->done))
 		return false;
 
@@ -307,14 +309,9 @@ bool completion_done(struct completion *x)
 	 * If ->done, we need to wait for complete() to release ->wait.lock
 	 * otherwise we can end up freeing the completion before complete()
 	 * is done referencing it.
-	 *
-	 * The RMB pairs with complete()'s RELEASE of ->wait.lock and orders
-	 * the loads of ->done and ->wait.lock such that we cannot observe
-	 * the lock before complete() acquires it while observing the ->done
-	 * after it's acquired the lock.
 	 */
-	smp_rmb();
-	spin_unlock_wait(&x->wait.lock);
+	spin_lock_irqsave(&x->wait.lock, flags);
+	spin_unlock_irqrestore(&x->wait.lock, flags);
 	return true;
 }
 EXPORT_SYMBOL(completion_done);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 0869b20..e053c31 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -951,8 +951,13 @@ struct migration_arg {
 static struct rq *__migrate_task(struct rq *rq, struct rq_flags *rf,
 				 struct task_struct *p, int dest_cpu)
 {
-	if (unlikely(!cpu_active(dest_cpu)))
-		return rq;
+	if (p->flags & PF_KTHREAD) {
+		if (unlikely(!cpu_online(dest_cpu)))
+			return rq;
+	} else {
+		if (unlikely(!cpu_active(dest_cpu)))
+			return rq;
+	}
 
 	/* Affinity changed (again). */
 	if (!cpumask_test_cpu(dest_cpu, &p->cpus_allowed))
@@ -2635,6 +2640,16 @@ static struct rq *finish_task_switch(struct task_struct *prev)
 	prev_state = prev->state;
 	vtime_task_switch(prev);
 	perf_event_task_sched_in(prev, current);
+	/*
+	 * The membarrier system call requires a full memory barrier
+	 * after storing to rq->curr, before going back to user-space.
+	 *
+	 * TODO: This smp_mb__after_unlock_lock can go away if PPC end
+	 * up adding a full barrier to switch_mm(), or we should figure
+	 * out if a smp_mb__after_unlock_lock is really the proper API
+	 * to use.
+	 */
+	smp_mb__after_unlock_lock();
 	finish_lock_switch(rq, prev);
 	finish_arch_post_lock_switch();
 
@@ -3324,6 +3339,21 @@ static void __sched notrace __schedule(bool preempt)
 	if (likely(prev != next)) {
 		rq->nr_switches++;
 		rq->curr = next;
+		/*
+		 * The membarrier system call requires each architecture
+		 * to have a full memory barrier after updating
+		 * rq->curr, before returning to user-space. For TSO
+		 * (e.g. x86), the architecture must provide its own
+		 * barrier in switch_mm(). For weakly ordered machines
+		 * for which spin_unlock() acts as a full memory
+		 * barrier, finish_lock_switch() in common code takes
+		 * care of this barrier. For weakly ordered machines for
+		 * which spin_unlock() acts as a RELEASE barrier (only
+		 * arm64 and PowerPC), arm64 has a full barrier in
+		 * switch_to(), and PowerPC has
+		 * smp_mb__after_unlock_lock() before
+		 * finish_lock_switch().
+		 */
 		++*switch_count;
 
 		trace_sched_switch(preempt, prev, next);
@@ -3352,8 +3382,8 @@ void __noreturn do_task_dead(void)
 	 * To avoid it, we have to wait for releasing tsk->pi_lock which
 	 * is held by try_to_wake_up()
 	 */
-	smp_mb();
-	raw_spin_unlock_wait(&current->pi_lock);
+	raw_spin_lock_irq(&current->pi_lock);
+	raw_spin_unlock_irq(&current->pi_lock);
 
 	/* Causes final put_task_struct in finish_task_switch(): */
 	__set_current_state(TASK_DEAD);
diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
new file mode 100644
index 0000000..a92fddc
--- /dev/null
+++ b/kernel/sched/membarrier.c
@@ -0,0 +1,152 @@
+/*
+ * Copyright (C) 2010-2017 Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
+ *
+ * membarrier system call
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/syscalls.h>
+#include <linux/membarrier.h>
+#include <linux/tick.h>
+#include <linux/cpumask.h>
+
+#include "sched.h"	/* for cpu_rq(). */
+
+/*
+ * Bitmask made from a "or" of all commands within enum membarrier_cmd,
+ * except MEMBARRIER_CMD_QUERY.
+ */
+#define MEMBARRIER_CMD_BITMASK	\
+	(MEMBARRIER_CMD_SHARED | MEMBARRIER_CMD_PRIVATE_EXPEDITED)
+
+static void ipi_mb(void *info)
+{
+	smp_mb();	/* IPIs should be serializing but paranoid. */
+}
+
+static void membarrier_private_expedited(void)
+{
+	int cpu;
+	bool fallback = false;
+	cpumask_var_t tmpmask;
+
+	if (num_online_cpus() == 1)
+		return;
+
+	/*
+	 * Matches memory barriers around rq->curr modification in
+	 * scheduler.
+	 */
+	smp_mb();	/* system call entry is not a mb. */
+
+	/*
+	 * Expedited membarrier commands guarantee that they won't
+	 * block, hence the GFP_NOWAIT allocation flag and fallback
+	 * implementation.
+	 */
+	if (!zalloc_cpumask_var(&tmpmask, GFP_NOWAIT)) {
+		/* Fallback for OOM. */
+		fallback = true;
+	}
+
+	cpus_read_lock();
+	for_each_online_cpu(cpu) {
+		struct task_struct *p;
+
+		/*
+		 * Skipping the current CPU is OK even through we can be
+		 * migrated at any point. The current CPU, at the point
+		 * where we read raw_smp_processor_id(), is ensured to
+		 * be in program order with respect to the caller
+		 * thread. Therefore, we can skip this CPU from the
+		 * iteration.
+		 */
+		if (cpu == raw_smp_processor_id())
+			continue;
+		rcu_read_lock();
+		p = task_rcu_dereference(&cpu_rq(cpu)->curr);
+		if (p && p->mm == current->mm) {
+			if (!fallback)
+				__cpumask_set_cpu(cpu, tmpmask);
+			else
+				smp_call_function_single(cpu, ipi_mb, NULL, 1);
+		}
+		rcu_read_unlock();
+	}
+	if (!fallback) {
+		smp_call_function_many(tmpmask, ipi_mb, NULL, 1);
+		free_cpumask_var(tmpmask);
+	}
+	cpus_read_unlock();
+
+	/*
+	 * Memory barrier on the caller thread _after_ we finished
+	 * waiting for the last IPI. Matches memory barriers around
+	 * rq->curr modification in scheduler.
+	 */
+	smp_mb();	/* exit from system call is not a mb */
+}
+
+/**
+ * sys_membarrier - issue memory barriers on a set of threads
+ * @cmd:   Takes command values defined in enum membarrier_cmd.
+ * @flags: Currently needs to be 0. For future extensions.
+ *
+ * If this system call is not implemented, -ENOSYS is returned. If the
+ * command specified does not exist, not available on the running
+ * kernel, or if the command argument is invalid, this system call
+ * returns -EINVAL. For a given command, with flags argument set to 0,
+ * this system call is guaranteed to always return the same value until
+ * reboot.
+ *
+ * All memory accesses performed in program order from each targeted thread
+ * is guaranteed to be ordered with respect to sys_membarrier(). If we use
+ * the semantic "barrier()" to represent a compiler barrier forcing memory
+ * accesses to be performed in program order across the barrier, and
+ * smp_mb() to represent explicit memory barriers forcing full memory
+ * ordering across the barrier, we have the following ordering table for
+ * each pair of barrier(), sys_membarrier() and smp_mb():
+ *
+ * The pair ordering is detailed as (O: ordered, X: not ordered):
+ *
+ *                        barrier()   smp_mb() sys_membarrier()
+ *        barrier()          X           X            O
+ *        smp_mb()           X           O            O
+ *        sys_membarrier()   O           O            O
+ */
+SYSCALL_DEFINE2(membarrier, int, cmd, int, flags)
+{
+	if (unlikely(flags))
+		return -EINVAL;
+	switch (cmd) {
+	case MEMBARRIER_CMD_QUERY:
+	{
+		int cmd_mask = MEMBARRIER_CMD_BITMASK;
+
+		if (tick_nohz_full_enabled())
+			cmd_mask &= ~MEMBARRIER_CMD_SHARED;
+		return cmd_mask;
+	}
+	case MEMBARRIER_CMD_SHARED:
+		/* MEMBARRIER_CMD_SHARED is not compatible with nohz_full. */
+		if (tick_nohz_full_enabled())
+			return -EINVAL;
+		if (num_online_cpus() > 1)
+			synchronize_sched();
+		return 0;
+	case MEMBARRIER_CMD_PRIVATE_EXPEDITED:
+		membarrier_private_expedited();
+		return 0;
+	default:
+		return -EINVAL;
+	}
+}
diff --git a/kernel/task_work.c b/kernel/task_work.c
index d513051..836a72a 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -96,20 +96,16 @@ void task_work_run(void)
 		 * work->func() can do task_work_add(), do not set
 		 * work_exited unless the list is empty.
 		 */
+		raw_spin_lock_irq(&task->pi_lock);
 		do {
 			work = READ_ONCE(task->task_works);
 			head = !work && (task->flags & PF_EXITING) ?
 				&work_exited : NULL;
 		} while (cmpxchg(&task->task_works, work, head) != work);
+		raw_spin_unlock_irq(&task->pi_lock);
 
 		if (!work)
 			break;
-		/*
-		 * Synchronize with task_work_cancel(). It can't remove
-		 * the first entry == work, cmpxchg(task_works) should
-		 * fail, but it can play with *work and other entries.
-		 */
-		raw_spin_unlock_wait(&task->pi_lock);
 
 		do {
 			next = work->next;
diff --git a/kernel/time/timekeeping.c b/kernel/time/timekeeping.c
index cedafa0..7e7e61c 100644
--- a/kernel/time/timekeeping.c
+++ b/kernel/time/timekeeping.c
@@ -637,9 +637,7 @@ static inline void tk_update_ktime_data(struct timekeeper *tk)
 	tk->ktime_sec = seconds;
 
 	/* Update the monotonic raw base */
-	seconds = tk->raw_sec;
-	nsec = (u32)(tk->tkr_raw.xtime_nsec >> tk->tkr_raw.shift);
-	tk->tkr_raw.base = ns_to_ktime(seconds * NSEC_PER_SEC + nsec);
+	tk->tkr_raw.base = ns_to_ktime(tk->raw_sec * NSEC_PER_SEC);
 }
 
 /* must hold timekeeper_lock */
diff --git a/kernel/torture.c b/kernel/torture.c
index 55de965..637e172 100644
--- a/kernel/torture.c
+++ b/kernel/torture.c
@@ -117,7 +117,7 @@ bool torture_offline(int cpu, long *n_offl_attempts, long *n_offl_successes,
 				 torture_type, cpu);
 		(*n_offl_successes)++;
 		delta = jiffies - starttime;
-		sum_offl += delta;
+		*sum_offl += delta;
 		if (*min_offl < 0) {
 			*min_offl = delta;
 			*max_offl = delta;
diff --git a/lib/assoc_array.c b/lib/assoc_array.c
index 59fd7c0..155c55d 100644
--- a/lib/assoc_array.c
+++ b/lib/assoc_array.c
@@ -1,6 +1,6 @@
 /* Generic associative array implementation.
  *
- * See Documentation/assoc_array.txt for information.
+ * See Documentation/core-api/assoc_array.rst for information.
  *
  * Copyright (C) 2013 Red Hat, Inc. All Rights Reserved.
  * Written by David Howells (dhowells@redhat.com)
diff --git a/lib/debugobjects.c b/lib/debugobjects.c
index 17afb04..2f5349c 100644
--- a/lib/debugobjects.c
+++ b/lib/debugobjects.c
@@ -18,6 +18,7 @@
 #include <linux/debugfs.h>
 #include <linux/slab.h>
 #include <linux/hash.h>
+#include <linux/kmemleak.h>
 
 #define ODEBUG_HASH_BITS	14
 #define ODEBUG_HASH_SIZE	(1 << ODEBUG_HASH_BITS)
@@ -110,6 +111,7 @@ static void fill_pool(void)
 		if (!new)
 			return;
 
+		kmemleak_ignore(new);
 		raw_spin_lock_irqsave(&pool_lock, flags);
 		hlist_add_head(&new->node, &obj_pool);
 		debug_objects_allocated++;
@@ -1080,6 +1082,7 @@ static int __init debug_objects_replace_static_objects(void)
 		obj = kmem_cache_zalloc(obj_cache, GFP_KERNEL);
 		if (!obj)
 			goto free;
+		kmemleak_ignore(obj);
 		hlist_add_head(&obj->node, &objects);
 	}
 
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index 5a0f75a..eead4b3 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -364,11 +364,11 @@ MPI mpi_read_raw_from_sgl(struct scatterlist *sgl, unsigned int nbytes)
 	}
 
 	miter.consumed = lzeros;
-	sg_miter_stop(&miter);
 
 	nbytes -= lzeros;
 	nbits = nbytes * 8;
 	if (nbits > MAX_EXTERN_MPI_BITS) {
+		sg_miter_stop(&miter);
 		pr_info("MPI: mpi too large (%u bits)\n", nbits);
 		return NULL;
 	}
@@ -376,6 +376,8 @@ MPI mpi_read_raw_from_sgl(struct scatterlist *sgl, unsigned int nbytes)
 	if (nbytes > 0)
 		nbits -= count_leading_zeros(*buff) - (BITS_PER_LONG - 8);
 
+	sg_miter_stop(&miter);
+
 	nlimbs = DIV_ROUND_UP(nbytes, BYTES_PER_MPI_LIMB);
 	val = mpi_alloc(nlimbs);
 	if (!val)
diff --git a/lib/radix-tree.c b/lib/radix-tree.c
index 898e879..3527eb3 100644
--- a/lib/radix-tree.c
+++ b/lib/radix-tree.c
@@ -2022,6 +2022,7 @@ void radix_tree_iter_delete(struct radix_tree_root *root,
 	if (__radix_tree_delete(root, iter->node, slot))
 		iter->index = iter->next_index;
 }
+EXPORT_SYMBOL(radix_tree_iter_delete);
 
 /**
  * radix_tree_delete_item - delete an item from a radix tree
diff --git a/mm/madvise.c b/mm/madvise.c
index 23ed525..4d7d1e5 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -613,6 +613,7 @@ static int madvise_inject_error(int behavior,
 		unsigned long start, unsigned long end)
 {
 	struct page *page;
+	struct zone *zone;
 
 	if (!capable(CAP_SYS_ADMIN))
 		return -EPERM;
@@ -646,6 +647,11 @@ static int madvise_inject_error(int behavior,
 		if (ret)
 			return ret;
 	}
+
+	/* Ensure that all poisoned pages are removed from per-cpu lists */
+	for_each_populated_zone(zone)
+		drain_all_pages(zone);
+
 	return 0;
 }
 #endif
diff --git a/mm/memory.c b/mm/memory.c
index fe2fba2..56e48e4 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4008,7 +4008,8 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address)
 #endif /* __PAGETABLE_PMD_FOLDED */
 
 static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address,
-		pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp)
+			    unsigned long *start, unsigned long *end,
+			    pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp)
 {
 	pgd_t *pgd;
 	p4d_t *p4d;
@@ -4035,17 +4036,29 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address,
 		if (!pmdpp)
 			goto out;
 
+		if (start && end) {
+			*start = address & PMD_MASK;
+			*end = *start + PMD_SIZE;
+			mmu_notifier_invalidate_range_start(mm, *start, *end);
+		}
 		*ptlp = pmd_lock(mm, pmd);
 		if (pmd_huge(*pmd)) {
 			*pmdpp = pmd;
 			return 0;
 		}
 		spin_unlock(*ptlp);
+		if (start && end)
+			mmu_notifier_invalidate_range_end(mm, *start, *end);
 	}
 
 	if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd)))
 		goto out;
 
+	if (start && end) {
+		*start = address & PAGE_MASK;
+		*end = *start + PAGE_SIZE;
+		mmu_notifier_invalidate_range_start(mm, *start, *end);
+	}
 	ptep = pte_offset_map_lock(mm, pmd, address, ptlp);
 	if (!pte_present(*ptep))
 		goto unlock;
@@ -4053,6 +4066,8 @@ static int __follow_pte_pmd(struct mm_struct *mm, unsigned long address,
 	return 0;
 unlock:
 	pte_unmap_unlock(ptep, *ptlp);
+	if (start && end)
+		mmu_notifier_invalidate_range_end(mm, *start, *end);
 out:
 	return -EINVAL;
 }
@@ -4064,20 +4079,21 @@ static inline int follow_pte(struct mm_struct *mm, unsigned long address,
 
 	/* (void) is needed to make gcc happy */
 	(void) __cond_lock(*ptlp,
-			   !(res = __follow_pte_pmd(mm, address, ptepp, NULL,
-					   ptlp)));
+			   !(res = __follow_pte_pmd(mm, address, NULL, NULL,
+						    ptepp, NULL, ptlp)));
 	return res;
 }
 
 int follow_pte_pmd(struct mm_struct *mm, unsigned long address,
+			     unsigned long *start, unsigned long *end,
 			     pte_t **ptepp, pmd_t **pmdpp, spinlock_t **ptlp)
 {
 	int res;
 
 	/* (void) is needed to make gcc happy */
 	(void) __cond_lock(*ptlp,
-			   !(res = __follow_pte_pmd(mm, address, ptepp, pmdpp,
-					   ptlp)));
+			   !(res = __follow_pte_pmd(mm, address, start, end,
+						    ptepp, pmdpp, ptlp)));
 	return res;
 }
 EXPORT_SYMBOL(follow_pte_pmd);
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index 54ca545..3142852 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -174,20 +174,6 @@ void __mmu_notifier_change_pte(struct mm_struct *mm, unsigned long address,
 	srcu_read_unlock(&srcu, id);
 }
 
-void __mmu_notifier_invalidate_page(struct mm_struct *mm,
-					  unsigned long address)
-{
-	struct mmu_notifier *mn;
-	int id;
-
-	id = srcu_read_lock(&srcu);
-	hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) {
-		if (mn->ops->invalidate_page)
-			mn->ops->invalidate_page(mn, mm, address);
-	}
-	srcu_read_unlock(&srcu, id);
-}
-
 void __mmu_notifier_invalidate_range_start(struct mm_struct *mm,
 				  unsigned long start, unsigned long end)
 {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7a58eb5..1423da8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3291,10 +3291,13 @@ __alloc_pages_may_oom(gfp_t gfp_mask, unsigned int order,
 	/*
 	 * Go through the zonelist yet one more time, keep very high watermark
 	 * here, this is only to catch a parallel oom killing, we must fail if
-	 * we're still under heavy pressure.
+	 * we're still under heavy pressure. But make sure that this reclaim
+	 * attempt shall not depend on __GFP_DIRECT_RECLAIM && !__GFP_NORETRY
+	 * allocation which will never fail due to oom_lock already held.
 	 */
-	page = get_page_from_freelist(gfp_mask | __GFP_HARDWALL, order,
-					ALLOC_WMARK_HIGH|ALLOC_CPUSET, ac);
+	page = get_page_from_freelist((gfp_mask | __GFP_HARDWALL) &
+				      ~__GFP_DIRECT_RECLAIM, order,
+				      ALLOC_WMARK_HIGH|ALLOC_CPUSET, ac);
 	if (page)
 		goto out;
 
diff --git a/mm/rmap.c b/mm/rmap.c
index c1286d4..c570f82 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -887,11 +887,21 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
 		.address = address,
 		.flags = PVMW_SYNC,
 	};
+	unsigned long start = address, end;
 	int *cleaned = arg;
-	bool invalidation_needed = false;
+
+	/*
+	 * We have to assume the worse case ie pmd for invalidation. Note that
+	 * the page can not be free from this function.
+	 */
+	end = min(vma->vm_end, start + (PAGE_SIZE << compound_order(page)));
+	mmu_notifier_invalidate_range_start(vma->vm_mm, start, end);
 
 	while (page_vma_mapped_walk(&pvmw)) {
+		unsigned long cstart, cend;
 		int ret = 0;
+
+		cstart = address = pvmw.address;
 		if (pvmw.pte) {
 			pte_t entry;
 			pte_t *pte = pvmw.pte;
@@ -899,11 +909,12 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
 			if (!pte_dirty(*pte) && !pte_write(*pte))
 				continue;
 
-			flush_cache_page(vma, pvmw.address, pte_pfn(*pte));
-			entry = ptep_clear_flush(vma, pvmw.address, pte);
+			flush_cache_page(vma, address, pte_pfn(*pte));
+			entry = ptep_clear_flush(vma, address, pte);
 			entry = pte_wrprotect(entry);
 			entry = pte_mkclean(entry);
-			set_pte_at(vma->vm_mm, pvmw.address, pte, entry);
+			set_pte_at(vma->vm_mm, address, pte, entry);
+			cend = cstart + PAGE_SIZE;
 			ret = 1;
 		} else {
 #ifdef CONFIG_TRANSPARENT_HUGE_PAGECACHE
@@ -913,11 +924,13 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
 			if (!pmd_dirty(*pmd) && !pmd_write(*pmd))
 				continue;
 
-			flush_cache_page(vma, pvmw.address, page_to_pfn(page));
-			entry = pmdp_huge_clear_flush(vma, pvmw.address, pmd);
+			flush_cache_page(vma, address, page_to_pfn(page));
+			entry = pmdp_huge_clear_flush(vma, address, pmd);
 			entry = pmd_wrprotect(entry);
 			entry = pmd_mkclean(entry);
-			set_pmd_at(vma->vm_mm, pvmw.address, pmd, entry);
+			set_pmd_at(vma->vm_mm, address, pmd, entry);
+			cstart &= PMD_MASK;
+			cend = cstart + PMD_SIZE;
 			ret = 1;
 #else
 			/* unexpected pmd-mapped page? */
@@ -926,15 +939,12 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
 		}
 
 		if (ret) {
+			mmu_notifier_invalidate_range(vma->vm_mm, cstart, cend);
 			(*cleaned)++;
-			invalidation_needed = true;
 		}
 	}
 
-	if (invalidation_needed) {
-		mmu_notifier_invalidate_range(vma->vm_mm, address,
-				address + (1UL << compound_order(page)));
-	}
+	mmu_notifier_invalidate_range_end(vma->vm_mm, start, end);
 
 	return true;
 }
@@ -1328,7 +1338,8 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 	};
 	pte_t pteval;
 	struct page *subpage;
-	bool ret = true, invalidation_needed = false;
+	bool ret = true;
+	unsigned long start = address, end;
 	enum ttu_flags flags = (enum ttu_flags)arg;
 
 	/* munlock has nothing to gain from examining un-locked vmas */
@@ -1340,6 +1351,14 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 				flags & TTU_MIGRATION, page);
 	}
 
+	/*
+	 * We have to assume the worse case ie pmd for invalidation. Note that
+	 * the page can not be free in this function as call of try_to_unmap()
+	 * must hold a reference on the page.
+	 */
+	end = min(vma->vm_end, start + (PAGE_SIZE << compound_order(page)));
+	mmu_notifier_invalidate_range_start(vma->vm_mm, start, end);
+
 	while (page_vma_mapped_walk(&pvmw)) {
 		/*
 		 * If the page is mlock()d, we cannot swap it out.
@@ -1368,9 +1387,11 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 		VM_BUG_ON_PAGE(!pvmw.pte, page);
 
 		subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte);
+		address = pvmw.address;
+
 
 		if (!(flags & TTU_IGNORE_ACCESS)) {
-			if (ptep_clear_flush_young_notify(vma, pvmw.address,
+			if (ptep_clear_flush_young_notify(vma, address,
 						pvmw.pte)) {
 				ret = false;
 				page_vma_mapped_walk_done(&pvmw);
@@ -1379,7 +1400,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 		}
 
 		/* Nuke the page table entry. */
-		flush_cache_page(vma, pvmw.address, pte_pfn(*pvmw.pte));
+		flush_cache_page(vma, address, pte_pfn(*pvmw.pte));
 		if (should_defer_flush(mm, flags)) {
 			/*
 			 * We clear the PTE but do not flush so potentially
@@ -1389,12 +1410,11 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 			 * transition on a cached TLB entry is written through
 			 * and traps if the PTE is unmapped.
 			 */
-			pteval = ptep_get_and_clear(mm, pvmw.address,
-						    pvmw.pte);
+			pteval = ptep_get_and_clear(mm, address, pvmw.pte);
 
 			set_tlb_ubc_flush_pending(mm, pte_dirty(pteval));
 		} else {
-			pteval = ptep_clear_flush(vma, pvmw.address, pvmw.pte);
+			pteval = ptep_clear_flush(vma, address, pvmw.pte);
 		}
 
 		/* Move the dirty bit to the page. Now the pte is gone. */
@@ -1409,12 +1429,12 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 			if (PageHuge(page)) {
 				int nr = 1 << compound_order(page);
 				hugetlb_count_sub(nr, mm);
-				set_huge_swap_pte_at(mm, pvmw.address,
+				set_huge_swap_pte_at(mm, address,
 						     pvmw.pte, pteval,
 						     vma_mmu_pagesize(vma));
 			} else {
 				dec_mm_counter(mm, mm_counter(page));
-				set_pte_at(mm, pvmw.address, pvmw.pte, pteval);
+				set_pte_at(mm, address, pvmw.pte, pteval);
 			}
 
 		} else if (pte_unused(pteval)) {
@@ -1438,7 +1458,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 			swp_pte = swp_entry_to_pte(entry);
 			if (pte_soft_dirty(pteval))
 				swp_pte = pte_swp_mksoft_dirty(swp_pte);
-			set_pte_at(mm, pvmw.address, pvmw.pte, swp_pte);
+			set_pte_at(mm, address, pvmw.pte, swp_pte);
 		} else if (PageAnon(page)) {
 			swp_entry_t entry = { .val = page_private(subpage) };
 			pte_t swp_pte;
@@ -1449,6 +1469,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 			if (unlikely(PageSwapBacked(page) != PageSwapCache(page))) {
 				WARN_ON_ONCE(1);
 				ret = false;
+				/* We have to invalidate as we cleared the pte */
 				page_vma_mapped_walk_done(&pvmw);
 				break;
 			}
@@ -1464,7 +1485,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 				 * If the page was redirtied, it cannot be
 				 * discarded. Remap the page to page table.
 				 */
-				set_pte_at(mm, pvmw.address, pvmw.pte, pteval);
+				set_pte_at(mm, address, pvmw.pte, pteval);
 				SetPageSwapBacked(page);
 				ret = false;
 				page_vma_mapped_walk_done(&pvmw);
@@ -1472,7 +1493,7 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 			}
 
 			if (swap_duplicate(entry) < 0) {
-				set_pte_at(mm, pvmw.address, pvmw.pte, pteval);
+				set_pte_at(mm, address, pvmw.pte, pteval);
 				ret = false;
 				page_vma_mapped_walk_done(&pvmw);
 				break;
@@ -1488,18 +1509,18 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 			swp_pte = swp_entry_to_pte(entry);
 			if (pte_soft_dirty(pteval))
 				swp_pte = pte_swp_mksoft_dirty(swp_pte);
-			set_pte_at(mm, pvmw.address, pvmw.pte, swp_pte);
+			set_pte_at(mm, address, pvmw.pte, swp_pte);
 		} else
 			dec_mm_counter(mm, mm_counter_file(page));
 discard:
 		page_remove_rmap(subpage, PageHuge(page));
 		put_page(page);
-		invalidation_needed = true;
+		mmu_notifier_invalidate_range(mm, address,
+					      address + PAGE_SIZE);
 	}
 
-	if (invalidation_needed)
-		mmu_notifier_invalidate_range(mm, address,
-				address + (1UL << compound_order(page)));
+	mmu_notifier_invalidate_range_end(vma->vm_mm, start, end);
+
 	return ret;
 }
 
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index 861ae2a..5a7be3b 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -53,6 +53,9 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
 	brstats->tx_bytes += skb->len;
 	u64_stats_update_end(&brstats->syncp);
 
+#ifdef CONFIG_NET_SWITCHDEV
+	skb->offload_fwd_mark = 0;
+#endif
 	BR_INPUT_SKB_CB(skb)->brdev = dev;
 
 	skb_reset_mac_header(skb);
diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c
index 181a44d..f6b1c7d 100644
--- a/net/bridge/br_switchdev.c
+++ b/net/bridge/br_switchdev.c
@@ -115,7 +115,7 @@ br_switchdev_fdb_call_notifiers(bool adding, const unsigned char *mac,
 void
 br_switchdev_fdb_notify(const struct net_bridge_fdb_entry *fdb, int type)
 {
-	if (!fdb->added_by_user)
+	if (!fdb->added_by_user || !fdb->dst)
 		return;
 
 	switch (type) {
diff --git a/net/core/datagram.c b/net/core/datagram.c
index a21ca8d..8c2f448 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -362,7 +362,7 @@ int __sk_queue_drop_skb(struct sock *sk, struct sk_buff_head *sk_queue,
 	if (flags & MSG_PEEK) {
 		err = -ENOENT;
 		spin_lock_bh(&sk_queue->lock);
-		if (skb == skb_peek(sk_queue)) {
+		if (skb->next) {
 			__skb_unlink(skb, sk_queue);
 			refcount_dec(&skb->users);
 			if (destructor)
diff --git a/net/core/dev.c b/net/core/dev.c
index ce15a06..86b4b0a 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5289,6 +5289,7 @@ static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock)
 	 * Ideally, a new ndo_busy_poll_stop() could avoid another round.
 	 */
 	rc = napi->poll(napi, BUSY_POLL_BUDGET);
+	trace_napi_poll(napi, rc, BUSY_POLL_BUDGET);
 	netpoll_poll_unlock(have_poll_lock);
 	if (rc == BUSY_POLL_BUDGET)
 		__napi_schedule(napi);
@@ -5667,12 +5668,13 @@ EXPORT_SYMBOL(netdev_has_upper_dev_all_rcu);
  * Find out if a device is linked to an upper device and return true in case
  * it is. The caller must hold the RTNL lock.
  */
-static bool netdev_has_any_upper_dev(struct net_device *dev)
+bool netdev_has_any_upper_dev(struct net_device *dev)
 {
 	ASSERT_RTNL();
 
 	return !list_empty(&dev->adj_list.upper);
 }
+EXPORT_SYMBOL(netdev_has_any_upper_dev);
 
 /**
  * netdev_master_upper_dev_get - Get master upper device
diff --git a/net/core/filter.c b/net/core/filter.c
index 6280a60..1699749 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -2836,15 +2836,12 @@ BPF_CALL_5(bpf_setsockopt, struct bpf_sock_ops_kern *, bpf_sock,
 		   sk->sk_prot->setsockopt == tcp_setsockopt) {
 		if (optname == TCP_CONGESTION) {
 			char name[TCP_CA_NAME_MAX];
+			bool reinit = bpf_sock->op > BPF_SOCK_OPS_NEEDS_ECN;
 
 			strncpy(name, optval, min_t(long, optlen,
 						    TCP_CA_NAME_MAX-1));
 			name[TCP_CA_NAME_MAX-1] = 0;
-			ret = tcp_set_congestion_control(sk, name, false);
-			if (!ret && bpf_sock->op > BPF_SOCK_OPS_NEEDS_ECN)
-				/* replacing an existing ca */
-				tcp_reinit_congestion_control(sk,
-					inet_csk(sk)->icsk_ca_ops);
+			ret = tcp_set_congestion_control(sk, name, false, reinit);
 		} else {
 			struct tcp_sock *tp = tcp_sk(sk);
 
@@ -2872,7 +2869,6 @@ BPF_CALL_5(bpf_setsockopt, struct bpf_sock_ops_kern *, bpf_sock,
 				ret = -EINVAL;
 			}
 		}
-		ret = -EINVAL;
 #endif
 	} else {
 		ret = -EINVAL;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index f990eb8..e075566 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1363,18 +1363,20 @@ struct sk_buff *skb_copy_expand(const struct sk_buff *skb,
 EXPORT_SYMBOL(skb_copy_expand);
 
 /**
- *	skb_pad			-	zero pad the tail of an skb
+ *	__skb_pad		-	zero pad the tail of an skb
  *	@skb: buffer to pad
  *	@pad: space to pad
+ *	@free_on_error: free buffer on error
  *
  *	Ensure that a buffer is followed by a padding area that is zero
  *	filled. Used by network drivers which may DMA or transfer data
  *	beyond the buffer end onto the wire.
  *
- *	May return error in out of memory cases. The skb is freed on error.
+ *	May return error in out of memory cases. The skb is freed on error
+ *	if @free_on_error is true.
  */
 
-int skb_pad(struct sk_buff *skb, int pad)
+int __skb_pad(struct sk_buff *skb, int pad, bool free_on_error)
 {
 	int err;
 	int ntail;
@@ -1403,10 +1405,11 @@ int skb_pad(struct sk_buff *skb, int pad)
 	return 0;
 
 free_skb:
-	kfree_skb(skb);
+	if (free_on_error)
+		kfree_skb(skb);
 	return err;
 }
-EXPORT_SYMBOL(skb_pad);
+EXPORT_SYMBOL(__skb_pad);
 
 /**
  *	pskb_put - add data to the tail of a potentially fragmented buffer
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index c442051..20bc9c5 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -577,7 +577,7 @@ static int dsa_dst_parse(struct dsa_switch_tree *dst)
 			return err;
 	}
 
-	if (!dst->cpu_dp->netdev) {
+	if (!dst->cpu_dp) {
 		pr_warn("Tree has no master device\n");
 		return -EINVAL;
 	}
diff --git a/net/dsa/tag_ksz.c b/net/dsa/tag_ksz.c
index de66ca8..fcd90f7 100644
--- a/net/dsa/tag_ksz.c
+++ b/net/dsa/tag_ksz.c
@@ -42,7 +42,8 @@ static struct sk_buff *ksz_xmit(struct sk_buff *skb, struct net_device *dev)
 	padlen = (skb->len >= ETH_ZLEN) ? 0 : ETH_ZLEN - skb->len;
 
 	if (skb_tailroom(skb) >= padlen + KSZ_INGRESS_TAG_LEN) {
-		if (skb_put_padto(skb, skb->len + padlen))
+		/* Let dsa_slave_xmit() free skb */
+		if (__skb_put_padto(skb, skb->len + padlen, false))
 			return NULL;
 
 		nskb = skb;
@@ -60,12 +61,13 @@ static struct sk_buff *ksz_xmit(struct sk_buff *skb, struct net_device *dev)
 					 skb_transport_header(skb) - skb->head);
 		skb_copy_and_csum_dev(skb, skb_put(nskb, skb->len));
 
-		if (skb_put_padto(nskb, nskb->len + padlen)) {
-			kfree_skb(nskb);
+		/* Let skb_put_padto() free nskb, and let dsa_slave_xmit() free
+		 * skb
+		 */
+		if (skb_put_padto(nskb, nskb->len + padlen))
 			return NULL;
-		}
 
-		kfree_skb(skb);
+		consume_skb(skb);
 	}
 
 	tag = skb_put(nskb, KSZ_INGRESS_TAG_LEN);
diff --git a/net/dsa/tag_trailer.c b/net/dsa/tag_trailer.c
index b09e562..9c7b1d7 100644
--- a/net/dsa/tag_trailer.c
+++ b/net/dsa/tag_trailer.c
@@ -40,7 +40,7 @@ static struct sk_buff *trailer_xmit(struct sk_buff *skb, struct net_device *dev)
 	skb_set_network_header(nskb, skb_network_header(skb) - skb->head);
 	skb_set_transport_header(nskb, skb_transport_header(skb) - skb->head);
 	skb_copy_and_csum_dev(skb, skb_put(nskb, skb->len));
-	kfree_skb(skb);
+	consume_skb(skb);
 
 	if (padlen) {
 		skb_put_zero(nskb, padlen);
diff --git a/net/hsr/hsr_device.c b/net/hsr/hsr_device.c
index 4e7bdb2..172d830 100644
--- a/net/hsr/hsr_device.c
+++ b/net/hsr/hsr_device.c
@@ -314,7 +314,8 @@ static void send_hsr_supervision_frame(struct hsr_port *master,
 	hsr_sp = skb_put(skb, sizeof(struct hsr_sup_payload));
 	ether_addr_copy(hsr_sp->MacAddressA, master->dev->dev_addr);
 
-	skb_put_padto(skb, ETH_ZLEN + HSR_HLEN);
+	if (skb_put_padto(skb, ETH_ZLEN + HSR_HLEN))
+		return;
 
 	hsr_forward_skb(skb, master);
 	return;
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 0cbee0a..df68963 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -258,7 +258,7 @@ int esp_output_head(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *
 		esp_output_udp_encap(x, skb, esp);
 
 	if (!skb_cloned(skb)) {
-		if (tailen <= skb_availroom(skb)) {
+		if (tailen <= skb_tailroom(skb)) {
 			nfrags = 1;
 			trailer = skb;
 			tail = skb_tail_pointer(trailer);
@@ -292,8 +292,6 @@ int esp_output_head(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *
 
 			kunmap_atomic(vaddr);
 
-			spin_unlock_bh(&x->lock);
-
 			nfrags = skb_shinfo(skb)->nr_frags;
 
 			__skb_fill_page_desc(skb, nfrags, page, pfrag->offset,
@@ -301,6 +299,9 @@ int esp_output_head(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *
 			skb_shinfo(skb)->nr_frags = ++nfrags;
 
 			pfrag->offset = pfrag->offset + allocsize;
+
+			spin_unlock_bh(&x->lock);
+
 			nfrags++;
 
 			skb->len += tailen;
@@ -381,7 +382,7 @@ int esp_output_tail(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *
 		           (unsigned char *)esph - skb->data,
 		           assoclen + ivlen + esp->clen + alen);
 	if (unlikely(err < 0))
-		goto error;
+		goto error_free;
 
 	if (!esp->inplace) {
 		int allocsize;
@@ -392,7 +393,7 @@ int esp_output_tail(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *
 		spin_lock_bh(&x->lock);
 		if (unlikely(!skb_page_frag_refill(allocsize, pfrag, GFP_ATOMIC))) {
 			spin_unlock_bh(&x->lock);
-			goto error;
+			goto error_free;
 		}
 
 		skb_shinfo(skb)->nr_frags = 1;
@@ -409,7 +410,7 @@ int esp_output_tail(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *
 			           (unsigned char *)esph - skb->data,
 			           assoclen + ivlen + esp->clen + alen);
 		if (unlikely(err < 0))
-			goto error;
+			goto error_free;
 	}
 
 	if ((x->props.flags & XFRM_STATE_ESN))
@@ -442,8 +443,9 @@ int esp_output_tail(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *
 
 	if (sg != dsg)
 		esp_ssg_unref(x, tmp);
-	kfree(tmp);
 
+error_free:
+	kfree(tmp);
 error:
 	return err;
 }
@@ -695,8 +697,10 @@ static int esp_input(struct xfrm_state *x, struct sk_buff *skb)
 
 	sg_init_table(sg, nfrags);
 	err = skb_to_sgvec(skb, sg, 0, skb->len);
-	if (unlikely(err < 0))
+	if (unlikely(err < 0)) {
+		kfree(tmp);
 		goto out;
+	}
 
 	skb->ip_summed = CHECKSUM_NONE;
 
diff --git a/net/ipv4/esp4_offload.c b/net/ipv4/esp4_offload.c
index e066601..5011232 100644
--- a/net/ipv4/esp4_offload.c
+++ b/net/ipv4/esp4_offload.c
@@ -257,7 +257,7 @@ static int esp_xmit(struct xfrm_state *x, struct sk_buff *skb,  netdev_features_
 	esp.seqno = cpu_to_be64(xo->seq.low + ((u64)xo->seq.hi << 32));
 
 	err = esp_output_tail(x, skb, &esp);
-	if (err < 0)
+	if (err)
 		return err;
 
 	secpath_reset(skb);
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 0bc3c3d..9e9d9af 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -268,14 +268,14 @@ unsigned int arpt_do_table(struct sk_buff *skb,
 		acpar.targinfo = t->data;
 		verdict = t->u.kernel.target->target(skb, &acpar);
 
-		/* Target might have changed stuff. */
-		arp = arp_hdr(skb);
-
-		if (verdict == XT_CONTINUE)
+		if (verdict == XT_CONTINUE) {
+			/* Target might have changed stuff. */
+			arp = arp_hdr(skb);
 			e = arpt_next_entry(e);
-		else
+		} else {
 			/* Verdict */
 			break;
+		}
 	} while (!acpar.hotdrop);
 	xt_write_recseq_end(addend);
 	local_bh_enable();
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 2a55a40..622ed28 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -352,13 +352,14 @@ ipt_do_table(struct sk_buff *skb,
 		acpar.targinfo = t->data;
 
 		verdict = t->u.kernel.target->target(skb, &acpar);
-		/* Target might have changed stuff. */
-		ip = ip_hdr(skb);
-		if (verdict == XT_CONTINUE)
+		if (verdict == XT_CONTINUE) {
+			/* Target might have changed stuff. */
+			ip = ip_hdr(skb);
 			e = ipt_next_entry(e);
-		else
+		} else {
 			/* Verdict */
 			break;
+		}
 	} while (!acpar.hotdrop);
 
 	xt_write_recseq_end(addend);
diff --git a/net/ipv4/netfilter/ipt_CLUSTERIP.c b/net/ipv4/netfilter/ipt_CLUSTERIP.c
index 7d72dec..efaa04d 100644
--- a/net/ipv4/netfilter/ipt_CLUSTERIP.c
+++ b/net/ipv4/netfilter/ipt_CLUSTERIP.c
@@ -117,7 +117,8 @@ clusterip_config_entry_put(struct net *net, struct clusterip_config *c)
 		 * functions are also incrementing the refcount on their own,
 		 * so it's safe to remove the entry even if it's in use. */
 #ifdef CONFIG_PROC_FS
-		proc_remove(c->pde);
+		if (cn->procdir)
+			proc_remove(c->pde);
 #endif
 		return;
 	}
@@ -815,6 +816,7 @@ static void clusterip_net_exit(struct net *net)
 #ifdef CONFIG_PROC_FS
 	struct clusterip_net *cn = net_generic(net, clusterip_net_id);
 	proc_remove(cn->procdir);
+	cn->procdir = NULL;
 #endif
 	nf_unregister_net_hook(net, &cip_arp_ops);
 }
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 71ce33d..a3e91b5 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2481,7 +2481,7 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
 		name[val] = 0;
 
 		lock_sock(sk);
-		err = tcp_set_congestion_control(sk, name, true);
+		err = tcp_set_congestion_control(sk, name, true, true);
 		release_sock(sk);
 		return err;
 	}
diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c
index fde983f..421ea1b 100644
--- a/net/ipv4/tcp_cong.c
+++ b/net/ipv4/tcp_cong.c
@@ -189,8 +189,8 @@ void tcp_init_congestion_control(struct sock *sk)
 		INET_ECN_dontxmit(sk);
 }
 
-void tcp_reinit_congestion_control(struct sock *sk,
-				   const struct tcp_congestion_ops *ca)
+static void tcp_reinit_congestion_control(struct sock *sk,
+					  const struct tcp_congestion_ops *ca)
 {
 	struct inet_connection_sock *icsk = inet_csk(sk);
 
@@ -338,7 +338,7 @@ int tcp_set_allowed_congestion_control(char *val)
  * tcp_reinit_congestion_control (if the current congestion control was
  * already initialized.
  */
-int tcp_set_congestion_control(struct sock *sk, const char *name, bool load)
+int tcp_set_congestion_control(struct sock *sk, const char *name, bool load, bool reinit)
 {
 	struct inet_connection_sock *icsk = inet_csk(sk);
 	const struct tcp_congestion_ops *ca;
@@ -360,9 +360,18 @@ int tcp_set_congestion_control(struct sock *sk, const char *name, bool load)
 	if (!ca) {
 		err = -ENOENT;
 	} else if (!load) {
-		icsk->icsk_ca_ops = ca;
-		if (!try_module_get(ca->owner))
+		const struct tcp_congestion_ops *old_ca = icsk->icsk_ca_ops;
+
+		if (try_module_get(ca->owner)) {
+			if (reinit) {
+				tcp_reinit_congestion_control(sk, ca);
+			} else {
+				icsk->icsk_ca_ops = ca;
+				module_put(old_ca->owner);
+			}
+		} else {
 			err = -EBUSY;
+		}
 	} else if (!((ca->flags & TCP_CONG_NON_RESTRICTED) ||
 		     ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN))) {
 		err = -EPERM;
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index cd1d044..6234480 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1176,7 +1176,7 @@ static void udp_set_dev_scratch(struct sk_buff *skb)
 	scratch->csum_unnecessary = !!skb_csum_unnecessary(skb);
 	scratch->is_linear = !skb_is_nonlinear(skb);
 #endif
-	if (likely(!skb->_skb_refdst))
+	if (likely(!skb->_skb_refdst && !skb_sec_path(skb)))
 		scratch->_tsize_state |= UDP_SKB_IS_STATELESS;
 }
 
@@ -1929,14 +1929,16 @@ static int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 /* For TCP sockets, sk_rx_dst is protected by socket lock
  * For UDP, we use xchg() to guard against concurrent changes.
  */
-void udp_sk_rx_dst_set(struct sock *sk, struct dst_entry *dst)
+bool udp_sk_rx_dst_set(struct sock *sk, struct dst_entry *dst)
 {
 	struct dst_entry *old;
 
 	if (dst_hold_safe(dst)) {
 		old = xchg(&sk->sk_rx_dst, dst);
 		dst_release(old);
+		return old != dst;
 	}
+	return false;
 }
 EXPORT_SYMBOL(udp_sk_rx_dst_set);
 
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 3c46e95..936e9ab 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -5556,7 +5556,7 @@ static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp)
 		 * our DAD process, so we don't need
 		 * to do it again
 		 */
-		if (!(ifp->rt->rt6i_node))
+		if (!rcu_access_pointer(ifp->rt->rt6i_node))
 			ip6_ins_rt(ifp->rt);
 		if (ifp->idev->cnf.forwarding)
 			addrconf_join_anycast(ifp);
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index 9ed3547..ab64f36 100644
--- a/net/ipv6/esp6.c
+++ b/net/ipv6/esp6.c
@@ -226,7 +226,7 @@ int esp6_output_head(struct xfrm_state *x, struct sk_buff *skb, struct esp_info
 	int tailen = esp->tailen;
 
 	if (!skb_cloned(skb)) {
-		if (tailen <= skb_availroom(skb)) {
+		if (tailen <= skb_tailroom(skb)) {
 			nfrags = 1;
 			trailer = skb;
 			tail = skb_tail_pointer(trailer);
@@ -260,8 +260,6 @@ int esp6_output_head(struct xfrm_state *x, struct sk_buff *skb, struct esp_info
 
 			kunmap_atomic(vaddr);
 
-			spin_unlock_bh(&x->lock);
-
 			nfrags = skb_shinfo(skb)->nr_frags;
 
 			__skb_fill_page_desc(skb, nfrags, page, pfrag->offset,
@@ -269,6 +267,9 @@ int esp6_output_head(struct xfrm_state *x, struct sk_buff *skb, struct esp_info
 			skb_shinfo(skb)->nr_frags = ++nfrags;
 
 			pfrag->offset = pfrag->offset + allocsize;
+
+			spin_unlock_bh(&x->lock);
+
 			nfrags++;
 
 			skb->len += tailen;
@@ -345,7 +346,7 @@ int esp6_output_tail(struct xfrm_state *x, struct sk_buff *skb, struct esp_info
 		           (unsigned char *)esph - skb->data,
 		           assoclen + ivlen + esp->clen + alen);
 	if (unlikely(err < 0))
-		goto error;
+		goto error_free;
 
 	if (!esp->inplace) {
 		int allocsize;
@@ -356,7 +357,7 @@ int esp6_output_tail(struct xfrm_state *x, struct sk_buff *skb, struct esp_info
 		spin_lock_bh(&x->lock);
 		if (unlikely(!skb_page_frag_refill(allocsize, pfrag, GFP_ATOMIC))) {
 			spin_unlock_bh(&x->lock);
-			goto error;
+			goto error_free;
 		}
 
 		skb_shinfo(skb)->nr_frags = 1;
@@ -373,7 +374,7 @@ int esp6_output_tail(struct xfrm_state *x, struct sk_buff *skb, struct esp_info
 			           (unsigned char *)esph - skb->data,
 			           assoclen + ivlen + esp->clen + alen);
 		if (unlikely(err < 0))
-			goto error;
+			goto error_free;
 	}
 
 	if ((x->props.flags & XFRM_STATE_ESN))
@@ -406,8 +407,9 @@ int esp6_output_tail(struct xfrm_state *x, struct sk_buff *skb, struct esp_info
 
 	if (sg != dsg)
 		esp_ssg_unref(x, tmp);
-	kfree(tmp);
 
+error_free:
+	kfree(tmp);
 error:
 	return err;
 }
diff --git a/net/ipv6/esp6_offload.c b/net/ipv6/esp6_offload.c
index f02f131..1cf437f 100644
--- a/net/ipv6/esp6_offload.c
+++ b/net/ipv6/esp6_offload.c
@@ -286,7 +286,7 @@ static int esp6_xmit(struct xfrm_state *x, struct sk_buff *skb,  netdev_features
 	esp.seqno = cpu_to_be64(xo->seq.low + ((u64)xo->seq.hi << 32));
 
 	err = esp6_output_tail(x, skb, &esp);
-	if (err < 0)
+	if (err)
 		return err;
 
 	secpath_reset(skb);
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 5cc0ea0..e1c85bb 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -148,11 +148,23 @@ static struct fib6_node *node_alloc(void)
 	return fn;
 }
 
-static void node_free(struct fib6_node *fn)
+static void node_free_immediate(struct fib6_node *fn)
 {
 	kmem_cache_free(fib6_node_kmem, fn);
 }
 
+static void node_free_rcu(struct rcu_head *head)
+{
+	struct fib6_node *fn = container_of(head, struct fib6_node, rcu);
+
+	kmem_cache_free(fib6_node_kmem, fn);
+}
+
+static void node_free(struct fib6_node *fn)
+{
+	call_rcu(&fn->rcu, node_free_rcu);
+}
+
 static void rt6_free_pcpu(struct rt6_info *non_pcpu_rt)
 {
 	int cpu;
@@ -601,9 +613,9 @@ static struct fib6_node *fib6_add_1(struct fib6_node *root,
 
 		if (!in || !ln) {
 			if (in)
-				node_free(in);
+				node_free_immediate(in);
 			if (ln)
-				node_free(ln);
+				node_free_immediate(ln);
 			return ERR_PTR(-ENOMEM);
 		}
 
@@ -877,7 +889,7 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt,
 
 		rt->dst.rt6_next = iter;
 		*ins = rt;
-		rt->rt6i_node = fn;
+		rcu_assign_pointer(rt->rt6i_node, fn);
 		atomic_inc(&rt->rt6i_ref);
 		if (!info->skip_notify)
 			inet6_rt_notify(RTM_NEWROUTE, rt, info, nlflags);
@@ -903,7 +915,7 @@ static int fib6_add_rt2node(struct fib6_node *fn, struct rt6_info *rt,
 			return err;
 
 		*ins = rt;
-		rt->rt6i_node = fn;
+		rcu_assign_pointer(rt->rt6i_node, fn);
 		rt->dst.rt6_next = iter->dst.rt6_next;
 		atomic_inc(&rt->rt6i_ref);
 		if (!info->skip_notify)
@@ -1038,7 +1050,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt,
 				   root, and then (in failure) stale node
 				   in main tree.
 				 */
-				node_free(sfn);
+				node_free_immediate(sfn);
 				err = PTR_ERR(sn);
 				goto failure;
 			}
@@ -1468,8 +1480,9 @@ static void fib6_del_route(struct fib6_node *fn, struct rt6_info **rtp,
 
 int fib6_del(struct rt6_info *rt, struct nl_info *info)
 {
+	struct fib6_node *fn = rcu_dereference_protected(rt->rt6i_node,
+				    lockdep_is_held(&rt->rt6i_table->tb6_lock));
 	struct net *net = info->nl_net;
-	struct fib6_node *fn = rt->rt6i_node;
 	struct rt6_info **rtp;
 
 #if RT6_DEBUG >= 2
@@ -1658,7 +1671,9 @@ static int fib6_clean_node(struct fib6_walker *w)
 			if (res) {
 #if RT6_DEBUG >= 2
 				pr_debug("%s: del failed: rt=%p@%p err=%d\n",
-					 __func__, rt, rt->rt6i_node, res);
+					 __func__, rt,
+					 rcu_access_pointer(rt->rt6i_node),
+					 res);
 #endif
 				continue;
 			}
@@ -1780,8 +1795,10 @@ static int fib6_age(struct rt6_info *rt, void *arg)
 		}
 		gc_args->more++;
 	} else if (rt->rt6i_flags & RTF_CACHE) {
+		if (time_after_eq(now, rt->dst.lastuse + gc_args->timeout))
+			rt->dst.obsolete = DST_OBSOLETE_KILL;
 		if (atomic_read(&rt->dst.__refcnt) == 1 &&
-		    time_after_eq(now, rt->dst.lastuse + gc_args->timeout)) {
+		    rt->dst.obsolete == DST_OBSOLETE_KILL) {
 			RT6_TRACE("aging clone %p\n", rt);
 			return -1;
 		} else if (rt->rt6i_flags & RTF_GATEWAY) {
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 02d795f..a5e466d 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -242,7 +242,6 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 			pktopt = xchg(&np->pktoptions, NULL);
 			kfree_skb(pktopt);
 
-			sk->sk_destruct = inet_sock_destruct;
 			/*
 			 * ... and add it to the refcnt debug socks count
 			 * in the new family. -acme
diff --git a/net/ipv6/output_core.c b/net/ipv6/output_core.c
index abb2c30..a338bbc 100644
--- a/net/ipv6/output_core.c
+++ b/net/ipv6/output_core.c
@@ -86,7 +86,6 @@ int ip6_find_1stfragopt(struct sk_buff *skb, u8 **nexthdr)
 
 	while (offset <= packet_len) {
 		struct ipv6_opt_hdr *exthdr;
-		unsigned int len;
 
 		switch (**nexthdr) {
 
@@ -112,10 +111,9 @@ int ip6_find_1stfragopt(struct sk_buff *skb, u8 **nexthdr)
 
 		exthdr = (struct ipv6_opt_hdr *)(skb_network_header(skb) +
 						 offset);
-		len = ipv6_optlen(exthdr);
-		if (len + offset >= IPV6_MAXPLEN)
+		offset += ipv6_optlen(exthdr);
+		if (offset > IPV6_MAXPLEN)
 			return -EINVAL;
-		offset += len;
 		*nexthdr = &exthdr->nexthdr;
 	}
 
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 94d6a13..2d0e779 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -440,7 +440,8 @@ static bool rt6_check_expired(const struct rt6_info *rt)
 		if (time_after(jiffies, rt->dst.expires))
 			return true;
 	} else if (rt->dst.from) {
-		return rt6_check_expired((struct rt6_info *) rt->dst.from);
+		return rt->dst.obsolete != DST_OBSOLETE_FORCE_CHK ||
+		       rt6_check_expired((struct rt6_info *)rt->dst.from);
 	}
 	return false;
 }
@@ -1289,7 +1290,9 @@ static void rt6_dst_from_metrics_check(struct rt6_info *rt)
 
 static struct dst_entry *rt6_check(struct rt6_info *rt, u32 cookie)
 {
-	if (!rt->rt6i_node || (rt->rt6i_node->fn_sernum != cookie))
+	u32 rt_cookie = 0;
+
+	if (!rt6_get_cookie_safe(rt, &rt_cookie) || rt_cookie != cookie)
 		return NULL;
 
 	if (rt6_check_expired(rt))
@@ -1357,8 +1360,14 @@ static void ip6_link_failure(struct sk_buff *skb)
 		if (rt->rt6i_flags & RTF_CACHE) {
 			if (dst_hold_safe(&rt->dst))
 				ip6_del_rt(rt);
-		} else if (rt->rt6i_node && (rt->rt6i_flags & RTF_DEFAULT)) {
-			rt->rt6i_node->fn_sernum = -1;
+		} else {
+			struct fib6_node *fn;
+
+			rcu_read_lock();
+			fn = rcu_dereference(rt->rt6i_node);
+			if (fn && (rt->rt6i_flags & RTF_DEFAULT))
+				fn->fn_sernum = -1;
+			rcu_read_unlock();
 		}
 	}
 }
@@ -1375,7 +1384,8 @@ static void rt6_do_update_pmtu(struct rt6_info *rt, u32 mtu)
 static bool rt6_cache_allowed_for_pmtu(const struct rt6_info *rt)
 {
 	return !(rt->rt6i_flags & RTF_CACHE) &&
-		(rt->rt6i_flags & RTF_PCPU || rt->rt6i_node);
+		(rt->rt6i_flags & RTF_PCPU ||
+		 rcu_access_pointer(rt->rt6i_node));
 }
 
 static void __ip6_rt_update_pmtu(struct dst_entry *dst, const struct sock *sk,
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 20039c8..d688622 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -768,6 +768,15 @@ static int __udp6_lib_mcast_deliver(struct net *net, struct sk_buff *skb,
 	return 0;
 }
 
+static void udp6_sk_rx_dst_set(struct sock *sk, struct dst_entry *dst)
+{
+	if (udp_sk_rx_dst_set(sk, dst)) {
+		const struct rt6_info *rt = (const struct rt6_info *)dst;
+
+		inet6_sk(sk)->rx_dst_cookie = rt6_get_cookie(rt);
+	}
+}
+
 int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
 		   int proto)
 {
@@ -817,7 +826,7 @@ int __udp6_lib_rcv(struct sk_buff *skb, struct udp_table *udptable,
 		int ret;
 
 		if (unlikely(sk->sk_rx_dst != dst))
-			udp_sk_rx_dst_set(sk, dst);
+			udp6_sk_rx_dst_set(sk, dst);
 
 		ret = udpv6_queue_rcv_skb(sk, skb);
 		sock_put(sk);
diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
index da49191..4abf628 100644
--- a/net/kcm/kcmsock.c
+++ b/net/kcm/kcmsock.c
@@ -1383,6 +1383,10 @@ static int kcm_attach(struct socket *sock, struct socket *csock,
 	if (!csk)
 		return -EINVAL;
 
+	/* We must prevent loops or risk deadlock ! */
+	if (csk->sk_family == PF_KCM)
+		return -EOPNOTSUPP;
+
 	psock = kmem_cache_zalloc(kcm_psockp, GFP_KERNEL);
 	if (!psock)
 		return -ENOMEM;
diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index b0c2d4a..90165a6 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -113,7 +113,6 @@ struct l2tp_net {
 	spinlock_t l2tp_session_hlist_lock;
 };
 
-static void l2tp_tunnel_free(struct l2tp_tunnel *tunnel);
 
 static inline struct l2tp_tunnel *l2tp_tunnel(struct sock *sk)
 {
@@ -127,39 +126,6 @@ static inline struct l2tp_net *l2tp_pernet(const struct net *net)
 	return net_generic(net, l2tp_net_id);
 }
 
-/* Tunnel reference counts. Incremented per session that is added to
- * the tunnel.
- */
-static inline void l2tp_tunnel_inc_refcount_1(struct l2tp_tunnel *tunnel)
-{
-	refcount_inc(&tunnel->ref_count);
-}
-
-static inline void l2tp_tunnel_dec_refcount_1(struct l2tp_tunnel *tunnel)
-{
-	if (refcount_dec_and_test(&tunnel->ref_count))
-		l2tp_tunnel_free(tunnel);
-}
-#ifdef L2TP_REFCNT_DEBUG
-#define l2tp_tunnel_inc_refcount(_t)					\
-do {									\
-	pr_debug("l2tp_tunnel_inc_refcount: %s:%d %s: cnt=%d\n",	\
-		 __func__, __LINE__, (_t)->name,			\
-		 refcount_read(&_t->ref_count));			\
-	l2tp_tunnel_inc_refcount_1(_t);					\
-} while (0)
-#define l2tp_tunnel_dec_refcount(_t)					\
-do {									\
-	pr_debug("l2tp_tunnel_dec_refcount: %s:%d %s: cnt=%d\n",	\
-		 __func__, __LINE__, (_t)->name,			\
-		 refcount_read(&_t->ref_count));			\
-	l2tp_tunnel_dec_refcount_1(_t);					\
-} while (0)
-#else
-#define l2tp_tunnel_inc_refcount(t) l2tp_tunnel_inc_refcount_1(t)
-#define l2tp_tunnel_dec_refcount(t) l2tp_tunnel_dec_refcount_1(t)
-#endif
-
 /* Session hash global list for L2TPv3.
  * The session_id SHOULD be random according to RFC3931, but several
  * L2TP implementations use incrementing session_ids.  So we do a real
@@ -229,6 +195,27 @@ l2tp_session_id_hash(struct l2tp_tunnel *tunnel, u32 session_id)
 	return &tunnel->session_hlist[hash_32(session_id, L2TP_HASH_BITS)];
 }
 
+/* Lookup a tunnel. A new reference is held on the returned tunnel. */
+struct l2tp_tunnel *l2tp_tunnel_get(const struct net *net, u32 tunnel_id)
+{
+	const struct l2tp_net *pn = l2tp_pernet(net);
+	struct l2tp_tunnel *tunnel;
+
+	rcu_read_lock_bh();
+	list_for_each_entry_rcu(tunnel, &pn->l2tp_tunnel_list, list) {
+		if (tunnel->tunnel_id == tunnel_id) {
+			l2tp_tunnel_inc_refcount(tunnel);
+			rcu_read_unlock_bh();
+
+			return tunnel;
+		}
+	}
+	rcu_read_unlock_bh();
+
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(l2tp_tunnel_get);
+
 /* Lookup a session. A new reference is held on the returned session.
  * Optionally calls session->ref() too if do_ref is true.
  */
@@ -1348,17 +1335,6 @@ static void l2tp_udp_encap_destroy(struct sock *sk)
 	}
 }
 
-/* Really kill the tunnel.
- * Come here only when all sessions have been cleared from the tunnel.
- */
-static void l2tp_tunnel_free(struct l2tp_tunnel *tunnel)
-{
-	BUG_ON(refcount_read(&tunnel->ref_count) != 0);
-	BUG_ON(tunnel->sock != NULL);
-	l2tp_info(tunnel, L2TP_MSG_CONTROL, "%s: free...\n", tunnel->name);
-	kfree_rcu(tunnel, rcu);
-}
-
 /* Workqueue tunnel deletion function */
 static void l2tp_tunnel_del_work(struct work_struct *work)
 {
@@ -1844,6 +1820,8 @@ struct l2tp_session *l2tp_session_create(int priv_size, struct l2tp_tunnel *tunn
 
 		l2tp_session_set_header_len(session, tunnel->version);
 
+		refcount_set(&session->ref_count, 1);
+
 		err = l2tp_session_add_to_tunnel(tunnel, session);
 		if (err) {
 			kfree(session);
@@ -1851,10 +1829,6 @@ struct l2tp_session *l2tp_session_create(int priv_size, struct l2tp_tunnel *tunn
 			return ERR_PTR(err);
 		}
 
-		/* Bump the reference count. The session context is deleted
-		 * only when this drops to zero.
-		 */
-		refcount_set(&session->ref_count, 1);
 		l2tp_tunnel_inc_refcount(tunnel);
 
 		/* Ensure tunnel socket isn't deleted */
diff --git a/net/l2tp/l2tp_core.h b/net/l2tp/l2tp_core.h
index cdb6e33..9101297 100644
--- a/net/l2tp/l2tp_core.h
+++ b/net/l2tp/l2tp_core.h
@@ -231,6 +231,8 @@ static inline struct l2tp_tunnel *l2tp_sock_to_tunnel(struct sock *sk)
 	return tunnel;
 }
 
+struct l2tp_tunnel *l2tp_tunnel_get(const struct net *net, u32 tunnel_id);
+
 struct l2tp_session *l2tp_session_get(const struct net *net,
 				      struct l2tp_tunnel *tunnel,
 				      u32 session_id, bool do_ref);
@@ -269,6 +271,17 @@ int l2tp_nl_register_ops(enum l2tp_pwtype pw_type,
 void l2tp_nl_unregister_ops(enum l2tp_pwtype pw_type);
 int l2tp_ioctl(struct sock *sk, int cmd, unsigned long arg);
 
+static inline void l2tp_tunnel_inc_refcount(struct l2tp_tunnel *tunnel)
+{
+	refcount_inc(&tunnel->ref_count);
+}
+
+static inline void l2tp_tunnel_dec_refcount(struct l2tp_tunnel *tunnel)
+{
+	if (refcount_dec_and_test(&tunnel->ref_count))
+		kfree_rcu(tunnel, rcu);
+}
+
 /* Session reference counts. Incremented when code obtains a reference
  * to a session.
  */
diff --git a/net/l2tp/l2tp_netlink.c b/net/l2tp/l2tp_netlink.c
index 12cfcd0..57427d4 100644
--- a/net/l2tp/l2tp_netlink.c
+++ b/net/l2tp/l2tp_netlink.c
@@ -65,10 +65,12 @@ static struct l2tp_session *l2tp_nl_session_get(struct genl_info *info,
 		   (info->attrs[L2TP_ATTR_CONN_ID])) {
 		tunnel_id = nla_get_u32(info->attrs[L2TP_ATTR_CONN_ID]);
 		session_id = nla_get_u32(info->attrs[L2TP_ATTR_SESSION_ID]);
-		tunnel = l2tp_tunnel_find(net, tunnel_id);
-		if (tunnel)
+		tunnel = l2tp_tunnel_get(net, tunnel_id);
+		if (tunnel) {
 			session = l2tp_session_get(net, tunnel, session_id,
 						   do_ref);
+			l2tp_tunnel_dec_refcount(tunnel);
+		}
 	}
 
 	return session;
@@ -271,8 +273,8 @@ static int l2tp_nl_cmd_tunnel_delete(struct sk_buff *skb, struct genl_info *info
 	}
 	tunnel_id = nla_get_u32(info->attrs[L2TP_ATTR_CONN_ID]);
 
-	tunnel = l2tp_tunnel_find(net, tunnel_id);
-	if (tunnel == NULL) {
+	tunnel = l2tp_tunnel_get(net, tunnel_id);
+	if (!tunnel) {
 		ret = -ENODEV;
 		goto out;
 	}
@@ -282,6 +284,8 @@ static int l2tp_nl_cmd_tunnel_delete(struct sk_buff *skb, struct genl_info *info
 
 	(void) l2tp_tunnel_delete(tunnel);
 
+	l2tp_tunnel_dec_refcount(tunnel);
+
 out:
 	return ret;
 }
@@ -299,8 +303,8 @@ static int l2tp_nl_cmd_tunnel_modify(struct sk_buff *skb, struct genl_info *info
 	}
 	tunnel_id = nla_get_u32(info->attrs[L2TP_ATTR_CONN_ID]);
 
-	tunnel = l2tp_tunnel_find(net, tunnel_id);
-	if (tunnel == NULL) {
+	tunnel = l2tp_tunnel_get(net, tunnel_id);
+	if (!tunnel) {
 		ret = -ENODEV;
 		goto out;
 	}
@@ -311,6 +315,8 @@ static int l2tp_nl_cmd_tunnel_modify(struct sk_buff *skb, struct genl_info *info
 	ret = l2tp_tunnel_notify(&l2tp_nl_family, info,
 				 tunnel, L2TP_CMD_TUNNEL_MODIFY);
 
+	l2tp_tunnel_dec_refcount(tunnel);
+
 out:
 	return ret;
 }
@@ -438,34 +444,37 @@ static int l2tp_nl_cmd_tunnel_get(struct sk_buff *skb, struct genl_info *info)
 
 	if (!info->attrs[L2TP_ATTR_CONN_ID]) {
 		ret = -EINVAL;
-		goto out;
+		goto err;
 	}
 
 	tunnel_id = nla_get_u32(info->attrs[L2TP_ATTR_CONN_ID]);
 
-	tunnel = l2tp_tunnel_find(net, tunnel_id);
-	if (tunnel == NULL) {
-		ret = -ENODEV;
-		goto out;
-	}
-
 	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
 	if (!msg) {
 		ret = -ENOMEM;
-		goto out;
+		goto err;
+	}
+
+	tunnel = l2tp_tunnel_get(net, tunnel_id);
+	if (!tunnel) {
+		ret = -ENODEV;
+		goto err_nlmsg;
 	}
 
 	ret = l2tp_nl_tunnel_send(msg, info->snd_portid, info->snd_seq,
 				  NLM_F_ACK, tunnel, L2TP_CMD_TUNNEL_GET);
 	if (ret < 0)
-		goto err_out;
+		goto err_nlmsg_tunnel;
+
+	l2tp_tunnel_dec_refcount(tunnel);
 
 	return genlmsg_unicast(net, msg, info->snd_portid);
 
-err_out:
+err_nlmsg_tunnel:
+	l2tp_tunnel_dec_refcount(tunnel);
+err_nlmsg:
 	nlmsg_free(msg);
-
-out:
+err:
 	return ret;
 }
 
@@ -509,8 +518,9 @@ static int l2tp_nl_cmd_session_create(struct sk_buff *skb, struct genl_info *inf
 		ret = -EINVAL;
 		goto out;
 	}
+
 	tunnel_id = nla_get_u32(info->attrs[L2TP_ATTR_CONN_ID]);
-	tunnel = l2tp_tunnel_find(net, tunnel_id);
+	tunnel = l2tp_tunnel_get(net, tunnel_id);
 	if (!tunnel) {
 		ret = -ENODEV;
 		goto out;
@@ -518,24 +528,24 @@ static int l2tp_nl_cmd_session_create(struct sk_buff *skb, struct genl_info *inf
 
 	if (!info->attrs[L2TP_ATTR_SESSION_ID]) {
 		ret = -EINVAL;
-		goto out;
+		goto out_tunnel;
 	}
 	session_id = nla_get_u32(info->attrs[L2TP_ATTR_SESSION_ID]);
 
 	if (!info->attrs[L2TP_ATTR_PEER_SESSION_ID]) {
 		ret = -EINVAL;
-		goto out;
+		goto out_tunnel;
 	}
 	peer_session_id = nla_get_u32(info->attrs[L2TP_ATTR_PEER_SESSION_ID]);
 
 	if (!info->attrs[L2TP_ATTR_PW_TYPE]) {
 		ret = -EINVAL;
-		goto out;
+		goto out_tunnel;
 	}
 	cfg.pw_type = nla_get_u16(info->attrs[L2TP_ATTR_PW_TYPE]);
 	if (cfg.pw_type >= __L2TP_PWTYPE_MAX) {
 		ret = -EINVAL;
-		goto out;
+		goto out_tunnel;
 	}
 
 	if (tunnel->version > 2) {
@@ -557,7 +567,7 @@ static int l2tp_nl_cmd_session_create(struct sk_buff *skb, struct genl_info *inf
 			u16 len = nla_len(info->attrs[L2TP_ATTR_COOKIE]);
 			if (len > 8) {
 				ret = -EINVAL;
-				goto out;
+				goto out_tunnel;
 			}
 			cfg.cookie_len = len;
 			memcpy(&cfg.cookie[0], nla_data(info->attrs[L2TP_ATTR_COOKIE]), len);
@@ -566,7 +576,7 @@ static int l2tp_nl_cmd_session_create(struct sk_buff *skb, struct genl_info *inf
 			u16 len = nla_len(info->attrs[L2TP_ATTR_PEER_COOKIE]);
 			if (len > 8) {
 				ret = -EINVAL;
-				goto out;
+				goto out_tunnel;
 			}
 			cfg.peer_cookie_len = len;
 			memcpy(&cfg.peer_cookie[0], nla_data(info->attrs[L2TP_ATTR_PEER_COOKIE]), len);
@@ -609,7 +619,7 @@ static int l2tp_nl_cmd_session_create(struct sk_buff *skb, struct genl_info *inf
 	if ((l2tp_nl_cmd_ops[cfg.pw_type] == NULL) ||
 	    (l2tp_nl_cmd_ops[cfg.pw_type]->session_create == NULL)) {
 		ret = -EPROTONOSUPPORT;
-		goto out;
+		goto out_tunnel;
 	}
 
 	/* Check that pseudowire-specific params are present */
@@ -619,7 +629,7 @@ static int l2tp_nl_cmd_session_create(struct sk_buff *skb, struct genl_info *inf
 	case L2TP_PWTYPE_ETH_VLAN:
 		if (!info->attrs[L2TP_ATTR_VLAN_ID]) {
 			ret = -EINVAL;
-			goto out;
+			goto out_tunnel;
 		}
 		break;
 	case L2TP_PWTYPE_ETH:
@@ -647,6 +657,8 @@ static int l2tp_nl_cmd_session_create(struct sk_buff *skb, struct genl_info *inf
 		}
 	}
 
+out_tunnel:
+	l2tp_tunnel_dec_refcount(tunnel);
 out:
 	return ret;
 }
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 9979f46..51390fe 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -96,19 +96,26 @@ static struct conntrack_gc_work conntrack_gc_work;
 
 void nf_conntrack_lock(spinlock_t *lock) __acquires(lock)
 {
+	/* 1) Acquire the lock */
 	spin_lock(lock);
-	while (unlikely(nf_conntrack_locks_all)) {
-		spin_unlock(lock);
 
-		/*
-		 * Order the 'nf_conntrack_locks_all' load vs. the
-		 * spin_unlock_wait() loads below, to ensure
-		 * that 'nf_conntrack_locks_all_lock' is indeed held:
-		 */
-		smp_rmb(); /* spin_lock(&nf_conntrack_locks_all_lock) */
-		spin_unlock_wait(&nf_conntrack_locks_all_lock);
-		spin_lock(lock);
-	}
+	/* 2) read nf_conntrack_locks_all, with ACQUIRE semantics
+	 * It pairs with the smp_store_release() in nf_conntrack_all_unlock()
+	 */
+	if (likely(smp_load_acquire(&nf_conntrack_locks_all) == false))
+		return;
+
+	/* fast path failed, unlock */
+	spin_unlock(lock);
+
+	/* Slow path 1) get global lock */
+	spin_lock(&nf_conntrack_locks_all_lock);
+
+	/* Slow path 2) get the lock we want */
+	spin_lock(lock);
+
+	/* Slow path 3) release the global lock */
+	spin_unlock(&nf_conntrack_locks_all_lock);
 }
 EXPORT_SYMBOL_GPL(nf_conntrack_lock);
 
@@ -149,28 +156,27 @@ static void nf_conntrack_all_lock(void)
 	int i;
 
 	spin_lock(&nf_conntrack_locks_all_lock);
+
 	nf_conntrack_locks_all = true;
 
-	/*
-	 * Order the above store of 'nf_conntrack_locks_all' against
-	 * the spin_unlock_wait() loads below, such that if
-	 * nf_conntrack_lock() observes 'nf_conntrack_locks_all'
-	 * we must observe nf_conntrack_locks[] held:
-	 */
-	smp_mb(); /* spin_lock(&nf_conntrack_locks_all_lock) */
-
 	for (i = 0; i < CONNTRACK_LOCKS; i++) {
-		spin_unlock_wait(&nf_conntrack_locks[i]);
+		spin_lock(&nf_conntrack_locks[i]);
+
+		/* This spin_unlock provides the "release" to ensure that
+		 * nf_conntrack_locks_all==true is visible to everyone that
+		 * acquired spin_lock(&nf_conntrack_locks[]).
+		 */
+		spin_unlock(&nf_conntrack_locks[i]);
 	}
 }
 
 static void nf_conntrack_all_unlock(void)
 {
-	/*
-	 * All prior stores must be complete before we clear
+	/* All prior stores must be complete before we clear
 	 * 'nf_conntrack_locks_all'. Otherwise nf_conntrack_lock()
 	 * might observe the false value but not the entire
-	 * critical section:
+	 * critical section.
+	 * It pairs with the smp_load_acquire() in nf_conntrack_lock()
 	 */
 	smp_store_release(&nf_conntrack_locks_all, false);
 	spin_unlock(&nf_conntrack_locks_all_lock);
diff --git a/net/netfilter/nf_nat_core.c b/net/netfilter/nf_nat_core.c
index eb54178..b1d3740 100644
--- a/net/netfilter/nf_nat_core.c
+++ b/net/netfilter/nf_nat_core.c
@@ -441,7 +441,7 @@ nf_nat_setup_info(struct nf_conn *ct,
 		else
 			ct->status |= IPS_DST_NAT;
 
-		if (nfct_help(ct))
+		if (nfct_help(ct) && !nfct_seqadj(ct))
 			if (!nfct_seqadj_ext_add(ct))
 				return NF_DROP;
 	}
diff --git a/net/netfilter/nft_compat.c b/net/netfilter/nft_compat.c
index f5a7cb6..b89f4f6 100644
--- a/net/netfilter/nft_compat.c
+++ b/net/netfilter/nft_compat.c
@@ -305,7 +305,7 @@ static int nft_target_validate(const struct nft_ctx *ctx,
 		const struct nf_hook_ops *ops = &basechain->ops[0];
 
 		hook_mask = 1 << ops->hooknum;
-		if (!(hook_mask & target->hooks))
+		if (target->hooks && !(hook_mask & target->hooks))
 			return -EINVAL;
 
 		ret = nft_compat_chain_validate_dependency(target->table,
@@ -484,7 +484,7 @@ static int nft_match_validate(const struct nft_ctx *ctx,
 		const struct nf_hook_ops *ops = &basechain->ops[0];
 
 		hook_mask = 1 << ops->hooknum;
-		if (!(hook_mask & match->hooks))
+		if (match->hooks && !(hook_mask & match->hooks))
 			return -EINVAL;
 
 		ret = nft_compat_chain_validate_dependency(match->table,
diff --git a/net/netfilter/nft_limit.c b/net/netfilter/nft_limit.c
index 18dd57a..14538b1 100644
--- a/net/netfilter/nft_limit.c
+++ b/net/netfilter/nft_limit.c
@@ -65,19 +65,23 @@ static int nft_limit_init(struct nft_limit *limit,
 	limit->nsecs = unit * NSEC_PER_SEC;
 	if (limit->rate == 0 || limit->nsecs < unit)
 		return -EOVERFLOW;
-	limit->tokens = limit->tokens_max = limit->nsecs;
 
-	if (tb[NFTA_LIMIT_BURST]) {
-		u64 rate;
-
+	if (tb[NFTA_LIMIT_BURST])
 		limit->burst = ntohl(nla_get_be32(tb[NFTA_LIMIT_BURST]));
+	else
+		limit->burst = 0;
 
-		rate = limit->rate + limit->burst;
-		if (rate < limit->rate)
-			return -EOVERFLOW;
+	if (limit->rate + limit->burst < limit->rate)
+		return -EOVERFLOW;
 
-		limit->rate = rate;
-	}
+	/* The token bucket size limits the number of tokens can be
+	 * accumulated. tokens_max specifies the bucket size.
+	 * tokens_max = unit * (rate + burst) / rate.
+	 */
+	limit->tokens = div_u64(limit->nsecs * (limit->rate + limit->burst),
+				limit->rate);
+	limit->tokens_max = limit->tokens;
+
 	if (tb[NFTA_LIMIT_FLAGS]) {
 		u32 flags = ntohl(nla_get_be32(tb[NFTA_LIMIT_FLAGS]));
 
@@ -95,9 +99,8 @@ static int nft_limit_dump(struct sk_buff *skb, const struct nft_limit *limit,
 {
 	u32 flags = limit->invert ? NFT_LIMIT_F_INV : 0;
 	u64 secs = div_u64(limit->nsecs, NSEC_PER_SEC);
-	u64 rate = limit->rate - limit->burst;
 
-	if (nla_put_be64(skb, NFTA_LIMIT_RATE, cpu_to_be64(rate),
+	if (nla_put_be64(skb, NFTA_LIMIT_RATE, cpu_to_be64(limit->rate),
 			 NFTA_LIMIT_PAD) ||
 	    nla_put_be64(skb, NFTA_LIMIT_UNIT, cpu_to_be64(secs),
 			 NFTA_LIMIT_PAD) ||
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 008a45c..1c61af9 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2191,6 +2191,7 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
 	struct timespec ts;
 	__u32 ts_status;
 	bool is_drop_n_account = false;
+	bool do_vnet = false;
 
 	/* struct tpacket{2,3}_hdr is aligned to a multiple of TPACKET_ALIGNMENT.
 	 * We may add members to them until current aligned size without forcing
@@ -2241,8 +2242,10 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
 		netoff = TPACKET_ALIGN(po->tp_hdrlen +
 				       (maclen < 16 ? 16 : maclen)) +
 				       po->tp_reserve;
-		if (po->has_vnet_hdr)
+		if (po->has_vnet_hdr) {
 			netoff += sizeof(struct virtio_net_hdr);
+			do_vnet = true;
+		}
 		macoff = netoff - maclen;
 	}
 	if (po->tp_version <= TPACKET_V2) {
@@ -2259,8 +2262,10 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
 					skb_set_owner_r(copy_skb, sk);
 			}
 			snaplen = po->rx_ring.frame_size - macoff;
-			if ((int)snaplen < 0)
+			if ((int)snaplen < 0) {
 				snaplen = 0;
+				do_vnet = false;
+			}
 		}
 	} else if (unlikely(macoff + snaplen >
 			    GET_PBDQC_FROM_RB(&po->rx_ring)->max_frame_len)) {
@@ -2273,6 +2278,7 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
 		if (unlikely((int)snaplen < 0)) {
 			snaplen = 0;
 			macoff = GET_PBDQC_FROM_RB(&po->rx_ring)->max_frame_len;
+			do_vnet = false;
 		}
 	}
 	spin_lock(&sk->sk_receive_queue.lock);
@@ -2298,7 +2304,7 @@ static int tpacket_rcv(struct sk_buff *skb, struct net_device *dev,
 	}
 	spin_unlock(&sk->sk_receive_queue.lock);
 
-	if (po->has_vnet_hdr) {
+	if (do_vnet) {
 		if (virtio_net_hdr_from_skb(skb, h.raw + macoff -
 					    sizeof(struct virtio_net_hdr),
 					    vio_le(), true)) {
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 9fd44c2..6c5ea84 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -215,9 +215,15 @@ static void tcf_chain_flush(struct tcf_chain *chain)
 
 static void tcf_chain_destroy(struct tcf_chain *chain)
 {
-	list_del(&chain->list);
-	tcf_chain_flush(chain);
-	kfree(chain);
+	/* May be already removed from the list by the previous call. */
+	if (!list_empty(&chain->list))
+		list_del_init(&chain->list);
+
+	/* There might still be a reference held when we got here from
+	 * tcf_block_put. Wait for the user to drop reference before free.
+	 */
+	if (!chain->refcnt)
+		kfree(chain);
 }
 
 struct tcf_chain *tcf_chain_get(struct tcf_block *block, u32 chain_index,
@@ -288,8 +294,10 @@ void tcf_block_put(struct tcf_block *block)
 	if (!block)
 		return;
 
-	list_for_each_entry_safe(chain, tmp, &block->chain_list, list)
+	list_for_each_entry_safe(chain, tmp, &block->chain_list, list) {
+		tcf_chain_flush(chain);
 		tcf_chain_destroy(chain);
+	}
 	kfree(block);
 }
 EXPORT_SYMBOL(tcf_block_put);
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index a3fa144..4fb5a322 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -836,7 +836,7 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
 
 			old = dev_graft_qdisc(dev_queue, new);
 			if (new && i > 0)
-				refcount_inc(&new->refcnt);
+				qdisc_refcount_inc(new);
 
 			if (!ingress)
 				qdisc_destroy(old);
@@ -847,7 +847,7 @@ static int qdisc_graft(struct net_device *dev, struct Qdisc *parent,
 			notify_and_destroy(net, skb, n, classid,
 					   dev->qdisc, new);
 			if (new && !new->ops->attach)
-				refcount_inc(&new->refcnt);
+				qdisc_refcount_inc(new);
 			dev->qdisc = new ? : &noop_qdisc;
 
 			if (new && new->ops->attach)
@@ -1256,7 +1256,7 @@ static int tc_modify_qdisc(struct sk_buff *skb, struct nlmsghdr *n,
 				if (q == p ||
 				    (p && check_loop(q, p, 0)))
 					return -ELOOP;
-				refcount_inc(&q->refcnt);
+				qdisc_refcount_inc(q);
 				goto graft;
 			} else {
 				if (!q)
diff --git a/net/sched/sch_cbq.c b/net/sched/sch_cbq.c
index 780db43..156c8a3 100644
--- a/net/sched/sch_cbq.c
+++ b/net/sched/sch_cbq.c
@@ -1139,6 +1139,13 @@ static int cbq_init(struct Qdisc *sch, struct nlattr *opt)
 	struct tc_ratespec *r;
 	int err;
 
+	qdisc_watchdog_init(&q->watchdog, sch);
+	hrtimer_init(&q->delay_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_PINNED);
+	q->delay_timer.function = cbq_undelay;
+
+	if (!opt)
+		return -EINVAL;
+
 	err = nla_parse_nested(tb, TCA_CBQ_MAX, opt, cbq_policy, NULL);
 	if (err < 0)
 		return err;
@@ -1177,9 +1184,6 @@ static int cbq_init(struct Qdisc *sch, struct nlattr *opt)
 	q->link.avpkt = q->link.allot/2;
 	q->link.minidle = -0x7FFFFFFF;
 
-	qdisc_watchdog_init(&q->watchdog, sch);
-	hrtimer_init(&q->delay_timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_PINNED);
-	q->delay_timer.function = cbq_undelay;
 	q->toplevel = TC_CBQ_MAXLEVEL;
 	q->now = psched_get_time();
 
diff --git a/net/sched/sch_fq_codel.c b/net/sched/sch_fq_codel.c
index 337f2d6..2c0c05f 100644
--- a/net/sched/sch_fq_codel.c
+++ b/net/sched/sch_fq_codel.c
@@ -491,10 +491,8 @@ static int fq_codel_init(struct Qdisc *sch, struct nlattr *opt)
 		if (!q->flows)
 			return -ENOMEM;
 		q->backlogs = kvzalloc(q->flows_cnt * sizeof(u32), GFP_KERNEL);
-		if (!q->backlogs) {
-			kvfree(q->flows);
+		if (!q->backlogs)
 			return -ENOMEM;
-		}
 		for (i = 0; i < q->flows_cnt; i++) {
 			struct fq_codel_flow *flow = q->flows + i;
 
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 57ba406..4ba6da5 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -785,7 +785,7 @@ static void attach_default_qdiscs(struct net_device *dev)
 	    dev->priv_flags & IFF_NO_QUEUE) {
 		netdev_for_each_tx_queue(dev, attach_one_default_qdisc, NULL);
 		dev->qdisc = txq->qdisc_sleeping;
-		refcount_inc(&dev->qdisc->refcnt);
+		qdisc_refcount_inc(dev->qdisc);
 	} else {
 		qdisc = qdisc_create_dflt(txq, &mq_qdisc_ops, TC_H_ROOT);
 		if (qdisc) {
diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c
index fd15200..11ab8da 100644
--- a/net/sched/sch_hfsc.c
+++ b/net/sched/sch_hfsc.c
@@ -1418,6 +1418,8 @@ hfsc_init_qdisc(struct Qdisc *sch, struct nlattr *opt)
 	struct tc_hfsc_qopt *qopt;
 	int err;
 
+	qdisc_watchdog_init(&q->watchdog, sch);
+
 	if (opt == NULL || nla_len(opt) < sizeof(*qopt))
 		return -EINVAL;
 	qopt = nla_data(opt);
@@ -1430,7 +1432,7 @@ hfsc_init_qdisc(struct Qdisc *sch, struct nlattr *opt)
 
 	err = tcf_block_get(&q->root.block, &q->root.filter_list);
 	if (err)
-		goto err_tcf;
+		return err;
 
 	q->root.cl_common.classid = sch->handle;
 	q->root.refcnt  = 1;
@@ -1448,13 +1450,7 @@ hfsc_init_qdisc(struct Qdisc *sch, struct nlattr *opt)
 	qdisc_class_hash_insert(&q->clhash, &q->root.cl_common);
 	qdisc_class_hash_grow(sch, &q->clhash);
 
-	qdisc_watchdog_init(&q->watchdog, sch);
-
 	return 0;
-
-err_tcf:
-	qdisc_class_hash_destroy(&q->clhash);
-	return err;
 }
 
 static int
diff --git a/net/sched/sch_hhf.c b/net/sched/sch_hhf.c
index 51d3ba6..73a53c0 100644
--- a/net/sched/sch_hhf.c
+++ b/net/sched/sch_hhf.c
@@ -477,6 +477,9 @@ static void hhf_destroy(struct Qdisc *sch)
 		kvfree(q->hhf_valid_bits[i]);
 	}
 
+	if (!q->hh_flows)
+		return;
+
 	for (i = 0; i < HH_FLOWS_CNT; i++) {
 		struct hh_flow_state *flow, *next;
 		struct list_head *head = &q->hh_flows[i];
diff --git a/net/sched/sch_htb.c b/net/sched/sch_htb.c
index 5d65ec5..5bf5177 100644
--- a/net/sched/sch_htb.c
+++ b/net/sched/sch_htb.c
@@ -1017,6 +1017,9 @@ static int htb_init(struct Qdisc *sch, struct nlattr *opt)
 	int err;
 	int i;
 
+	qdisc_watchdog_init(&q->watchdog, sch);
+	INIT_WORK(&q->work, htb_work_func);
+
 	if (!opt)
 		return -EINVAL;
 
@@ -1041,8 +1044,6 @@ static int htb_init(struct Qdisc *sch, struct nlattr *opt)
 	for (i = 0; i < TC_HTB_NUMPRIO; i++)
 		INIT_LIST_HEAD(q->drops + i);
 
-	qdisc_watchdog_init(&q->watchdog, sch);
-	INIT_WORK(&q->work, htb_work_func);
 	qdisc_skb_head_init(&q->direct_queue);
 
 	if (tb[TCA_HTB_DIRECT_QLEN])
diff --git a/net/sched/sch_multiq.c b/net/sched/sch_multiq.c
index f143b7b..9c454f5 100644
--- a/net/sched/sch_multiq.c
+++ b/net/sched/sch_multiq.c
@@ -257,12 +257,7 @@ static int multiq_init(struct Qdisc *sch, struct nlattr *opt)
 	for (i = 0; i < q->max_bands; i++)
 		q->queues[i] = &noop_qdisc;
 
-	err = multiq_tune(sch, opt);
-
-	if (err)
-		kfree(q->queues);
-
-	return err;
+	return multiq_tune(sch, opt);
 }
 
 static int multiq_dump(struct Qdisc *sch, struct sk_buff *skb)
diff --git a/net/sched/sch_netem.c b/net/sched/sch_netem.c
index 1b3dd61..14d1724 100644
--- a/net/sched/sch_netem.c
+++ b/net/sched/sch_netem.c
@@ -933,11 +933,11 @@ static int netem_init(struct Qdisc *sch, struct nlattr *opt)
 	struct netem_sched_data *q = qdisc_priv(sch);
 	int ret;
 
+	qdisc_watchdog_init(&q->watchdog, sch);
+
 	if (!opt)
 		return -EINVAL;
 
-	qdisc_watchdog_init(&q->watchdog, sch);
-
 	q->loss_model = CLG_RANDOM;
 	ret = netem_change(sch, opt);
 	if (ret)
diff --git a/net/sched/sch_sfq.c b/net/sched/sch_sfq.c
index 82469ef..fc69fc5 100644
--- a/net/sched/sch_sfq.c
+++ b/net/sched/sch_sfq.c
@@ -716,13 +716,13 @@ static int sfq_init(struct Qdisc *sch, struct nlattr *opt)
 	int i;
 	int err;
 
+	setup_deferrable_timer(&q->perturb_timer, sfq_perturbation,
+			       (unsigned long)sch);
+
 	err = tcf_block_get(&q->block, &q->filter_list);
 	if (err)
 		return err;
 
-	setup_deferrable_timer(&q->perturb_timer, sfq_perturbation,
-			       (unsigned long)sch);
-
 	for (i = 0; i < SFQ_MAX_DEPTH + 1; i++) {
 		q->dep[i].next = i + SFQ_MAX_FLOWS;
 		q->dep[i].prev = i + SFQ_MAX_FLOWS;
diff --git a/net/sched/sch_tbf.c b/net/sched/sch_tbf.c
index b2e4b6a..493270f 100644
--- a/net/sched/sch_tbf.c
+++ b/net/sched/sch_tbf.c
@@ -425,12 +425,13 @@ static int tbf_init(struct Qdisc *sch, struct nlattr *opt)
 {
 	struct tbf_sched_data *q = qdisc_priv(sch);
 
+	qdisc_watchdog_init(&q->watchdog, sch);
+	q->qdisc = &noop_qdisc;
+
 	if (opt == NULL)
 		return -EINVAL;
 
 	q->t_c = ktime_get_ns();
-	qdisc_watchdog_init(&q->watchdog, sch);
-	q->qdisc = &noop_qdisc;
 
 	return tbf_change(sch, opt);
 }
diff --git a/net/sctp/sctp_diag.c b/net/sctp/sctp_diag.c
index 9a64721..e99518e 100644
--- a/net/sctp/sctp_diag.c
+++ b/net/sctp/sctp_diag.c
@@ -70,7 +70,8 @@ static int inet_diag_msg_sctpladdrs_fill(struct sk_buff *skb,
 
 	info = nla_data(attr);
 	list_for_each_entry_rcu(laddr, address_list, list) {
-		memcpy(info, &laddr->a, addrlen);
+		memcpy(info, &laddr->a, sizeof(laddr->a));
+		memset(info + sizeof(laddr->a), 0, addrlen - sizeof(laddr->a));
 		info += addrlen;
 	}
 
@@ -93,7 +94,9 @@ static int inet_diag_msg_sctpaddrs_fill(struct sk_buff *skb,
 	info = nla_data(attr);
 	list_for_each_entry(from, &asoc->peer.transport_addr_list,
 			    transports) {
-		memcpy(info, &from->ipaddr, addrlen);
+		memcpy(info, &from->ipaddr, sizeof(from->ipaddr));
+		memset(info + sizeof(from->ipaddr), 0,
+		       addrlen - sizeof(from->ipaddr));
 		info += addrlen;
 	}
 
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 1db478e..8d76086 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -4538,8 +4538,7 @@ int sctp_get_sctp_info(struct sock *sk, struct sctp_association *asoc,
 	info->sctpi_ictrlchunks = asoc->stats.ictrlchunks;
 
 	prim = asoc->peer.primary_path;
-	memcpy(&info->sctpi_p_address, &prim->ipaddr,
-	       sizeof(struct sockaddr_storage));
+	memcpy(&info->sctpi_p_address, &prim->ipaddr, sizeof(prim->ipaddr));
 	info->sctpi_p_state = prim->state;
 	info->sctpi_p_cwnd = prim->cwnd;
 	info->sctpi_p_srtt = prim->srtt;
diff --git a/net/tipc/bearer.c b/net/tipc/bearer.c
index 767e053..89cd061 100644
--- a/net/tipc/bearer.c
+++ b/net/tipc/bearer.c
@@ -65,6 +65,8 @@ static struct tipc_bearer *bearer_get(struct net *net, int bearer_id)
 }
 
 static void bearer_disable(struct net *net, struct tipc_bearer *b);
+static int tipc_l2_rcv_msg(struct sk_buff *skb, struct net_device *dev,
+			   struct packet_type *pt, struct net_device *orig_dev);
 
 /**
  * tipc_media_find - locates specified media object by name
@@ -428,6 +430,10 @@ int tipc_enable_l2_media(struct net *net, struct tipc_bearer *b,
 
 	/* Associate TIPC bearer with L2 bearer */
 	rcu_assign_pointer(b->media_ptr, dev);
+	b->pt.dev = dev;
+	b->pt.type = htons(ETH_P_TIPC);
+	b->pt.func = tipc_l2_rcv_msg;
+	dev_add_pack(&b->pt);
 	memset(&b->bcast_addr, 0, sizeof(b->bcast_addr));
 	memcpy(b->bcast_addr.value, dev->broadcast, b->media->hwaddr_len);
 	b->bcast_addr.media_id = b->media->type_id;
@@ -447,6 +453,7 @@ void tipc_disable_l2_media(struct tipc_bearer *b)
 	struct net_device *dev;
 
 	dev = (struct net_device *)rtnl_dereference(b->media_ptr);
+	dev_remove_pack(&b->pt);
 	RCU_INIT_POINTER(dev->tipc_ptr, NULL);
 	synchronize_net();
 	dev_put(dev);
@@ -594,11 +601,12 @@ static int tipc_l2_rcv_msg(struct sk_buff *skb, struct net_device *dev,
 	struct tipc_bearer *b;
 
 	rcu_read_lock();
-	b = rcu_dereference_rtnl(dev->tipc_ptr);
+	b = rcu_dereference_rtnl(dev->tipc_ptr) ?:
+		rcu_dereference_rtnl(orig_dev->tipc_ptr);
 	if (likely(b && test_bit(0, &b->up) &&
 		   (skb->pkt_type <= PACKET_MULTICAST))) {
 		skb->next = NULL;
-		tipc_rcv(dev_net(dev), skb, b);
+		tipc_rcv(dev_net(b->pt.dev), skb, b);
 		rcu_read_unlock();
 		return NET_RX_SUCCESS;
 	}
@@ -659,11 +667,6 @@ static int tipc_l2_device_event(struct notifier_block *nb, unsigned long evt,
 	return NOTIFY_OK;
 }
 
-static struct packet_type tipc_packet_type __read_mostly = {
-	.type = htons(ETH_P_TIPC),
-	.func = tipc_l2_rcv_msg,
-};
-
 static struct notifier_block notifier = {
 	.notifier_call  = tipc_l2_device_event,
 	.priority	= 0,
@@ -671,19 +674,12 @@ static struct notifier_block notifier = {
 
 int tipc_bearer_setup(void)
 {
-	int err;
-
-	err = register_netdevice_notifier(&notifier);
-	if (err)
-		return err;
-	dev_add_pack(&tipc_packet_type);
-	return 0;
+	return register_netdevice_notifier(&notifier);
 }
 
 void tipc_bearer_cleanup(void)
 {
 	unregister_netdevice_notifier(&notifier);
-	dev_remove_pack(&tipc_packet_type);
 }
 
 void tipc_bearer_stop(struct net *net)
diff --git a/net/tipc/bearer.h b/net/tipc/bearer.h
index 635c908..e07a55a 100644
--- a/net/tipc/bearer.h
+++ b/net/tipc/bearer.h
@@ -131,6 +131,7 @@ struct tipc_media {
  * @name: bearer name (format = media:interface)
  * @media: ptr to media structure associated with bearer
  * @bcast_addr: media address used in broadcasting
+ * @pt: packet type for bearer
  * @rcu: rcu struct for tipc_bearer
  * @priority: default link priority for bearer
  * @window: default window size for bearer
@@ -151,6 +152,7 @@ struct tipc_bearer {
 	char name[TIPC_MAX_BEARER_NAME];
 	struct tipc_media *media;
 	struct tipc_media_addr bcast_addr;
+	struct packet_type pt;
 	struct rcu_head rcu;
 	u32 priority;
 	u32 window;
diff --git a/net/tipc/msg.c b/net/tipc/msg.c
index dcd90e6..6ef379f 100644
--- a/net/tipc/msg.c
+++ b/net/tipc/msg.c
@@ -479,13 +479,14 @@ bool tipc_msg_make_bundle(struct sk_buff **skb,  struct tipc_msg *msg,
 bool tipc_msg_reverse(u32 own_node,  struct sk_buff **skb, int err)
 {
 	struct sk_buff *_skb = *skb;
-	struct tipc_msg *hdr = buf_msg(_skb);
+	struct tipc_msg *hdr;
 	struct tipc_msg ohdr;
-	int dlen = min_t(uint, msg_data_sz(hdr), MAX_FORWARD_SIZE);
+	int dlen;
 
 	if (skb_linearize(_skb))
 		goto exit;
 	hdr = buf_msg(_skb);
+	dlen = min_t(uint, msg_data_sz(hdr), MAX_FORWARD_SIZE);
 	if (msg_dest_droppable(hdr))
 		goto exit;
 	if (msg_errcode(hdr))
@@ -511,6 +512,8 @@ bool tipc_msg_reverse(u32 own_node,  struct sk_buff **skb, int err)
 	    pskb_expand_head(_skb, BUF_HEADROOM, BUF_TAILROOM, GFP_ATOMIC))
 		goto exit;
 
+	/* reassign after skb header modifications */
+	hdr = buf_msg(_skb);
 	/* Now reverse the concerned fields */
 	msg_set_errcode(hdr, err);
 	msg_set_non_seq(hdr, 0);
diff --git a/net/tipc/node.c b/net/tipc/node.c
index 9b4dcb6..7dd2233 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -1126,8 +1126,8 @@ int tipc_node_get_linkname(struct net *net, u32 bearer_id, u32 addr,
 		strncpy(linkname, tipc_link_name(link), len);
 		err = 0;
 	}
-exit:
 	tipc_node_read_unlock(node);
+exit:
 	tipc_node_put(node);
 	return err;
 }
@@ -1557,6 +1557,8 @@ void tipc_rcv(struct net *net, struct sk_buff *skb, struct tipc_bearer *b)
 
 	/* Check/update node state before receiving */
 	if (unlikely(skb)) {
+		if (unlikely(skb_linearize(skb)))
+			goto discard;
 		tipc_node_write_lock(n);
 		if (tipc_node_check_state(n, skb, bearer_id, &xmitq)) {
 			if (le->link) {
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 101e359..d50edd6 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -2255,8 +2255,8 @@ void tipc_sk_reinit(struct net *net)
 
 	do {
 		tsk = ERR_PTR(rhashtable_walk_start(&iter));
-		if (tsk)
-			continue;
+		if (IS_ERR(tsk))
+			goto walk_stop;
 
 		while ((tsk = rhashtable_walk_next(&iter)) && !IS_ERR(tsk)) {
 			spin_lock_bh(&tsk->sk.sk_lock.slock);
@@ -2265,7 +2265,7 @@ void tipc_sk_reinit(struct net *net)
 			msg_set_orignode(msg, tn->own_addr);
 			spin_unlock_bh(&tsk->sk.sk_lock.slock);
 		}
-
+walk_stop:
 		rhashtable_walk_stop(&iter);
 	} while (tsk == ERR_PTR(-EAGAIN));
 }
diff --git a/net/tipc/subscr.c b/net/tipc/subscr.c
index 0bf91cd..be3d9e3 100644
--- a/net/tipc/subscr.c
+++ b/net/tipc/subscr.c
@@ -52,7 +52,6 @@ struct tipc_subscriber {
 	struct list_head subscrp_list;
 };
 
-static void tipc_subscrp_delete(struct tipc_subscription *sub);
 static void tipc_subscrb_put(struct tipc_subscriber *subscriber);
 
 /**
@@ -197,15 +196,19 @@ static void tipc_subscrb_subscrp_delete(struct tipc_subscriber *subscriber,
 {
 	struct list_head *subscription_list = &subscriber->subscrp_list;
 	struct tipc_subscription *sub, *temp;
+	u32 timeout;
 
 	spin_lock_bh(&subscriber->lock);
 	list_for_each_entry_safe(sub, temp, subscription_list,  subscrp_list) {
 		if (s && memcmp(s, &sub->evt.s, sizeof(struct tipc_subscr)))
 			continue;
 
-		tipc_nametbl_unsubscribe(sub);
-		list_del(&sub->subscrp_list);
-		tipc_subscrp_delete(sub);
+		timeout = htohl(sub->evt.s.timeout, sub->swap);
+		if (timeout == TIPC_WAIT_FOREVER || del_timer(&sub->timer)) {
+			tipc_nametbl_unsubscribe(sub);
+			list_del(&sub->subscrp_list);
+			tipc_subscrp_put(sub);
+		}
 
 		if (s)
 			break;
@@ -236,18 +239,12 @@ static void tipc_subscrb_delete(struct tipc_subscriber *subscriber)
 	tipc_subscrb_put(subscriber);
 }
 
-static void tipc_subscrp_delete(struct tipc_subscription *sub)
-{
-	u32 timeout = htohl(sub->evt.s.timeout, sub->swap);
-
-	if (timeout == TIPC_WAIT_FOREVER || del_timer(&sub->timer))
-		tipc_subscrp_put(sub);
-}
-
 static void tipc_subscrp_cancel(struct tipc_subscr *s,
 				struct tipc_subscriber *subscriber)
 {
+	tipc_subscrb_get(subscriber);
 	tipc_subscrb_subscrp_delete(subscriber, s);
+	tipc_subscrb_put(subscriber);
 }
 
 static struct tipc_subscription *tipc_subscrp_create(struct net *net,
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index ff61d85..69b16ee 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -2226,7 +2226,6 @@ struct dst_entry *xfrm_lookup(struct net *net, struct dst_entry *dst_orig,
 				goto no_transform;
 			}
 
-			dst_hold(&xdst->u.dst);
 			route = xdst->route;
 		}
 	}
@@ -3308,9 +3307,15 @@ int xfrm_migrate(const struct xfrm_selector *sel, u8 dir, u8 type,
 	struct xfrm_state *x_new[XFRM_MAX_DEPTH];
 	struct xfrm_migrate *mp;
 
+	/* Stage 0 - sanity checks */
 	if ((err = xfrm_migrate_check(m, num_migrate)) < 0)
 		goto out;
 
+	if (dir >= XFRM_POLICY_MAX) {
+		err = -EINVAL;
+		goto out;
+	}
+
 	/* Stage 1 - find policy */
 	if ((pol = xfrm_migrate_policy_find(sel, dir, type, net)) == NULL) {
 		err = -ENOENT;
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 6c0956d..a792eff 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -1620,6 +1620,7 @@ int
 xfrm_tmpl_sort(struct xfrm_tmpl **dst, struct xfrm_tmpl **src, int n,
 	       unsigned short family, struct net *net)
 {
+	int i;
 	int err = 0;
 	struct xfrm_state_afinfo *afinfo = xfrm_state_get_afinfo(family);
 	if (!afinfo)
@@ -1628,6 +1629,9 @@ xfrm_tmpl_sort(struct xfrm_tmpl **dst, struct xfrm_tmpl **src, int n,
 	spin_lock_bh(&net->xfrm.xfrm_state_lock); /*FIXME*/
 	if (afinfo->tmpl_sort)
 		err = afinfo->tmpl_sort(dst, src, n);
+	else
+		for (i = 0; i < n; i++)
+			dst[i] = src[i];
 	spin_unlock_bh(&net->xfrm.xfrm_state_lock);
 	rcu_read_unlock();
 	return err;
@@ -1638,6 +1642,7 @@ int
 xfrm_state_sort(struct xfrm_state **dst, struct xfrm_state **src, int n,
 		unsigned short family)
 {
+	int i;
 	int err = 0;
 	struct xfrm_state_afinfo *afinfo = xfrm_state_get_afinfo(family);
 	struct net *net = xs_net(*src);
@@ -1648,6 +1653,9 @@ xfrm_state_sort(struct xfrm_state **dst, struct xfrm_state **src, int n,
 	spin_lock_bh(&net->xfrm.xfrm_state_lock);
 	if (afinfo->state_sort)
 		err = afinfo->state_sort(dst, src, n);
+	else
+		for (i = 0; i < n; i++)
+			dst[i] = src[i];
 	spin_unlock_bh(&net->xfrm.xfrm_state_lock);
 	rcu_read_unlock();
 	return err;
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index 2be4c6a..9391ced 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -796,7 +796,7 @@ static int copy_user_offload(struct xfrm_state_offload *xso, struct sk_buff *skb
 		return -EMSGSIZE;
 
 	xuo = nla_data(attr);
-
+	memset(xuo, 0, sizeof(*xuo));
 	xuo->ifindex = xso->dev->ifindex;
 	xuo->flags = xso->flags;
 
@@ -1869,6 +1869,7 @@ static int build_aevent(struct sk_buff *skb, struct xfrm_state *x, const struct
 		return -EMSGSIZE;
 
 	id = nlmsg_data(nlh);
+	memset(&id->sa_id, 0, sizeof(id->sa_id));
 	memcpy(&id->sa_id.daddr, &x->id.daddr, sizeof(x->id.daddr));
 	id->sa_id.spi = x->id.spi;
 	id->sa_id.family = x->props.family;
@@ -2578,6 +2579,8 @@ static int build_expire(struct sk_buff *skb, struct xfrm_state *x, const struct
 	ue = nlmsg_data(nlh);
 	copy_to_user_state(x, &ue->state);
 	ue->hard = (c->data.hard != 0) ? 1 : 0;
+	/* clear the padding bytes */
+	memset(&ue->hard + 1, 0, sizeof(*ue) - offsetofend(typeof(*ue), hard));
 
 	err = xfrm_mark_put(skb, &x->mark);
 	if (err)
@@ -2715,6 +2718,7 @@ static int xfrm_notify_sa(struct xfrm_state *x, const struct km_event *c)
 		struct nlattr *attr;
 
 		id = nlmsg_data(nlh);
+		memset(id, 0, sizeof(*id));
 		memcpy(&id->daddr, &x->id.daddr, sizeof(id->daddr));
 		id->spi = x->id.spi;
 		id->family = x->props.family;
diff --git a/scripts/dtc/checks.c b/scripts/dtc/checks.c
index 4b72b53..62ea8f8 100644
--- a/scripts/dtc/checks.c
+++ b/scripts/dtc/checks.c
@@ -873,7 +873,7 @@ static void check_simple_bus_reg(struct check *c, struct dt_info *dti, struct no
 	while (size--)
 		reg = (reg << 32) | fdt32_to_cpu(*(cells++));
 
-	snprintf(unit_addr, sizeof(unit_addr), "%zx", reg);
+	snprintf(unit_addr, sizeof(unit_addr), "%llx", (unsigned long long)reg);
 	if (!streq(unitname, unit_addr))
 		FAIL(c, dti, "Node %s simple-bus unit address format error, expected \"%s\"",
 		     node->fullpath, unit_addr);
diff --git a/scripts/kernel-doc b/scripts/kernel-doc
index 6e36b78..9d3eafe 100755
--- a/scripts/kernel-doc
+++ b/scripts/kernel-doc
@@ -2226,6 +2226,7 @@
     if ($x =~ /enum\s+(\w+)\s*{(.*)}/) {
 	$declaration_name = $1;
 	my $members = $2;
+	$members =~ s/\s+$//;
 
 	foreach my $arg (split ',', $members) {
 	    $arg =~ s/^\s*(\w+).*/$1/;
@@ -2766,6 +2767,9 @@
 
     while (1) {
 	if ( $x =~ /([^{};]*)([{};])(.*)/ ) {
+            if( length $prototype ) {
+                $prototype .= " "
+            }
 	    $prototype .= $1 . $2;
 	    ($2 eq '{') && $brcount++;
 	    ($2 eq '}') && $brcount--;
diff --git a/scripts/sphinx-pre-install b/scripts/sphinx-pre-install
new file mode 100755
index 0000000..677756a
--- /dev/null
+++ b/scripts/sphinx-pre-install
@@ -0,0 +1,609 @@
+#!/usr/bin/perl
+use strict;
+
+# Copyright (c) 2017 Mauro Carvalho Chehab <mchehab@kernel.org>
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License
+# as published by the Free Software Foundation; either version 2
+# of the License, or (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+my $virtenv_dir = "sphinx_1.4";
+my $requirement_file = "Documentation/sphinx/requirements.txt";
+
+#
+# Static vars
+#
+
+my %missing;
+my $system_release;
+my $need = 0;
+my $optional = 0;
+my $need_symlink = 0;
+my $need_sphinx = 0;
+my $install = "";
+
+#
+# Command line arguments
+#
+
+my $pdf = 1;
+my $virtualenv = 1;
+
+#
+# List of required texlive packages on Fedora and OpenSuse
+#
+
+my %texlive = (
+	'adjustbox.sty'      => 'texlive-adjustbox',
+	'amsfonts.sty'       => 'texlive-amsfonts',
+	'amsmath.sty'        => 'texlive-amsmath',
+	'amssymb.sty'        => 'texlive-amsfonts',
+	'amsthm.sty'         => 'texlive-amscls',
+	'anyfontsize.sty'    => 'texlive-anyfontsize',
+	'atbegshi.sty'       => 'texlive-oberdiek',
+	'bm.sty'             => 'texlive-tools',
+	'capt-of.sty'        => 'texlive-capt-of',
+	'cmap.sty'           => 'texlive-cmap',
+	'ecrm1000.tfm'       => 'texlive-ec',
+	'eqparbox.sty'       => 'texlive-eqparbox',
+	'eu1enc.def'         => 'texlive-euenc',
+	'fancybox.sty'       => 'texlive-fancybox',
+	'fancyvrb.sty'       => 'texlive-fancyvrb',
+	'float.sty'          => 'texlive-float',
+	'fncychap.sty'       => 'texlive-fncychap',
+	'footnote.sty'       => 'texlive-mdwtools',
+	'framed.sty'         => 'texlive-framed',
+	'luatex85.sty'       => 'texlive-luatex85',
+	'multirow.sty'       => 'texlive-multirow',
+	'needspace.sty'      => 'texlive-needspace',
+	'palatino.sty'       => 'texlive-psnfss',
+	'parskip.sty'        => 'texlive-parskip',
+	'polyglossia.sty'    => 'texlive-polyglossia',
+	'tabulary.sty'       => 'texlive-tabulary',
+	'threeparttable.sty' => 'texlive-threeparttable',
+	'titlesec.sty'       => 'texlive-titlesec',
+	'ucs.sty'            => 'texlive-ucs',
+	'upquote.sty'        => 'texlive-upquote',
+	'wrapfig.sty'        => 'texlive-wrapfig',
+);
+
+#
+# Subroutines that checks if a feature exists
+#
+
+sub check_missing(%)
+{
+	my %map = %{$_[0]};
+
+	foreach my $prog (sort keys %missing) {
+		my $is_optional = $missing{$prog};
+
+		if ($is_optional) {
+			print "Warning: better to also install \"$prog\".\n";
+		} else {
+			print "ERROR: please install \"$prog\", otherwise, build won't work.\n";
+		}
+		if (defined($map{$prog})) {
+			$install .= " " . $map{$prog};
+		} else {
+			$install .= " " . $prog;
+		}
+	}
+
+	$install =~ s/^\s//;
+}
+
+sub add_package($$)
+{
+	my $package = shift;
+	my $is_optional = shift;
+
+	$missing{$package} = $is_optional;
+	if ($is_optional) {
+		$optional++;
+	} else {
+		$need++;
+	}
+}
+
+sub check_missing_file($$$)
+{
+	my $file = shift;
+	my $package = shift;
+	my $is_optional = shift;
+
+	return if(-e $file);
+
+	add_package($package, $is_optional);
+}
+
+sub findprog($)
+{
+	foreach(split(/:/, $ENV{PATH})) {
+		return "$_/$_[0]" if(-x "$_/$_[0]");
+	}
+}
+
+sub check_program($$)
+{
+	my $prog = shift;
+	my $is_optional = shift;
+
+	return if findprog($prog);
+
+	add_package($prog, $is_optional);
+}
+
+sub check_perl_module($$)
+{
+	my $prog = shift;
+	my $is_optional = shift;
+
+	my $err = system("perl -M$prog -e 1 2>/dev/null /dev/null");
+	return if ($err == 0);
+
+	add_package($prog, $is_optional);
+}
+
+sub check_python_module($$)
+{
+	my $prog = shift;
+	my $is_optional = shift;
+
+	my $err = system("python3 -c 'import $prog' 2>/dev/null /dev/null");
+	return if ($err == 0);
+	my $err = system("python -c 'import $prog' 2>/dev/null /dev/null");
+	return if ($err == 0);
+
+	add_package($prog, $is_optional);
+}
+
+sub check_rpm_missing($$)
+{
+	my @pkgs = @{$_[0]};
+	my $is_optional = $_[1];
+
+	foreach my $prog(@pkgs) {
+		my $err = system("rpm -q '$prog' 2>/dev/null >/dev/null");
+		add_package($prog, $is_optional) if ($err);
+	}
+}
+
+sub check_pacman_missing($$)
+{
+	my @pkgs = @{$_[0]};
+	my $is_optional = $_[1];
+
+	foreach my $prog(@pkgs) {
+		my $err = system("pacman -Q '$prog' 2>/dev/null >/dev/null");
+		add_package($prog, $is_optional) if ($err);
+	}
+}
+
+sub check_missing_tex($)
+{
+	my $is_optional = shift;
+	my $kpsewhich = findprog("kpsewhich");
+
+	foreach my $prog(keys %texlive) {
+		my $package = $texlive{$prog};
+		if (!$kpsewhich) {
+			add_package($package, $is_optional);
+			next;
+		}
+		my $file = qx($kpsewhich $prog);
+		add_package($package, $is_optional) if ($file =~ /^\s*$/);
+	}
+}
+
+sub check_sphinx()
+{
+	return if findprog("sphinx-build");
+
+	if (findprog("sphinx-build-3")) {
+		$need_symlink = 1;
+		return;
+	}
+
+	if ($virtualenv) {
+		my $prog = findprog("virtualenv-3");
+		$prog = findprog("virtualenv-3.5") if (!$prog);
+
+		check_program("virtualenv", 0) if (!$prog);
+		$need_sphinx = 1;
+	} else {
+		add_package("python-sphinx", 0);
+	}
+}
+
+#
+# Ancillary subroutines
+#
+
+sub catcheck($)
+{
+  my $res = "";
+  $res = qx(cat $_[0]) if (-r $_[0]);
+  return $res;
+}
+
+sub which($)
+{
+	my $file = shift;
+	my @path = split ":", $ENV{PATH};
+
+	foreach my $dir(@path) {
+		my $name = $dir.'/'.$file;
+		return $name if (-x $name );
+	}
+	return undef;
+}
+
+#
+# Subroutines that check distro-specific hints
+#
+
+sub give_debian_hints()
+{
+	my %map = (
+		"python-sphinx"		=> "python3-sphinx",
+		"sphinx_rtd_theme"	=> "python3-sphinx-rtd-theme",
+		"virtualenv"		=> "virtualenv",
+		"dot"			=> "graphviz",
+		"convert"		=> "imagemagick",
+		"Pod::Usage"		=> "perl-modules",
+		"xelatex"		=> "texlive-xetex",
+		"rsvg-convert"		=> "librsvg2-bin",
+	);
+
+	if ($pdf) {
+		check_missing_file("/usr/share/fonts/truetype/dejavu/DejaVuSans.ttf",
+				   "fonts-dejavu", 1);
+	}
+
+	check_program("dvipng", 1) if ($pdf);
+	check_missing(\%map);
+
+	return if (!$need && !$optional);
+	printf("You should run:\n\n\tsudo apt-get install $install\n");
+}
+
+sub give_redhat_hints()
+{
+	my %map = (
+		"python-sphinx"		=> "python3-sphinx",
+		"sphinx_rtd_theme"	=> "python3-sphinx_rtd_theme",
+		"virtualenv"		=> "python3-virtualenv",
+		"dot"			=> "graphviz",
+		"convert"		=> "ImageMagick",
+		"Pod::Usage"		=> "perl-Pod-Usage",
+		"xelatex"		=> "texlive-xetex-bin",
+		"rsvg-convert"		=> "librsvg2-tools",
+	);
+
+	my @fedora26_opt_pkgs = (
+		"graphviz-gd",		# Fedora 26: needed for PDF support
+	);
+
+	my @fedora_tex_pkgs = (
+		"texlive-collection-fontsrecommended",
+		"texlive-collection-latex",
+		"dejavu-sans-fonts",
+		"dejavu-serif-fonts",
+		"dejavu-sans-mono-fonts",
+	);
+
+	#
+	# Checks valid for RHEL/CentOS version 7.x.
+	#
+	if (! $system_release =~ /Fedora/) {
+		$map{"virtualenv"} = "python-virtualenv";
+	}
+
+	my $release;
+
+	$release = $1 if ($system_release =~ /Fedora\s+release\s+(\d+)/);
+
+	check_rpm_missing(\@fedora26_opt_pkgs, 1) if ($pdf && $release >= 26);
+	check_rpm_missing(\@fedora_tex_pkgs, 1) if ($pdf);
+	check_missing_tex(1) if ($pdf);
+	check_missing(\%map);
+
+	return if (!$need && !$optional);
+
+	if ($release >= 18) {
+		# dnf, for Fedora 18+
+		printf("You should run:\n\n\tsudo dnf install -y $install\n");
+	} else {
+		# yum, for RHEL (and clones) or Fedora version < 18
+		printf("You should run:\n\n\tsudo yum install -y $install\n");
+	}
+}
+
+sub give_opensuse_hints()
+{
+	my %map = (
+		"python-sphinx"		=> "python3-sphinx",
+		"sphinx_rtd_theme"	=> "python3-sphinx_rtd_theme",
+		"virtualenv"		=> "python3-virtualenv",
+		"dot"			=> "graphviz",
+		"convert"		=> "ImageMagick",
+		"Pod::Usage"		=> "perl-Pod-Usage",
+		"xelatex"		=> "texlive-xetex-bin",
+		"rsvg-convert"		=> "rsvg-view",
+	);
+
+	my @suse_tex_pkgs = (
+		"texlive-babel-english",
+		"texlive-caption",
+		"texlive-colortbl",
+		"texlive-courier",
+		"texlive-dvips",
+		"texlive-helvetic",
+		"texlive-makeindex",
+		"texlive-metafont",
+		"texlive-metapost",
+		"texlive-palatino",
+		"texlive-preview",
+		"texlive-times",
+		"texlive-zapfchan",
+		"texlive-zapfding",
+	);
+
+	check_rpm_missing(\@suse_tex_pkgs, 1) if ($pdf);
+	check_missing_tex(1) if ($pdf);
+	check_missing(\%map);
+
+	return if (!$need && !$optional);
+	printf("You should run:\n\n\tsudo zypper install --no-recommends $install\n");
+}
+
+sub give_mageia_hints()
+{
+	my %map = (
+		"python-sphinx"		=> "python3-sphinx",
+		"sphinx_rtd_theme"	=> "python3-sphinx_rtd_theme",
+		"virtualenv"		=> "python3-virtualenv",
+		"dot"			=> "graphviz",
+		"convert"		=> "ImageMagick",
+		"Pod::Usage"		=> "perl-Pod-Usage",
+		"xelatex"		=> "texlive",
+		"rsvg-convert"		=> "librsvg2-tools",
+	);
+
+	my @tex_pkgs = (
+		"texlive-fontsextra",
+	);
+
+	check_rpm_missing(\@tex_pkgs, 1) if ($pdf);
+	check_missing(\%map);
+
+	return if (!$need && !$optional);
+	printf("You should run:\n\n\tsudo urpmi $install\n");
+}
+
+sub give_arch_linux_hints()
+{
+	my %map = (
+		"sphinx_rtd_theme"	=> "python-sphinx_rtd_theme",
+		"virtualenv"		=> "python-virtualenv",
+		"dot"			=> "graphviz",
+		"convert"		=> "imagemagick",
+		"xelatex"		=> "texlive-bin",
+		"rsvg-convert"		=> "extra/librsvg",
+	);
+
+	my @archlinux_tex_pkgs = (
+		"texlive-core",
+		"texlive-latexextra",
+		"ttf-dejavu",
+	);
+	check_pacman_missing(\@archlinux_tex_pkgs, 1) if ($pdf);
+	check_missing(\%map);
+
+	return if (!$need && !$optional);
+	printf("You should run:\n\n\tsudo pacman -S $install\n");
+}
+
+sub give_gentoo_hints()
+{
+	my %map = (
+		"sphinx_rtd_theme"	=> "dev-python/sphinx_rtd_theme",
+		"virtualenv"		=> "dev-python/virtualenv",
+		"dot"			=> "media-gfx/graphviz",
+		"convert"		=> "media-gfx/imagemagick",
+		"xelatex"		=> "dev-texlive/texlive-xetex media-fonts/dejavu",
+		"rsvg-convert"		=> "gnome-base/librsvg",
+	);
+
+	check_missing_file("/usr/share/fonts/dejavu/DejaVuSans.ttf",
+			   "media-fonts/dejavu", 1) if ($pdf);
+
+	check_missing(\%map);
+
+	return if (!$need && !$optional);
+
+	printf("You should run:\n\n");
+	printf("\tsudo su -c 'echo \"media-gfx/imagemagick svg png\" > /etc/portage/package.use/imagemagick'\n");
+	printf("\tsudo su -c 'echo \"media-gfx/graphviz cairo pdf\" > /etc/portage/package.use/graphviz'\n");
+	printf("\tsudo emerge --ask $install\n");
+
+}
+
+sub check_distros()
+{
+	# Distro-specific hints
+	if ($system_release =~ /Red Hat Enterprise Linux/) {
+		give_redhat_hints;
+		return;
+	}
+	if ($system_release =~ /CentOS/) {
+		give_redhat_hints;
+		return;
+	}
+	if ($system_release =~ /Scientific Linux/) {
+		give_redhat_hints;
+		return;
+	}
+	if ($system_release =~ /Oracle Linux Server/) {
+		give_redhat_hints;
+		return;
+	}
+	if ($system_release =~ /Fedora/) {
+		give_redhat_hints;
+		return;
+	}
+	if ($system_release =~ /Ubuntu/) {
+		give_debian_hints;
+		return;
+	}
+	if ($system_release =~ /Debian/) {
+		give_debian_hints;
+		return;
+	}
+	if ($system_release =~ /openSUSE/) {
+		give_opensuse_hints;
+		return;
+	}
+	if ($system_release =~ /Mageia/) {
+		give_mageia_hints;
+		return;
+	}
+	if ($system_release =~ /Arch Linux/) {
+		give_arch_linux_hints;
+		return;
+	}
+	if ($system_release =~ /Gentoo/) {
+		give_gentoo_hints;
+		return;
+	}
+
+	#
+	# Fall-back to generic hint code for other distros
+	# That's far from ideal, specially for LaTeX dependencies.
+	#
+	my %map = (
+		"sphinx-build" => "sphinx"
+	);
+	check_missing_tex(1) if ($pdf);
+	check_missing(\%map);
+	print "I don't know distro $system_release.\n";
+	print "So, I can't provide you a hint with the install procedure.\n";
+	print "There are likely missing dependencies.\n";
+}
+
+#
+# Common dependencies
+#
+
+sub check_needs()
+{
+	if ($system_release) {
+		print "Detected OS: $system_release.\n";
+	} else {
+		print "Unknown OS\n";
+	}
+
+	# RHEL 7.x and clones have Sphinx version 1.1.x and incomplete texlive
+	if (($system_release =~ /Red Hat Enterprise Linux/) ||
+	    ($system_release =~ /CentOS/) ||
+	    ($system_release =~ /Scientific Linux/) ||
+	    ($system_release =~ /Oracle Linux Server/)) {
+		$virtualenv = 1;
+		$pdf = 0;
+
+		printf("NOTE: On this distro, Sphinx and TexLive shipped versions are incompatible\n");
+		printf("with doc build. So, use Sphinx via a Python virtual environment.\n\n");
+		printf("This script can't install a TexLive version that would provide PDF.\n");
+	}
+
+	# Check for needed programs/tools
+	check_sphinx();
+	check_perl_module("Pod::Usage", 0);
+	check_program("make", 0);
+	check_program("gcc", 0);
+	check_python_module("sphinx_rtd_theme", 1) if (!$virtualenv);
+	check_program("xelatex", 1) if ($pdf);
+	check_program("dot", 1);
+	check_program("convert", 1);
+	check_program("rsvg-convert", 1) if ($pdf);
+
+	check_distros();
+
+	if ($need_symlink) {
+		printf "\tsudo ln -sf %s /usr/bin/sphinx-build\n\n",
+		       which("sphinx-build-3");
+	}
+	if ($need_sphinx) {
+		my $activate = "$virtenv_dir/bin/activate";
+		if (-e "$ENV{'PWD'}/$activate") {
+			printf "\nNeed to activate virtualenv with:\n";
+			printf "\t. $activate\n";
+		} else {
+			my $virtualenv = findprog("virtualenv-3");
+			$virtualenv = findprog("virtualenv-3.5") if (!$virtualenv);
+			$virtualenv = findprog("virtualenv") if (!$virtualenv);
+			$virtualenv = "virtualenv" if (!$virtualenv);
+
+			printf "\t$virtualenv $virtenv_dir\n";
+			printf "\t. $activate\n";
+			printf "\tpip install -r $requirement_file\n";
+			$need++;
+		}
+	}
+	printf "\n";
+
+	print "All optional dependenties are met.\n" if (!$optional);
+
+	if ($need == 1) {
+		die "Can't build as $need mandatory dependency is missing";
+	} elsif ($need) {
+		die "Can't build as $need mandatory dependencies are missing";
+	}
+
+	print "Needed package dependencies are met.\n";
+}
+
+#
+# Main
+#
+
+while (@ARGV) {
+	my $arg = shift(@ARGV);
+
+	if ($arg eq "--no-virtualenv") {
+		$virtualenv = 0;
+	} elsif ($arg eq "--no-pdf"){
+		$pdf = 0;
+	} else {
+		print "Usage:\n\t$0 <--no-virtualenv> <--no-pdf>\n\n";
+		exit -1;
+	}
+}
+
+#
+# Determine the system type. There's no standard unique way that would
+# work with all distros with a minimal package install. So, several
+# methods are used here.
+#
+# By default, it will use lsb_release function. If not available, it will
+# fail back to reading the known different places where the distro name
+# is stored
+#
+
+$system_release = qx(lsb_release -d) if which("lsb_release");
+$system_release =~ s/Description:\s*// if ($system_release);
+$system_release = catcheck("/etc/system-release") if !$system_release;
+$system_release = catcheck("/etc/redhat-release") if !$system_release;
+$system_release = catcheck("/etc/lsb-release") if !$system_release;
+$system_release = catcheck("/etc/gentoo-release") if !$system_release;
+$system_release = catcheck("/etc/issue") if !$system_release;
+$system_release =~ s/\s+$//;
+
+check_needs;
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 22995cb..cf0433f 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -3064,6 +3064,7 @@ int snd_pcm_kernel_ioctl(struct snd_pcm_substream *substream,
 {
 	snd_pcm_uframes_t *frames = arg;
 	snd_pcm_sframes_t result;
+	int err;
 	
 	switch (cmd) {
 	case SNDRV_PCM_IOCTL_FORWARD:
@@ -3083,7 +3084,10 @@ int snd_pcm_kernel_ioctl(struct snd_pcm_substream *substream,
 	case SNDRV_PCM_IOCTL_START:
 		return snd_pcm_start_lock_irq(substream);
 	case SNDRV_PCM_IOCTL_DRAIN:
-		return snd_pcm_drain(substream, NULL);
+		snd_power_lock(substream->pcm->card);
+		err = snd_pcm_drain(substream, NULL);
+		snd_power_unlock(substream->pcm->card);
+		return err;
 	case SNDRV_PCM_IOCTL_DROP:
 		return snd_pcm_drop(substream);
 	case SNDRV_PCM_IOCTL_DELAY:
diff --git a/sound/soc/codecs/rt5670.c b/sound/soc/codecs/rt5670.c
index 0ec7985..054b613 100644
--- a/sound/soc/codecs/rt5670.c
+++ b/sound/soc/codecs/rt5670.c
@@ -567,7 +567,7 @@ int rt5670_set_jack_detect(struct snd_soc_codec *codec,
 
 	rt5670->jack = jack;
 	rt5670->hp_gpio.gpiod_dev = codec->dev;
-	rt5670->hp_gpio.name = "headphone detect";
+	rt5670->hp_gpio.name = "headset";
 	rt5670->hp_gpio.report = SND_JACK_HEADSET |
 		SND_JACK_BTN_0 | SND_JACK_BTN_1 | SND_JACK_BTN_2;
 	rt5670->hp_gpio.debounce_time = 150;
diff --git a/sound/soc/generic/simple-card-utils.c b/sound/soc/generic/simple-card-utils.c
index 7d7ab4a..d72f7d5 100644
--- a/sound/soc/generic/simple-card-utils.c
+++ b/sound/soc/generic/simple-card-utils.c
@@ -132,7 +132,7 @@ int asoc_simple_card_parse_card_name(struct snd_soc_card *card,
 
 	/* Parse the card name from DT */
 	ret = snd_soc_of_parse_card_name(card, "label");
-	if (ret < 0) {
+	if (ret < 0 || !card->name) {
 		char prop[128];
 
 		snprintf(prop, sizeof(prop), "%sname", prefix);
diff --git a/sound/soc/intel/boards/cht_bsw_rt5672.c b/sound/soc/intel/boards/cht_bsw_rt5672.c
index bc2a52d..f597d55 100644
--- a/sound/soc/intel/boards/cht_bsw_rt5672.c
+++ b/sound/soc/intel/boards/cht_bsw_rt5672.c
@@ -184,6 +184,13 @@ static int cht_aif1_hw_params(struct snd_pcm_substream *substream,
 	return 0;
 }
 
+static const struct acpi_gpio_params headset_gpios = { 0, 0, false };
+
+static const struct acpi_gpio_mapping cht_rt5672_gpios[] = {
+	{ "headset-gpios", &headset_gpios, 1 },
+	{},
+};
+
 static int cht_codec_init(struct snd_soc_pcm_runtime *runtime)
 {
 	int ret;
@@ -191,6 +198,9 @@ static int cht_codec_init(struct snd_soc_pcm_runtime *runtime)
 	struct snd_soc_codec *codec = codec_dai->codec;
 	struct cht_mc_private *ctx = snd_soc_card_get_drvdata(runtime->card);
 
+	if (devm_acpi_dev_add_driver_gpios(codec->dev, cht_rt5672_gpios))
+		dev_warn(runtime->dev, "Unable to add GPIO mapping table\n");
+
 	/* TDM 4 slots 24 bit, set Rx & Tx bitmask to 4 active slots */
 	ret = snd_soc_dai_set_tdm_slot(codec_dai, 0xF, 0xF, 4, 24);
 	if (ret < 0) {
diff --git a/sound/soc/omap/omap-hdmi-audio.c b/sound/soc/omap/omap-hdmi-audio.c
index 888133f..3e9cc48 100644
--- a/sound/soc/omap/omap-hdmi-audio.c
+++ b/sound/soc/omap/omap-hdmi-audio.c
@@ -337,14 +337,11 @@ static int omap_hdmi_audio_probe(struct platform_device *pdev)
 	ad->dma_data.addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
 	mutex_init(&ad->current_stream_lock);
 
-	switch (ha->dss_version) {
-	case OMAPDSS_VER_OMAP4430_ES1:
-	case OMAPDSS_VER_OMAP4430_ES2:
-	case OMAPDSS_VER_OMAP4:
+	switch (ha->version) {
+	case 4:
 		dai_drv = &omap4_hdmi_dai;
 		break;
-	case OMAPDSS_VER_OMAP5:
-	case OMAPDSS_VER_DRA7xx:
+	case 5:
 		dai_drv = &omap5_hdmi_dai;
 		break;
 	default:
diff --git a/tools/testing/selftests/rcutorture/bin/config_override.sh b/tools/testing/selftests/rcutorture/bin/config_override.sh
new file mode 100755
index 0000000..49fa517
--- /dev/null
+++ b/tools/testing/selftests/rcutorture/bin/config_override.sh
@@ -0,0 +1,61 @@
+#!/bin/bash
+#
+# config_override.sh base override
+#
+# Combines base and override, removing any Kconfig options from base
+# that conflict with any in override, concatenating what remains and
+# sending the result to standard output.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, you can access it online at
+# http://www.gnu.org/licenses/gpl-2.0.html.
+#
+# Copyright (C) IBM Corporation, 2017
+#
+# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
+
+base=$1
+if test -r $base
+then
+	:
+else
+	echo Base file $base unreadable!!!
+	exit 1
+fi
+
+override=$2
+if test -r $override
+then
+	:
+else
+	echo Override file $override unreadable!!!
+	exit 1
+fi
+
+T=/tmp/config_override.sh.$$
+trap 'rm -rf $T' 0
+mkdir $T
+
+sed < $override -e 's/^/grep -v "/' -e 's/=.*$/="/' |
+	awk '
+	{
+		if (last)
+			print last " |";
+		last = $0;
+	}
+	END {
+		if (last)
+			print last;
+	}' > $T/script
+sh $T/script < $base
+cat $override
diff --git a/tools/testing/selftests/rcutorture/bin/functions.sh b/tools/testing/selftests/rcutorture/bin/functions.sh
index 1426a9b..07a1377 100644
--- a/tools/testing/selftests/rcutorture/bin/functions.sh
+++ b/tools/testing/selftests/rcutorture/bin/functions.sh
@@ -66,9 +66,34 @@
 
 # configfrag_boot_cpus bootparam-string config-fragment-file config-cpus
 #
-# Decreases number of CPUs based on any maxcpus= boot parameters specified.
+# Decreases number of CPUs based on any nr_cpus= boot parameters specified.
 configfrag_boot_cpus () {
 	local bootargs="`configfrag_boot_params "$1" "$2"`"
+	local nr_cpus
+	if echo "${bootargs}" | grep -q 'nr_cpus=[0-9]'
+	then
+		nr_cpus="`echo "${bootargs}" | sed -e 's/^.*nr_cpus=\([0-9]*\).*$/\1/'`"
+		if test "$3" -gt "$nr_cpus"
+		then
+			echo $nr_cpus
+		else
+			echo $3
+		fi
+	else
+		echo $3
+	fi
+}
+
+# configfrag_boot_maxcpus bootparam-string config-fragment-file config-cpus
+#
+# Decreases number of CPUs based on any maxcpus= boot parameters specified.
+# This allows tests where additional CPUs come online later during the
+# test run.  However, the torture parameters will be set based on the
+# number of CPUs initially present, so the scripting should schedule
+# test runs based on the maxcpus= boot parameter controlling the initial
+# number of CPUs instead of on the ultimate number of CPUs.
+configfrag_boot_maxcpus () {
+	local bootargs="`configfrag_boot_params "$1" "$2"`"
 	local maxcpus
 	if echo "${bootargs}" | grep -q 'maxcpus=[0-9]'
 	then
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-build.sh b/tools/testing/selftests/rcutorture/bin/kvm-build.sh
index c29f2ec..46752c1 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-build.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-build.sh
@@ -2,7 +2,7 @@
 #
 # Build a kvm-ready Linux kernel from the tree in the current directory.
 #
-# Usage: kvm-build.sh config-template build-dir more-configs
+# Usage: kvm-build.sh config-template build-dir
 #
 # This program is free software; you can redistribute it and/or modify
 # it under the terms of the GNU General Public License as published by
@@ -34,24 +34,17 @@
 	echo "kvm-build.sh :$builddir: Not a writable directory, cannot build into it"
 	exit 1
 fi
-moreconfigs=${3}
-if test -z "$moreconfigs" -o ! -r "$moreconfigs"
-then
-	echo "kvm-build.sh :$moreconfigs: Not a readable file"
-	exit 1
-fi
 
 T=/tmp/test-linux.sh.$$
 trap 'rm -rf $T' 0
 mkdir $T
 
-grep -v 'CONFIG_[A-Z]*_TORTURE_TEST=' < ${config_template} > $T/config
+cp ${config_template} $T/config
 cat << ___EOF___ >> $T/config
 CONFIG_INITRAMFS_SOURCE="$TORTURE_INITRD"
 CONFIG_VIRTIO_PCI=y
 CONFIG_VIRTIO_CONSOLE=y
 ___EOF___
-cat $moreconfigs >> $T/config
 
 configinit.sh $T/config O=$builddir
 retval=$?
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh b/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
index 93eede4..0af36a7 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
@@ -40,7 +40,7 @@
 
 T=/tmp/kvm-test-1-run.sh.$$
 trap 'rm -rf $T' 0
-touch $T
+mkdir $T
 
 . $KVM/bin/functions.sh
 . $CONFIGFRAG/ver_functions.sh
@@ -60,37 +60,33 @@
 	echo "kvm-test-1-run.sh :$resdir: Not a writable directory, cannot store results into it"
 	exit 1
 fi
-cp $config_template $resdir/ConfigFragment
 echo ' ---' `date`: Starting build
 echo ' ---' Kconfig fragment at: $config_template >> $resdir/log
+touch $resdir/ConfigFragment.input $resdir/ConfigFragment
 if test -r "$config_dir/CFcommon"
 then
-	cat < $config_dir/CFcommon >> $T
+	echo " --- $config_dir/CFcommon" >> $resdir/ConfigFragment.input
+	cat < $config_dir/CFcommon >> $resdir/ConfigFragment.input
+	config_override.sh $config_dir/CFcommon $config_template > $T/Kc1
+	grep '#CHECK#' $config_dir/CFcommon >> $resdir/ConfigFragment
+else
+	cp $config_template $T/Kc1
 fi
-# Optimizations below this point
-# CONFIG_USB=n
-# CONFIG_SECURITY=n
-# CONFIG_NFS_FS=n
-# CONFIG_SOUND=n
-# CONFIG_INPUT_JOYSTICK=n
-# CONFIG_INPUT_TABLET=n
-# CONFIG_INPUT_TOUCHSCREEN=n
-# CONFIG_INPUT_MISC=n
-# CONFIG_INPUT_MOUSE=n
-# # CONFIG_NET=n # disables console access, so accept the slower build.
-# CONFIG_SCSI=n
-# CONFIG_ATA=n
-# CONFIG_FAT_FS=n
-# CONFIG_MSDOS_FS=n
-# CONFIG_VFAT_FS=n
-# CONFIG_ISO9660_FS=n
-# CONFIG_QUOTA=n
-# CONFIG_HID=n
-# CONFIG_CRYPTO=n
-# CONFIG_PCCARD=n
-# CONFIG_PCMCIA=n
-# CONFIG_CARDBUS=n
-# CONFIG_YENTA=n
+echo " --- $config_template" >> $resdir/ConfigFragment.input
+cat $config_template >> $resdir/ConfigFragment.input
+grep '#CHECK#' $config_template >> $resdir/ConfigFragment
+if test -n "$TORTURE_KCONFIG_ARG"
+then
+	echo $TORTURE_KCONFIG_ARG | tr -s " " "\012" > $T/cmdline
+	echo " --- --kconfig argument" >> $resdir/ConfigFragment.input
+	cat $T/cmdline >> $resdir/ConfigFragment.input
+	config_override.sh $T/Kc1 $T/cmdline > $T/Kc2
+	# Note that "#CHECK#" is not permitted on commandline.
+else
+	cp $T/Kc1 $T/Kc2
+fi
+cat $T/Kc2 >> $resdir/ConfigFragment
+
 base_resdir=`echo $resdir | sed -e 's/\.[0-9]\+$//'`
 if test "$base_resdir" != "$resdir" -a -f $base_resdir/bzImage -a -f $base_resdir/vmlinux
 then
@@ -100,7 +96,9 @@
 	KERNEL=$base_resdir/${BOOT_IMAGE##*/} # use the last component of ${BOOT_IMAGE}
 	ln -s $base_resdir/Make*.out $resdir  # for kvm-recheck.sh
 	ln -s $base_resdir/.config $resdir  # for kvm-recheck.sh
-elif kvm-build.sh $config_template $builddir $T
+	# Arch-independent indicator
+	touch $resdir/builtkernel
+elif kvm-build.sh $T/Kc2 $builddir
 then
 	# Had to build a kernel for this test.
 	QEMU="`identify_qemu $builddir/vmlinux`"
@@ -112,6 +110,8 @@
 	then
 		cp $builddir/$BOOT_IMAGE $resdir
 		KERNEL=$resdir/${BOOT_IMAGE##*/}
+		# Arch-independent indicator
+		touch $resdir/builtkernel
 	else
 		echo No identifiable boot image, not running KVM, see $resdir.
 		echo Do the torture scripts know about your architecture?
@@ -149,7 +149,7 @@
 
 # Generate -smp qemu argument.
 qemu_args="-enable-kvm -nographic $qemu_args"
-cpu_count=`configNR_CPUS.sh $config_template`
+cpu_count=`configNR_CPUS.sh $resdir/ConfigFragment`
 cpu_count=`configfrag_boot_cpus "$boot_args" "$config_template" "$cpu_count"`
 vcpus=`identify_qemu_vcpus`
 if test $cpu_count -gt $vcpus
diff --git a/tools/testing/selftests/rcutorture/bin/kvm.sh b/tools/testing/selftests/rcutorture/bin/kvm.sh
index 50091de..b55895f 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm.sh
@@ -41,6 +41,7 @@
 TORTURE_DEFCONFIG=defconfig
 TORTURE_BOOT_IMAGE=""
 TORTURE_INITRD="$KVM/initrd"; export TORTURE_INITRD
+TORTURE_KCONFIG_ARG=""
 TORTURE_KMAKE_ARG=""
 TORTURE_SHUTDOWN_GRACE=180
 TORTURE_SUITE=rcu
@@ -65,6 +66,7 @@
 	echo "       --duration minutes"
 	echo "       --interactive"
 	echo "       --jitter N [ maxsleep (us) [ maxspin (us) ] ]"
+	echo "       --kconfig Kconfig-options"
 	echo "       --kmake-arg kernel-make-arguments"
 	echo "       --mac nn:nn:nn:nn:nn:nn"
 	echo "       --no-initrd"
@@ -129,6 +131,11 @@
 		jitter="$2"
 		shift
 		;;
+	--kconfig)
+		checkarg --kconfig "(Kconfig options)" $# "$2" '^CONFIG_[A-Z0-9_]\+=\([ynm]\|[0-9]\+\)\( CONFIG_[A-Z0-9_]\+=\([ynm]\|[0-9]\+\)\)*$' '^error$'
+		TORTURE_KCONFIG_ARG="$2"
+		shift
+		;;
 	--kmake-arg)
 		checkarg --kmake-arg "(kernel make arguments)" $# "$2" '.*' '^error$'
 		TORTURE_KMAKE_ARG="$2"
@@ -205,6 +212,7 @@
 	then
 		cpu_count=`configNR_CPUS.sh $CONFIGFRAG/$CF1`
 		cpu_count=`configfrag_boot_cpus "$TORTURE_BOOTARGS" "$CONFIGFRAG/$CF1" "$cpu_count"`
+		cpu_count=`configfrag_boot_maxcpus "$TORTURE_BOOTARGS" "$CONFIGFRAG/$CF1" "$cpu_count"`
 		for ((cur_rep=0;cur_rep<$config_reps;cur_rep++))
 		do
 			echo $CF1 $cpu_count >> $T/cfgcpu
@@ -275,6 +283,7 @@
 TORTURE_BUILDONLY="$TORTURE_BUILDONLY"; export TORTURE_BUILDONLY
 TORTURE_DEFCONFIG="$TORTURE_DEFCONFIG"; export TORTURE_DEFCONFIG
 TORTURE_INITRD="$TORTURE_INITRD"; export TORTURE_INITRD
+TORTURE_KCONFIG_ARG="$TORTURE_KCONFIG_ARG"; export TORTURE_KCONFIG_ARG
 TORTURE_KMAKE_ARG="$TORTURE_KMAKE_ARG"; export TORTURE_KMAKE_ARG
 TORTURE_QEMU_CMD="$TORTURE_QEMU_CMD"; export TORTURE_QEMU_CMD
 TORTURE_QEMU_INTERACTIVE="$TORTURE_QEMU_INTERACTIVE"; export TORTURE_QEMU_INTERACTIVE
@@ -324,6 +333,7 @@
 {
 	print "echo ----Start batch " batchnum ": `date`";
 	print "echo ----Start batch " batchnum ": `date` >> " rd "/log";
+	print "needqemurun="
 	jn=1
 	for (j = first; j < pastlast; j++) {
 		builddir=KVM "/b" jn
@@ -359,10 +369,11 @@
 	for (j = 1; j < jn; j++) {
 		builddir=KVM "/b" j
 		print "rm -f " builddir ".ready"
-		print "if test -z \"$TORTURE_BUILDONLY\""
+		print "if test -f \"" rd cfr[j] "/builtkernel\""
 		print "then"
-		print "\techo ----", cfr[j], cpusr[j] ovf ": Starting kernel. `date`";
-		print "\techo ----", cfr[j], cpusr[j] ovf ": Starting kernel. `date` >> " rd "/log";
+		print "\techo ----", cfr[j], cpusr[j] ovf ": Kernel present. `date`";
+		print "\techo ----", cfr[j], cpusr[j] ovf ": Kernel present. `date` >> " rd "/log";
+		print "\tneedqemurun=1"
 		print "fi"
 	}
 	njitter = 0;
@@ -377,13 +388,22 @@
 		njitter = 0;
 		print "echo Build-only run, so suppressing jitter >> " rd "/log"
 	}
-	for (j = 0; j < njitter; j++)
-		print "jitter.sh " j " " dur " " ja[2] " " ja[3] "&"
-	print "wait"
-	print "if test -z \"$TORTURE_BUILDONLY\""
+	if (TORTURE_BUILDONLY) {
+		print "needqemurun="
+	}
+	print "if test -n \"$needqemurun\""
 	print "then"
+	print "\techo ---- Starting kernels. `date`";
+	print "\techo ---- Starting kernels. `date` >> " rd "/log";
+	for (j = 0; j < njitter; j++)
+		print "\tjitter.sh " j " " dur " " ja[2] " " ja[3] "&"
+	print "\twait"
 	print "\techo ---- All kernel runs complete. `date`";
 	print "\techo ---- All kernel runs complete. `date` >> " rd "/log";
+	print "else"
+	print "\twait"
+	print "\techo ---- No kernel runs. `date`";
+	print "\techo ---- No kernel runs. `date` >> " rd "/log";
 	print "fi"
 	for (j = 1; j < jn; j++) {
 		builddir=KVM "/b" j
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/BUSTED.boot b/tools/testing/selftests/rcutorture/configs/rcu/BUSTED.boot
index 6804f9d..be7728d 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/BUSTED.boot
+++ b/tools/testing/selftests/rcutorture/configs/rcu/BUSTED.boot
@@ -1 +1 @@
-rcutorture.torture_type=rcu_busted
+rcutorture.torture_type=busted
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/SRCU-C.boot b/tools/testing/selftests/rcutorture/configs/rcu/SRCU-C.boot
deleted file mode 100644
index 84a7d51..0000000
--- a/tools/testing/selftests/rcutorture/configs/rcu/SRCU-C.boot
+++ /dev/null
@@ -1 +0,0 @@
-rcutorture.torture_type=srcud
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/SRCU-u b/tools/testing/selftests/rcutorture/configs/rcu/SRCU-u
index 6bc24e9..c15ada8 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/SRCU-u
+++ b/tools/testing/selftests/rcutorture/configs/rcu/SRCU-u
@@ -4,6 +4,7 @@
 CONFIG_PREEMPT=n
 #CHECK#CONFIG_TINY_SRCU=y
 CONFIG_RCU_TRACE=n
-CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_DEBUG_LOCK_ALLOC=y
+CONFIG_PROVE_LOCKING=y
 CONFIG_DEBUG_OBJECTS_RCU_HEAD=n
 CONFIG_PREEMPT_COUNT=n
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/TREE01.boot b/tools/testing/selftests/rcutorture/configs/rcu/TREE01.boot
index 1d14e13..9f3a4d2 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/TREE01.boot
+++ b/tools/testing/selftests/rcutorture/configs/rcu/TREE01.boot
@@ -1,4 +1,4 @@
-rcutorture.torture_type=rcu_bh maxcpus=8
+rcutorture.torture_type=rcu_bh maxcpus=8 nr_cpus=43
 rcutree.gp_preinit_delay=3
 rcutree.gp_init_delay=3
 rcutree.gp_cleanup_delay=3
diff --git a/tools/testing/selftests/rcutorture/doc/TREE_RCU-kconfig.txt b/tools/testing/selftests/rcutorture/doc/TREE_RCU-kconfig.txt
index 9ad3f89..af6fca0 100644
--- a/tools/testing/selftests/rcutorture/doc/TREE_RCU-kconfig.txt
+++ b/tools/testing/selftests/rcutorture/doc/TREE_RCU-kconfig.txt
@@ -69,11 +69,11 @@
 CONFIG_PREEMPT_RCU
 CONFIG_TREE_RCU
 CONFIG_TINY_RCU
+CONFIG_TASKS_RCU
 
 	These are controlled by CONFIG_PREEMPT and/or CONFIG_SMP.
 
 CONFIG_SRCU
-CONFIG_TASKS_RCU
 
 	Selected by CONFIG_RCU_TORTURE_TEST, so cannot disable.
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 15252d7..4d81f6d 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -322,47 +322,6 @@ static inline struct kvm *mmu_notifier_to_kvm(struct mmu_notifier *mn)
 	return container_of(mn, struct kvm, mmu_notifier);
 }
 
-static void kvm_mmu_notifier_invalidate_page(struct mmu_notifier *mn,
-					     struct mm_struct *mm,
-					     unsigned long address)
-{
-	struct kvm *kvm = mmu_notifier_to_kvm(mn);
-	int need_tlb_flush, idx;
-
-	/*
-	 * When ->invalidate_page runs, the linux pte has been zapped
-	 * already but the page is still allocated until
-	 * ->invalidate_page returns. So if we increase the sequence
-	 * here the kvm page fault will notice if the spte can't be
-	 * established because the page is going to be freed. If
-	 * instead the kvm page fault establishes the spte before
-	 * ->invalidate_page runs, kvm_unmap_hva will release it
-	 * before returning.
-	 *
-	 * The sequence increase only need to be seen at spin_unlock
-	 * time, and not at spin_lock time.
-	 *
-	 * Increasing the sequence after the spin_unlock would be
-	 * unsafe because the kvm page fault could then establish the
-	 * pte after kvm_unmap_hva returned, without noticing the page
-	 * is going to be freed.
-	 */
-	idx = srcu_read_lock(&kvm->srcu);
-	spin_lock(&kvm->mmu_lock);
-
-	kvm->mmu_notifier_seq++;
-	need_tlb_flush = kvm_unmap_hva(kvm, address) | kvm->tlbs_dirty;
-	/* we've to flush the tlb before the pages can be freed */
-	if (need_tlb_flush)
-		kvm_flush_remote_tlbs(kvm);
-
-	spin_unlock(&kvm->mmu_lock);
-
-	kvm_arch_mmu_notifier_invalidate_page(kvm, address);
-
-	srcu_read_unlock(&kvm->srcu, idx);
-}
-
 static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn,
 					struct mm_struct *mm,
 					unsigned long address,
@@ -510,7 +469,6 @@ static void kvm_mmu_notifier_release(struct mmu_notifier *mn,
 }
 
 static const struct mmu_notifier_ops kvm_mmu_notifier_ops = {
-	.invalidate_page	= kvm_mmu_notifier_invalidate_page,
 	.invalidate_range_start	= kvm_mmu_notifier_invalidate_range_start,
 	.invalidate_range_end	= kvm_mmu_notifier_invalidate_range_end,
 	.clear_flush_young	= kvm_mmu_notifier_clear_flush_young,