igt/gem_ctx_switch: Measure qlen for timing loops

Submitted by Chris Wilson on March 16, 2018, 4:03 p.m.

Details

Message ID 20180316160325.15554-1-chris@chris-wilson.co.uk
State New
Series "igt/gem_ctx_switch: Measure qlen for timing loops"
Headers show

Commit Message

Chris Wilson March 16, 2018, 4:03 p.m.
Some platforms may execute the heavy workload very slowly, such that
using a batch of 1024 takes tens of seconds and immediately overrunning
the 5s timeout on a pass. Added up over a few dozen passes, this turns a
120 second test into 10 minutes. Counter this by doing a warmup loop to
estimate the appropriate queue len for timing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
---
 tests/gem_ctx_switch.c | 26 +++++++++++++++++++++++---
 1 file changed, 23 insertions(+), 3 deletions(-)

Patch hide | download patch | download mbox

diff --git a/tests/gem_ctx_switch.c b/tests/gem_ctx_switch.c
index 4efece14..8d164398 100644
--- a/tests/gem_ctx_switch.c
+++ b/tests/gem_ctx_switch.c
@@ -141,7 +141,7 @@  static void all(int fd, uint32_t handle, unsigned flags, int timeout)
 	const char *name[16];
 	uint32_t contexts[65];
 	unsigned int nengine;
-	int n;
+	int n, qlen;
 
 	nengine = 0;
 	for_each_physical_engine(fd, e) {
@@ -165,6 +165,25 @@  static void all(int fd, uint32_t handle, unsigned flags, int timeout)
 	execbuf.flags |= LOCAL_I915_EXEC_NO_RELOC;
 	igt_require(__gem_execbuf(fd, &execbuf) == 0);
 	gem_sync(fd, handle);
+
+	qlen = 64;
+	for (n = 0; n < nengine; n++) {
+		uint64_t saved = execbuf.flags;
+		struct timespec tv = {};
+
+		execbuf.flags |= engine[n];
+
+		igt_nsec_elapsed(&tv);
+		for (int loop = 0; loop < qlen; loop++)
+			gem_execbuf(fd, &execbuf);
+		gem_sync(fd, handle);
+
+		execbuf.flags = saved;
+
+		qlen = qlen * timeout * 1e9 / igt_nsec_elapsed(&tv) / 8 + 1;
+	}
+	igt_info("Using timing depth of %d batches\n", qlen);
+
 	execbuf.buffers_ptr = to_user_pointer(obj);
 	execbuf.buffer_count = 2;
 
@@ -184,11 +203,12 @@  static void all(int fd, uint32_t handle, unsigned flags, int timeout)
 
 				clock_gettime(CLOCK_MONOTONIC, &start);
 				do {
-					for (int loop = 0; loop < 1024; loop++) {
+					for (int loop = 0; loop < qlen; loop++) {
 						execbuf.rsvd1 = contexts[loop % nctx];
 						gem_execbuf(fd, &execbuf);
 					}
-					count += 1024;
+					count += qlen;
+					gem_sync(fd, obj[0].handle);
 					clock_gettime(CLOCK_MONOTONIC, &now);
 				} while (elapsed(&start, &now) < timeout);
 				gem_sync(fd, obj[0].handle);

Comments

Joonas Lahtinen March 23, 2018, 10:27 a.m.
Quoting Chris Wilson (2018-03-16 18:03:25)
> Some platforms may execute the heavy workload very slowly, such that
> using a batch of 1024 takes tens of seconds and immediately overrunning
> the 5s timeout on a pass. Added up over a few dozen passes, this turns a
> 120 second test into 10 minutes. Counter this by doing a warmup loop to
> estimate the appropriate queue len for timing.
> 
> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>

#define MAGNIFICENT_FACTOR_8 (3 + 5) or maybe a comment.

Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>

Regards, Joonas