[v8,1/6] drm/i915: Add per context timelines to fence object

Submitted by John Harrison on May 12, 2016, 9:06 p.m.

Details

Message ID 1463087196-11688-2-git-send-email-John.C.Harrison@Intel.com
State New
Headers show
Series "Convert requests to use struct fence" ( rev: 5 ) in Intel GFX

Not browsing as part of any series.

Commit Message

John Harrison May 12, 2016, 9:06 p.m.
From: John Harrison <John.C.Harrison@Intel.com>

The fence object used inside the request structure requires a sequence
number. Although this is not used by the i915 driver itself, it could
potentially be used by non-i915 code if the fence is passed outside of
the driver. This is the intention as it allows external kernel drivers
and user applications to wait on batch buffer completion
asynchronously via the dma-buff fence API.

To ensure that such external users are not confused by strange things
happening with the seqno, this patch adds in a per context timeline
that can provide a guaranteed in-order seqno value for the fence. This
is safe because the scheduler will not re-order batch buffers within a
context - they are considered to be mutually dependent.

v2: New patch in series.

v3: Renamed/retyped timeline structure fields after review comments by
Tvrtko Ursulin.

Added context information to the timeline's name string for better
identification in debugfs output.

v5: Line wrapping and other white space fixes to keep style checker
happy.

v7: Updated to newer nightly (lots of ring -> engine renaming).

v8: Moved to earlier in patch series so no longer needs to remove the
quick hack timeline that was being added before.

For: VIZ-5190
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
---
 drivers/gpu/drm/i915/i915_drv.h         | 14 ++++++++++++
 drivers/gpu/drm/i915/i915_gem.c         | 40 +++++++++++++++++++++++++++++++++
 drivers/gpu/drm/i915/i915_gem_context.c | 14 ++++++++++++
 drivers/gpu/drm/i915/intel_lrc.c        |  8 +++++++
 4 files changed, 76 insertions(+)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index d5496ab..520480b 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -818,6 +818,19 @@  struct i915_ctx_hang_stats {
 	bool banned;
 };
 
+struct i915_fence_timeline {
+	char        name[32];
+	unsigned    fence_context;
+	unsigned    next;
+
+	struct intel_context *ctx;
+	struct intel_engine_cs *engine;
+};
+
+int i915_create_fence_timeline(struct drm_device *dev,
+			       struct intel_context *ctx,
+			       struct intel_engine_cs *ring);
+
 /* This must match up with the value previously used for execbuf2.rsvd1. */
 #define DEFAULT_CONTEXT_HANDLE 0
 
@@ -869,6 +882,7 @@  struct intel_context {
 		u64 lrc_desc;
 		uint32_t *lrc_reg_state;
 		bool initialised;
+		struct i915_fence_timeline fence_timeline;
 	} engine[I915_NUM_ENGINES];
 
 	struct list_head link;
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index c2548bd..99bd470 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -2723,6 +2723,46 @@  void i915_gem_request_free(struct kref *req_ref)
 	kmem_cache_free(req->i915->requests, req);
 }
 
+int i915_create_fence_timeline(struct drm_device *dev,
+			       struct intel_context *ctx,
+			       struct intel_engine_cs *engine)
+{
+	struct i915_fence_timeline *timeline;
+
+	timeline = &ctx->engine[engine->id].fence_timeline;
+
+	if (timeline->engine)
+		return 0;
+
+	timeline->fence_context = fence_context_alloc(1);
+
+	/*
+	 * Start the timeline from seqno 0 as this is a special value
+	 * that is reserved for invalid sync points.
+	 */
+	timeline->next       = 1;
+	timeline->ctx        = ctx;
+	timeline->engine     = engine;
+
+	snprintf(timeline->name, sizeof(timeline->name), "%d>%s:%d",
+		 timeline->fence_context, engine->name, ctx->user_handle);
+
+	return 0;
+}
+
+unsigned i915_fence_timeline_get_next_seqno(struct i915_fence_timeline *timeline)
+{
+	unsigned seqno;
+
+	seqno = timeline->next;
+
+	/* Reserve zero for invalid */
+	if (++timeline->next == 0)
+		timeline->next = 1;
+
+	return seqno;
+}
+
 static inline int
 __i915_gem_request_alloc(struct intel_engine_cs *engine,
 			 struct intel_context *ctx,
diff --git a/drivers/gpu/drm/i915/i915_gem_context.c b/drivers/gpu/drm/i915/i915_gem_context.c
index de8d878..e90762e 100644
--- a/drivers/gpu/drm/i915/i915_gem_context.c
+++ b/drivers/gpu/drm/i915/i915_gem_context.c
@@ -311,6 +311,20 @@  i915_gem_create_context(struct drm_device *dev,
 	if (IS_ERR(ctx))
 		return ctx;
 
+	if (!i915.enable_execlists) {
+		struct intel_engine_cs *engine;
+
+		/* Create a per context timeline for fences */
+		for_each_engine(engine, to_i915(dev)) {
+			ret = i915_create_fence_timeline(dev, ctx, engine);
+			if (ret) {
+				DRM_ERROR("Fence timeline creation failed for legacy %s: %p\n",
+					  engine->name, ctx);
+				goto err_destroy;
+			}
+		}
+	}
+
 	if (is_global_default_ctx && ctx->legacy_hw_ctx.rcs_state) {
 		/* We may need to do things with the shrinker which
 		 * require us to immediately switch back to the default
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index d876352..5e15fbe 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -2526,6 +2526,14 @@  static int execlists_context_deferred_alloc(struct intel_context *ctx,
 		goto error_ringbuf;
 	}
 
+	/* Create a per context timeline for fences */
+	ret = i915_create_fence_timeline(dev, ctx, engine);
+	if (ret) {
+		DRM_ERROR("Fence timeline creation failed for engine %s, ctx %p\n",
+			  engine->name, ctx);
+		goto error_ringbuf;
+	}
+
 	ctx->engine[engine->id].ringbuf = ringbuf;
 	ctx->engine[engine->id].state = ctx_obj;
 	ctx->engine[engine->id].initialised = engine->init_context == NULL;

Comments

On Thu, May 12, 2016 at 10:06:31PM +0100, John.C.Harrison@Intel.com wrote:
> From: John Harrison <John.C.Harrison@Intel.com>
> 
> The fence object used inside the request structure requires a sequence
> number. Although this is not used by the i915 driver itself, it could
> potentially be used by non-i915 code if the fence is passed outside of
> the driver. This is the intention as it allows external kernel drivers
> and user applications to wait on batch buffer completion
> asynchronously via the dma-buff fence API.
> 
> To ensure that such external users are not confused by strange things
> happening with the seqno, this patch adds in a per context timeline
> that can provide a guaranteed in-order seqno value for the fence. This
> is safe because the scheduler will not re-order batch buffers within a
> context - they are considered to be mutually dependent.
> 
> v2: New patch in series.
> 
> v3: Renamed/retyped timeline structure fields after review comments by
> Tvrtko Ursulin.
> 
> Added context information to the timeline's name string for better
> identification in debugfs output.
> 
> v5: Line wrapping and other white space fixes to keep style checker
> happy.
> 
> v7: Updated to newer nightly (lots of ring -> engine renaming).
> 
> v8: Moved to earlier in patch series so no longer needs to remove the
> quick hack timeline that was being added before.
> 
> For: VIZ-5190
> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
> ---
>  drivers/gpu/drm/i915/i915_drv.h         | 14 ++++++++++++
>  drivers/gpu/drm/i915/i915_gem.c         | 40 +++++++++++++++++++++++++++++++++
>  drivers/gpu/drm/i915/i915_gem_context.c | 14 ++++++++++++
>  drivers/gpu/drm/i915/intel_lrc.c        |  8 +++++++
>  4 files changed, 76 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
> index d5496ab..520480b 100644
> --- a/drivers/gpu/drm/i915/i915_drv.h
> +++ b/drivers/gpu/drm/i915/i915_drv.h
> @@ -818,6 +818,19 @@ struct i915_ctx_hang_stats {
>  	bool banned;
>  };
>  
> +struct i915_fence_timeline {
> +	char        name[32];
> +	unsigned    fence_context;
> +	unsigned    next;
> +
> +	struct intel_context *ctx;
> +	struct intel_engine_cs *engine;
> +};
> +
> +int i915_create_fence_timeline(struct drm_device *dev,
> +			       struct intel_context *ctx,
> +			       struct intel_engine_cs *ring);
> +
>  /* This must match up with the value previously used for execbuf2.rsvd1. */
>  #define DEFAULT_CONTEXT_HANDLE 0
>  
> @@ -869,6 +882,7 @@ struct intel_context {
>  		u64 lrc_desc;
>  		uint32_t *lrc_reg_state;
>  		bool initialised;
> +		struct i915_fence_timeline fence_timeline;

This is still fundamentally wrong. This i915_fence_timeline is both the
fence context and timeline. Our timelines are singular per vm, with a
fence context under each timeline per engine.

Please complete the legacy/execlists unification first so that we can
have timelines work correctly for the whole driver.
-Chris
On 13/05/2016 08:39, Chris Wilson wrote:
> On Thu, May 12, 2016 at 10:06:31PM +0100, John.C.Harrison@Intel.com wrote:
>> From: John Harrison <John.C.Harrison@Intel.com>
>>
>> The fence object used inside the request structure requires a sequence
>> number. Although this is not used by the i915 driver itself, it could
>> potentially be used by non-i915 code if the fence is passed outside of
>> the driver. This is the intention as it allows external kernel drivers
>> and user applications to wait on batch buffer completion
>> asynchronously via the dma-buff fence API.
>>
>> To ensure that such external users are not confused by strange things
>> happening with the seqno, this patch adds in a per context timeline
>> that can provide a guaranteed in-order seqno value for the fence. This
>> is safe because the scheduler will not re-order batch buffers within a
>> context - they are considered to be mutually dependent.
>>
>> v2: New patch in series.
>>
>> v3: Renamed/retyped timeline structure fields after review comments by
>> Tvrtko Ursulin.
>>
>> Added context information to the timeline's name string for better
>> identification in debugfs output.
>>
>> v5: Line wrapping and other white space fixes to keep style checker
>> happy.
>>
>> v7: Updated to newer nightly (lots of ring -> engine renaming).
>>
>> v8: Moved to earlier in patch series so no longer needs to remove the
>> quick hack timeline that was being added before.
>>
>> For: VIZ-5190
>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>> ---
>>   drivers/gpu/drm/i915/i915_drv.h         | 14 ++++++++++++
>>   drivers/gpu/drm/i915/i915_gem.c         | 40 +++++++++++++++++++++++++++++++++
>>   drivers/gpu/drm/i915/i915_gem_context.c | 14 ++++++++++++
>>   drivers/gpu/drm/i915/intel_lrc.c        |  8 +++++++
>>   4 files changed, 76 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>> index d5496ab..520480b 100644
>> --- a/drivers/gpu/drm/i915/i915_drv.h
>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>> @@ -818,6 +818,19 @@ struct i915_ctx_hang_stats {
>>   	bool banned;
>>   };
>>   
>> +struct i915_fence_timeline {
>> +	char        name[32];
>> +	unsigned    fence_context;
>> +	unsigned    next;
>> +
>> +	struct intel_context *ctx;
>> +	struct intel_engine_cs *engine;
>> +};
>> +
>> +int i915_create_fence_timeline(struct drm_device *dev,
>> +			       struct intel_context *ctx,
>> +			       struct intel_engine_cs *ring);
>> +
>>   /* This must match up with the value previously used for execbuf2.rsvd1. */
>>   #define DEFAULT_CONTEXT_HANDLE 0
>>   
>> @@ -869,6 +882,7 @@ struct intel_context {
>>   		u64 lrc_desc;
>>   		uint32_t *lrc_reg_state;
>>   		bool initialised;
>> +		struct i915_fence_timeline fence_timeline;
> This is still fundamentally wrong. This i915_fence_timeline is both the
> fence context and timeline. Our timelines are singular per vm, with a
> fence context under each timeline per engine.
As I said last time you mentioned 'per vm', please explain. Where does a 
virtual machine come in? Or is that not what you mean? What is the 
purpose of having multiple fence contexts within a single timeline? It 
will stop external uses attempting to combine fences but it won't stop 
them from actually being out of order. The timeline needs to be per 
hardware context not simply per engine because the whole point is that 
requests should not be re-ordered once they have been allocated a point 
on a timeline. However, the purpose of the scheduler (which is what this 
series is preparation for) is to re-order requests between contexts. 
Thus the timeline must be per context to prevent requests ending up with 
out of order completions.


> Please complete the legacy/execlists unification first so that we can
> have timelines work correctly for the whole driver.
No. That is a much larger task that people are working towards. However, 
we urgently need a scheduler for specific customers to use right now. 
That means we need to get something working right now not in some random 
number of years time. If an intermediate step is less than perfect but 
still functional that is better than not having anything at all.

> -Chris
>
Op 13-05-16 om 11:16 schreef John Harrison:
> On 13/05/2016 08:39, Chris Wilson wrote:
>> On Thu, May 12, 2016 at 10:06:31PM +0100, John.C.Harrison@Intel.com wrote:
>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>
>>> The fence object used inside the request structure requires a sequence
>>> number. Although this is not used by the i915 driver itself, it could
>>> potentially be used by non-i915 code if the fence is passed outside of
>>> the driver. This is the intention as it allows external kernel drivers
>>> and user applications to wait on batch buffer completion
>>> asynchronously via the dma-buff fence API.
>>>
>>> To ensure that such external users are not confused by strange things
>>> happening with the seqno, this patch adds in a per context timeline
>>> that can provide a guaranteed in-order seqno value for the fence. This
>>> is safe because the scheduler will not re-order batch buffers within a
>>> context - they are considered to be mutually dependent.
>>>
>>> v2: New patch in series.
>>>
>>> v3: Renamed/retyped timeline structure fields after review comments by
>>> Tvrtko Ursulin.
>>>
>>> Added context information to the timeline's name string for better
>>> identification in debugfs output.
>>>
>>> v5: Line wrapping and other white space fixes to keep style checker
>>> happy.
>>>
>>> v7: Updated to newer nightly (lots of ring -> engine renaming).
>>>
>>> v8: Moved to earlier in patch series so no longer needs to remove the
>>> quick hack timeline that was being added before.
>>>
>>> For: VIZ-5190
>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>> ---
>>>   drivers/gpu/drm/i915/i915_drv.h         | 14 ++++++++++++
>>>   drivers/gpu/drm/i915/i915_gem.c         | 40 +++++++++++++++++++++++++++++++++
>>>   drivers/gpu/drm/i915/i915_gem_context.c | 14 ++++++++++++
>>>   drivers/gpu/drm/i915/intel_lrc.c        |  8 +++++++
>>>   4 files changed, 76 insertions(+)
>>>
>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>> index d5496ab..520480b 100644
>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>> @@ -818,6 +818,19 @@ struct i915_ctx_hang_stats {
>>>       bool banned;
>>>   };
>>>   +struct i915_fence_timeline {
>>> +    char        name[32];
>>> +    unsigned    fence_context;
>>> +    unsigned    next;
>>> +
>>> +    struct intel_context *ctx;
>>> +    struct intel_engine_cs *engine;
>>> +};
>>> +
>>> +int i915_create_fence_timeline(struct drm_device *dev,
>>> +                   struct intel_context *ctx,
>>> +                   struct intel_engine_cs *ring);
>>> +
>>>   /* This must match up with the value previously used for execbuf2.rsvd1. */
>>>   #define DEFAULT_CONTEXT_HANDLE 0
>>>   @@ -869,6 +882,7 @@ struct intel_context {
>>>           u64 lrc_desc;
>>>           uint32_t *lrc_reg_state;
>>>           bool initialised;
>>> +        struct i915_fence_timeline fence_timeline;
>> This is still fundamentally wrong. This i915_fence_timeline is both the
>> fence context and timeline. Our timelines are singular per vm, with a
>> fence context under each timeline per engine.
> As I said last time you mentioned 'per vm', please explain. Where does a virtual machine come in? Or is that not what you mean? What is the purpose of having multiple fence contexts within a single timeline? It will stop external uses attempting to combine fences but it won't stop them from actually being out of order. The timeline needs to be per hardware context not simply per engine because the whole point is that requests should not be re-ordered once they have been allocated a point on a timeline. However, the purpose of the scheduler (which is what this series is preparation for) is to re-order requests between contexts. Thus the timeline must be per context to prevent requests ending up with out of order completions.

Virtual memory, not machine. Guessing i915_address_space :-)

Fence contexts may not have to be bound to a single timeline, lets say you have 2 engines, running commands for Com1 and Com2.

ENG1: Com1.1
ENG2: Com2.1 Com1.2 Com1.3

then there's no harm if you end up with multiple fence contexts since they're independently executing.

>
>> Please complete the legacy/execlists unification first so that we can
>> have timelines work correctly for the whole driver.
> No. That is a much larger task that people are working towards. However, we urgently need a scheduler for specific customers to use right now. That means we need to get something working right now not in some random number of years time. If an intermediate step is less than perfect but still functional that is better than not having anything at all.
>
>> -Chris
>>
>
> _______________________________________________
> Intel-gfx mailing list
> Intel-gfx@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
On 18/05/2016 13:22, Maarten Lankhorst wrote:
> Op 13-05-16 om 11:16 schreef John Harrison:
>> On 13/05/2016 08:39, Chris Wilson wrote:
>>> On Thu, May 12, 2016 at 10:06:31PM +0100, John.C.Harrison@Intel.com wrote:
>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>
>>>> The fence object used inside the request structure requires a sequence
>>>> number. Although this is not used by the i915 driver itself, it could
>>>> potentially be used by non-i915 code if the fence is passed outside of
>>>> the driver. This is the intention as it allows external kernel drivers
>>>> and user applications to wait on batch buffer completion
>>>> asynchronously via the dma-buff fence API.
>>>>
>>>> To ensure that such external users are not confused by strange things
>>>> happening with the seqno, this patch adds in a per context timeline
>>>> that can provide a guaranteed in-order seqno value for the fence. This
>>>> is safe because the scheduler will not re-order batch buffers within a
>>>> context - they are considered to be mutually dependent.
>>>>
>>>> v2: New patch in series.
>>>>
>>>> v3: Renamed/retyped timeline structure fields after review comments by
>>>> Tvrtko Ursulin.
>>>>
>>>> Added context information to the timeline's name string for better
>>>> identification in debugfs output.
>>>>
>>>> v5: Line wrapping and other white space fixes to keep style checker
>>>> happy.
>>>>
>>>> v7: Updated to newer nightly (lots of ring -> engine renaming).
>>>>
>>>> v8: Moved to earlier in patch series so no longer needs to remove the
>>>> quick hack timeline that was being added before.
>>>>
>>>> For: VIZ-5190
>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>>> ---
>>>>    drivers/gpu/drm/i915/i915_drv.h         | 14 ++++++++++++
>>>>    drivers/gpu/drm/i915/i915_gem.c         | 40 +++++++++++++++++++++++++++++++++
>>>>    drivers/gpu/drm/i915/i915_gem_context.c | 14 ++++++++++++
>>>>    drivers/gpu/drm/i915/intel_lrc.c        |  8 +++++++
>>>>    4 files changed, 76 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>>> index d5496ab..520480b 100644
>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>> @@ -818,6 +818,19 @@ struct i915_ctx_hang_stats {
>>>>        bool banned;
>>>>    };
>>>>    +struct i915_fence_timeline {
>>>> +    char        name[32];
>>>> +    unsigned    fence_context;
>>>> +    unsigned    next;
>>>> +
>>>> +    struct intel_context *ctx;
>>>> +    struct intel_engine_cs *engine;
>>>> +};
>>>> +
>>>> +int i915_create_fence_timeline(struct drm_device *dev,
>>>> +                   struct intel_context *ctx,
>>>> +                   struct intel_engine_cs *ring);
>>>> +
>>>>    /* This must match up with the value previously used for execbuf2.rsvd1. */
>>>>    #define DEFAULT_CONTEXT_HANDLE 0
>>>>    @@ -869,6 +882,7 @@ struct intel_context {
>>>>            u64 lrc_desc;
>>>>            uint32_t *lrc_reg_state;
>>>>            bool initialised;
>>>> +        struct i915_fence_timeline fence_timeline;
>>> This is still fundamentally wrong. This i915_fence_timeline is both the
>>> fence context and timeline. Our timelines are singular per vm, with a
>>> fence context under each timeline per engine.
>> As I said last time you mentioned 'per vm', please explain. Where does a virtual machine come in? Or is that not what you mean? What is the purpose of having multiple fence contexts within a single timeline? It will stop external uses attempting to combine fences but it won't stop them from actually being out of order. The timeline needs to be per hardware context not simply per engine because the whole point is that requests should not be re-ordered once they have been allocated a point on a timeline. However, the purpose of the scheduler (which is what this series is preparation for) is to re-order requests between contexts. Thus the timeline must be per context to prevent requests ending up with out of order completions.
> Virtual memory, not machine. Guessing i915_address_space :-)
How do you have a timeline per 'virtual memory'? Guessing is pointless 
when trying to get patches written and accepted. A design is not 
complete when it is based on guesses about what the important 
stakeholders mean.

> Fence contexts may not have to be bound to a single timeline,
Fence contexts absolutely have to be bound to a single timeline. The 
context is used to say whether any two fences can be combined by 
numerical comparison of their timeline points. If they share a context 
but have points generated from different timelines then any such 
comparison is broken. The fence context is the timeline!

>   lets say you have 2 engines, running commands for Com1 and Com2.
>
> ENG1: Com1.1
> ENG2: Com2.1 Com1.2 Com1.3
>
> then there's no harm if you end up with multiple fence contexts since they're independently executing.
Not sure what you are meaning here. If Com1.1 and Com1.2 are sharing 
fence context '1' and have timeline values '1' and '2' then the system 
is broken. The fence framework could merge the two by simply discarding 
Com1.1 because '2' is greater than '1' (timeline value) and '1' is equal 
to '1' (context). Thus a user would end up waiting on only Com1.2, 
however, being executed concurrently on different engines means that 
Com1.2 might actually complete before Com1.1. And hence the user is 
broken as it now thinks everything is finished when Engine1 is still 
running its work.

>>> Please complete the legacy/execlists unification first so that we can
>>> have timelines work correctly for the whole driver.
>> No. That is a much larger task that people are working towards. However, we urgently need a scheduler for specific customers to use right now. That means we need to get something working right now not in some random number of years time. If an intermediate step is less than perfect but still functional that is better than not having anything at all.
>>
>>> -Chris
>>>
>> _______________________________________________
>> Intel-gfx mailing list
>> Intel-gfx@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/intel-gfx
>
Op 18-05-16 om 14:49 schreef John Harrison:
> On 18/05/2016 13:22, Maarten Lankhorst wrote:
>> Op 13-05-16 om 11:16 schreef John Harrison:
>>> On 13/05/2016 08:39, Chris Wilson wrote:
>>>> On Thu, May 12, 2016 at 10:06:31PM +0100, John.C.Harrison@Intel.com wrote:
>>>>> From: John Harrison <John.C.Harrison@Intel.com>
>>>>>
>>>>> The fence object used inside the request structure requires a sequence
>>>>> number. Although this is not used by the i915 driver itself, it could
>>>>> potentially be used by non-i915 code if the fence is passed outside of
>>>>> the driver. This is the intention as it allows external kernel drivers
>>>>> and user applications to wait on batch buffer completion
>>>>> asynchronously via the dma-buff fence API.
>>>>>
>>>>> To ensure that such external users are not confused by strange things
>>>>> happening with the seqno, this patch adds in a per context timeline
>>>>> that can provide a guaranteed in-order seqno value for the fence. This
>>>>> is safe because the scheduler will not re-order batch buffers within a
>>>>> context - they are considered to be mutually dependent.
>>>>>
>>>>> v2: New patch in series.
>>>>>
>>>>> v3: Renamed/retyped timeline structure fields after review comments by
>>>>> Tvrtko Ursulin.
>>>>>
>>>>> Added context information to the timeline's name string for better
>>>>> identification in debugfs output.
>>>>>
>>>>> v5: Line wrapping and other white space fixes to keep style checker
>>>>> happy.
>>>>>
>>>>> v7: Updated to newer nightly (lots of ring -> engine renaming).
>>>>>
>>>>> v8: Moved to earlier in patch series so no longer needs to remove the
>>>>> quick hack timeline that was being added before.
>>>>>
>>>>> For: VIZ-5190
>>>>> Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
>>>>> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
>>>>> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
>>>>> ---
>>>>>    drivers/gpu/drm/i915/i915_drv.h         | 14 ++++++++++++
>>>>>    drivers/gpu/drm/i915/i915_gem.c         | 40 +++++++++++++++++++++++++++++++++
>>>>>    drivers/gpu/drm/i915/i915_gem_context.c | 14 ++++++++++++
>>>>>    drivers/gpu/drm/i915/intel_lrc.c        |  8 +++++++
>>>>>    4 files changed, 76 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
>>>>> index d5496ab..520480b 100644
>>>>> --- a/drivers/gpu/drm/i915/i915_drv.h
>>>>> +++ b/drivers/gpu/drm/i915/i915_drv.h
>>>>> @@ -818,6 +818,19 @@ struct i915_ctx_hang_stats {
>>>>>        bool banned;
>>>>>    };
>>>>>    +struct i915_fence_timeline {
>>>>> +    char        name[32];
>>>>> +    unsigned    fence_context;
>>>>> +    unsigned    next;
>>>>> +
>>>>> +    struct intel_context *ctx;
>>>>> +    struct intel_engine_cs *engine;
>>>>> +};
>>>>> +
>>>>> +int i915_create_fence_timeline(struct drm_device *dev,
>>>>> +                   struct intel_context *ctx,
>>>>> +                   struct intel_engine_cs *ring);
>>>>> +
>>>>>    /* This must match up with the value previously used for execbuf2.rsvd1. */
>>>>>    #define DEFAULT_CONTEXT_HANDLE 0
>>>>>    @@ -869,6 +882,7 @@ struct intel_context {
>>>>>            u64 lrc_desc;
>>>>>            uint32_t *lrc_reg_state;
>>>>>            bool initialised;
>>>>> +        struct i915_fence_timeline fence_timeline;
>>>> This is still fundamentally wrong. This i915_fence_timeline is both the
>>>> fence context and timeline. Our timelines are singular per vm, with a
>>>> fence context under each timeline per engine.
>>> As I said last time you mentioned 'per vm', please explain. Where does a virtual machine come in? Or is that not what you mean? What is the purpose of having multiple fence contexts within a single timeline? It will stop external uses attempting to combine fences but it won't stop them from actually being out of order. The timeline needs to be per hardware context not simply per engine because the whole point is that requests should not be re-ordered once they have been allocated a point on a timeline. However, the purpose of the scheduler (which is what this series is preparation for) is to re-order requests between contexts. Thus the timeline must be per context to prevent requests ending up with out of order completions.
>> Virtual memory, not machine. Guessing i915_address_space :-)
> How do you have a timeline per 'virtual memory'? Guessing is pointless when trying to get patches written and accepted. A design is not complete when it is based on guesses about what the important stakeholders mean.
>
>> Fence contexts may not have to be bound to a single timeline,
> Fence contexts absolutely have to be bound to a single timeline. The context is used to say whether any two fences can be combined by numerical comparison of their timeline points. If they share a context but have points generated from different timelines then any such comparison is broken. The fence context is the timeline!
>
>>   lets say you have 2 engines, running commands for Com1 and Com2.
>>
>> ENG1: Com1.1
>> ENG2: Com2.1 Com1.2 Com1.3
>>
>> then there's no harm if you end up with multiple fence contexts since they're independently executing.
> Not sure what you are meaning here. If Com1.1 and Com1.2 are sharing fence context '1' and have timeline values '1' and '2' then the system is broken. The fence framework could merge the two by simply discarding Com1.1 because '2' is greater than '1' (timeline value) and '1' is equal to '1' (context). Thus a user would end up waiting on only Com1.2, however, being executed concurrently on different engines means that Com1.2 might actually complete before Com1.1. And hence the user is broken as it now thinks everything is finished when Engine1 is still running its work. 
Indeed. Not completely sure what the original comment meant here.