[RFC] mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables

Submitted by Ralph Campbell on June 8, 2019, 12:14 a.m.

Details

Message ID 20190608001452.7922-1-rcampbell@nvidia.com
State New
Headers show
Series "mm/hmm: pass mmu_notifier_range to sync_cpu_device_pagetables" ( rev: 1 ) in AMD X.Org drivers

Not browsing as part of any series.

Commit Message

Ralph Campbell June 8, 2019, 12:14 a.m.
HMM defines its own struct hmm_update which is passed to the
sync_cpu_device_pagetables() callback function. This is
sufficient when the only action is to invalidate. However,
a device may want to know the reason for the invalidation and
be able to see the new permissions on a range, update device access
rights or range statistics. Since sync_cpu_device_pagetables()
can be called from try_to_unmap(), the mmap_sem may not be held
and find_vma() is not safe to be called.
Pass the struct mmu_notifier_range to sync_cpu_device_pagetables()
to allow the full invalidation information to be used.

Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
---

I'm sending this out now since we are updating many of the HMM APIs
and I think it will be useful.


 drivers/gpu/drm/nouveau/nouveau_svm.c |  4 ++--
 include/linux/hmm.h                   | 27 ++-------------------------
 mm/hmm.c                              | 13 ++++---------
 3 files changed, 8 insertions(+), 36 deletions(-)

Patch hide | download patch | download mbox

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 8c92374afcf2..c34b98fafe2f 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -252,13 +252,13 @@  nouveau_svmm_invalidate(struct nouveau_svmm *svmm, u64 start, u64 limit)
 
 static int
 nouveau_svmm_sync_cpu_device_pagetables(struct hmm_mirror *mirror,
-					const struct hmm_update *update)
+					const struct mmu_notifier_range *update)
 {
 	struct nouveau_svmm *svmm = container_of(mirror, typeof(*svmm), mirror);
 	unsigned long start = update->start;
 	unsigned long limit = update->end;
 
-	if (!update->blockable)
+	if (!mmu_notifier_range_blockable(update))
 		return -EAGAIN;
 
 	SVMM_DBG(svmm, "invalidate %016lx-%016lx", start, limit);
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 0fa8ea34ccef..07a2d38fde34 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -377,29 +377,6 @@  static inline uint64_t hmm_pfn_from_pfn(const struct hmm_range *range,
 
 struct hmm_mirror;
 
-/*
- * enum hmm_update_event - type of update
- * @HMM_UPDATE_INVALIDATE: invalidate range (no indication as to why)
- */
-enum hmm_update_event {
-	HMM_UPDATE_INVALIDATE,
-};
-
-/*
- * struct hmm_update - HMM update information for callback
- *
- * @start: virtual start address of the range to update
- * @end: virtual end address of the range to update
- * @event: event triggering the update (what is happening)
- * @blockable: can the callback block/sleep ?
- */
-struct hmm_update {
-	unsigned long start;
-	unsigned long end;
-	enum hmm_update_event event;
-	bool blockable;
-};
-
 /*
  * struct hmm_mirror_ops - HMM mirror device operations callback
  *
@@ -420,7 +397,7 @@  struct hmm_mirror_ops {
 	/* sync_cpu_device_pagetables() - synchronize page tables
 	 *
 	 * @mirror: pointer to struct hmm_mirror
-	 * @update: update information (see struct hmm_update)
+	 * @update: update information (see struct mmu_notifier_range)
 	 * Return: -EAGAIN if update.blockable false and callback need to
 	 *          block, 0 otherwise.
 	 *
@@ -434,7 +411,7 @@  struct hmm_mirror_ops {
 	 * synchronous call.
 	 */
 	int (*sync_cpu_device_pagetables)(struct hmm_mirror *mirror,
-					  const struct hmm_update *update);
+				const struct mmu_notifier_range *update);
 };
 
 /*
diff --git a/mm/hmm.c b/mm/hmm.c
index 9aad3550f2bb..b49a43712554 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -164,7 +164,6 @@  static int hmm_invalidate_range_start(struct mmu_notifier *mn,
 {
 	struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
 	struct hmm_mirror *mirror;
-	struct hmm_update update;
 	struct hmm_range *range;
 	unsigned long flags;
 	int ret = 0;
@@ -173,15 +172,10 @@  static int hmm_invalidate_range_start(struct mmu_notifier *mn,
 	if (!kref_get_unless_zero(&hmm->kref))
 		return 0;
 
-	update.start = nrange->start;
-	update.end = nrange->end;
-	update.event = HMM_UPDATE_INVALIDATE;
-	update.blockable = mmu_notifier_range_blockable(nrange);
-
 	spin_lock_irqsave(&hmm->ranges_lock, flags);
 	hmm->notifiers++;
 	list_for_each_entry(range, &hmm->ranges, list) {
-		if (update.end < range->start || update.start >= range->end)
+		if (nrange->end < range->start || nrange->start >= range->end)
 			continue;
 
 		range->valid = false;
@@ -198,9 +192,10 @@  static int hmm_invalidate_range_start(struct mmu_notifier *mn,
 	list_for_each_entry(mirror, &hmm->mirrors, list) {
 		int rc;
 
-		rc = mirror->ops->sync_cpu_device_pagetables(mirror, &update);
+		rc = mirror->ops->sync_cpu_device_pagetables(mirror, nrange);
 		if (rc) {
-			if (WARN_ON(update.blockable || rc != -EAGAIN))
+			if (WARN_ON(mmu_notifier_range_blockable(nrange) ||
+				    rc != -EAGAIN))
 				continue;
 			ret = -EAGAIN;
 			break;

Comments

On Sat, Jun 08, 2019 at 02:10:08AM -0700, Christoph Hellwig wrote:
> On Fri, Jun 07, 2019 at 05:14:52PM -0700, Ralph Campbell wrote:
> > HMM defines its own struct hmm_update which is passed to the
> > sync_cpu_device_pagetables() callback function. This is
> > sufficient when the only action is to invalidate. However,
> > a device may want to know the reason for the invalidation and
> > be able to see the new permissions on a range, update device access
> > rights or range statistics. Since sync_cpu_device_pagetables()
> > can be called from try_to_unmap(), the mmap_sem may not be held
> > and find_vma() is not safe to be called.
> > Pass the struct mmu_notifier_range to sync_cpu_device_pagetables()
> > to allow the full invalidation information to be used.
> > 
> > Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
> > 
> > I'm sending this out now since we are updating many of the HMM APIs
> > and I think it will be useful.
> 
> This is the right thing to do.  But the really right thing is to just
> kill the hmm_mirror API entirely and move to mmu_notifiers.  At least
> for noveau this already is way simpler, although right now it defeats
> Jasons patch to avoid allocating the struct hmm in the fault path.
> But as said before that can be avoided by just killing struct hmm,
> which for many reasons is the right thing to do anyway.
> 
> I've got a series here, which is a bit broken (epecially the last
> patch can't work as-is), but should explain where I'm trying to head:
> 
> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/hmm-mirror-simplification

At least the current hmm approach does rely on the collision retry
locking scheme in struct hmm/struct hmm_range for the pagefault side
to work right.

So, before we can apply patch one in this series we need to fix
hmm_vma_fault() and all its varients. Otherwise the driver will be
broken.

I'm hoping to first define what this locking should be (see other
emails to Ralph) then, ideally, see if we can extend mmu notifiers to
get it directly withouth hmm stuff.

Then we apply your patch one and the hmm ops wrapper dies.

Jason
On Fri, Jun 07, 2019 at 05:14:52PM -0700, Ralph Campbell wrote:
> HMM defines its own struct hmm_update which is passed to the
> sync_cpu_device_pagetables() callback function. This is
> sufficient when the only action is to invalidate. However,
> a device may want to know the reason for the invalidation and
> be able to see the new permissions on a range, update device access
> rights or range statistics. Since sync_cpu_device_pagetables()
> can be called from try_to_unmap(), the mmap_sem may not be held
> and find_vma() is not safe to be called.
> Pass the struct mmu_notifier_range to sync_cpu_device_pagetables()
> to allow the full invalidation information to be used.
> 
> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
> ---
> 
> I'm sending this out now since we are updating many of the HMM APIs
> and I think it will be useful.

I agree with CH that struct hmm_update seems particularly pointless
and we really should just use mmu_notifier_range directly.

We need to find out from the DRM folks if we can merge this kind of
stuff through hmm.git and then resolve any conflicts that might arise
in DRM tree or in nouveau tree?

But I would like to see this patch go in this cycle, thanks

Jason
On Fri, Jun 07, 2019 at 05:14:52PM -0700, Ralph Campbell wrote:
> HMM defines its own struct hmm_update which is passed to the
> sync_cpu_device_pagetables() callback function. This is
> sufficient when the only action is to invalidate. However,
> a device may want to know the reason for the invalidation and
> be able to see the new permissions on a range, update device access
> rights or range statistics. Since sync_cpu_device_pagetables()
> can be called from try_to_unmap(), the mmap_sem may not be held
> and find_vma() is not safe to be called.
> Pass the struct mmu_notifier_range to sync_cpu_device_pagetables()
> to allow the full invalidation information to be used.
> 
> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>

I don't disagree with Christoph or Jason but since I've been trying to sort out
where hmm does and does not fit any chance to remove a custom structure is a
good simplification IMO.  So...

Reviewed-by: Ira Weiny <ira.weiny@intel.com>

> ---
> 
> I'm sending this out now since we are updating many of the HMM APIs
> and I think it will be useful.
> 
> 
>  drivers/gpu/drm/nouveau/nouveau_svm.c |  4 ++--
>  include/linux/hmm.h                   | 27 ++-------------------------
>  mm/hmm.c                              | 13 ++++---------
>  3 files changed, 8 insertions(+), 36 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
> index 8c92374afcf2..c34b98fafe2f 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_svm.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
> @@ -252,13 +252,13 @@ nouveau_svmm_invalidate(struct nouveau_svmm *svmm, u64 start, u64 limit)
>  
>  static int
>  nouveau_svmm_sync_cpu_device_pagetables(struct hmm_mirror *mirror,
> -					const struct hmm_update *update)
> +					const struct mmu_notifier_range *update)
>  {
>  	struct nouveau_svmm *svmm = container_of(mirror, typeof(*svmm), mirror);
>  	unsigned long start = update->start;
>  	unsigned long limit = update->end;
>  
> -	if (!update->blockable)
> +	if (!mmu_notifier_range_blockable(update))
>  		return -EAGAIN;
>  
>  	SVMM_DBG(svmm, "invalidate %016lx-%016lx", start, limit);
> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
> index 0fa8ea34ccef..07a2d38fde34 100644
> --- a/include/linux/hmm.h
> +++ b/include/linux/hmm.h
> @@ -377,29 +377,6 @@ static inline uint64_t hmm_pfn_from_pfn(const struct hmm_range *range,
>  
>  struct hmm_mirror;
>  
> -/*
> - * enum hmm_update_event - type of update
> - * @HMM_UPDATE_INVALIDATE: invalidate range (no indication as to why)
> - */
> -enum hmm_update_event {
> -	HMM_UPDATE_INVALIDATE,
> -};
> -
> -/*
> - * struct hmm_update - HMM update information for callback
> - *
> - * @start: virtual start address of the range to update
> - * @end: virtual end address of the range to update
> - * @event: event triggering the update (what is happening)
> - * @blockable: can the callback block/sleep ?
> - */
> -struct hmm_update {
> -	unsigned long start;
> -	unsigned long end;
> -	enum hmm_update_event event;
> -	bool blockable;
> -};
> -
>  /*
>   * struct hmm_mirror_ops - HMM mirror device operations callback
>   *
> @@ -420,7 +397,7 @@ struct hmm_mirror_ops {
>  	/* sync_cpu_device_pagetables() - synchronize page tables
>  	 *
>  	 * @mirror: pointer to struct hmm_mirror
> -	 * @update: update information (see struct hmm_update)
> +	 * @update: update information (see struct mmu_notifier_range)
>  	 * Return: -EAGAIN if update.blockable false and callback need to
>  	 *          block, 0 otherwise.
>  	 *
> @@ -434,7 +411,7 @@ struct hmm_mirror_ops {
>  	 * synchronous call.
>  	 */
>  	int (*sync_cpu_device_pagetables)(struct hmm_mirror *mirror,
> -					  const struct hmm_update *update);
> +				const struct mmu_notifier_range *update);
>  };
>  
>  /*
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 9aad3550f2bb..b49a43712554 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -164,7 +164,6 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
>  {
>  	struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
>  	struct hmm_mirror *mirror;
> -	struct hmm_update update;
>  	struct hmm_range *range;
>  	unsigned long flags;
>  	int ret = 0;
> @@ -173,15 +172,10 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
>  	if (!kref_get_unless_zero(&hmm->kref))
>  		return 0;
>  
> -	update.start = nrange->start;
> -	update.end = nrange->end;
> -	update.event = HMM_UPDATE_INVALIDATE;
> -	update.blockable = mmu_notifier_range_blockable(nrange);
> -
>  	spin_lock_irqsave(&hmm->ranges_lock, flags);
>  	hmm->notifiers++;
>  	list_for_each_entry(range, &hmm->ranges, list) {
> -		if (update.end < range->start || update.start >= range->end)
> +		if (nrange->end < range->start || nrange->start >= range->end)
>  			continue;
>  
>  		range->valid = false;
> @@ -198,9 +192,10 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
>  	list_for_each_entry(mirror, &hmm->mirrors, list) {
>  		int rc;
>  
> -		rc = mirror->ops->sync_cpu_device_pagetables(mirror, &update);
> +		rc = mirror->ops->sync_cpu_device_pagetables(mirror, nrange);
>  		if (rc) {
> -			if (WARN_ON(update.blockable || rc != -EAGAIN))
> +			if (WARN_ON(mmu_notifier_range_blockable(nrange) ||
> +				    rc != -EAGAIN))
>  				continue;
>  			ret = -EAGAIN;
>  			break;
> -- 
> 2.20.1
>
On 6/8/19 4:41 AM, Jason Gunthorpe wrote:
> On Sat, Jun 08, 2019 at 02:10:08AM -0700, Christoph Hellwig wrote:
>> On Fri, Jun 07, 2019 at 05:14:52PM -0700, Ralph Campbell wrote:
>>> HMM defines its own struct hmm_update which is passed to the
>>> sync_cpu_device_pagetables() callback function. This is
>>> sufficient when the only action is to invalidate. However,
>>> a device may want to know the reason for the invalidation and
>>> be able to see the new permissions on a range, update device access
>>> rights or range statistics. Since sync_cpu_device_pagetables()
>>> can be called from try_to_unmap(), the mmap_sem may not be held
>>> and find_vma() is not safe to be called.
>>> Pass the struct mmu_notifier_range to sync_cpu_device_pagetables()
>>> to allow the full invalidation information to be used.
>>>
>>> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
>>>
>>> I'm sending this out now since we are updating many of the HMM APIs
>>> and I think it will be useful.
>>
>> This is the right thing to do.  But the really right thing is to just
>> kill the hmm_mirror API entirely and move to mmu_notifiers.  At least
>> for noveau this already is way simpler, although right now it defeats
>> Jasons patch to avoid allocating the struct hmm in the fault path.
>> But as said before that can be avoided by just killing struct hmm,
>> which for many reasons is the right thing to do anyway.
>>
>> I've got a series here, which is a bit broken (epecially the last
>> patch can't work as-is), but should explain where I'm trying to head:
>>
>> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/hmm-mirror-simplification
> 
> At least the current hmm approach does rely on the collision retry
> locking scheme in struct hmm/struct hmm_range for the pagefault side
> to work right.
> 
> So, before we can apply patch one in this series we need to fix
> hmm_vma_fault() and all its varients. Otherwise the driver will be
> broken.
> 
> I'm hoping to first define what this locking should be (see other
> emails to Ralph) then, ideally, see if we can extend mmu notifiers to
> get it directly withouth hmm stuff.
> 
> Then we apply your patch one and the hmm ops wrapper dies.
> 

This all makes sense, and thanks for all this work to simplify and clarify
HMM. It's going to make it a lot easier to work with, when the dust settles.

thanks,
On Fri, Jun 07, 2019 at 05:14:52PM -0700, Ralph Campbell wrote:
> HMM defines its own struct hmm_update which is passed to the
> sync_cpu_device_pagetables() callback function. This is
> sufficient when the only action is to invalidate. However,
> a device may want to know the reason for the invalidation and
> be able to see the new permissions on a range, update device access
> rights or range statistics. Since sync_cpu_device_pagetables()
> can be called from try_to_unmap(), the mmap_sem may not be held
> and find_vma() is not safe to be called.
> Pass the struct mmu_notifier_range to sync_cpu_device_pagetables()
> to allow the full invalidation information to be used.
> 
> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
> ---
> 
> I'm sending this out now since we are updating many of the HMM APIs
> and I think it will be useful.

This make so much sense, I'd like to apply this in hmm.git, is there
any objection?

Jason
On 7/2/19 12:53 PM, Jason Gunthorpe wrote:
> On Fri, Jun 07, 2019 at 05:14:52PM -0700, Ralph Campbell wrote:
>> HMM defines its own struct hmm_update which is passed to the
>> sync_cpu_device_pagetables() callback function. This is
>> sufficient when the only action is to invalidate. However,
>> a device may want to know the reason for the invalidation and
>> be able to see the new permissions on a range, update device access
>> rights or range statistics. Since sync_cpu_device_pagetables()
>> can be called from try_to_unmap(), the mmap_sem may not be held
>> and find_vma() is not safe to be called.
>> Pass the struct mmu_notifier_range to sync_cpu_device_pagetables()
>> to allow the full invalidation information to be used.
>>
>> Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
>> ---
>>
>> I'm sending this out now since we are updating many of the HMM APIs
>> and I think it will be useful.
> 
> This make so much sense, I'd like to apply this in hmm.git, is there
> any objection?
> 
> Jason
> 
Not from me. :-)

Thanks!
On Tue, Jul 02, 2019 at 07:53:23PM +0000, Jason Gunthorpe wrote:
> > I'm sending this out now since we are updating many of the HMM APIs
> > and I think it will be useful.
> 
> This make so much sense, I'd like to apply this in hmm.git, is there
> any objection?

As this creates a somewhat hairy conflict for amdgpu, wouldn't it be
a better idea to wait a bit and apply it first thing for next merge
window?
On Wed, Jul 03, 2019 at 12:49:12AM +0200, Christoph Hellwig wrote:
> On Tue, Jul 02, 2019 at 07:53:23PM +0000, Jason Gunthorpe wrote:
> > > I'm sending this out now since we are updating many of the HMM APIs
> > > and I think it will be useful.
> > 
> > This make so much sense, I'd like to apply this in hmm.git, is there
> > any objection?
> 
> As this creates a somewhat hairy conflict for amdgpu, wouldn't it be
> a better idea to wait a bit and apply it first thing for next merge
> window?

My thinking is that AMD GPU already has a monster conflict from this:

 int hmm_range_register(struct hmm_range *range,
-                      struct mm_struct *mm,
+                      struct hmm_mirror *mirror,
                       unsigned long start,
                       unsigned long end,
                       unsigned page_shift);

So, depending on how that is resolved we might want to do both API
changes at once.

Or we may have to revert the above change at this late date.

Waiting for AMDGPU team to discuss what process they want to use.

Jason
On Tue, Jul 02, 2019 at 10:59:16PM +0000, Jason Gunthorpe wrote:
> > As this creates a somewhat hairy conflict for amdgpu, wouldn't it be
> > a better idea to wait a bit and apply it first thing for next merge
> > window?
> 
> My thinking is that AMD GPU already has a monster conflict from this:
> 
>  int hmm_range_register(struct hmm_range *range,
> -                      struct mm_struct *mm,
> +                      struct hmm_mirror *mirror,
>                        unsigned long start,
>                        unsigned long end,
>                        unsigned page_shift);

Well, that seems like a relatively easy to fix conflict, at least as
long as you have the mirror easily available.  The notifier change
on the other hand basically requires rewriting about two dozen lines
of code entirely.
On 2019-07-02 6:59 p.m., Jason Gunthorpe wrote:
> On Wed, Jul 03, 2019 at 12:49:12AM +0200, Christoph Hellwig wrote:

>> On Tue, Jul 02, 2019 at 07:53:23PM +0000, Jason Gunthorpe wrote:

>>>> I'm sending this out now since we are updating many of the HMM APIs

>>>> and I think it will be useful.

>>> This make so much sense, I'd like to apply this in hmm.git, is there

>>> any objection?

>> As this creates a somewhat hairy conflict for amdgpu, wouldn't it be

>> a better idea to wait a bit and apply it first thing for next merge

>> window?

> My thinking is that AMD GPU already has a monster conflict from this:

>

>   int hmm_range_register(struct hmm_range *range,

> -                      struct mm_struct *mm,

> +                      struct hmm_mirror *mirror,

>                         unsigned long start,

>                         unsigned long end,

>                         unsigned page_shift);

>

> So, depending on how that is resolved we might want to do both API

> changes at once.


I just sent out a fix for the hmm_mirror API change.


>

> Or we may have to revert the above change at this late date.

>

> Waiting for AMDGPU team to discuss what process they want to use.


Yeah, I'm wondering what the process is myself. With HMM and driver 
development happening on different branches these kinds of API changes 
are painful. There seems to be a built-in assumption in the current 
process, that code flows mostly in one direction amd-staging-drm-next -> 
drm-next -> linux-next -> linux. That assumption is broken with HMM code 
evolving rapidly in both amdgpu and mm.

If we want to continue developing HMM driver changes in 
amd-staging-drm-next, we'll need to synchronize with hmm.git more 
frequently, both ways. I believe part of the problem is, that there is a 
fairly long lead-time from getting changes from amd-staging-drm-next 
into linux-next, as they are held for one release cycle in drm-next. 
Pushing HMM-related changes through drm-fixes may offer a kind of 
shortcut. Philip and my latest fixup is just bypassing drm-next 
completely and going straight into linux-next, though.

Regards,
   Felix


>

> Jason
On Wed, Jul 03, 2019 at 02:27:22AM +0000, Kuehling, Felix wrote:
> On 2019-07-02 6:59 p.m., Jason Gunthorpe wrote:
> > On Wed, Jul 03, 2019 at 12:49:12AM +0200, Christoph Hellwig wrote:
> >> On Tue, Jul 02, 2019 at 07:53:23PM +0000, Jason Gunthorpe wrote:
> >>>> I'm sending this out now since we are updating many of the HMM APIs
> >>>> and I think it will be useful.
> >>> This make so much sense, I'd like to apply this in hmm.git, is there
> >>> any objection?
> >> As this creates a somewhat hairy conflict for amdgpu, wouldn't it be
> >> a better idea to wait a bit and apply it first thing for next merge
> >> window?
> > My thinking is that AMD GPU already has a monster conflict from this:
> >
> >   int hmm_range_register(struct hmm_range *range,
> > -                      struct mm_struct *mm,
> > +                      struct hmm_mirror *mirror,
> >                         unsigned long start,
> >                         unsigned long end,
> >                         unsigned page_shift);
> >
> > So, depending on how that is resolved we might want to do both API
> > changes at once.
> 
> I just sent out a fix for the hmm_mirror API change.

I think if you follow my suggestion to apply a prep patch to AMD GPU
to make the conflict resolution simple, we should defer this patch
until next kernel for the reasons CH gave.

> > Or we may have to revert the above change at this late date.
> >
> > Waiting for AMDGPU team to discuss what process they want to use.
> 
> Yeah, I'm wondering what the process is myself. With HMM and driver 
> development happening on different branches these kinds of API changes 
> are painful. There seems to be a built-in assumption in the current 
> process, that code flows mostly in one direction amd-staging-drm-next -> 
> drm-next -> linux-next -> linux. That assumption is broken with HMM code 
> evolving rapidly in both amdgpu and mm.

It looks to me like AMD GPU uses a pull request model. So a goal as a
tree runner should be to work with the other trees (ie hmm.git, etc)
to minimize conflicts between the PR you will send and the PR other
trees will send.

Do not focus on linux-next, that is just an 'early warning system'
that conflicts are on the horizon, we knew about this one :) (well,
mostly, I was surprised how big it was, my bad)

So we must stay in co-ordination with patches in-flight on the list
and make the right decision, depending on the situation. Communication
here is key :)

We have lots of strategies available to deal with these situations.

> If we want to continue developing HMM driver changes in
> amd-staging-drm-next, we'll need to synchronize with hmm.git more 
> frequently, both ways.

It can't really go both ways. hmm.git has to be only the hmm topic,
otherwise it doesn't really work.

> I believe part of the problem is, that there is a fairly long
> lead-time from getting changes from amd-staging-drm-next into
> linux-next, as they are held for one release cycle in drm-next.
> Pushing HMM-related changes through drm-fixes may offer a kind of
> shortcut. Philip and my latest fixup is just bypassing drm-next
> completely and going straight into linux-next, though.

I'm not so familiar with the DRM work flow to give you advice on this.

Jason