iommu/amd: flush IOTLB for specific domains only (v2)

Submitted by Nath, Arindam on May 19, 2017, 10:02 a.m.

Details

Message ID 1495188151-14358-1-git-send-email-arindam.nath@amd.com
State New
Headers show
Series "amd-iommu: can't boot with amdgpu, AMD-Vi: Completion-Wait loop timed out" ( rev: 3 ) in AMD X.Org drivers

Not browsing as part of any series.

Commit Message

Nath, Arindam May 19, 2017, 10:02 a.m.
From: Arindam Nath <arindam.nath@amd.com>

Change History
--------------

v2: changes suggested by Joerg
- add flush flag to improve efficiency of flush operation

v1:
- The idea behind flush queues is to defer the IOTLB flushing
  for domains for which the mappings are no longer valid. We
  add such domains in queue_add(), and when the queue size
  reaches FLUSH_QUEUE_SIZE, we perform __queue_flush().

  Since we have already taken lock before __queue_flush()
  is called, we need to make sure the IOTLB flushing is
  performed as quickly as possible.

  In the current implementation, we perform IOTLB flushing
  for all domains irrespective of which ones were actually
  added in the flush queue initially. This can be quite
  expensive especially for domains for which unmapping is
  not required at this point of time.

  This patch makes use of domain information in
  'struct flush_queue_entry' to make sure we only flush
  IOTLBs for domains who need it, skipping others.

Suggested-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Arindam Nath <arindam.nath@amd.com>
---
 drivers/iommu/amd_iommu.c       | 27 ++++++++++++++++++++-------
 drivers/iommu/amd_iommu_types.h |  2 ++
 2 files changed, 22 insertions(+), 7 deletions(-)

Patch hide | download patch | download mbox

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 63cacf5..1edeebec 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -2227,15 +2227,26 @@  static struct iommu_group *amd_iommu_device_group(struct device *dev)
 
 static void __queue_flush(struct flush_queue *queue)
 {
-	struct protection_domain *domain;
-	unsigned long flags;
 	int idx;
 
-	/* First flush TLB of all known domains */
-	spin_lock_irqsave(&amd_iommu_pd_lock, flags);
-	list_for_each_entry(domain, &amd_iommu_pd_list, list)
-		domain_flush_tlb(domain);
-	spin_unlock_irqrestore(&amd_iommu_pd_lock, flags);
+	/* First flush TLB of all domains which were added to flush queue */
+	for (idx = 0; idx < queue->next; ++idx) {
+		struct flush_queue_entry *entry;
+
+		entry = queue->entries + idx;
+
+		/*
+		 * There might be cases where multiple IOVA entries for the
+		 * same domain are queued in the flush queue. To avoid
+		 * flushing the same domain again, we check whether the
+		 * flag is set or not. This improves the efficiency of
+		 * flush operation.
+		 */
+		if (!entry->dma_dom->domain.already_flushed) {
+			entry->dma_dom->domain.already_flushed = true;
+			domain_flush_tlb(&entry->dma_dom->domain);
+		}
+	}
 
 	/* Wait until flushes have completed */
 	domain_flush_complete(NULL);
@@ -2289,6 +2300,8 @@  static void queue_add(struct dma_ops_domain *dma_dom,
 	pages     = __roundup_pow_of_two(pages);
 	address >>= PAGE_SHIFT;
 
+	dma_dom->domain.already_flushed = false;
+
 	queue = get_cpu_ptr(&flush_queue);
 	spin_lock_irqsave(&queue->lock, flags);
 
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 4de8f41..4f5519d 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -454,6 +454,8 @@  struct protection_domain {
 	bool updated;		/* complete domain flush required */
 	unsigned dev_cnt;	/* devices assigned to this domain */
 	unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */
+	bool already_flushed;	/* flag to avoid flushing the same domain again
+				   in a single invocation of __queue_flush() */
 };
 
 /*

Comments

On Fri, 2017-05-19 at 15:32 +0530, arindam.nath@amd.com wrote:
> From: Arindam Nath <arindam.nath@amd.com>
> 
> Change History
> --------------
> 
> v2: changes suggested by Joerg
> - add flush flag to improve efficiency of flush operation
> 
> v1:
> - The idea behind flush queues is to defer the IOTLB flushing
>   for domains for which the mappings are no longer valid. We
>   add such domains in queue_add(), and when the queue size
>   reaches FLUSH_QUEUE_SIZE, we perform __queue_flush().
> 
>   Since we have already taken lock before __queue_flush()
>   is called, we need to make sure the IOTLB flushing is
>   performed as quickly as possible.
> 
>   In the current implementation, we perform IOTLB flushing
>   for all domains irrespective of which ones were actually
>   added in the flush queue initially. This can be quite
>   expensive especially for domains for which unmapping is
>   not required at this point of time.
> 
>   This patch makes use of domain information in
>   'struct flush_queue_entry' to make sure we only flush
>   IOTLBs for domains who need it, skipping others.

Hi,

just a note, the patch also fixes "AMD-Vi: Completion-Wait loop timed
out" boot hang on my system (carrizo + iceland) [0,1,2]. Presumably the
old loop also tried to flush domains that included powered-off devices.

regards,
Jan

[0] https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/issues/20
[1] https://bugs.freedesktop.org/show_bug.cgi?id=101029
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1448121

> 
> Suggested-by: Joerg Roedel <joro@8bytes.org>
> Signed-off-by: Arindam Nath <arindam.nath@amd.com>
> ---
>  drivers/iommu/amd_iommu.c       | 27 ++++++++++++++++++++-------
>  drivers/iommu/amd_iommu_types.h |  2 ++
>  2 files changed, 22 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> index 63cacf5..1edeebec 100644
> --- a/drivers/iommu/amd_iommu.c
> +++ b/drivers/iommu/amd_iommu.c
> @@ -2227,15 +2227,26 @@ static struct iommu_group *amd_iommu_device_group(struct device *dev)
>  
>  static void __queue_flush(struct flush_queue *queue)
>  {
> -	struct protection_domain *domain;
> -	unsigned long flags;
>  	int idx;
>  
> -	/* First flush TLB of all known domains */
> -	spin_lock_irqsave(&amd_iommu_pd_lock, flags);
> -	list_for_each_entry(domain, &amd_iommu_pd_list, list)
> -		domain_flush_tlb(domain);
> -	spin_unlock_irqrestore(&amd_iommu_pd_lock, flags);
> +	/* First flush TLB of all domains which were added to flush queue */
> +	for (idx = 0; idx < queue->next; ++idx) {
> +		struct flush_queue_entry *entry;
> +
> +		entry = queue->entries + idx;
> +
> +		/*
> +		 * There might be cases where multiple IOVA entries for the
> +		 * same domain are queued in the flush queue. To avoid
> +		 * flushing the same domain again, we check whether the
> +		 * flag is set or not. This improves the efficiency of
> +		 * flush operation.
> +		 */
> +		if (!entry->dma_dom->domain.already_flushed) {
> +			entry->dma_dom->domain.already_flushed = true;
> +			domain_flush_tlb(&entry->dma_dom->domain);
> +		}
> +	}
>  
>  	/* Wait until flushes have completed */
>  	domain_flush_complete(NULL);
> @@ -2289,6 +2300,8 @@ static void queue_add(struct dma_ops_domain *dma_dom,
>  	pages     = __roundup_pow_of_two(pages);
>  	address >>= PAGE_SHIFT;
>  
> +	dma_dom->domain.already_flushed = false;
> +
>  	queue = get_cpu_ptr(&flush_queue);
>  	spin_lock_irqsave(&queue->lock, flags);
>  
> diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
> index 4de8f41..4f5519d 100644
> --- a/drivers/iommu/amd_iommu_types.h
> +++ b/drivers/iommu/amd_iommu_types.h
> @@ -454,6 +454,8 @@ struct protection_domain {
>  	bool updated;		/* complete domain flush required */
>  	unsigned dev_cnt;	/* devices assigned to this domain */
>  	unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference count */
> +	bool already_flushed;	/* flag to avoid flushing the same domain again
> +				   in a single invocation of __queue_flush() */
>  };
>  
>  /*
>-----Original Message-----

>From: Jan Vesely [mailto:jv356@scarletmail.rutgers.edu] On Behalf Of Jan

>Vesely

>Sent: Friday, May 19, 2017 7:06 PM

>To: Nath, Arindam; iommu@lists.linux-foundation.org

>Cc: Bridgman, John; Joerg Roedel; amd-gfx@lists.freedesktop.org;

>drake@endlessm.com; Craig Stein; Suthikulpanit, Suravee; Deucher,

>Alexander; Kuehling, Felix; linux@endlessm.com; michel@daenzer.net

>Subject: Re: [PATCH] iommu/amd: flush IOTLB for specific domains only (v2)

>

>On Fri, 2017-05-19 at 15:32 +0530, arindam.nath@amd.com wrote:

>> From: Arindam Nath <arindam.nath@amd.com>

>>

>> Change History

>> --------------

>>

>> v2: changes suggested by Joerg

>> - add flush flag to improve efficiency of flush operation


Joerg, do you have any comments on the patch?

Thanks,
Arindam

>>

>> v1:

>> - The idea behind flush queues is to defer the IOTLB flushing

>>   for domains for which the mappings are no longer valid. We

>>   add such domains in queue_add(), and when the queue size

>>   reaches FLUSH_QUEUE_SIZE, we perform __queue_flush().

>>

>>   Since we have already taken lock before __queue_flush()

>>   is called, we need to make sure the IOTLB flushing is

>>   performed as quickly as possible.

>>

>>   In the current implementation, we perform IOTLB flushing

>>   for all domains irrespective of which ones were actually

>>   added in the flush queue initially. This can be quite

>>   expensive especially for domains for which unmapping is

>>   not required at this point of time.

>>

>>   This patch makes use of domain information in

>>   'struct flush_queue_entry' to make sure we only flush

>>   IOTLBs for domains who need it, skipping others.

>

>Hi,

>

>just a note, the patch also fixes "AMD-Vi: Completion-Wait loop timed

>out" boot hang on my system (carrizo + iceland) [0,1,2]. Presumably the

>old loop also tried to flush domains that included powered-off devices.

>

>regards,

>Jan

>

>[0] https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/issues/20

>[1] https://bugs.freedesktop.org/show_bug.cgi?id=101029

>[2] https://bugzilla.redhat.com/show_bug.cgi?id=1448121

>

>>

>> Suggested-by: Joerg Roedel <joro@8bytes.org>

>> Signed-off-by: Arindam Nath <arindam.nath@amd.com>

>> ---

>>  drivers/iommu/amd_iommu.c       | 27 ++++++++++++++++++++-------

>>  drivers/iommu/amd_iommu_types.h |  2 ++

>>  2 files changed, 22 insertions(+), 7 deletions(-)

>>

>> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c

>> index 63cacf5..1edeebec 100644

>> --- a/drivers/iommu/amd_iommu.c

>> +++ b/drivers/iommu/amd_iommu.c

>> @@ -2227,15 +2227,26 @@ static struct iommu_group

>*amd_iommu_device_group(struct device *dev)

>>

>>  static void __queue_flush(struct flush_queue *queue)

>>  {

>> -	struct protection_domain *domain;

>> -	unsigned long flags;

>>  	int idx;

>>

>> -	/* First flush TLB of all known domains */

>> -	spin_lock_irqsave(&amd_iommu_pd_lock, flags);

>> -	list_for_each_entry(domain, &amd_iommu_pd_list, list)

>> -		domain_flush_tlb(domain);

>> -	spin_unlock_irqrestore(&amd_iommu_pd_lock, flags);

>> +	/* First flush TLB of all domains which were added to flush queue */

>> +	for (idx = 0; idx < queue->next; ++idx) {

>> +		struct flush_queue_entry *entry;

>> +

>> +		entry = queue->entries + idx;

>> +

>> +		/*

>> +		 * There might be cases where multiple IOVA entries for the

>> +		 * same domain are queued in the flush queue. To avoid

>> +		 * flushing the same domain again, we check whether the

>> +		 * flag is set or not. This improves the efficiency of

>> +		 * flush operation.

>> +		 */

>> +		if (!entry->dma_dom->domain.already_flushed) {

>> +			entry->dma_dom->domain.already_flushed = true;

>> +			domain_flush_tlb(&entry->dma_dom->domain);

>> +		}

>> +	}

>>

>>  	/* Wait until flushes have completed */

>>  	domain_flush_complete(NULL);

>> @@ -2289,6 +2300,8 @@ static void queue_add(struct dma_ops_domain

>*dma_dom,

>>  	pages     = __roundup_pow_of_two(pages);

>>  	address >>= PAGE_SHIFT;

>>

>> +	dma_dom->domain.already_flushed = false;

>> +

>>  	queue = get_cpu_ptr(&flush_queue);

>>  	spin_lock_irqsave(&queue->lock, flags);

>>

>> diff --git a/drivers/iommu/amd_iommu_types.h

>b/drivers/iommu/amd_iommu_types.h

>> index 4de8f41..4f5519d 100644

>> --- a/drivers/iommu/amd_iommu_types.h

>> +++ b/drivers/iommu/amd_iommu_types.h

>> @@ -454,6 +454,8 @@ struct protection_domain {

>>  	bool updated;		/* complete domain flush required */

>>  	unsigned dev_cnt;	/* devices assigned to this domain */

>>  	unsigned dev_iommu[MAX_IOMMUS]; /* per-IOMMU reference

>count */

>> +	bool already_flushed;	/* flag to avoid flushing the same domain

>again

>> +				   in a single invocation of __queue_flush() */

>>  };

>>

>>  /*

>

>--

>Jan Vesely <jan.vesely@rutgers.edu>
On 19/05/17 07:02 PM, arindam.nath@amd.com wrote:
> From: Arindam Nath <arindam.nath@amd.com>
> 
> Change History
> --------------
> 
> v2: changes suggested by Joerg
> - add flush flag to improve efficiency of flush operation
> 
> v1:
> - The idea behind flush queues is to defer the IOTLB flushing
>   for domains for which the mappings are no longer valid. We
>   add such domains in queue_add(), and when the queue size
>   reaches FLUSH_QUEUE_SIZE, we perform __queue_flush().
> 
>   Since we have already taken lock before __queue_flush()
>   is called, we need to make sure the IOTLB flushing is
>   performed as quickly as possible.
> 
>   In the current implementation, we perform IOTLB flushing
>   for all domains irrespective of which ones were actually
>   added in the flush queue initially. This can be quite
>   expensive especially for domains for which unmapping is
>   not required at this point of time.
> 
>   This patch makes use of domain information in
>   'struct flush_queue_entry' to make sure we only flush
>   IOTLBs for domains who need it, skipping others.
> 
> Suggested-by: Joerg Roedel <joro@8bytes.org>
> Signed-off-by: Arindam Nath <arindam.nath@amd.com>

Please add these tags:

Fixes: b1516a14657a ("iommu/amd: Implement flush queue")
Cc: stable@vger.kernel.org
On 22/05/17 10:08 AM, Michel Dänzer wrote:
> On 19/05/17 07:02 PM, arindam.nath@amd.com wrote:
>> From: Arindam Nath <arindam.nath@amd.com>
>>
>> Change History
>> --------------
>>
>> v2: changes suggested by Joerg
>> - add flush flag to improve efficiency of flush operation
>>
>> v1:
>> - The idea behind flush queues is to defer the IOTLB flushing
>>   for domains for which the mappings are no longer valid. We
>>   add such domains in queue_add(), and when the queue size
>>   reaches FLUSH_QUEUE_SIZE, we perform __queue_flush().
>>
>>   Since we have already taken lock before __queue_flush()
>>   is called, we need to make sure the IOTLB flushing is
>>   performed as quickly as possible.
>>
>>   In the current implementation, we perform IOTLB flushing
>>   for all domains irrespective of which ones were actually
>>   added in the flush queue initially. This can be quite
>>   expensive especially for domains for which unmapping is
>>   not required at this point of time.
>>
>>   This patch makes use of domain information in
>>   'struct flush_queue_entry' to make sure we only flush
>>   IOTLBs for domains who need it, skipping others.
>>
>> Suggested-by: Joerg Roedel <joro@8bytes.org>
>> Signed-off-by: Arindam Nath <arindam.nath@amd.com>
> 
> Please add these tags:
> 
> Fixes: b1516a14657a ("iommu/amd: Implement flush queue")
> Cc: stable@vger.kernel.org

Also

Bugzilla: https://bugs.freedesktop.org/101029