[Mesa-dev,2/2] RFC: radeon/compute: Limit allocations for VRAM-based chips to 3/4 VRAM

Submitted by Aaron Watry on June 7, 2017, 2:10 p.m.

Details

Message ID CAM+GqJY4QEt6iJ1XYvpqviZQ8ynbmYpTb23=DouWutA3X8shPA@mail.gmail.com
State New
Headers show
Series "Series without cover letter" ( rev: 2 ) in Mesa

Not browsing as part of any series.

Commit Message

Aaron Watry June 7, 2017, 2:10 p.m.
On Mon, Jun 5, 2017 at 3:07 PM, Marek Olšák <maraeo@gmail.com> wrote:
> Hi Aaron,
>
> Can you make the change in radeon_drm_winsys.c instead?

Something like the following?

         ws->info.max_alloc_size = MIN2(ws->info.max_alloc_size, 256*1024*1024);

--Aaron



>
> Thanks,
> Marek
>
> On Mon, Jun 5, 2017 at 2:32 AM, Aaron Watry <awatry@gmail.com> wrote:
>> The CL CTS queries the max allocation size, and then attempts to
>> allocate buffers of that size. If any of the VRAM is in use, this
>> causes errors in the radeon kernel module.
>>
>> It's a bit of a hack, but experimentally on my system, I can use 3/4
>> of the card's VRAM for a single global/constant buffer allocation given
>> current GUI/compositor use.
>>
>> If there's a way to get the actual amount of free VRAM, I'd love to hear about it.
>>
>> Also, I'm unsure if the radeon kernel module requires all allocated memory to be
>> contiguous, if so, then we'd need to be able to get at that value.. I'm suspecting
>> that's not actually the case.
>>
>> For a 1GB Pitcairn (HD7850) this gets me from the reported clinfo values of:
>> Global memory size                              2143076352 (1.996GiB)
>> Max memory allocation                           1500153446 (1.397GiB)
>> Max constant buffer size                        1500153446 (1.397GiB)
>>
>> To:
>> Global memory size                              2143076352 (1.996GiB)
>> Max memory allocation                           805306368 (768MiB)
>> Max constant buffer size                        805306368 (768MiB)
>>
>> Fixes: OpenCL CTS test/conformance/api/min_max_mem_alloc_size,
>>        OpenCL CTS test/conformance/api/min_max_constant_buffer_size
>>
>> Signed-off-by: Aaron Watry <awatry@gmail.com>
>> ---
>>  src/gallium/drivers/radeon/r600_pipe_common.c | 17 +++++++++++++++--
>>  1 file changed, 15 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c
>> index 2c0cadb030..cdd4062fd3 100644
>> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
>> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
>> @@ -1144,8 +1144,21 @@ static int r600_get_compute_param(struct pipe_screen *screen,
>>                 if (ret) {
>>                         uint64_t *max_mem_alloc_size = ret;
>>
>> -                       *max_mem_alloc_size = rscreen->info.max_alloc_size;
>> -               }
>> +                       uint64_t max_alloc = rscreen->info.max_alloc_size;
>> +
>> +                       if (rscreen->info.has_dedicated_vram) {
>> +                               /* XXX: Hack to prevent system hangs...
>> +                                * Limit to 3/4 VRAM for any single allocation.
>> +                                * Prevents:
>> +                                *     radeon: Not enough memory for command submission.
>> +                                */
>> +                               *max_mem_alloc_size = MIN2(
>> +                                       rscreen->info.vram_size * 3 / 4, max_alloc
>> +                               );
>> +                       } else {
>> +                               *max_mem_alloc_size = max_alloc;
>> +                       }
>> +        }
>>                 return sizeof(uint64_t);
>>
>>         case PIPE_COMPUTE_CAP_MAX_CLOCK_FREQUENCY:
>> --
>> 2.11.0
>>
>> _______________________________________________
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Patch hide | download patch | download mbox

diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
index a485615ae4..44948f49ef 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
@@ -365,6 +365,8 @@  static bool do_winsys_init(struct radeon_drm_winsys *ws)
     /* Radeon allocates all buffers as contigous, which makes large allocations
      * unlikely to succeed. */
     ws->info.max_alloc_size = MAX2(ws->info.vram_size,
ws->info.gart_size) * 0.7;
+    if (ws->info.has_dedicated_vram)
+        ws->info.max_alloc_size = MIN2(ws->info.vram_size * 0.7,
ws->info.max_alloc_size);
     if (ws->info.drm_minor < 40)

Comments

On Wed, Jun 7, 2017 at 4:10 PM, Aaron Watry <awatry@gmail.com> wrote:
> On Mon, Jun 5, 2017 at 3:07 PM, Marek Olšák <maraeo@gmail.com> wrote:
>> Hi Aaron,
>>
>> Can you make the change in radeon_drm_winsys.c instead?
>
> Something like the following?
>
> diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> index a485615ae4..44948f49ef 100644
> --- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> +++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
> @@ -365,6 +365,8 @@ static bool do_winsys_init(struct radeon_drm_winsys *ws)
>      /* Radeon allocates all buffers as contigous, which makes large allocations
>       * unlikely to succeed. */
>      ws->info.max_alloc_size = MAX2(ws->info.vram_size,
> ws->info.gart_size) * 0.7;
> +    if (ws->info.has_dedicated_vram)
> +        ws->info.max_alloc_size = MIN2(ws->info.vram_size * 0.7,
> ws->info.max_alloc_size);
>      if (ws->info.drm_minor < 40)
>          ws->info.max_alloc_size = MIN2(ws->info.max_alloc_size, 256*1024*1024);

Yes, feel free to push that.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

Marek


>
> --Aaron
>
>
>
>>
>> Thanks,
>> Marek
>>
>> On Mon, Jun 5, 2017 at 2:32 AM, Aaron Watry <awatry@gmail.com> wrote:
>>> The CL CTS queries the max allocation size, and then attempts to
>>> allocate buffers of that size. If any of the VRAM is in use, this
>>> causes errors in the radeon kernel module.
>>>
>>> It's a bit of a hack, but experimentally on my system, I can use 3/4
>>> of the card's VRAM for a single global/constant buffer allocation given
>>> current GUI/compositor use.
>>>
>>> If there's a way to get the actual amount of free VRAM, I'd love to hear about it.
>>>
>>> Also, I'm unsure if the radeon kernel module requires all allocated memory to be
>>> contiguous, if so, then we'd need to be able to get at that value.. I'm suspecting
>>> that's not actually the case.
>>>
>>> For a 1GB Pitcairn (HD7850) this gets me from the reported clinfo values of:
>>> Global memory size                              2143076352 (1.996GiB)
>>> Max memory allocation                           1500153446 (1.397GiB)
>>> Max constant buffer size                        1500153446 (1.397GiB)
>>>
>>> To:
>>> Global memory size                              2143076352 (1.996GiB)
>>> Max memory allocation                           805306368 (768MiB)
>>> Max constant buffer size                        805306368 (768MiB)
>>>
>>> Fixes: OpenCL CTS test/conformance/api/min_max_mem_alloc_size,
>>>        OpenCL CTS test/conformance/api/min_max_constant_buffer_size
>>>
>>> Signed-off-by: Aaron Watry <awatry@gmail.com>
>>> ---
>>>  src/gallium/drivers/radeon/r600_pipe_common.c | 17 +++++++++++++++--
>>>  1 file changed, 15 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c b/src/gallium/drivers/radeon/r600_pipe_common.c
>>> index 2c0cadb030..cdd4062fd3 100644
>>> --- a/src/gallium/drivers/radeon/r600_pipe_common.c
>>> +++ b/src/gallium/drivers/radeon/r600_pipe_common.c
>>> @@ -1144,8 +1144,21 @@ static int r600_get_compute_param(struct pipe_screen *screen,
>>>                 if (ret) {
>>>                         uint64_t *max_mem_alloc_size = ret;
>>>
>>> -                       *max_mem_alloc_size = rscreen->info.max_alloc_size;
>>> -               }
>>> +                       uint64_t max_alloc = rscreen->info.max_alloc_size;
>>> +
>>> +                       if (rscreen->info.has_dedicated_vram) {
>>> +                               /* XXX: Hack to prevent system hangs...
>>> +                                * Limit to 3/4 VRAM for any single allocation.
>>> +                                * Prevents:
>>> +                                *     radeon: Not enough memory for command submission.
>>> +                                */
>>> +                               *max_mem_alloc_size = MIN2(
>>> +                                       rscreen->info.vram_size * 3 / 4, max_alloc
>>> +                               );
>>> +                       } else {
>>> +                               *max_mem_alloc_size = max_alloc;
>>> +                       }
>>> +        }
>>>                 return sizeof(uint64_t);
>>>
>>>         case PIPE_COMPUTE_CAP_MAX_CLOCK_FREQUENCY:
>>> --
>>> 2.11.0
>>>
>>> _______________________________________________
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
On 08/06/17 03:42 AM, Marek Olšák wrote:
> On Wed, Jun 7, 2017 at 4:10 PM, Aaron Watry <awatry@gmail.com> wrote:
>> On Mon, Jun 5, 2017 at 3:07 PM, Marek Olšák <maraeo@gmail.com> wrote:
>>>
>>> Can you make the change in radeon_drm_winsys.c instead?
>>
>> Something like the following?
>>
>> diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>> b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>> index a485615ae4..44948f49ef 100644
>> --- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>> +++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>> @@ -365,6 +365,8 @@ static bool do_winsys_init(struct radeon_drm_winsys *ws)
>>      /* Radeon allocates all buffers as contigous, which makes large allocations
>>       * unlikely to succeed. */
>>      ws->info.max_alloc_size = MAX2(ws->info.vram_size,
>> ws->info.gart_size) * 0.7;
>> +    if (ws->info.has_dedicated_vram)
>> +        ws->info.max_alloc_size = MIN2(ws->info.vram_size * 0.7,
>> ws->info.max_alloc_size);
>>      if (ws->info.drm_minor < 40)
>>          ws->info.max_alloc_size = MIN2(ws->info.max_alloc_size, 256*1024*1024);
> 
> Yes, feel free to push that.

That also affects PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE, is that intended?
On Wed, Jun 7, 2017 at 9:15 PM, Michel Dänzer <michel@daenzer.net> wrote:
> On 08/06/17 03:42 AM, Marek Olšák wrote:
>> On Wed, Jun 7, 2017 at 4:10 PM, Aaron Watry <awatry@gmail.com> wrote:
>>> On Mon, Jun 5, 2017 at 3:07 PM, Marek Olšák <maraeo@gmail.com> wrote:
>>>>
>>>> Can you make the change in radeon_drm_winsys.c instead?
>>>
>>> Something like the following?
>>>
>>> diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>> b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>> index a485615ae4..44948f49ef 100644
>>> --- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>> +++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>> @@ -365,6 +365,8 @@ static bool do_winsys_init(struct radeon_drm_winsys *ws)
>>>      /* Radeon allocates all buffers as contigous, which makes large allocations
>>>       * unlikely to succeed. */
>>>      ws->info.max_alloc_size = MAX2(ws->info.vram_size,
>>> ws->info.gart_size) * 0.7;
>>> +    if (ws->info.has_dedicated_vram)
>>> +        ws->info.max_alloc_size = MIN2(ws->info.vram_size * 0.7,
>>> ws->info.max_alloc_size);
>>>      if (ws->info.drm_minor < 40)
>>>          ws->info.max_alloc_size = MIN2(ws->info.max_alloc_size, 256*1024*1024);
>>
>> Yes, feel free to push that.
>
> That also affects PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE, is that intended?

Not necessarily.

Part of the reason that I had originally put this in
r600_pipe_common.c under the compute params was that I didn't feel
comfortable changing this for all workload types. There's evidence
that implies that the closed-source AMD CL runtime limits global
allocations to either 256MB or 1/4 VRAM (on a 1GB card), so 70% of the
max of GART/VRAM seems a bit high for us to report. I'll probably
check around a bit and see what the prevailing limits seem to be and
if lowering the absolute max might make sense here (for compute loads
only), as a failure to allocate the requested amount of memory seems
to result in system hangs shortly thereafter, and I'd like to get the
frequency of those occurrences down a bit.

--Aaron



> --
> Earthling Michel Dänzer               |               http://www.amd.com
> Libre software enthusiast             |             Mesa and X developer
On 08.06.2017 04:15, Michel Dänzer wrote:
> On 08/06/17 03:42 AM, Marek Olšák wrote:
>> On Wed, Jun 7, 2017 at 4:10 PM, Aaron Watry <awatry@gmail.com> wrote:
>>> On Mon, Jun 5, 2017 at 3:07 PM, Marek Olšák <maraeo@gmail.com> wrote:
>>>>
>>>> Can you make the change in radeon_drm_winsys.c instead?
>>>
>>> Something like the following?
>>>
>>> diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>> b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>> index a485615ae4..44948f49ef 100644
>>> --- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>> +++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>> @@ -365,6 +365,8 @@ static bool do_winsys_init(struct radeon_drm_winsys *ws)
>>>       /* Radeon allocates all buffers as contigous, which makes large allocations
>>>        * unlikely to succeed. */
>>>       ws->info.max_alloc_size = MAX2(ws->info.vram_size,
>>> ws->info.gart_size) * 0.7;
>>> +    if (ws->info.has_dedicated_vram)
>>> +        ws->info.max_alloc_size = MIN2(ws->info.vram_size * 0.7,
>>> ws->info.max_alloc_size);
>>>       if (ws->info.drm_minor < 40)
>>>           ws->info.max_alloc_size = MIN2(ws->info.max_alloc_size, 256*1024*1024);
>>
>> Yes, feel free to push that.
> 
> That also affects PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE, is that intended?

Yes, that's actually good. We want to prevent applications from 
allocating buffers that are so large that we're likely to fail to put 
them in VRAM.

Cheers,
Nicolai
On Wed, Jun 7, 2017 at 11:12 PM, Aaron Watry <awatry@gmail.com> wrote:
> On Wed, Jun 7, 2017 at 9:15 PM, Michel Dänzer <michel@daenzer.net> wrote:
>> On 08/06/17 03:42 AM, Marek Olšák wrote:
>>> On Wed, Jun 7, 2017 at 4:10 PM, Aaron Watry <awatry@gmail.com> wrote:
>>>> On Mon, Jun 5, 2017 at 3:07 PM, Marek Olšák <maraeo@gmail.com> wrote:
>>>>>
>>>>> Can you make the change in radeon_drm_winsys.c instead?
>>>>
>>>> Something like the following?
>>>>
>>>> diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>>> b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>>> index a485615ae4..44948f49ef 100644
>>>> --- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>>> +++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>>> @@ -365,6 +365,8 @@ static bool do_winsys_init(struct radeon_drm_winsys *ws)
>>>>      /* Radeon allocates all buffers as contigous, which makes large allocations
>>>>       * unlikely to succeed. */
>>>>      ws->info.max_alloc_size = MAX2(ws->info.vram_size,
>>>> ws->info.gart_size) * 0.7;
>>>> +    if (ws->info.has_dedicated_vram)
>>>> +        ws->info.max_alloc_size = MIN2(ws->info.vram_size * 0.7,
>>>> ws->info.max_alloc_size);
>>>>      if (ws->info.drm_minor < 40)
>>>>          ws->info.max_alloc_size = MIN2(ws->info.max_alloc_size, 256*1024*1024);
>>>
>>> Yes, feel free to push that.
>>
>> That also affects PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE, is that intended?
>
> Not necessarily.
>
> Part of the reason that I had originally put this in
> r600_pipe_common.c under the compute params was that I didn't feel
> comfortable changing this for all workload types. There's evidence
> that implies that the closed-source AMD CL runtime limits global
> allocations to either 256MB or 1/4 VRAM (on a 1GB card), so 70% of the
> max of GART/VRAM seems a bit high for us to report. I'll probably
> check around a bit and see what the prevailing limits seem to be and
> if lowering the absolute max might make sense here (for compute loads
> only), as a failure to allocate the requested amount of memory seems
> to result in system hangs shortly thereafter, and I'd like to get the
> frequency of those occurrences down a bit.

At least in Windows 10 using the AMD binary CL runtime, it reports
global memory size of 2GB and max allocation of 1GB for the 1GB card
that I've got.  Whether that's being calculated as max allocation =
VRAM-size, or 50% of global memory size is an unknown. I'm not sure if
you can easily adjust the gart size in windows. So my original theory
of 1/4 VRAM seems to be limited to other cards or older drivers/OSes.

Given that Marek/Nicolai want to stick this in radeon_drm_winsys.c,
I'm ok with putting it there.  I think it still makes sense to limit
the max allocation to a percentage of VRAM when the card has its own
memory available for the reasons already mentioned by Nicolai. Whether
70% is a good number is another question, but one thing at a time.

Any objections Michel, or were you just raising the point that it
affected the texture allocation sizes just to make sure we were aware?

--Aaron

>
> --Aaron
>
>
>
>> --
>> Earthling Michel Dänzer               |               http://www.amd.com
>> Libre software enthusiast             |             Mesa and X developer
On 10/06/17 12:43 AM, Aaron Watry wrote:
> On Wed, Jun 7, 2017 at 11:12 PM, Aaron Watry <awatry@gmail.com> wrote:
>> On Wed, Jun 7, 2017 at 9:15 PM, Michel Dänzer <michel@daenzer.net> wrote:
>>> On 08/06/17 03:42 AM, Marek Olšák wrote:
>>>> On Wed, Jun 7, 2017 at 4:10 PM, Aaron Watry <awatry@gmail.com> wrote:
>>>>> On Mon, Jun 5, 2017 at 3:07 PM, Marek Olšák <maraeo@gmail.com> wrote:
>>>>>>
>>>>>> Can you make the change in radeon_drm_winsys.c instead?
>>>>>
>>>>> Something like the following?
>>>>>
>>>>> diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>>>> b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>>>> index a485615ae4..44948f49ef 100644
>>>>> --- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>>>> +++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>>>> @@ -365,6 +365,8 @@ static bool do_winsys_init(struct radeon_drm_winsys *ws)
>>>>>      /* Radeon allocates all buffers as contigous, which makes large allocations
>>>>>       * unlikely to succeed. */
>>>>>      ws->info.max_alloc_size = MAX2(ws->info.vram_size,
>>>>> ws->info.gart_size) * 0.7;
>>>>> +    if (ws->info.has_dedicated_vram)
>>>>> +        ws->info.max_alloc_size = MIN2(ws->info.vram_size * 0.7,
>>>>> ws->info.max_alloc_size);
>>>>>      if (ws->info.drm_minor < 40)
>>>>>          ws->info.max_alloc_size = MIN2(ws->info.max_alloc_size, 256*1024*1024);
>>>>
>>>> Yes, feel free to push that.
>>>
>>> That also affects PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE, is that intended?
>>
>> Not necessarily.
>>
>> Part of the reason that I had originally put this in
>> r600_pipe_common.c under the compute params was that I didn't feel
>> comfortable changing this for all workload types. There's evidence
>> that implies that the closed-source AMD CL runtime limits global
>> allocations to either 256MB or 1/4 VRAM (on a 1GB card), so 70% of the
>> max of GART/VRAM seems a bit high for us to report. I'll probably
>> check around a bit and see what the prevailing limits seem to be and
>> if lowering the absolute max might make sense here (for compute loads
>> only), as a failure to allocate the requested amount of memory seems
>> to result in system hangs shortly thereafter, and I'd like to get the
>> frequency of those occurrences down a bit.
> 
> At least in Windows 10 using the AMD binary CL runtime, it reports
> global memory size of 2GB and max allocation of 1GB for the 1GB card
> that I've got.  Whether that's being calculated as max allocation =
> VRAM-size, or 50% of global memory size is an unknown. I'm not sure if
> you can easily adjust the gart size in windows. So my original theory
> of 1/4 VRAM seems to be limited to other cards or older drivers/OSes.
> 
> Given that Marek/Nicolai want to stick this in radeon_drm_winsys.c,
> I'm ok with putting it there.  I think it still makes sense to limit
> the max allocation to a percentage of VRAM when the card has its own
> memory available for the reasons already mentioned by Nicolai. Whether
> 70% is a good number is another question, but one thing at a time.
> 
> Any objections Michel, or were you just raising the point that it
> affected the texture allocation sizes just to make sure we were aware?

Right, just wanted to make sure everybody's aware.
On Mon, Jun 12, 2017 at 5:19 AM, Michel Dänzer <michel@daenzer.net> wrote:
> On 10/06/17 12:43 AM, Aaron Watry wrote:
>> On Wed, Jun 7, 2017 at 11:12 PM, Aaron Watry <awatry@gmail.com> wrote:
>>> On Wed, Jun 7, 2017 at 9:15 PM, Michel Dänzer <michel@daenzer.net> wrote:
>>>> On 08/06/17 03:42 AM, Marek Olšák wrote:
>>>>> On Wed, Jun 7, 2017 at 4:10 PM, Aaron Watry <awatry@gmail.com> wrote:
>>>>>> On Mon, Jun 5, 2017 at 3:07 PM, Marek Olšák <maraeo@gmail.com> wrote:
>>>>>>>
>>>>>>> Can you make the change in radeon_drm_winsys.c instead?
>>>>>>
>>>>>> Something like the following?
>>>>>>
>>>>>> diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>>>>> b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>>>>> index a485615ae4..44948f49ef 100644
>>>>>> --- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>>>>> +++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c
>>>>>> @@ -365,6 +365,8 @@ static bool do_winsys_init(struct radeon_drm_winsys *ws)
>>>>>>      /* Radeon allocates all buffers as contigous, which makes large allocations
>>>>>>       * unlikely to succeed. */
>>>>>>      ws->info.max_alloc_size = MAX2(ws->info.vram_size,
>>>>>> ws->info.gart_size) * 0.7;
>>>>>> +    if (ws->info.has_dedicated_vram)
>>>>>> +        ws->info.max_alloc_size = MIN2(ws->info.vram_size * 0.7,
>>>>>> ws->info.max_alloc_size);
>>>>>>      if (ws->info.drm_minor < 40)
>>>>>>          ws->info.max_alloc_size = MIN2(ws->info.max_alloc_size, 256*1024*1024);
>>>>>
>>>>> Yes, feel free to push that.
>>>>
>>>> That also affects PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE, is that intended?
>>>
>>> Not necessarily.
>>>
>>> Part of the reason that I had originally put this in
>>> r600_pipe_common.c under the compute params was that I didn't feel
>>> comfortable changing this for all workload types. There's evidence
>>> that implies that the closed-source AMD CL runtime limits global
>>> allocations to either 256MB or 1/4 VRAM (on a 1GB card), so 70% of the
>>> max of GART/VRAM seems a bit high for us to report. I'll probably
>>> check around a bit and see what the prevailing limits seem to be and
>>> if lowering the absolute max might make sense here (for compute loads
>>> only), as a failure to allocate the requested amount of memory seems
>>> to result in system hangs shortly thereafter, and I'd like to get the
>>> frequency of those occurrences down a bit.
>>
>> At least in Windows 10 using the AMD binary CL runtime, it reports
>> global memory size of 2GB and max allocation of 1GB for the 1GB card
>> that I've got.  Whether that's being calculated as max allocation =
>> VRAM-size, or 50% of global memory size is an unknown. I'm not sure if
>> you can easily adjust the gart size in windows. So my original theory
>> of 1/4 VRAM seems to be limited to other cards or older drivers/OSes.
>>
>> Given that Marek/Nicolai want to stick this in radeon_drm_winsys.c,
>> I'm ok with putting it there.  I think it still makes sense to limit
>> the max allocation to a percentage of VRAM when the card has its own
>> memory available for the reasons already mentioned by Nicolai. Whether
>> 70% is a good number is another question, but one thing at a time.
>>
>> Any objections Michel, or were you just raising the point that it
>> affected the texture allocation sizes just to make sure we were aware?
>
> Right, just wanted to make sure everybody's aware.

GL_MAX_TEXTURE_BUFFER_SIZE can be as low as 64K.

Marek