Mmap DMA memory uncached: “map pfn ram range req uncached-minus got write-back”











up vote
3
down vote

favorite
2












I am mapping DMA coherent memory from kernel to user space. At user level I use mmap() and in kernel driver I use dma_alloc_coherent() and afterwards remap_pfn_range() to remap the pages. This basically works as I can write data to the mapped area in my app and verify it in my kernel driver.



However, despite using dma_alloc_coherent (which should alloc uncached memory) and pgprot_noncached() the kernel informs me with this dmesg output:




map pfn ram range req uncached-minus for [mem 0xABC-0xCBA], got write-back




In my understanding, write-back is cached memory. But I need uncached memory for the DMA operation.



The Code (only showing the important parts):



User App



fd = open(dev_fn, O_RDWR | O_SYNC);
if (fd > 0)
{
mem = mmap ( NULL
, mmap_len
, PROT_READ | PROT_WRITE
, MAP_SHARED
, fd
, 0
);
}


For testing purposes I used mmap_len = getpagesize(); Which is 4096.



Kernel Driver



typedef struct
{
size_t mem_size;
dma_addr_t dma_addr;
void *cpu_addr;
} Dma_Priv;

fops_mmap()
{
dma_priv->mem_size = vma->vm_end - vma->vm_start;

dma_priv->cpu_addr = dma_alloc_coherent ( &gen_dev
, dma_priv->mem_size
, &dma_priv->dma_addr
, GFP_KERNEL
);
if (dma_priv->cpu_addr != NULL)
{
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
remap_pfn_range ( vma
, vma->vm_start
, virt_to_phys(dma_priv->cpu_addr)>>PAGE_SHIFT
, dma_priv->mem_size
, vma->vm_page_prot
)
}
}


Useful information I've found



1) PATting Linux: https://www.kernel.org/doc/ols/2008/ols2008v2-pages-135-144.pdf



Page 7 --> mmap with O_SYNC (uncached):




Applications can open /dev/mem with the O_SYNC flag and then do mmap
on it. With that, applications will be accessing that address with an
uncached memory type. mmap will succeed only if there is no other
conflicting mappings to the same region.




I used the flag, doesn't help.



Page 7 --> mmap without O_SYNC (uncached-minus):




mmap without O_SYNC, no existing mapping, and not a write-back region:
For an mmap that comes under this category, we use uncached-minus type
mapping. In the absence of any MTRR for this region, the effective
type will be uncached. But in cases where there is an MTRR, making
this region write-combine, then the effective type will be
write-combine.




2) pgprot_noncached()



In /arch/x86/include/asm/pgtable.h I found this:



#define pgprot_noncached(prot)                              
((boot_cpu_data.x86 > 3)
? (__pgprot(pgprot_val(prot) |
cachemode2protval(_PAGE_CACHE_MODE_UC_MINUS)))
: (prot))


Is it possible that x86 always sets a noncached request to UC_MINUS, which results in combination with MTRR in a cached write-back?



I am using Ubuntu 16.04.1, Kernel: 4.10.0-40-generic.



EDIT: SOLVED



https://www.kernel.org/doc/Documentation/x86/pat.txt




Drivers wanting to export some pages to userspace do it by using mmap
interface and a combination of 1) pgprot_noncached() 2)
io_remap_pfn_range() or remap_pfn_range() or vmf_insert_pfn()



With PAT support, a new API pgprot_writecombine is being added. So,
drivers can continue to use the above sequence, with either
pgprot_noncached() or pgprot_writecombine() in step 1, followed by
step 2.



In addition, step 2 internally tracks the region as UC or WC in
memtype list in order to ensure no conflicting mapping.



Note that this set of APIs only works with IO (non RAM) regions. If
driver wants to export a RAM region, it has to do set_memory_uc() or
set_memory_wc() as step 0 above and also track the usage of those
pages and use set_memory_wb() before the page is freed to free pool.




I added set_memory_uc() before pgprot_noncached() and it did the thing.



if (dma_priv->cpu_addr != NULL)
{
set_memory_uc(dma_priv->cpu_addr, (dma_priv->mem_size/PAGE_SIZE));
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
remap_pfn_range ( vma
, vma->vm_start
, virt_to_phys(dma_priv->cpu_addr)>>PAGE_SHIFT
, dma_priv->mem_size
, vma->vm_page_prot
)
}









share|improve this question




























    up vote
    3
    down vote

    favorite
    2












    I am mapping DMA coherent memory from kernel to user space. At user level I use mmap() and in kernel driver I use dma_alloc_coherent() and afterwards remap_pfn_range() to remap the pages. This basically works as I can write data to the mapped area in my app and verify it in my kernel driver.



    However, despite using dma_alloc_coherent (which should alloc uncached memory) and pgprot_noncached() the kernel informs me with this dmesg output:




    map pfn ram range req uncached-minus for [mem 0xABC-0xCBA], got write-back




    In my understanding, write-back is cached memory. But I need uncached memory for the DMA operation.



    The Code (only showing the important parts):



    User App



    fd = open(dev_fn, O_RDWR | O_SYNC);
    if (fd > 0)
    {
    mem = mmap ( NULL
    , mmap_len
    , PROT_READ | PROT_WRITE
    , MAP_SHARED
    , fd
    , 0
    );
    }


    For testing purposes I used mmap_len = getpagesize(); Which is 4096.



    Kernel Driver



    typedef struct
    {
    size_t mem_size;
    dma_addr_t dma_addr;
    void *cpu_addr;
    } Dma_Priv;

    fops_mmap()
    {
    dma_priv->mem_size = vma->vm_end - vma->vm_start;

    dma_priv->cpu_addr = dma_alloc_coherent ( &gen_dev
    , dma_priv->mem_size
    , &dma_priv->dma_addr
    , GFP_KERNEL
    );
    if (dma_priv->cpu_addr != NULL)
    {
    vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
    remap_pfn_range ( vma
    , vma->vm_start
    , virt_to_phys(dma_priv->cpu_addr)>>PAGE_SHIFT
    , dma_priv->mem_size
    , vma->vm_page_prot
    )
    }
    }


    Useful information I've found



    1) PATting Linux: https://www.kernel.org/doc/ols/2008/ols2008v2-pages-135-144.pdf



    Page 7 --> mmap with O_SYNC (uncached):




    Applications can open /dev/mem with the O_SYNC flag and then do mmap
    on it. With that, applications will be accessing that address with an
    uncached memory type. mmap will succeed only if there is no other
    conflicting mappings to the same region.




    I used the flag, doesn't help.



    Page 7 --> mmap without O_SYNC (uncached-minus):




    mmap without O_SYNC, no existing mapping, and not a write-back region:
    For an mmap that comes under this category, we use uncached-minus type
    mapping. In the absence of any MTRR for this region, the effective
    type will be uncached. But in cases where there is an MTRR, making
    this region write-combine, then the effective type will be
    write-combine.




    2) pgprot_noncached()



    In /arch/x86/include/asm/pgtable.h I found this:



    #define pgprot_noncached(prot)                              
    ((boot_cpu_data.x86 > 3)
    ? (__pgprot(pgprot_val(prot) |
    cachemode2protval(_PAGE_CACHE_MODE_UC_MINUS)))
    : (prot))


    Is it possible that x86 always sets a noncached request to UC_MINUS, which results in combination with MTRR in a cached write-back?



    I am using Ubuntu 16.04.1, Kernel: 4.10.0-40-generic.



    EDIT: SOLVED



    https://www.kernel.org/doc/Documentation/x86/pat.txt




    Drivers wanting to export some pages to userspace do it by using mmap
    interface and a combination of 1) pgprot_noncached() 2)
    io_remap_pfn_range() or remap_pfn_range() or vmf_insert_pfn()



    With PAT support, a new API pgprot_writecombine is being added. So,
    drivers can continue to use the above sequence, with either
    pgprot_noncached() or pgprot_writecombine() in step 1, followed by
    step 2.



    In addition, step 2 internally tracks the region as UC or WC in
    memtype list in order to ensure no conflicting mapping.



    Note that this set of APIs only works with IO (non RAM) regions. If
    driver wants to export a RAM region, it has to do set_memory_uc() or
    set_memory_wc() as step 0 above and also track the usage of those
    pages and use set_memory_wb() before the page is freed to free pool.




    I added set_memory_uc() before pgprot_noncached() and it did the thing.



    if (dma_priv->cpu_addr != NULL)
    {
    set_memory_uc(dma_priv->cpu_addr, (dma_priv->mem_size/PAGE_SIZE));
    vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
    remap_pfn_range ( vma
    , vma->vm_start
    , virt_to_phys(dma_priv->cpu_addr)>>PAGE_SHIFT
    , dma_priv->mem_size
    , vma->vm_page_prot
    )
    }









    share|improve this question


























      up vote
      3
      down vote

      favorite
      2









      up vote
      3
      down vote

      favorite
      2






      2





      I am mapping DMA coherent memory from kernel to user space. At user level I use mmap() and in kernel driver I use dma_alloc_coherent() and afterwards remap_pfn_range() to remap the pages. This basically works as I can write data to the mapped area in my app and verify it in my kernel driver.



      However, despite using dma_alloc_coherent (which should alloc uncached memory) and pgprot_noncached() the kernel informs me with this dmesg output:




      map pfn ram range req uncached-minus for [mem 0xABC-0xCBA], got write-back




      In my understanding, write-back is cached memory. But I need uncached memory for the DMA operation.



      The Code (only showing the important parts):



      User App



      fd = open(dev_fn, O_RDWR | O_SYNC);
      if (fd > 0)
      {
      mem = mmap ( NULL
      , mmap_len
      , PROT_READ | PROT_WRITE
      , MAP_SHARED
      , fd
      , 0
      );
      }


      For testing purposes I used mmap_len = getpagesize(); Which is 4096.



      Kernel Driver



      typedef struct
      {
      size_t mem_size;
      dma_addr_t dma_addr;
      void *cpu_addr;
      } Dma_Priv;

      fops_mmap()
      {
      dma_priv->mem_size = vma->vm_end - vma->vm_start;

      dma_priv->cpu_addr = dma_alloc_coherent ( &gen_dev
      , dma_priv->mem_size
      , &dma_priv->dma_addr
      , GFP_KERNEL
      );
      if (dma_priv->cpu_addr != NULL)
      {
      vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
      remap_pfn_range ( vma
      , vma->vm_start
      , virt_to_phys(dma_priv->cpu_addr)>>PAGE_SHIFT
      , dma_priv->mem_size
      , vma->vm_page_prot
      )
      }
      }


      Useful information I've found



      1) PATting Linux: https://www.kernel.org/doc/ols/2008/ols2008v2-pages-135-144.pdf



      Page 7 --> mmap with O_SYNC (uncached):




      Applications can open /dev/mem with the O_SYNC flag and then do mmap
      on it. With that, applications will be accessing that address with an
      uncached memory type. mmap will succeed only if there is no other
      conflicting mappings to the same region.




      I used the flag, doesn't help.



      Page 7 --> mmap without O_SYNC (uncached-minus):




      mmap without O_SYNC, no existing mapping, and not a write-back region:
      For an mmap that comes under this category, we use uncached-minus type
      mapping. In the absence of any MTRR for this region, the effective
      type will be uncached. But in cases where there is an MTRR, making
      this region write-combine, then the effective type will be
      write-combine.




      2) pgprot_noncached()



      In /arch/x86/include/asm/pgtable.h I found this:



      #define pgprot_noncached(prot)                              
      ((boot_cpu_data.x86 > 3)
      ? (__pgprot(pgprot_val(prot) |
      cachemode2protval(_PAGE_CACHE_MODE_UC_MINUS)))
      : (prot))


      Is it possible that x86 always sets a noncached request to UC_MINUS, which results in combination with MTRR in a cached write-back?



      I am using Ubuntu 16.04.1, Kernel: 4.10.0-40-generic.



      EDIT: SOLVED



      https://www.kernel.org/doc/Documentation/x86/pat.txt




      Drivers wanting to export some pages to userspace do it by using mmap
      interface and a combination of 1) pgprot_noncached() 2)
      io_remap_pfn_range() or remap_pfn_range() or vmf_insert_pfn()



      With PAT support, a new API pgprot_writecombine is being added. So,
      drivers can continue to use the above sequence, with either
      pgprot_noncached() or pgprot_writecombine() in step 1, followed by
      step 2.



      In addition, step 2 internally tracks the region as UC or WC in
      memtype list in order to ensure no conflicting mapping.



      Note that this set of APIs only works with IO (non RAM) regions. If
      driver wants to export a RAM region, it has to do set_memory_uc() or
      set_memory_wc() as step 0 above and also track the usage of those
      pages and use set_memory_wb() before the page is freed to free pool.




      I added set_memory_uc() before pgprot_noncached() and it did the thing.



      if (dma_priv->cpu_addr != NULL)
      {
      set_memory_uc(dma_priv->cpu_addr, (dma_priv->mem_size/PAGE_SIZE));
      vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
      remap_pfn_range ( vma
      , vma->vm_start
      , virt_to_phys(dma_priv->cpu_addr)>>PAGE_SHIFT
      , dma_priv->mem_size
      , vma->vm_page_prot
      )
      }









      share|improve this question















      I am mapping DMA coherent memory from kernel to user space. At user level I use mmap() and in kernel driver I use dma_alloc_coherent() and afterwards remap_pfn_range() to remap the pages. This basically works as I can write data to the mapped area in my app and verify it in my kernel driver.



      However, despite using dma_alloc_coherent (which should alloc uncached memory) and pgprot_noncached() the kernel informs me with this dmesg output:




      map pfn ram range req uncached-minus for [mem 0xABC-0xCBA], got write-back




      In my understanding, write-back is cached memory. But I need uncached memory for the DMA operation.



      The Code (only showing the important parts):



      User App



      fd = open(dev_fn, O_RDWR | O_SYNC);
      if (fd > 0)
      {
      mem = mmap ( NULL
      , mmap_len
      , PROT_READ | PROT_WRITE
      , MAP_SHARED
      , fd
      , 0
      );
      }


      For testing purposes I used mmap_len = getpagesize(); Which is 4096.



      Kernel Driver



      typedef struct
      {
      size_t mem_size;
      dma_addr_t dma_addr;
      void *cpu_addr;
      } Dma_Priv;

      fops_mmap()
      {
      dma_priv->mem_size = vma->vm_end - vma->vm_start;

      dma_priv->cpu_addr = dma_alloc_coherent ( &gen_dev
      , dma_priv->mem_size
      , &dma_priv->dma_addr
      , GFP_KERNEL
      );
      if (dma_priv->cpu_addr != NULL)
      {
      vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
      remap_pfn_range ( vma
      , vma->vm_start
      , virt_to_phys(dma_priv->cpu_addr)>>PAGE_SHIFT
      , dma_priv->mem_size
      , vma->vm_page_prot
      )
      }
      }


      Useful information I've found



      1) PATting Linux: https://www.kernel.org/doc/ols/2008/ols2008v2-pages-135-144.pdf



      Page 7 --> mmap with O_SYNC (uncached):




      Applications can open /dev/mem with the O_SYNC flag and then do mmap
      on it. With that, applications will be accessing that address with an
      uncached memory type. mmap will succeed only if there is no other
      conflicting mappings to the same region.




      I used the flag, doesn't help.



      Page 7 --> mmap without O_SYNC (uncached-minus):




      mmap without O_SYNC, no existing mapping, and not a write-back region:
      For an mmap that comes under this category, we use uncached-minus type
      mapping. In the absence of any MTRR for this region, the effective
      type will be uncached. But in cases where there is an MTRR, making
      this region write-combine, then the effective type will be
      write-combine.




      2) pgprot_noncached()



      In /arch/x86/include/asm/pgtable.h I found this:



      #define pgprot_noncached(prot)                              
      ((boot_cpu_data.x86 > 3)
      ? (__pgprot(pgprot_val(prot) |
      cachemode2protval(_PAGE_CACHE_MODE_UC_MINUS)))
      : (prot))


      Is it possible that x86 always sets a noncached request to UC_MINUS, which results in combination with MTRR in a cached write-back?



      I am using Ubuntu 16.04.1, Kernel: 4.10.0-40-generic.



      EDIT: SOLVED



      https://www.kernel.org/doc/Documentation/x86/pat.txt




      Drivers wanting to export some pages to userspace do it by using mmap
      interface and a combination of 1) pgprot_noncached() 2)
      io_remap_pfn_range() or remap_pfn_range() or vmf_insert_pfn()



      With PAT support, a new API pgprot_writecombine is being added. So,
      drivers can continue to use the above sequence, with either
      pgprot_noncached() or pgprot_writecombine() in step 1, followed by
      step 2.



      In addition, step 2 internally tracks the region as UC or WC in
      memtype list in order to ensure no conflicting mapping.



      Note that this set of APIs only works with IO (non RAM) regions. If
      driver wants to export a RAM region, it has to do set_memory_uc() or
      set_memory_wc() as step 0 above and also track the usage of those
      pages and use set_memory_wb() before the page is freed to free pool.




      I added set_memory_uc() before pgprot_noncached() and it did the thing.



      if (dma_priv->cpu_addr != NULL)
      {
      set_memory_uc(dma_priv->cpu_addr, (dma_priv->mem_size/PAGE_SIZE));
      vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
      remap_pfn_range ( vma
      , vma->vm_start
      , virt_to_phys(dma_priv->cpu_addr)>>PAGE_SHIFT
      , dma_priv->mem_size
      , vma->vm_page_prot
      )
      }






      linux linux-kernel linux-device-driver mmap






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 8 at 10:43

























      asked Nov 7 at 19:20









      Gbo

      212




      212





























          active

          oldest

          votes











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














           

          draft saved


          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53196359%2fmmap-dma-memory-uncached-map-pfn-ram-range-req-uncached-minus-got-write-back%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown






























          active

          oldest

          votes













          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















           

          draft saved


          draft discarded



















































           


          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53196359%2fmmap-dma-memory-uncached-map-pfn-ram-range-req-uncached-minus-got-write-back%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          這個網誌中的熱門文章

          Xamarin.form Move up view when keyboard appear

          Post-Redirect-Get with Spring WebFlux and Thymeleaf

          Anylogic : not able to use stopDelay()