How to implement pagination for cassandra by using keys?











up vote
0
down vote

favorite












I'm trying to implement some kind of pagination feature for my app that using cassandra in the backend.



CREATE TABLE sample (
some_pk int,
some_id int,
name1 txt,
name2 text,
value text,
PRIMARY KEY (some_pk, some_id, name1, name2)
)
WITH CLUSTERING ORDER BY(some_id DESC)


I want to query 100 records, then store the last records keys in memory to use them later.



+---------+---------+-------+-------+-------+
| sample_pk| some_id | name1 | name2 | value |
+---------+---------+-------+-------+-------+
| 1 | 125 | x | '' | '' |
+---------+---------+-------+-------+-------+
| 1 | 124 | a | '' | '' |
+---------+---------+-------+-------+-------+
| 1 | 124 | b | '' | '' |
+---------+---------+-------+-------+-------+
| 1 | 123 | y | '' | '' |
+---------+---------+-------+-------+-------+


(for simplicity, i left some columns empty. partition key(sample_pk) is not important)



let's assume my page size is 2.



select * from sample where sample_pk=1 limit 2;


returns first 2 rows. now i store the last record in my query result and run query again to get next 2 rows;



this is the query that does not work because of restriction of a single non-EQ relation



select * from where sample_pk=1 and some_id <= 124 and name1>='a' and name2>='' limit 2; 


and this one returns wrong results because some_id is in descending order and name columns are in ascending order.



select * from where sample_pk=1 and (some_id, name1, name2) <= (124, 'a', '') limit 2; 


So I'm stuck. How can I implement pagination?










share|improve this question




























    up vote
    0
    down vote

    favorite












    I'm trying to implement some kind of pagination feature for my app that using cassandra in the backend.



    CREATE TABLE sample (
    some_pk int,
    some_id int,
    name1 txt,
    name2 text,
    value text,
    PRIMARY KEY (some_pk, some_id, name1, name2)
    )
    WITH CLUSTERING ORDER BY(some_id DESC)


    I want to query 100 records, then store the last records keys in memory to use them later.



    +---------+---------+-------+-------+-------+
    | sample_pk| some_id | name1 | name2 | value |
    +---------+---------+-------+-------+-------+
    | 1 | 125 | x | '' | '' |
    +---------+---------+-------+-------+-------+
    | 1 | 124 | a | '' | '' |
    +---------+---------+-------+-------+-------+
    | 1 | 124 | b | '' | '' |
    +---------+---------+-------+-------+-------+
    | 1 | 123 | y | '' | '' |
    +---------+---------+-------+-------+-------+


    (for simplicity, i left some columns empty. partition key(sample_pk) is not important)



    let's assume my page size is 2.



    select * from sample where sample_pk=1 limit 2;


    returns first 2 rows. now i store the last record in my query result and run query again to get next 2 rows;



    this is the query that does not work because of restriction of a single non-EQ relation



    select * from where sample_pk=1 and some_id <= 124 and name1>='a' and name2>='' limit 2; 


    and this one returns wrong results because some_id is in descending order and name columns are in ascending order.



    select * from where sample_pk=1 and (some_id, name1, name2) <= (124, 'a', '') limit 2; 


    So I'm stuck. How can I implement pagination?










    share|improve this question


























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      I'm trying to implement some kind of pagination feature for my app that using cassandra in the backend.



      CREATE TABLE sample (
      some_pk int,
      some_id int,
      name1 txt,
      name2 text,
      value text,
      PRIMARY KEY (some_pk, some_id, name1, name2)
      )
      WITH CLUSTERING ORDER BY(some_id DESC)


      I want to query 100 records, then store the last records keys in memory to use them later.



      +---------+---------+-------+-------+-------+
      | sample_pk| some_id | name1 | name2 | value |
      +---------+---------+-------+-------+-------+
      | 1 | 125 | x | '' | '' |
      +---------+---------+-------+-------+-------+
      | 1 | 124 | a | '' | '' |
      +---------+---------+-------+-------+-------+
      | 1 | 124 | b | '' | '' |
      +---------+---------+-------+-------+-------+
      | 1 | 123 | y | '' | '' |
      +---------+---------+-------+-------+-------+


      (for simplicity, i left some columns empty. partition key(sample_pk) is not important)



      let's assume my page size is 2.



      select * from sample where sample_pk=1 limit 2;


      returns first 2 rows. now i store the last record in my query result and run query again to get next 2 rows;



      this is the query that does not work because of restriction of a single non-EQ relation



      select * from where sample_pk=1 and some_id <= 124 and name1>='a' and name2>='' limit 2; 


      and this one returns wrong results because some_id is in descending order and name columns are in ascending order.



      select * from where sample_pk=1 and (some_id, name1, name2) <= (124, 'a', '') limit 2; 


      So I'm stuck. How can I implement pagination?










      share|improve this question















      I'm trying to implement some kind of pagination feature for my app that using cassandra in the backend.



      CREATE TABLE sample (
      some_pk int,
      some_id int,
      name1 txt,
      name2 text,
      value text,
      PRIMARY KEY (some_pk, some_id, name1, name2)
      )
      WITH CLUSTERING ORDER BY(some_id DESC)


      I want to query 100 records, then store the last records keys in memory to use them later.



      +---------+---------+-------+-------+-------+
      | sample_pk| some_id | name1 | name2 | value |
      +---------+---------+-------+-------+-------+
      | 1 | 125 | x | '' | '' |
      +---------+---------+-------+-------+-------+
      | 1 | 124 | a | '' | '' |
      +---------+---------+-------+-------+-------+
      | 1 | 124 | b | '' | '' |
      +---------+---------+-------+-------+-------+
      | 1 | 123 | y | '' | '' |
      +---------+---------+-------+-------+-------+


      (for simplicity, i left some columns empty. partition key(sample_pk) is not important)



      let's assume my page size is 2.



      select * from sample where sample_pk=1 limit 2;


      returns first 2 rows. now i store the last record in my query result and run query again to get next 2 rows;



      this is the query that does not work because of restriction of a single non-EQ relation



      select * from where sample_pk=1 and some_id <= 124 and name1>='a' and name2>='' limit 2; 


      and this one returns wrong results because some_id is in descending order and name columns are in ascending order.



      select * from where sample_pk=1 and (some_id, name1, name2) <= (124, 'a', '') limit 2; 


      So I'm stuck. How can I implement pagination?







      cassandra datastax-java-driver cqlsh spring-data-cassandra






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 7 at 12:56

























      asked Nov 7 at 12:07









      Cory

      610715




      610715
























          2 Answers
          2






          active

          oldest

          votes

















          up vote
          2
          down vote













          You can run your second query like,



          select * from sample where some_pk =1 and some_id <= 124 limit x;


          Now after fetching the records ignore the record(s) which you have already read (this can be done because you are storing the last record from the previous select query).



          And after ignoring those records if you are end up with empty list of rows/records that means you have iterated over all the records else continue doing this for your pagination task.






          share|improve this answer





















          • This is approximately how paging is implemented in the drivers (see github.com/datastax/java-driver/tree/3.x/manual/paging, datastax.github.io/python-driver/query_paging.html). If you don't want to implement this yourself, you can just use those driver functions.
            – Justin Cameron
            Nov 9 at 0:10


















          up vote
          1
          down vote













          You don't have to store any keys in memory, also you don't need to use limit in your cqlsh query. Just use the capabilities of datastax driver in your application code for doing pagination like the following code:



          public Response getFromCassandra(Integer itemsPerPage, String pageIndex) {
          Response response = new Response();
          String query = "select * from sample where sample_pk=1";
          Statement statement = new SimpleStatement(query).setFetchSize(itemsPerPage); // set the number of items we want per page (fetch size)
          // imagine page '0' indicates the first page, so if pageIndex = '0' then there is no paging state
          if (!pageIndex.equals("0")) {
          statement.setPagingState(PagingState.fromString(pageIndex));
          }
          ResultSet rows = session.execute(statement); // execute the query
          Integer numberOfRows = rows.getAvailableWithoutFetching(); // this should get only number of rows = fetchSize (itemsPerPage)
          Iterator<Row> iterator = rows.iterator();
          while (numberOfRows-- != 0) {
          response.getRows.add(iterator.next());
          }
          PagingState pagingState = rows.getExecutionInfo().getPagingState();
          if(pagingState != null) { // there is still remaining pages
          response.setNextPageIndex(pagingState.toString());
          }
          return response;
          }


          note that if you make the while loop like the following:



          while(iterator.hasNext()) {
          response.getRows.add(iterator.next());
          }


          it will first fetch number of rows as equal as the fetch size we set, then as long as the query still matches some rows in Cassandra it will go fetch again from cassandra till it fetches all rows matching the query from cassandra which may not be intended if you want to implement a pagination feature



          source: https://docs.datastax.com/en/developer/java-driver/3.2/manual/paging/






          share|improve this answer





















            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














             

            draft saved


            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53189168%2fhow-to-implement-pagination-for-cassandra-by-using-keys%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            2
            down vote













            You can run your second query like,



            select * from sample where some_pk =1 and some_id <= 124 limit x;


            Now after fetching the records ignore the record(s) which you have already read (this can be done because you are storing the last record from the previous select query).



            And after ignoring those records if you are end up with empty list of rows/records that means you have iterated over all the records else continue doing this for your pagination task.






            share|improve this answer





















            • This is approximately how paging is implemented in the drivers (see github.com/datastax/java-driver/tree/3.x/manual/paging, datastax.github.io/python-driver/query_paging.html). If you don't want to implement this yourself, you can just use those driver functions.
              – Justin Cameron
              Nov 9 at 0:10















            up vote
            2
            down vote













            You can run your second query like,



            select * from sample where some_pk =1 and some_id <= 124 limit x;


            Now after fetching the records ignore the record(s) which you have already read (this can be done because you are storing the last record from the previous select query).



            And after ignoring those records if you are end up with empty list of rows/records that means you have iterated over all the records else continue doing this for your pagination task.






            share|improve this answer





















            • This is approximately how paging is implemented in the drivers (see github.com/datastax/java-driver/tree/3.x/manual/paging, datastax.github.io/python-driver/query_paging.html). If you don't want to implement this yourself, you can just use those driver functions.
              – Justin Cameron
              Nov 9 at 0:10













            up vote
            2
            down vote










            up vote
            2
            down vote









            You can run your second query like,



            select * from sample where some_pk =1 and some_id <= 124 limit x;


            Now after fetching the records ignore the record(s) which you have already read (this can be done because you are storing the last record from the previous select query).



            And after ignoring those records if you are end up with empty list of rows/records that means you have iterated over all the records else continue doing this for your pagination task.






            share|improve this answer












            You can run your second query like,



            select * from sample where some_pk =1 and some_id <= 124 limit x;


            Now after fetching the records ignore the record(s) which you have already read (this can be done because you are storing the last record from the previous select query).



            And after ignoring those records if you are end up with empty list of rows/records that means you have iterated over all the records else continue doing this for your pagination task.







            share|improve this answer












            share|improve this answer



            share|improve this answer










            answered Nov 7 at 14:27









            Raj Parekh

            327




            327












            • This is approximately how paging is implemented in the drivers (see github.com/datastax/java-driver/tree/3.x/manual/paging, datastax.github.io/python-driver/query_paging.html). If you don't want to implement this yourself, you can just use those driver functions.
              – Justin Cameron
              Nov 9 at 0:10


















            • This is approximately how paging is implemented in the drivers (see github.com/datastax/java-driver/tree/3.x/manual/paging, datastax.github.io/python-driver/query_paging.html). If you don't want to implement this yourself, you can just use those driver functions.
              – Justin Cameron
              Nov 9 at 0:10
















            This is approximately how paging is implemented in the drivers (see github.com/datastax/java-driver/tree/3.x/manual/paging, datastax.github.io/python-driver/query_paging.html). If you don't want to implement this yourself, you can just use those driver functions.
            – Justin Cameron
            Nov 9 at 0:10




            This is approximately how paging is implemented in the drivers (see github.com/datastax/java-driver/tree/3.x/manual/paging, datastax.github.io/python-driver/query_paging.html). If you don't want to implement this yourself, you can just use those driver functions.
            – Justin Cameron
            Nov 9 at 0:10












            up vote
            1
            down vote













            You don't have to store any keys in memory, also you don't need to use limit in your cqlsh query. Just use the capabilities of datastax driver in your application code for doing pagination like the following code:



            public Response getFromCassandra(Integer itemsPerPage, String pageIndex) {
            Response response = new Response();
            String query = "select * from sample where sample_pk=1";
            Statement statement = new SimpleStatement(query).setFetchSize(itemsPerPage); // set the number of items we want per page (fetch size)
            // imagine page '0' indicates the first page, so if pageIndex = '0' then there is no paging state
            if (!pageIndex.equals("0")) {
            statement.setPagingState(PagingState.fromString(pageIndex));
            }
            ResultSet rows = session.execute(statement); // execute the query
            Integer numberOfRows = rows.getAvailableWithoutFetching(); // this should get only number of rows = fetchSize (itemsPerPage)
            Iterator<Row> iterator = rows.iterator();
            while (numberOfRows-- != 0) {
            response.getRows.add(iterator.next());
            }
            PagingState pagingState = rows.getExecutionInfo().getPagingState();
            if(pagingState != null) { // there is still remaining pages
            response.setNextPageIndex(pagingState.toString());
            }
            return response;
            }


            note that if you make the while loop like the following:



            while(iterator.hasNext()) {
            response.getRows.add(iterator.next());
            }


            it will first fetch number of rows as equal as the fetch size we set, then as long as the query still matches some rows in Cassandra it will go fetch again from cassandra till it fetches all rows matching the query from cassandra which may not be intended if you want to implement a pagination feature



            source: https://docs.datastax.com/en/developer/java-driver/3.2/manual/paging/






            share|improve this answer

























              up vote
              1
              down vote













              You don't have to store any keys in memory, also you don't need to use limit in your cqlsh query. Just use the capabilities of datastax driver in your application code for doing pagination like the following code:



              public Response getFromCassandra(Integer itemsPerPage, String pageIndex) {
              Response response = new Response();
              String query = "select * from sample where sample_pk=1";
              Statement statement = new SimpleStatement(query).setFetchSize(itemsPerPage); // set the number of items we want per page (fetch size)
              // imagine page '0' indicates the first page, so if pageIndex = '0' then there is no paging state
              if (!pageIndex.equals("0")) {
              statement.setPagingState(PagingState.fromString(pageIndex));
              }
              ResultSet rows = session.execute(statement); // execute the query
              Integer numberOfRows = rows.getAvailableWithoutFetching(); // this should get only number of rows = fetchSize (itemsPerPage)
              Iterator<Row> iterator = rows.iterator();
              while (numberOfRows-- != 0) {
              response.getRows.add(iterator.next());
              }
              PagingState pagingState = rows.getExecutionInfo().getPagingState();
              if(pagingState != null) { // there is still remaining pages
              response.setNextPageIndex(pagingState.toString());
              }
              return response;
              }


              note that if you make the while loop like the following:



              while(iterator.hasNext()) {
              response.getRows.add(iterator.next());
              }


              it will first fetch number of rows as equal as the fetch size we set, then as long as the query still matches some rows in Cassandra it will go fetch again from cassandra till it fetches all rows matching the query from cassandra which may not be intended if you want to implement a pagination feature



              source: https://docs.datastax.com/en/developer/java-driver/3.2/manual/paging/






              share|improve this answer























                up vote
                1
                down vote










                up vote
                1
                down vote









                You don't have to store any keys in memory, also you don't need to use limit in your cqlsh query. Just use the capabilities of datastax driver in your application code for doing pagination like the following code:



                public Response getFromCassandra(Integer itemsPerPage, String pageIndex) {
                Response response = new Response();
                String query = "select * from sample where sample_pk=1";
                Statement statement = new SimpleStatement(query).setFetchSize(itemsPerPage); // set the number of items we want per page (fetch size)
                // imagine page '0' indicates the first page, so if pageIndex = '0' then there is no paging state
                if (!pageIndex.equals("0")) {
                statement.setPagingState(PagingState.fromString(pageIndex));
                }
                ResultSet rows = session.execute(statement); // execute the query
                Integer numberOfRows = rows.getAvailableWithoutFetching(); // this should get only number of rows = fetchSize (itemsPerPage)
                Iterator<Row> iterator = rows.iterator();
                while (numberOfRows-- != 0) {
                response.getRows.add(iterator.next());
                }
                PagingState pagingState = rows.getExecutionInfo().getPagingState();
                if(pagingState != null) { // there is still remaining pages
                response.setNextPageIndex(pagingState.toString());
                }
                return response;
                }


                note that if you make the while loop like the following:



                while(iterator.hasNext()) {
                response.getRows.add(iterator.next());
                }


                it will first fetch number of rows as equal as the fetch size we set, then as long as the query still matches some rows in Cassandra it will go fetch again from cassandra till it fetches all rows matching the query from cassandra which may not be intended if you want to implement a pagination feature



                source: https://docs.datastax.com/en/developer/java-driver/3.2/manual/paging/






                share|improve this answer












                You don't have to store any keys in memory, also you don't need to use limit in your cqlsh query. Just use the capabilities of datastax driver in your application code for doing pagination like the following code:



                public Response getFromCassandra(Integer itemsPerPage, String pageIndex) {
                Response response = new Response();
                String query = "select * from sample where sample_pk=1";
                Statement statement = new SimpleStatement(query).setFetchSize(itemsPerPage); // set the number of items we want per page (fetch size)
                // imagine page '0' indicates the first page, so if pageIndex = '0' then there is no paging state
                if (!pageIndex.equals("0")) {
                statement.setPagingState(PagingState.fromString(pageIndex));
                }
                ResultSet rows = session.execute(statement); // execute the query
                Integer numberOfRows = rows.getAvailableWithoutFetching(); // this should get only number of rows = fetchSize (itemsPerPage)
                Iterator<Row> iterator = rows.iterator();
                while (numberOfRows-- != 0) {
                response.getRows.add(iterator.next());
                }
                PagingState pagingState = rows.getExecutionInfo().getPagingState();
                if(pagingState != null) { // there is still remaining pages
                response.setNextPageIndex(pagingState.toString());
                }
                return response;
                }


                note that if you make the while loop like the following:



                while(iterator.hasNext()) {
                response.getRows.add(iterator.next());
                }


                it will first fetch number of rows as equal as the fetch size we set, then as long as the query still matches some rows in Cassandra it will go fetch again from cassandra till it fetches all rows matching the query from cassandra which may not be intended if you want to implement a pagination feature



                source: https://docs.datastax.com/en/developer/java-driver/3.2/manual/paging/







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 12 at 4:42









                Mis94

                4071615




                4071615






























                     

                    draft saved


                    draft discarded



















































                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53189168%2fhow-to-implement-pagination-for-cassandra-by-using-keys%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    這個網誌中的熱門文章

                    Hercules Kyvelos

                    Tangent Lines Diagram Along Smooth Curve

                    Yusuf al-Mu'taman ibn Hud