Cast int96 timestamp from parquet to golang











up vote
1
down vote

favorite












Having this 12 byte array (int96) to timestamp.



[128 76 69 116 64 7 0 0 48 131 37 0]



How do I cast it to timestamp?



I understand the first 8 byte should be cast to int64 millisecond that represent an epoch datetime.










share|improve this question




























    up vote
    1
    down vote

    favorite












    Having this 12 byte array (int96) to timestamp.



    [128 76 69 116 64 7 0 0 48 131 37 0]



    How do I cast it to timestamp?



    I understand the first 8 byte should be cast to int64 millisecond that represent an epoch datetime.










    share|improve this question


























      up vote
      1
      down vote

      favorite









      up vote
      1
      down vote

      favorite











      Having this 12 byte array (int96) to timestamp.



      [128 76 69 116 64 7 0 0 48 131 37 0]



      How do I cast it to timestamp?



      I understand the first 8 byte should be cast to int64 millisecond that represent an epoch datetime.










      share|improve this question















      Having this 12 byte array (int96) to timestamp.



      [128 76 69 116 64 7 0 0 48 131 37 0]



      How do I cast it to timestamp?



      I understand the first 8 byte should be cast to int64 millisecond that represent an epoch datetime.







      go parquet






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Nov 2 at 11:37









      dlsniper

      2,84611325




      2,84611325










      asked Nov 1 at 14:53









      ZAky

      490314




      490314
























          2 Answers
          2






          active

          oldest

          votes

















          up vote
          1
          down vote













          The first 8 bytes are time in nanosecs, not millisecs. They are not measured from the epoch either, but from midnight. The date part is stored separatly in the last 4 bytes as Julian day number.



          Here is the result of an experiment I did earlier that may help. I stored '2000-01-01 12:34:56' as an int96 and dumped with parquet-tools:



          $ parquet-tools dump hdfs://path/to/parquet/file | tail -n 1
          value 1: R:0 D:1 V:117253024523396126668760320


          Since 117253024523396126668760320 = 0x60FD4B3229000059682500, the 12 bytes are 00 60 FD 4B 32 29 00 00 | 59 68 25 00, where | shows the boundary between the time and the date parts.



          00 60 FD 4B 32 29 00 00 is the time part. We need to reverse the bytes because int96 timestamp use a reverse byte order, thus we get 0x000029324BFD6000 = 45296 * 10^9 nanoseconds = 45296 seconds = 12 hours + 34 minutes + 56 seconds.



          59 68 25 00 is the date part, if we reverse the bytes we get 0x00256859 = 2451545 as the Julian day number, which corresponds to 2000-01-01.






          share|improve this answer




























            up vote
            0
            down vote













            @Zoltan you definitely deserve the vote although you didn't supply a Golang sulotion.



            Thanks to you and to https://github.com/carlosjhr64/jd



            I wrote a function func int96ToJulian(parquetDate byte) time.Time



            playground



            func int96ToJulian(parquetDate byte) time.Time {

            nano := binary.LittleEndian.Uint64(parquetDate[:8])
            dt := binary.LittleEndian.Uint32(parquetDate[8:])

            l := dt + 68569
            n := 4 * l / 146097
            l = l - (146097*n+3)/4
            i := 4000 * (l + 1) / 1461001
            l = l - 1461*i/4 + 31
            j := 80 * l / 2447
            k := l - 2447*j/80
            l = j / 11
            j = j + 2 - 12*l
            i = 100*(n-49) + i + l

            tm := time.Date(int(i), time.Month(j), int(k), 0, 0, 0, 0, time.UTC)
            return tm.Add(time.Duration(nano))
            }





            share|improve this answer























            • I do not know any Go (which is why I did not provide any code in my answer), but your use of binary.BigEndian above suggest that there is a binary.LittleEndian as well (which I confirmed with a quick Google search). If you used that, you wouldn't have to reverse the bytes manually, since that's exactly what endianness means.
              – Zoltan
              Nov 4 at 8:09











            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














             

            draft saved


            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53103762%2fcast-int96-timestamp-from-parquet-to-golang%23new-answer', 'question_page');
            }
            );

            Post as a guest
































            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            1
            down vote













            The first 8 bytes are time in nanosecs, not millisecs. They are not measured from the epoch either, but from midnight. The date part is stored separatly in the last 4 bytes as Julian day number.



            Here is the result of an experiment I did earlier that may help. I stored '2000-01-01 12:34:56' as an int96 and dumped with parquet-tools:



            $ parquet-tools dump hdfs://path/to/parquet/file | tail -n 1
            value 1: R:0 D:1 V:117253024523396126668760320


            Since 117253024523396126668760320 = 0x60FD4B3229000059682500, the 12 bytes are 00 60 FD 4B 32 29 00 00 | 59 68 25 00, where | shows the boundary between the time and the date parts.



            00 60 FD 4B 32 29 00 00 is the time part. We need to reverse the bytes because int96 timestamp use a reverse byte order, thus we get 0x000029324BFD6000 = 45296 * 10^9 nanoseconds = 45296 seconds = 12 hours + 34 minutes + 56 seconds.



            59 68 25 00 is the date part, if we reverse the bytes we get 0x00256859 = 2451545 as the Julian day number, which corresponds to 2000-01-01.






            share|improve this answer

























              up vote
              1
              down vote













              The first 8 bytes are time in nanosecs, not millisecs. They are not measured from the epoch either, but from midnight. The date part is stored separatly in the last 4 bytes as Julian day number.



              Here is the result of an experiment I did earlier that may help. I stored '2000-01-01 12:34:56' as an int96 and dumped with parquet-tools:



              $ parquet-tools dump hdfs://path/to/parquet/file | tail -n 1
              value 1: R:0 D:1 V:117253024523396126668760320


              Since 117253024523396126668760320 = 0x60FD4B3229000059682500, the 12 bytes are 00 60 FD 4B 32 29 00 00 | 59 68 25 00, where | shows the boundary between the time and the date parts.



              00 60 FD 4B 32 29 00 00 is the time part. We need to reverse the bytes because int96 timestamp use a reverse byte order, thus we get 0x000029324BFD6000 = 45296 * 10^9 nanoseconds = 45296 seconds = 12 hours + 34 minutes + 56 seconds.



              59 68 25 00 is the date part, if we reverse the bytes we get 0x00256859 = 2451545 as the Julian day number, which corresponds to 2000-01-01.






              share|improve this answer























                up vote
                1
                down vote










                up vote
                1
                down vote









                The first 8 bytes are time in nanosecs, not millisecs. They are not measured from the epoch either, but from midnight. The date part is stored separatly in the last 4 bytes as Julian day number.



                Here is the result of an experiment I did earlier that may help. I stored '2000-01-01 12:34:56' as an int96 and dumped with parquet-tools:



                $ parquet-tools dump hdfs://path/to/parquet/file | tail -n 1
                value 1: R:0 D:1 V:117253024523396126668760320


                Since 117253024523396126668760320 = 0x60FD4B3229000059682500, the 12 bytes are 00 60 FD 4B 32 29 00 00 | 59 68 25 00, where | shows the boundary between the time and the date parts.



                00 60 FD 4B 32 29 00 00 is the time part. We need to reverse the bytes because int96 timestamp use a reverse byte order, thus we get 0x000029324BFD6000 = 45296 * 10^9 nanoseconds = 45296 seconds = 12 hours + 34 minutes + 56 seconds.



                59 68 25 00 is the date part, if we reverse the bytes we get 0x00256859 = 2451545 as the Julian day number, which corresponds to 2000-01-01.






                share|improve this answer












                The first 8 bytes are time in nanosecs, not millisecs. They are not measured from the epoch either, but from midnight. The date part is stored separatly in the last 4 bytes as Julian day number.



                Here is the result of an experiment I did earlier that may help. I stored '2000-01-01 12:34:56' as an int96 and dumped with parquet-tools:



                $ parquet-tools dump hdfs://path/to/parquet/file | tail -n 1
                value 1: R:0 D:1 V:117253024523396126668760320


                Since 117253024523396126668760320 = 0x60FD4B3229000059682500, the 12 bytes are 00 60 FD 4B 32 29 00 00 | 59 68 25 00, where | shows the boundary between the time and the date parts.



                00 60 FD 4B 32 29 00 00 is the time part. We need to reverse the bytes because int96 timestamp use a reverse byte order, thus we get 0x000029324BFD6000 = 45296 * 10^9 nanoseconds = 45296 seconds = 12 hours + 34 minutes + 56 seconds.



                59 68 25 00 is the date part, if we reverse the bytes we get 0x00256859 = 2451545 as the Julian day number, which corresponds to 2000-01-01.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered Nov 1 at 15:37









                Zoltan

                999212




                999212
























                    up vote
                    0
                    down vote













                    @Zoltan you definitely deserve the vote although you didn't supply a Golang sulotion.



                    Thanks to you and to https://github.com/carlosjhr64/jd



                    I wrote a function func int96ToJulian(parquetDate byte) time.Time



                    playground



                    func int96ToJulian(parquetDate byte) time.Time {

                    nano := binary.LittleEndian.Uint64(parquetDate[:8])
                    dt := binary.LittleEndian.Uint32(parquetDate[8:])

                    l := dt + 68569
                    n := 4 * l / 146097
                    l = l - (146097*n+3)/4
                    i := 4000 * (l + 1) / 1461001
                    l = l - 1461*i/4 + 31
                    j := 80 * l / 2447
                    k := l - 2447*j/80
                    l = j / 11
                    j = j + 2 - 12*l
                    i = 100*(n-49) + i + l

                    tm := time.Date(int(i), time.Month(j), int(k), 0, 0, 0, 0, time.UTC)
                    return tm.Add(time.Duration(nano))
                    }





                    share|improve this answer























                    • I do not know any Go (which is why I did not provide any code in my answer), but your use of binary.BigEndian above suggest that there is a binary.LittleEndian as well (which I confirmed with a quick Google search). If you used that, you wouldn't have to reverse the bytes manually, since that's exactly what endianness means.
                      – Zoltan
                      Nov 4 at 8:09















                    up vote
                    0
                    down vote













                    @Zoltan you definitely deserve the vote although you didn't supply a Golang sulotion.



                    Thanks to you and to https://github.com/carlosjhr64/jd



                    I wrote a function func int96ToJulian(parquetDate byte) time.Time



                    playground



                    func int96ToJulian(parquetDate byte) time.Time {

                    nano := binary.LittleEndian.Uint64(parquetDate[:8])
                    dt := binary.LittleEndian.Uint32(parquetDate[8:])

                    l := dt + 68569
                    n := 4 * l / 146097
                    l = l - (146097*n+3)/4
                    i := 4000 * (l + 1) / 1461001
                    l = l - 1461*i/4 + 31
                    j := 80 * l / 2447
                    k := l - 2447*j/80
                    l = j / 11
                    j = j + 2 - 12*l
                    i = 100*(n-49) + i + l

                    tm := time.Date(int(i), time.Month(j), int(k), 0, 0, 0, 0, time.UTC)
                    return tm.Add(time.Duration(nano))
                    }





                    share|improve this answer























                    • I do not know any Go (which is why I did not provide any code in my answer), but your use of binary.BigEndian above suggest that there is a binary.LittleEndian as well (which I confirmed with a quick Google search). If you used that, you wouldn't have to reverse the bytes manually, since that's exactly what endianness means.
                      – Zoltan
                      Nov 4 at 8:09













                    up vote
                    0
                    down vote










                    up vote
                    0
                    down vote









                    @Zoltan you definitely deserve the vote although you didn't supply a Golang sulotion.



                    Thanks to you and to https://github.com/carlosjhr64/jd



                    I wrote a function func int96ToJulian(parquetDate byte) time.Time



                    playground



                    func int96ToJulian(parquetDate byte) time.Time {

                    nano := binary.LittleEndian.Uint64(parquetDate[:8])
                    dt := binary.LittleEndian.Uint32(parquetDate[8:])

                    l := dt + 68569
                    n := 4 * l / 146097
                    l = l - (146097*n+3)/4
                    i := 4000 * (l + 1) / 1461001
                    l = l - 1461*i/4 + 31
                    j := 80 * l / 2447
                    k := l - 2447*j/80
                    l = j / 11
                    j = j + 2 - 12*l
                    i = 100*(n-49) + i + l

                    tm := time.Date(int(i), time.Month(j), int(k), 0, 0, 0, 0, time.UTC)
                    return tm.Add(time.Duration(nano))
                    }





                    share|improve this answer














                    @Zoltan you definitely deserve the vote although you didn't supply a Golang sulotion.



                    Thanks to you and to https://github.com/carlosjhr64/jd



                    I wrote a function func int96ToJulian(parquetDate byte) time.Time



                    playground



                    func int96ToJulian(parquetDate byte) time.Time {

                    nano := binary.LittleEndian.Uint64(parquetDate[:8])
                    dt := binary.LittleEndian.Uint32(parquetDate[8:])

                    l := dt + 68569
                    n := 4 * l / 146097
                    l = l - (146097*n+3)/4
                    i := 4000 * (l + 1) / 1461001
                    l = l - 1461*i/4 + 31
                    j := 80 * l / 2447
                    k := l - 2447*j/80
                    l = j / 11
                    j = j + 2 - 12*l
                    i = 100*(n-49) + i + l

                    tm := time.Date(int(i), time.Month(j), int(k), 0, 0, 0, 0, time.UTC)
                    return tm.Add(time.Duration(nano))
                    }






                    share|improve this answer














                    share|improve this answer



                    share|improve this answer








                    edited Nov 5 at 3:44

























                    answered Nov 3 at 17:48









                    ZAky

                    490314




                    490314












                    • I do not know any Go (which is why I did not provide any code in my answer), but your use of binary.BigEndian above suggest that there is a binary.LittleEndian as well (which I confirmed with a quick Google search). If you used that, you wouldn't have to reverse the bytes manually, since that's exactly what endianness means.
                      – Zoltan
                      Nov 4 at 8:09


















                    • I do not know any Go (which is why I did not provide any code in my answer), but your use of binary.BigEndian above suggest that there is a binary.LittleEndian as well (which I confirmed with a quick Google search). If you used that, you wouldn't have to reverse the bytes manually, since that's exactly what endianness means.
                      – Zoltan
                      Nov 4 at 8:09
















                    I do not know any Go (which is why I did not provide any code in my answer), but your use of binary.BigEndian above suggest that there is a binary.LittleEndian as well (which I confirmed with a quick Google search). If you used that, you wouldn't have to reverse the bytes manually, since that's exactly what endianness means.
                    – Zoltan
                    Nov 4 at 8:09




                    I do not know any Go (which is why I did not provide any code in my answer), but your use of binary.BigEndian above suggest that there is a binary.LittleEndian as well (which I confirmed with a quick Google search). If you used that, you wouldn't have to reverse the bytes manually, since that's exactly what endianness means.
                    – Zoltan
                    Nov 4 at 8:09


















                     

                    draft saved


                    draft discarded



















































                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53103762%2fcast-int96-timestamp-from-parquet-to-golang%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest




















































































                    這個網誌中的熱門文章

                    Xamarin.form Move up view when keyboard appear

                    Post-Redirect-Get with Spring WebFlux and Thymeleaf

                    Anylogic : not able to use stopDelay()